Skip to main content

SE::Yandex::SQI - Site Quality Index Check in Yandex

Scraper Overview

OverviewSE::Yandex::SQISE::Yandex::SQI – site quality index check in Yandex. Incredibly fast scraper, with a speed of 3000-7000 requests per minute.

You can use automatic query multiplication, substitution of subqueries from files, enumeration of alphanumeric combinations and lists to get the maximum possible number of results. By using result filtering you can immediately clean up the result by removing all unnecessary garbage (using minus-words).

The A-Parser functionality allows you to save the settings of the SE::Yandex::SQI scraper for further use (presets), set a parsing schedule, and much more.

Results can be saved in the form and structure you need, thanks to the built-in powerful template engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL, and CSV.

Collected Data

  • Site Quality Index (Yandex SQI)
  • Data on the presence of badges on the site (1 - badge received, 0 - no badge):
    • User's Choice
    • Popular site
    • Secure connection
    • Turbo pages
    • Is the site official
  • For the "User's Choice" and "Popular site" badges, it is possible to obtain the readiness level to receive the badge as an intermediate value from 0 to 1, for example 0.4.
  • Number of reviews, rating, and score
  • Store rating in product search and store rating on Yandex Market (if this data is available for the searched site)

Use Cases

  • Evaluation of site usefulness from Yandex's point of view
  • Gathering titles

Queries

As queries, it is necessary to specify the domain of the searched site. You can specify both with and without the protocol, for example:

yandex.ru 
google.com
vk.com
facebook.com
https://a-parser.com

Output Results Examples

A-Parser supports flexible result formatting thanks to the built-in template engine Template Toolkit, which allows it to output results in an arbitrary form, as well as in a structured form, for example CSV or JSON.

Default Output

Result format:

$query: $sqi\n

Example of a result that shows the initial query and its SQI:

facebook.com: 130000  
yandex.ru: -1
https://a-parser.com: 110
google.com: 120000
vk.com: 340000

If the SQI for the domain is unavailable, the result will be -1.

Output in CSV Table

Result format:

[% tools.CSVline(query, sqi, rating); %]

File name:

$datefile.format().csv

Initial text:

Домен,Рейтинг,Автор,Цена

tip

To make the "Initial text" option available in the Task Editor, you need to activate "More options". In "Initial text" we write the column names separated by commas and make the second line empty.

Saving in SQL Format

Result format:

[% "INSERT INTO sqi VALUES('" _ query _ "', '" _ sqi _ "', '" _ rating _ "')\n" %]

Example of a result:

INSERT INTO sqi VALUES('google.com', '122000', '87')
INSERT INTO sqi VALUES('yandex.ru', 'none', '92')
INSERT INTO sqi VALUES('https://a-parser.com', '200', '')
INSERT INTO sqi VALUES('vk.com', '326000', '73')
INSERT INTO sqi VALUES('facebook.com', '117000', '66')

Dump Results to JSON

Общий формат результата:

[% IF notFirst;
",\n";
ELSE;
notFirst = 1;
END;

obj = {};
obj.query = query;
obj.sqi = p1.sqi;
obj.rating = p1.rating;

obj.json %]

Начальный текст:

[

Конечный текст:

]

Example of a result:

[{"query":"vk.com","rating":73,"sqi":326000},
{"query":"google.com","rating":87,"sqi":122000},
{"query":"https://a-parser.com","rating":"","sqi":200},
{"query":"yandex.ru","rating":92,"sqi":"none"},
{"query":"facebook.com","rating":66,"sqi":117000}]
tip

To make the "Initial text" and "Final text" options available in the Task Editor, you need to activate "More options".

Possible Settings

ParameterDefault valueDescription
AntiGate presetdefaultSelection of preset for Util::AntiGateUtil::AntiGate, more details on setting here
AntiGate preset for old captchadefaultSimilar to AntiGate preset, but used only for regular (old, single-image) captchas. If no preset is selected here, the preset chosen in AntiGate preset will be used for such captchas.
Auto-Solve ClickCaptchaAutomatic solving of click captchas (without using services)
Experimental img captcha max count1Maximum number of repeated captcha images per attempt
Use sessionsSaves good sessions, allowing for even faster scraping with fewer errors