SE::Yandex::SQI - Checking the Site Quality Index in Yandex
Overview of the scraper

SE::Yandex::SQI – checking the site quality index in Yandex. Incredibly fast scraper, working speed 3000-7000 requests per minute.You can use automatic query multiplication, substitution of sub-queries from files, iteration of alphanumeric combinations and lists to get the maximum possible number of results. Using result filtering you can immediately clean the result, removing all unnecessary rubbish (using stop words).
A-Parser's functionality allows you to save the parsing settings of the SE::Yandex::SQI scraper for future use (presets), setting a parsing schedule, and much more.
Saving results is possible in the format and structure you need, thanks to the powerful built-in templating engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL and CSV.
Collected data
- Site Quality Index (Yandex SQI)
- Data on the presence of badges at the site (1 - badge obtained, 0 - no badge):
- Users' Choice
- Popular Site
- Secure Connection
- Turbo Pages
- Whether the site is official
- For the badges "Users' Choice" and "Popular Site", you can get the degree of readiness to receive the badge as an intermediate value from 0 to 1, for example 0.4.
- Number of reviews, rating, and score
- Store rating in product search and store rating on Yandex Market (if this data is available for the searched site)
Use cases
- Assessing site usefulness from Yandex's perspective
- Collecting titles
Queries
The domain of the searched site must be specified as queries. You can specify it with or without the protocol, for example:
yandex.ru
google.com
vk.com
facebook.com
https://a-parser.com
Output results examples
A-Parser supports flexible result formatting thanks to the powerful built-in templating engine Template Toolkit, which allows it to output results in an arbitrary form, as well as in a structured one, such as CSV or JSON
Default output
Result format:
$query: $sqi\n
Example of a result showing the initial query and its SQI:
facebook.com: 130000
yandex.ru: -1
https://a-parser.com: 110
google.com: 120000
vk.com: 340000
If the SQI for the domain is unavailable, the result will be -1.
Output to CSV table
Result format:
[% tools.CSVline(query, sqi, rating); %]
File name:
$datefile.format().csv
Initial text:
Domain,Rating,Author,Price
For the "Initial text" option to be available in the Task Editor, you need to activate "More options". In "Initial text", write the column names separated by commas and leave the second line empty.
Saving in SQL format
Result format:
[% "INSERT INTO sqi VALUES('" _ query _ "', '" _ sqi _ "', '" _ rating _ "')\n" %]
Example result:
INSERT INTO sqi VALUES('google.com', '122000', '87')
INSERT INTO sqi VALUES('yandex.ru', 'none', '92')
INSERT INTO sqi VALUES('https://a-parser.com', '200', '')
INSERT INTO sqi VALUES('vk.com', '326000', '73')
INSERT INTO sqi VALUES('facebook.com', '117000', '66')
Dump results to JSON
Общий формат результата:
[% IF notFirst;
",\n";
ELSE;
notFirst = 1;
END;
obj = {};
obj.query = query;
obj.sqi = p1.sqi;
obj.rating = p1.rating;
obj.json %]
Начальный текст:
[
Конечный текст:
]
Example result:
[{"query":"vk.com","rating":73,"sqi":326000},
{"query":"google.com","rating":87,"sqi":122000},
{"query":"https://a-parser.com","rating":"","sqi":200},
{"query":"yandex.ru","rating":92,"sqi":"none"},
{"query":"facebook.com","rating":66,"sqi":117000}]
For the "Initial text" and "Final text" options to be available in the Task Editor, you need to activate "More options".
Possible settings
| Parameter | Default value | Description |
|---|---|---|
| AntiGate preset | default | Selecting a preset Util::AntiGate, more details on configuration here |
| AntiGate preset for old captcha | default | Same as AntiGate preset, but used only for regular (old, single image) captchas. If a preset is not selected here, the preset selected in AntiGate preset will be used for such captchas. |
| Experimental img captcha max count | 5 | Maximum number of repeated captcha images per attempt |
| Preffered captcha type | Click | Choosing the preferred captcha type: Click or Puzzle |
| Use sessions | ☑ | Saves good sessions, allowing scraping even faster with fewer errors |
