SE::Yandex::WordStat - WordStat Scraper. Collection of keywords and statistics of impressions

Scraper Overview

Wordstat is a Yandex service designed to assess user interest in various topics and to select keywords for SEO optimization and contextual advertising. Additionally, with Wordstat Yandex, you can assess the seasonality and geographical dependence of search queries.

The Yandex WordStat keyword scraper supports automatic query multiplication, you can be sure that you will get the maximum number of results from the output. A-Parser can also automatically navigate through related queries to a specified depth.

A-Parser functionality allows you to save scraping settings for later use (presets), set a scraping schedule, and much more. You can use automatic query multiplication, substitution of subqueries from files, iteration of alphanumeric combinations and lists to obtain the maximum possible number of results when scraping Yandex Wordstat.

Saving results is possible in the form and structure that you need, thanks to the built-in powerful templating engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL, and CSV.

Go to DEMO Buy A-Parser Pro ($299)

Use Cases for the Scraper

🔗 Deep parsing of Wordstat

Using the Yandex WordStat scraper for deep parsing.

🔗 Assessing frequency by WordStat

Assessing frequency by WordStat

Accounts

For the SE::Yandex::WordStat scraper to work, Yandex accounts are required. Accounts can be registered using the SE::Yandex::Register scraper or simply by adding existing accounts to the file files/SE-Yandex/accounts.txt in the supported format.

Alternatively, you can enable "on the fly" account registration.

Collected Data

The number of impressions for the specified query
The date of statistics update
A list of all keywords related to the specified one and the number of their monthly impressions
A list of all additional keywords that users searched for and the number of their monthly impressions

what data does the SE::Yandex::WordStat scraper collect

Capabilities

Scrapes the maximum number of results provided by Wordstat - 40 pages of 50 search results each
Supports the selection of search region (with subgroups)
Can automatically re-enter found keywords into queries (option Parse to level)
Ability to select several regions at once for assessment
Supports automatic bypass of Smart captcha and the ability to bypass graphic captcha using the AntiCaptcha service or any other supporting their API
Choice of device type
Ability to choose the method of authorization
Ability to register accounts "on the fly"
Supports working with extended account format and can answer the secret question (if the answer is in info). It also uses the saved proxy for authorization (if it is in info).

Use Cases

Estimating the amount of traffic by keyword (frequency)
Searching for new keywords of similar themes
Collecting large databases of keywords of various themes
Any other options implying scraping Yandex.WordStat in one form or another

Queries

As queries, you need to specify keywords, just as if they were entered directly into the Wordstat search form, for example:

окна москва  
"окна москва"  
!окна !москва

Output Results Examples

A-Parser supports flexible formatting of results thanks to the built-in Template Toolkit, which allows it to output results in any form, as well as in structured formats, such as CSV or JSON

Default Output

Result format:

$query - $totalcount, updated: $updatedate\nkeywords:\n$keys.format('$key: $count\n')\nadditional keywords:\n$search.format('$key: $count\n')

The result displays the original query, the number of its impressions, the date of statistics update, a list of related keywords and their monthly impressions, a list of additional keywords and their monthly impressions:

!окна !москва - 10368, updated: 16/05/2013  
keywords:  
окна москва: 32367  
пластиковые окна москва: 8994  
окна пвх москва: 4813  
купить окна москва: 2561  
окна цены москва: 1706  
москва работа окна: 1547  
вакансии окна москва: 1187  
деревянные окна москва: 1087  
служба +одного окна москва: 1021  
...  
additional keywords:  
производство окон пвх: 8512  
окна rehau: 15686  
окна salamander: 1576  
окна kbe: 3798  
окна кбе: 6089  
окна кве: 3227  
остекление балконов: 83216  
беседки: 471213  
остекление лоджий: 26366  
офисные перегородки: 18740  
монтаж окон: 26223  

Output in CSV Table

Result format:

[% FOREACH i IN keys;
  tools.CSVline(query, i. key, i.count);
END %]

Example of result:

парсер сайтов,  парсер сайтов, 8055
парсер сайтов,  бесплатный парсер сайтов,   1122
парсер сайтов,  парсер официальный сайт,    666
парсер сайтов,  сайты облачный парсер,  507
парсер сайтов,  парсер email +с сайта,  477
парсер сайтов,  парсер сайта скачать,   434
парсер сайтов,  парсер адресов сайтов,  390
парсер сайтов,  парсер сайтов онлайн,   366
парсер сайтов,  турбо парсер сайтов,    342
парсер сайтов,  турбо парсер официальный сайт,  309
парсер сайтов,  облачный парсер официальный сайт,   308
парсер сайтов,  парсер сайтов excel,    276
парсер сайтов,  слиза парсер сайт,  259

Saving in SQL Format

Result format:

[% FOREACH i IN keys;
  "INSERT INTO keys VALUES('" _ query _ "', '";     i.key _ "', '";     i.count _ "')\n";
END %]

Example of result:

INSERT INTO serp VALUES('тест', 'тест', '10837937')
INSERT INTO serp VALUES('тест', 'тест драйв', '1164338')
INSERT INTO serp VALUES('тест', 'тесто +для теста', '879980')
INSERT INTO serp VALUES('тест', 'тесты онлайн', '792560')
INSERT INTO serp VALUES('тест', 'тест драйв видео', '550164')
INSERT INTO serp VALUES('тест', 'рецепт теста', '484489')
INSERT INTO serp VALUES('тест', 'тесты +с ответами', '449401')
INSERT INTO serp VALUES('тест', 'тест 2014', '427602')
INSERT INTO serp VALUES('тест', 'тесты бесплатно', '315144')
INSERT INTO serp VALUES('тест', 'бесплатные тесты', '315096')
INSERT INTO serp VALUES('тест', 'тесты +для девочек', '309355')
INSERT INTO serp VALUES('тест', 'тесты +по темам', '293917')
INSERT INTO serp VALUES('тест', 'игры тесты', '288989')

Dumping Results to JSON

Общий формат результата:

[% IF notFirst;
  ",\n";
ELSE;
  notFirst = 1;
END;

obj = {};
obj.updatedate = p1.updatedate;
obj.totalcount = p1.totalcount;
obj.keys = [];

FOREACH item IN p1.keys;
    obj.keys.push({
        key = item.key
        count = item.count
    });
END;

obj.json %]

Начальный текст:

Конечный текст:

Example of result:

[{
    "updatedate": "12.03.2014",
    "totalcount": "10837937",
    "keys": [
        {
            "count": "10837937",
            "key": "тест"
        },
        {
            "count": "1164338",
            "key": "тест драйв"
        },
        {
            "count": "879980",
            "key": "тесто +для теста"
        },
        {
            "count": "792560",
            "key": "тесты онлайн"
        },
    ]
}]

tip

Possible Settings

note

Common settings for all scrapers

Parameter	Default Value	Description
Pages count	`10`	Number of pages to scrape
Region	`All`	Search region
Remove + from keywords	`☐`	Remove the plus sign (+) from found queries
AntiGate preset	`default`	You need to preconfigure the scraper Util::AntiGate - specify your access key and other parameters, after which select the created preset here
AntiGate preset for Login	`default`	AntiGate preset for login. You need to preconfigure the scraper Util::AntiGate with parameters, after which select the created preset here
Type	`All`	Choice of device type
Accounts	`Only from "accounts.txt"`	Method of working with accounts: `Always auto register` - always automatically register accounts "on the fly", a configured preset must be selected in the SE::Yandex::Register preset parameter. `Auto register if no more in "accounts.txt"` - first use existing accounts from accounts.txt, and if they run out - use automatic registration "on the fly", for which a configured preset must be selected in the SE::Yandex::Register preset parameter. `Only from "accounts.txt"` - use only existing accounts from accounts.txt, and if they run out - wait a set time (parameter Wait new accounts in "accounts.txt") for new ones to appear
Wait new accounts in "accounts.txt"	`0`	Waiting time for new accounts to appear in accounts.txt
Remove bad accounts	`Always, except wrong login/password`	Automatic removal of "bad" accounts: `Always` - always remove. `Always, except wrong login/password` - always remove, except when Yandex reports that an incorrect login/password has been entered. The fact is that Yandex can give such a message when the IP is banned for an absolutely working account, so optionally you can leave such accounts for reuse. `Never` - never remove. Regardless of the selected option, accounts are not deleted in case of proxy/browser errors
SE::Yandex::Register preset	`default`	Choice of preset settings for SE::Yandex::Register
Authorization method	`HTTP`	Authorization method: `HTTP` - fast, not resource-intensive. `Chrome` - slow, resource-intensive, theoretically can extend the life of accounts
Chrome headless	`☑`	If the option is enabled, the browser will not be displayed
Use sessions	`☑`	Using sessions
Do not reset session if authorization passed	`☑`	Do not reset the session upon errors if the scraper has already authorized
Use Wordstat 2	`☐`	Using Wordstat 2

Useful links

Detailed analysis of scraping keyword frequency and competition using scrapers SE::Yandex and SE::Yandex::WordStat.

Scraper Overview​

Use Cases for the Scraper​

🔗 Deep parsing of Wordstat

🔗 Assessing frequency by WordStat

Accounts​

Collected Data​

Capabilities​

Use Cases​

Queries​

Output Results Examples​

Default Output​

Output in CSV Table​

Saving in SQL Format​

Dumping Results to JSON​

Possible Settings​

Useful links​

Scraper Overview

Use Cases for the Scraper

Accounts

Collected Data

Capabilities

Use Cases

Queries

Output Results Examples

Default Output

Output in CSV Table

Saving in SQL Format

Dumping Results to JSON

Possible Settings

Useful links