Skip to main content

SE::Yandex::WordStat::ByRegion -

Overview of Yandex WordStat scraper by region

Data collected by SE::Yandex::WordStat::ByRegion scraper

Wordstat is a Yandex service designed to evaluate user interest in various topics and select keywords for SEO optimization and contextual advertising. In addition, with Wordstat Yandex, you can evaluate the seasonality and geographic dependence of search queries.

The Yandex WordStat scraper by region supports automatic query multiplication, so you can be sure that you will get the maximum number of results from the output. Also, A-Parser can automatically navigate related queries to the specified depth.

The functionality of A-Parser allows you to save parsing settings for further use (presets), set a parsing schedule, and much more. You can use automatic query multiplication, substitution of subqueries from files, enumeration of alphanumeric combinations and lists to obtain the maximum possible number of results.

Saving results is possible in the form and structure that you need, thanks to the built-in powerful Template Toolkit template engine, which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL, and CSV.

Accounts

To work with the SE::Yandex::WordStat::ByRegionSE::Yandex::WordStat::ByRegion scraper, Yandex accounts are required. Accounts can be registered using the SE::Yandex::RegisterSE::Yandex::Register scraper or simply added existing accounts to the files/SE-Yandex/accounts.txt file in the supported format.

Alternatively, you can enable "on-the-fly" account registration.

Collected data

  • Total number of impressions per query
  • Statistics of keywords by regions and cities:
  • Region/City
  • Number of views per month
  • Regional popularity in %

Capabilities

  • Supports automatic bypass of Smart captcha and the ability to bypass graphic captcha using the AntiCaptcha service or any other supporting their API
  • Choice of device type
  • Ability to choose an authentication method
  • Ability to register accounts "on the fly"
  • Supports working with advanced account format and can answer the secret question (if the answer is in info). It also uses a saved proxy for authorization (if it is in info).

Use cases

  • Estimation of traffic volume by keyword by region

Queries

  • Queries must be specified as keywords, just as if they were entered directly into the Wordstat search form, for example:
test

Results

  • The result displays the number of impressions per query, statistics of keywords by regions and cities, the number of views per month, and regional popularity:
test - Total views: 872855
Views by regions:
Москва и Московская область 147107, 85%
Центр 194716, 77%
Северо-Запад 55815, 70%
Юг 31759, 67%
Поволжье 86006, 66%
...
Views by cities:
Чита 2937, 113%
Санкт-Петербург 35713, 73%
Белгород 2737, 58%
Иваново 1773, 55%
Калуга 2196, 64%
Кострома 1166, 49%

Result output options

A-Parser supports flexible result formatting thanks to the built-in Template Toolkit template engine, which allows it to output results in any form, as well as in a structured form, such as CSV or JSON.

Outputting results in JSON

Result format:

[%  data = {}; 
data.regions = [];
data.totalcount = totalcount;

FOREACH i IN regions;
item = {};
item.popularity = i.popularity;
item.region = i.region;
item.count = i.count;
data.regions.push(item);
END;

data.json %]

Result example:

{
"regions": [
{
"count": "1902795",
"popularity": 88,
"region": "Москва и Московская область"
},
{
"count": "2992864",
"popularity": 96,
"region": "Центр"
},
{
"count": "926138",
"popularity": 95,
"region": "Северо-Запад"
},
{
"count": "647140",
"popularity": 112,
"region": "Юг"
},
{

"count": "34894",
"popularity": 77,
"region": "Север"
},
],
"totalcount": "10837937"
}

Outputting results in CSV

Result format:

[% FOREACH i IN regions;
tools.CSVline(query, i.popularity, i.region, i.count);
END %]

Result example:

"тест",88,"Москва и Московская область",1902795
"тест",96,"Центр",2992864
"тест",95,"Северо-Запад",926138
"тест",112,Юг,647140
"тест",124,"Поволжье",1927873
"тест",64,"Запад",60975
"тест",86,"Восток",427304

Dumping results to SQL

Result format:

[% FOREACH i IN regions;
"INSERT INTO regions VALUES('" _ query _ "', '"; i.popularity _ "', '"; i.count _ "', '"; i.region _ "')\n";
END %]

Result example:

INSERT INTO regions VALUES('тест', '88', '1902795', 'Москва и Московская область')
INSERT INTO regions VALUES('тест', '96', '2992864', 'Центр')
INSERT INTO regions VALUES('тест', '95', '926138', 'Северо-Запад')
INSERT INTO regions VALUES('тест', '112', '647140', 'Юг')
INSERT INTO regions VALUES('тест', '124', '1927873', 'Поволжье')
INSERT INTO regions VALUES('тест', '64', '60975', 'Запад')
INSERT INTO regions VALUES('тест', '86', '427304', 'Восток')
INSERT INTO regions VALUES('тест', '80', '89569', 'Юг')
INSERT INTO regions VALUES('тест', '75', '356560', 'Центр')
INSERT INTO regions VALUES('тест', '77', '34894', 'Север')
tip

See also: Results filters

Possible settings

ParameterDefault valueDescription
AntiGate presetdefaultYou need to configure the scraper Util::AntiGateUtil::AntiGate beforehand - specify your access key and other parameters, and then select the created preset here
AntiGate preset for LogindefaultAntiGate preset for login. You need to configure the scraper Util::AntiGateUtil::AntiGate with parameters, and then select the created preset here
TypeAllSelect device type
AccountsOnly from "accounts.txt"Select account working method: Always auto register - always automatically register accounts "on the fly", you need to select a configured preset in the SE::Yandex::Register preset parameter. Auto register if no more in "accounts.txt" - first use existing accounts from accounts.txt, and if they run out - use automatic registration "on the fly", for which you need to select a configured preset in the SE::Yandex::Register preset parameter. Only from "accounts.txt" - use only existing accounts from accounts.txt, and if they run out - wait for the specified time (Wait new accounts in "accounts.txt" parameter) for new ones to appear
Wait new accounts in "accounts.txt"0Waiting time for new accounts to appear in accounts.txt
Remove bad accountsAlways, except wrong login/passwordAutomatic removal of "bad" accounts: Always - always remove. Always, except wrong login/password - remove always, except in cases when Yandex reported that the login/password is incorrect. The thing is that Yandex can give such a message when banning IP for a completely working account, so optionally you can leave such accounts for reuse. Never - never remove. Regardless of the selected option, accounts are not deleted in case of proxy/browser errors
SE::Yandex::Register presetdefaultSelect a preset of settings for SE::Yandex::RegisterSE::Yandex::Register
Authorization methodHTTPAuthorization method: HTTP - fast, not resource-intensive. Chrome - slow, resource-intensive, theoretically can extend the life of accounts
Chrome headlessIf the option is enabled, the browser will not be displayed
Use sessionsUse sessions
Do not reset session if authorization passedDo not reset the session in case of errors if the scraper has already been authorized