Shop::Yandex::Market - Scraper of products from Yandex.Market
Scraper Overview
Using Yandex market product scraper, you can obtain data from the product card, compile a database of product links, track the dynamics of product prices, changes in the number of sellers, collect rating scores and the number of reviews on the product, gather product images
A-Parser functionality allows you to save parsing settings for further use (presets), set a parsing schedule, and much more. You can use automatic query multiplication, substitution of subqueries from files, iteration over alphanumeric combinations and lists to get the maximum possible number of results
Collected Data
- Product name
- Product link
- Product image
- Price and old price
- Currency
- Rating and number of comments
- Number of sellers
- Additional information
- Number of purchases and product views
Use Cases
- Collecting product links
- Assessing product popularity
- Monitoring the dynamics of prices and product popularity
Queries
As queries, you need to specify keywords or a link to the category, for example:
xiaomi redmi note
https://market.yandex.ru/catalog/54726/list?local-offers-first=0&deliveryincluded=0&onstock=1ы
Output Results Examples
A-Parser supports flexible formatting of results thanks to the built-in template engine Template Toolkit, which allows it to output results in any form, as well as in structured ones, for example CSV or JSON
Output of the name, minimum price, and rating of the product
Result format:
$products.format('Название: $title, Минимальная цена: $amountfrom, Рейтинг: $rating\n')
Example of result:
Название: Смартфон Apple iPhone 11 64GB, Минимальная цена: 46 244, Рейтинг: 4.7
Название: Смартфон Apple iPhone Xr 64GB, Минимальная цена: 36 990, Рейтинг: 4.7
Название: Смартфон Apple iPhone 12 64GB, Минимальная цена: 60 840, Рейтинг: 4.7
Название: Смартфон Apple iPhone SE 2020 64GB, Минимальная цена: 33 490, Рейтинг: 4.5
Название: Смартфон Apple iPhone Xr 128GB, Минимальная цена: 43 450, Рейтинг: 4.7
Output in CSV table
Result format:
[% FOREACH item IN products;
tools.CSVline(item.cardlink, item.title, item.amountfrom, item.rating, item.commentscount);
END %]
Example of result:
https://market.yandex.ru/product--smartfon-apple-iphone-11-64gb/558171067?nid=54726&show-uid=16206538929466307988916001&context=search&text=iphone&sku=101106266737,"Смартфон Apple iPhone 11 64GB","46 244",4.7,810
https://market.yandex.ru/product--smartfon-apple-iphone-xr-64gb/175941311?nid=54726&show-uid=16206538929466307988916002&context=search&text=iphone&sku=101103379766,"Смартфон Apple iPhone Xr 64GB","36 990",4.7,624
https://market.yandex.ru/product--smartfon-apple-iphone-12-64gb/722976004?nid=54726&show-uid=16206538929466307988916003&context=search&text=iphone&sku=101077347750,"Смартфон Apple iPhone 12 64GB","60 840",4.7,103
https://market.yandex.ru/product--smartfon-apple-iphone-se-2020-64gb/661221015?nid=54726&show-uid=16206538929466307988916004&context=search&text=iphone&sku=101099789863,"Смартфон Apple iPhone SE 2020 64GB","33 490",4.5,358
Initial text:
Ссылка на товар, Название товара, Минимальная цена, Рейтинг, Количество комментариев
In the Result Format, the Template Toolkit is used to output the $products
array in a FOREACH
loop.
To make the "Initial text" option available in the Task Editor, you need to activate "More options". In "Initial text," we write the names of the columns separated by commas and make the second line empty.
Saving in SQL format
Result format:
[% FOREACH item IN products;
"INSERT INTO products VALUES('" _ item.title _ "', '"; item.cardlink _ "', '"; item.amountfrom _ "', '"; item.rating _ "')\n";
END %]
Example of result:
INSERT INTO products VALUES('Смартфон Apple iPhone 11 64GB', 'https://market.yandex.ru/product--smartfon-apple-iphone-11-64gb/558171067?nid=54726&show-uid=16206542754162480526716001&context=search&text=iphone&sku=101106266737', '46 244', '4.7')
INSERT INTO products VALUES('Смартфон Apple iPhone Xr 64GB', 'https://market.yandex.ru/product--smartfon-apple-iphone-xr-64gb/175941311?nid=54726&show-uid=16206542754162480526716002&context=search&text=iphone&sku=101103379766', '36 990', '4.7')
INSERT INTO products VALUES('Смартфон Apple iPhone 12 64GB', 'https://market.yandex.ru/product--smartfon-apple-iphone-12-64gb/722976004?nid=54726&show-uid=16206542754162480526716003&context=search&text=iphone&sku=101077347750', '60 840', '4.7')
INSERT INTO products VALUES('Смартфон Apple iPhone SE 2020 64GB', 'https://market.yandex.ru/product--smartfon-apple-iphone-se-2020-64gb/661221015?nid=54726&show-uid=16206542754162480526716004&context=search&text=iphone&sku=101099789863', '33 490', '4.5')
Dumping results to JSON
Общий формат результата:
[% IF notFirst;
",\n";
ELSE;
notFirst = 1;
END;
obj = {};
obj.query = query;
obj.items = [];
FOREACH item IN p1.products;
obj.items.push({
link = item.cardlink
name = item.title
amountfrom = item.amountfrom
});
END;
obj.json %]
Начальный текст:
[
Конечный текст:
]
Example of result:
[
{
"query": "https://market.yandex.ru/catalog--mobilnye-telefony/54726/list?text=iphone&hid=91491&was_redir=1&rt=10&cpa=0&onstock=0&local-offers-first=0",
"items": [
{
"link": "https://market.yandex.ru/product--smartfon-apple-iphone-11-64gb/558171067?nid=54726&show-uid=16206548825917275667016001&context=search&text=iphone&sku=101106266737",
"amountfrom": "46 244",
"name": "Смартфон Apple iPhone 11 64GB"
},
{
"link": "https://market.yandex.ru/product--smartfon-apple-iphone-xr-64gb/175941311?nid=54726&show-uid=16206548825917275667016002&context=search&text=iphone&sku=101103379766",
"amountfrom": "36 990",
"name": "Смартфон Apple iPhone Xr 64GB"
},
{
"link": "https://market.yandex.ru/product--smartfon-apple-iphone-12-64gb/722976004?nid=54726&show-uid=16206548825917275667016003&context=search&text=iphone&sku=101077347750",
"amountfrom": "60 840",
"name": "Смартфон Apple iPhone 12 64GB"
},
{
"link": "https://market.yandex.ru/product--smartfon-apple-iphone-se-2020-64gb/661221015?nid=54726&show-uid=16206548825917275667016004&context=search&text=iphone&sku=101099789863",
"amountfrom": "33 490",
"name": "Смартфон Apple iPhone SE 2020 64GB"
}
]
}
]
To make the "Initial text" and "End text" options available in the Task Editor, you need to activate "More options".
Possible Settings
Parameter | Default Value | Description |
---|---|---|
AntiGate preset | default | Selection of the preset for Util::AntiGate, more details about the setup here |
AntiGate preset for old captcha | default | Similar to AntiGate preset, but is used only for regular (old, single-image) captchas. If no preset is selected here, then the preset chosen in AntiGate preset will be used for such captchas. |
Auto-Solve ClickCaptcha | ☐ | Automatic solving of click captcha (without using services) |
Experimental img captcha max count | 1 | Maximum number of captcha images retries per attempt |
Pages count | 5 | Number of pages to scrape |
Search region ID | Not set | Region for scraping |