Skip to main content

SE::Rambler - search results scraper for Rambler

img

Scraper Overview

Rambler search results scraper. With the Rambler scraper, you can get large databases of links ready for further use. You can use queries in the same way as you enter them in the Rambler search bar, including search operators (site, ip, etc.).

A-Parser functionality allows you to save the Rambler scraper parsing settings for further use (presets), set up a parsing schedule, and much more. You can use automatic query multiplication, substitution of subqueries from files, iteration of alphanumeric combinations and lists to obtain the maximum possible number of results.

Saving results is possible in the format and structure you need, thanks to the built-in powerful templating engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL, and CSV.

Collected Data

  • Number of results in the output
  • Links, anchors, and snippets from the output
  • List of related keywords (hints)

data collected by SE::Rambler scraper

Capabilities

  • Support for Rambler search operators (url:, site:, inurl:, host:, rhost:, domain:.)
  • Scrapes up to 25 pages, from 10 to 50 results per page
  • Scrapes related keywords ($hints)
  • Ability to use solving services to bypass captchas
  • Choice of output device: regular desktop, Android mobile, or iOS mobile

Use Cases

  • Collecting link databases
  • Evaluating competition for keywords
  • Finding backlinks (mentions) of websites
  • All cases when it is necessary to scrape Rambler search results

Queries

Specify queries in the same way as in the Rambler search. For example, if you need only links from one site, enter in the query field:

"купить двери" site:http://kp.ru

Query Substitutions

You can use built-in macros to expand queries. For example, if we want to get a very large database of forums, we will specify several main queries in different languages:

forum
форум
foro
论坛

In the query format, we will specify the iteration of characters from a to zzzz, this method allows to rotate the search output to the maximum and obtain many new unique results:

$query {az:a:zzzz}

This macro will create 475254 additional queries for each original search query, which will result in 4 x 475254 = 1901016 search queries in total, an impressive number, but not a problem for A-Parser. At a speed of 2000 requests per minute, this task will be processed in just 16 hours.

Using Operators

You can use search operators in the query format, so it will be automatically added to each query in your list:

site:$query

Output Results Examples

A-Parser supports flexible formatting of results thanks to the built-in templating engine Template Toolkit, allowing it to output results in any form, as well as in a structured form, for example CSV or JSON.

Exporting a list of links

Similar to SE::Google.

Similar to SE::Google.

Similar to SE::Google.

Result format:

$hints.format('$hint\n')

Example of result:

habrahabr
habr
habrahabr ru
xabra
livebusiness
эврика
электронный бухгалтер
остров эльба
эльба электронный бухгалтер
хаброхабр
...

Saving in SQL format

Similar to SE::Google.

Dumping results to JSON

Similar to SE::Google.

Results processing

A-Parser allows processing results directly during scraping, in this section we have provided the most popular use cases for the Rambler scraper.

Similar to SE::Google.

Similar to SE::Google.

Extracting domains

Similar to SE::Google.

Removing tags from anchors and snippets

Similar to SE::Google.

Similar to SE::Google.

Possible settings

Parameter NameDefault ValueDescription
DeviceDesktopSelection of output device: regular desktop, Android mobile, or iOS mobile
Pages count5Number of pages to scrape (from 1 to 25)
Links per page10Number of results per page (10/15/30/50)
Rambler region IDAbility to specify a region. You need to specify the region ID. How to find the ID of the required region is described here
SortSites by relevanceSelection of result sorting option
Results filteringModerateSelection of result filtering option
Results languageAny languageSelection of search results language
Serp timeAnytimeSelection of results period
Results typeAny formatSelection of result type (mime type)
Exact matchExact match to the query
Disable autocorrectDisables autocorrection, allowing scraping based on the specified query
Use sessionsSaves good sessions, allowing even faster scraping with fewer errors
AntiGate presetdefaultDetermines whether to use Util::AntiGateUtil::AntiGate to bypass captcha