SE::Startpage - startpage.com search results scraper
Scraper Overview
Startpage search results scraper. With the Startpage scraper, you can get large databases of links ready for further use. You can use queries in the same way as you enter them in the Startpage search bar, including search operators (site, inurl, etc.).
A-Parser functionality allows you to save the parsing settings of the Startpage scraper for further use (presets), set up a parsing schedule, and much more. You can use automatic query multiplication, substitution of subqueries from files, enumeration of alphanumeric combinations and lists to get the maximum possible number of results.
Saving results is possible in the form and structure you need, thanks to the built-in powerful templating engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL, and CSV.
Collected Data
- Links, anchors, and snippets from the results
Capabilities
- Supports country selection, search language, and page language
- Ability to output anchors, links, and snippets together or separately
- Ability to specify the number of displayed results
- Specifying the result size (10 or 20 results)
Use Cases
- Collecting databases of links, anchors, and snippets
- Getting a list of sites most frequently mentioned in search engines
- Any other use cases for obtaining information
Queries
Queries are specified as words, phrases, or phrases in the same way as they are entered in the search engine. Example:
тест
site:http://test.ru
красные розы
Query Substitutions
You can use built-in macros to expand queries, for example, if we want to get a very large database of forums, we will specify several main queries in different languages:
forum
форум
foro
论坛
In the query format, we will specify the enumeration of characters from a to zzzz, this method allows to rotate the search results to the maximum and obtain many new unique results:
$query {az:a:zzzz}
This macro will create 475254
additional queries for each original search query, which in total will give 4 x 475254 = 1901016
search queries, an impressive number, but it's not a problem for A-Parser. At a speed of 2000
requests per minute, this task will be processed in just 16
hours.
Using Operators
You can use search operators in the query format, so it will be automatically added to each query from your list:
site:$query
Output Results Examples
A-Parser supports flexible formatting of results thanks to the built-in templating engine Template Toolkit, which allows it to output results in any form, as well as in a structured form, for example, CSV or JSON
Exporting a list of links
Links + anchors + snippets with position output
Output of links, anchors, and snippets in CSV table
Saving in SQL format
Dumping results to JSON
Results processing
A-Parser allows processing results directly during scraping, in this section we have provided the most popular use cases for the scraper Startpage
Link deduplication
Link deduplication by domain
Extracting domains
Removing tags from anchors and snippets
Filtering links by inclusion
Possible settings
Parameter name | Default value | Description |
---|---|---|
Pages count | 5 | Number of pages to scrape (from 1 to 50) |
Family filter | Filter depending on search | Selection of filtering level (Filter all results / Filter depending on search / Do not filter my results) |
Period | Any time | Selection of results period (Any time / Past 24 hours / Past week / Past month / Past year) |
Links per page | 10 | Size of results (10 / 20) |
Results language | English | Selection of results language |
Page language | English | Selection of page language |
Search country | All | Selection of the country for search |