SE::DuckDuckGo - DuckDuckGo Search Results Scraper
Parser Overview
DuckDuckGo search results scraper. With the DuckDuckGo scraper, you can obtain large databases of links ready for further use. You can use queries in the same way as you enter them in the Dogpile search bar, including search operators (intitle, inurl, site, etc.). More details on the official page DuckDuckGo Search Syntax.
A-Parser functionality allows you to save the parsing settings of the DuckDuckGo scraper for further use (presets), set up a parsing schedule, and much more. You can use automatic query multiplication, substitution of subqueries from files, enumeration of alphanumeric combinations and lists to obtain the maximum possible number of results.
Results can be saved in the format and structure you need, thanks to the built-in powerful templating engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL, and CSV.
Collected Data
- Links, anchors, and snippets from the results
Capabilities
- Support for all DuckDuckGo search operators (intitle:, inurl:, site:, etc.). More details about search operators on the official page DuckDuckGo Search Syntax
- Scrapes the maximum number of results returned by Bing - 10 pages of 10 items in the results
- Total number of results - 100
- Ability to scrape by selected location (option Location)
- Ability to choose the output language (option Language)
Use Cases
- Collecting link databases - for A-Poster, XRumer, AllSubmitter, etc.
- Checking site indexing
- Backlink (mention) site search
- Any other use cases involving scraping DuckDuckGo in one way or another
Queries
Queries should be specified as search phrases, for example:
Football
тест
site:a-parser.com
парсер site:a-parser.com
test -site:tests.com
IoT filetype:pdf
Query Substitutions
You can use built-in macros to expand queries. For example, if we want to obtain a very large database of forums, we will specify several main queries in different languages:
forum
форум
foro
论坛
In the query format, we will specify the enumeration of characters from a to zzzz. This method allows for maximum rotation of search results and obtaining a multitude of new unique results:
$query {az:a:zzzz}
This macro will create 475254
additional queries for each original search query, which will result in 4 x 475254 = 1901016
search queries in total, an impressive figure, but no problem for A-Parser. At a speed of 2000
queries per minute, this task will be completed in just 16
hours.
Using Operators
You can use search operators in the query format, so it will be automatically added to each query in your list:
site:$query
Output Results Examples
A-Parser supports flexible formatting of results thanks to the built-in templating engine Template Toolkit, allowing it to output results in any form, as well as in a structured format, such as CSV or JSON
Export of the list of links
Links + anchors + snippets with position output
Output of links, anchors and snippets in CSV table
Saving related keywords
Checking the indexing of links
Saving in SQL format
Dumping results to JSON
Results processing
A-Parser allows processing the results directly during scraping, in this section we have provided the most popular use cases for the DuckDuckGo scraper
Link deduplication
Link deduplication by domain
Extracting domains
Removing tags from anchors and snippets
Filtering links by inclusion
Possible settings
Parameter name | Default value | Description |
---|---|---|
Pages count | 5 | Number of pages to scrape (from 1 to 10) |
Region | US (English) | Location selection |
Language | English (United States) | Language selection |
Safe search | Moderate | Ability to enable "Safe search" |
Serp time | Any time | Search period |
Use HTTP/2 | ☐ | Determines whether to use HTTP/2 instead of HTTP/1.1 |
User agent | Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:120.0) Gecko/20100101 Firefox/120.0 | User-Agent header when requesting pages |