SE::Yahoo - Yahoo Search Results Scraper

Overview of the scraper
The Yahoo search results scraper. allows you to obtain large link databases ready for further use. You can use queries exactly as you would enter them into the Yahoo search box, including search operators (site, ip etc.)
A-Parser's functionality allows you to save Yahoo scraper settings for future use (presets), ), set up scraping schedules, and much more. You can use automatic query generation and expansion, substitution of subqueries from files, iteration over alphanumeric combinations and lists to get the maximum possible number of results.
Results can be saved in the format and structure you need, thanks to the built-in powerful templating engine Template Toolkit which allows applying additional logic to the results and outputting data in various formats, including JSON, SQL and CSV.
Collected data
- Links, anchors, and snippets from the results
- Related keywords list
- Advertising results

Capabilities
- Supports all search operators Yahoo(site:, ip: etc.)
- Scrapes the maximum number of results provided by Yahoo - 50 pages with 100 elements in the results
- Can automatically scrape more than 1000 results per query - inserts additional characters (via Parse all results option)
- Ability to scrape in depth using related keywords (Parse related to level)
- Ability to search for related keywords
- Supports specifying result time (via Result time)
Use cases
- Collecting link databases - for A-Poster, XRumer, AllSubmitter, etc.
- Evaluating competition for keywords
- Finding backlinks (mentions) of websites
- Checking website indexation
- Finding websites on the same IP address
- Finding vulnerable websites
- Any other use cases that involve scraping Yahoo in some form
Queries
Query substitutions
test
okna Moskva   
site:http://lenta.ru  
ip:222.36.12.12
Query Substitutions
You can use built-in macros to multiply queries. For example, if we want to get a very large database of forums, we'll list a few main queries in different languages:
forum
forum
foro
论坛
In the query format, we specify the iteration of characters from a to zzzz., This method allows maximum rotation of search results and retrieval of many new unique results:
$query {az:a:zzzz}
This macro will create 475254 additional queries for each initial search query, totaling 4 x 475254 = 1901016 search queries. This number is impressive, but it's no problem for A-Parser. At a rate of 2000 queries per minute, this task will be processed in just 16 hours.
Using Operators
You can use search operators in the query format, so they will be automatically added to every query in your list:
site:$query
Output Results Options
A-Parser supports flexible result formatting thanks to the built-in templating engine Template Toolkit, which allows it to output results in any desired form, as well as in structured formats like CSV or JSON
Exporting a List of Links
Links + Anchors + Snippets with Position Output
Outputting links, anchors, and snippets to a CSV table
Saving Related Keywords
Keyword Competition
Checking Link Indexation
Saving in SQL Format
Dumping Results to JSON
Results Processing
A-Parser allows results to be processed directly during scraping. In this section, we present the most popular use cases for the Yahoo scraper
Link Deduplication
Link Deduplication by Domain
Extracting Domains
Removing Tags from Anchors and Snippets
Filtering Links by Inclusion
Available Settings
| Parameter Name | Default Value | Description | 
|---|---|---|
| Pages count | 5 | Number of pages to scrape (from 1 to 50) | 
| Serp time | All time | SERP time (time-dependent search, parameter "tbs=": All time / Past 24 hours / Past week / Past month) | 
| Safe Search | Moderate | Safe search option (Off / Moderate / Strict) | 
| Yahoo domain | United States (English) | Yahoo domain selection | 
| Yahoo language | Any | Yahoo language selection, allows choosing search language | 
| Yahoo country | Any | Country selection, allows choosing the country for the search | 
| Not found is error | ☐ | Treat absence of results as an error |