Skip to main content

SE::Baidu - Baidu Search Results Parser

img

Baidu Parser Overviewโ€‹

Baidu search results parser. With the Baidu parser, you can get huge databases of links ready for further use. You can use queries in the same way as you enter them in the Bing search bar, including search operators (filetype, site, intitle).

A-Parser functionality allows you to save Baidu parser parsing settings for further use (presets), set up a parsing schedule, and much more. You can use automatic query multiplication, substitution of subqueries from files, enumeration of alphanumeric combinations and lists to get the maximum possible number of results.

In the Baidu parser, saving results is possible in the form and structure that you need, thanks to the built-in powerful Template Toolkit template engine, which allows you to apply additional logic to results and output data in various formats, including JSON, SQL, and CSV.

Baidu Parser Use Casesโ€‹

Parsing full Baidu links

This resource shows how to parse full links

Baidu suggestions

Multilevel parsing of Baidu suggestions

JS parser JS::SE::Baidu::Suggest

Creating JS parsers. Getting Baidu suggestions

List of collected dataโ€‹

  • Links
  • Snippets
  • Anchors
  • Total number of results
  • List of related words
  • Number of search results pages

Data collected by the SE::Baidu parser

Capabilitiesโ€‹

  • Parses up to 5000 results per query
  • Supports all Baidu search operators (filetype:, site:, intitle:).
  • Collects search results and related keywords
  • Converts truncated links to full ones (option Get full links)

Usage scenariosโ€‹

  • Collecting link databases - for A-Poster, XRumer, AllSubmitter, etc.
  • Keyword competition assessment
  • Checking site indexing
  • Collecting pages that contain specified keywords in the page title

Query examplesโ€‹

  • Queries must be specified as search phrases, for example:
test
site:www.baidu.com
็™พๅบฆไบงๅ“ๅคงๅ…จ
intitle:ะฟะฐั€ัะตั€

Query substitutionsโ€‹

You can use built-in macros to multiply queries, for example, we want to get a very large database of forums, specify several main queries in different languages:

forum
ั„ะพั€ัƒะผ
foro
่ฎบๅ›

In the query format, we will specify the enumeration of characters from a to zzzz, this method allows us to rotate the search results to the maximum and get many new unique results:

$query {az:a:zzzz}

This macro will create 475254 additional queries for each original search query, which in total will give 4 x 475254 = 1901016 search queries, an impressive figure, but this is not a problem for A-Parser. At a speed of 2000 requests per minute, this task will be processed in just 16 hours.

Using operatorsโ€‹

You can use search operators in the query format, so it will be automatically added to each query from your list:

site:$query

Result output optionsโ€‹

A-Parser supports flexible formatting of results thanks to the built-in Template Toolkit template engine, which allows it to output results in any form, as well as in a structured form, such as CSV or JSON.

Same as in SE::Google.

Same as in SE::Google.

Same as in SE::Google.

Same as in SE::Google.

Keyword competitionโ€‹

Same as in SE::Google.

Same as in SE::Google.

Saving in SQL formatโ€‹

Same as in SE::Google.

Dumping results in JSONโ€‹

Similarly to SE::Google.

Results processingโ€‹

A-Parser allows processing results directly during parsing, in this section we have provided the most popular use cases for the Baidu parser.

Similarly to SE::Google.

Similarly to SE::Google.

Domain extractionโ€‹

Similarly to SE::Google.

Removing tags from anchors and snippetsโ€‹

Similarly to SE::Google.

Similarly to SE::Google.

Possible settingsโ€‹

Parameter nameDefault valueDescription
Pages count5Number of pages to parse (from 1 to 100)
Links per page50Number of links in the search results on each page (10 / 20 /50)
Get full linksโ˜Conversion of truncated links to full ones (disabled by default)