Skip to main content

Results Builder

Results Builder - allows you to transform the results from each scraper before formatting and saving them to disk.

Capabilities

  • Splitting the result into parts using a regular expression or an arbitrary delimiter
  • Replacing a substring in the result or replacing it with a regular expression
  • Extracting the domain or main domain from the link
  • Converting the result to uppercase\lowercase
  • Removing HTML tags(<b>text</b> -> text)
  • Converting HTML entities to their Unicode equivalents(&copy; -> ©)
  • Retrieving data using XPath queries

Results Builder screenshot

Examples

Parsing domains

Saving only domains when parsing links from search engines:

Parsing domains example

The source result is the link elements from the serp array of the first scraper (p1). The main domain extraction function will be applied to each element, and the new result will be saved under the same name (the link element in the serp array) - so there is no need to change the result format.

Parsing snippets with cleaning

Saving snippets from search engines with cleaning from HTML tags and conversion of HTML entities.

By default, anchors and snippets are parsed with all nested tags, which allows you to save the same formatting as when viewing the search results. If only plain text is needed, you can use the capabilities of Results Builder:

Parsing snippets with cleaning example

In this example, two Results Builders are sequentially applied to the snippets - removing HTML tags and converting HTML entities.

Parsing with XPath

Parsing links from search results using XPath:

Parsing with XPath example

This example shows parsing links from the google.com search engine. The XPath query used is:

//*[@id="rso"]/div[3]/div/div[1]/a/@href