Skip to main content

Results Builder

Results Builder - allows you to transform the results from each scraper before their formatting and saving to disk

Capabilities

  • Splitting the result into parts using a regular expression or using an arbitrary delimiter
  • Replacing a substring in the result or replacing with a regular expression
  • Extracting the domain or main domain from a link
  • Converting the result to upper\lower case
  • Removing HTML tags (<b>text</b> -> text)
  • Converting HTML entities into their Unicode equivalents (&copy; -> ©)
  • Retrieving data using XPath queries
Results builder

Examples

Domain parsing

Saving only domains when parsing links from search engines:

Domain parsing

As a source, link elements from the serp array from the first scraper are used, to each element a function for extracting the main domain from the link will be applied, the new result will be saved under the same name (link element in the serp array) - therefore, changing the result format is not required

Snippet parsing with cleaning

Saving snippets from search engines with cleaning from HTML tags and converting HTML entities

By default, anchors and snippets are parsed with all nested tags, which allows preserving the same formatting as when viewing the output from search engines. If only plain text is needed, then you can use the capabilities of the Results Builder:

Snippet parsing with cleaning

In this example, two Results Builders are sequentially applied to the snippets - removing HTML tags and converting HTML entities

Parsing using XPath

Parsing links from search results using XPath:

Snippet parsin

In this example, parsing of links from the Google search engine is shown. The following XPath query is used:

//*[@id="rso"]/div[3]/div/div[1]/a/@href