Results Builder

Results Builder - allows you to transform the results from each scraper before their formatting and saving to disk

Capabilities

Splitting the result into parts using a regular expression or using an arbitrary delimiter
Replacing a substring in the result or replacing with a regular expression
Extracting the domain or main domain from a link
Converting the result to upper\lower case
Removing HTML tags (<b>text</b> -> text)
Converting HTML entities into their Unicode equivalents (© -> ©)
Retrieving data using XPath queries

Examples

Domain parsing

Saving only domains when parsing links from search engines:

As a source, link elements from the serp array from the first scraper are used, to each element a function for extracting the main domain from the link will be applied, the new result will be saved under the same name (link element in the serp array) - therefore, changing the result format is not required

Snippet parsing with cleaning

Saving snippets from search engines with cleaning from HTML tags and converting HTML entities

By default, anchors and snippets are parsed with all nested tags, which allows preserving the same formatting as when viewing the output from search engines. If only plain text is needed, then you can use the capabilities of the Results Builder:

In this example, two Results Builders are sequentially applied to the snippets - removing HTML tags and converting HTML entities

Parsing using XPath

Parsing links from search results using XPath:

In this example, parsing of links from the Google search engine is shown. The following XPath query is used:

//*[@id="rso"]/div[3]/div/div[1]/a/@href

Capabilities​

Examples​

Domain parsing​

Snippet parsing with cleaning​

Parsing using XPath​

Capabilities

Examples

Domain parsing

Snippet parsing with cleaning

Parsing using XPath