Skip to main content

Presentation and Formatting of Results

Available Formats for Saving Results

For formatting results in A-Parser, Template Toolkit templating is used, which allows you to easily save parsing results in various formats:

  • In text files as a list: one result per line, separated by a delimiter, in an arbitrary format
  • In CSV files with the possibility of further import into Excel, Google Docs, etc.
  • In XML, JSON, and other data storage formats
  • In HTML "on the fly" generating pages
  • In SQL dump format for direct import into a database or directly writing to a SQLite database
  • In binary format for saving images (jpg, png, gif, ...), documents (pdf, docx, ...), executable files and archives (exe, dmg, zip, ...) and any other types of data

Editing the Result Format

Result format - allows you to format the results to the desired form using templates, is used for each query-results combination.

Result formats
  • General result format is set in the Result format field
  • Result format for each scraper separately can be set in the scraper settings in Result format

A-Parser supports working with multiple scrapers in one task, in the general result format it is necessary to indicate from which scraper to display the result:

  • $p1 - results from the first scraper (SE::GoogleSE::Google in the screenshot), $p2 - results from the second scraper (SE::BingSE::Bing in the screenshot)
  • The ordinal number of the scraper is displayed to the left of the scraper selection field
  • $p1.preset and $p2.preset implies that it is necessary to take the value of the result format from the settings of the corresponding scrapers
  • In this example, $p1.preset can be replaced with $p1.serp.format('$link\n') which will have the same effect, while the result format from the settings will no longer be used

Result format can be specified in a convenient multi-line editor by clicking on the corresponding icon in the editing field:

Result Format editing field

The following variables are available in the general result format:

  • $query - the query after formatting
  • $query.* - all variables related to the query, described in the article Templates in queries
  • $p1, $p2, ... - variables for accessing parsing results for each scraper separately (Viewing possible results for each scraper)
  • $p1.query, $p2.query, ... - queries after formatting taking into account the query format specified in the settings of each scraper

Prepend and Append Text

For each result file, separate Prepend/Append text is specified:

  • For forming the header of a CSV file
  • For initial and final tags of an XML file
  • For the header, heading, and footer of HTML files
  • For any other applications

To activate this feature, click on the More options button at the bottom of the Task Editor

Fields for start and end text

The initial and final text supports the use of the Template Toolkit templating engine, available variables:

  • $query - the query after formatting
  • $query.* - all variables related to the query, described in the article Templates in queries
note

Important! These variables are only available when saving each query in a separate file or when using the same variables in the Result file name format.

Result file name format

A-Parser allows you to use templates in the names of the resulting files as well, which allows you to automatically create files and folders based on the current date, by the query's serial number, by the query itself, and in any other format.

File name field

The following variables are supported in the File name field:

  • All variables available for the General result format
  • $queriesfile - the path and name of the file with queries, if queries are specified through the form then it will contain queries_from_text.txt
  • $datefile - object of the date plugin of the Template Toolkit templating engine, configured to the date format %b-%d_%H-%M-%S, when formatting it outputs the current time and date as May-08_20-08-38, the format can be changed in Additional settings

By default, the file name is created based on the date and time at the start of the task

Complex example

reports/$queriesfile/${query}.txt
  • A folder named reports will be created
  • A subfolder with the name of the query file will be created
  • In the subfolder, as many files will be created as there are queries used in the task, the query itself will be used as the file name with the .txt extension
tip

The $query variable is written in the format ${query} to prevent the interpolation of the .txt extension as part of the variable, more details in the documentation on the Template Toolkit templating engine

⏩ Video. Naming result files

This video presents several examples of naming the result file:

  1. Numbering the result file according to the queries.
  2. Numbering the result file + part of the query name.
  3. Naming the result file by the query, if the query is a link.

Viewing available results

Each scraper has its own set of results, you can view the list of available results by hovering over the scraper, a tooltip will display a list of simple results and arrays, with a list of nested elements:

List of available results in a tooltipYellow highlights results common to all scrapers:
  • $query - the query passed to the scraper after formatting
  • $query.orig - the original query (as it was in the file or in the query input field)
  • $query.first - the first query when using nested parsing options (Parse all results or Parse to level)
  • $info.success - information about the success of parsing this query
  • $info.retries - the number of attempts used for this query
  • $info.stats - statistics of the scraper's work for this query
  • $pages.$i.data - an array with raw server responses for the possibility of extracting additional information independently
Green highlights results available only for the SE::BingSE::Bing scraper:
  • $totalcount - the number of search results
  • $ads with elements $link, $anchor, $visiblelink, $snippet, $position, and $page - an array with a list of ads
  • $related.$i.key - an array with a list of related keywords
  • $serp with elements $link, $anchor, $snippet, $cache - an array with the main search engine results
note

Please note that for arrays, the variable $i is explicitly specified, indicating that there are multiple elements and they can be accessed by index (position number) or iterated over each element in a loop.

tip

The result $pages.$i.data will automatically be changed to $data for those scrapers that do not "navigate through pages" within a single request. For example, like DeepL::TranslatorDeepL::Translator.

Presentation of Results

A-Parser was created for scraping information of all kinds, for this purpose two types of results were introduced:

  • Simple Results (Flat)
  • Arrays of Results (Array)

Let's consider each type using the example of the scraper SE::GoogleSE::Google, screenshot of the search results:

Screenshot of Google search results

Simple Results

Simple Results - when one request corresponds to one result, examples:

  • The number of results for a query ($totalcount)
  • Whether the query is a typo ($misspell, not shown in the screenshot)

Other examples:

  • The value of the translated text ($translated) in the scraper DeepL::TranslatorDeepL::Translator
  • The number of referring domains ($domains), trust value ($trustflow), backlinks ($backlinks), etc. in the scraper Rank::MajesticSEORank::MajesticSEO

Single results are stored in regular variables (prefix $ + name in Latin script)

Arrays of Results

Arrays of Results - when one request corresponds to a list of results, each item in the list may contain several nested elements. Let's analyze using the example of Google search results - it is represented in the scraper by the array $serp, for clarity let's use a table, let's write down the first 5 results:

Link ($link)Anchor ($anchor)Snippet ($snippet)
http://www.speedtest.net/Speedtest.net by Ookla - The Global Broadband Speed TestTest your Internet connection bandwidth to locations around the world with this interactive broadband speed test from Ookla.
http://en.wikipedia.org/wiki/Test_cricketTest cricket - Wikipedia, the free encyclopediaTest cricket is the longest form of the sport of cricket. Test matches are played between national representative teams with "Test status", as determined by the ...
http://www.speakeasy.net/speedtest/Speakeasy Speed TestSaturday 03-May 2014, 11:04:29 AM Your IP: The Speakeasy Speed Test requires Flash v7 or higher. Please update your browser. See Pricing Or Call Today
http://www.humanmetrics.com/cgi-win/jtypes2.aspPersonality test based on C. Jung and I. Briggs Myers type theoryHumanmetrics Jung Typology Test™ instrument uses methodology, questionnaire, scoring and software that are proprietary to Humanmetrics, and shall not be ...
http://test-ipv6.com/Test your IPv6.This will test your browser and connection for IPv6 readiness, as well as show you your current IPV4 and IPv6 address. ... Test your IPv6 connectivity. JavaScript ...

Each search position is recorded in an array with 3 nested elements - link ($link), anchor ($anchor), snippet ($snippet)

Another example - a list of related keywords, which is stored in the array $related:

Keyword($key)
test wwe
depression test
test my speed
wonderlic test
test personality
act test
jiggle test
bipolar test

As you can see, this array has only one nested element - keyword ($key)

The numbering of array elements starts from 0, an example of accessing individual array elements:

  • $serp.0.link - the first link from the search results
  • $serp.3.anchor - the fourth anchor from the search results
  • $related.0.key - the first related keyword

More details about the formatting of simple results and arrays will be described below

Formatting Principles

After the scraper has collected data in simple results and arrays, they need to be displayed (saved to a file) in the desired format. For convenience and functionality, A-Parser uses the templating engine Template Toolkit. Let's look at frequently used constructs, for this we will use the tool Templates Testing. Let's select a project for the scraper SE::GoogleSE::Google:

Templates testing