Skip to main content

Features of Template Operation in A-Parser

Templates Testing

A-Parser has a special tool for debugging and testing templates: Templates Testing

The .format method for arrays

In A-Parser, most results are presented as arrays with nested elements. In technical terms, the results are presented as an array of hashes, where each hash has fixed keys. Let's look at this using the example of the SE::GoogleSE::Google, scraper, in the results it contains the array $serp c with elements $link, $anchor and $snippet (and others):

"serp" : [
{
"link" : "http://www.speedtest.net/",
"anchor" : "Speedtest.net by Ookla - The Global Broadband Speed <b>Test</b>",
"snippet" : "<b>Test</b> your Internet connection bandwidth to locations around the world with this <br>interactive broadband speed <b>test</b> from Ookla."
},
{
"link" : "http://www.speakeasy.net/speedtest/",
"anchor" : "Speakeasy Speed <b>Test</b>",
"snippet" : "Speakeasy Speed <b>Test</b> - Broadband Speed <b>Test</b>. Go to MegaPath Speed <b>Test</b> ... <br>02:38:36 PM Your IP: The Speakeasy Speed <b>Test</b> requires Flash v7 or higher."
},
{
"link" : "http://en.wikipedia.org/wiki/Test_cricket",
"anchor" : "<b>Test</b> cricket - Wikipedia, the free encyclopedia",
"snippet" : "<b>Test</b> cricket is the longest form of the sport of cricket. <b>Test</b> matches are played <br>between national representative teams with &quot;<b>Test</b> status&quot;, as determined by the&nbsp;..."
}
]

To easily iterate and output data from such an array, the method .format was created, which allows combining all array elements according to a specific format, for example, all links separated by a newline:

$serp.format('$link\n')

Which will result in each link being saved on a new line:

http://www.speedtest.net/
http://www.speakeasy.net/speedtest/
http://en.wikipedia.org/wiki/Test_cricket

Outputting snippets:

$serp.format('$snippet\n')
<b>Test</b> your Internet connection bandwidth to locations around the world with this <br>interactive broadband speed <b>test</b> from Ookla.
Speakeasy Speed <b>Test</b> - Broadband Speed <b>Test</b>. Go to MegaPath Speed <b>Test</b> ...<br>02:38:36 PM Your IP: The Speakeasy Speed <b>Test</b> requires Flash v7 or higher.
<b>Test</b> cricket is the longest form of the sport of cricket. <b>Test</b> matches are played <br>between national representative teams with &quot;<b>Test</b> status&quot;, as determined by the&nbsp;...

Links, anchors, and snippets simultaneously:

$serp.format('Link: $link, Anchor: $anchor, Snippet: $snippet\n')
Link: http://www.speedtest.net/, Anchor: Speedtest.net by Ookla - The Global Broadband Speed <b>Test</b>, Snippet: <b>Test</b> your Internet connection bandwidth to locations around the world with this <br>interactive broadband speed <b>test</b> from Ookla.Link: http://www.speedtest.net/, Anchor: Speedtest.net by Ookla - The Global Broadband Speed <b>Test</b>, Snippet: <b>Test</b> your Internet connection bandwidth to locations around the world with this <br>interactive broadband speed <b>test</b> from Ookla.
Link: http://www.speakeasy.net/speedtest/, Anchor: Speakeasy Speed <b>Test</b>, Snippet: Speakeasy Speed <b>Test</b> - Broadband Speed <b>Test</b>. Go to MegaPath Speed <b>Test</b> ... <br>02:38:36 PM Your IP: The Speakeasy Speed <b>Test</b> requires Flash v7 or higher.Link: http://www.speakeasy.net/speedtest/, Anchor: Speakeasy Speed <b>Test</b>, Snippet: Speakeasy Speed <b>Test</b> - Broadband Speed <b>Test</b>. Go to MegaPath Speed <b>Test</b> ... <br>02:38:36 PM Your IP: The Speakeasy Speed <b>Test</b> requires Flash v7 or higher.
Link: http://en.wikipedia.org/wiki/Test_cricket, Anchor: <b>Test</b> cricket - Wikipedia, the free encyclopedia, Snippet: <b>Test</b> cricket is the longest form of the sport of cricket. <b>Test</b> matches are played <br>between national representative teams with &quot;<b>Test</b> status&quot;, as determined by the&nbsp;...Link: http://en.wikipedia.org/wiki/Test_cricket, Anchor: <b>Test</b> cricket - Wikipedia, the free encyclopedia, Snippet: <b>Test</b> cricket is the longest form of the sport of cricket. <b>Test</b> matches are played <br>between national representative teams with &quot;<b>Test</b> status&quot;, as determined by the&nbsp;...

In the format, you can also use the original query (or other available variables), which allows defining a correspondence between the query and each array element:

$serp.format('$query: $link\n')
test: http://www.speedtest.net/
test: http://www.speakeasy.net/speedtest/
test: http://en.wikipedia.org/wiki/Test_cricket

The .json method for objects

As you know, all data in A-Parser is presented as variables. There is a serialization method (conversion to String type) of such data into JSON format: .json. For instance:

$results.json
Example

Static template flag in Result File Name

The isStaticTemplate() flag allows you to make a dynamic template in the Result File Name static. Operating principle: when this flag is used in the Result File Name, the template will be executed once at the start of the task and will thus be considered static. This allows for more flexible file naming while retaining the ability to get links to them via the API method getTaskResultsFile.

Usage example:

[% isStaticTemplate(); tools.js.eval('Date.now()') %]

Available variables

When formatting queries

When formatting results

When forming the result file name

When filtering results

Variables Interpolation

By default, templates are enclosed between the tags [% and %], everything outside the tags is plain text that will be passed to the result as is. Variable interpolation, is additionally enabled in A-Parser, which allows accessing variables in the text using the symbol $. In addition, \n is also interpolated as an explicit newline.

Example:

Total results for query $query: $totalcount\n

The values of the corresponding variables will be substituted in place of $query and $totalcount , and \n will be replaced by a newline. The equivalent notation without using interpolation:

Total results for query [% query %]: [% totalcount; "\n" %]
note

Note that in Template Toolkit templates, variables are written without the $.

Template Use Cases

Formatting queries

Example

In this example, the search operator site: will be added to each domain from the file Alexa top500.txt, and substitutions from the words.txt file will be added after a space.

Formatting results

Template in result format

In this example, the query, the number of results in the output, and the number of related keywords will be displayed. A list of collected anchors will also be displayed.

Templates for result filtering

Filtering template

To be able to set a template, you should select Custom Template.
In this example, only those queries for which fewer than 5 results were collected will be output as the result.

Templates when using the Use Regex option

Templates with regex using

In this example, the scraper will collect sentences that contain the word passed as the second argument of the query. The operating algorithm is as follows: the query is separated by the Query Builder into a link and a word using the specified delimiter; the scraper navigates the link, selects the text; the word from the query is substituted into the regular expression, and sentences are collected using it.

Download example

How to import the example into A-Parser

eJyNVE1T2zAQ/StUEwYo1EkoDK0vncA003YCoSScHLejxmtXjWwZSYZkQv57d2XH
diiHXmTp6b3d1X54zSw3C3OrwYA1zA/WLHd75rMIYl5Iy05YzrUBTdcB+zK9Hvn+
FJb289JqPrdKI6Pmrpld5YDqeWGsSu/AlCZ0ufEDZlFqEKEvC+kmgSUKDoNZ0Tvr
997R5zQOgx/em0+zWTab6fA42N97KECvvAWs9vZ37t4GuA+Pj1hlalr6F0nttUK4
1nyFoPve8JQwY7XIElNT6Y0VyMJNGNb4UOmUU1Y6ed+rVF7swMODTglgQAdHjakJ
f4SpQkksJDTwEE+V907ELdDt1tKRZ5eULR5FwgqVcVn6pbCaWO4zgclAfaaQS3kR
YIZapQi5rJbgahtzwDruTNkunPZ7qWF+zKWBE2Yw1CHHQKKXN8KC5ljjcU7xIL5m
KhtIOYJHkA3N2b8shIywTwYxir5Wwtcp439sbOrntV09gn7SGENtxZ0ux9eNKlIj
lWyTIUUqLJ7NlSoyKlcPwQVAXufshmip0lC7qSxX3nEUcsgiZDYlG+QNtPMMNzBG
FXpOpsskn2wnoGyKSS4FVcQATlGZEPZMJEWF2Uqwranfao8tBwjOVRaLZIzJ0CKC
bTMU2RRnd5xdqTSXQDnKCimxxAbumlYbmKqkdGge+1J85VzsTL1VSppvk/LZuRYY
8DkFmGJV2l4rk3Mu5f3dqH3DmvbEw29rc+N3u7rwnsRC5BAJ7imddOnUdVPfp/X9
B7dGtJ6dun3PrdwhLc7Zhduft3DesvDxZ0sctfa/WqSL5/8hMXqfhURhtTD7VKnq
p1j/Otev/hr99QZ79I+5LdlUBOIihtU02IDM72/+Ar1q4iA=

Templates in parser settings

Example of random user-agent substitution

Preset macros configuration

In A-Parser, you can configure template macros and preset variables that will be globally available for all templates; global macros can be specified in Settings -> Advanced Settings.

By default, it already contains the pre-definition of the $datefile, object, which is used for time formatting for the result file name.

Adding and using a macro

This example shows the setting of a global variable. This can be useful, for example, if you need to use the same cookies in several Instagram scrapers.

Setup example:

Example
caution

Note the syntax of the closing bracket of the template -%]. This is needed to remove the newline; otherwise, when using any template, an empty line will be added at the beginning.

Usage example:

Example