Presentation and formatting of results
Available formats for saving results
To format results in A-Parser, the Template Toolkit template engine is used, which makes it easy to save parsing results in various formats:
- In text files as a list: one result per line, separated by a delimiter, in any format
- In
CSV
files with the ability to further import intoExcel, Google Docs
, etc. - In
XML
,JSON
, and other data storage formats - In
HTML
by generating pages on the fly - In
SQL
dump format for direct import into a database or directly writing to anSQLite
database - In binary format for saving images (
jpg, png, gif
, ...), documents (pdf, docx
, ...), executable files and archives (exe, dmg, zip
, ...) and any other data types
Editing result format
Result format allows you to format results to the desired view using templates and is applied to each query-result combination.
- Common result format is set in the
Result format
field - Result format for each parser can be set separately in the parser settings in
Result format
A-Parser supports working with several parsers in one task, in the general format of results, it is necessary to indicate from which parser to output the result:
$p1
- results from the first parser (SE::Google in the screenshot),
$p2
- results from the second parser (SE::Bing in the screenshot)
- The parser number is displayed to the left of the parser selection field
$p1.preset
and$p2.preset
imply that it is necessary to take the result format value from the settings of the corresponding parsers- In this example,
$p1.preset
can be replaced with$p1.serp.format('$link\n')
which will have the same effect, while the result format from the settings will not be used
Result format can be specified in a convenient multiline editor by clicking on the corresponding icon in the editing field:
The following variables are available in the general result format:
$query
- query after formatting$query.*
- all variables related to the query, described in the article Templates in queries$p1, $p2, ...
- variables for accessing parsing results for each parser separately (View possible results for each parser)$p1.query, $p2.query, ...
- queries after formatting with the query format specified in the settings of each parser
Prepend and append text
For each result file, a separate Prepend/Append text is specified:
- To form the header of a CSV file
- For initial and final tags of an XML file
- For the header, body, and footer of HTML files
- For any other options
To activate this feature, click the button at the bottom of the Task Editor
The Prepend and Append text supports the use of the Template Toolkit template engine, the following variables are available:
$query
- query after formatting$query.*
- all variables related to the query, described in the article Templates in queries
Important! These variables are only available when saving each query to a separate file or when using these same variables in the Result file name format.
Result file name format
A-Parser allows using templates in result file names, which allows automatically creating files and folders based on the current date, the order number of the request, the request itself, and in any other format.
The following variables are supported in the File name field:
- All variables available for the Common result format
$queriesfile
- the path and file name of the queries file, if the queries are specified through the form, it will contain queries_from_text.txt$datefile
- the date plugin object of the Template Toolkit templating engine, configured to the date format%b-%d_%H-%M-%S
, when formatted, it outputs the current time and date in the form of May-08_20-08-38, the format can be changed in the Additional settings
By default, the file name is created based on the date and time at the start of the task.
Complex example
reports/$queriesfile/${query}.txt
- A reports folder will be created.
- A subfolder with the name of the query file will be created.
- In the subfolder, as many files will be created as there are queries used in the task, and the name of the file will be the query itself with the .txt extension.
The $query variable is written in the format ${query} to prevent interpolation of the .txt extension as part of the variable, see the documentation for the Template Toolkit templating engine for more information.
⏩ Video. Naming result files
In this video, we will give several examples of naming the result file:
- Numbering the result file according to the queries.
- Numbering the result file + part of the query name.
- Naming the result file according to the query, if the query is a link.
Viewing available results
Each scraper has its own set of results, and you can view the list of available results by hovering over the scraper with the pointer, and the pop-up tooltip will display a list of simple results and arrays, with a list of nested elements:
- $query - the query passed to the scraper after formatting
- $query.orig - the original query (as it was in the file or in the query input field)
- $query.first - the first query when using nested parsing options (Parse all results or Parse to level)
- $info.success - information about the success of parsing this query
- $info.retries - the number of attempts used for this query
- $info.stats - the scraper's work statistics for this query
- $pages.$i.data - an array of unprocessed server responses for the possibility of extracting additional information independently

- $totalcount - the number of search results
- $misspell - whether there is a typo in the query
- $detected_geo - the detected geo
- $ads with elements $link, $anchor, $visiblelink, $snippet, $position, and $page - an array with a list of ads
- $related.$i.key - an array with a list of related keywords
- $rich.$i.name - an array with extended snippets
- $serp with elements $link, $anchor, $snippet, $amp - an array with the main search engine results
Note that for arrays, the variable $i is explicitly specified, which means that there are several elements, and they can be accessed by index (position number) or iterated over each element in a loop.
The $pages.$i.data result will automatically be changed to $data for those scrapers that do not "go through pages" within one query. For example, like DeepL::Translator.
Results representation
A-Parser was created for parsing information of any kind, for this purpose, 2 types of results were introduced:
- Simple results(Flat)
- Result arrays(Array)
Let's consider each type using the example of the SE::Google scraper, a screenshot of the search results:
Simple results
Simple results - when one query corresponds to one result, examples:
- The number of results for the query ($totalcount)
- Whether the query is a typo ($misspell, not shown in the screenshot)
Other examples:
- The value of Alexa Rank ($rank) in the
Rank::Alexa scraper
- The value of the translated text ($translated) in the
DeepL::Translator scraper
- The number of referring domains ($domains), the trust value ($trustflow), backlinks ($backlinks), etc. in the
Rank::MajesticSEO scraper
Single results are stored in regular variables (prefix $
+ name in Latin letters)
Result arrays
Result arrays - when one query corresponds to a list of results, each element of the list, in turn, can contain several nested elements. Let's consider the Google search results as an example - it is represented in the scraper by the $serp array, for clarity, we will use a table and write down the first 5 search results:
Link($link) | Anchor($anchor) | Snippet($snippet) |
---|---|---|
http://www.speedtest.net/ | Speedtest.net by Ookla - The Global Broadband Speed Test | Test your Internet connection bandwidth to locations around the world with this interactive broadband speed test from Ookla. |
http://en.wikipedia.org/wiki/Test_cricket | Test cricket - Wikipedia, the free encyclopedia | Test cricket is the longest form of the sport of cricket. Test matches are played between national representative teams with "Test status", as determined by the ... |
http://www.speakeasy.net/speedtest/ | Speakeasy Speed Test | Saturday 03-May 2014, 11:04:29 AM Your IP: The Speakeasy Speed Test requires Flash v7 or higher. Please update your browser. See Pricing Or Call Today |
http://www.humanmetrics.com/cgi-win/jtypes2.asp | Personality test based on C. Jung and I. Briggs Myers type theory | Humanmetrics Jung Typology Test™ instrument uses methodology, questionnaire, scoring and software that are proprietary to Humanmetrics, and shall not be ... |
http://test-ipv6.com/ | Test your IPv6. | This will test your browser and connection for IPv6 readiness, as well as show you your current IPV4 and IPv6 address. ... Test your IPv6 connectivity. JavaScript ... |
Each search result is recorded in an array with 3 nested elements - link($link), anchor($anchor), snippet($snippet)
Another example is a list of related keywords, which is stored in the $related array:
Keyword($key) |
---|
test wwe |
depression test |
test my speed |
wonderlic test |
test personality |
act test |
jiggle test |
bipolar test |
As can be seen in this array, there is only one nested element - keyword($key)
The numbering of array elements starts from 0, an example of accessing individual array elements:
- $serp.0.link - the first link from the search results
- $serp.3.anchor - the fourth anchor from the search results
- $related.0.key - the first related keyword
More details about formatting simple results and arrays will be described below.
Formatting principles
After the scraper has collected the data in simple results and arrays, it is necessary to display (save to a file) it in the required format. For convenience and functionality, A-Parser uses the Template Toolkit Template Engine. Let's consider frequently used constructions, for this we will use the Template testing tool. We will select the project for the SE::Google scraper:
The screenshot shows 3 fields:
- JSON - internal representation of data in the scraper
- Template - the template according to which the result is formatted
- Result - the data converted according to the specified template, in the form in which the result will be written to the file
By changing the template, we can change the appearance of the result, let's consider the following template:
Text in the Template field:
Отчет по запросу: $query
Конкуренция: $totalcount
Список ссылок, анкоров и сниппетов:
$serp.format('$link $anchor\n$snippet\n\n')
The main rules are:
- Ordinary text is output to the result as is, without changes
- To output simple results, it is necessary to output a variable containing the required result with the prefix
$
in the right place - To format arrays, the
format
method is used, which will be described below \n
is responsible for line break
Formatting arrays
Formatting arrays, let's consider the construction:
$serp.format('$link $anchor\n$snippet\n\n')
This entry means that for the $serp
array, it is necessary to call the format
method with the parameter '$link $anchor\n$snippet\n\n'
. The format
method concatenates all the elements of the array into a string according to the template specified in the parameter, the template itself means: for each element of the $serp
array, output the link and anchor separated by a space, then output the snippet on a new line, after which there are two more line breaks, resulting in an empty line between the results.
Using the template engine
Outputting variables
To use the template engine, you need to insert [% %]
tags, and inside the tags enter the logic that needs to be executed.
Looping through an array
To output array elements, use the FOREACH
construction:
[%
FOREACH i IN p1.list;
i.cms _ "\n";
END
%]
More information and examples on the template engine in Features of working with templates in A-Parser.
Examples
Outputting competition
Outputting the competition for a query (the number of results for a query) for all search engine scrapers (SE::Google,
SE::Yandex...).
Result format:
$query: $totalcount\n
Result:
test: 3910000000
viagra: 278000000
окна пвх: 3220000
...
Parsing links
Outputting links from search engine results.
Result format:
$serp.format('$link\n')
Result:
http://www.speedtest.net/
http://www.speakeasy.net/speedtest/
http://en.wikipedia.org/wiki/Test_cricket
http://www.humanmetrics.com/cgi-win/jtypes2.asp
http://html5test.com/
http://test-ipv6.com/
...
Parsing suggestions
Outputting search engine suggestions.
Result format:
$results.format('$suggest\n')
Result:
тестовый сервер танки онлайн
тесты гиа по русскому языку
тесто для блинов рецепт
тестикула
тесто для пиццы на молоке
Outputting data about the response
In Net::HTTP and scrapers based on it, additional output is available:
$proxy
- the proxy on which the request was executed$headers
- response headers$code
- response code$reason
- response status
Output of variable values in JSON
$results.json
The .json
method allows you to output data in JSON format:
Output of all request redirects
For this task, the variable $response
is available, which allows you to get any request variables, including all previous redirects.
Result format:
$response.Redirects.format('$URI\n--> ')$response.URI
Result:
Output in JSON using a template engine to record the date
The example shows the output of the Net::Whois scraper results in JSON format.
As a result, there will be a domain that was checked, the date at the time of the check, and the check result. As can be seen in the Result format, we get the date using the Template-Toolkit template engine.
Result format:
{
"domain": "$query",
"date": "[% USE d = date(format = '%d.%m.20%y', locale = 'C');d.format() %]",
"expire": "$p1.expire_date",
},
Example result:
[{
"domain": "a-parser.com",
"date": "05.05.2021",
"expire": "25.02.2022",
},
]
Download example
How to import an example into A-Parser
eJxtVG1v2jAQ/ivWCUQrZaxM2pdM+0BRkTYx6ErRPgQ0efWFeXXszHYYVZT/3nMS
EtruQyTf23P33EtK8Nw9uluLDr2DOCkhr98QwxJ9HP/4baRj79jNkWe5QjaBCHJu
HdrgnZw5kUFgygvlYbeLgFDo6ebGZjyglVvNGNuCMBmXegsxvQd/C7RPW4hONu6x
sSRDtlnfMME+s6C8SGsYkkZDMR5m4w9Xw6dRxJR54FQUqWejy09i3LhdXLLhrkfF
Yy5tizvIJ+NG/tkkI6eKPugKXvMD3hsqOJUKe/WcpCXPkAyDEBmsXbqxP3py5UJI
L43mqmEdOtR3YqMl0aV4bcg3MJfo5tZkfa66HaeOJW17gCCKOvZ7EwNxypXDCByV
OucUKl5bpEfLvbGrPNRD+hKMniq1wAOq3q3Gvy6kEjTOaUpBX9rA/7us3mBUHb3z
VAe0/yzVALG3BYHUwvXqWx8kzMLsibj4RbSVzKQn2c1MocOuXJHyETHvWrYMLcuM
xS5LA9zmpvXNUQtyTPqJTfNWV1Y7eEXkxWBeKh+MTuV+RRSsFHjyLPQ93clKz0y4
gsBMF0rRYBze9Qsyde0ggtC18E3wrE4R2Lf3EoE3Rrmva9KF+7KSFvBjKDCjXp5n
bSFp69XmbnFugbOlqrMnJ/F9c3Ku3tLAkNZ3b2ixiFq16865+weU50cdlxWN64+7
bZwCdK2MgDrkaBYQT6pnHsF5pA==
Checking a site for presence in Google News by keyword
Result format:
[%
linksToOneString = p1.serp.format('$link. ');
matches = linksToOneString.match('.+?(' _ p1.query.domain _ ').+?');
IF matches.0;
p1.query.orig _';yes' _ "\n";
ELSE;
p1.query.orig _';no' _ "\n";
END
%]
Example result:
парсер гугл|a-parser.com;no
парсер гугл|forbes.ru;yes
Download example
How to import an example into A-Parser
eJylVVFv2jAQ/iuR1YpWo1EQ7UuqaaIMpk6stKV9AoRccqQeju3aCQVR/vvOTkjS
ruNlEoq4u+++O9+dz1uSUrM0txoMpIaE4y1R7j8JyagXhj+kjDl4Z173GeZLL5IJ
ZcLDX2EQ8Go8WNNEcSBNoqg2oC3PuOaOhggWNOMpaW5JulGA7HIFWrPIGlmEsqIx
zGUmEENWlGeIae3+DccwqkIS8WrIATRnYmkUumCQGn9wwCXmNfr7x0Psz3Wozg4m
omtQTkU8O4xXWq43GlLNwNQ8W0EQkN102iTYLKyr6UudUNu08fFEuNM+yKGAETqK
2PvqqZZvK+YvHO6kcWQxvtc4vZwIVMyfwSDqo6PvTCcN/8u3k4Y3sywvGeiNX8zB
DAnQ5lg+CTvzJmSCFuLM132viOQHqCi5pGYIbVxuwDTeufQGo95nQCE/4G6+T8Tx
lJTFGNEVPEgsxoK54dvXCKUbmtj6HUU0BWvdF+TUT9d28mgUsZRJQXleUTvKVZUf
BXvJx00i1maFbelrmaAqBUfgUt13Y0yOnEyQInO+d7kPCReUG2gSg6n2KSYSlZZU
Z2hgKWiaSj1UNh1Ub4kUHc4HsAJe+Tv6q4zxCK9dZ4FO14Xj55DhXxy78nT1UDiG
rxpzKFmcdDX8VXlFciBjPHj0hMfmLGEpyqbrrnBIAlQuAVRZshtbskRqKMMUzEV0
XEEKhJ35qmMdVaneHcMtKiMzPbfUeY2b+ztk3PSNFGe2IQZwKeUFIW8WJG1f9i75
IJPprmpyLQYq51IsWDwsruV+HDLxgGtzKLrSLj5bJpFxjk02cF8NW8cUTbVCdd6P
zl0XwlZyvyQxScnNz1F+cqUZ5nxhE0ywMfWoBeWccv54P6hbSDWgKEyy4Ly9cN/A
fs/zb8tpLnKN54S2E9rV//bTGz3L97o/lwneuf/iwvv2hBsAt55NO4VYYh9sN939
CUkvf0vssoO1oiKCKL8SO9ej4oEp36tt/ZkJtzucy9/mNgfZqlsI6rB9BofOPil/
AGn6WSM=
Output of timestamp value in date format
Sometimes there is no regular date in the results, but there is a timestamp value as in the Social::Instagram::Tag scraper. This value can be represented in date format using Template-Toolkit template engine.
Result format:
[%
USE date;
p1.query.orig _ ": total posts - " _ p1.postscount _ "\nPosts:\n";
FOREACH i IN p1.posts;
d = date.format(i.time, format => '%d.%m.20%y');
i.link _ " - " _ d _ ":\n";
i.text _ "\n";
END
%]
Example result:
sport: total posts - 96500663
Posts:
https://www.instagram.com/p/COfJHshAkeD/ - 05.05.2021:
Quelques exemples de notre nouvelle campagne de communication personnalisable avec le nom des clubs 😀
Vous préférez quel visuel : 1, 2, 3, 4, 5 ? 🤔
#clubnormand #tennis #padel #beachtennis #tenniscourt #padelcourt #beachtenniscourt #lnt #LigueNormandieTennis #🎾 #sport #normandie #normandietourisme
https://www.instagram.com/p/COfJG7olavg/ - 05.05.2021:
💥 Sau màn lật đổ “Bà già” thành công, Nửa xanh thành Milan chính thức vượt qua Nửa đỏ về số lần lên đỉnh nước Ý nhiều nhất lịch sử.
-----------------------------
➖ Website: https://webthethao247.com/
➖ https://g.page/webthethao247?share
#wtt247 #webthethao247 #thethao #sport #bongda #SerieA #InterMilan #Juventus #ACMilan
https://www.instagram.com/p/COfJG1Hg7ax/ - 05.05.2021:
Which Skill was better 1 or 2? 🤔👇
Follow @ftb4ll for more 💥
Follow @ftb4ll for more 💥
Follow @ftb4ll for more 💥
________________________________________
Leave a Like 👍🏽
Subscribe for more 🔔
Leave your thoughts in the Comments 💬
________________________________________
❌Ignore the Tags ❌
#football #soccer #fussball #futbol #fifa #championsleague #bundesliga #ucl #footballmemes #goal #transfer #sports #penalty #ultimateteam #pacybits #fut #ultras #laliga #freekick #referee #sport #calcio #messi #ronaldo #skills #premierleague #foul #footballseason
https://www.instagram.com/p/COfIlXqhfAa/ - 05.05.2021:
Be Fuckin’ Ready 🤣🤣🤣
Get ready to fly!!!! 🏐🏐🏐🏐
Follow - @crackonkings
#beachball #nalin&kane #trance #music #90s #onyerhead #festival #party #afterparty #love #summer #uk #happy #sesh #crackon #football #sport #festivaloutfit #festivalfashion #sun #dj #dancing #club #festivalgirl #house #techno #rave
...
Download example
How to import an example into A-Parser
eJx1VNuO2jAQ/RXLAm1XotFupb6k2kosBZWKAuXyRFDlYoNcHDtrOxSE+PfOODfY
dvOUOZ6Zc+Zin6lnbu+mVjjhHY1XZ5qFfxrTudlIpuJ4qJ1nO8vSOF6wHXlPFjIV
AKUZ8YZw5gURR7CUoB2aMeuExUyrNxKAExdblitPO2fqT5kALnMQ1kqOGSQHe2ts
yjwoCW70wFSObqt2opfzfiD9lOhEZ4/RSy7sKTJW7shPktAYRHmmSGacdyA2oQCD
W7A3Jtc+uCWJniISw09CQ67BZNbv9r4SSYbjOgJPPCdPgTIqZL2TkYcWdEhhkqfP
5K7No3YafXhon+7uIYbAJyMl9T6wVTp4IbHiLLy8ONaaAtwff0l0e00v63WHFj1w
g0AFLWihsmJE9eGcHcTCYNtkGEIVA9aYpdi4FsrH06qE+8gfMQPjXHppNFMFA06u
YV1q+RIarw34YqelcANrUoBQdgmeKnUr2go2hRR5iP1RxNB4y5QTHepA6oCBEP76
RHphmTd2kqEewM/U6K5SI3EQqnEL+Z9zqTisWXcLQcMy8P8uk39yXOryrqlgAf9Y
0FBnCdbz5HsTxc3I7KBy/gvqVjKVHmzXw50C9AHAvRBZ3bMx9iw1VtQ0ZeaSHS5d
JjRuezOybtZAN2XcjOUW3Bi9lbtJeYMqz1wv4GZPdM/gzcS6dK4UjMWJWbMeXVeO
AY1G4OvgXqDA0qubS70xyn2bF1IzK2H9PqLAFDp5zVqm3DCllrPR9QltVgoMlxmL
eTewqDsDKwRlXNb1g1K/UOe3npX4fIEZ/XbTIgALQnfAoDMOBkDjx8tfNDez1Q==