Template Toolkit Tools
In the Template Toolkit, there is a global variable $tools
that contains a set of tools available in any template and inside JS scrapers. There is also a variable $tools.error
, which contains error descriptions if they occur during the operation of all tools.
Adding queries $tools.query.*
β
This tool allows you to add queries to existing ones during the task, forming them based on already parsed results. It can be used as an analogue of the Parse to level function in those scrapers where it is not implemented. There are 2 methods:
[% tools.query.add(query, maxLevel) %]
- adds a single query[% tools.query.addAll(array, item, maxLevel) %]
- adds an array of queries
The maxLevel
parameter specifies up to which level to add queries and is optional: if it is omitted, the scraper will actually add new queries as long as they exist. It is also recommended to include the Unique queries option to avoid looping and unnecessary scraper work.
It is possible to set an arbitrary level for subqueries. This can be used for logic distribution, i.e. when each level is a separate functionality.
example:
[% tools.query.add(query, lvl = 1) %]
- adds a query at a specific level.
example for JS:
this.query.add({
query: "some query",
lvl: 1,
})
The result of the preset on the screenshot:
ΠΏΠ°ΡΡΠ΅Ρ:
parser
what is parsing in programming
parsing in compiler
compiler and parser development
what is syntax analysis
difference between lexical analysis and syntax analysis
syntax analyzer
parser programming language
parser:
parser definition
xml parser
parser generator
parser swtor
parser c++
ffxiv parser
html parser
parser java
what is parsing in programming:
parse wikipedia
parser compiler
what is a parser
parsing programming languages
definition of parser
parsing c++
parser define
parsing java
html parser:
online html parser
html parser php
html parser java
...
Parsing JSON structures $tools.parseJSON()
β
This tool allows you to deserialize data in JSON format into variables (an object) available in the template. Example of use:
[% tools.parseJSON(data) %]
After deserialization, you can refer to the keys from the received object as to ordinary variables and arrays.
If a string with invalid JSON is specified as an argument, the scraper will write an error to $tools.error
.
Output to CSV $tools.CSVline
β
This tool automatically converts values to CSV format and adds a line break, so it is enough to list the necessary variables in the result format, and the output will be a valid CSV file ready for import into Google Docs/Excel/etc.
Example of use:β
[% tools.CSVline(query, p1.serp.0.link, p2.title) %]
Video using $tools.CSVline()
:β
Working with SQLite DB $tools.sqlite.*
β
This tool allows you to easily and fully work with SQLite databases. There are three methods:
$tools.sqlite.get()
- a method that allows you to get single information from the database using SELECT, for example:
[% res = tools.sqlite.get('results/test.sqlite', 'SELECT COUNT(*) AS count FROM test') %]
$tools.sqlite.run()
- a method that allows you to perform operations with the database (INSERT, DROP, etc.), for example:
[% res = tools.sqlite.run('results/test.sqlite', 'INSERT INTO test VALUES(?)', 'test') %]
$tools.sqlite.all()
- a method that allows you to output all data from the table, for example:
[% res = tools.sqlite.get('results/test.sqlite', 'SELECT * FROM test') %]
Substitution of user-agent $tools.ua.*
β
This tool is designed to replace the user-agent in scrapers that use it (for example, Net::HTTP). There are two methods:
$tools.ua.list()
- contains a complete list of available user agents.$tools.ua.random()
- outputs a random user agent from the available ones.
Example of use:β
The list of all user agents is stored in the files/tools/user-agents.txt
file, which can be edited if necessary.
When using this tool for the User agent parameter in scrapers, it is necessary to specify it explicitly:
[% tools.ua.random() %]
JS support in tools $tools.js.*
β
This tool allows you to add your own JS functions and use them directly in the template. Node.js modules are also supported.
Functions are added in Tools -> JavaScript Editor (the screenshot below shows the Tools.prototype.sum(a, b)
function added):
Example of using the created function:
Working with base64 $tools.base64.*
β
This tool allows you to work with base64 directly in the scraper. This tool has 2 methods:
$tools.base64.encode()
- encodes text to base64$tools.base64.decode()
- decodes a base64 string to text
Example of use:β
Data reference $tools.data.*
β
This tool is essentially an object that contains a large amount of pre-installed information - languages, regions, domains for search engines, etc. The complete list of elements (which may change in the future) includes:
"YandexWordStatRegions", "TopDomains", "CountryCodes", "YahooLocalDomains", "GoogleDomains", "BingTranslatorLangs", "Top1000Words", "GoogleLangs", "GoogleInterfaceLangs", "EnglishMonths", "GoogleTrendsCountries"
Each of these elements is an array or hash of data, and you can view the contents by outputting the data, for example, in JSON:
[% tools.data.GoogleDomains.json() %]
Data storage in memory $tools.memory.*
β
A simple key/value storage in memory, shared across all tasks, API requests, etc. It is reset when the scraper is restarted. There are three methods:
[% tools.memory.set(key, value) %]
- sets the valuevalue
for the keykey
[% tools.memory.get(key) %]
- returns the value corresponding to the keykey
[% tools.memory.delete(key) %]
- deletes the record from memory by the keykey
Getting information about the A-Parser version $tools.aparser.version()
β
This tool allows you to get information about the A-Parser version and display it in the result.
Example of use:β
[% tools.aparser.version() %]
Getting task ID and number of threads $tools.task.*
β
This tool allows you to get information about the task ID and display the number of threads used in the task. There are two methods:
[% tools.task.id %]
- returns the task ID[% tools.task.threadsCount %]
- returns the number of threads used in the task