Query Formatting
Query format - allows adding substitutions and formatting the query to the desired form using templates, applied to each query.
Query formats

- Query format for the 1st parser
- Query format for the 2nd parser
- Common query format
There are 2 ways to specify a template:
- Common query format, it is processed first and supports substitutions
- Query format for each parser - allows setting a specific format for individual parsers
Let's look at the example in the screenshot, suppose that as queries we use a file with a list of domains like this:
google.com
a-parser.com
yandex.ru
The common query format is set as:
http://$query
Before each original query (domain), the string http:// will be prepended, the query will be transformed google.com -> http://google.com
The query format for the 1st parser remained unchanged, it will parse the query http://google.com
The query format for the 2nd parser looks as follows:
site:$query
The query for this parser will be transformed: http://google.com -> site:http://google.com
Templates in queries
The query format fully supports the Template Toolkit template engine, the following variables are available:
$query- query after formatting via the common result format$query.num- serial number of the query$query.lvl- query nesting level when using the options Parse to level or Parse all results$query.orig- original query before formatting$query.first- the first query when using the options Parse to level or Parse all results$query.prev- shows the query that was on the previous level, works for
HTML::LinkExtractor, $tools.query.add and JS parsers this.query.add- All variables created via the Query Builder
Substitution macros
Common query format supports the following macros:
| Macro | Description | Examples |
|---|---|---|
| {az:START:END} | Substitution of a numeric-character sequence. Instead of START, the beginning of the sequence is specified, instead of END - the end. The length of END must be greater than or equal to the length of START. Characters at the end of the END sequence must be after (in alphabetical order) characters at the beginning of the START sequence. Any UTF-8 character sequences can be used | {az:a:z} - substitution of all characters from a to z (a, b, c, ..., x, z). {az:aaa:zzz} - substitution of all characters from aaa to zzz (aaa, aab, aac, ..., zzx, zzz). {az:a:zz} - substitution of all characters from a to zz (a, b, c, ... aa, ab, ..., zx, zz). {az:00:99} - substitution of all numbers from 00 to 99 (00, 01, 02, ..., 98, 99). {az:а:яяя} - substitution of all Cyrillic characters from а to яяя (а, б, ... аа, аб, ... яяю, яяя) |
| {each:WORD1,WORD2,...} | Substitution of specified words WORD1, WORD2, etc., length is unlimited | {each:green,blue,red,black} - substitution of words green, blue, red, black. {each:,buy,sell} - substitution of an empty word, then buy and sell |
| {subs:NAME} | Substitution of additional words from files in the queries/subs/ folder. Instead of NAME, you must specify the filename without the .txt extension | {subs:zones} - substitution of all lines from the file queries/subs/zones.txt |
| {num:START:END} | The macro iterates through numbers in the specified range. Instead of START, the beginning of the interval is specified, instead of END - the end. Fractional numbers are supported. | {num:1:1000} - substitution of all numbers from 1 to 1000 (1, 2, 3 ..., 999, 1000) |
| {num:START:END:STEP} | The macro iterates through numbers in the specified range with a specified step. Instead of START, the beginning of the interval is specified, instead of END - the end, instead of STEP - the step. Fractional numbers are supported. | {num:0:1000:10} - substitution of all numbers from 0 to 1000 with a step of 10 (0, 10, 20 ..., 990, 1000) |
| {num:END:START} | The macro iterates through numbers in the specified range in reverse order. Instead of END - the end of the interval is specified, START indicates the beginning of the interval. Fractional numbers are supported. | {num:1000:1} - substitution of all numbers from 1000 to 1 (1000, 999, 998, ..., 2, 1) |
| {num:END:START:STEP} | The macro iterates through numbers in the specified range in reverse order with a specified step. Instead of END - the end of the interval is specified, START indicates the beginning of the interval, instead of STEP - the step. Fractional numbers are supported. | {num:1000:1:10} - substitution of all numbers from 1000 to 1 with a step of 10 (1000, 990, 980, ..., 10, 1) |
⏩ Video: Substitution macros
This video covers:
- the
{num}macro in examples of page-by-page crawling and coordinate iteration in the
Maps::Google parser - the
{az}macro using parsing with inurl: as an example to increase the number of queries and results accordingly - the
{each}macro using suggestion parsing as an example for generating word combinations
Combining substitution macros
Substitution macros can be combined. Complex example:
$query site:{subs:zones} {az:aa:zz}
Suppose one of the queries for parsing was viagra, and the file queries/subs/zones.txt contains the following list of zones: com, net, org, then the following set of combinations will be sent for parsing:
viagra site:com ab
...
viagra site:net jj
...
viagra site:eek.rg zz
The total number of queries will correspond to the multiplication of possible combinations:
1 query (viagra) x 3 zones ({subs:zones}) x 676 character variations ({az:aa:zz}) = 2028 queries