A-Parser Integration with Redis: Advanced API
Comparison with HTTP API
A-Parser Redis API was developed to replace the oneRequest and bulkRequest methods for a more performant implementation and to support additional use cases:
- Redis acts as the request and results server
- the ability to request results asynchronously or in a blocking mode
- the ability to connect multiple scrapers (both on the same or different servers) to process requests from a single entry point
- the ability to set the number of threads for request processing and view operation logs
- the ability to organize timeouts for operations
- automatic Expire for unclaimed results
Preliminary Setup
UnlikeA-Parser HTTP API to use the Redis API, you must first configure and run the job with the scraper
API::Server::Redis:
- install and run Redis server (locally or remotely)
- create a preset of settings for the scraper
API::Server::Redis, specify:Redis Host- Redis server address, default127.0.0.1Redis Port- Redis server port, default6379Redis Queue Key- key name for data exchange with A-Parser, defaultaparser_redis_api, you can create separate queues and process them with different jobs or different A-Parser copiesResult Expire(TTL)- result lifetime in seconds, used for automatic control and removal of unclaimed results, default3600seconds (1 hour)
- add a job with the scraper
API::Server::Redis- for requests, you must specify
{num:1:N}, whereNmust match the number of threads specified in the job - you can also enable the logging option, thus the ability to view the log for each request will be available
- for requests, you must specify
Example of setting up a job with
API::Server::Redis

Running A-Parser together with Redis using docker-compose
With this launch method, for the Redis server address (Redis Host) instead of IP, you can specify the service name, in the examples below it isredis
If A-Parser has not been run previously via docker-compose
- Download and unpack the distribution (you first need to get the one-time link in the Personal Account, as describedhere):
curl -O https://a-parser.com/members/onetime/ce42f308eaa577b5/aparser.tar.gz
tar zxf aparser.tar.gz
rm -f aparser.tar.gz
Create the file
docker-compose.ymland place the following content in it:- Basicoption without password and port opening, Redis will only be available within the Docker network
version: '3'
services:
a-parser:
image: aparser/runtime:latest
command: ./aparser
restart: always
volumes:
- ./aparser:/app
ports:
- 9091:9091
redis:
image: redis:latest
restart: always- Option with password and port opening, Redis will be available externally, so it is highly recommended to use a password
version: '3'
services:
a-parser:
image: aparser/runtime:latest
command: ./aparser
restart: always
volumes:
- ./aparser:/app
ports:
- 9091:9091
redis:
image: redis:latest
restart: always
command: redis-server --requirepass PASSWORD_FOR_REDIS_HERE
ports:
- 6379:6379Instead of PASSWORD_FOR_REDIS_HERE, invent and specify a password that will be used for authorization in Redis.
Start the containers:
docker compose up -d
If A-Parser has already been run previously via docker-compose
Edit the file
docker-compose.ymladding the following content at the end:- Basicoption without password and port opening, Redis will only be available within the Docker network
redis:
image: redis:latest
restart: always- Option with password and port opening, Redis will be available externally, so it is highly recommended to use a password
redis:
image: redis:latest
restart: always
command: redis-server --requirepass PASSWORD_FOR_REDIS_HERE
ports:
- 6379:6379Instead of PASSWORD_FOR_REDIS_HERE, invent and specify a password that will be used for authorization in Redis.
Start the containers:
docker compose up -d
If A-Parser has already been launched and its configuration has not changed, it will not be restarted, and Docker will simply add and start Redis.
Executing Requests
The Redis API operation is based onRedis Lists (lists), list operations allow adding an unlimited number of requests to the queue (limited by RAM), as well as receiving results in blocking mode with a timeout (blpop) or in asynchronous mode (lpop).
- all settings except
useproxy,proxyCheckerandproxybannedcleanupare taken from the preset of the calling scraper+overrideOpts - settings
useproxy,proxyCheckerandproxybannedcleanupare taken from the preset
API::Server::Redis + overrideOpts
The request is added to Redis with the commandlpush, each request consists of an array[queryId, parser, preset, query, overrideOpts, apiOpts] serialized usingJSON:
parser,preset,querycorresponds to the analogues for the API requestoneRequestqueryId- is formed together with the request, we recommend using the sequence number from your database or a good random value; the result can be obtained using this IDoverrideOpts- request settings override for the scraper presetapiOpts- additional API processing parameters
When requesting through Redis, the result formatting stage is skipped, as the entire result is transmitted in JSON for subsequent programmatic processing.
redis-cli
Example of executing requests, for testing you can useredis-cli:
127.0.0.1:6379> lpush aparser_redis_api '["some_unique_id", "Net::HTTP", "default", "https://ya.ru"]'
(integer) 1
127.0.0.1:6379> blpop aparser_redis_api:some_unique_id 0
1) "aparser_redis_api:some_unique_id"
2) "{\"data\":\"<!DOCTYPE html><html.....
Various Use Cases
Asynchronous Check for the Result
lpop aparser_redis_api:some_unique_id
Will return the result if it has already been processed or nil if the request is still being processed
Blocking Result Retrieval
blpop aparser_redis_api:some_unique_id 0
This request will be blocked until the result is received, you can also specify the maximum timeout for receiving the result, after which the command will returnnil
Saving Results to a Single Queue
By default, A-Parser saves the result for each request under its unique keyaparser_redis_api:query_id, which allows organizing multithreaded processing, sending requests and receiving results separately for each thread
In some cases, it is necessary to process results in a single thread as they arrive, in which case it is convenient to save results to a single results queue (the key must differ from the key for requests)
To do this, you need to specify the output_queue key for apiOpts:
lpush aparser_redis_api '["some_unique_id", "Net::HTTP", "default", "https://ya.ru", {}, {"output_queue": "aparser_results"}]'
Retrieving the result from the shared queue:
127.0.0.1:6379> blpop aparser_results 0
1) "aparser_results"
2) "{\"queryId\":\"some_unique_id\",\"results\":{\"data\":\"<!DOCTYPE html><html class=...
Implementation Example (SpySERP Use Case)
Suppose we are creating aSaaS service that evaluates domain parameters; for simplicity, we will check the domain registration date
Our service consists of 2 pages:
/index.php- landing page with the domain input form/results.php?domain=google.com- page with service operation results
To improve the user experience, we want our service pages to load instantly, and the data waiting process to look natural and display a loader
When requestingresults.php we first execute a request to A-Parser Redis API, forming a unique request_id:
lpush aparser_redis_api '["request-1", "Net::Whois", "default", "google.com", {}, {}]'
After that, we can display the page to the user and show a loader in the data display area; due to the absence of delays, the server response will be limited only by the Redis connection speed (usually within 10ms)
A-Parser will start processing the request even before the user's browser receives the first content, after the browser loads all the necessary resources and scripts, we can display the result, for this we send aAJAX request to retrieve data:
/get-results.php?request_id=request-1
The get-results.php script performs a blocking request to Redis with a 15-second timeout:
blpop aparser_redis_api:request-1 15
And returns the response immediately as soon as it is received from A-Parser; if we receive a null result due to a timeout, we can display a data retrieval error to the user
Thus, by sending a request to A-Parser when the page is first opened (/results.php) we reduce the necessary waiting time for data for the user (/get-results.php) by the time the user's browser spends waiting for content, loading scripts, and executing theAJAX request