Skip to main content

Proxy Checkers

This section displays the statistics of all proxy checkers' work. Each proxy checker is a continuously running module (if enabled), which checks proxies and thereby has an up-to-date list of live proxies.

You can add an unlimited number of proxy checkers and select one or several of them for each task or even each scraper in the task. Thus, it is possible to use one set of proxies for parsing Google and a completely different set for Yandex within the same task.

Overview of Proxy Checker

At the top, the total number of live proxies and the number of running (working) proxy checkers are displayed. At the top right, there is a button to add a new proxy checker. More about the procedure for adding proxy checkers is described in the section Proxy Settings.

Below is a list of all existing proxy checkers in the form of cards with information about each proxy checker. The following information is displayed on each card:

  • Working directory - folder with files of the proxy checker in aparser/files/proxy
  • Update time - the time of the last check of the uploaded proxy list
  • Number of proxies in the check queue and the total number of uploaded proxies
  • Number of live proxies
  • Download status or date of the next download from proxy sources
  • Number of sources from which proxies were last successfully downloaded and the total number of sources in this proxy checker
  • The current status of proxy checking

The Enabled checkbox next to the proxy checker control buttons allows you to enable/disable the proxy checker.

The first in the list of proxy checkers is always the default proxy checker. It serves as a template for new proxy checkers and cannot be edited or deleted.

File Structure

The working files of the proxy checker are located in the folder files/proxy/<name of the proxy checker>:

  • proxy.txt - proxies are loaded from this file, you need to put the list of proxies here
  • sites.txt - you need to put the list of proxy sources (links to proxies, one link per line) in this file
  • alive.txt - live proxies are saved to this file every 5 seconds if the corresponding option is enabled
  • regex.txt - this file contains a list of regular expressions for parsing proxies from external sources (one regular expression per line, $1 should be the IP address, $2 - the port)
note

If you have links to proxy sources - specify them in the sites.txt file, the proxy.txt file should be left empty
For the "default" proxy checker, the files are located in the root directory files/proxy/

Adding and Configuring a Proxy Checker

Go to the "Proxy Checker" menu and click "Add Checker" or select "Edit" from the drop-down menu in an existing proxy checker. You will be taken to the proxy checker settings page.

Add Proxy Checker

If necessary, set the required number of threads for checking proxies (Checking Threads), select the type of proxy (Proxy Type), and change other settings. The default parameter values are suitable for most tasks. Save the settings as a new proxy checker. It is not possible to change and save the settings of the default proxy checker.

Proxy sources are specified in the files inside the folder with the name of the created proxy checker (files/proxy/.../):

  • links in sites.txt
  • list of proxies in proxy.txt
Proxy sources in the working directory

Proxies with IP access

Proxies with IP access are configured in a similar way.

List of proxies with the same login and password for all proxies

This method is suitable for cases when the list of proxies is in the ip:port format and the login/password is the same for the entire list of proxies

In the checker settings, we specify:

  • login
  • password
  • Use proxy authorization
Setup: list of proxies with the same login and password for all proxies

List of proxies with different passwords for each proxy

In this case, the list of proxies should be in the format login:password@ip:port, in the checker settings it is enough to indicate Use proxy authorization

Setup: list of proxies with different passwords for each proxy

⏩ Video: connecting a proxy with authorization

Choosing a proxy checker for a task

note

These settings are necessary to differentiate the operation of tasks with various proxy checkers, you can skip this section if you need to use all available proxies in all tasks

Go to the Settings -> Threads Settings, select the required preset or create a new one (button Add new).

In the Proxy Checkers field, select one or several proxy checkers (to use the proxy checkers they must be enabled) and save (Save). You can also select all proxy checkers at once All (default value).

Choosing a proxy checker for a task

Now you can use the created Threads Config with the specified proxies in your tasks by selecting it in the Task Editor.

Select threads config

You can also override the proxy checker in each scraper using the override function - Proxy Checker.

Override Proxy Checker

The Exclude from "All" option in the proxy checker settings allows you to exclude its proxies from general use in A-Parser. This option is useful in cases where you need to make certain proxies available only from specific tasks or only for specific scrapers:

  • for the task, you must forcibly select the excluded proxy checker
  • for a specific scraper, it is necessary to set the use of the excluded proxy checker in the settings

Changes in logic

Previously, if a specific proxy checker was selected in the task, and another proxy checker was specified in the scraper, the scraper waited for a proxy. Now the settings of a specific scraper are more prioritized:

  • "All" - uses all proxies selected for the task
  • a specific proxy checker - uses it, even if it is not selected in the task

Proxy checker parameters

Parameter NameDefault ValueDescription
Loading typeReplaceDefines whether to keep previously loaded proxies or not, Add - always adds new proxies to the general list, Replace - replaces old proxies with newly loaded ones
Load threads count5Number of threads for loading proxies from websites
Load interval30Interval between full rechecks of the proxy list
Load timeout30Timeout for a request to the proxy site
Load max size524288Maximum page size with proxies, if the page is larger it is trimmed to the specified size
Load limit count0Limit on the number of proxies to be loaded, 0 to disable
No check proxiesAllows disabling proxy checking. All loaded proxies are automatically considered alive
Proxies typeHTTP, SOCKS5Choice of which types of proxies to check and in what sequence, if both HTTP and SOCKS are specified simultaneously, the proxy will be rechecked for the SOCKS protocol if the HTTP check fails
Check threads15Number of threads for checking proxies
Check urlhttp://work.a-poster.info:25000/Link to the proxy checking script, currently the check is carried out through the scraper server, this behavior may change in the future
Check interval30Interval between full rechecks of all proxies
Check timeout5Proxy timeout
Check max size5120Maximum download page size when checking proxies
Check anonymousCheck proxies for anonymity, if selected then it is mandatory to specify External IP
External IPExternal IP address of the computer/server, must be specified if the Check anonymous option is enabled
Exclude from "All"By default, in each scraper, the value "All" is selected as the proxy checker, i.e., all available proxy checkers are used. If the option is enabled, the proxy checker will be excluded from All.
Save alive proxies to fileNoSave alive proxies to the file files/proxy/alive.txt
Use proxy authorizationUse authorization for proxies by login/password
Authorization loginLogin for authorization
Authorization passwordPassword for authorization

Installing the hosting verification script

note

By default, A-Parser checks proxies through its own verification script, without the need to install the script on your own hosting

Upload the following PHP script to your hosting or server and specify the link to it in Check url:

<?php

print_r($_SERVER);
print_r($_POST);

?>

And specify one of the proxy lists:

- **[http://work.a-poster.info/prx/perm_socks.txt](http://work.a-poster.info/prx/perm_socks.txt)** - Each port has its own proxy with its own outgoing IP address. The proxy is fixed to its port as long as it is online. This list is updated every 30 seconds and always contains current and live proxies.
- **[http://work.a-poster.info/prx/rand_socks.txt](http://work.a-poster.info/prx/rand_socks.txt)** - The outgoing IP address changes for each connection to the proxy. The IP address is chosen randomly from all live proxies. This list is fixed and there is no need to update it.