Skip to main content

Proxy Checkers

This section displays statistics on the operation of all proxy checkers. Each proxy checker - is a continuously running module (if enabled) that checks proxies and thus maintains an up-to-date list of live proxies.

You can add an unlimited number of proxy checkers and select one or more of them for each task or even for each scraper within a task. This allows you to use one set of proxies for a Google scraper, for example, and completely different ones for a Yandex scraper within the same task.

Overview of Proxy Checker

The total number of live proxies and the number of running (active) proxy checkers are displayed at the top. The button for adding a new proxy checker is in the top right. More details about the procedure for adding proxy checkers are described in the section Proxy setup.

Below is a list of all existing proxy checkers in the form of cards with information about each proxy checker. The following information is displayed on each card:

  • Working directory - folder with files of the proxy checker in aparser/files/proxy
  • Update time - the time of the last check of the loaded proxy list
  • Number of proxies in the check queue and the total number of loaded proxies
  • Number of live proxies
  • Download status or date of next download from proxy sources
  • Number of sources, from which proxies were successfully loaded last time, and total number of sources in this proxy checker
  • Current proxy check status

The Enabled checkbox next to the proxy checker management buttons allows you to enable/disable the proxy checker.

The default proxy checker is always first in the list. It serves as a template for new proxy checkers and cannot be edited or deleted.

File structure

The working files of the proxy checker are located in the folder files/proxy/<proxy checker name>:

  • proxy.txt - proxies are loaded from this file; you need to put the proxy list here
  • sites.txt - you need to put the list of proxy sources in this file (links to proxies, one link per line)
  • alive.txt - live proxies are saved to this file every 5 seconds if the corresponding option is enabled
  • regex.txt - this file contains a list of regular expressions for scraping proxies from external sources (one regular expression per line, $1 must contain the IP address, $2 - the port)
note

If you have links to proxy sources - specify them in the sites.txt file, the proxy.txt file should be left empty
For the "default" proxy checker, the files are located in the root of the files/proxy/

Adding and configuring a proxy checker

Go to the "Proxy Checker" menu and click "Add Checker" or select "Edit" from the dropdown menu in an existing proxy checker. This takes you to the proxy checker configuration page.

Add Proxy Checker

If necessary, set the required number of threads for checking proxies (Check Threads), select the proxy type (Proxy Type), and change other settings. The default parameter values are suitable for most tasks. Save the settings as a new proxy checker. You cannot change and save settings for the default proxy checker.

Proxy sources are specified in files inside the folder named after the created proxy checker (files/proxy/.../):

  • links in sites.txt
  • list of proxies in proxy.txt
Proxy sources in the working directory

Proxies with IP access

Proxies with IP access are set up similarly.

List of proxies with the same login password for all proxies

This method is suitable for cases where the list of proxies has the format ip:port and the login/password is the same for the entire proxy list

In the checker settings, specify:

  • login
  • password
  • Use proxy authorization
Setup: list of proxies with the same login password for all proxies

List of proxies with different passwords for each proxy

In this case, the proxy list must have the format login:password@ip:port, , and in the checker settings it is enough to specify Use proxy authorization

Setup: list of proxies with different passwords for each proxy

⏩ Video: connecting proxies with authorization

Choosing a proxy checker for a task

note

These settings are necessary to differentiate the operation of tasks with different proxy checkers; you can skip this section if you need to use all available proxies in all tasks

Go to the Settings -> menu, then , select the desired preset or create a new one (button Add new).

In the Proxy Checkers field, select one or more proxy checkers (proxy checkers must be enabled to be used) and save (Save). You can also select all proxy checkers at once All (default value).

Choosing a proxy checker for a task

Now you can use the created Threads Config, with the specified proxies, in your tasks by selecting it in the Task Editor.

Select threads config

You can also override the proxy checker in each scraper using the Override function - Proxy Checker.

Override Proxy Checker

The option Exclude from "All" in the proxy checker settings allows excluding its proxies from general access in A-Parser. This option is useful when certain proxies need to be made available only from specific tasks or only for specific scrapers:

  • the task requires explicit selection of the excluded proxy checker
  • a specific scraper needs to be configured to use the excluded proxy checker

Changes in logic

Previously, if a specific proxy checker was selected in the task, and a different proxy checker was specified in the scraper, the scraper waited for proxies. Now, the settings of the specific scraper are prioritized:

  • "All" - uses all proxies selected for the task
  • specific proxy checker - uses it, even if it is not selected in the task

Proxy checker parameters

Parameter nameDefault valueDescription
Loading typeReplaceDetermines whether to keep previously loaded proxies or not, Add - always adds new proxies to the general list, Replace - replaces old proxies with newly loaded ones
Load threads count5Number of threads for loading proxies from sites
Load interval30Interval between complete re-checking of the list of sites
Load timeout30Timeout for a request to the proxy site
Load max size524288Maximum size of the page with proxies, if the page is larger, it is trimmed to the specified size
Load limit count0Limit on the number of loaded proxies, 0 to disable
No check proxiesAllows disabling proxy checking. All loaded proxies are automatically considered live
Proxies typeHTTP, SOCKS5Selection of which proxy types to check and in what order; if both HTTP and SOCKS are specified, upon failed check with HTTP the proxy will be re-checked with SOCKS protocol
Check threads15Number of threads for checking proxies
Check urlhttp://work.a-poster.info:25000/Link to the proxy checking script; currently, checking is done through the scraper server, this behavior may change in the future
Check interval30Interval between complete re-checks of all proxies
Check timeout5Proxy timeout
Check max size5120Maximum size of the page to download when checking proxies
Check anonymousCheck proxies for anonymity; if selected, External IP must be specified
External IPExternal IP address of the computer/server, must be specified if the Check anonymous option is enabled
Exclude from "All"By default, in each scraper, the proxy checker is set to "All", meaning all available proxy checkers are used. If this option is enabled, the proxy checker will be excluded from All.
Save alive proxies to fileNoSave live proxies to the file files/proxy/alive.txt
Use proxy authorizationUse authorization for proxies via login/password
Authorization loginLogin for authorization
Authorization passwordPassword for authorization

Installing the verification script on the hosting service

note

By default, A-Parser checks proxies using its own check script, without the need to install a script on your hosting service

Upload the following PHP script to your hosting or server and specify the link to it in Check url:

<?php

print_r($_SERVER);
print_r($_POST);

?>