1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.
  2. Join our Telegram chat: https://t.me/a_parser_en
    Dismiss Notice

Proxy setup

Dec 23, 2020

  • Main capabilities(top)

    • Simultaneous support of HTTP, SOCKS4 and SOCKS5 proxy
    • Multithread check
    • Loading proxy from the local file
    • Multithreaded loading from external sources
    • Check on anonymous
    • Support of authorization according to login\password both for HTTP and for SOCKS, and also support of different data for authorization in the format login:[email protected]:port
    • Opportunity to set arbitrary regular expressions for the IP address and port of a proxy when parsing from external sources
    • Possibility of output checked proxy to the file
    • Possibility to use multiple proxy sources in a single task
    • Since version 1.2.948, support for domain proxies has been added in the domain: port and login:р[email protected]:рort

    File structure(top)

    Work directory of a proxy cheker:
    It contains folders for each proxy checker. In each folder there are such files:
    • proxy.txt - from this file loading proxy (in the ip:port format), here you need to put the proxy list
    • sites.txt - in this file it is necessary to paste the list of sources proxy (links to proxy, in a format one link to a line)
    • alive.txt - in this file remain each 5 seconds live proxies if the appropriate option is included
    • regex.txt - in this file is the list of the regular expressions for parsing proxy from external sources (in a format one regular expression for the line, in $1 there shall be an IP address, in $2 - port)
    For the "default" proxy checker files are located in the root directory files/proxy/


    Control of a proxy checker is exercised in the tab Proxy checker, where you can add, delete, and enable or disable the proxy checkers. Also in this tab shows statistics of each proxy checker, graph of live proxy and statistics for the processing of sources::

    Adding and configuring a proxy checker(top)

    1. Select "Proxy checker" and press "Add Proxy Checker" or select in drop-down menu "Edit" item. Page appears of configure the proxy checker, select the preset default.
    2. If necessary, set the desired number of threads for checking proxies (Check threads), select the type of proxy (Proxies type) and change other settings. If you use a proxy from A-Parser (from Members Area) then is enough to disable checking proxy (No check proxies - to put a tick), all the rest left by default.
    3. Save preset (for existing - Save, for the new - Save as new).
    4. Go back to the "Proxy checker", check to see if the newly created checker is enabled, if not - enable.
    5. Open the directory specified in the "Working path".
    6. Next, you must specify the sources of proxies: links in sites.txt, list of proxy in proxy.txt. If you use a proxy from A-Parser (from Members Area) specify in the file sites.txt the link from Proxy tab in the Members Area, previously in the same place saving IP.
    7. Go back to the A-Parser into "Check proxy" and make sure that the "Total alive" for the edited proxy checker is greater than 0 - means the proxy configured correctly.
    Parameter values on defaults is suitable for the majority of tasks.

    Choosing proxy checker for task (parser)(top)

    1. Goto "Settings - Configs Presets", select desired preset or create a new (Save as new).
    2. In the field "Proxy Checkers" select one or more proxy checkers (for use, a proxy checkers must be enabled) and save (Save).
    3. Now you can use the created Config preset, with specified proxy in their own tasks, by selecting it in the Task editor.
    4. You can also override the proxy checker in each parser using the function Override - Proxy Checker.
    The Exclude from "All" option in the proxychecker settings allows you to exclude its proxies from general circulation in A-Parser. This option is useful when you need to make certain proxies available only from certain tasks or only for certain scrapers:
    1. you need to forcibly select an excluded proxy for a task
    2. you must set to use the excluded proxychecker for a particular scrapers.

    Changes in logic.

    Previously, if a particular proxychecker was selected in a task, and another proxychecker was specified in the scraper, the scraper would expect the proxy. Now the settings of a particular scraper are a higher priority:
    1. "All" - uses all the proxycheckers selected for the task
    2. a particular proxychecker - uses it even if it is not selected in the job

    Use proxy with authorization(top)

    1. If login and the password for all proxy identical that in settings of proxy cheker we specify:
    Proxy is saved in files/proxy/proxy.txt file or specify links to the sites in files/proxy/sites.txt file
    Proxy need to be specified in format ip:port

    2. If login and the password for all proxy are different that in settings of proxy cheker we specify:
    Proxies need to be specified in format login:[email protected]:port

    Briefly about each parameter(top)

    Loading typeВefines save the previous loaded proxies or not, Add - always adds new proxies to the general list, Replace - replaced old proxies new loaded
    Load threads countValue of threads loading proxy from the sites
    Load intervalInterval between full reverification of the list sites
    Load timeoutTimeout on request to the site with proxy
    Load max sizeThe maximum page size with proxy, if the page more that it is cut off to the given size
    Load limit countLimit the number of downloaded proxies, 0 to disable
    No check proxiesAllows turn off check of proxy. All loaded proxies automatically are considered as the live
    Proxies typeChoice what types proxy to check and in what sequence, if HTTP and SOCKS is specified at the same time, in case of unsuccessful check on HTTP proxy it will be repeatedly checked for the SOCKS protocol
    Check threadsQuantity of threads of proxy checker
    Check urlThe link to script of proxy checker, at this moment check is carried out via the parser server, in the future this behavior can change
    Check intervalInterval between full rechecks of all proxy
    Check timeoutProxy timeout
    Check max sizeThe maximum size of the downloaded page when checking a proxy
    Check anonymousTo check a proxy for anonymity, if that is selected, it is necessary to specify surely External IP
    External IPExternal ip address of the computer\server, it is necessary to specify if the option Check anonymous is included
    Save alive proxies to fileSave live proxies in the files/proxy/alive.txt
    Use proxy authorizationUse authorization for proxy according to the login\password
    Authorization loginLogin for authorization
    Authorization passwordPassword for authorization
    Exclude from "All"By default, "All" is selected as the proxychecker in each parser, i.e. all available proxycheckers are used. If this option is enabled, the proxy checker will be excluded from All.

    Installation of a script of check to the hosting(top)

    Upload the following PHP script to your hosting or the server:

    and also specify the link to it in Check url.
Caster, Gray12, venila and 4 others like this.