Frequently Asked Questions
1. Questions related to demo, payment, and purchase
1.1. How to download results in the Demo version?
In the Demo version, the results are not available for download. We provide them upon your requests. Send us your requests and tell us which scraper you are interested in, and we will send you the results (within the demo, their number is limited).
1.2. Do I need to pay extra for anything after purchasing A-Parser?
No. More details: licenses and add-ons, purchase page.
1.3. Where and how can I pay for proxies?
When purchasing a license, you are provided with bonus proxies.
Lite - 20 threads for 2 weeks, Pro and Enterprise - 50 threads for a month.
You can buy more threads or extend them in the Members Area on the Shop tab, Proxy subsection.
1.4. Could you set up a task for me for money?
Technical support related to the operation of A-Parser is provided for free. For paid assistance in setting up tasks, you can contact here: Paid services for setting up tasks, help with configuration and training to work with A-Parser.
1.5. Can I make a payment for the scraper through Privat24 bank? Through KIWI?
The list of payment systems we work with is indicated here: buy A-Parser.
1.6. If I need to scrape only the number of indexed pages in Yandex, which scraper should I buy?
For such purposes, the Lite version is sufficient, but the Pro is more practical and flexible in operation.
1.7. Where can I view information about my license?
1.8. Is it possible to use purchased proxies from multiple IPs?
No.
2. Questions about installation, launch, and updates
2.1. I click the Download button - but the archive does not download. What should I do?
Check if you have free space on your hard drive, disable antivirus. Follow the installation instructions. Also, see How to start working.
2.2. I bought the Enterprise version, but PRO is still being installed. What should I do?
Delete the previous version. In the Members Area, check if your IP address is correctly entered. Before downloading, press the Update button. Download the newer version. More details in the installation instructions.
2.3. I installed the program, but it does not start, what should I do?
Check the running applications, disable antivirus, check the available amount of free RAM. Also, in the Members Area check if your IP address is correctly entered. More details: installation instructions.
2.4. What if I have a dynamic IP address?
Nothing serious, A-Parser supports work with dynamic IP addresses. Just every time it changes, you need to enter it in the Members Area. To avoid these manipulations, it is recommended to use a static IP address.
2.5. What are the optimal server, computer parameters for installing the scraper?
All system requirements can be viewed here: system requirements.
2.6. I launched a task. The scraper crashed and won't start again, what should I do?
It is necessary to stop the server, check if the process is not hanging in memory, and try to launch it again. You can also try to start A-Parser with stopping all tasks. To do this, you need to start with the parameter -stoptasks. Details about starting with the parameter.
2.7. What password should I enter when opening the address 127.0.0.1:9091?
If this is the first launch, then the password is empty. If not the first - then the one you set. If you forgot the password - reset password.
2.8. In the Personal Account, I enter my IP, but it does not change in the Your current IP field. Why?
The Your current IP field displays the IP that is currently valid for you, and it should not change. This is the one you should enter in the IP 1 field.
2.9. Can I run two copies at the same time?
You can run two copies on the same machine only if they have different ports specified in the configuration file.
You can run two A-Parsers on different machines simultaneously only if you have purchased an additional IP in the Personal Account.
2.10. Does the scraper have hardware binding?
No. Your IP is used for license control.
2.11. Question about updating - only update .exe? config/config.db and files/Rank-CMS/apps.json - what are these files for?
Unless otherwise specified, update only .exe
. The first file is for storing the configuration of A-Parser, and the second is a database for determining CMS and actually the operation of the scraper Rank::CMS.
2.12. I have Win Server 2008 Web Edition - the scraper won't start...
On this version of the OS, A-parser will not work. The only option is to change the OS.
2.13. I have a quad-core processor. Why does A-Parser use only one core?
A-Parser uses from 2 to 4 cores, additional cores are used only for filtering, Result Constructor, Parse custom result
2.14. I started getting a segmentation error (segmentation failed, segmentation error). What should I do?
Most likely your IP has changed. Check in the Personal Account.
2.15. I have Linux. A-Parser started, but it does not open in the browser. How to solve?
Check the firewall - most likely it is blocking access.
2.16. I have Windows 7. A-Parser started, but it does not open in the browser and there is no Node.js process in the task manager. How to solve?
You need to check for Windows updates and install the latest available. Specifically, you need Windows 7 SP1 update.
2.17. A-Parser won't start and aparser.log writes the error FATAL: padding_depad failed: Invalid argument provided. at ./Crypt/Mode/CBC.pm line 20.
Most likely there is a problem with some task (folder /config/tasks/
), as a result of a disk error (for example, if the PC power was turned off without proper shutdown), you can find out more if you start A-Parser with the flag -morelogs
Solution: start A-Parser with the parameter -stoptasks. If that doesn't help, then clean the entire /config/tasks/
. If the problem is still not resolved after that, then reinstall the scraper in a new directory and throw in the config from the old one (if it is not damaged).
3. Questions about setting up A-Parser and other settings
3.1. How to configure the proxy checker?
Detailed instructions are located here: proxy configuration.
3.2. There are no live proxies - why?
Check your internet connection and the correctness of the proxy checker settings. If everything is done correctly, it means that at the moment your proxy list does not contain working servers. The solution to this problem: either use other proxies or try again later. If you are using our proxies, then check the IP address in the Personal Area in the Proxy section. It is also possible that your provider blocks access to other DNS, try to do the steps described here: http://a-parser.com/threads/1240/#post-3582
3.3. How to connect AntiGate?
Detailed instructions for setting up AntiGate here.
3.4. I changed the parameters in the scraper settings, but they did not apply. Why?
The default preset (default) cannot be changed, if any changes are made, you need to click Save as new preset, and after that use it in your task.
3.5. Is it possible to change the settings of a running task?
Yes, but not all. In the running task, you can click on pause and there in the drop-down menu select Edit.
3.6. How to import a preset?
Click the button next to the task selection field in the Task Editor. Details here.
3.7. How to configure the scraper so that it does not use proxies?
In the settings of the desired scraper, uncheck Use proxy.
3.8. I don't have the Add Override / Override option button!
This option can be added directly in the Task Editor. Scraper options.
3.9. How to overwrite the same file with results?
When composing a task, set the option Overwrite file.
3.10. Where to change the password for the scraper?
3.11. I set 6 million keys for parsing, also indicated that the domains should all be unique. But how to do it so that when I put another 6 million keys, only unique domains are recorded that do not intersect with the previous parsing?
You need to use the Save uniqueness option when composing the first task, and specify the saved database in the second. Details in Additional task editor options.
3.12. How to bypass the limitation of 1000 results for Google?
Use the option Parse all results.
3.13. How to bypass the limitation of 1024 threads on Linux?
3.14. What is the thread limit on Windows?
Up to 10,000 threads.
3.15. How to make requests unique?
Use the Unique requests option in the Requests block in the Task Editor.
3.16. How to disable proxy checking?
In Settings - Proxy checker settings select the required proxy checker and add a checkmark Do not check proxy. Save and select the saved preset.
3.17. What is Proxy ban time? Can I set it to 0?
Proxy ban time in seconds. Yes, you can.
3.18. What is the difference between Exact Domain and Top Level Domain in the scraper SE::Google::Position
Exact Domain is a strict match, i.e., if the search result is www.domain.com, and we are looking for domain.com, there will be no match. Top Level Domain checks the entire top-level domain, i.e., there will be a match here.
3.19. If I run a test scrape - everything works, if a regular one - I get an error Some error.
Most likely the problem is with DNS, try to follow this instruction for DNS setup.
3.20. Where is the Result Format set?
3.21. In SE::Google there is no Dutch language, although it is available in Google settings. Why?
Dutch language is Dutch, it is in the list. Detailed in the improvement for adding Dutch language.
4. Questions about scraping and errors during scraping
4.1. What are threads?
All modern processors can perform tasks in multiple threads, which significantly increases their speed of execution. For comparison, one can cite a regular bus, which transports a certain number of people per unit of time - this would be a normal, single-threaded processing, and a double-decker bus, which transports twice as many people in the same amount of time - this would be multi-threaded processing. A-Parser can process up to 10,000 threads simultaneously.
4.2. The task does not start - it says Some Error - why?
Check the IP address in Personal Area.
4.3. All requests are failing, what should I do?
Most likely the task is incorrectly composed or the wrong request format is being used. Also, check if there are any live proxies. You can also try to increase the Request retries option (more details here: failed requests).
4.4. How many accounts do I need to register to scrape 1,000,000 keywords with SE::Yandex::Wordstat?
It's impossible to say exactly how many accounts are needed, as an account may stop being valid after an unknown number of requests. But you can always register new accounts using the scraper SE::Yandex::Register or simply add existing accounts to the file files/SE-Yandex/accounts.txt.
4.5. The task does not start, it says Error: Lock 100 threads failed(20 of limit 100 used) what should I do?
You need to increase the maximum available number of threads in the scraper settings, or reduce it in the task settings. Detailed in Settings.
4.6. Is it possible to run 2 tasks simultaneously?
Yes, A-Parser supports the execution of multiple tasks simultaneously. The number of tasks that can run at the same time is regulated in Settings - General settings: Maximum active tasks.
4.7. Where is the file with the results located?
On the Queue of tasks tab, after the completion of each task, you can download the results of the work. Physically, they are located in the results folder.
4.8. Can I download the file with results if the scraping is not finished?
No, you cannot download the results until the scraping is finished. But you can copy it from the folder aparser/results when the task is stopped or paused.
4.9. Can your scraper scrape 1,000,000 links with a single query?
Yes, using the option Scrape all results / Parse all results.
4.10. Is it possible to scrape Rank::CMS, Net::Whois without proxies?
4.11. How to scrape links from Google?
It is necessary to use SE::Google.
4.12. Can a scraper follow links?
Yes, this can be done by the scraper HTML::LinkExtractor when using the option Parse to level
4.13. Google scraping is very slow, what to do?
First of all, you need to check the task logs, it is possible that all requests are unsuccessful. If this is the case, then you need to find out why the requests are unsuccessful and fix it. When scraping with SE::Google, the task logs often show unsuccessful attempts associated with Google displaying captchas, this is normal. You can connect AntiGate to bypass captchas, so the scraper does not cycle through attempts. Also, there is an article describing the factors that affect the speed of scraping and how they affect it: speed and principle of scrapers operation.
4.14. Can your scraper scrape links with text only in Japanese?
Yes, for this you need to set the necessary language in the scraper settings, as well as use Japanese keywords.
4.15. Can your scraper scrape links only in the domain zone .de or .ru?
Yes. For this, you need to use a filter.
4.16. How to get each result on a new line in the file?
When formatting the result, use \n
. Example:
$serp.format('$link\n')
4.17. How to scrape the top 10 sites from Google?
Here is the preset:
eyJwcmVzZXQiOiJUT1AxMCIsInZhbHVlIjp7InByZXNldCI6IlRPUDEwIiwicGFy
c2VycyI6W1siU0U6Okdvb2dsZSIsImRlZmF1bHQiLHsidHlwZSI6Im92ZXJyaWRl
IiwiaWQiOiJwYWdlY291bnQiLCJ2YWx1ZSI6MX0seyJ0eXBlIjoib3ZlcnJpZGUi
LCJpZCI6ImxpbmtzcGVycGFnZSIsInZhbHVlIjoxMH0seyJ0eXBlIjoib3ZlcnJp
ZGUiLCJpZCI6InVzZXByb3h5IiwidmFsdWUiOmZhbHNlfV1dLCJyZXN1bHRzRm9y
bWF0IjoiJHAxLnByZXNldCIsInJlc3VsdHNTYXZlVG8iOiJmaWxlIiwicmVzdWx0
c0ZpbGVOYW1lIjoiJGRhdGVmaWxlLmZvcm1hdCgpLnR4dCIsImFkZGl0aW9uYWxG
b3JtYXRzIjpbXSwicmVzdWx0c1VuaXF1ZSI6Im5vIiwicXVlcnlGb3JtYXQiOlsi
JHF1ZXJ5Il0sInVuaXF1ZVF1ZXJpZXMiOmZhbHNlLCJzYXZlRmFpbGVkUXVlcmll
cyI6ZmFsc2UsIml0ZXJhdG9yT3B0aW9ucyI6eyJvbkFsbExldmVscyI6ZmFsc2Us
InF1ZXJ5QnVpbGRlcnNBZnRlckl0ZXJhdG9yIjpmYWxzZX0sInJlc3VsdHNPcHRp
b25zIjp7Im92ZXJ3cml0ZSI6ZmFsc2V9LCJkb0xvZyI6Im5vIiwia2VlcFVuaXF1
ZSI6Ik5vIiwibW9yZU9wdGlvbnMiOmZhbHNlLCJyZXN1bHRzUHJlcGVuZCI6IiIs
InJlc3VsdHNBcHBlbmQiOiIiLCJxdWVyeUJ1aWxkZXJzIjpbXSwicmVzdWx0c0J1
aWxkZXJzIjpbXSwiY29uZmlnT3ZlcnJpZGVzIjpbXX19
4.18. I added a task, went to the Task Queue tab - and it's not there! Why?
Either a mistake was made when creating the task, or it has already been completed and moved to Completed.
4.19. It says the file is not in utf-8, but I didn't change it, it's utf-8, what to do?
Check again. Also, try to change the encoding anyway, for example using Notepad++.
4.20. The file with the results is all in one line, although I set a line break in the task - why?
In the additional settings of A-Parser, you need to use the line break CRLF (Windows)
.
But if you have already scraped without this option, then use a more advanced viewer, for example Notepad++.
4.21. How much time does it take to check the frequency of queries on Yandex for 1,000 queries?
This indicator is highly dependent on the task parameters, server characteristics, quality of proxies, etc., so it is impossible to give a definitive answer.
4.22. How do I set up the scraper so that the result is query-link?
Result format:
$p1.serp.format('$query: $link\n')
The result will be:
запрос: ссылка 1
запрос: ссылка 2
запрос: ссылка 3
4.23. How do I re-scrape unsuccessful queries and where are they stored?
In order for unsuccessful queries to be saved, you should select the appropriate option in the Queries block in the Task Editor. Unsuccessful queries are stored in queries\failed. You need to create a new task and specify the file with unsuccessful queries as the file for queries.
4.24. How to get rid of HTML tags when scraping text?
Use the Remove HTML tags option in the Result Builder.
4.25. How to make the scraper parse only domains?
Use the Extract Domain option in the Results Builder.
4.26. What is the maximum file size for requests that can be used in the scraper?
The sizes of request files and result files are unlimited and can reach terabyte values.
4.27. Why, when I enter text into the request field, does the scraper output Queries length limited to 8192 characters?
This happens because the request length is limited to 8192 characters. To use longer requests, use files as requests.
4.28. What does Pending threads - 3 mean?
It means that there is a lack of proxies. Reduce the number of threads or increase the number of proxies.
4.29. In the test parsing, it writes 596 SOCKS proxy error: Hello read error(Connection reset by peer) (0 KB) and does not parse, why?
This indicates that the proxies are not working.
4.30. What is the difference between the language of the results and the country of search in the Google scraper?
The difference is as follows: the country of search is tying the results to a specific country. For example, if you search for купить окна
(buy windows) with a binding to a specific country, then the priority will be websites offering to buy windows specifically in that country. And the language of the results is the language in which the results should be presented.
4.31. I can't scrape a specific website. What could be the reason?
Often the problem is that there is a blockage due to an old user agent on the server side. It can be solved with a new user agent or the following code in the User agent parameter:
[% tools.ua.random() %]
4.32. The scraper hangs, crashes. In the log, there is a line syswrite: No space left on device
The scraper lacks space on the hard drive. Free up more space.
4.33. My scraper started giving none in the results (or clearly incorrect results)
4.34. A window with the inscription Failed fetch news constantly appears
4.35. How to display the first n results of the search output?
4.36. How to track the chain of redirects?
4.37. How to check if a link is indexed on the donor site?
For such purposes, there is a separate scraper: Check::BackLink. More details in the discussion.
4.38. The scraper crashes on Linux. The log contains the following entry: EV: error in callback (ignoring): syswrite() on closed filehandle at AnyEvent/Handle.pm line...
Most likely, you need to tune the number of threads, as written in Documentation: Tuning Linux for more threads.
4.39. Where can I view all possible parameters for their use through the API?
Getting an API request in the interface.
Also, you can generate a full job config in JSON. To do this, take the job code and decode it from base64.
4.40. I download images using Net::HTTP, but for some reason, they are all corrupted. What to do?
1) Check the Max body size parameter - it may need to be increased. 2) Check the line break format in A-Parser settings: Additional settings - Line break.
For the image not to be broken, UNIX format must be used.
4.41. How to get admin contact from WHOIS?
This task is easily solved using the Parse custom result function and a regular expression. Detailed in discussion.
4.42. Regular expression for scraping phone numbers
4.43. Identifying websites without a mobile version
4.44. How to find out the name of the ns-server?
4.45. How to scrape links to Yandex cache?
4.46. How to scrape links to all pages of a website?
4.47. How to scrape the title from a page?
4.48. How to scrape all websites in a given domain zone?
4.49. How to collect all URLs with parameters?
4.50. How to filter results by multiple criteria and split them in the report?
4.51. How to simplify the filter construction?
4.52. How to sort by files depending on the result?
4.53. Create new result directory every X number of files (English)
4.54. First steps working with WordStat
4.55. Collecting text blocks >1000 characters
4.56. Outputting a certain amount of text from a page
This is also solved using Template Toolkit. More details in the discussion.
4.57. Checking competition and inclusion in the title in Google
4.58. Filtering by the number of query occurrences in the anchor and snippet
4.59. How to get the content of an article in one line?
4.60. How to compare two string dates?
4.61. How to scrape highlighted words from a snippet?
4.62. Example of a task using multiple scrapers
4.63. How to shuffle lines in the result and how to output a random number of results?
4.64. How to sign the result with MD5?
4.65. How to convert a date from Unix timestamp to string representation?
4.66. Parse to level, how to parse with a limit?
4.67. The scraper crashes on Linux when starting a task. The log contains the following lines: Can't call method "if_list" on an undefined value at IO/Interface/Simple.pm...
You need to execute the following command in the console:
apt-get --reinstall --purge install netbase
4.68. Error Cannot init Parser: Error: Failed to launch the browser process! [0429/082706.472999:ERROR:zygote_host_impl_linux.cc(90)] Running as root without --no-sandbox is not supported...
You need to start A-Parser not as root. Specifically: from the root user, you need to create a new user without root privileges (if there is one, then simply use it) and then allow this user to interact with the A-Parser directory, then you need to log in as the new user and launch from it.
To create a user under the root
user, you can use this guide.
To allow the created user to interact with the A-Parser directory, you need to give the user permissions. To do this, log in as the root user and give permissions with the command:
chown -R user:user aparser
4.69. Error Cannot init Parser: Error: Failed to launch the browser process! [0429/102002.619437:FATAL:zygote_host_impl_linux.cc(117)] No usable sandbox! Update your kernel or see...
Under the root
user, execute the command:
sysctl -w kernel.unprivileged_userns_clone=1
A-Parser restart is not required.
For CentOS 7 the solution is in this topic.
Under the root
user, execute the command:
echo "user.max_user_namespaces=15000" >> /etc/sysctl.conf
Then restart sysctl
with the command:
sysctl -p
4.70. JavaScript execution error(): Error: Failed to launch the browser process! /aparser/dist/nodejs/node_modules/puppeteer/.local-chromium/linux-884014/chrome-linux/chrome: error while loading shared libraries: libatk-1.0.so.0: cannot open shared object file: No such file or directory...
The error occurs due to the absence of libraries in the OS needed for Chrome to work.
A list of necessary libraries for Chrome to work can be found in Chrome headless doesn't launch on UNIX.
4.71. Why isn't the captcha being solved? The log shows that A-Parser received question marks instead of the captcha answer from XEvil
In the region settings, you need to change to Russian. You only need to change it on the Advanced tab. It does not affect the captcha solving, but there will be a problem with encoding in the Xrumer itself if you change it in both places.