1.2.912 - NodeJS update, performance improvement, adaptation to changes in recaptchas

Support Artur · Jun 11, 2020

We have completed the transition to NodeJS as the main engine for scrapers and present a new stable version 1.2.912 with support for NodeJS 14.2.0. This update combines many improvements, including increased performance, reduced memory consumption, a completely new network stack, as well as support for native NodeJS modules, allowing you to use the full power of the npmjs directory in A-Parser

Also, this update includes changes in working with ReCaptcha2 in the Google scraper, our team was one of the first to find a solution to circumvent the new version of the recaptcha and tested it together with the RuCaptcha service, for which they have a separate respect. At the moment, the correct captcha bypass has been tested with RuCaptcha, Anti-Captcha, XEvil and CapMonster.

In addition, many optimizations were made in the core of A-Parser, and performance was significantly increased when using a large number of tasks or large proxy lists. The scraper Rank::CMS has been completely rewritten and stabilized, support for the new apps.json format and support for user rules have been added.

Improvements

NodeJS updated to v14.0.0, v8 to 8.1

Added support for the data-s parameter in recaptures for SE::Google, also added the ReCaptcha2 pass proxy option

Increased thread limit to 10,000 for Windows OS

Significantly improved performance with a large number of active proxies and / or jobs, completely rewritten the stack for working with proxies, optimized work with large lists

Added new scraper Rank::KeysSo

Completely rewritten in JS and Rank::Archive

Improved performance when using regular expressions, as well as improved compatibility

In SE::Google::KeywordPlanner added automatic token retrieval

In SE::Bing added the ability to scrap links to cached pages, as well as the ability to scrap mobile results

In the scraper Util::ReCaptcha2, when choosing the provider Capmonster or Xevil it is now optional to specify the Provider url

In SE::Google::Trends added the ability to specify an arbitrary date range

In Rank::CMS added the choice of a regular engine and support for its own file with features

In SE::Yandex::ByImage added option Don't scrape if no other sizes, which allows you to disable the collection of results if the desired image is not in other sizes

[NodeJS] Fixed this.cookies.getAll()

[NodeJS] Added protection against endless loops and long regulars

[JS scrapers] Added follow_meta_refresh option for this.request

[JS scrapers] Added bypass_cloudflare option for this.request

[JS scrapers] Underscore replaced by Lodash

[JS scrapers] Added a mark in the log when calling other scrapers

[JS scrapers] Using the previous proxy after a request to another scraper

[JS scrapers] Added destroy() method

Corrections due to changes in the issuance

Many fixes in SE::Google

Fixed SE::Youtube, incl. scraping by tags

Fixed collection of links in Shop::eBay

Fixed phone scraping in Maps::Google

Fixed work with captchas in SE::Yandex::ByImage

In Rank::Social::Signal the variable $facebook_comment was deleted due to irrelevance

SE::Startpage, Rank::Linkpad, Social::Instagram::post, SE::Yandex::Translate

Corrections

Fixed a bug due to which the selected proxy checker was ignored

Fixed work of Decode HTML entities and Extract domain functions in Result Constructor

Fixed problem with encoding detection

Fixed error using $tools.query

Fixed bug in Rank::MajesticSEO in which all attempts were used in the absence of results

Fixed work of http2

Fixed a bug when the scraper crashes due to the inability to write in alive.txt

Fixed captcha capturing in SE::Yandex::Register and Check::RosKomNadzor

Fixed the difference in requests sent via Net::HTTP and JS

Fixed bug in SE::Yahoo

Bugs fixed in Rank::CMS when choosing an application without a category

[NodeJS] Fixed calculation of scraper code execution time

[JS scrapers] When the body is empty, the content-length header was not transmitted when posting a request

[JS scrapers] Fixed work of CloudFlare bypass

[JS scrapers] Fixed work with sessions

[JS scrapers] Fixed work with overrides for this.parser.request

[JS scrapers] Fixed error in encoding detection in JS scrapers

1.2.912 - NodeJS update, performance improvement, adaptation to changes in recaptchas

Support Artur A-Parser Enterprise License
A-Parser Enterprise

Share This Page

About A-Parser

Quick Navigation

Twitter

Contact Us

Useful Searches

1.2.912 - NodeJS update, performance improvement, adaptation to changes in recaptchas

Support Artur A-Parser Enterprise License A-Parser Enterprise

Share This Page

Support Tickets

Support Artur A-Parser Enterprise License
A-Parser Enterprise