1.2.912 - NodeJS update, performance improvement, adaptation to changes in recaptchas

Discussion in 'News' started by Support Artur, Jun 11, 2020.

  1. Support Artur

    Support Artur A-Parser Enterprise License
    A-Parser Enterprise

    Joined:
    Apr 6, 2020
    Messages:
    172
    Likes Received:
    50
    [​IMG]

    We have completed the transition to NodeJS as the main engine for scrapers and present a new stable version 1.2.912 with support for NodeJS 14.2.0. This update combines many improvements, including increased performance, reduced memory consumption, a completely new network stack, as well as support for native NodeJS modules, allowing you to use the full power of the npmjs directory in A-Parser

    Also, this update includes changes in working with ReCaptcha2 in the Google scraper, our team was one of the first to find a solution to circumvent the new version of the recaptcha and tested it together with the RuCaptcha service, for which they have a separate respect. At the moment, the correct captcha bypass has been tested with RuCaptcha, Anti-Captcha, XEvil and CapMonster.

    In addition, many optimizations were made in the core of A-Parser, and performance was significantly increased when using a large number of tasks or large proxy lists. The scraper Rank::CMS Rank::CMS has been completely rewritten and stabilized, support for the new apps.json format and support for user rules have been added.

    Improvements
    • NodeJS updated to v14.0.0, v8 to 8.1
    • Added support for the data-s parameter in recaptures for SE::Google SE::Google, also added the ReCaptcha2 pass proxy option
    • Increased thread limit to 10,000 for Windows OS
    • Significantly improved performance with a large number of active proxies and / or jobs, completely rewritten the stack for working with proxies, optimized work with large lists
    • Added new scraper Rank::KeysSo Rank::KeysSo
    • Completely rewritten in JS and Rank::Archive Rank::Archive
    • Improved performance when using regular expressions, as well as improved compatibility
    • In SE::Google::KeywordPlanner SE::Google::KeywordPlanner added automatic token retrieval
    • In SE::Bing SE::Bing added the ability to scrap links to cached pages, as well as the ability to scrap mobile results
    • In the scraper Util::ReCaptcha2 Util::ReCaptcha2, when choosing the provider Capmonster or Xevil it is now optional to specify the Provider url
    • In SE::Google::Trends SE::Google::Trends added the ability to specify an arbitrary date range
    • In Rank::CMS Rank::CMS added the choice of a regular engine and support for its own file with features
    • In SE::Yandex::ByImage SE::Yandex::ByImage added option Don't scrape if no other sizes, which allows you to disable the collection of results if the desired image is not in other sizes
    • [NodeJS] Fixed this.cookies.getAll()
    • [NodeJS] Added protection against endless loops and long regulars
    • [JS scrapers] Added follow_meta_refresh option for this.request
    • [JS scrapers] Added bypass_cloudflare option for this.request
    • [JS scrapers] Underscore replaced by Lodash
    • [JS scrapers] Added a mark in the log when calling other scrapers
    • [JS scrapers] Using the previous proxy after a request to another scraper
    • [JS scrapers] Added destroy() method
    Corrections due to changes in the issuance
    Corrections
    • Fixed a bug due to which the selected proxy checker was ignored
    • Fixed work of Decode HTML entities and Extract domain functions in Result Constructor
    • Fixed problem with encoding detection
    • Fixed error using $tools.query
    • Fixed bug in Rank::MajesticSEO Rank::MajesticSEO in which all attempts were used in the absence of results
    • Fixed work of http2
    • Fixed a bug when the scraper crashes due to the inability to write in alive.txt
    • Fixed captcha capturing in SE::Yandex::Register SE::Yandex::Register and Check::RosKomNadzor Check::RosKomNadzor
    • Fixed the difference in requests sent via Net::HTTP Net::HTTP and JS
    • Fixed bug in SE::Yahoo SE::Yahoo
    • Bugs fixed in Rank::CMS Rank::CMS when choosing an application without a category
    • [NodeJS] Fixed calculation of scraper code execution time
    • [JS scrapers] When the body is empty, the content-length header was not transmitted when posting a request
    • [JS scrapers] Fixed work of CloudFlare bypass
    • [JS scrapers] Fixed work with sessions
    • [JS scrapers] Fixed work with overrides for this.parser.request
    • [JS scrapers] Fixed error in encoding detection in JS scrapers
    [​IMG]
     

Share This Page