1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.
  2. Join our Telegram chat: https://t.me/a_parser_en
    Dismiss Notice

1.2.1076 - 3 new scrapers. Сompleting the transition to Node.js. Integration of puppeteer into build

Discussion in 'News' started by Support Artur, Dec 21, 2020.

  1. Support Artur

    Support Artur Moderator
    Staff Member A-Parser Enterprise

    Joined:
    Apr 6, 2020
    Messages:
    150
    Likes Received:
    39
    [​IMG]
    Improvements

    • In connection with the transfer of the main built-in scrapers to the new Node.js platform, the scrapers have been completely rewritten and updated:
    • Major improvements from migration scrapers data to Node.js:
      • performance increase in ~ 1.5 times
      • unification of HTTP engine with JavaScript scrapers, unified bypass of CloudFlare
    • Added new scrapers:
    • In HTML::EmailExtractor HTML::EmailExtractor added Skip non-HTML blocks option to disable collection of mails inside script, style tags, etc.
    • In SE::Google::Translate SE::Google::Translate added new variables:
      • $translit_orig - original text in transliteration
      • $translit_translated - translated text in transliteration
      • $variants.$i.text - a list of translation options for the original text
    • In SE::Bing SE::Bing updated list of regions and languages
    • In Social::Instagram::Profile Social::Instagram::Profile и Social::Instagram::Post Social::Instagram::Post added the ability to collect the number of video views
    • In SE::Yandex::Translate SE::Yandex::Translate added the ability to disable the use of sessions
    • In Net::HTTP Net::HTTP added the ability to specify user-agent for Chrome
    • In scraper Rank::MOZ Rank::MOZ fixed the error that occurred when calling the scraper from the JS method this.parser.request().
    • In Rank::CMS Rank::CMS added support for new apps.json and the ability to use Net::HTTP Net::HTTP
    • In Net::Whois Net::Whois updated support for all zones
    • Added option for proxycheckers Exclude from "All", and also made changes in logic:
      • "All" - uses all proxies selected for tasks
      • specific proxychecker - uses it even if it is not selected in the task
    • Added support for outdated versions SSL
    • JS scrapers: Added option tlsOpts for this.request(), allows you to transfer settings for https connections
    • JS scrapers: updating Node.js с 14.2.0 to 14.15.0
    • JS scrapers: the puppeteer module is included in the A-Parser build and does not require a separate installation
    Corrections due to changes in the SERP
    Bug fixes
    • In SE::Yandex SE::Yandex fixed work Extra query string
    • Fixed regex in HTML::EmailExtractor HTML::EmailExtractor to correct errors in some cases
    • Fixed scraper behavior SE::Google::KeywordPlanner SE::Google::KeywordPlanner in the absence of results on request
    • Maps::Yandex Maps::Yandex fixed and translated to puppeteer
    • Fixed a bug in the priorities of choosing a proxychecker
    • JS scrapers: fixed follow_meta_refresh
    • API: fixed rawResults parameter work

    [​IMG]
     
    user1114 likes this.

Share This Page