1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.
  2. Join our Telegram chat: https://t.me/a_parser_en
    Dismiss Notice

Parse custom results

Jul 2, 2015
  • In A-Parser exists opportunity to process any result with the help of regular expression, is for this purpose used the option Parse custom result:[​IMG]
    Main options of use this option:
    • Search of phones, emails and other information in snippets, in the text on the sites
    • Extraction of a certain part of the link
    • Search in the HTML document arbitrary information using a parser Net::HTTP Net::HTTP
    Description of work:
    • As Parse result is selected result from parser, it can be simple result or an array
    • Regular expression is specified without delimiters, afterwards there is an opportunity to specify a flag, type declaration of the regular expressions and the supported flags it is possible to find in the article Use of the Regular Expressions
    • In Result type is specified result type - Flat (simple result) or Array (array of results). If as the initial result is selected array or is used flag g of regular expression, that the result will always remain in an array. In the field Name is entered array name
    • Each capturing bracket of the regular expression can be saved as a separate element, the name of an element registers in the appropriate field $1 to, $2 to... - where the digit designates number of a capturing bracket
      In the field RegEx can use Template Toolkit that allows to use request as part of the regular expression
    The created new results it is possible to use in case of Formatting of results, in Results builder, in filterings and unique results or in the following option Parse custom result
    This option is similar with Results builder when using RegEx Match
    Example of parsing of links to pictures from the initial HTML code [​IMG]

    • Use a parser Net::HTTP Net::HTTP for obtaining the source code of the page
    • Apply to $data (the downloaded page) regular expression with flags isg, result we save in array images in the element src
    • In format of result we specify to remove all src elements through line break
    • For request http://a-parser.com/ in the file of result we will receive the following list: