Regex Only Extracting First Header

Discussion in 'A-Parser Support Forum' started by scrapefun, May 30, 2022.

  1. scrapefun

    scrapefun A-Parser Enterprise License
    A-Parser Enterprise

    Joined:
    Feb 24, 2015
    Messages:
    184
    Likes Received:
    34
    I'm trying to extract all of the H2 headings from a list of urls but only the first H2 is being extracted instead of all.

    Probably my regex is incorrect? Here is the regex:
    <h2.*>([^<>]*)</h2>

    eJylVm1T4zYQ/isZDTfAFefFcA1nOCjHDNN2OEJJrl+SlFHsja0iSz5JzsuF/Peu
    bMeO04Nh2g+ZWKvdZx+tnl17RQzVT/pegQajiTdckSR7Jh7p+xHElByRhCoNyu4O
    yR0Yz/t1MLhHewBTmnJDjlbELBPAEDkDpVgAuMkCXE+liqlBwMyNzChPrdvetxTU
    8mz4rqEjmSRMhM1206fKND59auzvNy4b7YbX6DTejc+Gezqj0cyhDvb3Amro2f6h
    3YqABhjsVpuRa7dGI0HWFSs/1UbGDxsWBR1viAcLQduDICQZ250QFhgwIr/Y0BHx
    RiN9OSIHzZ8uD0foWLgMclymwxKuMFGl6BKN2f8dja1Nb+qYe9o6FhnXL1cuUXKx
    VGAUyxhuKtdpk1eCFBWBjNl3eDRcP06xNKASxcRW7Y1K4RWEmC4eNQLUUp6cfuj+
    3H41c2RM4lZBU8r1a2n0E0sGt/0/QbHp8s3kQimDaxlskRu67fb4/9x06+DSs8J7
    lioA9TzBZgDzbJTkHJa4DJ83CrXPeJH+k0zN4Y4Q3qSCAqeuA5ubvOEEOfGK93nk
    Nt9fHAz/Or8Yvz88b0XuxX8T56aF6rQiF0mNx6XtJusv27lJp1kMiHKzT2cwkLbb
    GYfKfIOrIontWbC7m0Y9bJqFRaBBwAyTgvI8g01eZf0q2LdMgkKir50Z2Aw3SsZo
    MpABZINkw25YDBZ7wWkW+0ceUwjyiGikekORSLC7wwwoaqTqJZYP2ldEiivOb2EG
    vHLL8D+njKNa9NUUg34rAn/s0vsXxro83nYqVPpcIYcSJVt97n2pogJ5K8NNMTiL
    mcG1vpapsBfTRuMTQFLW7M66xVJBmaZALrLjyE9ABNvq1VdJZaodo3YtdaMvxZSF
    vaJVN56pGGAr9cS1jBMO9lwi5RyvRcNDJY8rXVyDXVQEd4OvsxS2CTYvHGKk5Pr3
    fk4VRxzK74MlGGMlt7MWkD7l/OvD7fYOqSRVzC/ttVrz+bxpwI9oMGNaqqYv45aA
    uW4ZOsE43YqZr6SWU+PoVE2pDw5Oauejc3zacY+7bmskdqEUDegW0EsA84gaZw7O
    nArjGOlogDrWnOFwn2sfhFGUZ4AVVgwqxDauYS4cfB3sZMF68B2KCicOBApmDOlV
    POtxWGJ8VsCBanBsOzs6AV87J67bwV8dcxFQJ7CilwnqJMOswe1WScY6TFE9mafE
    KYmCfbFQJgInTUIsK2iHvVyvMj5DBeGk5aHqjt8DAbkTDmPmc/hBaiYct+26+S0F
    srwoW+EtGyywKGbneDChywx+0vpSAvcL4PvsTAjfaef/fV8Bkh0Ueut0O/j6bU3E
    Y7fbaR+ffDzFbxsraAOhxFGH7bYel59o5XfcautDzVutcXz8re9zH9tr1gNt2LQa
    S028zvofV/18gQ==
     
  2. Support

    Support Administrator
    Staff Member A-Parser Enterprise

    Joined:
    Mar 16, 2012
    Messages:
    4,557
    Likes Received:
    2,167
    Yes. Use correct regex:
    Code:
    <h2[^>]*>([^<]+)</h2>
     
    scrapefun likes this.
  3. scrapefun

    scrapefun A-Parser Enterprise License
    A-Parser Enterprise

    Joined:
    Feb 24, 2015
    Messages:
    184
    Likes Received:
    34
    Works perfect. Thanks.
     
    Support likes this.

Share This Page