little regex help needed

Discussion in 'A-Parser Support Forum' started by Kamika2917, Dec 26, 2018.

Tags:
  1. Kamika2917

    Kamika2917 A-Parser Enterprise License
    A-Parser Enterprise

    Joined:
    Dec 25, 2018
    Messages:
    2
    Likes Received:
    0
    hello i am kind new to parser so i need little help i want to skip duplicate domains while scraping and want to skip some tlds as well while scraping how can i do that skipping duplicate domains like

    support.google.com
    this should only comes once not gain like not even google.com like this and i mean by skipping tlds like .br .vn .tw i want to skip these type of tlds and need ones more help i want to only extract those urls which have specific character like = sign or other
     
  2. Support Денис

    Support Денис A-Parser Enterprise License
    Staff Member A-Parser Enterprise

    Joined:
    Jun 12, 2017
    Messages:
    511
    Likes Received:
    134
    Hello.
    To solve this problem, you can use filters.
    If you want the links to contain a specific symbol
    [​IMG]
    If you want to not contain, then choose
    [​IMG]
     
  3. Kamika2917

    Kamika2917 A-Parser Enterprise License
    A-Parser Enterprise

    Joined:
    Dec 25, 2018
    Messages:
    2
    Likes Received:
    0
    thanks lovely what about the filter for subdomains or scraping duplicate domans
     
  4. Support Денис

    Support Денис A-Parser Enterprise License
    Staff Member A-Parser Enterprise

    Joined:
    Jun 12, 2017
    Messages:
    511
    Likes Received:
    134
    To separate domains from subdomains, you can use the Results Builder.
    [​IMG]
    In order to avoid duplicates, use unique
    [​IMG]
     

Share This Page