HTML Link Extractor: only crawl certain directories?

Discussion in 'A-Parser Support Forum' started by scrapefun, Jan 28, 2022.

  1. scrapefun

    scrapefun A-Parser Enterprise License
    A-Parser Enterprise

    Joined:
    Feb 24, 2015
    Messages:
    184
    Likes Received:
    34
    Is it possible to only follow links in certain directories using the Link Extractor?

    I want to scrape the Twitter profile directory and only want to follow links that contain "/i/directory/profiles/" in the URL.

    I still want to collect all the links on those pages but only follow the links contain that url format (/i/directory/profiles/)
     
  2. scrapefun

    scrapefun A-Parser Enterprise License
    A-Parser Enterprise

    Joined:
    Feb 24, 2015
    Messages:
    184
    Likes Received:
    34
    Nevermind I discovered the "$followlinks.$i.link - Link" filter and have it working as needed now.
     
    Support likes this.

Share This Page