Scraping Google Shopping

Discussion in 'A-Parser Support Forum' started by scrapefun, Jul 6, 2015.

  1. scrapefun

    scrapefun A-Parser Enterprise License
    A-Parser Enterprise

    Joined:
    Feb 24, 2015
    Messages:
    184
    Likes Received:
    34
    I am trying to scrape Google shopping but with a bit of a twist that has two main arts.

    The First Part:
    In addition to the shopping results, I also need to scrape the various values from the left hand navigation which unfortunately is dynamic and changes based on the search query so the side navigation values for "tv" are different from "mobile phone"

    Mobile Phone:
    https://www.google.co.uk/webhp?hl=en&gws_rd=ssl#hl=en-GB&tbm=shop&q=mobile+phone

    Tv:
    https://www.google.co.uk/webhp?hl=en&gws_rd=ssl#hl=en-GB&tbm=shop&q=tv

    The nagivation groups are always under "<div class="sr__group"> " elements, with each new group starting after the "<div class="sr__title">" element and for all of them the data is under "<span class="sr__link-text">"

    On the surface it looks somewhat routine because the various "Divs" involved stay the same from search to search, but because the options/groups change so often I am stuck is how to express this hierarchy in regex that can parse the different elements and store but with the flexibility that works with the changing navigation groups.

    Not sure if I am making any sense :)

    Part 2:
    For the shopping results I need to parse the data highlighted in this screenshot:

    shopping_parse.png
    Thanks for the help in advance. I am just horrible with regex. But maybe it will help some others out as well.
     
  2. Support

    Support Administrator
    Staff Member A-Parser Enterprise

    Joined:
    Mar 16, 2012
    Messages:
    4,547
    Likes Received:
    2,164
    This page is almost completely generated by JS, so parse it with the A-parser will not work...
     
  3. scrapefun

    scrapefun A-Parser Enterprise License
    A-Parser Enterprise

    Joined:
    Feb 24, 2015
    Messages:
    184
    Likes Received:
    34
    opps, sorry I should have checked that before posting my question. Thanks for the help.
     

Share This Page