I need to extract some additional information from Google Images results and am not sure how to go about it. On the Google image results page each image generates a url like this: href="http://www.google.com/imgres?imgurl...BQ&tbm=isch&ved=0CDQQMygCMAI&biw=1366&bih=631" I need to extract the values for these parameters: imgurl= imgrefurl= tbnid= And finally, is there a way to extract the filetype of the image into a variable as well (jpg, png, etc)? Something like $filetype? So for the final result I would like stored on each line: $query;$loop.count;$imgurl;$imgrefurl;$tbnid.$filetype\n
The source code for the pages is contained in the $pages array. After analyzing it, you can see that each picture is represented by a JSON object, which has all the data that you need. Therefore, the task is reduced to scraping these objects and outputting the necessary data. Spoiler: Preset Code: eJxtVE1v2zAM/SsG0aJJEXjoYRf3C2nXbC26uEvaU5QVQswYTmXJk+QshZH/Pkp2 7CZrDo5JkY+Pj7QqsNy8mSeNBq2BaFZB4d8hggSXvBQWBlBwbVC74xlM76Lou1Kp wCi6z3mKhgLa0Arse4GUvCiNVfkETY2g65doRlhNCrcc5u4kxQ0lXCTZuncdBSsj eY6XDGa/GcxPGfSvg4XgxpBLp685Wr47ueoxVoWn14xt+xeMfSGEK2gQn2saJm2L Nx6uNX8np/8fUynyZRZz0wa6PmFV89vO5617pHTOnTCz42AUT+6Gtz8Clxncj4Pi LPQg50wyuzJKBpeBVUqY0Gv3MI3HPRcQetw+hQX0+1Oifg9eg5PzE3oKpYpwoUpp W5dDClW5b+sDO0sObOswGTDiAlTpbvwtOJ537U35Gp8V9bHMBHbuEVmNHkdEEt1p uPQ99/qh3bgx8iTJbKYkF7UYTqpOoBeZUUeULxXFuuYyNCOtcnJZ9AC+452QMzjy tluD0uf+qnMgWnJhcACGqI44EUkOT0hMza3SceH4kL8CJYdCPOIaRRfm8W/KTCS0 v8MlJd03iZ+HxP9hbNv2PpZao/6riUOL4q2b+GeXlahHle7EeEMsWnnGzpMrjS1i A9IUoq+xQJlQZDedYdG59hjvTWDfuVBymaUxcdVZgrvIUj7TJx/LW5UXAl0LshSC JmBw0m3C0DSKO6MjeJh860vsXRZ+7R+mNdVCZ7RpXx3BnET7WLWBXHAhXiaPH0+g 2x6/OcbBLmglU0XL4u4ov00RKO2bHQBuCi4TJH3OtvTNtjdWe69Vn91bUbWlOa3M Ux3sOvXOAZBkhibj4P4BnL/CrQ==