Skip to main content

SE::Google::Cache - Checking the presence of pages in Google cache

Google Cache parser overview

img

The Google Cache parser checks the presence of a page in the Google cache.

Saving results is possible in the format and structure that you need, thanks to the built-in powerful Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL and CSV.

Collected data

img

  • Date of page indexing in the cache
  • Date of page indexing in Unix format
  • Presence of the page in the cache
  • Page data without google-tollbar

Requests

  • As requests, you need to specify the url of the page, for example:
https://a-parser.com
https://lenta.ru/

Use cases

  • Determining the presence of a page in the Google cache
  • Getting the date of the last Google snapshot
  • Getting the date of the last Google snapshot in Unix format
  • Getting the content of a page that is in the cache

Results

  • By default, the domain, presence in the cache (1 or 0), and caching date are displayed in the result
https://lenta.ru/: 1 -  25 Dec 2020 10:44:05 GMT

Result output options

Output to CSV

Result format:

[% tools.CSVline(query, exists, date, timestamp) %]

Example of result:

https://a-parser.com/wiki/index/,1," 18 Mar 2021 20:05:44 GMT",1616097944

Possible settings

ParameterDefault valueDescription
Use sessionsSaves good sessions, allowing you to parse even faster, getting fewer errors
Util::ReCaptcha2 presetdefaultDetermines whether to use Util::ReCaptcha2Util::ReCaptcha2 to bypass reCAPTCHA
Remove toolbarSpecifies whether to remove the toolbar from the page