Skip to main content

SE::Google::Compromised - Checking for the presence of the "This site may be hacked" notice in Google

Overview of the scraper

The Google Compromised scraper allows you to check for the presence of the This site may be hacked notice in Google search results. With the Google Compromised scraper, you can check your own domain databases for the notice. You can learn more about this notice in Google Search Help.

A-Parser functionality allows you to save scraping settings for future use (presets), set a scraping schedule, and much more.

Saving results is possible in the form and structure you need, thanks to the built-in powerful template engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL, and CSV.

Data collected

  • Checking for the presence of the This site may be hacked notice in Google

what data does the SE::Google::Compromised scraper collect

Capabilities

  • Supports all the features of the SE::GoogleSE::Google scraper.

Use cases

  • Checking a list of domains for the presence of the This site may be hacked notice in Google
  • Monitoring your own domains

Queries

As queries, you need to specify the URL of the site you are looking for, for example:

http://a-parser.com/  
http://www.yandex.ru/
http://google.com/
http://russbehnke.com/
http://www.bmlaroca.cat/
http://vk.com/
http://facebook.com/
http://youtube.com/

Query substitutions

You can use built-in macros for automatic substitution of subqueries from files, for example, if we want to check sites/site based on a database of keys, we will specify several main queries:

ria.ru
lenta.ru
rbc.ru
yandex.ru

In the query format, we will specify a macro for substituting additional words from the Keywords.txt file, this method allows checking a database of sites against a database of keys and getting positions as a result:

$query {subs:Keywords}

This macro will create as many additional queries as there are in the file for each original search query, which in total will give [number of original queries (domains)] x [number of queries in the Keywords file] = [total number of queries] as a result of the macro's work.

You can also specify the protocol in the query format so that you can use only domains as queries:

http://$query 

This format will prepend http:// to each query.

Examples of output results

A-Parser supports flexible result formatting thanks to the built-in template engine Template Toolkit, which allows it to output results in any form, as well as in structured formats, such as CSV or JSON

Export of the list for checking the presence of the notice

Result format:

$query: $compromised\n

An example of a result showing the URL and whether it has the This site may be hacked notice in Google:

http://a-parser.com/: 0
http://www.bmlaroca.cat/: 1
http://russbehnke.com/: 0
http://www.yandex.ru/: 0
http://google.com/: 0

Similar to SE::Google.

Similar to SE::Google.

Similar to SE::Google.

Keyword competition

Similar to SE::Google.

Checking link indexing

Similar to SE::Google.

Saving in SQL format

Similar to SE::Google.

Dumping results to JSON

Similar to SE::Google.

Results processing

A-Parser allows processing results directly during scraping, in this section we have provided the most popular cases for the scraper SE::Google::Compromised

Saving domains with validation value "1"

Add a filter and select the validation value variable from the dropdown list $compromised - Is site compromised. Choose type: String equals. Then in String write the value we need 1. With this filter, you can remove all results with unwanted values.

Example of filtering
Download example

How to import example into A-Parser

eJx1VE1z2jAQ/SseTQ7tDDFw6MU3woROOzSkITkBB2GtiYosCUnmYzz8965kYxta
btZ+vH37dtclcdRu7asBC86SZFESHb5JQubPSfJdqY2AJBmrXBuVcwsseowmXDgw
ERxprgWQHtHUWDA+f3EvDaMYZLQQjvRK4k4asITagzGceQjO8J0pk1OHBEIY2VNR
+LCHXQHmlEQPaYu3XEpyvo+EYceTAWc42A7ScDAYdNOy0AkG1DUTkl5RrvzvVTDs
Ciq6YPittONK4sOCtOS8Wl2g7CT04tnrYVyL2jjndA/vqiIArRmFhReah54ZdeC9
cSXKl6+xO3oEyhj3NamoKnjV26ofku8COakw1uuGAkywITQ5CABBzAu7RS0uQYgi
5P6uckiSUWGhRyxSnVAkwm49HKWhTplZ0ADtJVFyJMQU9iDasID/VHDBcEVGGSb9
qBP/HzL7B+PctNcthRM/GOTQoITX0+xXm8XUVG2wc7bGvgXPucO3HatC+sEM0LgF
0I1mL16zXBloytTIdXU8Ew3Sr1c7spFuTVdtXI3l2pgqmfHNrF7ZS2Qh3/EWZ9If
jQDflyyEwLFYeGvXY2TrMfhHS/A2eRxK+NYvR0ecUsL+nFdUteG4ft88wRyV7Fat
IVMqxMfbtOsh7Urh49M5nfT79LG6/hgvpx9FS1nbD4dDfKKSwTE2RdexCT+H23BT
WLuGT7mtPFcw61xQo1Iap9T1iafmYKNwaVG486r5/TR/sfLuTygpz7gWf+xrleE1
9PFow2HYcMnD81/p/MfQ
tip

See also: Results filters

Same as in SE::Google.

Same as in SE::Google.

Extracting domains

Same as in SE::Google.

Removing tags from anchors and snippets

Same as in SE::Google.

Same as in SE::Google.

Possible settings

Supports all settings of the scraper SE::GoogleSE::Google, as well as additionally:

Parameter nameDefault valueDescription
Pages count1Number of search results pages to scrape (from 1 to 10)
Links per page10Number of links in the search results on each page (from 10 to 100)