Skip to main content

SE::Google::SafeBrowsing - Domain Check in Google Blacklist

Overview of the scraper

The Google Safe Browsing scraper allows you to check a domain against Google's blacklist. Using the Google Safe Browsing scraper, you can check your own domain databases for inclusion in the Google blacklist. You can learn more about this notice in Google Search Help.

A-Parser's functionality allows you to save scraping settings for later use (presets), schedule scraping, and much more.

Thanks to A-Parser's multi-threaded operation, the request processing speed can reach 3800-4000 requests per minute.

Overview: speed of operation

Result saving is possible in the form and structure you need, thanks to the powerful built-in templating engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL and CSV.

Collected data

  • Inclusion in the list of suspicious sites

what data the SE::Google::SafeBrowsing scraper collects

Use cases

  • Checking a list of domains for inclusion in the Google blacklist
  • Monitoring your domains for inclusion in the Google blacklist

Queries

As queries, you must specify the URL of the site you are looking for, for example:

http://a-parser.com/
http://www.yandex.ru/
http://facebook.com/
http://youtube.com/
http://perfect-soft.net/

Examples of output results

A-Parser supports flexible result formatting thanks to the built-in templating engine Template Toolkit, which allows it to output results in an arbitrary form, as well as in a structured form, such as CSV or JSON

Exporting the blacklist check list

Result format:

$query: $exists\n

An example result showing the URL and whether it is in the Google blacklist:

http://youtube.com/: 0
http://www.yandex.ru/: 0
http://a-parser.com/: 0
http://perfect-soft.net: 1
http://facebook.com/: 0

Outputting results to a CSV table

The built-in utility $tools.CSVLine allows you to create correct tabular documents ready for import into Excel or Google Sheets.

General result format:

[% tools.CSVline(query.orig,p1.exists) %]

File name:

$datefile.format().csv

Initial text:

Site,Check Result

Result example:

Site,Check Result
http://youtube.com/,0
http://www.yandex.ru/,0
http://a-parser.com/,0
http://perfect-soft.net,1
http://facebook.com/,0
tip

The Template Toolkit templating engine is used in the General Result Format to output the query and check against the blacklist.

In the result file name, you simply need to change the file extension to csv.

For the "Initial text" option to be available in the Task Editor, you need to activate "More options". In "Initial text", we write the column names separated by commas and leave the second line blank.

Saving in SQL format

Result format:

[% "INSERT INTO volumes VALUES('" _ query.query _ "', '" _ exists _ "')\n" %]

Example result:

INSERT INTO serp VALUES('http://www.yandex.ru/', '0')
INSERT INTO serp VALUES('http://a-parser.com/', '0')
INSERT INTO serp VALUES('http://perfect-soft.net', '1')
INSERT INTO serp VALUES('http://facebook.com/', '0')
INSERT INTO serp VALUES('http://youtube.com/', '0')

Dump results to JSON

Общий формат результата:

[% IF notFirst;
",\n";
ELSE;
notFirst = 1;
END;

obj = {};
obj.query = p1.query.orig;
obj.exists = p1.exists;

obj.json %]

Начальный текст:

[

Конечный текст:

]

Example result:

[{"query":"http://www.yandex.ru/","exists":"0"},
{"query":"http://youtube.com/","exists":"0"},
{"query":"http://facebook.com/","exists":"0"},
{"query":"http://a-parser.com/","exists":"0"},
{"query":"http://perfect-soft.net","exists":"1"}]
tip

For the "Initial text" and "Final text" options to be available in the Task Editor, you need to activate "More options".

Results processing

A-Parser allows you to process results directly during scraping; in this section, we have provided the most popular use cases for the SE::Google::SafeBrowsing scraper

Saving domains with a check value of "1"

Add a filter and select the check value variable $exists - Listed as suspicious from the drop-down list. Select type: String equals. Next, in String, you need to enter the value you need 1. With this filter, you can remove all results with an unwanted value.

Example of filtering
Download example

How to import the example into A-Parser

eJx1VEuP2jAQ/ivI4tBKEMqhl9wAlaoVXbYLe2I5mGTCujger+3wUJT/3rETEth2
b57H983bJXPcHuyjAQvOsnhTMh3eLGarb3H8HXEvIY5XPIOpwZMVat8b9uZCgunB
medaAhswzY0F4/GbD2HklkLGC+nYoGTuooFi4BGMEannECnJySskBxKOXBbenmLO
hWJVh8iEdGDIhdL0XDGDs7CU+6AxrWs/eCu4vGEa0xu1E6hIsKAsq7bbK4udo8m5
J+vrcdR0oDWu+BHWWMeGTu2b8MBzT95PuQNvjbJA9Olz5M6egaep8DG5rCP4FnVR
n5V4C8kpJF96GgF2bjAnlYNA4JWXa3Yb1g8yI4oiYH/XGBZnXFoYMEupzjklkr63
CGoNd2iWoQekLxmqiZQLOILs3AL/tBAypXlOMgL9aID/d1n+w1G15d2GojmfDOXQ
sgRpuvzVoVJc4N7PfEd1S5ELR7KdYaH8YL6Q8gCg2549+J7laKAN0zA30WmnNSi/
VN3IJrpT3ZVxN5Z7ZYIqE/tls6hXz0Kt6XCWaob+BHxdqpCSxmLhqVuPiW3G4IUu
wffgWQjhS78eCHOI0v5c1alqI2j9vvoEc+rkbdSGMuFSPj8tbi2sWykSXp3T8WjE
h/WpRgnmoxfVaE+nU3ThKoVzZIpOnfEEdoiHe+cLFq7Ywb1Sg8kgcUOLmYtUOJ6E
TmKPtKzUsGrb/hHtV1N+/FPEZUX78Mc+1hDfPA8gHU3BhhMeV38BeN+pvw==
tip

See also: Results Filters

Possible settings

Parameter NameDefault ValueDescription
CheckDomainSelect check type (Domain / Full link)