Skip to main content

SE::Google::SafeBrowsing - Checking a domain in Google's blacklist

Overview

The Google Safe Browsing scraper allows you to check a domain in Google's blacklist. With the Google Safe Browsing scraper, you can check your own domain databases for presence in Google's blacklist. You can learn more about this feature in Google Search Help.

A-Parser's functionality allows you to save parsing settings for future use (presets), set a parsing schedule, and much more.

Thanks to the multi-threaded operation of A-Parser, the processing speed can reach 3800-4000 requests per minute.

Overview: speed of operation

Saving results is possible in the form and structure you need, thanks to the built-in powerful template engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL, and CSV.

Collected Data

  • Presence in the list of suspicious sites

what data does the SE::Google::SafeBrowsing scraper collect

Use Cases

  • Checking a list of domains for inclusion in Google's blacklist
  • Monitoring your own domains for inclusion in Google's blacklist

Queries

As queries, you need to specify the URL of the site you are looking for, for example:

http://a-parser.com/
http://www.yandex.ru/
http://facebook.com/
http://youtube.com/
http://perfect-soft.net/

Output Results Examples

A-Parser supports flexible formatting of results thanks to the built-in template engine Template Toolkit, which allows it to output results in any form, as well as in structured formats, such as CSV or JSON

Exporting the blacklist check list

Result format:

$query: $exists\n

An example of a result showing the URL and whether it is on Google's blacklist:

http://youtube.com/: 0
http://www.yandex.ru/: 0
http://a-parser.com/: 0
http://perfect-soft.net: 1
http://facebook.com/: 0

Outputting Results to a CSV Table

The built-in utility $tools.CSVLine allows you to create correct table documents, ready for import into Excel or Google Sheets.

General result format:

[% tools.CSVline(query.orig,p1.exists) %]

File name:

$datefile.format().csv

Initial text:

Сайт,Результат проверки

Result example:

Сайт,Результат проверки
http://youtube.com/,0
http://www.yandex.ru/,0
http://a-parser.com/,0
http://perfect-soft.net,1
http://facebook.com/,0
tip

In the General Result Format, the Template Toolkit template engine is used to output the query and blacklist check.

In the file name of the results, you just need to change the file extension to csv.

To make the "Initial Text" option available in the Task Editor, you need to activate "More options". In the "Initial Text," we write the names of the columns separated by commas, and we make the second line empty.

Saving in SQL Format

Result format:

[% "INSERT INTO volumes VALUES('" _ query.query _ "', '" _ exists _ "')\n" %]

Result example:

INSERT INTO serp VALUES('http://www.yandex.ru/', '0')
INSERT INTO serp VALUES('http://a-parser.com/', '0')
INSERT INTO serp VALUES('http://perfect-soft.net', '1')
INSERT INTO serp VALUES('http://facebook.com/', '0')
INSERT INTO serp VALUES('http://youtube.com/', '0')

Dumping Results to JSON

Общий формат результата:

[% IF notFirst;
",\n";
ELSE;
notFirst = 1;
END;

obj = {};
obj.query = p1.query.orig;
obj.exists = p1.exists;

obj.json %]

Начальный текст:

[

Конечный текст:

]

Result example:

[{"query":"http://www.yandex.ru/","exists":"0"},
{"query":"http://youtube.com/","exists":"0"},
{"query":"http://facebook.com/","exists":"0"},
{"query":"http://a-parser.com/","exists":"0"},
{"query":"http://perfect-soft.net","exists":"1"}]
tip

To make the "Initial Text" and "Final Text" options available in the Task Editor, you need to activate "More options".

Results Processing

A-Parser allows you to process results directly during scraping, in this section we have presented the most popular cases for the SE::Google::SafeBrowsing scraper

Saving domains with a validation value of "1"

Add a filter and in the dropdown menu select the variable for check value $exists - Listed as suspicious. Choose the type: String equals. Then in String you need to write the value we need 1. With this filter, you can remove all results with the unwanted value.

Example of filtering
Download example

How to import example into A-Scraper

eJx1VEuP2jAQ/ivI4tBKEMqhl9wAlaoVXbYLe2I5mGTCujger+3wUJT/3rETEth2
b57H983bJXPcHuyjAQvOsnhTMh3eLGarb3H8HXEvIY5XPIOpwZMVat8b9uZCgunB
medaAhswzY0F4/GbD2HklkLGC+nYoGTuooFi4BGMEannECnJySskBxKOXBbenmLO
hWJVh8iEdGDIhdL0XDGDs7CU+6AxrWs/eCu4vGEa0xu1E6hIsKAsq7bbK4udo8m5
J+vrcdR0oDWu+BHWWMeGTu2b8MBzT95PuQNvjbJA9Olz5M6egaep8DG5rCP4FnVR
n5V4C8kpJF96GgF2bjAnlYNA4JWXa3Yb1g8yI4oiYH/XGBZnXFoYMEupzjklkr63
CGoNd2iWoQekLxmqiZQLOILs3AL/tBAypXlOMgL9aID/d1n+w1G15d2GojmfDOXQ
sgRpuvzVoVJc4N7PfEd1S5ELR7KdYaH8YL6Q8gCg2549+J7laKAN0zA30WmnNSi/
VN3IJrpT3ZVxN5Z7ZYIqE/tls6hXz0Kt6XCWaob+BHxdqpCSxmLhqVuPiW3G4IUu
wffgWQjhS78eCHOI0v5c1alqI2j9vvoEc+rkbdSGMuFSPj8tbi2sWykSXp3T8WjE
h/WpRgnmoxfVaE+nU3ThKoVzZIpOnfEEdoiHe+cLFq7Ywb1Sg8kgcUOLmYtUOJ6E
TmKPtKzUsGrb/hHtV1N+/FPEZUX78Mc+1hDfPA8gHU3BhhMeV38BeN+pvw==
tip

See also: Results filters

Possible settings

Parameter NameDefault ValueDescription
CheckDomainChoice of check type (Domain / Full link)