Skip to main content

Cloudflare::Radar - Cloudflare Radar scraper

img

Scraper Overview

The Cloudflare Radar scraper allows you to quickly determine the category of a website by its domain name.

Saving results is possible in the form and structure you need, thanks to the built-in powerful templating engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL, and CSV

Collected Data

Data is collected from the service radar.cloudflare.com

  • Website categories

Use Cases

  • Determining to which category a domain's websites belong

Queries

As queries, you need to specify a list of domains, for example:

a-parser.com  
yandex.ru
google.com
vk.com
facebook.com
youtube.com

Output Results Examples

A-Parser supports flexible formatting of results thanks to the built-in templating engine Template Toolkit, which allows it to output results in any form, as well as in structured formats, such as CSV or JSON

Default Output

Result format:

$query: $categories.format('$name, ')\n

An example of a result that displays categories and their descriptions:

a-parser.com: Business, Business & Economy, 
yandex.ru: News & Media, Entertainment,
vk.com: Social Networks, Society & Lifestyle,
youtube.com: Video Streaming, Entertainment,
facebook.com: Social Networks, Society & Lifestyle,
google.com: Search Engines, Technology,

Output in CSV Table

Result format:

[% FOREACH categories;
tools.CSVline(name, desc);
END %]

Example result:

Business,"Sites related to business."
"Business & Economy","Sites that are related to business, economy, finance, education, science and technology."
"Social Networks","Sites that facilitate interaction and networking between people."
"Society & Lifestyle","Sites related to lifestyle that are not included in other categories like fashion, food & drink etc."
"Social Networks","Sites that facilitate interaction and networking between people."
"Society & Lifestyle","Sites related to lifestyle that are not included in other categories like fashion, food & drink etc."
"Search Engines","Sites that allow users to search for content using keywords."
Technology,"Sites related to technology that are not included in the science category."
"News & Media","Sites related to news and media."
Entertainment,"Sites related to entertainment that are not includeded in other categories like Comic books, Audio streaming, Video streaming etc."

Dump Results to JSON

Общий формат результата:

[% IF notFirst;
",\n";
ELSE;
notFirst = 1;
END;

obj = {};
obj.query = query;
obj.categories = [];

FOREACH item IN p1.categories;
obj.categories.push({
name = item.name
desc = item.desc
});
END;

obj.json %]

Начальный текст:

[

Конечный текст:

]

Example result:

[{"query":"yandex.ru","categories":[{"desc":"Sites related to news and media.","name":"News & Media"},{"desc":"Sites related to entertainment that are not includeded in other categories like Comic books, Audio streaming, Video streaming etc.","name":"Entertainment"}]},{"query":"google.com","categories":[{"desc":"Sites that allow users to search for content using keywords.","name":"Search Engines"},{"desc":"Sites related to technology that are not included in the science category.","name":"Technology"}]},{"query":"a-parser.com","categories":[{"desc":"Sites related to business.","name":"Business"},{"desc":"Sites that are related to business, economy, finance, education, science and technology.","name":"Business & Economy"}]}]
tip

To make the "Start Text" and "End Text" options available in the Task Editor, you need to activate "More options".

Possible Settings

Parameter NameDefault ValueDescription
Bypass CloudFlare with Chrome Max Pages10Max. number of pages when bypassing CF via Chrome
Bypass CloudFlare with Chrome HeadlessIf the option is enabled, the browser will not be displayed during CF bypass via Chrome
Use sessionSaves good sessions, allowing for even faster scraping with fewer errors.