Skip to main content

SE::YouTube - a full-fledged YouTube scraper

YouTube

Scraper Overview

The scraper for YouTube search results. Thanks to the YouTube scraper, you can get large databases of video links ready for further use. You can use queries in the same way you enter them in the YouTube search bar. Using the YouTube scraper, you can collect basic data on a video in multi-threaded mode. And to obtain complete data about each video, you can use SE::YouTube::VideoSE::YouTube::Video

A-Parser functionality allows you to save YouTube scraper parsing settings for future use (presets), set a parsing schedule, and much more. You can use automatic query multiplication, substitution of subqueries from files, iteration of alphanumeric combinations and lists to get the maximum possible number of results.

Saving results is possible in the form and structure that you need, thanks to the built-in powerful template engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL, and CSV.

Collected Data

Data is collected from the service http://www.youtube.com/.

  • Main output
    • Video link
    • Video title (title)
    • Video description
    • Username
    • Link to the preview image
    • View count
    • Video length
    • Video upload date
    • Channel subscriber count
Collected data
  • "Related queries" array
    • Keyword
    • Link to the preview

Capabilities

  • Maximum number of pages for scraping on YouTube - 50 pages
  • Country selection from where the search is performed
  • Search by date of upload
  • Choice of result type (videos, channels, playlists)
  • Choice of video duration
  • Advanced search settings (subtitles, 3D, HD, live streaming, Creative Commons license)
  • Sorting by relevance, upload date, rating, view count
  • Additionally scrapes link to the video preview
  • Ability to choose interface language
  • Ability to enable safe search mode

Use Cases

  • Search, collection, and analysis of information on YouTube

Queries

As queries, you need to specify search phrases, for example:

Футбол  
Ниагарский водопад
Speak in english
Cats and dogs
Автомобили

Query Substitutions

You can use built-in macros for query multiplication, for example, we want to get a very large database of forums, let's specify several main queries in different languages:

forum
форум
foro

In the query format, we will specify a character iteration from a to zzzz, this method allows to maximally rotate the search output and get many new unique results:

$query {az:a:zzzz}

This macro will create 475254 additional queries for each original search query, which in total will give 4 x 475254 = 1901016 search queries, an impressive number, but it's not a problem for A-Parser at all. With a speed of 2000 queries per minute, such a task will be processed in just 16 hours.

Output Results Examples

A-Parser supports flexible formatting of results thanks to the built-in Template Toolkit, which allows it to output results in any form, as well as in structured formats, such as CSV or JSON

Export list of links

Similarly as in SE::Google.

Result format:

[% FOREACH item IN p1.serp;    loop.count _ ' - ' _ item.link _ ' - ' _ item.title _ ' - ' _ item.desc _ "\n"; END %]

Example of result:

1 - https://www.youtube.com/watch?v=dm_T7H6J2U8 - НАСКОЛЬКО ТЫ УМНЫЙ? Простой Тест, который не пройдут многие взрослые - В этом видео вы сможете проверить насколько вы умны. Вас ждет <b>тест</b>, состоящий из простых вопросов школьной ...
2 - https://www.youtube.com/watch?v=iDAYNEV9Kxg - Уникальный японский тест на старость мозга! Обязательно проверь себя! - Уникальный японский <b>тест</b> на старость мозга! Обязательно проверь себя! Данный <b>тест</b> разработан в Японии. Как ...
3 - https://www.youtube.com/watch?v=0PEy2_sSy6A - Этот Простой Тест Раскроет Ваш Самый Потаенный Страх - Наше подсознание — довольно темное место, для его понимания нужны долгие годы психоанализа. И этот ...
4 - https://www.youtube.com/watch?v=j6K9nIugzAY - India vs England 2nd Test Day 4 Highlights 2021| Royal Sports Tv - India vs England 2nd <b>Test</b> Day 4 Highlights 2021 India vs England 2nd <b>Test</b> Day 4 Highlights 2021 | ind vs eng <b>test</b> series India vs ...
5 - https://www.youtube.com/watch?v=ALDqwSMVYKQ - ТЕСТ НА ПСИХИКУ/ 929 СЕКУНД СМЕХА/ЛУЧШИЕ ПРИКОЛЫ ЗА ФЕВРАЛЬ 2021 РЖАКА/ПОПРОБУЙ НЕ СМЕЙСЯ! BEST COUB - Телеграм канал: https://t.me/CrazyHumor129k НА КАНАЛЕ ВЫ НАЙДЕТЕ 929СЕКУНД ОТМЕННОГО СМЕХА ПОД ЛУЧШИЕ ...
6 - https://www.youtube.com/watch?v=6X1puBtvc_s - Сериал Тест на беременность 1 серия - русский сериал 2015 HD - Премьера сериала - <b>Тест</b> на беременность 1 серия - русский сериал 2015 После смерти пациентки гинеколог Наталья ...
7 - https://www.youtube.com/watch?v=hXuhVD7Dwp0 - Тест! Оптические Иллюзии, Которые Откроют Вам Неожиданную Правду О Вас! - <b>Тест</b>! Оптические Иллюзии, Которые Откроют Вам Неожиданную Правду О Вас! Существует множество различных типов ...
8 - https://www.youtube.com/watch?v=BYA8lY4o33A - Тест! КАКОЕ ЖИВОТНОЕ ВАШ ТАЛИСМАН? Какой хищник прячется в вашей душе? Точный тест на характер - <b>Тест</b>! КАКОЕ ВЫ БОЖЕСТВЕННОЕ ЖИВОТНОЕ? Какой хищник прячется в вашей душе? Точный <b>тест</b> на характер Для того ...
9 - https://www.youtube.com/watch?v=V-kqty2vAm4 - Тест! КТО-ТО ТАЙНО В ТЕБЯ ВЛЮБЛЕН! УЗНАЙ КТО! - <b>Тест</b>! КТО-ТО ТАЙНО В ТЕБЯ ВЛЮБЛЕН! УЗНАЙ КТО! Вы часто ощущаете себя одиноко и мечтаете найти настоящую ...
10 - https://www.youtube.com/watch?v=9HtbSe_oJto - Пройди этот Тест и проверь своё Внимание - В этом видео мы проверим насколько развито твое внимание. Тебя ждут разные типы заданий с несколькими уровнями ...
...

The built-in tool tools.CSVLine allows you to create correct table documents, ready for import into Excel or Google Sheets.

General result format:

[%  FOREACH i IN p1.serp;    tools.CSVline(i.link, i.title, i.desc); END  %]

File name:

$datefile.format().csv

Initial text:

Ссылка,Анкор,Сниппет

tip

In the General result format, the Template Toolkit is used to output the $serp array in a FOREACH loop.

In the file name of the results, you just need to change the file extension to csv.

To make the "Initial text" option available in the Task Editor, you need to activate "More options". In the "Initial text," we write the names of the columns separated by commas, and make the second line empty.

Keyword competition

Similarly as in SE::Google.

Saving in SQL format

Result format:

[%  FOREACH serp;   "INSERT INTO serp VALUES('" _ query _ "', '";   link _ "', '";  title _ "')\n"; END  %]

Example of result:

INSERT INTO serp VALUES('тест', 'https://www.youtube.com/watch?v=dm_T7H6J2U8', 'НАСКОЛЬКО ТЫ УМНЫЙ? Простой Тест, который не пройдут многие взрослые')
INSERT INTO serp VALUES('тест', 'https://www.youtube.com/watch?v=iDAYNEV9Kxg', 'Уникальный японский тест на старость мозга! Обязательно проверь себя!')
INSERT INTO serp VALUES('тест', 'https://www.youtube.com/watch?v=0PEy2_sSy6A', 'Этот Простой Тест Раскроет Ваш Самый Потаенный Страх')
INSERT INTO serp VALUES('тест', 'https://www.youtube.com/watch?v=BYA8lY4o33A', 'Тест! КАКОЕ ЖИВОТНОЕ ВАШ ТАЛИСМАН? Какой хищник прячется в вашей душе? Точный тест на характер')
INSERT INTO serp VALUES('тест', 'https://www.youtube.com/watch?v=5Se6w0lOkyY', 'Новый Renault Duster.Тест-драйв.Anton Avtoman.')
INSERT INTO serp VALUES('тест', 'https://www.youtube.com/watch?v=Ko8cFdoOV6U', 'Тест! ЧТО ТЫ ЗА ДЕВУШКА ТАКАЯ? Кого в тебе больше ЛЕДИ или ПАЦАНКИ?')
INSERT INTO serp VALUES('тест', 'https://www.youtube.com/watch?v=j6K9nIugzAY', 'India vs England 2nd Test Day 4 Highlights 2021| Royal Sports Tv')
INSERT INTO serp VALUES('тест', 'https://www.youtube.com/watch?v=9HtbSe_oJto', 'Пройди этот Тест и проверь своё Внимание')
INSERT INTO serp VALUES('тест', 'https://www.youtube.com/watch?v=V-kqty2vAm4', 'Тест! КТО-ТО ТАЙНО В ТЕБЯ ВЛЮБЛЕН! УЗНАЙ КТО!')
...

Dump results to JSON

Similarly as in SE::Google.

Results processing

A-Parser allows you to process results directly during scraping, in this section we have presented the most popular cases for the YouTube scraper

Similarly as in SE::Google.

Similarly as in SE::Google.

Extracting domains

Similarly as in SE::Google.

Removing tags from video titles and descriptions

Add Results Constructor and from the dropdown list select the source: $p1.serp.$i.title - Title. Choose type: Remove HTML tags.

Add Results Constructor again and from the dropdown list select the source: $p1.serp.$i.desc - Description. Choose type: Remove HTML tags.

Download example

How to import example into A-Parser

eJyVVMtu2zAQ/BWBMJAGUI3k0ItujlsjLZw4tZ1D4fjASCuBDUWyJOXGEPTv3aVo
y07TQ2/kPmb2MWTLPHcv7sGCA+9YtmmZCWeWsdWXLPuhm3XzDMkSar2D5HZ9N088
r1xSWl0na+EluISrIvkMLrfCeKGVYykz3DqwBLg5xUFPASVvpGdpy/zeAPIgsLWi
IKco8G54BbluFMawHZcNxlx3/w4vta25x6ID6iGDjZDfjHvnh4uRp1KTj8mowEKf
ntTFJeu225T1eW4W4ijNXI/jBI7OFd/BWhOVkDCYZ3i753UgK7gH8h4IL8f+lRB4
UQiaCZc9A01kYH1U4lcoVmmMxaMV4GY4WTR5CABk3B+q27BRuDOEaELu9z6HZSWX
DlLmsNQZx0KKtx7hwXKv7SLuKGuZVhMp57ADOYQF/JtGyALXNykx6WtMfD9k8RdG
d2zvlAqX9ttiDUeUcLtZ3A1ZhZ7rCjsvnrFvKWrh8e6mQQkZu0LjC4A5zuyeZlZr
C0eaiBzZUdMGFClkWNnEDKazNs7WcmJsmdONzZFvc5VuGGkKU4OYGMknatKG53Hr
a0k7t5bv0XgIJuX0GaTi9/BIk/8FFxI6TMi1KkW1iE/i0Eaj1viqF2qqayOBhq4a
KVEzDpaDdicuaoQuw/TeJk8DRSCNLxdr0NJ9W/VzNFZgSZ/SWPUpa4TMuZSPy/mp
hw16D1p3BJvjI6o0ypvmHvSfsUrrKjw6eDX4zwAuz9sGui11H3+Z4+/Vnv01Wduh
qn66hz6KuqQYtOG4HEqG/pU/xum9JQ==
tip

You can add the Results Constructor as many times as you need.

See also: Results Constructor

Similarly as in SE::Google.

Possible settings

Parameter NameDefault ValueDescription
DeviceDesktopChoice of output type (Desktop/Mobile)
Pages count10Number of pages for scraping (from 1 to 50)
Search from countryAuto (Based on IP)Choice of country from which the search is performed
Interface languageEnglishChoice of interface language
Restricted modeEnable/disable "Safe Search" mode
Uploaded timeAll timeSearch by upload date
Result typeVideoChoice of result type
DurationAllChoice of video duration
FeaturesAllAdvanced search settings
Sort byRelevancySorting of results
Advanced filters (param sp=)Allows specifying complex filter combinations. To do this, take the value of the sp parameter from the URL in the browser and insert it into this field. This value takes priority over the filters set in the scraper settings.