SE::Bing::Video - Bing Video scraper
Scraper Overview
The Bing Video Search scraper. With the SE::Bing::Video scraper, you can collect databases of video links. You can use queries in the same way you enter them in the Bing search bar.
A-Parser functionality allows you to save scraper settings SE::Bing::Video for future use (presets), set a scraping schedule, and much more. You can use automatic query multiplication, substitution of subqueries from files, permutation of alphanumeric combinations, and lists to obtain the maximum possible number of results.
Saving results is possible in the form and structure you need, thanks to the built-in powerful templating engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL, and CSV.
Collected Data
- Video links
- Title
- The name of the service where the video is located
- Duration, number of views, and publication date
- Links to video previews
Capabilities
- Choice of the number of search results pages
- Choice of location
Use Cases
- Collecting videos for filling your blogs, tubes, doorways...
- Collecting textual data
Queries
As queries, you need to specify search phrases, for example:
Cats
Football
Waterfall
Speak in english
cars
Query Substitutions
You can use built-in macros for query multiplication, for example, we want to get a very large database of forums, let's specify several main queries in different languages:
forum
форум
foro
论坛
In the query format, we will specify a permutation of characters from a to zzzz, this method allows to maximally rotate the search results and get many new unique results:
$query {az:a:zzzz}
This macro will create 475254
additional queries for each original search query, which in total will give 4 x 475254 = 1901016
search queries, an impressive figure, but this is not a problem for A-Parser. At a speed of 2000
queries per minute, such a task will be processed in just 16
hours.
Output Results Examples
A-Parser supports flexible formatting of results thanks to the built-in templating engine Template Toolkit, which allows it to output results in any form, as well as in structured ones, such as CSV or JSON
Default Output
Result format:
$serp.format('$link\n')
Example of result:
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=79AF507BCEEA455ACC1679AF507BCEEA455ACC16&&FORM=VRDGAR
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=86FB4CDD27E041A3F95586FB4CDD27E041A3F955&&FORM=VRDGAR
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=3AD36B1FAFC111F9C6F03AD36B1FAFC111F9C6F0&&FORM=VRDGAR
https://www.msn.com/en-gb/sport/golf/benefits-of-winning-the-masters-golf/vi-AA1lNwOI
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=D8EB9E5532894EACFB73D8EB9E5532894EACFB73&&FORM=VRDGAR
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=9CB33DC7E23801445F3F9CB33DC7E23801445F3F&&FORM=VRDGAR
https://talksport.com/football/1685319/troy-deeney-forest-green-rovers-manager/
https://ca.sports.yahoo.com/news/best-30-mens-cricketers-britain-140144281.html
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=B9593E6DF96A59F4D941B9593E6DF96A59F4D941&&FORM=VRDGAR
https://www.msn.com/en-gb/sport/golf/6-golf-tips-golf-monthly/vi-AA1lNrLU
https://sports.yahoo.com/best-30-mens-cricketers-britain-140144281.html
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=086DF2129F5807EC02F1086DF2129F5807EC02F1&&FORM=VRDGAR
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=14632A97F627B502518514632A97F627B5025185&&FORM=VRDGAR
Output in CSV Table
Result format:
[% FOREACH item IN serp;
tools.CSVline(query, item.link, item.name, item.preview, item.duration);
END %]
Example of result:
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=79AF507BCEEA455ACC1679AF507BCEEA455ACC16&&FORM=VRDGAR,"England's Mary Earps wins 2023 Sports Personality of th",w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,3:35
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=86FB4CDD27E041A3F95586FB4CDD27E041A3F955&&FORM=VRDGAR,"1972 Chevy Super Sport Nova",w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,0:51
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=3AD36B1FAFC111F9C6F03AD36B1FAFC111F9C6F0&&FORM=VRDGAR,"1968 Super Sport Chevelle",w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,0:51
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=FBBB3E08963152230A54FBBB3E08963152230A54&&FORM=VRDGAR,"We had to have some hard conversations - Marsters",https://tse2.mm.bing.net/th?id=OVF.O3Nq%2bBQ%2bjnbhZnbfYxDA7w&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,7:51
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=086DF2129F5807EC02F1086DF2129F5807EC02F1&&FORM=VRDGAR,"Ja Morant Hits Buzzer-Beater, Seals Victory Post-Suspension",https://tse2.mm.bing.net/th?id=OVF.ON%2fSFfXw5e9WwzZEMbbEeQ&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,1:09
sport,https://www.bbc.co.uk/sport/football/67723705,"Ollie Watkins: Aston Villa striker explains controversia",https://tse3.mm.bing.net/th?id=OVF.Hc9LkZQ9XhYo%2bFbAtxpLGg&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,
sport,https://www.bbc.com/sport/articles/c2vyevn0g7zo,"Anthony Ogogo: 'Why I used to hide being a Norwich City fan'",https://tse3.mm.bing.net/th?id=OVF.kvcGexXDQxqqCSiNRXEkRg&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,1:15
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=9FDCCE66150310EB99CB9FDCCE66150310EB99CB&&FORM=VRDGAR,"Aaron Rodgers Eyes Future Beyond 40 Despite Achilles ",https://tse4.mm.bing.net/th?id=OVF.fMSU0FvKihMc8q2TjXg%2fkw&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,1:13
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=361720861BF1297ADE98361720861BF1297ADE98&&FORM=VRDGAR,"Dillon Brooks, Ime Udoka Penalized For Outbursts At R",https://tse1.mm.bing.net/th?id=OVF.3TNSq7yVIFY84%2fQsm5KyJQ&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,1:12
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=B9593E6DF96A59F4D941B9593E6DF96A59F4D941&&FORM=VRDGAR,"Manchester United, Arsenal and the battle for Mary Earps",https://tse3.mm.bing.net/th?id=OVF.bK8xXZhzmQ0PD8CbFvDaGg&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,1:18
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=14632A97F627B502518514632A97F627B5025185&&FORM=VRDGAR,"Miller desperate for debut",https://tse2.mm.bing.net/th?id=OVF.a8MhMzLvFmPQ5fqRbc3l0g&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,3:38
Saving in SQL Format
Result format:
[% FOREACH serp; "INSERT INTO serp VALUES('" _ query _ "', '"; directLink _ "', '"; name.replace("\n", '\n') _ "', '"; author _ "')\n"; END %]
Example of result:
INSERT INTO serp VALUES('sport', 'https://www.youtube.com/watch?v=d5sxT8CACHM', 'England's Mary Earps wins 2023 Sports Personality of th', 'BBC Sport')
INSERT INTO serp VALUES('sport', 'https://sports.yahoo.com/best-30-mens-cricketers-britain-140144281.html', 'Best 30 men's cricketers in Britain right now', 'Tim Wigmore')
INSERT INTO serp VALUES('sport', 'https://www.msn.com/en-us/sports/more-sports/when-usain-bolt-and-andre-de-grasse-smile-the-whole-world-smiles-with-them-olympic-memories/vi-AA1lMZ2W', 'When Usain Bolt and Andre de Grasse smile, the whole worl', 'The Independent News')
INSERT INTO serp VALUES('sport', 'https://www.msn.com/en-us/sports/more-sports/1968-super-sport-chevelle/vi-AA1lMLLn', '1968 Super Sport Chevelle', 'FOX 13 Tampa Bay')
INSERT INTO serp VALUES('sport', 'https://www.msn.com/en-gb/sport/golf/benefits-of-winning-the-masters-golf/vi-AA1lNwOI', 'Benefits Of Winning The Masters Golf', 'Dailymotion')
INSERT INTO serp VALUES('sport', 'https://www.independent.co.uk/sport/darts/world-darts-championship-live-stream-scores-results-b2467256.html', 'PDC World Darts Championship LIVE: Results', 'Michael Jones')
INSERT INTO serp VALUES('sport', 'https://www.msn.com/en-us/sports/nfl/aaron-rodgers-eyes-future-beyond-40-despite-achilles-setback/vi-AA1lNg0R', 'Aaron Rodgers Eyes Future Beyond 40 Despite Achilles S', 'unbranded - Sport')
INSERT INTO serp VALUES('sport', 'https://www.msn.com/en-gb/sport/golf/6-golf-tips-golf-monthly/vi-AA1lNrLU', '6 Golf Tips | Golf Monthly', 'Dailymotion')
INSERT INTO serp VALUES('sport', 'https://www.msn.com/en-us/autos/news/1972-chevy-super-sport-nova/vi-AA1lN3Px', '1972 Chevy Super Sport Nova', 'FOX 13 Tampa Bay')
INSERT INTO serp VALUES('sport', 'https://www.youtube.com/watch?v=1DtqwboJVFc', 'Desi Cricket Pakistan Final Match Bhutto Club Vs GB Cal', 'Desi Sport GB')
INSERT INTO serp VALUES('sport', 'https://ca.sports.yahoo.com/news/best-30-mens-cricketers-britain-140144281.html', 'Best 30 men's cricketers in Britain right now', 'Tim Wigmore')
INSERT INTO serp VALUES('sport', 'https://www.independent.co.uk/sport/football/mary-earps-manchester-united-arsenal-spoty-b2467111.html', 'Manchester United, Arsenal and the battle for Mary Earps', 'Ben Fleming')
Dump Results to JSON
Общий формат результата:
[% IF notFirst;
",\n";
ELSE;
notFirst = 1;
END;
obj = {};
obj.query = query;
obj.videos = [];
FOREACH item IN p1.serp;
obj.videos.push({
link = item.link
name = item.name
duration = item.duration
author = item.author
preview = item.preview
});
END;
obj.json %]
Начальный текст:
[
Конечный текст:
]
Example of result:
{
"videos": [{
"link": "https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=79AF507BCEEA455ACC1679AF507BCEEA455ACC16&&FORM=VRDGAR",
"preview": "https://tse1.mm.bing.net/th?id=OVF.BbkN01YgJzwRV0nBF%2ff%2fQQ&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1",
"name": "England's Mary Earps wins 2023 Sports Personality of th",
"author": "BBC Sport",
"duration": "3:35"
}, {
"link": "https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=86FB4CDD27E041A3F95586FB4CDD27E041A3F955&&FORM=VRDGAR",
"preview": "https://tse3.mm.bing.net/th?id=OVF.SPaQMo8Zrt%2fF5bGyKS0rQA&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1",
"name": "1972 Chevy Super Sport Nova",
"author": "FOX 13 Tampa Bay",
"duration": "0:51"
}, {
"link": "https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=3AD36B1FAFC111F9C6F03AD36B1FAFC111F9C6F0&&FORM=VRDGAR",
"preview": "https://tse3.mm.bing.net/th?id=OVF.d1Q3sVw%2fHfzK9x2Z%2fV5Qkg&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1",
"name": "1968 Super Sport Chevelle",
"author": "FOX 13 Tampa Bay",
"duration": "0:51"
}, {
"link": "https://www.msn.com/en-gb/sport/golf/benefits-of-winning-the-masters-golf/vi-AA1lNwOI",
"preview": "https://tse4.mm.bing.net/th?id=OVF.0Qa9k1McfmxqQgQudnQ%2bnw&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1",
"name": "Benefits Of Winning The Masters Golf",
"author": "Dailymotion",
"duration": "1:46"
}, {
"link": "https://www.skysports.com/watch/video/13034880/radek-szaganskis-142-checkout-propels-him-to-round-1-victory",
"preview": "https://tse4.mm.bing.net/th?id=OVF.GBYcZsZ4KRxIcMCTRyvclw&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1",
"name": "Radek Szaganski’s 142 checkout propels him to Rou",
"author": "",
"duration": "0:41"
}], "query": "sport"
}
To make the "Start Text" and "End Text" options available in the Task Editor, you need to activate "More options".
Possible Settings
Parameter | Default Value | Description |
---|---|---|
Pages count | 1 | Number of pages for scraping |
Region | Based on IP | Region selection. List of regions. |
Interface language | Any | Interface language selection. List of languages. |