Skip to main content

SE::DuckDuckGo::Images - Image Scraper

Dogpile Images

Overview of the scraper

The image scraper for DuckDuckGo search results. The SE::DuckDuckGo::Images scraper allows you to get databases of image links or images ready for further use. You can use queries in the same way as you would enter them in the DuckDuckGo search bar

A-Parser's functionality allows you to save the DuckDuckGo scraper's scraping settings for future use (presets), ), set up a scraping schedule, and much more. You can use automatic query multiplication, substitution of subqueries from files, iteration of alphanumeric combinations and lists to get the maximum possible number of results.

Saving results is possible in the form and structure you need, thanks to the powerful built-in templating engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL and CSV.

Scraper use cases

A-Parser Allows you to use a chain of tasks, where the second task starts after the first one is completed, and the links from the first task are used as queries for the second one

Download Example

How to import the example into A-Parser

eJyNVk1T2zAQ/SuMJofQhsQcevGFCdC0dCihEE4hnVHjtSuQJSPJKYzJf+9KMv6q
E3rIjLXaL2nfe0pBDNWP+lqBBqNJuCxI5r5JSM7z9aP9fZEHLKUJ6INI/hFc0ggU
GZGMKg3KxizJ7ecwrN3D8ML5o1MEMc25IavViGBe/NQzqVJq8w+y43FZrNq8pRtY
SNyMGYfaPMPVFU3BRkXUgN0dxy7R8HBsnm0GGkXMMCko9xVsZ3XVO8GechuvjWIi
QX9cKgZ6pmSKZgMuiTW+vHW4JAO3Jpgmd/E/fAwJY8o1jIjGdmcUm4m6O8yAokaq
eWZ7QntBpJhyfgkb4LWby3+aM453qqcxBl2Ugf0u839ybKsjNkttQP1R2EOVxa1O
59/rqEheygRPHv3Cc3OWMoNrfSZzYYcToPERIKvu7UqiJZUKqjJG5VAVRwRlICJ0
rKc2zWpT6xStybSNaylilsyxf8UiePPMxQJhOhdnMs042GMRD7GD8yYkcw03NWCm
uhyKXVTtdlOduYL2HkqojoiRkutvt77xTDHE4yfbborX2uyhvNo15fzu5rLVXY0v
m1kmsJZMEOtrIJEIKTzXdlS0SHQFJgy/LhbXDd6gi4IEnjEL3owBYY7MSwbh+MOJ
4+RkmInk9SGDk+Q1YfGhu3z0X6ATxjDLC6XoS0keezi/s861kam/rWpiaP8N1A3j
zVQmijltsNRSw/LFHWFDuQOIkALq/MhQ42YiHVws8UDoZiVPOO9XVhHSwFNOOdk2
9aLmvpehiWflWOTp0aAov9FhO7aK4tP2IEbknO8Aeh+Ge+Xhf9CzF4RBzTsh3xWt
rhruQFoTVP0yi5pJe0hYEC1ztbZpvNBZ7Nvh2uskq1GFvclw+XOy+nh4fz8enoRt
yA16MFeCwIdvV6P6UemjbY9cdOQ26BOjipk9Khw0+deldEvFgh1C1H033LjeU/Vg
v6J3t1tqHmz7NCbY8yb1a+W+xyBoPgS2oJsB3vmxG5IXo+rPQLHrXQ8LDK3lCpe4
ftDXPt4ivfTBytqx/3j7F6rw6z8=

Capabilities

  • Selecting the number of pages to scrape
  • Searching by region
  • Selecting the output language
  • Selecting Safe Search
  • Specifying the image size
  • Selecting the image type
  • Selecting the layout
  • Selecting by color

Collected data

  • Image links
  • Image anchors
  • Page links
  • Height and width
  • Thumbnail links

Use Cases

  • Collecting images to fill your blogs, tubes, doorways...
  • Collecting avatar databases

Queries

Search phrases should be specified as queries, for example:

Audi  
Box
Byron
hunting and fishing

Query substitutions

You can use built-in macros to multiply queries, for example, if we want to get a very large database of forums, we specify several main queries in different languages:

forum
forum
foro
论坛

In the query format, we specify iterating through characters from a to zzzz, , this method allows for maximum rotation of search results and obtaining many new unique results:

$query {az:a:zzzz}

This macro will create 475254 additional queries for each initial search query, which totals 4 x 475254 = 1901016 search queries, an impressive number, but not a problem for A-Parser. At a speed of 2000 queries per minute, such a task will be processed in just 16 hours.

Output results examples

A-Parser supports flexible result formatting thanks to the built-in templating engine Template Toolkit, which allows it to output results in an arbitrary form, as well as in a structured form, such as CSV or JSON

Default output

Result format:

$serp.format('$link\n')

Result example:

https://viralcats.net/blog/wp-content/uploads/2017/12/Mean-looking-cat-Viral-Cats-03.jpg
http://mymodernmet.com/wp/wp-content/uploads/2017/03/gabrielius-khiterer-stray-cats-8.jpg
http://fishsubsidy.org/wp-content/uploads/2020/01/abyssinian-cats.jpg
https://cdn2.theweek.co.uk/sites/theweek/files/2017/11/131117-wd-cats.jpg
https://www.israelhayom.com/wp-content/uploads/2020/04/why-cats-are-best-pets-worshipped-animals-1559234295.jpg
https://s-i.huffpost.com/gen/964776/images/o-CATS-KILL-BILLIONS-facebook.jpg
https://external-preview.redd.it/gxbKXOj-OF1_RSHa7Ncp8Gs_OFFP5i6V7SU5DPT2t1E.jpg?auto=webp&s=b6e85ba0f1517dc629d21208a7d9db992d550ba9
http://www.zastavki.com/pictures/originals/2013/Animals_Cats_Sleeping_gray_kitten_036760_.jpg
http://mcdaniel.hu/wp-content/uploads/2015/01/6784063-cute-cats-hd.jpg
https://img.webmd.com/dtmcms/live/webmd/consumer_assets/site_images/article_thumbnails/reference_guide/why_cats_sneeze_ref_guide/1800x1200_why_cats_sneeze_ref_guide.jpg
http://www.zastavki.com/pictures/originals/2013/Animals___Cats_Silver_beautiful_Scottish_Fold_cat_045199_.jpg

Output in CSV table

Result format:

[% FOREACH item IN serp;
tools.CSVline(query, item.link, item.width, item.height, item.page, item.thumb);
END %]

Result example:

cats,https://viralcats.net/blog/wp-content/uploads/2017/12/Mean-looking-cat-Viral-Cats-03.jpg,462,722,https://viralcats.net/blog/2017/12/30/10-kitties-that-you-dont-want-to-mess-with/,https://tse2.mm.bing.net/th?id=OIP.AdkhgipoWbJwiQBp9VIWpgAAAA&pid=Api
cats,http://mymodernmet.com/wp/wp-content/uploads/2017/03/gabrielius-khiterer-stray-cats-8.jpg,750,1028,https://mymodernmet.com/gabrielius-khiterer-stray-cat-photos/,https://tse2.mm.bing.net/th?id=OIP.ZjfS8JQc9sahsK0-w8dRFAHaKJ&pid=Api
cats,http://fishsubsidy.org/wp-content/uploads/2020/01/abyssinian-cats.jpg,1204,1445,http://fishsubsidy.org/category/cat/cat-breeds/,https://tse3.mm.bing.net/th?id=OIP.uHEu4-5TLJ6SSgDree6ahQHaI4&pid=Api
cats,https://cdn2.theweek.co.uk/sites/theweek/files/2017/11/131117-wd-cats.jpg,1400,788,https://www.theweek.co.uk/94877/why-are-so-many-australian-towns-introducing-cat-curfews,https://tse3.mm.bing.net/th?id=OIP.iYyPimFLj1_wgKEsTsggQgHaEK&pid=Api
cats,https://www.israelhayom.com/wp-content/uploads/2020/04/why-cats-are-best-pets-worshipped-animals-1559234295.jpg,2119,1415,https://www.israelhayom.com/2020/04/23/2-nyc-cats-test-positive-for-coronavirus-officials-recommend-pet-precautions/,https://tse1.mm.bing.net/th?id=OIP.U7274nc_llbuQTChXpKVNgHaE8&pid=Api
cats,https://s-i.huffpost.com/gen/964776/images/o-CATS-KILL-BILLIONS-facebook.jpg,1536,1536,https://www.huffingtonpost.com/2013/01/30/domestic-cats-kill-billions-mice-birds-annually-study_n_2575833.html,https://tse1.mm.bing.net/th?id=OIP.ETFxELWtgKQwMlcoccq-SAHaHa&pid=Api
cats,https://external-preview.redd.it/gxbKXOj-OF1_RSHa7Ncp8Gs_OFFP5i6V7SU5DPT2t1E.jpg?auto=webp&s=b6e85ba0f1517dc629d21208a7d9db992d550ba9,1920,2560,https://www.reddit.com/r/cats/comments/2k2pio/my_very_ugly_cat/,https://tse1.mm.bing.net/th?id=OIP.t2BxlpEwcGrXJJQSToWVBAHaJ4&pid=Api
cats,http://www.zastavki.com/pictures/originals/2013/Animals_Cats_Sleeping_gray_kitten_036760_.jpg,2560,1600,http://www.zastavki.com/eng/Animals/Cats/wallpaper-36760.htm,https://tse4.mm.bing.net/th?id=OIP.3c_ISLWidlMWXHfjqkpB2wHaEo&pid=Api
cats,http://mcdaniel.hu/wp-content/uploads/2015/01/6784063-cute-cats-hd.jpg,2560,1600,http://mcdaniel.hu/cat-adoption-101/,https://tse4.mm.bing.net/th?id=OIP.QdEkrZjd1c_VN_aUtleoFgHaEo&pid=Api

Saving in SQL format

Result format:

[% FOREACH serp;
"INSERT INTO serp VALUES('" _ query _ "', '"; link _ "', '"; page _ "', '"; thumb _ "')\n";
END %]

Result example:

INSERT INTO serp VALUES('cats', 'https://viralcats.net/blog/wp-content/uploads/2017/12/Mean-looking-cat-Viral-Cats-03.jpg', 'https://viralcats.net/blog/2017/12/30/10-kitties-that-you-dont-want-to-mess-with/', 'https://tse2.mm.bing.net/th?id=OIP.AdkhgipoWbJwiQBp9VIWpgAAAA&pid=Api')
INSERT INTO serp VALUES('cats', 'http://mymodernmet.com/wp/wp-content/uploads/2017/03/gabrielius-khiterer-stray-cats-8.jpg', 'https://mymodernmet.com/gabrielius-khiterer-stray-cat-photos/', 'https://tse2.mm.bing.net/th?id=OIP.ZjfS8JQc9sahsK0-w8dRFAHaKJ&pid=Api')
INSERT INTO serp VALUES('cats', 'http://fishsubsidy.org/wp-content/uploads/2020/01/abyssinian-cats.jpg', 'http://fishsubsidy.org/category/cat/cat-breeds/', 'https://tse3.mm.bing.net/th?id=OIP.uHEu4-5TLJ6SSgDree6ahQHaI4&pid=Api')
INSERT INTO serp VALUES('cats', 'https://cdn2.theweek.co.uk/sites/theweek/files/2017/11/131117-wd-cats.jpg', 'https://www.theweek.co.uk/94877/why-are-so-many-australian-towns-introducing-cat-curfews', 'https://tse3.mm.bing.net/th?id=OIP.iYyPimFLj1_wgKEsTsggQgHaEK&pid=Api')
INSERT INTO serp VALUES('cats', 'https://www.israelhayom.com/wp-content/uploads/2020/04/why-cats-are-best-pets-worshipped-animals-1559234295.jpg', 'https://www.israelhayom.com/2020/04/23/2-nyc-cats-test-positive-for-coronavirus-officials-recommend-pet-precautions/', 'https://tse1.mm.bing.net/th?id=OIP.U7274nc_llbuQTChXpKVNgHaE8&pid=Api')
INSERT INTO serp VALUES('cats', 'https://s-i.huffpost.com/gen/964776/images/o-CATS-KILL-BILLIONS-facebook.jpg', 'https://www.huffingtonpost.com/2013/01/30/domestic-cats-kill-billions-mice-birds-annually-study_n_2575833.html', 'https://tse1.mm.bing.net/th?id=OIP.ETFxELWtgKQwMlcoccq-SAHaHa&pid=Api')
INSERT INTO serp VALUES('cats', 'https://external-preview.redd.it/gxbKXOj-OF1_RSHa7Ncp8Gs_OFFP5i6V7SU5DPT2t1E.jpg?auto=webp&s=b6e85ba0f1517dc629d21208a7d9db992d550ba9', 'https://www.reddit.com/r/cats/comments/2k2pio/my_very_ugly_cat/', 'https://tse1.mm.bing.net/th?id=OIP.t2BxlpEwcGrXJJQSToWVBAHaJ4&pid=Api')
INSERT INTO serp VALUES('cats', 'http://www.zastavki.com/pictures/originals/2013/Animals_Cats_Sleeping_gray_kitten_036760_.jpg', 'http://www.zastavki.com/eng/Animals/Cats/wallpaper-36760.htm', 'https://tse4.mm.bing.net/th?id=OIP.3c_ISLWidlMWXHfjqkpB2wHaEo&pid=Api')
INSERT INTO serp VALUES('cats', 'http://mcdaniel.hu/wp-content/uploads/2015/01/6784063-cute-cats-hd.jpg', 'http://mcdaniel.hu/cat-adoption-101/', 'https://tse4.mm.bing.net/th?id=OIP.QdEkrZjd1c_VN_aUtleoFgHaEo&pid=Api')

Dump results to JSON

Общий формат результата:

[% IF notFirst;
",\n";
ELSE;
notFirst = 1;
END;

obj = {};
obj.query = query;
obj.images = [];

FOREACH item IN p1.serp;
obj.images.push({
width = item.width
height = item.height
link = item.link
pagelink = item.pagelink
thumb = item.thumb
});
END;

obj.json %]

Начальный текст:

[

Конечный текст:

]

Result example:

[{
"images": [
{
"link": "https://viralcats.net/blog/wp-content/uploads/2017/12/Mean-looking-cat-Viral-Cats-03.jpg",
"width": 462,
"thumb": "https://tse2.mm.bing.net/th?id=OIP.AdkhgipoWbJwiQBp9VIWpgAAAA&pid=Api",
"height": 722
},
{
"link": "http://mymodernmet.com/wp/wp-content/uploads/2017/03/gabrielius-khiterer-stray-cats-8.jpg",
"width": 750,
"thumb": "https://tse2.mm.bing.net/th?id=OIP.ZjfS8JQc9sahsK0-w8dRFAHaKJ&pid=Api",
"height": 1028
},
{
"link": "http://fishsubsidy.org/wp-content/uploads/2020/01/abyssinian-cats.jpg",
"width": 1204,
"thumb": "https://tse3.mm.bing.net/th?id=OIP.uHEu4-5TLJ6SSgDree6ahQHaI4&pid=Api",
"height": 1445
},

],
"query": "cats"
}]
tip

For the options "Start text" and "End text" to be available in the Task Editor, , you need to activate "More options".

Available settings

ParameterDefault ValueDescription
Pages count5Number of pages to scrape
LocationUnited StatesSearch by region
LanguageEnglish of United StatesOutput language
Safe searchModerateSafe search
SizeAllImage size
TypeAllImage type
LayoutAllLayout
ColorAllColor