Skip to main content

Reddit::Comments - Reddit comments scraper

SE::Quora

Overview of Reddit::Comments scraper

Reddit::CommentsReddit::Comments - a scraper for messages on Reddit.

Collects a list of comments and a wealth of information for each of them from the eponymous service.

You can use automatic query multiplication, substitution of subqueries from files, iteration over alphanumeric combinations and lists to get the maximum possible number of results.

A-Parser functionality allows you to save scraper settings of Reddit::Posts for future use (presets), set a scraping schedule, and much more.

Saving results is possible in the form and structure you need, thanks to the built-in powerful templating engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL, and CSV.

Collected data

Array of comments:

  • Link to the comment
  • Content of the comment (in markdown)
  • Rating and number of awards
  • Date of comment creation
  • Author of the comment and their flair
  • Link to the post to which the comment relates
  • Title of the post and its flair
  • Post rating, number of comments on it, and number of awards
  • Date of post creation
  • Community in which the post was published
  • Author of the post and their flair
  • Content of the post: text in markdown, link to media content, and link to an external resource

Capabilities

  • Specifying the number of pages for scraping
  • Specifying the method of sorting results
  • Ability to scrape within a specific community

Use cases

  • Any scenarios where it is necessary to collect comments left on messages on Reddit

Queries

Supports 2 types of queries:

Keywords

Example:

wordpress features
parser

By default, the result will display a list of links to comments, for example:

https://www.reddit.com/r/node/comments/14lmqbq/how_to_work_with_xlsx_files/jpy3r5a/
https://www.reddit.com/r/StardewValley/comments/14qidly/having_problems_installing_stardew_valley/jqnalwz/
https://www.reddit.com/r/elasticsearch/comments/14pr86i/how_to_parsing_this_lin_logstash/jqkstjw/
https://www.reddit.com/r/vexillology/comments/14fh5th/flag_of_riga_michigan/jp10w17/
https://www.reddit.com/r/Marvel/comments/14otc3t/hank_pym_is_a_really_humble_guy_the_mighty/jqf27xy/
https://www.reddit.com/r/math/comments/14p1lkg/from_the_perspective_of_you_mathematicians_what/jqgug4q/
https://www.reddit.com/r/Wordpress/comments/14okx06/help_looking_for_a_specific_plugin_for_booking/jqhwtu5/
https://www.reddit.com/r/osr/comments/13u8g7s/difference_between_whitebox_whitehack/jlzhthi/
...

The scraper supports searching by keyword in a specific community. For this, the query must specify the keyword and a space followed by the link to the community. Example:

jesus https://www.reddit.com/r/atheism/
stage 3 https://www.reddit.com/r/Audi/

By default, the result will display a list of links to posts, for example:

https://www.reddit.com/r/atheism/comments/14dp1rv/sen_josh_hawley_shares_his_mindblowingly_stupid/jor20zd/
https://www.reddit.com/r/atheism/comments/14kt69e/why_do_my_christian_friends_view_my_atheism_as_an/jpsgbe5/
https://www.reddit.com/r/atheism/comments/14p6yir/finally_happened_the_one_babysitter_we_can_get/jqhk48s/
https://www.reddit.com/r/Audi/comments/14nyn9m/excuse_me_we_late/jqbdu2a/
https://www.reddit.com/r/Audi/comments/14oqxce/talk_me_inout_of_buying_this_gorgeous_audi_s5/jqev0p6/
https://www.reddit.com/r/Audi/comments/14pqr8a/is_this_a_good_deal_in_your_guys_opinions/jql4wnb/
...

Result output options

A-Parser supports flexible formatting of results thanks to the built-in templating engine Template Toolkit, which allows it to output results in any form, as well as in structured formats, such as CSV or JSON.

Possible settings

ParameterDefault valueDescription
Pages count5Number of search result pages
SortRelevanceSorting of results