Skip to main content

Reddit::Comments - Reddit comments scraper

SE::Quora

Overview of the Reddit::Comments scraper

Reddit::CommentsReddit::Comments - scraper for Reddit messages.

Collects a list of comments and a set of information for each of them from the service of the same name.

You can use automatic query multiplication, substitution of subqueries from files, iteration of alphanumeric combinations and lists to get the maximum possible number of results.

A-Parser functionality allows you to save the scraping settings for the Reddit::Posts scraper for future use (presets), ), set a scraping schedule, and much more.

Saving results is possible in the form and structure you need, thanks to the built-in powerful templating engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL and CSV.

Data Collected

Array of comments:

  • Comment link
  • Comment content (in markdown)
  • Rating and number of awards
  • Comment creation date
  • Comment author and their flair
  • Link to the post this comment belongs to
  • Post title and its flair
  • Post rating, number of comments, and number of awards
  • Post creation date
  • Community where the post was published
  • Post author and their flair
  • Post content: text in markdown, link to media content, and link to an external resource

Capabilities

  • Specify the number of pages to scrape
  • Specify the sort order for results
  • Ability to scrape within a specific community

Use Cases

  • Any scenario where collecting comments left on Reddit posts is required

Queries

2 query options are supported:

Keywords

Example:

wordpress features
parser

By default, a list of links to comments will be output, for example:

https://www.reddit.com/r/node/comments/14lmqbq/how_to_work_with_xlsx_files/jpy3r5a/
https://www.reddit.com/r/StardewValley/comments/14qidly/having_problems_installing_stardew_valley/jqnalwz/
https://www.reddit.com/r/elasticsearch/comments/14pr86i/how_to_parsing_this_lin_logstash/jqkstjw/
https://www.reddit.com/r/vexillology/comments/14fh5th/flag_of_riga_michigan/jp10w17/
https://www.reddit.com/r/Marvel/comments/14otc3t/hank_pym_is_a_really_humble_guy_the_mighty/jqf27xy/
https://www.reddit.com/r/math/comments/14p1lkg/from_the_perspective_of_you_mathematicians_what/jqgug4q/
https://www.reddit.com/r/Wordpress/comments/14okx06/help_looking_for_a_specific_plugin_for_booking/jqhwtu5/
https://www.reddit.com/r/osr/comments/13u8g7s/difference_between_whitebox_whitehack/jlzhthi/
...

The scraper supports searching by keyword within a specific community. To do this, you must specify the keyword and, separated by a space, the link to the community in the query. Example:

jesus https://www.reddit.com/r/atheism/
stage 3 https://www.reddit.com/r/Audi/

By default, a list of links to posts will be output, for example:

https://www.reddit.com/r/atheism/comments/14dp1rv/sen_josh_hawley_shares_his_mindblowingly_stupid/jor20zd/
https://www.reddit.com/r/atheism/comments/14kt69e/why_do_my_christian_friends_view_my_atheism_as_an/jpsgbe5/
https://www.reddit.com/r/atheism/comments/14p6yir/finally_happened_the_one_babysitter_we_can_get/jqhk48s/
https://www.reddit.com/r/Audi/comments/14nyn9m/excuse_me_we_late/jqbdu2a/
https://www.reddit.com/r/Audi/comments/14oqxce/talk_me_inout_of_buying_this_gorgeous_audi_s5/jqev0p6/
https://www.reddit.com/r/Audi/comments/14pqr8a/is_this_a_good_deal_in_your_guys_opinions/jql4wnb/
...

Result Output Options

A-Parser supports flexible result formatting thanks to the built-in template engine Template Toolkit, which allows it to output results in an arbitrary form, as well as in a structured one, such as CSV or JSON.

Possible Settings

ParameterDefault ValueDescription
Pages count5Number of result pages
SortRelevanceSort order for results