Skip to main content

Reddit::Posts - Scraper for Reddit Posts

SE::Quora

Overview of the Reddit::Posts Scraper

Reddit::PostsReddit::Posts - Reddit posts scraper.

Collects a list of posts and a set of information for each of them from the service of the same name.

You can use automatic query multiplication, substitution of sub-queries from files, iteration over alphanumeric combinations and lists to get the maximum possible number of results.

A-Parser functionality allows you to save scraping settings for the Reddit::Posts scraper for future use (presets), ), set a scraping schedule, and much more.

Saving results is possible in the format and structure you require, thanks to the powerful built-in templating engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL and CSV.

Data Collected

Array of posts:

  • Link to the post
  • Title and flair
  • Rating, number of comments, and number of awards
  • Creation date
  • Community where the post was published
  • Author and their flair
  • Post content: text in markdown, link to media content, and link to external resource
  • Whether the post is an ad

Capabilities

  • Specify the number of pages for scraping
  • Specify the sorting method for results
  • Select the time range for results
  • Ability to scrape within a specific community

Use Cases

  • Any scenarios where data about Reddit messages (posts) needs to be retrieved

Queries

Several query options are supported:

Example:

https://www.reddit.com/t/bitcoin/
https://www.reddit.com/t/kim_kardashian/

By default, a list of links to posts will be output, for example:

https://www.reddit.com/r/Bitcoin/comments/14nbyy2/i_took_out_a_35000_loan_to_buy_bitcoin_1_year/
https://www.reddit.com/r/CryptoCurrency/comments/14guprs/bitcoin_is_up_75_since_jim_cramer_told_investors/
https://www.reddit.com/r/Bitcoin/comments/14opp2t/this_guy_was_paid_32_bitcoin_to_hold_up_this_sign/
https://www.reddit.com/r/CryptoCurrency/comments/14ivx43/nearly_69_of_all_bitcoin_supply_did_not_move_in/
https://www.reddit.com/r/CryptoCurrency/comments/149vy0o/bitcoin_dips_below_25k_for_the_first_time_in_3/
...

Parameters in the links indicating the time and sorting of results are also taken into account; settings defined in the configuration will be ignored. Example:

https://www.reddit.com/r/nba/
https://www.reddit.com/r/OrlandoMagic/top/?t=month

By default, a list of links to posts will be output, for example:

https://www.reddit.com/r/OrlandoMagic/comments/14a5br2/
https://www.reddit.com/r/OrlandoMagic/comments/14nqfk1/keep_mo_or_no_mo/
https://www.reddit.com/r/nba/comments/14nfzki/202324_nba_free_agent_tracker/
https://www.reddit.com/user/Grammarly/comments/14ghtld/verbessere_deine_schreibfertigkeit_auf_englisch/
https://www.reddit.com/r/nba/comments/14r4l4s/vernon_dillon_brooks_took_991_shots_last_year_he/
https://www.reddit.com/r/nba/comments/14ql1es/highlight_matt_devlin_inexplicably_yells_punjabi/
https://www.reddit.com/user/TelekomShop/comments/yqkina/der_highspeedhotspot_zum_mitnehmen_die_speedbox/
https://www.reddit.com/r/nba/comments/14qysvi/michael_jordan_with_the_spin_hanging_onehanded/
https://www.reddit.com/r/nba/comments/14qxrep/dwyane_wade_leads_the_redeem_team_with_27_points/
...

Keywords

Example:

wordpress features
parser

By default, a list of links to posts will be output, for example:

https://www.reddit.com/r/ShitpostXIV/comments/14511em/i_am_a_proud_grey_parser/
https://www.reddit.com/r/opengl/comments/147sbjk/4_hours_of_my_obj_parser_so_far/
https://www.reddit.com/r/Compilers/comments/14pi9xh/demystifying_pratt_parsers/
https://www.reddit.com/r/ZETTAHOST/comments/11qdg99/how_to_change_the_wordpress_featured_image_size/
https://www.reddit.com/r/Wordpress/comments/14p1k2p/what_features_is_wordpress_missing_i_want_to_help/
https://www.reddit.com/r/Wordpress/comments/13q8g5x/is_it_possible_and_advisable_to_build_a_website/
...

The scraper supports searching by keyword within a specific community. To do this, you need to specify the keyword in the query followed by a space and a link to the community. Example:

jesus https://www.reddit.com/r/atheism/
stage 3 https://www.reddit.com/r/Audi/

By default, a list of links to posts will be output, for example:

https://www.reddit.com/r/Audi/comments/vi6cs5/thoughts_on_used_stage_3_2017_a3/
https://www.reddit.com/r/Audi/comments/lfvjuo/just_picked_up_this_beauty_stage_3_b5_s4/
https://www.reddit.com/r/Audi/comments/ssr8ui/anyone_else_track_their_audis_ttrs_stage_3_big/
https://www.reddit.com/r/atheism/comments/14lq0y6/heaven_and_hell_are_not_what_jesus_preached/
https://www.reddit.com/r/atheism/comments/13gxzj6/so_jesus_freaks_can_shove_their_religion_onto/
https://www.reddit.com/r/atheism/comments/13b8kl6/chris_pratt_compares_his_struggles_to_jesus/
https://www.reddit.com/r/atheism/comments/137k88b/artwork_of_jesus_surrounded_by_hot_leather/
...

Output Options

A-Parser supports flexible result formatting thanks to the built-in templating engine Template Toolkit, which allows it to output results in an arbitrary form, as well as in structured formats, such as CSV or JSON.

Possible Settings

ParameterDefault ValueDescription
Pages count5Number of output pages
SortRelevanceResult sorting
TimeAll timeResult time range
Use HTTP/2 transportDetermines whether to use HTTP/2 instead of HTTP/1.1