Skip to main content

Reddit::PostInfo - Scraper for Reddit Post Information

SE::Quora

Overview of the Reddit::PostInfo scraper

Reddit::PostInfoReddit::PostInfo - scraper for post information on Reddit.

Collects information about the post, including comments.

You can use automatic query replication, substitution of subqueries from files, iteration through alphanumeric combinations and lists to obtain the maximum possible number of results.

A-Parser's functionality allows you to save the scraping settings for the Reddit::PostInfo scraper for future use (presets), set a scraping schedule, and much more.

Results can be saved in the format and structure you need, thanks to the built-in powerful templating engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL and CSV.

Data collected

  • Link to the post
  • Title and flair
  • Score, number of comments, and number of awards
  • Creation date
  • Community where the post was published
  • Author and their flair
  • Post content: text in markdown, link to media content, and link to an external resource
  • Whether the post is sponsored

Array of comments:

  • ID
  • Parent ID
  • Link
  • Author
  • Text (cleaned of tags)
  • Text (with tags)

Capabilities

  • Ability to limit the number of comments to scrape

Queries

One query type is supported:

Example:

https://www.reddit.com/r/Audi/comments/151atr5/audi_r8_high_speed_crash_294_km/
https://www.reddit.com/r/Lexus/comments/1dc7r2m/anyone_come_from_audi_to_lexus/

By default, the result will output information about the post without comments

Output options

A-Parser supports flexible result formatting thanks to the built-in templating engine Template Toolkit, which allows it to output results in an arbitrary form, as well as in a structured format, such as CSV or JSON.

Available settings

ParameterDefault valueDescription
Max comments count50Number of comments to scrape