SE::Yandex::Balaboba - text scraper for Balaboba
Scraper Overview
SE::Yandex::Balaboba - text scraper for Balaboba.Gets texts from the eponymous service.
You can use automatic query multiplication, substitution of subqueries from files, enumeration of alphanumeric combinations and lists to obtain the maximum possible number of results.
A-Parser functionality allows you to save the settings of the SE::Yandex::Balaboba scraper for further use (presets), set up a parsing schedule, and much more.
Results can be saved in the form and structure you need, thanks to the built-in powerful templating engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL, and CSV.
Collected Data
- Generated text
- Style by which the text was generated
- Link to the image
Capabilities
- Scraping unique texts with the ability to choose the text style (parameter Style):
Recipes (RU)
,Short stories (RU)
,Recipies (EN)
, and others - Choosing the style number that can be seen in the browser and scraping the text with the required style if it is not available in the style selection option (parameter ID of custom style)
Use Cases
- Mass collection of unique texts
Queries
Queries should be specified as phrases from which the generation will begin, for example:
Жили были
Query Substitutions
You can use built-in macros for automatic substitution of subqueries from files, for example, if we want to add some list of other words to each query, we specify several main queries:
Жили были
Fantasy
Tower defense
In the query format, we specify a macro for substituting additional words from the keywords.txt file, this method allows to increase the variability of queries many times:
{subs:keywords} $query
This macro will create as many additional queries as there are in the file for each original search query, which will result in [number of original queries] x [number of queries in the Keywords file] = [total number of queries]
as a result of the macro operation.
For example, if the keywords.txt file contains:
free
online
As a result, the substitution macro will turn 3 main queries into 6:
free fantasy
online fantasy
free tower defense
online tower defense
free rpg
online rpg
Output Results Examples
A-Parser supports flexible result formatting thanks to the built-in templating engine Template Toolkit, which allows it to output results in any form, as well as in a structured form, for example, CSV or JSON
Default Output
Result format:
$style: $text\n
Result example:
Без стиля (RU): Жили были три поросенка, три брата.
И у каждого из них был дом.
Это были очень дружные поросята.
Они помогали друг другу во всем, а если что-нибудь случалось с одним из них, то другой брат всегда приходил на помощь.
Однажды пошел сильный снег, и братья решили спрятаться от него в своих домах.
Но тут из-за угла вышел серый волк.
Он был голоден и увидел, что в домах не было дверей.
Тогда волк решил зайти в первый дом и съесть поросенка.
Волк быстро открыл дверь и заглянул туда.
Possible Settings
Parameter | Default Value | Description |
---|---|---|
Style | Random (All languages) | Choose the text style |
ID of custom style | Set the style number for text generation | |
Repeat if Balaboba reports about error | ☑ | Retry parsing if Balaboba shows an error message |
Repeat if Balaboba reports about bad query | ☑ | Retry parsing if Balaboba shows a message about an incorrect query |