Skip to main content

SE::Yandex::Speller - Checking pages for text errors via Yandex.Speller

Overview

SE::Yandex::Speller

SE::Yandex::SpellerSE::Yandex::Speller – finds spelling errors in Russian, Ukrainian, or English text on the specified page via the Yandex.Speller service. Language models include hundreds of millions of words and phrases.

A-Parser functionality allows you to save parsing settings for the SE::Yandex::Speller parser for future use (presets), set up parsing schedules, and much more.

Saving results is possible in any format and structure you need, thanks to the built-in powerful Template Toolkit which allows applying additional logic to results and outputting data in various formats, including JSON, SQL, and CSV.

Collected data

  • Text blocks where errors were found

Capabilities

  • Determining the number of blocks containing errors
  • Outputting possible reasons for errors in the text

Use cases

  • Finding the number of text blocks containing errors
  • Checking website pages for spelling errors in the text
  • Checking spelling on website pages

Queries

The parser can accept both keywords (text strings) and page links as input. The query type is determined automatically.

  • Example of queries as text strings:
Text for checking with Yandex Speller parser
Query with an eror
  • Example of queries as a website page address to be checked:
https://a-parser.com/
https://en.wikipedia.org/wiki/Parsing

Output results examples

A-Parser supports flexible result formatting thanks to the built-in Template Toolkit, which allows it to output results in any form, as well as in structured formats like CSV or JSON

Default output

Result format:

$query: $total\n$errors.format('$word ($suggest) - $type\n')

Result example:

Query with an eror: 1
eror (error) - Word not found in dictionary.
Text for checking with Yandex Speller parser: 0
https://a-parser.com/: 10
sugestions (suggestions) - Word not found in dictionary.
datta (data) - Word not found in dictionary.
MOZ (DMOZ) - Word not found in dictionary.
NodeJS (Node JS) - Word not found in dictionary.
Develop (Developing) - Word not found in dictionary.
...
https://en.wikipedia.org/wiki/Parsing: 183
• العربية (• العربية) - Text contains too many errors.
• বাংলা (• বাংলা) - Text contains too many errors.
...
material (material) - Word not found in dictionary.
parsed (passed) - Word not found in dictionary.
they (that) - Word not found in dictionary.
...

Saving in SQL format

Result format:

[% FOREACH errors;
"INSERT INTO errors VALUES('" _ word _ "', '" _ suggest _ "', '" _ type _ "')\n";
END %]

Result example:

INSERT INTO errors VALUES('SaaS', 'Seas', 'Word not found in dictionary.')
INSERT INTO errors VALUES('freelancers', '', 'Word not in dictionary.')
INSERT INTO errors VALUES('Affiliate Marketers', 'Affiliate Marketers', 'Word not in dictionary.')
INSERT INTO errors VALUES('Youtube', 'YouTube', 'Incorrect use of uppercase and lowercase letters.')
INSERT INTO errors VALUES('emails', 'mails', 'Word not in dictionary.')
INSERT INTO errors VALUES('WordStat', '', 'Word not found in dictionary.')
INSERT INTO errors VALUES('Link building', '', 'Word not in dictionary.')
INSERT INTO errors VALUES('outreach', '', 'Word not in dictionary.')
INSERT INTO errors VALUES('Alexa', '', 'Word not found in dictionary.')
INSERT INTO errors VALUES('SEMRush', '', 'Word not found in dictionary.')
INSERT INTO errors VALUES('Ahrefs', 'Href', 'Word not found in dictionary.')
INSERT INTO errors VALUES('MajesticSEO', '', 'Word not found in dictionary.')
INSERT INTO errors VALUES('SerpStat', '', 'Word not found in dictionary.')
INSERT INTO errors VALUES('freelancers', '', 'Word not in dictionary.')
INSERT INTO errors VALUES('SaaS', 'Saab,Seas,SAS', 'Word not found in dictionary.')
INSERT INTO errors VALUES('SaaS', 'Seas,SAS', 'Word not found in dictionary.')
INSERT INTO errors VALUES('NodeJS', 'Nodes', 'Word not found in dictionary.')
INSERT INTO errors VALUES('NodeJS', 'Nodes', 'Word not found in dictionary.')
INSERT INTO errors VALUES('async', 'sync', 'Word not found in dictionary.')
INSERT INTO errors VALUES('lead generation', 'lead generation', 'Word not in dictionary.')

Dump results to JSON

General result format:

[% IF notFirst;
",\n";
ELSE;
notFirst = 1;
END;

obj = {};
obj.errors = p1.errors;

obj.json %]

Initial text:

[

Final text:

]

Result example:

[{"errors": [{"word":"SaaS","suggest":"Seas","type":"Word not found in dictionary."},{"word":"freelancers","suggest":"","type":"Word not found in dictionary."},{"word":"Affiliate Marketers","suggest":"Affiliate Marketers","type":"Word not found in dictionary."},{"word":"Youtube","suggest":"YouTube","type":"Incorrect use of uppercase and lowercase letters."},{"word":"emails","suggest":"mails","type":"Word not found in dictionary."},{"word":"WordStat","suggest":"","type":"Word not found in dictionary."},{"word":"Linkbuilding","suggest":"","type":"Word not found in dictionary."},{"word":"outreach","suggest":"","type":"Word not found in dictionary."},{"word":"Alexa","suggest":"","type":"Word not found in dictionary."},{"word":"SEMRush","suggest":"","type":"Word not found in dictionary."},{"word":"Ahrefs","suggest":"Href","type":"Word not found in dictionary."},{"word":"MajesticSEO","suggest":"","type":"Word not found in dictionary."},{"word":"SerpStat","suggest":"","type":"Word not found in dictionary."},{"word":"freelancers","suggest":"","type":"Word not found in dictionary."},{"word":"SaaS","suggest":"Saab,Seas,SAS","type":"Word not found in dictionary."},{"word":"SaaS","suggest":"Seas,SAS","type":"Word not found in dictionary."},{"word":"NodeJS","suggest":"Nodes","type":"Word not found in dictionary."},{"word":"Parser'a","suggest":"","type":"Word not found in dictionary."},{"word":"NodeJS","suggest":"Nodes","type":"Word not found in dictionary."},{"word":"async","suggest":"sync","type":"Word not found in dictionary."},{"word":"lead generation","suggest":"lead generation","type":"Word not found in dictionary."},{"word":"Parse","suggest":"Paring","type":"Word not found in dictionary."},{"word":"Instagram","suggest":"","type":"Word not found in dictionary."},{"word":"marketplaces","suggest":"","type":"Word not found in dictionary."},{"word":"marketplaces","suggest":"","type":"Word not found in dictionary."},{"word":"marketplace","suggest":"","type":"Word not found in dictionary."},{"word":"Instagram","suggest":"","type":"Word not found in dictionary."},{"word":"Bing","suggest":"","type":"Word not found in dictionary."},{"word":"news sites","suggest":"","type":"Word not found in dictionary."},{"word":"Redis","suggest":"","type":"Word not found in dictionary."},{"word":"scrape","suggest":"","type":"Word not found in dictionary."},{"word":"captchas","suggest":"","type":"Word not found in dictionary."},{"word":"XEvil","suggest":"Evil,Devil","type":"Word not found in dictionary."},{"word":"CapMonster","suggest":"Cap Monster","type":"Word not found in dictionary."},{"word":"Captcha","suggest":"","type":"Word not found in dictionary."},{"word":"RuCaptcha","suggest":"","type":"Word not found in dictionary."},{"word":"scrape","suggest":"dispute","type":"Word not found in dictionary."},{"word":"scrape","suggest":"","type":"Word not found in dictionary."},{"word":"scrape","suggest":"request","type":"Word not found in dictionary."},{"word":"brief","suggest":"","type":"Word not found in dictionary."},{"word":"tickets","suggest":"","type":"Word not found in dictionary."},{"word":"Parser’om","suggest":"","type":"Word not found in dictionary."},{"word":"Parser'om","suggest":"","type":"Word not found in dictionary."},{"word":"tools","suggest":"nodes,aces,tools","type":"Word not found in dictionary."}]}]

Possible settings

ParameterDefault valueDescription
LanguagesEnglish, Russian, UkrainianCheck languages
OptionsSkip words written in capital letters, e.g., "MIC"., Skip words with numbers, e.g., "avp17x4534"., Skip internet addresses, email addresses, and filenames., Ignore Roman numerals ("I, II, III, ...").Check options
HTML::TextExtractor presetdefaultPreset for HTML::TextExtractorHTML::TextExtractor. Allows specifying text parsing settings