Rank::CMS - identification of more than 600 types of CMS based on characteristics. Identifies all popular forums, blogs, CMS, guestbooks, wikis, and many other types of engines
Overview
Rank::CMS – identifies more than 600 types of CMS based on characteristics. Identifies all popular forums, blogs, CMS, guestbooks, wikis, and many other types of engines.The functionality of A-Parser allows you to save the scraping settings of the Rank::CMS scraper for further use (presets), set a scraping schedule, and much more.
Saving results is possible in the form and structure that you need, thanks to the built-in powerful template engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL, and CSV.
Collected Data
- CMS Name
- Category Name
List of Supported CMS
"1C-Bitrix", "2z Project", "3dCart", "Accessible Portal", "actionhero.js", "Adobe CQ5", "Ametys", "Amiro.CMS", "AMPcms", "Anchor CMS", "AsciiDoc", "Backdrop", "Banshee", "BIGACE", "Bolt", "BrowserCMS", "Business Catalyst", "Cargo", "Chameleon", "Ckan", "CMS Made Simple", "CMSimple", "Concrete5", "Contao", "Contenido", "Contens", "ContentBox", "Cotonti", "CPG Dragonfly", "CppCMS", "Craft CMS", "Danneo CMS", "DataLife Engine", "DedeCMS", "Django CMS", "DNN", "Dotclear", "Drupal", "DTG", "Dynamicweb", "e107", "Eleanor CMS", "EPiServer", "eSyndiCat", "ExpressionEngine", "eZ Publish", "FlexCMP", "GetSimple CMS", "Google Sites", "Graffiti CMS", "Grav", "Green Valley CMS", "GX WebManager", "Hippo", "Hotaru CMS", "IBM WebSphere Portal", "ImpressCMS", "ImpressPages", "Indexhibit", "Indico", "InProces", "InstantCMS", "io4 CMS", "Jalios", "Jekyll", "Joomla", "Kentico CMS", "Koala Framework", "Koken", "Kolibri CMS", "Komodo CMS", "Koobi", "Kooboo CMS", "Kotisivukone", "LEPTON", "Liferay", "LightMon Engine", "Lithium", "LiveStreet CMS", "Locomotive", "M.R. Inc Wild CMS", "Mambo", "MaxSite CMS", "Methode", "Microsoft SharePoint", "MODx", "Moguta.CMS", "Mono.net", "Movable Type", "Mozard Suite", "Mura CMS", "Mynetcap", "Nepso", "October CMS", "Odoo", "OpenCms", "openEngine", "OpenNemas", "OpenText Web Solutions", "Ophal", "Orchard CMS", "Pagekit", "PANSITE", "papaya CMS", "PencilBlue", "Percussion", "PHP-Fusion", "phpCMS", "phpSQLiteCMS", "phpwind", "Pligg", "Plone", "Posterous", "Quick.CMS", "RBS Change", "RCMS", "RiteCMS", "Roadiz CMS", "S.Builder", "Sarka-SPIP", "SDL Tridion", "Serendipity", "Silva", "SilverStripe", "SIMsite", "Sitecore", "SiteEdit", "Sivuviidakko", "SmartSite", "sNews", "Solodev", "SPIP", "Squarespace", "Squiz Matrix", "Subrion", "swift.engine", "Textpattern CMS", "Thelia", "TiddlyWiki", "Tiki Wiki CMS Groupware", "Twilight CMS", "TYPO3 CMS", "TYPO3 Neos", "uCore", "Umbraco", "Unbounce", "Ushahidi", "viennaCMS", "Vignette", "VIVVO", "webEdition", "WebGUI", "WebPublisher", "Webs", "WebsiteBaker", "WebsPlanet", "Weebly", "Wix", "Wolf CMS", "WordPress", "XOOPS"
Capabilities
- Identification of 161 types of CMS based on features
- Identifies all popular forums, blogs, CMS, guestbooks, wikis, and many other types of engines based on a large and high-quality Wappalyzer feature database (over 800 technologies in total)
- Ability to select a category or specific engines for recognition
- Ability to specify an arbitrary User-Agent
- Ability to modify and supplement the feature database
- Ability to use your own file with features (the custom-apps.json file must be structurally similar to the usual apps.json and located in the files/Rank-CMS path, if everything is done correctly, new categories and applications for selection will appear in the Check list option at the end of the list)
Use cases
- Filtering by engines
- Sorting large databases by engines
Queries
As queries, you need to specify a list of domains, for example:
http://a-parser.com/
http://techcrunch.com/
http://vkusnologia.ru/
http://blogautomobile.fr/
http://avto-blogger.ru/
http://www.cyberforum.ru/
Output results examples
A-Parser supports flexible formatting of results thanks to the built-in Template Toolkit, which allows it to output results in any form, as well as in structured formats, such as CSV or JSON
Default output
Result format:
$query - $cms\n
Example of result:
http://blogautomobile.fr/- WordPress
http://a-parser.com/ - XenForo
http://vkusnologia.ru/ - WordPress
http://avto-blogger.ru/ - WordPress
http://techcrunch.com/ - WordPress
http://www.cyberforum.ru/ - 1C-Bitrix
Saving in SQL format
Result format:
[% "INSERT INTO cms VALUES('" _ query _ "', '" _ cms _ "', '" _ cat _ "')\n" %]
Example of result:
INSERT INTO cms VALUES('http://yandex.ru', 'unknown', 'unknown')
INSERT INTO cms VALUES('http://vk.com', 'unknown', 'unknown')
INSERT INTO cms VALUES('http://facebook.com', 'unknown', 'unknown')
INSERT INTO cms VALUES('http://a-parser.com', 'WordPress', 'CMS')
INSERT INTO cms VALUES('http://youtube.com', 'unknown', 'unknown')
INSERT INTO cms VALUES('http://google.com', 'unknown', 'unknown')
Dump results to JSON
Общий формат результата:
[% IF notFirst;
",\n";
ELSE;
notFirst = 1;
END;
obj = {};
obj.query = query;
obj.cms = p1.cms;
obj.cat = p1.cat;
obj.json %]
Начальный текст:
[
Конечный текст:
]
Example of result:
[
{"cat":"unknown","cms":"unknown","query":"http://google.com"},
{"cat":"unknown","cms":"unknown","query":"http://yandex.ru"},
{"cat":"unknown","cms":"unknown","query":"http://facebook.com"},
{"cat":"CMS","cms":"WordPress","query":"http://a-parser.com"},
{"cat":"unknown","cms":"unknown","query":"http://vk.com"},
{"cat":"unknown","cms":"unknown","query":"http://youtube.com"}
]
To make the "Start text" and "End text" options available in the Task Editor, you need to activate "More options".
Possible settings
Parameter | Default value | Description |
---|---|---|
User agent | _Automatically substituted user-agent of the current version of Chrome_ | Allows you to impersonate a certain browser or search engine |
Log long running regex | ☐ | Determines whether to record slow regular expressions |
Check list | cms, message-boards, wikis | Choice of engines to check |
Emulate browser headers | ☑ | Ability to emulate browser headers |
RegExp engine | Built-in | Choice of regular expression engine |
Use Net::HTTP | ☐ | Ability to use the Net::HTTP scraper for requests |
Net::HTTP preset | default | Ability to specify a preset with settings |