Rank::CMS - detecting over 600 types of CMS based on features. Detects all popular forums, blogs, CMS, guestbooks, wikis, and many other engine types
Overview of the scraper

 Rank::CMS – identifies over 600 types of CMS based on features. Identifies all popular forums, blogs, CMS, guestbooks, wikis, and many other types of engines.
Rank::CMS – identifies over 600 types of CMS based on features. Identifies all popular forums, blogs, CMS, guestbooks, wikis, and many other types of engines.A-Parser's functionality allows you to save Rank::CMS scraper parsing settings for future use (presets), set a parsing schedule, and much more.
Results can be saved in the form and structure you need, thanks to the built-in powerful templating engine Template Toolkit which allows you to apply additional logic to the results and output data in various formats, including JSON, SQL and CSV.
Collected data
- CMS name
- Category name
List of supported CMS
"1C-Bitrix", "2z Project", "3dCart", "Accessible Portal", "actionhero.js", "Adobe CQ5", "Ametys", "Amiro.CMS", "AMPcms", "Anchor CMS", "AsciiDoc", "Backdrop", "Banshee", "BIGACE", "Bolt", "BrowserCMS", "Business Catalyst", "Cargo", "Chameleon", "Ckan", "CMS Made Simple", "CMSimple", "Concrete5", "Contao", "Contenido", "Contens", "ContentBox", "Cotonti", "CPG Dragonfly", "CppCMS", "Craft CMS", "Danneo CMS", "DataLife Engine", "DedeCMS", "Django CMS", "DNN", "Dotclear", "Drupal", "DTG", "Dynamicweb", "e107", "Eleanor CMS", "EPiServer", "eSyndiCat", "ExpressionEngine", "eZ Publish", "FlexCMP", "GetSimple CMS", "Google Sites", "Graffiti CMS", "Grav", "Green Valley CMS", "GX WebManager", "Hippo", "Hotaru CMS", "IBM WebSphere Portal", "ImpressCMS", "ImpressPages", "Indexhibit", "Indico", "InProces", "InstantCMS", "io4 CMS", "Jalios", "Jekyll", "Joomla", "Kentico CMS", "Koala Framework", "Koken", "Kolibri CMS", "Komodo CMS", "Koobi", "Kooboo CMS", "Kotisivukone", "LEPTON", "Liferay", "LightMon Engine", "Lithium", "LiveStreet CMS", "Locomotive", "M.R. Inc Wild CMS", "Mambo", "MaxSite CMS", "Methode", "Microsoft SharePoint", "MODx", "Moguta.CMS", "Mono.net", "Movable Type", "Mozard Suite", "Mura CMS", "Mynetcap", "Nepso", "October CMS", "Odoo", "OpenCms", "openEngine", "OpenNemas", "OpenText Web Solutions", "Ophal", "Orchard CMS", "Pagekit", "PANSITE", "papaya CMS", "PencilBlue", "Percussion", "PHP-Fusion", "phpCMS", "phpSQLiteCMS", "phpwind", "Pligg", "Plone", "Posterous", "Quick.CMS", "RBS Change", "RCMS", "RiteCMS", "Roadiz CMS", "S.Builder", "Sarka-SPIP", "SDL Tridion", "Serendipity", "Silva", "SilverStripe", "SIMsite", "Sitecore", "SiteEdit", "Sivuviidakko", "SmartSite", "sNews", "Solodev", "SPIP", "Squarespace", "Squiz Matrix", "Subrion", "swift.engine", "Textpattern CMS", "Thelia", "TiddlyWiki", "Tiki Wiki CMS Groupware", "Twilight CMS", "TYPO3 CMS", "TYPO3 Neos", "uCore", "Umbraco", "Unbounce", "Ushahidi", "viennaCMS", "Vignette", "VIVVO", "webEdition", "WebGUI", "WebPublisher", "Webs", "WebsiteBaker", "WebsPlanet", "Weebly", "Wix", "Wolf CMS", "WordPress", "XOOPS"
Capabilities
- Identification of 161 types of CMS based on features
- Identifies all popular forums, blogs, CMS, guestbooks, wikis, and many other types of engines based on the large and high-quality Wappalyzer feature database (over 800 technologies in total)
- Ability to select a category or specific engines for recognition
- Ability to specify a custom User-Agent
- Ability to modify and supplement the feature database
- Ability to use your own feature file (the custom-apps.json file must have a structure similar to the regular apps.json and be located at files/Rank-CMS; if done correctly, new categories and applications for selection will appear at the end of the list in the Check list option)
Use cases
- Filtering by engines
- Sorting large databases by engines
Queries
Queries should be a list of domains, for example:
http://a-parser.com/  
http://techcrunch.com/  
http://vkusnologia.ru/  
http://blogautomobile.fr/  
http://avto-blogger.ru/  
http://www.cyberforum.ru/
Output results examples
A-Parser supports flexible result formatting thanks to the built-in templating engine Template Toolkit, which allows it to output results in an arbitrary form, as well as in structured form, such as CSV or JSON
Default output
Result format:
$query - $cms\n
Example result:
http://blogautomobile.fr/- WordPress  
http://a-parser.com/ - XenForo  
http://vkusnologia.ru/ - WordPress  
http://avto-blogger.ru/ - WordPress  
http://techcrunch.com/ - WordPress  
http://www.cyberforum.ru/ - 1C-Bitrix
Saving in SQL format
Result format:
[% "INSERT INTO cms VALUES('" _ query _ "', '" _ cms _ "', '" _ cat _ "')\n" %]
Example result:
INSERT INTO cms VALUES('http://yandex.ru', 'unknown', 'unknown')
INSERT INTO cms VALUES('http://vk.com', 'unknown', 'unknown')
INSERT INTO cms VALUES('http://facebook.com', 'unknown', 'unknown')
INSERT INTO cms VALUES('http://a-parser.com', 'WordPress', 'CMS')
INSERT INTO cms VALUES('http://youtube.com', 'unknown', 'unknown')
INSERT INTO cms VALUES('http://google.com', 'unknown', 'unknown')
Dump results to JSON
Общий формат результата:
[% IF notFirst;
  ",\n";
ELSE;
  notFirst = 1;
END;
obj = {};
obj.query = query;
obj.cms = p1.cms;
obj.cat = p1.cat;
obj.json %]
Начальный текст:
[
Конечный текст:
]
Example result:
[
    {"cat":"unknown","cms":"unknown","query":"http://google.com"},
    {"cat":"unknown","cms":"unknown","query":"http://yandex.ru"},
    {"cat":"unknown","cms":"unknown","query":"http://facebook.com"},
    {"cat":"CMS","cms":"WordPress","query":"http://a-parser.com"},
    {"cat":"unknown","cms":"unknown","query":"http://vk.com"},
    {"cat":"unknown","cms":"unknown","query":"http://youtube.com"}
]
For the "Initial text" and "Final text" options to be available in the Job Editor, you need to activate "More options".
Possible settings
| Parameter | Default value | Description | 
|---|---|---|
| User agent | _Automatically substitutes the user-agent of the current Chrome version_ | Allows representing yourself as a specific browser or search engine | 
| Log long running regex | ☐ | Determines whether to record slow regular expressions | 
| Check list | cms, message-boards, wikis | Selection of engines for checking | 
| Emulate browser headers | ☑ | Ability to emulate browser headers | 
| RegExp engine | RE2 | Selection of the regular expression engine | 
| Use Net::HTTP | ☐ | Ability to use the  Net::HTTP scraper for queries | 
| Net::HTTP preset | default | Ability to specify a preset with settings |