-
Join our Telegram chat: https://t.me/a_parser_enDismiss Notice
User API, interaction with other programs and scripts
-
A-Parser supports control through API, that allows to integrate parser into composition of difficult systems, using possibilities of parser from other programs and scripts
The version with API is available only to subscribers of Enterprise license
Contents
- Principle of work
- Control Modules A-Parser via API
- Code samples
- List of supported requests
- ping
- info
- oneRequest
- bulkRequest
- getParserPreset
- getProxies
- addTask
- getTaskState
- getTaskConf
- getTaskResultsFile
- deleteTaskResultsFile
- changeTaskStatus
- moveTask
- getTasksList
- getParserInfo
- update
- getAccountsCount
- Getting the JSON request via the interface for the addTask method
Principle of work(top)
Interaction with a parser happens according to HTTP protocol with JSON serialization of request and response. It is necessary to execute POST request for the address http://IP-of-server:9091/API. As a body of request JSON serialized structure is used:Code:{ "password" : "pass", "action" : "oneRequest", "data" : { "query" : "test", "parser" : "SE::Google", "preset" : "Pages Count use Proxy" } }
Where:
- password - password on A-Parser
- action - type of request
- data - parameters of requests, their for each type of request
Code:{ "success" : 1, "data" : "answer" }
Where:
- success - success of request
- data - response, can be scalar or structure, depends on request type
Control Modules A-Parser via API(top)
Code samples(top)
PHP:$aparser = 'http://127.0.0.1:9091/API';
$request = json_encode(array(
'action' => 'oneRequest',
'data' => array (
'parser' => 'SE::Google',
'preset' => 'Pages Count use Proxy',
'query' => 'test'
),
'password' => 'pass'
));
$ch = curl_init($aparser);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $request);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Length: ' . strlen($request)));
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: text/plain; charset=UTF-8'));
$response = curl_exec($ch);
curl_close($ch);
$response = json_decode($response, true);
echo $response['data']['resultString'];
On Perl
Code:use LWP; use JSON::XS; my $aparser = 'http://127.0.0.1:9091/API'; my $request = encode_json { 'action' => 'oneRequest', 'data' => { 'parser' => 'SE::Google', 'preset' => 'Pages Count use Proxy', 'query' => 'test' }, 'password' => 'pass' }; my $ua = LWP::UserAgent->new(); my $response = $ua->post( $aparser, 'Content-Type' => 'text/plain; charset=UTF-8', 'Content-Length' => length $request, 'Content' => $request ); if($response->is_success) { my $json = decode_json $response->content(); print $json->{'data'}->{'resultString'}; } else { warn 'Response fail: ', $response->status_line(); };
List of supported requests(top)
ping(top)
Check of work the server, request example:Code:{ "password" : "pass", "action" : "ping" }
Response example:Code:{ "success" : 1, "data" : "pong" }
info(top)
General information and list of all available parsers:Code:{ "password" : "pass", "action" : "info" }
Response example:Code:{ "success" : 1, "data" : { " tasksInQueue: 1, pid: '6044', [ "SE::Google", "SE::Google::PR", "SE::Google::Maps", "SE::Google::Images", "SE::Google::Suggest", "SE::Google::Position", "SE::Google::Trends", "SE::Google::TrustCheck", "SE::Google::Compromised", "SE::Google::SafeBrowsing", "SE::AOL", "SE::AOL::Suggest", "SE::Ask", "SE::Baidu", "SE::Bing", "SE::Bing::Images", "SE::Bing::Suggest", "SE::Bing::Translator", "SE::Bing::LangDetect", "SE::Comcast", "SE::Dogpile", "SE::DuckDuckGo", "SE::MailRu", "SE::MailRu::Position", "SE::Seznam", "SE::Yahoo", "SE::Yahoo::Suggest", "SE::Yandex", "SE::Yandex::TIC", "SE::Yandex::Catalog", "SE::Yandex::Direct", "SE::Yandex::Direct::Frequency", "SE::Yandex::Register", "SE::Yandex::Position", "SE::Yandex::Suggest", "SE::Yandex::WordStat", "SE::Yandex::WordStat::ByDate", "SE::Yandex::WordStat::ByRegion", "SE::YouTube", "SE::QIP", "SE::QIP::Position", "SEO::Ping", "Check::BackLink", "HTML::LinkExtractor", "HTML::TextExtractor", "HTML::TextExtractor::LangDetect", "Net::Whois", "Net::HTTP", "Net::DNS", "Rank::CMS", "Rank::Ahrefs", "Rank::Alexa", "Rank::Alexa::API", "Rank::Archive", "Rank::Category", "Rank::DMOZ", "Rank::Linkpad", "Rank::MajesticSEO", "Rank::Mustat", "Rank::OpenSiteExplorer", "Rank::SEMrush", "Util::AntiGate" ], activeProxyCheckerThreads: 15, version: '1.2.220', activeThreads: 0, workingTasks: 0 } }
oneRequest(top)
The single request for parsing, can be used any parser and preset. As a result there will be a created string in compliance with a result format, the given in a preset, and also full log of work parser.
Request example:Code:{ "password" : "pass", "action" : "oneRequest", "data" : { "query" : "test", "parser" : "SE::Google", "configPreset": "default", "preset" : "Pages Count use Proxy" } }
- parser - what parser to execute request
- preset - what to use preset
- configPreset - what config preset to use
- query - request
- rawResults - option parameter, if it is set that instead of resultant string resultString will return array results with all results, which supports the specified parser
- needData - optional parameter indicating whether to send data/pages in response, used to save memory, disabled by default
- options - an array with additional options applicable to parser, for example override - allows to redefine values in preset
Code:{ "success" : 1, "data" : { "resultString" : "test: 3060000000\n", "logs" : [ [ 0, 1347516765, "Parser SE::Google::0 parse query test" ], [ 0, 1347516765, "Wait for proxy" ], [ 0, 1347516765, "Use proxy socks://*" ], [ 0, 1347516765, "Parse page 1" ], [ 0, 1347516779, "Request(1): http://www.google.com/search?ie=utf-8&oe=utf-8&hl=en&q=test1&num=10 - 200 OK (38.79 KB)" ], [ 0, 1347516779, "Total grabbed 10 links" ], [ 0, 1347516779, "Parse response: 1" ], [ 0, 1347516779, "Thread complete work" ] ] } }
- resultString - string of result
- logs - array with logs of execution of request
Code:{ "password" : "pass", "action" : "oneRequest", "data" : { "options" : [ { "value" : 1, "type" : "override", "id" : "pagecount" }, { "value" : 10, "type" : "override", "id" : "linksperpage" } ], "query" : "test", "rawResults" : 1, "parser" : "SE::Google", "doLog" : 0, "preset" : "default" } }
bulkRequest(top)
The mass request for parsing, can be used any parser and preset, and is also specified in what quantity of threads to make parsing. As a result there will be a created string in compliance with a result format, the given in a preset, and also full log of work parser on each thread.
Request example:Code:{ "password" : "pass", "action" : "bulkRequest", "data" : { "parser" : "SE::Google", "preset" : "Pages Count use Proxy", "configPreset": "default", "threads" : 5, "rawResults" : 1, "queries" : [ "test1", "test2", "test3", "test4", "test5" ] } }
- parser - what parser to execute request
- preset - what to use preset
- configPreset - what config preset to use
- threads - quantity of threads for parsing
- queries - array of requests
- rawResults - option parameter, if it is set that instead of the resultant string resultString will return array results with all results, which supports the specified parser
- needData - optional parameter indicating whether to send data/pages in response, used to save memory, disabled by default
- options - array with additional options applicable to parser, for example override - allows to redefine values in a preset
Code:{ "success" : 1, "data" : { "logs" : { "4" : { "1" : [ [ 0, 1350399396, "Parser SE::Google::0 parse query test5" ], [ 0, 1350399396, "Wait for proxy" ], [ 0, 1350399396, "Use proxy socks://176.9.9.90:22515" ], [ 0, 1350399396, "Parse page 1" ], [ 0, 1350399403, "Request(1): http://www.google.com/search?ie=utf-8&oe=utf-8&hl=en&q=test5&num=10 - 503 Service Unavailable (0 KB)" ], [ 0, 1350399403, "Parse response: 3" ], [ 0, 1350399403, "Wait for proxy" ], [ 0, 1350399403, "Use proxy socks://176.9.9.90:21917" ], [ 0, 1350399408, "Request(2): http://www.google.com/search?ie=utf-8&oe=utf-8&hl=en&q=test5&num=10 - 200 OK (39.84 KB)" ], [ 0, 1350399408, "Total grabbed 10 links" ], [ 0, 1350399408, "Parse response: 1" ], [ 0, 1350399408, "Thread complete work" ] ] }, <cutted logs of 4 threads> }, "results" : [ { "related" : [], "formatresult" : "{query}: {totalcount}\\n", "query" : "test1", "serp" : [ <cutted array serp> ], "totalcount" : "12200000", "origquery" : "test1" }, { "related" : [], "formatresult" : "{query}: {totalcount}\\n", "query" : "test2", "serp" : [ <cutted array serp> ], "totalcount" : "14200000", "origquery" : "test2" }, <cutted 3 results> ] } }
getParserPreset(top)
Receiving settings of the specified parser and preset
Request example:Code:{ "password" : "pass", "action" : "getParserPreset", "data" : { "parser" : "SE::Google", "preset" : "default" } }
Response example:Code:{ "success" : 1, "data" : { "queryformat" : "$query", "parsenotfound" : 1, "gl" : "", "pagecount" : 5, "do_gzip" : 1, "domain" : "www.google.com", "timeout" : 60, "useproxy" : 1, "antigatepreset" : "default", "extraquery" : "", "serptime" : "", "usesessions" : 0, "filter" : 1, "linksperpage" : 100, "serp" : "", "useantigate" : 0, "proxyretries" : 10, "requestdelay" : 0, "proxybannedcleanup" : 300, "formatresult" : "$serp.format('$link\\n')", "lr" : "", "usecaptchakiller" : 0, "max_size" : 204800 } }
getProxies(top)
Request of list live proxy. Returns a list of alive proxy from all checker.
Request example:Code:{ "password" : "pass", "action" : "getProxies", }
Response example:Code:{ "success" : 1, "data" : { "127.0.0.1:23486" : [ "socks" ], "127.0.0.1:23140" : [ "socks" ], "127.0.0.1:21971" : [ "http" ] } }
Proxy type goes the first array cell, can be 2 values - http or socks. If authorization on login\password is specified, that by the second and third elements there will be a login and password.
If optionally specify:Code:{ "data" : { "checkers" : [ "Elite proxies", "free proxies" ] } }
addTask(top)
Adding tasks in queue, all parameters are similar to those that are set in the interface in Add Task
Request example:Code:{ "password" : "pass", "action" : "addTask", "data" : { "resultsFileName" : "api-test.txt", "parsers" : [ [ "SE::Google", "default", { "value" : "$serp.format('$link\\n')", "id" : "formatresult", "type" : "override" }, { "value" : 1, "id" : "parseAll", "type" : "options" } ] ], "uniqueQueries" : 0, "keepUnique" : 0, "resultsPrepend" : "", "queries" : [ "bla", "blala" ], "configPreset" : "default", "moreOptions" : 0, "queriesFrom" : "text", "resultsUnique" : "no", "doLog" : "no", "queryFormat" : "$query", "resultsSaveTo" : "file", "configOverrides" : [ [ "asyncthreads", 10 ] ], "resultsFormat" : "$p1.preset", "resultsAppend" : "", "queryBuilders" : [] } }
Response example:Code:{ "success" : 1, "data" : "181" }
In reply given id of the added task
We add task with requests from the file:Code:"queriesFrom": "file", "queriesFile": "queries.txt",
It is possible to use in the advance created preset via the interfaceCode:{ "password" : "pass", "action" : "addTask", "data" : { "queriesFrom" : "text", "queries" : [ "google.com", "yandex.ru" ], "configPreset" : "default", "preset" : "Analyze Domains" } }
In this case it is necessary to specify requests only. It is also possible to specify any of parameters, it will be used over value in a preset.
Added in version 1.1.358
RemoveOnRestart flag indicates that the tasks will be deleted when parser restarted.
Code:{ "password" : "pass", "action" : "addTask", "data" : { "queriesFrom" : "text", "queries" : [ "google.com", "yandex.ru" ], "configPreset" : "default", "preset" : "Analyze Domains", "removeOnRestart" : 1 } }
Added in version 1.1.417
RemoveOnComplete flag indicates that the tasks will be deleted after completion.
Code:{ "password" : "pass", "action" : "addTask", "data" : { "queriesFrom" : "text", "queries" : [ "google.com", "yandex.ru" ], "configPreset" : "default", "preset" : "Analyze Domains", "removeOnComplete" : 1 } }
getTaskState(top)
Receiving a status of task on its id
Request example:Code:{ "password" : "pass", "action" : "getTaskState", "data" : { "taskUid" : "181" } }
Response example:Code:{ success: 1, data: { status: 'completed', stats: '<b>Overall stats</b><br>Runtime: 0:02:17<br>HTTP requests: 54<br><br><b>1. SE::Google</b><br>Queries done: 3<br>Successful queries: 2<br>Proxies used: 8.33 (per query)<br>Retries used: 8.66 (per query)<br>HTTP requests: 18 (per query)<br>CAPTCHAs shows: 0, 0 (per query)<br>CAPTCHAs sended: 0<br>CAPTCHAs recognized: 0<br>CAPTCHAs bad: 0<br>', state: { totalFail: 1, totalWaitProxyThreads: 0, minimized: 0, queriesDoneCount: 3, avgSpeed: 1, activeThreads: 0, startTime: 1515584474, changeTime: 1515584611, queriesCount: 3, logExists: 1, runTime: 137, uniqueResultsCount: 'none', requests: '54', addTime: 1515584018, additionalCount: 0, queriesDoneCountAtStart: 0, lastQuery: 'inurl:?login', curSpeed: 4, started: 1, resultsCount: 919 } } }
The status of task and its status (statistics) is in response given
You can also pass array of id's, for example:
Code:{ "password": "pass", "action": "getTaskState", "data": { "taskUid": [ "5338", "5337", "5336" ] } }
Code:{ success: 1, data: [ { "state": { "queriesDoneCountAtStart": 0, "addTime": 1525951886, "startTime": 1525951887, "requests": "119", "totalFail": 0, "totalWaitProxyThreads": 0, "additionalCount": 107, "changeTime": 1525951980, "curSpeed": 72, "avgSpeed": 69, "resultsCount": 12529, "lastTotalFail": 0, "uniqueResultsCount": "none", "activeThreads": 0, "lastQuery": "https://google.com", "logExists": 1, "queriesCount": 1, "runTime": 93, "minimized": 0, "started": 1, "queriesDoneCount": 108 }, "status": "completed", "stats": "<b>Overall stats</b><br>Runtime: 0:01:33<br>HTTP requests: 119<br><br><b>1. Net::HTTP</b><br>Queries done: 108<br>Successful queries: 108<br>Proxies used: 1.1 (per query)<br>Retries used: 1.1 (per query)<br>HTTP requests: 1.1 (per query)" }, { "stats": "<b>Overall stats</b><br>Runtime: 0:01:41<br>HTTP requests: 119<br><br><b>1. Net::HTTP</b><br>Queries done: 108<br>Successful queries: 108<br>Proxies used: 1.1 (per query)<br>Retries used: 1.1 (per query)<br>HTTP requests: 1.1 (per query)", "state": { "queriesDoneCountAtStart": 0, "addTime": 1525951729, "totalFail": 0, "totalWaitProxyThreads": 0, "additionalCount": 107, "changeTime": 1525951831, "startTime": 1525951730, "requests": "119", "lastQuery": "https://a-parser.com", "avgSpeed": 64, "curSpeed": 7, "uniqueResultsCount": "none", "activeThreads": 0, "lastTotalFail": 0, "resultsCount": 12529, "minimized": 0, "started": 1, "runTime": 101, "queriesCount": 1, "queriesDoneCount": 108, "logExists": 1 }, "status": "completed" }, { "status": "completed", "state": { "logExists": 1, "queriesCount": 1, "runTime": 192, "started": 1, "minimized": 0, "queriesDoneCount": 108, "curSpeed": 2, "avgSpeed": 33, "lastTotalFail": 0, "resultsCount": 12529, "activeThreads": 0, "uniqueResultsCount": "none", "lastQuery": "https://microsoft.com/", "startTime": 1525951384, "requests": "121", "totalFail": 0, "additionalCount": 107, "totalWaitProxyThreads": 0, "changeTime": 1525951576, "queriesDoneCountAtStart": 0, "addTime": 1525951382 }, "stats": "<b>Overall stats</b><br>Runtime: 0:03:12<br>HTTP requests: 121<br><br><b>1. Net::HTTP</b><br>Queries done: 108<br>Successful queries: 108<br>Proxies used: 1.12 (per query)<br>Retries used: 1.12 (per query)<br>HTTP requests: 1.12 (per query)" } ] }
getTaskConf(top)
Receiving a configuration of task on its id
Request example:Code:{ "password" : "pass", "action" : "getTaskConf", "data" : { "taskUid" : "181" } }
Response example:Code:{ "success" : 1, "data" : { "resultsFileName" : "Aug-26_08-30-00.txt", "parsers" : [ [ "SE::Google::PR", "default" ] ], "resultsPrepend" : "", "queriesFrom" : "text", "doLog" : "no", "resultsSaveTo" : "file", "resultsFormat" : "$p1.preset", "queryBuilders" : [], "resultsAppend" : "", "preset" : "default", "uniqueQueries" : 0, "keepUnique" : 0, "configPreset" : "default", "queries" : [ "google.com", "yandex.com" ], "resultsBuilders" : [], "moreOptions" : 0, "resultsUnique" : "no", "queryFormat" : "$query", "configOverrides" : [] } }
It is in reply given settings of task, including with resulting file name
getTaskResultsFile(top)
Added from version 1.1.5
Obtaining the link for downloading of result on task id
Request example:Code:{ "password" : "pass", "action" : "getTaskResultsFile", "data" : { "taskUid" : "181" } }
Response example:Code:{ "success" : 1, "data" : "http://127.0.0.1:9091/downloadResults?fileName=Jul-29_11-23-42.txt&token=zodrurcj" }
It is possible to download the file at this link once, without authorization (the one-time token is used)
deleteTaskResultsFile(top)
Added from version 1.1.198
Deleting file of result on task id
Request example:Code:{ "password" : "pass", "action" : "deleteTaskResultsFile", "data" : { "taskUid" : "181" } }
Response example:Code:{ "success" : 1 }
changeTaskStatus(top)
Change of the status task on its id
Request example:Code:{ "password" : "pass", "action" : "changeTaskStatus", "data" : { "taskUid" : "181", "toStatus" : "deleting" } }
Response example:Code:{ "success" : 1 }
There are only 4 statuses to which it is possible to transfer task: starting, pausing, stopping, deleting
moveTask(top)
Added from version 1.0.236
Relocation of task in queue on its id
Request example:Code:{ "password" : "pass", "action" : "moveTask", "data" : { "taskUid" : "181", "direction" : "start" } }
Response example:Code:{ "success" : 1 }
Possible directions of relocation:
- start - in the beginning of queue
- end - in the end of queue
- up - on one line item up
- down - on one line item down
getTasksList(top)
Added from version 1.1.268
Gets a list of active tasks. If you send an optional parameter completed: 1, we get a list of completed tasks.
Request example:Code:{ "password" : "pass", "action" : "getTasksList", "data" : { "completed" : "1" } }
Response example:Code:{ "data" : [ "2291", "2324", "2331", "2384", "2398", "2434", "2445", "3482", "3481", "2547", "2554", "2555", "2561", "2566", "2568", "2575", "2576", "2577", "2589", "2590", "2594", "2596", "2601", "2613", "2618", "2623", "2624" ], "success" : 1 }
getParserInfo(top)
Added from version 1.1.350
Displays a list of all available results that can return the specified parser.
Request example:Code:{ "password" : "123", "action" : "getParserInfo", "data" : { "parser" : "SE::Google" } }
Response example:Code:{ "data" : { "results" : { "arrays" : { "serp" : [ "Main serp list", [ [ "link", "Link" ], [ "anchor", "Anchor" ], [ "snippet", "Snippet" ] ] ], "ads" : [ "Ads list", [ [ "link", "Link" ], [ "anchor", "Anchor" ], [ "visiblelink", "Visible link" ], [ "snippet", "Snippet" ], [ "position", "Block position" ], [ "page", "Page" ] ] ], "related" : [ "Related keywords", [ [ "key", "Keyword" ] ] ], "pages" : [ "Raw data array", [ [ "data", "Raw data" ] ] ] }, "defaultUnique" : [ "serp", "link" ], "flat" : [ [ "query", "Formatted query" ], [ "query.orig", "Original query" ], [ "query.first", "First query" ], [ "info.success", "Parsing success" ], [ "info.retries", "Used retries" ], [ "info.stats", "Statistics" ], [ "totalcount", "Total results count" ], [ "misspell", "Is query misspelling" ] ] } }, "success" : 1 }
update(top)
Added from version 1.1.350
Update executable file of the parser to the latest version, after sending the command A-Parser will be automatically restarted. API returns a response about the success after download and update the executable file, it may take 1-3 minutes.
Request example:Code:{ "action" : "update", "data" : {}, "password" : "123" }
Response example:Code:{ "success" : 1 }
getAccountsCount(top)
Added from version 1.1.697
Getting the number of active accounts (for Yandex)
Request example:Code:{ "action" : "getAccountsCount", "data" : {}, "password" : "123" }
Response example:Code:{ "success" : 1, "data" : { "SE::Yandex" : 3 } }
Getting the JSON request via the interface for the addTask method (top)
C version 1.1.974 added the ability to get JSON request through the interface, using Show API query
olegborzov, Oopssik, Metroid and 4 others like this.