Skip to main content

Overview

The new JavaScript API v2 is designed to simplify development by moving away from generators in favor of async/await. This API also addresses several limitations of the previous version, adds TypeScript support, and improves performance. We recommend using this JavaScript API for creating all new parsers.

To use JavaScript API v2, simply inherit your parser class from the base BaseParser. Let's examine the structure of a parser class with an example:

files/parsers/v2-example/v2-example.ts
import { BaseParser } from 'a-parser-types';

export class JS_v2_example extends BaseParser {
static defaultConf: typeof BaseParser.defaultConf = {
version: '0.0.1',
results: {
flat: [
['title', 'Title'],
['h1', 'H1 Header']
],
arrays: {
h2: ['H2 Headers List', [
['header', 'Header'],
]],
}
},
max_size: 2 * 1024 * 1024,
parsecodes: {
200: 1,
},
results_format: "Title: $title\nH1: $h1\nH2 headers:\n$h2.format('$header\\n')\n",
limitH2Tags: 3,
};

static editableConf: typeof BaseParser.editableConf = [
['limitH2Tags', ['textfield', 'Limit H2 tags']],
];

async parse(set, results) {
const { success, data, headers } = await this.request('GET', set.query);

if (success && typeof data == 'string') {
let matches;
if (matches = data.match(/<title[^>]*>(.*?)<\/title>/))
results.title = matches[1];

if (matches = data.match(/<h1[^>]*>(.*?)<\/h1>/))
results.h1 = matches[1];

if (results.h2) {
let count = 0;
const re = /<h2[^>]*>(.*?)<\/h2>/g;
while(matches = re.exec(data)) {
results.h2.push(matches[1]);
if (++count == this.conf.limitH2Tags)
break;
}
}
}

return results;
}
}

TODO: (next) ## Inheritance