Pre-parsed Document to extract content from
Extraction options
Extraction result (success or failure)
Uses Mozilla Readability to extract clean article content from a pre-parsed Document. This function never throws exceptions - always returns a ContentResult.
Error handling:
import { parseHTML } from '../utils/html-parser.js';
import { extractSEO } from '../metadata/index.js';
const doc = parseHTML(html);
const metadata = extractSEO(doc);
const content = extractContent(doc, {
baseUrl: 'https://example.com/article',
charThreshold: 300,
checkReadability: true,
});
if (content.success) {
console.log(content.title);
console.log(content.wordCount);
console.log(`${content.readingTime} min read`);
} else {
console.error(content.error);
}
Extract article content from HTML.