Website URL as string or URL object
Gathered website data including final URL, title, description, image, icon, language, html, text, feeds, and links
This is a high-level convenience method that fetches a website and extracts all relevant data. It handles encoding detection, redirects, and provides a unified interface for all website data.
This method will be extended incrementally to include metadata extraction, content extraction, and more.
// Fetch a website and get its data
const site = await gatherWebsite('https://example.com');
console.log(site.url); // Final URL after redirects
console.log(site.title); // Page title (cleaned, from best source)
console.log(site.description); // Page description (from best source)
console.log(site.image); // Page image/keyvisual (from best source)
console.log(site.icon); // Best available icon/favicon
console.log(site.language); // Primary language code (ISO 639-1)
console.log(site.region); // Region code (ISO 3166-1 alpha-2)
console.log(site.html); // Raw HTML content (UTF-8)
console.log(site.text); // Plain text content (extracted from HTML)
console.log(site.feeds); // Array of feed URL objects
console.log(site.internalLinks); // Array of internal link URL objects
console.log(site.externalLinks); // Array of external link URL objects
Gather website data from a URL in one convenient call.