Interface Website

Gathered website data.

Remarks

This interface represents the complete gathered data from a website, including the authoritative URL and all extracted metadata. It will be extended incrementally with more properties.

interface Website {
    url: URL;
    feeds: URL[];
    title?: string;
    description?: string;
    image?: URL;
    icon?: URL;
    language?: string;
    region?: string;
    html: string;
    text: string;
    internalLinks: URL[];
    externalLinks: URL[];
}

Index

Properties

url feeds title? description? image? icon? language? region? html text internalLinks externalLinks

Properties

url

url: URL

Authoritative URL for the page.

Remarks

Uses canonical URL if present, otherwise the final URL after redirects.

feeds

feeds: URL[]

Discovered feed URLs (RSS, Atom, JSON Feed) as URL objects

`Optional`title

title?: string

Page title (cleaned, from best available source).

Remarks

Collects titles from multiple sources, cleans them, and picks the longest. Sources: OpenGraph, Twitter Card, HTML title tag, First H1

`Optional`description

description?: string

Page description (from best available source).

Remarks

Collects descriptions from metadata and picks the longest. Sources: OpenGraph, Twitter Card, HTML meta description

`Optional`image

image?: URL

Page keyvisual/image URL (from best available source).

Remarks

Priority: OpenGraph > Twitter Card > Largest Apple Touch Icon > Favicon Returns the URL object of the best visual representation of the site.

`Optional`icon

icon?: URL

Best available icon/favicon for the site.

Remarks

Priority: Largest Apple Touch Icon > Safari mask icon > Favicon > Shortcut icon > MS tile > Fluid icon Returns the highest quality icon available, preferring modern, high-resolution formats.

`Optional`language

language?: string

Primary language code (ISO 639-1).

Remarks

Extracted from HTML lang attribute, content-language meta tag, or OpenGraph locale. Normalized to lowercase ISO 639-1 format (e.g., 'en', 'de', 'fr', 'ja').

`Optional`region

region?: string

Region code (ISO 3166-1 alpha-2).

Remarks

Only present if the language includes a region specifier. Normalized to uppercase ISO 3166-1 alpha-2 format (e.g., 'US', 'GB', 'DE').

html

html: string

Raw HTML content of the page (UTF-8).

Remarks

The complete HTML source after fetching and decoding to UTF-8. Useful for custom processing or caching.

text

text: string

Plain text content extracted from the HTML.

Remarks

Automatically converted from HTML using the htmlToText function. Removes all tags, decodes entities, and preserves document structure with appropriate line breaks.

internalLinks

internalLinks: URL[]

Internal links found on the page (same domain, excluding current URL).

Remarks

All links are URL objects. The current page URL is excluded to avoid self-references. Useful for site crawling and navigation analysis.

externalLinks

externalLinks: URL[]

External links found on the page (different domains).

Remarks

All links are URL objects. Useful for analyzing outbound links, citations, and external resources.

Interface Website

Remarks

Index

Properties

Properties

url

Remarks

feeds

`Optional`title

Remarks

`Optional`description

Remarks

`Optional`image

Remarks

`Optional`icon

Remarks

`Optional`language

Remarks

`Optional`region

Remarks

html

Remarks

text

Remarks

internalLinks

Remarks

externalLinks

Remarks

Settings

On This Page

Interface Website

Remarks

Index

Properties

Properties

url

Remarks

feeds

Optionaltitle

Remarks

Optionaldescription

Remarks

Optionalimage

Remarks

Optionalicon

Remarks

Optionallanguage

Remarks

Optionalregion

Remarks

html

Remarks

text

Remarks

internalLinks

Remarks

externalLinks

Remarks

Settings

On This Page

`Optional`title

`Optional`description

`Optional`image

`Optional`icon

`Optional`language

`Optional`region