Magpie HTML - v0.1.3
    Preparing search index...

      Magpie HTML - v0.1.3

      Magpie HTML - Universal web content scraper for Node.js and browsers

      A modern TypeScript library for parsing web feeds (RSS, Atom, JSON Feed), extracting metadata, and scraping article content from HTML. Designed to be isomorphic, type-safe, and resilient to malformed data.

      Key features:

      • Universal feed parser with automatic format detection
      • Comprehensive metadata extraction (SEO, OpenGraph, Schema.org, etc.)
      • Article content extraction with Mozilla Readability
      • Smart URL resolution (relative to absolute)
      • Content quality assessment
      • Full TypeScript support
      • Minimal runtime dependencies

      Classes

      PluckError
      PluckNetworkError
      PluckTimeoutError
      PluckHttpError
      PluckSizeError
      PluckEncodingError
      PluckRedirectError
      PluckContentTypeError
      SwoopError
      SwoopEnvironmentError
      SwoopTimeoutError
      SwoopExecutionError
      SwoopSecurityError

      Interfaces

      HtmlToTextOptions
      ContentExtractionOptions
      ExtractedContent
      ExtractionFailure
      ContentQuality
      FeedAuthor
      FeedEnclosure
      FeedItem
      Feed
      ParseResult
      Website
      Article
      AnalyticsMetadata
      AssetsMetadata
      PreloadResource
      ConnectionHint
      CanonicalMetadata
      CopyrightMetadata
      DublinCoreMetadata
      DiscoveredFeed
      FeedDiscoveryMetadata
      GeoPosition
      GeoMetadata
      AppleTouchIcon
      MaskIcon
      MSTile
      IconsMetadata
      LanguageMetadata
      LinksExtractionOptions
      LinksMetadata
      MonetizationMetadata
      NewsMetadata
      OpenGraphArticle
      OpenGraphVideo
      OpenGraphAudio
      OpenGraphImage
      OpenGraphBook
      OpenGraphProfile
      OpenGraphMetadata
      PaginationMetadata
      RobotDirectives
      RobotsMetadata
      JsonLdBlock
      SchemaOrgMetadata
      SecurityMetadata
      SEOMetadata
      SitemapDiscoveryMetadata
      SocialProfilesMetadata
      TwitterAppPlatform
      TwitterApp
      TwitterPlayer
      TwitterCardMetadata
      VerificationMetadata
      PluckInit
      PluckResponse
      SwoopInit
      SwoopResult

      Type Aliases

      ExtractionErrorType
      ContentResult
      FeedFormat
      SwoopWaitStrategy
      HTMLDocument

      Functions

      extractContent
      htmlToText
      countWords
      calculateReadingTime
      assessContentQuality
      isProbablyReaderable
      detectFormat
      isFeed
      isRSS
      isAtom
      isJSONFeed
      parseFeed
      gatherArticle
      gatherFeed
      gatherWebsite
      extractAnalytics
      extractAssets
      extractCanonical
      extractCopyright
      extractDublinCore
      extractFeedDiscovery
      extractGeo
      extractIcons
      extractLanguage
      extractMonetization
      extractNews
      extractOpenGraph
      extractPagination
      extractRobots
      extractSchemaOrg
      extractSecurity
      extractSEO
      extractSitemapDiscovery
      extractSocialProfiles
      extractTwitterCard
      extractVerification
      pluck
      swoop
      parseHTML