Skip to content
@extractus

Extractus

A set of extractor tools for devs

Welcome to Extractus

Here we develop and share a Web Extraction Suite designed to transform the chaotic web into clean, structured data for AI, Data Analysis, and modern Software development.

🌟 Featured Projects

  • article-extractor: The core engine for turning messy HTML into structured JSON.
  • feed-extractor: High-performance logic to parse RSS/Atom/JSON feeds with zero overhead.
  • oembed-extractor: Lightweight utility for social media metadata extraction.

Deploy them individually or in combination to power dynamic news platforms, automate content marketing pipelines, or curate high-quality datasets for NLP and AI research.

Have a feature request or encountered an issue? We welcome your feedback! Please open an issue to help us improve the ecosystem.

💎 Need to Scale? Meet Article Intelligence

If you are a Content Marketer, News Aggregator, or an Enterprise team, managing your own extraction infrastructure can be a System admin headache.

We’ve built the Article Intelligence Suite - a managed API version of our core engine with advanced features:

  • ✅ Process millions of requests with 99.9% uptime
  • ✅ Implemented transformations for thousands of websites
  • ✅ Built-in translation, sentiment analysis, categorization, summarization, and more
  • ✅ Low Cost - Low Latency - Always On

Pinned Loading

  1. article-extractor article-extractor Public

    To extract main article from given URL with Node.js

    JavaScript 1.9k 155

  2. oembed-extractor oembed-extractor Public

    Extract oEmbed data from given webpage

    JavaScript 122 47

  3. feed-extractor feed-extractor Public

    Simplest way to read & normalize RSS/ATOM/JSON feed data

    JavaScript 183 36

Repositories

Showing 5 of 5 repositories
  • .github Public

    Organization meta data

    extractus/.github’s past year of commit activity
    1 MIT 0 0 0 Updated Dec 27, 2025
  • oembed-extractor Public

    Extract oEmbed data from given webpage

    extractus/oembed-extractor’s past year of commit activity
    JavaScript 122 MIT 47 0 0 Updated Sep 4, 2025
  • feed-extractor Public

    Simplest way to read & normalize RSS/ATOM/JSON feed data

    extractus/feed-extractor’s past year of commit activity
    JavaScript 183 MIT 36 5 0 Updated Sep 4, 2025
  • article-extractor Public

    To extract main article from given URL with Node.js

    extractus/article-extractor’s past year of commit activity
    JavaScript 1,854 MIT 155 6 1 Updated Sep 4, 2025
  • extractus Public
    extractus/extractus’s past year of commit activity
    HTML 14 MIT 0 4 (1 issue needs help) 0 Updated Jul 25, 2024

Top languages

JavaScript HTML

Most used topics

Loading…