Parse and clean email content - removes quotes, auto-signatures, and mailing list footers while preserving human signatures.
Maintained by Pinenlime
Unlike other email parsing libraries that aggressively remove all signatures, email-body-parser follows a conservative philosophy: only remove things we're 100% sure are not content.
| Feature | Other Libraries | email-body-parser |
|---|---|---|
| Human signatures ("Best, John") | ❌ Removes | ✅ Keeps |
| Mobile auto-signatures | âś… Removes | âś… Removes |
| Quote headers | âś… Removes | âś… Removes |
| Mailing list footers | ❌ Not handled | ✅ Removes |
| Legal disclaimers | ❌ Not handled | ✅ Removes |
| Compressed Outlook headers | ❌ Basic | ✅ Comprehensive |
npm install email-body-parserFor most use cases, the cleanEmailContent() function is all you need:
import { cleanEmailContent } from 'email-body-parser';
const rawEmail = `Thanks for the update!
Best regards,
John Smith
Product Manager
On Mon, Mar 17, 2025 at 1:29 PM Jane Doe <jane@example.com> wrote:
> Here's the latest report...
Sent from my iPhone`;
const cleaned = cleanEmailContent(rawEmail);
console.log(cleaned);
// Output:
// Thanks for the update!
//
// Best regards,
// John Smith
// Product ManagerFor more control, use the EmailBodyParser class:
import EmailBodyParser from 'email-body-parser';
const parser = new EmailBodyParser();
const email = parser.parse(rawEmail);
// Get visible content (excludes quotes, auto-signatures)
console.log(email.getVisibleText());
// Get just the quoted portions
console.log(email.getQuotedText());
// Iterate over all fragments
for (const fragment of email.getFragments()) {
console.log({
content: fragment.content,
isHidden: fragment.isHidden,
isQuoted: fragment.isQuoted,
isSignature: fragment.isSignature,
});
}const parser = new EmailBodyParser();
// Get visible text directly
const visibleText = parser.parseReply(rawEmail);
// Get quoted text directly
const quotedText = parser.parseReplied(rawEmail);- Gmail style:
On Mon, Mar 17, 2025 at 1:29 PM John <john@example.com> wrote: - Outlook style:
-----Original Message----- - Forward headers:
From: ... Sent: ... To: ... Subject: ... - Standard quote markers:
> quoted text
- Mobile:
Sent from my iPhone,Sent from my Android - Apps:
Sent via Superhuman,Get Outlook for iOS - Meeting links:
BOOK A MEETING...
- Google Groups:
You received this message because... - Unsubscribe links:
Click here to unsubscribe - Marketing footers:
This email was sent to...
CONFIDENTIAL: This message contains...DISCLAIMER: This email and any files...
Human signatures are kept because they provide valuable context:
- Contact information for follow-ups
- Job titles help understand urgency
- Avoids false positives
const email = `Please review the attached document.
Best regards,
Sarah Williams
Senior Financial Analyst
Direct: (555) 234-5678
s.williams@example.com`;
cleanEmailContent(email);
// Returns the ENTIRE email - signature is preserved!Cleans email content by removing quotes, auto-signatures, and mailing list footers.
Parameters:
content- The raw email content to clean
Returns: Cleaned email content with quotes and auto-signatures removed
Options:
keepSignatures(default:true) - Keep human signaturesremoveDisclaimers(default:true) - Remove legal disclaimersremoveMailingListFooters(default:true) - Remove mailing list footers
Parses email content into fragments.
Convenience method that returns visible text directly.
Convenience method that returns quoted text directly.
Returns all email fragments.
Returns content that is not hidden (excludes quotes, auto-signatures).
Returns only the quoted portions of the email.
interface EmailFragment {
content: string; // The fragment text
isHidden: boolean; // True if this should be hidden from display
isSignature: boolean; // True if this is an auto-signature
isQuoted: boolean; // True if this is quoted content
}For advanced users, the pattern arrays are exported:
import {
QUOTE_PATTERNS,
AUTO_SIGNATURE_PATTERNS,
MAILING_LIST_PATTERNS,
} from 'email-body-parser';
// Each pattern has metadata for debugging
QUOTE_PATTERNS.forEach(({ pattern, description, example }) => {
console.log(description, example);
});Full TypeScript support with exported types:
import type {
EmailFragment,
ParsedEmail,
PatternDefinition,
ParserOptions,
} from 'email-body-parser';For better performance and ReDoS protection, install RE2 as an optional peer dependency:
npm install re2The library will automatically use RE2 when available.
MIT License - see LICENSE for details.
Created and maintained by Pinenlime.
Contributions are welcome! Please feel free to submit issues and pull requests.