A smart and robust CSV parser designed to automatically extract banking/financial transactions from CSV files of any bank or institution, even when the format is unknown or non-standard.
The class name (HowIParsedYourDataAlgorithm) is a tribute to Ted Mosby's iconic line from How I Met Your Mother:
"Kids, I'm going to tell you an incredible story — the story of how I parsed your data."
Many users receive bank statements or transaction lists in CSV format with varying columns, names in different languages, and inconsistent date/amount formats. This parser uses advanced heuristics to:
- Automatically detect the start of the data table (skipping headers, notes, or legends)
- Map the correct columns for date, amount, currency, and description/title
- Handle multiple fallbacks and global search when primary fields are ambiguous or empty
- Support dozens of date formats, amount formats (with . or , separators and thousands), and currency symbols
- Automatically determine whether a transaction is an income or expense
- Automatic detection of the header row in the data table
- Intelligent scoring based on:
- Keywords in over 5 languages (Italian, English, German, French, Spanish)
- Regex patterns for dates, amounts, and currencies
- Analysis of sample values from the first rows
- Fallback management (up to 3 alternatives per field)
- Robust date parsing (ISO, DD/MM/YYYY, MM-DD-YYYY, DD.MM.YYYY, Italian textual months like "15 gen 2025")
- Amount normalization (removing currency symbols, handling comma/dot, thousands separators)
- Currency recognition (EUR, USD, GBP, JPY, etc.) from code or symbol
- Structured output with typed transactions and totals summary
npm install papaparseimport HowIParsedYourDataAlgorithm from "./src/TransactionParser";
// Read the CSV file content as a string
const csvContent = await fetch("/path/to/bank-statement.csv").then((r) =>
r.text()
);
const parser = new HowIParsedYourDataAlgorithm();
const result = await parser.parseCSVFile(csvContent);
console.log("Parsed transactions:", result.transactions.length);
console.log("Column mapping:", result.mapping);
console.log("Confidence:", result.mapping.confidence);
console.log("Total income:", result.summary.totalIncome);
console.log("Total expenses:", result.summary.totalExpenses);
console.log("Errors:", result.errors);{
transactions: IReadedTransaction[];
mapping: IFieldMapping;
errors: string[];
detectedStartRow?: number;
summary: {
totalTransactions: number;
totalIncome: number;
totalExpenses: number;
}
}{
date: string; // ISO string
amount: number; // absolute value
action: TransactionAction;
currency: Currencies;
title: string; // description
originalRow: Record<string, any>; // original row
}Dates: ISO, DD/MM/YYYY, MM/DD/YYYY, DD.MM.YYYY, DD Mon YYYY (Italian: "15 gen 2025")
Amounts: with/without currency symbol, decimal separator . or ,, thousands separator
Currencies: EUR, USD, GBP, JPY, CHF, CAD, AUD + symbols (€ $ £ ¥ ₹ ₽)
Descriptions: over 100 keywords in Italian, English, German, French, Spanish
Contributions are welcome! You can:
- Add new date/amount patterns
- Expand keyword lists for additional languages
- Improve scoring or fallback logic
- Add tests with real (anonymized) CSV samples
MIT