Skip to content

A smart and robust CSV parser designed to automatically extract banking/financial transactions from CSV files of any bank or institution, even when the format is unknown or non-standard.

License

Notifications You must be signed in to change notification settings

Poveroh/transaction-parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

How I Parsed Your Data – CSV Transaction Parser

A smart and robust CSV parser designed to automatically extract banking/financial transactions from CSV files of any bank or institution, even when the format is unknown or non-standard.

The class name (HowIParsedYourDataAlgorithm) is a tribute to Ted Mosby's iconic line from How I Met Your Mother:
"Kids, I'm going to tell you an incredible story — the story of how I parsed your data."

Purpose

Many users receive bank statements or transaction lists in CSV format with varying columns, names in different languages, and inconsistent date/amount formats. This parser uses advanced heuristics to:

  • Automatically detect the start of the data table (skipping headers, notes, or legends)
  • Map the correct columns for date, amount, currency, and description/title
  • Handle multiple fallbacks and global search when primary fields are ambiguous or empty
  • Support dozens of date formats, amount formats (with . or , separators and thousands), and currency symbols
  • Automatically determine whether a transaction is an income or expense

Key Features

  • Automatic detection of the header row in the data table
  • Intelligent scoring based on:
    • Keywords in over 5 languages (Italian, English, German, French, Spanish)
    • Regex patterns for dates, amounts, and currencies
    • Analysis of sample values from the first rows
  • Fallback management (up to 3 alternatives per field)
  • Robust date parsing (ISO, DD/MM/YYYY, MM-DD-YYYY, DD.MM.YYYY, Italian textual months like "15 gen 2025")
  • Amount normalization (removing currency symbols, handling comma/dot, thousands separators)
  • Currency recognition (EUR, USD, GBP, JPY, etc.) from code or symbol
  • Structured output with typed transactions and totals summary

Installation

npm install papaparse

Usage

import HowIParsedYourDataAlgorithm from "./src/TransactionParser";

// Read the CSV file content as a string
const csvContent = await fetch("/path/to/bank-statement.csv").then((r) =>
  r.text()
);

const parser = new HowIParsedYourDataAlgorithm();

const result = await parser.parseCSVFile(csvContent);

console.log("Parsed transactions:", result.transactions.length);
console.log("Column mapping:", result.mapping);
console.log("Confidence:", result.mapping.confidence);
console.log("Total income:", result.summary.totalIncome);
console.log("Total expenses:", result.summary.totalExpenses);
console.log("Errors:", result.errors);

Example Output (IValueReturned)

{
  transactions: IReadedTransaction[];
  mapping: IFieldMapping;
  errors: string[];
  detectedStartRow?: number;
  summary: {
    totalTransactions: number;
    totalIncome: number;
    totalExpenses: number;
  }
}

Transaction Structure (IReadedTransaction)

{
  date: string; // ISO string
  amount: number; // absolute value
  action: TransactionAction;
  currency: Currencies;
  title: string; // description
  originalRow: Record<string, any>; // original row
}

Supported Languages & Formats

Dates: ISO, DD/MM/YYYY, MM/DD/YYYY, DD.MM.YYYY, DD Mon YYYY (Italian: "15 gen 2025")
Amounts: with/without currency symbol, decimal separator . or ,, thousands separator
Currencies: EUR, USD, GBP, JPY, CHF, CAD, AUD + symbols (€ $ £ ¥ ₹ ₽)
Descriptions: over 100 keywords in Italian, English, German, French, Spanish

Contributing

Contributions are welcome! You can:

  • Add new date/amount patterns
  • Expand keyword lists for additional languages
  • Improve scoring or fallback logic
  • Add tests with real (anonymized) CSV samples

License

MIT

About

A smart and robust CSV parser designed to automatically extract banking/financial transactions from CSV files of any bank or institution, even when the format is unknown or non-standard.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published