Update design with real CSV format analysis and household details

Adds known source formats (Chase credit card with headers, Wells Fargo
checking headerless), description normalization strategy, cross-account
transfer detection, source category hints, household income sources,
and sample categorization rules based on real transaction data.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-10 14:05:51 -05:00
parent a1d550d1e0
commit 0e1013c673

View File

@@ -35,6 +35,16 @@ src/
- id, name, relationship - id, name, relationship
- Seeded with the primary user on first run - Seeded with the primary user on first run
- Initial household: Andrew (self), Donna (wife), son
### Known income sources
Based on real data:
- **Andrew's salary** — CIM Techniques via QuickBooks direct deposit (biweekly, ~$2,314)
- **Donna's salary** — Oasis Batch payroll (biweekly, varies ~$998$1,185)
- **Other deposits** — Mobile deposits, Capital One transfers in
These are auto-detected as income and attributed to the corresponding household member via categorization rules.
### accounts ### accounts
@@ -65,18 +75,60 @@ src/
## CSV Import & Mapping ## CSV Import & Mapping
### Known Source Formats
Based on real sample data:
**Chase credit card** (e.g., `Chase0372_Activity*.CSV`):
- Has headers: Transaction Date, Post Date, Description, Category, Type, Amount, Memo
- Amount is signed (negative=purchase, positive=payment/return)
- Type column: Sale, Payment, Return
- Includes Chase's own category assignments — import as initial suggestions
**Wells Fargo checking** (e.g., `Checking1.csv`):
- No headers — five positional columns: Date, Amount, Flag ("*"), Check Number, Description
- Amount is signed (negative=debit, positive=credit)
- Descriptions are very verbose with embedded authorization dates, card numbers, and reference IDs
- Requires description normalization to extract core merchant/payee
### Headerless CSV support
The mapping wizard handles files with no headers by showing column previews with positional indices. User assigns meaning by column position rather than header name.
### Description normalization
The service layer strips noise from transaction descriptions before storage:
- Authorization dates ("PURCHASE AUTHORIZED ON 02/06")
- Card numbers ("CARD 5360")
- Reference IDs ("S586033096695382", "REF #OP0WS99NKQ")
- Redundant location/phone details
The cleaned description is stored for display and rule matching. The raw description is preserved in a separate field for reference.
### Source category hints
When a CSV includes pre-assigned categories (like Chase does), the import process can use these as initial category suggestions. The user can accept, override, or ignore them.
### Cross-account transfer detection
The import service detects matching transfer pairs across accounts. For example:
- Checking: "CHASE CREDIT CRD EPAY 260128 ... -$1,461.35"
- Chase: "Payment Thank You - Web ... +$1,461.35"
These are linked as transfers so they don't double-count as spending or income.
### First import from a new source ### First import from a new source
1. App reads CSV, detects delimiter and headers 1. App reads CSV, detects delimiter, and checks for headers vs. headerless format
2. Mapping wizard shows a preview of the first few rows 2. Mapping wizard shows a preview of the first few rows
3. User maps columns to normalized fields: date, amount, description, and optionally reference number, balance 3. User maps columns to normalized fields: date, amount, description, and optionally reference number, balance, source category
4. User specifies amount logic (single signed column, separate debit/credit columns, type column) 4. User specifies amount logic (single signed column, separate debit/credit columns, type column)
5. User selects or creates the account this CSV belongs to 5. User selects or creates the account this CSV belongs to
6. Mapping saved to `csv_mappings` keyed on column header fingerprint 6. Mapping saved to `csv_mappings` keyed on column header fingerprint (or column count + sample patterns for headerless files)
### Subsequent imports ### Subsequent imports
1. App matches CSV headers to a saved mapping 1. App matches CSV headers (or structure) to a saved mapping
2. Confirmation screen shows: transaction count, date range, duplicates detected 2. Confirmation screen shows: transaction count, date range, duplicates detected
3. User confirms before committing 3. User confirms before committing
@@ -96,12 +148,23 @@ Categorization engine runs on new transactions. Unmatched transactions flagged f
### Rule-based categorization ### Rule-based categorization
Rules match patterns against transaction descriptions and assign: category, optional tag override, optional person attribution. Examples: Rules match patterns against transaction descriptions and assign: category, optional tag override, optional person attribution. Examples from real data:
- `NETFLIX` -> Entertainment, Wants - `CIMTECHNIQUES` -> Income, attributed to Andrew
- `KROGER` -> Groceries, Needs - `OASISBATCH PAYROLL` -> Income, attributed to Donna
- `TRANSFER WIFE VISA` -> Transfer, attributed to Wife - `CHASE CREDIT CRD EPAY` -> Transfer
- `ALLOWANCE` -> Family, Wants, attributed to Son - `CAPITAL ONE TRANSFER` -> Transfer
- `AMEX EPAYMENT` -> Transfer, attributed to Donna
- `FREEDOM MTG PYMTS` -> Housing, Needs (mortgage)
- `DOMINION ENERGY` -> Utilities, Needs
- `HELLOFRESH` -> Groceries, Needs
- `Netflix.com` -> Subscriptions, Wants
- `PUBLIX|ALDI|PIGGLY WIGGLY|WAL-MART` -> Groceries, Needs
- `CHICK-FIL-A|KFC|MCDONALD` -> Dining Out, Wants
- `VW CREDIT` -> Transportation, Needs (auto loan)
- `FARM BUREAU INS` -> Insurance, Needs
- `WSFS LOAN` -> Debt Payment, Needs
- `WAY2SAVE SAVINGS` -> Transfer (savings)
### Auto-rule creation ### Auto-rule creation