How I Cut Chargeback Fraud 73% on a Fintech Project
An anonymized case study with the real numbers.
Three months on a fintech engagement, three rule-engine refactors, and a small AI scoring layer. Here's how the numbers actually moved.
Anonymized case study. Numbers are real, identifying details are not.
The starting state
A mid-stage fintech doing $80M GMV/year. Chargeback rate had crept from 0.6% to 1.4% in 18 months. They'd hit Visa's monitoring program threshold. Penalties imminent.
Existing fraud system: a 200-rule hand-tuned engine, false-positive rate around 8%, slow to update.
My engagement
Three-month fixed-bid engagement. Goal: cut chargeback rate below 1% without sacrificing approval rate.
What I changed
Phase 1 (weeks 1-3): instrumentation. The team didn't know which rules were firing, which were catching real fraud, which were rejecting good customers. I built a dashboard tracking every rule's true-positive and false-positive rates over time. Discovered 40 rules fired but had near-zero true positives - pure noise.
Phase 2 (weeks 4-7): rule cleanup. Disabled the noise rules, tightened the high-value ones with better thresholds. False-positive rate dropped from 8% to 5.5%. No effect yet on chargeback rate.
Phase 3 (weeks 8-12): scoring layer. Built a gradient-boosted model on 18 months of historical chargeback data. 47 features, mostly velocity/network/device. Output: a 0-1 risk score.
The score didn't replace rules - it augmented them. Rules still ran first. Borderline rule outcomes (ambiguous) were broken by the model.
Final numbers
- Chargeback rate: 1.4% → 0.38% (73% reduction)
- False positive rate: 8% → 4.2% (47% reduction - yes, both improved)
- Approval rate on legit traffic: +2.1%
What I'd do differently
I overinvested in feature engineering on the model. The simpler version (with 12 features instead of 47) was within 4% of the final model. Diminishing returns.
I'd also have shipped a "shadow mode" earlier - the model running but not blocking - to validate it against production traffic before flipping the switch.
Lessons for fintech teams
- Instrumentation first. You can't improve what you can't measure. 80% of "we need ML" problems are actually "we don't measure our existing rules" problems.
- Don't replace rules - augment them. Rules are auditable, ML is not. Regulators love rules. Use the model to break ties.
- Watch approval rate. It's easy to cut chargebacks by being overly restrictive. The right metric is profit-per-attempted-transaction, not chargeback rate alone.
This is the kind of work I do on engagements. If your fraud numbers look like the starting state above, let's talk.