How to show 2 years of revenue when Stripe only remembers 3 months

The problem, in plain terms

I built a product called Indiecator. You connect it to your Stripe account (Stripe is the service most online businesses use to charge customers), and it shows you your revenue over time. How much you make each month, how fast you’re growing, how many customers are leaving. The kind of dashboard every subscription business wants but few have.

The promise was simple: connect your account, see the last 2 years. That promise turned out to be hard to keep, for a reason almost nobody expects.

The data source I was pulling from only remembers the last 3 months. I was promising 2 years of history while reading from something with a 3-month memory. Everything that follows is how I closed that gap.

One word you need: MRR

MRR means monthly recurring revenue. It’s the predictable income a subscription business earns every month. If 100 people pay you $10 a month, your MRR is $1,000. Someone upgrades, it goes up. Someone cancels, it goes down. The entire job of the dashboard is to track that number accurately, month by month, going back 2 years. Hold onto that. The rest is about where the numbers to compute it come from.

First attempt: follow the play-by-play

Stripe can notify you the instant anything happens. Customer subscribes, you get a ping. Upgrade, ping. Cancellation, ping. Each ping is called an "event." Think of it as a live play-by-play of every money change, delivered as it happens.

So the obvious design: listen to the play-by-play, and every time a change comes in, adjust the running total. Add for a new subscription, subtract for a cancellation. It’s clean and it’s honest. Every number on the dashboard traces back to a specific moment something happened.

I built that first. It worked in the demo. New changes came in, the chart moved. Ship it.

Week 4: the history wasn’t there

Then a real customer connected an account with 2 years of activity. The system tried to walk backward through the play-by-play to reconstruct their history. Around the 3-month mark, it ran out. The older plays simply weren’t there anymore. Stripe keeps the play-by-play for about 30 to 90 days, then drops it. Not hidden, not archived. Gone.

This is fine if you only care about right now. It is fatal if your whole product is "see your last 2 years." A play-by-play can tell you what happened this quarter. It can’t tell you what happened last year, because last year’s plays were thrown away.

The other source: the receipts

There’s a second place the data lives. Every time Stripe charges a customer, it keeps a receipt (an "invoice"). Unlike the play-by-play, receipts are kept forever. So the second idea: ignore the play-by-play, just read every receipt from day one, and rebuild the revenue history from those. This solves the memory problem completely. The receipts go back as far as you need.

But receipts have their own blind spot. They’re only created once a month, at billing time. Imagine a customer upgrades on the 3rd, then cancels on the 20th. The receipt at the end of the month shows the final state and quietly erases the two changes that happened in between. If you want to understand why revenue moved (how much came from upgrades, how much you lost to churn), that in-between detail is exactly what you need, and the receipts hide it.

So now I had two incomplete sources. The play-by-play has every detail but only remembers 3 months. The receipts remember forever but smooth over the details.

The fix: use both, split by time

Neither source is complete alone, but their weaknesses don’t overlap. So I used both, divided by time. For everything older than the last few months, use the receipts (they’re all that survive back there anyway). For the recent window, use the detailed play-by-play (it still exists, and that’s where the rich detail matters most). The dividing line sits comfortably inside the 3-month window, so the two sources overlap a little rather than leaving a gap.

That overlap creates an obvious risk: the same change counted twice, once from a receipt and once from a play. Double-counting would make every number wrong.

Here’s the rule that solved it, and it’s the part I’m actually proud of. Every change gets written into one shared list, and each entry is stamped with a fingerprint built from who it happened to, what changed, and when. The fingerprint is calculated the same way no matter which source it came from. So a change described by a receipt and the same change described by a play produce the identical fingerprint. When they overlap, they land on the same slot and merge automatically instead of counting twice.

That single decision (one shared list, identical fingerprints) is what turned "two messy data sources" into "one clean record with two contributors." Without it, every overlap is a judgment call and the totals slowly drift away from the truth.

How it runs day to day

In production, three jobs all feed that one shared list:

The live feed. As changes happen, they get recorded immediately. This is the real-time edge.

A daily catch-up. Live feeds miss things. Connections drop, notifications fail, things arrive out of order. So once a day a job re-checks the recent window and fills in anything the live feed missed. Because of the fingerprints, running it again never creates duplicates. It only fills gaps.

The one-time history rebuild. When you first connect your account, a job reads your receipts all the way back and builds your 2-year history. This is the part the play-by-play could never have done.

All three write to the same list, all three are safe to re-run, and none of them can double-count. That safety is the entire point.

What I’d do differently, and what’s still open

I’d have found the 3-month limit before building, not in week 4 when real history started disappearing. It’s a single fact that quietly invalidated my first design. The takeaway isn’t "the play-by-play was a bad idea." It was right for the recent window. The takeaway is: before you build on top of any data source, find out how long it actually keeps its data. That one property decides your whole architecture.

Still unsolved: the dividing line between "use receipts" and "use the play-by-play" is a fixed point right now, parked safely inside the 3-month window. It would be smarter to set it per account, based on how far back each account’s play-by-play actually still reaches. I haven’t needed that yet. I will, the first time an account’s history is shorter than my fixed line assumes.