Instrumentation Before Optimization

Half the projects I get pulled into are 'we need to optimize X.' Half of those don't actually have data showing X is the bottleneck. Instrumentation is the first move.

When a client says "we need to optimize the database / the front-end / the AI pipeline," my first question is: what does your data say?

About half the time, they don't have data. They have opinions. Opinions are how you optimize the wrong thing.

The instrumentation-before-optimization rule

Before any optimization work, I install:

Distributed tracing (OTel everywhere)
RED metrics on every endpoint (rate, errors, duration)
Database query logging at the slow-query threshold (Postgres pg_stat_statements is gold)
Browser performance metrics (Web Vitals + custom marks)
A 7-day baseline of all of the above

THEN we look at where the time actually goes.

What I usually find

In rough order of frequency:

N+1 queries. A page makes 1 query, then 50 more inside a loop. Fixing this often delivers 10x without any architectural change.
Synchronous external API calls in serial. A page calls 5 services in sequence; should be parallel.
Unnecessary data being fetched. Pulling 500 columns when the page uses 12.
Front-end JS bundle size. Bundle analyzer shows 600K minified for a page that needs maybe 80K.
Algorithm inside a hot path. Less common but high-impact when found.

Notice "we need a more powerful database" or "we should rewrite in Rust" doesn't appear. They almost never do.

A real example

A client told me "Postgres is the bottleneck, we need to migrate to ScyllaDB." I asked for the data. They didn't have any.

We instrumented for two weeks. Postgres CPU sat at 30%. The bottleneck was the application layer doing N+1 queries against a small table. We fixed the N+1 in a day. Latency dropped 60%. The Postgres "migration" project never happened.

That conversation saved 6 months of engineer time and a six-figure migration risk.

What instrumentation costs

For most teams: 2-3 weeks of focused work to get good observability across the stack. After that, it's marginal.

For teams that already have observability but treat it as alarm-only: a week of building dashboards that answer the questions you actually have.

The discipline

Teams that succeed at performance work treat observability as a deliverable. They have a "before" snapshot and an "after" snapshot for every optimization. They share the comparison.

Teams that don't succeed at performance work treat observability as a thing they'll add later. They never do.

Instrumentation Before Optimization

The instrumentation-before-optimization rule

What I usually find

A real example

What instrumentation costs

The discipline

References

Related Articles

Choosing a Tech Stack for Your Startup

Making Your First Engineering Hire

Technical Interviews: A Hiring Manager's Guide

Want to discuss this topic?

Instrumentation Before Optimization

The instrumentation-before-optimization rule

What I usually find

A real example

What instrumentation costs

The discipline

References

Related Articles

Choosing a Tech Stack for Your Startup

Making Your First Engineering Hire

Technical Interviews: A Hiring Manager's Guide

Want to discuss this topic?

Command Palette