All Insights
technical· 22 min read

Serverless at Scale: Patterns and Pitfalls

When functions aren't enough

SV
Sri VardhanJanuary 10, 2024
Share on Twitter
Share on LinkedIn
Copy link

Serverless is powerful but has sharp edges at scale. Learn the patterns that work, the antipatterns to avoid, and when to choose different architectures.

What "at scale" actually means

Serverless gets sold as a magic switch: write a function, deploy, scale to infinity. That story is true for the first ten thousand requests a day. It starts cracking somewhere between five hundred and a few thousand requests per second, or when a single workflow needs more than a few seconds of compute, or when your traffic shape stops looking like neat little independent events.

I still reach for serverless first. It is the right default for most APIs, async jobs, and event handlers. But "at scale" is where the platform's defaults stop helping and start hurting. This is a tour of the patterns I keep using and the traps I keep watching out for.

Cold starts are a product decision

Cold starts are not a bug. They are a knob. Every choice I make about runtime, package size, memory allocation, and provisioned concurrency moves that knob.

What I actually do:

  • Pick a fast runtime. In my experience Node and Go cold-start meaningfully faster than the heaviest options
  • Trim the deployment artifact. Tree-shake, ship only what runs, prefer the AWS SDK v3 modular packages
  • Raise memory deliberately. More memory means more CPU, often a faster cold start, and lower wall clock cost
  • Use provisioned concurrency only on hot paths. It is real money, so I scope it to the handful of functions facing users

For predictable user-facing latency I tend to keep the truly latency-critical surface on a long-running container behind a load balancer and use functions for everything async around it.

Concurrency is your real budget

People talk about request rate. The platform talks about concurrency. They are not the same. A 200 ms function at 500 RPS uses about 100 concurrent executions. A 5 second function at the same rate uses 2,500. The second one will hit account limits long before the first.

I plan capacity in concurrency, not RPS. I set per-function reserved concurrency on anything that talks to a fragile downstream so a runaway feature cannot starve the rest of the platform. I treat the regional concurrency limit as a shared resource and I monitor it.

The downstream is the bottleneck

A serverless function in front of a relational database is the classic foot-gun. The function will scale. The database will not. I have watched a single deploy spin up thousands of concurrent connections in seconds and brick a Postgres instance.

The patterns that work:

  • A connection proxy that pools and reuses connections. RDS Proxy, PgBouncer, or a similar layer
  • Aggressive read caching at the edge or in a fast key-value store
  • Queue-based smoothing for any write pattern that can tolerate seconds of delay
  • Idempotency keys on every write so retries do not double-charge or double-create

I cover the database side of this in more depth in PostgreSQL for everything.

Async is the secret weapon

The biggest wins I have seen on serverless platforms come from removing things from the request path. Email sends, search index updates, audit logs, analytics, webhooks to third parties: none of these need to block the user.

A queue plus a worker function is almost always the right shape. SQS, SNS, EventBridge, Kafka, whatever your platform offers. I make the queue the contract. Producers do one thing: enqueue. Consumers do one thing: process and ack. Failures land in a dead-letter queue with enough context to replay.

This pattern is half the reason serverless feels cheap. You stop paying for compute that is just waiting on someone else's network.

Observability is not optional

Stateless functions with no shared memory are a debugging nightmare without proper tooling. I will not ship a serverless system without:

  • Structured JSON logs with a correlation ID propagated across every hop
  • Distributed tracing on the critical paths
  • Custom metrics for business events, not just infrastructure
  • Alarms tied to user-visible symptoms, not internal noise

Earlier in my career working on regulated systems, I learned the hard way that "the function ran" is not the same as "the work succeeded." I instrument outcomes, not invocations.

When functions stop being enough

There is a point where the right answer is to leave functions behind for that specific workload. Signs I watch for:

  • Steady traffic with no real spikes (a container is cheaper)
  • Long-running streaming or websocket connections
  • Heavy startup work that cannot be amortized
  • Workloads needing GPU or large memory footprints
  • A handful of "hot" functions consuming most of the cost

The honest answer is usually a hybrid. Containers behind a load balancer for the steady core, functions for the spiky edges and async work. I help teams make this call as part of architecture work.

A short antipattern list

  • Calling Lambda from Lambda from Lambda. Use Step Functions or a queue
  • Storing state in /tmp and praying for warm starts
  • Hammering a single Postgres without a proxy
  • Treating timeouts as the only failure mode (think about throttling, partial failures, retries)
  • Skipping local emulation. Test the function logic the same way you test any other unit

The takeaway

Serverless at scale is mostly about respecting boundaries. Concurrency budgets, downstream limits, observability, and knowing when a different shape fits the workload. Done well, you get an architecture that absorbs spikes, costs less than you'd expect, and lets a small team operate a system that used to need a platform crew. Done poorly, you get the same incidents as before, just in a more expensive package.

References

Tagged

#serverless#architecture#aws#scale
SV

Sri Vardhan

Independent technology studio of one. I help founders and small teams ship serious software without the consultancy overhead. More about me.

Want to discuss this topic?

I am always happy to dig deeper. If a piece sparked an idea or a disagreement, send it over. I read every message myself.

Get in Touch