All Insights
technical

Kafka vs. EventBridge vs. Pub/Sub: A Practical Comparison

The cloud event-bus market is fragmented. I've shipped on all three. Here's how I'd choose today.

March 26, 202610 min read

Three years ago every event-driven system used Kafka. Today the answer is more nuanced. Cloud-native event buses have closed the gap on most workloads, but Kafka still wins for the workloads that matter.

When a client asks me "do we need Kafka?" the honest answer is "probably not, but here's when you do."

I've shipped event-driven systems on all three: Kafka (TD Bank, DBS, several startup clients), AWS EventBridge (two SaaS clients), Google Pub/Sub (one media client). Each one is the right answer for a different problem.

What each one is actually good at

Kafka is a log. Messages are persisted in order, partitioned, replayable. You can have 50 consumer groups reading the same topic, each at their own pace. The replay-history capability is the killer feature.

EventBridge is a router. You publish events, you write declarative rules, AWS routes them to consumers. The killer feature is integration breadth (it's pre-wired to ~200 AWS services and dozens of SaaS providers).

Pub/Sub is a messaging service. Push or pull semantics, very high throughput, excellent integration with the GCP ecosystem.

When I reach for Kafka

  • Need to replay events for analytics or migration. Kafka's log retention is unmatched.
  • Need strict ordering within a partition.
  • Need >100K messages/second sustained (Kafka can do millions).
  • Have an on-prem or hybrid deployment where managed cloud services aren't an option.
  • Need stream processing (Kafka Streams or Flink against Kafka).

When I reach for EventBridge

  • AWS-native architecture, and the breadth of integrations is doing real work.
  • Event volumes are moderate (<10K/sec).
  • Don't need replay (or can get away with re-running a job).
  • Want to declaratively route events without writing consumer code.

When I reach for Pub/Sub

  • GCP-native architecture.
  • Need very high throughput with low operational overhead.
  • Don't need replay.

The cost lens

Kafka self-hosted is expensive operationally (you're running brokers, ZooKeeper or KRaft, maintaining schemas). MSK (managed Kafka) flips that - you pay for it in dollars instead of engineer-hours.

EventBridge and Pub/Sub charge per event. Cheap at low volumes, gets expensive fast.

A back-of-the-envelope rule I use: at <1M events/day, EventBridge or Pub/Sub will be cheaper than Kafka end-to-end. At >100M/day, self-hosted or MSK Kafka pulls ahead. The grey zone is 1M-100M; the deciding factor is usually replay/ordering needs.

The architecture I default to in 2026

Greenfield startup, AWS, expecting moderate scale: EventBridge for cross-service events, SQS for fan-out, Step Functions for workflows. Kafka comes in only if we hit a constraint that the AWS native stack can't solve.

Greenfield enterprise with strict ordering needs: MSK or Confluent Cloud Kafka. The investment is justified.

Existing system on Kafka: don't rip it out unless you have a specific pain. Migration costs are always higher than expected.

References

kafkaawsgcpevent-drivenarchitecture

Want to discuss this topic?

I'm always happy to dive deeper. Reach out if you have questions or want to collaborate.

Get in Touch

Command Palette

Search for a command to run...