What Anthropic Got Right About AI Safety That OpenAI Got Wrong

Safety isn't a marketing position. It's a series of small product decisions that compound. After two years of building on both Claude and GPT, the difference in how each lab thinks shows up in the failure modes I see in production.

I build with both labs' APIs every week. I'm not a safety researcher and I'm not a true believer. I'm an engineer shipping production systems. From that vantage point, I can tell you that Anthropic and OpenAI have made meaningfully different bets, and the differences show up in my code.

Bet 1: Constitutional AI vs RLHF on vibes

Anthropic's Constitutional AI paper is, in my reading, the most important practical document of the last three years. The premise is that you train the model against an explicit, written set of principles, and you can reason about why it refuses or accepts requests.

OpenAI's training process is much less legible. The system prompt and refusal behavior shifts between model versions in ways that have repeatedly broken my code without warning.

In production, that legibility matters. When Claude refuses something, I can usually predict why. When GPT refuses, I'm sometimes guessing.

Bet 2: Defaults that bias toward being useful

This is the lens that surprised me. OpenAI's older defaults bias toward refusal in ambiguous cases. Anthropic's defaults bias toward asking a clarifying question or providing a hedged answer.

For a customer-facing assistant in a regulated industry, "asks a clarifying question" is enormously valuable. "Refuses with a generic safety message" creates a support ticket.

Bet 3: Tool use as a first-class primitive

Anthropic's tool use API is more boring than OpenAI's function calling, and that's the point. The contracts are stable. The error modes are predictable. I've shipped agent loops on Claude that have run for six months without behavioral drift across model updates.

I cannot say the same about GPT-based agents I've shipped. Every model version has shifted tool calling in subtle ways, sometimes silently.

Bet 4: The product surface

Look at where each company spends product effort.

Anthropic ships Claude Code, the Files API, MCP, computer use. All of these are developer-facing. All of them ship with documented limitations and security considerations.

OpenAI ships consumer features. ChatGPT, voice mode, the Sora app. All exciting, none oriented at the engineer building production systems on a five year horizon.

That's a strategic call, not a flaw. But if I'm betting on which lab I want to build on for the next decade, the developer focus matters.

What OpenAI got right

I want to be fair. OpenAI is faster to ship raw capability. GPT-4 came first, GPT-4 Turbo came first, the multimodal demos came first. If you need the absolute frontier of capability today, OpenAI has often been there first.

The cost is that frontier models on OpenAI tend to be less stable in production. The cost matrix on Sonnet/Haiku is also typically better for high-volume backend workloads.

Where the cultural difference shows up in code

Three real examples from my own codebase:

Refusal handling. With Claude I have one structured refusal handler. With OpenAI I have four, because the refusal shapes have changed across versions.
System prompt portability. I can usually port a Claude system prompt to a new model with minor edits. GPT prompts I rewrite from scratch each major version.
Cost predictability. Claude's pricing has been stable enough that I forecast spend monthly. OpenAI's has shifted often enough that I rebuild the spreadsheet quarterly.

The sharper insight

Safety, at the product layer, is the same thing as predictability. The model that does what you expect, refuses what you expect, and costs what you expect, is the model you can build a business on. It is not a coincidence that the lab that talks most about safety also ships the most predictable production behavior. Those things are deeply connected.

I'm not saying don't use OpenAI. I use both. I'm saying the cultural bet a lab makes shows up in how much your team has to clean up after each model release. Plan accordingly.

For more on how I think about this in production, see my agents in production piece.

What Anthropic Got Right About AI Safety That OpenAI Got Wrong

Bet 1: Constitutional AI vs RLHF on vibes

Bet 2: Defaults that bias toward being useful

Bet 3: Tool use as a first-class primitive

Bet 4: The product surface

What OpenAI got right

Where the cultural difference shows up in code

The sharper insight

References

Building a SaaS Backend in Java in 2026

Related articles

Building in Public: An Honest Take

Taste in Engineering

AI Won't Replace Engineers (But It Will Change Us)

Want to discuss this topic?

What Anthropic Got Right About AI Safety That OpenAI Got Wrong

Bet 1: Constitutional AI vs RLHF on vibes

Bet 2: Defaults that bias toward being useful

Bet 3: Tool use as a first-class primitive

Bet 4: The product surface

What OpenAI got right

Where the cultural difference shows up in code

The sharper insight

References

Building a SaaS Backend in Java in 2026

Related articles

Building in Public: An Honest Take

Taste in Engineering

AI Won't Replace Engineers (But It Will Change Us)

Want to discuss this topic?

Command Palette