OptimalARC · Field notes

The Hard 70%

Field notes on AI reliability, pattern discovery, and shipping agents that survive real production, from the team building the Pattern Intelligence Layer.

Why do customer-support AI agents fail in production, and how do you make them reliable?
Customer Support

Why do customer-support AI agents fail in production, and how do you make them reliable?

A support agent that confidently quotes a policy that does not exist is not a small bug. One Canadian tribunal made an airline honor a refund its chatbot invented. Here is why support agents fail, and the controls that make them safe to deploy.

Balagei G Nagarajan7 min read
What is a retry death spiral, and how do I stop it?
Cost & ROI

What is a retry death spiral, and how do I stop it?

Balagei G Nagarajan7 min read
Why did my agent's cost explode when it moved from pilot to production?
Cost & ROI

Why did my agent's cost explode when it moved from pilot to production?

Balagei G Nagarajan7 min read
Why do over-broad tool permissions turn one injection into a full breach?
Security

Why do over-broad tool permissions turn one injection into a full breach?

Balagei G Nagarajan7 min read
What is indirect prompt injection, and why can't the model just ignore it?
Security

What is indirect prompt injection, and why can't the model just ignore it?

Balagei G Nagarajan7 min read
Why does my agent repeat the same step in a loop?
AI Engineering

Why does my agent repeat the same step in a loop?

Balagei G Nagarajan7 min read
Why do multi-agent systems fail more often than a single agent?
AI Reliability

Why do multi-agent systems fail more often than a single agent?

Balagei G Nagarajan7 min read
Why do small per-step error rates cause large multi-step agent failures?
AI Reliability

Why do small per-step error rates cause large multi-step agent failures?

Balagei G Nagarajan7 min read
Why does my agent report success when the action never actually happened?
AI Reliability

Why does my agent report success when the action never actually happened?

Balagei G Nagarajan7 min read
Why does switching embedding models silently break my agent's retrieval?
AI Reliability

Why does switching embedding models silently break my agent's retrieval?

Balagei G Nagarajan7 min read
Why Does My AI Agent Hallucinate When the Data Exists in My Database?
AI Reliability

Why Does My AI Agent Hallucinate When the Data Exists in My Database?

Balagei G Nagarajan7 min read
My Calendar Bot Was Supposed To Take One Evening
AI Engineering

My Calendar Bot Was Supposed To Take One Evening

Balagei G Nagarajan8 min read
AI Drift Detection: How to Catch Behavioral Drift Before Users Do
AI Reliability

AI Drift Detection: How to Catch Behavioral Drift Before Users Do

Balagei G Nagarajan7 min read
Edge Case Discovery: Finding the Production Scenarios Your Tests Miss
AI Engineering

Edge Case Discovery: Finding the Production Scenarios Your Tests Miss

Balagei G Nagarajan7 min read
OptimalARC vs DataRobot, LangChain, and Arize: Production Reliability Compared
Comparison

OptimalARC vs DataRobot, LangChain, and Arize: Production Reliability Compared

Balagei G Nagarajan9 min read
The Third Question Nobody Asks About Your AI Team
AI Reliability

The Third Question Nobody Asks About Your AI Team

Balagei G Nagarajan6 min read
What 150 CXO Conversations Taught Me About the Patterns Under the Patterns
AI Reliability

What 150 CXO Conversations Taught Me About the Patterns Under the Patterns

Balagei G Nagarajan8 min read
How to Get AI From Prototype to Production
AI Reliability

How to Get AI From Prototype to Production

Balagei G Nagarajan8 min read
The 7 Layers of AI Reliability: A Complete Framework
Framework

The 7 Layers of AI Reliability: A Complete Framework

Balagei G Nagarajan12 min read
Pattern Discovery vs Model Training: Why Most AI Teams Start Wrong
AI Engineering

Pattern Discovery vs Model Training: Why Most AI Teams Start Wrong

Balagei G Nagarajan6 min read
Zero Data Exposure AI: Why On-Premise Matters for Enterprise
Enterprise

Zero Data Exposure AI: Why On-Premise Matters for Enterprise

Balagei G Nagarajan7 min read
The Hard 70%: What 150+ CXO Interviews Taught Me About Why AI Teams Get Stuck
AI Reliability

The Hard 70%: What 150+ CXO Interviews Taught Me About Why AI Teams Get Stuck

Balagei G Nagarajan5 min read

Get the next one in your inbox

New writing on AI reliability, pattern discovery, and what actually breaks when agents meet production. No noise.

View The Hard 70% on Substack

Want to see AI reliability in action?