Blog

Insights on AI agent governance, maturity, and production readiness

Same Agents, Two Languages: Closing the AI Engineering / Compliance Gap

May 17, 2026

A pattern shows up in almost every conversation we have with teams shipping AI agents at any meaningful scale. It isn’t about a specific framework, and it isn’t about whether the agents are “production-ready.” It’s about something quieter — and harder to fix with another framework or another platform.

Two functions inside the same organization are working from different maps of the same agents.

The Map AI Engineering Holds

The AI engineering team is the closest to the agents. They wrote the system prompt. They picked the tools and decided how many to register. They know which agents talk to other agents, which call the production database, which run autonomously, and which require human approval before they act.

When the Spec Becomes the Source: What Spec-Driven Development Asks of Your Specs

May 7, 2026

Something quiet has shifted in how engineering teams build with AI agents. A year ago, a feature began as a Jira ticket, became a design doc, became code, and the documentation lived a short life. Today, more teams are writing the specification first — in a structured form a coding agent can consume — and treating the spec, plan, tasks, and constitution as the primary artifacts. The code is generated downstream.

Nine Seconds to Wipe a Database: What That AI Agent Incident Tells You About Your Own Agents

April 28, 2026

The agent’s own post-incident explanation included the line “I guessed… I didn’t verify… I didn’t check.” Full account is on Yahoo Tech

This post is the engineering breakdown of that set of design gaps in agentic AI.

The Design Conditions That Make a Nine-Second Deletion Possible

For an incident of this shape to occur, some combination of the following has to be true at design time — before the agent ever runs a token of inference:

Your Test Suite Is Misleading You About Your AI Agents

April 19, 2026

The Moment You Realized Something Was Off

You’re on the QA team. Someone ships an AI agent into the codebase you’ve been testing for three years. You write a test. It looks like every test you’ve ever written:

def test_refund_agent_handles_valid_request():
    response = agent.run("I'd like a refund for order #12345")
    assert "refund" in response.lower()
    assert response != ""

It passes. It passes again. Then, on a Tuesday morning in CI, it fails. The agent replied “I’ll process that return for you right away.” No “refund” substring. Test red. Agent behavior? Fine. Arguably better.

You Don't Know How Many AI Agents You Have. Here's Why That's a Problem.

April 7, 2026

The Question Nobody Can Answer

Ask any engineering leader how many microservices their team runs. They’ll give you a number. Ask how many databases. They’ll know. Ask how many AI agents are deployed across the organization — and you’ll get silence.

This isn’t a failure of documentation or process. It’s a fundamental gap in how agentic AI systems are built and deployed today. Agents don’t look like traditional software, and the tools we built for tracking traditional software don’t work for agents.

The 6 Agentic AI Architecture Patterns — and What Can Go Wrong With Each

April 6, 2026

Not All Agents Are Created Equal

The term “AI agent” covers everything from a simple LLM call with a search tool to a fully autonomous swarm of specialized agents coordinating across systems. These aren’t just different scales — they’re fundamentally different architectures with different risk profiles, failure modes, and governance needs.

Understanding which pattern you’re building — and what can go wrong — is the first step toward building agents that are production-ready, not just demo-ready.

Your Agent Changed. You Didn't Know. Here's What Happened Next.

April 5, 2026

The Change Nobody Noticed

It started with a small commit. A senior engineer updated the system prompt for the customer support agent — adding a line about the new return policy. The change went through code review. Tests passed. The agent still answered questions correctly in staging.

Two weeks later, support tickets spiked. Customers reported the agent was offering refunds for products outside the return window. The agent wasn’t broken — it was behaving differently. The prompt change had subtly shifted the agent’s interpretation of “eligible for return” in edge cases that no test case covered.

Why Your AI Agent Security Tools Are Missing Half the Picture

April 4, 2026

The Layer Nobody’s Watching

The agentic AI security market is booming. Runtime guardrails that filter prompt injections. Firewalls that block malicious outputs. Shadow AI discovery tools that find unauthorized LLM usage. Red-teaming platforms that stress-test models.

These tools protect agents after they’re deployed. They sit in front of your agent at inference time and intercept bad inputs or outputs. They’re valuable — and they’re necessary.

But they’re only half the picture.