Agentic AI is everywhere right now. Every startup is claiming their product is "agentic" and every enterprise is asking how to build one. After spending the last two years building these systems in production — most recently as Head of AI at DVx Ventures — I have strong opinions.
The core problem with most agentic systems
Most demo-grade agentic systems are brittle. They work beautifully in a controlled environment and fall apart the moment real-world messiness enters the picture: ambiguous inputs, API rate limits, partial failures, or context windows that drift over long-running tasks.
The core mistake is treating agentic AI as a prompt engineering problem rather than a systems architecture problem.
Pattern 1: Bounded agency over unbounded autonomy
The first pattern I reach for is bounded agency — defining the exact scope of what an agent can and cannot do before it touches production.
This means:
Unbounded agents that can "do anything" are a liability. Bounded agents with clear contracts are assets.
Pattern 2: Event-sourced agent state
Stateful agents need durable, queryable state. I've moved away from in-memory state for anything beyond simple Q&A workflows.
My preferred pattern: treat every agent action as an event that gets appended to an immutable log. This gives you:
Kafka or a lightweight event store like EventStoreDB works well here. For simpler cases, even Postgres with an append-only events table does the job.
Pattern 3: Structured outputs as the contract layer
LLMs are probabilistic. Your downstream systems are deterministic. The bridge between them is structured output validation.
I use Pydantic AI or LangChain's output parsers with retry logic. Every agent output gets validated against a schema before it touches any downstream system. Failed validations trigger retry with the validation error injected back into the prompt — this self-correction loop catches ~85% of output failures.
Pattern 4: Human-in-the-loop checkpoints
For high-stakes actions (sending emails, modifying records, financial operations), I build explicit human-in-the-loop checkpoints. The agent pauses, surfaces its proposed action with reasoning, and requires approval before proceeding.
This isn't a limitation — it's a feature. Enterprise stakeholders are far more comfortable deploying AI systems that they can audit and intervene in. Start with more checkpoints than you think you need and remove them as trust is established.
The stack I reach for
The goal is always the same: AI that's reliable enough to trust with real business processes. That's a higher bar than most teams realize.