Ryan Watts — Principal AI & Data Engineer

Agentic AI is everywhere right now. Every startup is claiming their product is "agentic" and every enterprise is asking how to build one. After spending the last two years building these systems in production — most recently as Head of AI at DVx Ventures — I have strong opinions.

The core problem with most agentic systems

Most demo-grade agentic systems are brittle. They work beautifully in a controlled environment and fall apart the moment real-world messiness enters the picture: ambiguous inputs, API rate limits, partial failures, or context windows that drift over long-running tasks.

The core mistake is treating agentic AI as a prompt engineering problem rather than a systems architecture problem.

Pattern 1: Bounded agency over unbounded autonomy

The first pattern I reach for is bounded agency — defining the exact scope of what an agent can and cannot do before it touches production.

This means:

·Explicit tool inventories with typed schemas (Pydantic is non-negotiable here)

·Hard limits on recursion depth and API call budgets

·Deterministic fallback paths for every uncertain state

Unbounded agents that can "do anything" are a liability. Bounded agents with clear contracts are assets.

Pattern 2: Event-sourced agent state

Stateful agents need durable, queryable state. I've moved away from in-memory state for anything beyond simple Q&A workflows.

My preferred pattern: treat every agent action as an event that gets appended to an immutable log. This gives you:

·Full replay capability for debugging

·Auditability (critical for enterprise compliance)

·Resume-from-failure for long-running workflows

Kafka or a lightweight event store like EventStoreDB works well here. For simpler cases, even Postgres with an append-only events table does the job.

Pattern 3: Structured outputs as the contract layer

LLMs are probabilistic. Your downstream systems are deterministic. The bridge between them is structured output validation.

I use Pydantic AI or LangChain's output parsers with retry logic. Every agent output gets validated against a schema before it touches any downstream system. Failed validations trigger retry with the validation error injected back into the prompt — this self-correction loop catches ~85% of output failures.

Pattern 4: Human-in-the-loop checkpoints

For high-stakes actions (sending emails, modifying records, financial operations), I build explicit human-in-the-loop checkpoints. The agent pauses, surfaces its proposed action with reasoning, and requires approval before proceeding.

This isn't a limitation — it's a feature. Enterprise stakeholders are far more comfortable deploying AI systems that they can audit and intervene in. Start with more checkpoints than you think you need and remove them as trust is established.

The stack I reach for

·LangChain or LangGraph for orchestration

·Pydantic AI for output validation and tool schemas

·Kafka or Redis Streams for durable agent state

·Postgres for human-in-the-loop approval queues

·Kubernetes for deploying agent workers that scale independently

The goal is always the same: AI that's reliable enough to trust with real business processes. That's a higher bar than most teams realize.