- Draft Title: The Inference Problem Nobody Talks About in Agentic AI: Control
- Length: 15min full presentation + 5min quick demo
- Links: Will share the presentation deck to organizers at a closer date
Agentic AI is moving fast from demos to real workflows that plan, decide, and act autonomously. Many developers are already experimenting with multi-step agents, tool-calling, and orchestration frameworks.
But as these systems move closer to production, a pattern is emerging: they don’t fail because models aren’t smart enough. They fail because control breaks down.
In this talk, we’ll explore why agentic systems amplify problems that traditional LLM applications could ignore: unpredictable costs, unstable latency, unsafe behavior, and brittle governance. Using recent real-world examples from the agent ecosystem, we’ll show how treating inference as a simple, stateless API call quietly becomes the weakest link in agentic workflows.
Rather than going deeper into model architectures, this talk introduces a new way of thinking: control-first AI inference, where cost, speed, routing, safety, and behavior are built in and treated as runtime decisions, not afterthoughts.
The session is narrative-driven and beginner-friendly, using diagrams and real scenarios to help developers and product leaders reason about why agentic AI feels exciting in demos but fragile in practice, and what architectural shift is needed to make these systems reliable.
We’ll close with a short, non-promotional demo and an invitation for the VanJ community to experiment further.
Speaker Bio

Agentic AI is moving fast from demos to real workflows that plan, decide, and act autonomously. Many developers are already experimenting with multi-step agents, tool-calling, and orchestration frameworks.
But as these systems move closer to production, a pattern is emerging: they don’t fail because models aren’t smart enough. They fail because control breaks down.
In this talk, we’ll explore why agentic systems amplify problems that traditional LLM applications could ignore: unpredictable costs, unstable latency, unsafe behavior, and brittle governance. Using recent real-world examples from the agent ecosystem, we’ll show how treating inference as a simple, stateless API call quietly becomes the weakest link in agentic workflows.
Rather than going deeper into model architectures, this talk introduces a new way of thinking: control-first AI inference, where cost, speed, routing, safety, and behavior are built in and treated as runtime decisions, not afterthoughts.
The session is narrative-driven and beginner-friendly, using diagrams and real scenarios to help developers and product leaders reason about why agentic AI feels exciting in demos but fragile in practice, and what architectural shift is needed to make these systems reliable.
We’ll close with a short, non-promotional demo and an invitation for the VanJ community to experiment further.
Speaker Bio