Infrastructure
Pattern 14 of 26
Structured Output
The part your downstream code actually reads
Agents do not just talk. They call functions, fill schemas, and return JSON that other systems consume. Structured output is the pattern of making that reliable. When the model returns malformed JSON, everything downstream breaks. The two main approaches are constrained decoding, which makes invalid output impossible at the token level, and validation with retry, which catches failures and tries again with the error attached.
Why it matters
A tool call that returns malformed JSON two percent of the time is not a minor inconvenience. It is a production incident that happens every fifty tool calls. At scale, that failure rate is unacceptable. Structured output reliability is what separates a demo from a system you can actually trust.
Deep Dive
Every tool call is a structured output. The agent calls a function, and the model must return a valid JSON object that matches the expected schema. When this works reliably it is invisible. When it fails, every downstream system depending on that JSON fails with it. Structured output is not glamorous work. It is one of those load-bearing problems that does not feel important until something downstream crashes in production. An agent that reliably produces valid structured data is not the same thing as an agent that usually does.
The Outlines paper (2023) from dottxt-ai introduced constrained decoding as a principled solution. Instead of sampling tokens freely and hoping the result is valid JSON, the library constrains which tokens are legal at each step based on the target schema. Invalid output becomes impossible at the generation level, not just caught after the fact. Instructor takes a different approach. It wraps the model API with Pydantic validation and retry-with-feedback loops, resubmitting failed outputs with the validation error appended so the model can correct its mistake. With over three million monthly downloads, it is the most widely used structured output library in the ecosystem.
OpenAI shipped native structured outputs in August 2024, building constrained decoding into the API itself. Anthropic emphasizes well-designed prompts and schemas over constrained generation, and their models tend to follow JSON schemas reliably with good prompting. PydanticAI from the Pydantic team integrates structured output natively into an agent framework. The practical choice between approaches comes down to what you are willing to trade. Constrained decoding gives mathematical guarantees but constrains which models and providers you can use. Validation-with-retry is more flexible but introduces latency on failures. Most teams that build for production end up using both.