Structured LLM outputs
LLMs mostly produce syntactically valid outputs when we try generating JSON, XML, code, etc., but they can occasionally fail due to their probabilistic nature. This is a problem for developers as we use LLMs programmatically, for tasks like data extraction, code generation, tool calling, etc.

There are many deterministic ways to ensure structured LLM outputs. If you are a developer, this handbook covers everything you need.
- What happens under-the-hood?
- What are the best tools & techniques?
- How to pick the right tools & techniques?
- How to build, deploy, and scale systems?
- How to optimize for latency and cost?
- How to improve the quality of output?
Motivationβ
Structured generation is moving too fast. Most resources you find today are already outdated. You have to dig through multiple academic papers, blogs, GitHub repos, and other resources.
This handbook brings it all together in a living document that updates regularly.
How to use thisβ
You can read it start-to-finish, or treat it like a lookup table.
Who are we?β
We're the maintainers of Nanonets-OCR models (VLMs to convert documents into clean, structured Markdown) and docstrange (open-source document processing library).
Subscribe to our newsletter
Updates from the LLM developer community in your inbox. Twice a month.
- Developer insights
- Latest breakthroughs
- Useful tools & techniques