For Developers/Glossary/Structured Output
Inference

Structured Output

Constraining an LLM to produce output in a specific format (JSON, XML, a defined schema) rather than free-form text.

LLMs generate free-form text by default. Structured output techniques force the model to produce output that conforms to a schema, making it directly parseable by downstream code without regex extraction or post-processing.

Approaches

  • JSON mode: Tell the API to guarantee valid JSON output. No schema enforcement - the structure varies by prompt.
  • Function calling with JSON Schema: Provide a strict schema; the API guarantees the output matches the schema field types and required fields.
  • Constrained decoding: At the token level, only allow tokens that keep the output valid at each step. Libraries like Outlines and Guidance implement this client-side.

Why schema-constrained output matters

Without structured output, you're parsing free text in production. Models sometimes vary their output format between runs, especially at higher temperatures. Schema constraints eliminate this class of bug. The model still decides the values; the schema only constrains the structure.

Practical usage

OpenAI's response_format: { type: "json_schema", json_schema: {...} } and Anthropic's tool-call-based structured output are the main APIs. For complex nested schemas, Pydantic models with the Instructor library provide a Python-native interface that works across providers.

Related terms

Models relevant to Structured Output