Fine-tuning
Continuing a pre-trained model's training on a smaller, task-specific dataset to adapt it for a particular domain or behavior.
A pre-trained LLM has broad knowledge but generic behavior. Fine-tuning updates the model's weights on a curated dataset to shift it toward a specific format, style, domain, or task - without starting training from scratch.
When to fine-tune vs prompt-engineer
Fine-tuning makes sense when:
- The task requires a consistently specific output format that prompting alone struggles to maintain
- Domain-specific language or jargon needs to be baked in (medical notes, legal contracts, internal terminology)
- Latency or cost is critical - fine-tuned smaller models often outperform large models on narrow tasks
- You have hundreds or thousands of labeled examples
If you have fewer than 50 examples, start with few-shot prompting and retrieval first.
Fine-tuning approaches
- Full fine-tuning: Updates all weights. Most expensive but most flexible. Requires significant GPU memory.
- LoRA / QLoRA: Adds small adapter layers; only those are updated. 10-100x cheaper than full fine-tuning. The dominant approach for open-source models.
- RLHF / DPO: Trains the model to prefer outputs that match human preferences. Used by Anthropic, OpenAI, and others to improve chat behavior.
Fine-tuning services
OpenAI, Anthropic (via partners), Mistral, and Together AI all offer fine-tuning APIs. For open-source models, Hugging Face's PEFT library with LoRA is the standard toolkit.
Common mistakes
Overfitting on too small a dataset causes catastrophic forgetting (the model loses general capabilities). Data quality matters far more than quantity - 200 high-quality examples beat 10,000 noisy ones.
Related terms
Models relevant to Fine-tuning
DeepSeek V3
State-of-the-art open-weights model that shocked the industry with frontier performance at minimal cost
View model →Llama 4
Meta's multimodal open-weights model family with a 10M context window variant
View model →Qwen 3
Alibaba's highly capable open-weights model with top-tier multilingual performance
View model →Gemma 3
Google's open-weights model family optimized for on-device and edge deployment
View model →