Instruction Tuning
Train models to follow natural language instructions:
Dataset Creation
High-quality instruction datasets include:
- Alpaca (52k instructions)
- ShareGPT (conversations)
- Dolly (human-generated)
- FLAN (task mixtures)
Training Tips
- Use small learning rate (1e-5 to 5e-5)
- Train for 1-3 epochs (don't overfit)
- Balance different task types
- Include examples of desired behavior
Practical Examples
Example 1: Creating Instruction Data
Here's how to structure instruction-following data for fine-tuning:
Example 2: Using Open-Source Tools
Generate synthetic instruction data with existing models:
Example 3: Evaluating Instruction Following
Test if your model follows instructions correctly:
Knowledge Check
Question 1
What is the recommended learning rate range for instruction tuning?
Answer: 1e-5 to 5e-5 — much smaller than pre-training rates to preserve existing knowledge.
Question 2
Why should instruction tuning typically run for only 1-3 epochs?
Answer: To prevent overfitting. Too many epochs can cause the model to memorize training examples and lose generalization ability.
Question 3
Name two popular open-source instruction datasets.
Answer: Alpaca (52k instructions), ShareGPT (conversations), Dolly (human-generated), or FLAN (task mixtures).
Question 4
What is the key difference between fine-tuning and instruction tuning?
Answer: Fine-tuning adapts a model to a specific task or domain, while instruction tuning teaches the model to follow natural language instructions across various tasks.
Question 5
Why is it important to balance different task types in instruction datasets?
Answer: Balancing prevents the model from becoming biased toward certain instruction formats and ensures it generalizes across diverse tasks.
Practice Exercises
Exercise 1: Create Instruction Data
Write 3 instruction-following examples in JSON format. Each should include: instruction, input (can be empty), and expected output.
Tasks to cover: 1) Text summarization, 2) Code generation, 3) Sentiment analysis
Check your answers against the examples in the lesson above.
Exercise 2: Design Evaluation Prompts
Create 2 test prompts to evaluate if a model follows instructions precisely:
- One that tests format adherence (e.g., "Answer in JSON format")
- One that tests constraint following (e.g., "Use exactly 50 words")
Bonus: Write what a "good" vs "bad" response would look like for each.