🚧 Lesson 4 of 35 in Level 04
Level 04 • Lesson 4

Instruction Tuning

Teaching models to follow instructions. Dataset creation and training.

Instruction Tuning

Train models to follow natural language instructions:

Input: "Translate to French: The cat sat on the mat" Output: "Le chat s'est assis sur le tapis" Input: "Summarize: {long article}" Output: "{concise summary}"

Dataset Creation

High-quality instruction datasets include:

Training Tips

Practical Examples

Example 1: Creating Instruction Data

Here's how to structure instruction-following data for fine-tuning:

{ "instruction": "Write a Python function to calculate factorial", "input": "", "output": "def factorial(n):\n if n == 0 or n == 1:\n return 1\n return n * factorial(n - 1)" }

Example 2: Using Open-Source Tools

Generate synthetic instruction data with existing models:

# Using axolotl for instruction tuning base_model: meta-llama/Llama-2-7b-hf datasets: - path: yahma/alpaca-cleaned type: alpaca num_epochs: 3 learning_rate: 2e-5

Example 3: Evaluating Instruction Following

Test if your model follows instructions correctly:

Test: "List three benefits of renewable energy" ✓ Good response: Numbered list with clear benefits ✗ Bad response: "Renewable energy is important" (too vague) Test: "Answer with only YES or NO" ✓ Good response: "YES" ✗ Bad response: "Yes, I think that's correct"

Knowledge Check

Question 1

What is the recommended learning rate range for instruction tuning?

Answer: 1e-5 to 5e-5 — much smaller than pre-training rates to preserve existing knowledge.

Question 2

Why should instruction tuning typically run for only 1-3 epochs?

Answer: To prevent overfitting. Too many epochs can cause the model to memorize training examples and lose generalization ability.

Question 3

Name two popular open-source instruction datasets.

Answer: Alpaca (52k instructions), ShareGPT (conversations), Dolly (human-generated), or FLAN (task mixtures).

Question 4

What is the key difference between fine-tuning and instruction tuning?

Answer: Fine-tuning adapts a model to a specific task or domain, while instruction tuning teaches the model to follow natural language instructions across various tasks.

Question 5

Why is it important to balance different task types in instruction datasets?

Answer: Balancing prevents the model from becoming biased toward certain instruction formats and ensures it generalizes across diverse tasks.

Practice Exercises

Exercise 1: Create Instruction Data

Write 3 instruction-following examples in JSON format. Each should include: instruction, input (can be empty), and expected output.

# Example structure: { "instruction": "Your task description here", "input": "optional context", "output": "expected model response" }

Tasks to cover: 1) Text summarization, 2) Code generation, 3) Sentiment analysis

Check your answers against the examples in the lesson above.

Exercise 2: Design Evaluation Prompts

Create 2 test prompts to evaluate if a model follows instructions precisely:

  • One that tests format adherence (e.g., "Answer in JSON format")
  • One that tests constraint following (e.g., "Use exactly 50 words")

Bonus: Write what a "good" vs "bad" response would look like for each.