What is Fine-tuning?
Take a pre-trained model and continue training on a specific task:
Types of Fine-tuning
- Full fine-tuning: Update all parameters
- LoRA: Train low-rank adapters
- Prompt tuning: Learn soft prompts
- Layer freezing: Only train top layers
Why Fine-tuning Works
📝 Practice Exercises
Exercise 1: Implement LoRA Fine-tuning
Use the PEFT library to fine-tune a model with LoRA. Your task:
- Load GPT-2 from Hugging Face
- Configure LoRA with r=8, alpha=32, targeting query/value layers
- Set up training for a text classification task
Key insight: LoRA reduces trainable parameters to ~0.2% while maintaining performance!
Exercise 2: Layer Freezing Strategy
Implement layer-wise freezing for efficient fine-tuning:
Expected output: Trainable: ~20M / 124M parameters (16%)
🧠 Knowledge Check
Test your understanding of fine-tuning concepts:
Question 1
What is the main advantage of using LoRA (Low-Rank Adaptation) for fine-tuning?
- A) It trains all model parameters for maximum accuracy
- B) It reduces trainable parameters to ~0.2% while maintaining performance
- C) It requires no pre-trained model
- D) It only works with GPT-2
Answer: B — LoRA adds small trainable adapter matrices instead of updating all weights, making fine-tuning efficient and fast.
Question 2
In layer freezing, why would you freeze early transformer layers and only train the top layers?
- A) Early layers contain low-level features (syntax, grammar) that transfer well
- B) Early layers are always corrupted
- C) It's required by the PEFT library
- D) Top layers learn nothing during pre-training
Answer: A — Early layers capture general language patterns that are useful across tasks, while top layers capture task-specific patterns that need adaptation.
Question 3
What does 'transfer learning' mean in the context of fine-tuning LLMs?
- A) Transferring model weights between different cloud providers
- B) Transferring knowledge learned from one task to a different but related task
- C) Copying the training dataset to another location
- D) Moving the model from GPU to CPU
Answer: B — Transfer learning leverages knowledge gained during pre-training (general language understanding) and applies it to a specific downstream task through fine-tuning.
Question 4
Which fine-tuning method would be most appropriate if you have limited GPU memory (8GB) but need to adapt a 7B parameter model?
- A) Full fine-tuning of all parameters
- B) LoRA or QLoRA
- C) Training from scratch
- D) No fine-tuning, use prompt engineering only
Answer: B — LoRA/QLoRA dramatically reduces memory requirements by training only small adapter layers, making it feasible to fine-tune large models on consumer GPUs.