🚧 Lesson 1 of 25 in Level 05
Level 05 • Lesson 1

Vectors & Matrices

Linear algebra foundations. Vectors, matrices, and operations.

Vectors

A vector is an ordered list of numbers:

v = [1, 2, 3] # 3-dimensional vector # In ML: word embeddings are vectors # "cat" might be represented as [0.2, -0.5, 0.8, ...]

Vector Operations

# Addition: element-wise [1, 2] + [3, 4] = [4, 6] # Scalar multiplication 2 * [1, 2, 3] = [2, 4, 6] # Dot product (measures similarity) [1, 2] · [3, 4] = 1*3 + 2*4 = 11

Matrices

A matrix is a 2D array of numbers:

A = [[1, 2], [3, 4], [5, 6]] # 3×2 matrix (3 rows, 2 columns) # In ML: weight matrices transform vectors

Knowledge Check

Question 1: What is the result of the dot product [2, 3] · [4, 5]?

Answer: 2×4 + 3×5 = 8 + 15 = 23

Question 2: If matrix A is 3×2 and matrix B is 2×4, what are the dimensions of A × B?

Answer: The result is 3×4 (rows from A, columns from B)

Question 3: In ML, what do weight matrices do to input vectors?

Answer: They transform vectors through linear transformations, changing their dimensions and values to extract features

Question 4: What is the result of 3 * [1, 2, 3]?

Answer: [3, 6, 9] — each element is multiplied by the scalar

Practical Examples

Example 1: Word Embeddings in Practice

In LLMs, words are converted to vectors. Similar words have similar vector directions:

# Word embedding vectors (simplified, 3D for visualization) cat = [0.8, 0.2, 0.5] dog = [0.7, 0.3, 0.6] # Similar to "cat" (both animals) king = [0.2, 0.9, 0.1] # Different direction (royalty) queen = [0.3, 0.85, 0.15] # Similar to "king" # Dot product measures similarity dot(cat, dog) # High value (~0.9) - similar words dot(cat, king) # Low value (~0.3) - different meaning

Example 2: Neural Network Layer Computation

A simple neural network layer uses matrix multiplication:

# Input: 4 features (e.g., 4 token embeddings) input_vector = [1.0, 0.5, -0.3, 0.8] # Shape: (1, 4) # Weight matrix: 4 inputs → 3 outputs (hidden layer) weights = [[ 0.2, 0.5, -0.1, 0.3], [-0.3, 0.1, 0.4, -0.2], [ 0.1, -0.4, 0.2, 0.5]] # Shape: (4, 3) # Matrix multiplication: output = input × weights # Result shape: (1, 3) - 3 hidden neurons activated output = [0.49, 0.42, 0.33] # This transforms 4D input into 3D representation

Example 3: Attention Mechanism (Simplified)

Self-attention uses matrices to compute relationships between tokens:

# Three tokens: "The", "cat", "sat" # Each represented as a 4-dimensional embedding embeddings = [[0.5, 0.2, 0.1, 0.8], # "The" [0.8, 0.3, 0.5, 0.2], # "cat" [0.3, 0.9, 0.2, 0.1]] # "sat" # Query matrix projects embeddings to query space Q = embeddings × W_q # Shape: (3, 4) × (4, 4) = (3, 4) # Attention scores: how much each token attends to others # scores = Q × K^T (matrix multiply with key transpose) # Result: 3×3 matrix showing token-to-token relationships # Example attention scores: attn_scores = [[2.1, 1.5, 0.8], [1.4, 2.3, 1.1], [0.9, 1.2, 2.0]] # "cat" (row 2) has highest score with itself (2.3) and "sat" (1.1)

Practice Exercises

Exercise 1: Vector Operations

Implement basic vector operations in Python:

# Task: Complete the following functions def vector_add(v1, v2): """Add two vectors element-wise""" # Your code here pass def dot_product(v1, v2): """Calculate dot product of two vectors""" # Your code here pass def scalar_multiply(scalar, vector): """Multiply vector by scalar""" # Your code here pass # Test cases print(vector_add([1, 2, 3], [4, 5, 6])) # Expected: [5, 7, 9] print(dot_product([1, 2], [3, 4])) # Expected: 11 print(scalar_multiply(3, [1, 2, 3])) # Expected: [3, 6, 9]

Solution:

def vector_add(v1, v2): return [a + b for a, b in zip(v1, v2)] def dot_product(v1, v2): return sum(a * b for a, b in zip(v1, v2)) def scalar_multiply(scalar, vector): return [scalar * x for x in vector]

Exercise 2: Matrix Multiplication

Implement matrix multiplication from scratch:

# Task: Implement matrix multiplication def matrix_multiply(A, B): """ Multiply matrix A (m×n) by matrix B (n×p) Returns matrix of shape (m×p) """ # Your code here pass # Test case A = [[1, 2], [3, 4]] # 2×2 B = [[5, 6], [7, 8]] # 2×2 result = matrix_multiply(A, B) print(result) # Expected: [[19, 22], [43, 50]] # Explanation: [1*5+2*7, 1*6+2*8] = [19, 22] # [3*5+4*7, 3*6+4*8] = [43, 50]

Solution:

def matrix_multiply(A, B): rows_A = len(A) cols_A = len(A[0]) cols_B = len(B[0]) # Initialize result matrix with zeros result = [[0 for _ in range(cols_B)] for _ in range(rows_A)] # Multiply for i in range(rows_A): for j in range(cols_B): for k in range(cols_A): result[i][j] += A[i][k] * B[k][j] return result