Matrix Multiplication
The fundamental operation in neural networks:
Transpose
Identity and Inverse
Key Takeaways
- Matrix Multiplication: The dot product of rows and columns. For A (m×n) and B (n×p), result is m×p. Essential for neural network forward passes.
- Transpose: Flips rows to columns (A^T). Useful for shape compatibility and attention mechanisms in transformers.
- Identity Matrix: Acts as "1" for matrices — multiplying by I leaves the matrix unchanged.
- Matrix Inverse: A^-1 satisfies A @ A^-1 = I. Used for solving linear systems, though often computationally expensive for large matrices.
- Geometric View: Matrices represent linear transformations (scaling, rotation, shearing) — composition of transformations equals matrix multiplication.
Quick Quiz
1. Matrix Multiplication Dimensions: If matrix A is 3×4 and matrix B is 4×2, what are the dimensions of AB?
Answer: 3×2 (the inner dimensions must match, result has outer dimensions)
2. Transpose Property: What is (AB)^T equal to?
Answer: B^T × A^T (the order reverses when transposing a product)
3. Identity Matrix: If A is a 3×3 matrix, what is A × I?
Answer: A (the identity matrix acts like 1 for matrix multiplication)
4. Inverse Application: If Ax = b, how do you solve for x using the inverse?
Answer: x = A^(-1) × b (multiply both sides by A inverse on the left)
Practice Exercises
Exercise 1: Matrix Multiplication Implementation
Implement matrix multiplication from scratch without using numpy's @ operator:
Exercise 2: Batch Matrix Operations for Neural Networks
In neural networks, we process multiple inputs at once (batch processing). Given a batch of 3 inputs (each with 4 features) and weights (4×2), compute the output:
Challenge: Why is the bias added after the matrix multiplication, not before? What would happen if you added it before?
Additional Exercises
Exercise 3: Matrix Transpose Implementation
Write a function to compute the transpose of a matrix without using numpy:
Exercise 4: Verify Transpose Property
Prove that (AB)^T = B^T × A^T with a concrete example:
Insight: This property is crucial in backpropagation where we need to compute gradients through matrix operations.
Knowledge Check Quiz
1. Matrix Multiplication Non-Commutativity: Is AB always equal to BA? Provide a counterexample or explain why.
Answer: No, matrix multiplication is not commutative. For A = [[1, 2], [3, 4]] and B = [[0, 1], [0, 0]], AB = [[0, 1], [0, 3]] but BA = [[3, 4], [0, 0]].
2. Associative Property: For matrices A (2×3), B (3×4), C (4×5), is (AB)C equal to A(BC)?
Answer: Yes, matrix multiplication is associative: (AB)C = A(BC). The result will be 2×5 in both cases.
3. Singular Matrix: What does it mean if a matrix has no inverse? Give an example of a 2×2 singular matrix.
Answer: A singular matrix has determinant = 0. Example: [[1, 2], [2, 4]] — the rows are linearly dependent (second row is 2× first row).
4. Neural Network Application: In a layer with 100 inputs and 50 outputs, what are the dimensions of the weight matrix W?
Answer: W is 100×50. When we multiply input (batch_size × 100) by W (100×50), we get output (batch_size × 50).
5. Computational Complexity: What is the time complexity of multiplying an m×n matrix by an n×p matrix?
Answer: O(m×n×p). Each of the m×p output elements requires computing a dot product of length n.