What Is Broadcasting?
Broadcasting is NumPy's rule for performing arithmetic between arrays of different shapes. Instead of requiring both arrays to be identical in shape, NumPy stretches the smaller array along its size-1 dimensions to match the larger one. This lets you write clean, loop-free code for operations like adding a bias term to every row or subtracting the column mean from every value.
Broadcasting does not copy data. NumPy stretches the array conceptually during the operation, so memory usage stays low. The result is as if you had manually tiled the smaller array — without the performance cost.
The Broadcasting Rules
NumPy compares shapes from right to left. Two dimensions are compatible if they are equal or if one of them is 1. If shapes are compatible, the size-1 dimension expands to match the other. If they are not compatible, NumPy raises a ValueError.
# Example 1: (3, 4) + (4,)
A = np.ones((3, 4))
b = np.array([1, 2, 3, 4]) # shape (4,)
# Step 1: pad left → (4,) becomes (1, 4)
# Step 2: compare right to left
# dim 1: 4 vs 4 → equal ✓ keep 4
# dim 0: 3 vs 1 → one is 1 ✓ stretch to 3
# Step 3: b repeated 3x along axis 0
print(np.broadcast_to(b, (3, 4)))
# [[1, 2, 3, 4],
# [1, 2, 3, 4],
# [1, 2, 3, 4]] ← b stretched to match A
C = A + b
print(C.shape) # (3, 4)
# Example 2: (3, 4) + (3, 1)
bias = np.array([[10], [20], [30]]) # shape (3, 1)
# Step 1: both 2D, no padding needed
# Step 2: compare right to left
# dim 1: 4 vs 1 → one is 1 ✓ stretch to 4
# dim 0: 3 vs 3 → equal ✓ keep 3
# Step 3: bias repeated 4x along axis 1
print(np.broadcast_to(bias, (3, 4)))
# [[10, 10, 10, 10],
# [20, 20, 20, 20],
# [30, 30, 30, 30]] ← bias stretched to match A
D = A + bias
print(D.shape) # (3, 4)Broadcasting in Practice: Adding a Bias
In logistic regression, you compute Z = W·X + b where b is a scalar. NumPy broadcasts b to match the shape of W·X automatically — no manual tiling needed.
W = np.random.randn(5, 1) # (5, 1)
X = np.random.randn(5, 100) # (5, 100) — 100 examples
b = 0.5 # scalar
Z = np.dot(W.T, X) + b # W.T is (1,5), X is (5,100), result (1,100)
# b scalar broadcasts to (1, 100)
print(Z.shape) # (1, 100)When Broadcasting Fails
Broadcasting fails when neither dimension is 1 and the sizes differ. For e.g., adding a (3,) vector to a (4,) vector raises an error.
a = np.array([1, 2, 3]) # shape (3,)
b = np.array([10, 20, 30, 40]) # shape (4,)
# a + b → ValueError: operands could not be broadcast together with shapes (3,) (4,)
# Fix: reshape one array
a_col = a.reshape(-1, 1) # shape (3, 1)
# Now a_col + b.reshape(1, -1) → shape (3, 4) — outer product