Vectorization in Python

What Is Vectorization?

Vectorization means replacing explicit Python loops with array operations that execute in compiled C code under the hood. Instead of iterating over 1,000 examples one at a time, you process all 1,000 in a single operation. This is not a minor optimisation — vectorized code runs 100× to 1,000× faster than equivalent loop-based code.

The rule in ML engineering: never use an explicit for-loop over training examples if a vectorized alternative exists. Your models will train in seconds instead of hours.

The Cost of Python Loops

Python loops are slow because Python is an interpreted language — every iteration has interpreter overhead. NumPy avoids this by delegating array operations to BLAS (Basic Linear Algebra Subprograms), a highly optimised library written in Fortran and C. The speed difference is dramatic even for small arrays.

python

import numpy as np
import time

a = np.random.randn(1_000_000)
b = np.random.randn(1_000_000)

# Loop version
start = time.time()
c = 0.0
for i in range(len(a)):
    c += a[i] * b[i]
print(f"Loop: {(time.time() - start) * 1000:.1f} ms")

# Vectorized version
start = time.time()
c = np.dot(a, b)
print(f"Vectorized: {(time.time() - start) * 1000:.1f} ms")
# Typical output: Loop: 400ms  Vectorized: 1.5ms

Vectorizing Matrix Multiplication

The most common operation in ML is matrix multiplication: X · W + b. With matrices, this computes predictions for all m examples at once. The non-vectorized version would loop over every example; the vectorized version does it in one line.

python

# Parameters
m, n = 1000, 5
X = np.random.randn(m, n)    # (1000, 5) — 1000 examples, 5 features each
W = np.random.randn(n, 1)    # (5, 1)    — weight column vector, one weight per feature
b = 0.5

# Non-vectorized (avoid this)
Z_loop = np.zeros(m)
for i in range(m):
    total = 0.0
    for j in range(n):
        total += W[j, 0] * X[i, j]
    Z_loop[i] = total + b

# Vectorized (use this)
# X comes first because X is (m, n) and W is (n, 1) — inner dimensions must match
Z = np.dot(X, W) + b   # matrix multiply — shape (1000, 1)
Z = X @ W + b          # identical — @ and np.dot are equivalent for 2D arrays

print(np.allclose(Z_loop, Z.flatten()))  # True — same result

Element-wise Operations

NumPy applies arithmetic operators element-wise on arrays of the same shape. For e.g., applying the sigmoid function to every element of Z is a single expression, not a loop.

python

def sigmoid(Z):
    return 1 / (1 + np.exp(-Z))   # applied to every element at once

Z = np.array([[2.0, -1.0, 0.5, -3.0, 1.2]])
A = sigmoid(Z)
print(A)   # [[0.88, 0.27, 0.62, 0.047, 0.77]]

Element-wise operations require the arrays to have the same shape — or shapes compatible with broadcasting rules. Shape mismatches produce a ValueError. Always check .shape before operating on two arrays.

Common Vectorized Operations

Sum all elements

python

# Loop
total = 0.0
for i in range(len(X)):
    total += X[i]

# Vectorized
total = np.sum(X)

Apply a function to every element

python

# Loop
result = np.zeros(len(X))
for i in range(len(X)):
    result[i] = f(X[i])

# Vectorized
result = f(X)

Dot product of two 1D arrays

python

# Loop
c = 0.0
for i in range(len(a)):
    c += a[i] * b[i]

# Vectorized
c = np.dot(a, b)

Matrix multiplication

python

# Loop
Z = np.zeros((A.shape[0], B.shape[1]))
for i in range(A.shape[0]):
    for j in range(B.shape[1]):
        for k in range(A.shape[1]):
            Z[i, j] += A[i, k] * B[k, j]

# Vectorized
Z = A @ B          # or np.dot(A, B)

Column-wise mean

python

# Loop
col_mean = np.zeros(X.shape[1])
for j in range(X.shape[1]):
    total = 0.0
    for i in range(X.shape[0]):
        total += X[i, j]
    col_mean[j] = total / X.shape[0]

# Vectorized
col_mean = np.mean(X, axis=0)

What Is Vectorization?

The Cost of Python Loops

Vectorizing Matrix Multiplication

Element-wise Operations

Common Vectorized Operations

Test Your Knowledge