Why NumPy?
NumPy is the foundation of the entire Python ML ecosystem. It provides the ndarray — a fast, memory-efficient N-dimensional array — and a library of mathematical operations that run in optimised compiled code. Every other library in this section (scikit-learn, TensorFlow, PyTorch) uses NumPy arrays as the standard data format.
Creating Arrays
You create NumPy arrays from Python lists, from shape specifications, or by generating values. The most common creation functions in ML code are np.zeros, np.ones, np.random.randn, and np.array.
import numpy as np
# From a list
a = np.array([1.0, 2.0, 3.0]) # 1D array, shape (3,)
M = np.array([[1, 2], [3, 4]]) # 2D array, shape (2, 2)
# By shape
zeros = np.zeros((3, 4)) # all zeros, shape (3, 4)
ones = np.ones((2, 5)) # all ones, shape (2, 5)
rand = np.random.randn(100, 10) # random floats, normal distribution (mean 0, std 1)
ints = np.random.randint(0, 2, 1000) # random integers, each either 0 or 1, shape (1000,)
# Ranges
x = np.arange(0, 10, 0.5) # evenly spaced values from 0 to 9.5, step 0.5
x = np.linspace(0, 1, 100) # 100 evenly spaced from 0 to 1Use np.random.randn (standard normal) for initialising weights. Use np.zeros for bias terms. Never initialise all weights to zero — a neural network with all-zero weights fails to learn because all neurons compute the same gradient.
Key Array Operations
A = np.random.randn(4, 3)
# Shape and size
print(A.shape) # (4, 3) — tuple of dimensions (rows, columns)
print(A.ndim) # 2 — number of dimensions
print(A.size) # 12 — total elements
# Transpose
print(A.T.shape) # (3, 4) — rows and columns swapped
# Axis-wise reductions
print(np.sum(A)) # sum of all elements
print(np.sum(A, axis=0)) # sum down rows → shape (3,)
print(np.sum(A, axis=1)) # sum across columns → shape (4,)
print(np.mean(A, axis=0)) # mean per column → shape (3,)
# Math functions
print(np.exp(A)) # e^x element-wise
print(np.log(A)) # ln(x) element-wise
print(np.abs(A)) # |x| element-wise
print(np.maximum(0, A)) # ReLU — clamps negatives to 0Indexing and Slicing
X = np.random.randn(100, 5) # 100 examples, 5 features
# Single element
print(X[0, 0]) # row 0, column 0
# Entire row / column
print(X[0, :]) # first example — shape (5,)
print(X[:, 0]) # first feature for all examples — shape (100,)
# Slices
print(X[:10, :]) # first 10 examples — shape (10, 5)
print(X[:, 1:3]) # features 1 and 2 — shape (100, 2)
# Boolean indexing
mask = X[:, 0] > 0
print(X[mask, :]) # examples where first feature is positiveMatrix Operations for ML
| Operation | Code | Description |
|---|---|---|
| Dot product (vectors) | np.dot(a, b) | Sum of element-wise products |
| Matrix multiply | A @ B | Standard matrix multiplication |
| Element-wise multiply | A * B | Each element multiplied with the element at the same position — arrays must be the same shape |
| Transpose | A.T | Flip rows and columns |
| Inverse | np.linalg.inv(A) | Only for square matrices |
| Norm | np.linalg.norm(v) | Length of a vector |
Use @ for matrix multiplication and * for element-wise. Mixing them up is one of the most common dimension bugs in ML code.
Stacking and Splitting Arrays
a = np.array([[1, 2, 3]]) # shape (1, 3)
b = np.array([[4, 5, 6]]) # shape (1, 3)
# Stack vertically (add rows)
C = np.vstack([a, b]) # shape (2, 3)
# Stack horizontally (add columns)
D = np.hstack([a, b]) # shape (1, 6) — but this only works if row counts match
# Split
X_train, X_test = X[:80], X[80:] # first 80 / last 20