Matrices and Data Representation

Why Matrices?

In machine learning, you rarely process one example at a time. You process batches of thousands of examples simultaneously — and the way you organise them is as a matrix. A matrix is a 2D grid of numbers: rows represent examples, columns represent features. Every dataset, every set of predictions, every layer of a neural network is a matrix.

If you understand how data is shaped as a matrix, you will understand why neural network code is written the way it is. Shape errors are the most common bug in ML code — mastering X.shape prevents them.

The Input Matrix X

The input matrix X holds all your training examples. Each row is one example. Each column is one feature. For e.g., a dataset of 1,000 house prices with 3 features (size, bedrooms, location score) would be shaped (1000, 3) — 1,000 rows, 3 columns.

python

import numpy as np

# 1000 training examples, 3 features each
X = np.random.randn(1000, 3)
print(X.shape)   # (1000, 3)
print(X.shape[0])  # 1000 — number of examples (m)
print(X.shape[1])  # 3    — number of features (n)

This module uses NumPy throughout — np.array, np.zeros, np.random.randn, .reshape(), and more. We will take a detailed look at all of these in the NumPy Fundamentals module later in this track. For now, focus on the shapes and the data layout.

X.shape — Reading the Dimensions

X.shape returns a tuple (rows, columns). In ML convention, X.shape[0] is m — the number of training examples. X.shape[1] is n — the number of input features. You will read and check .shape constantly to catch dimension mismatches before they cause bugs downstream.

python

X = np.array([[2.1, 0.5, 1.2],
              [3.4, 1.1, 0.8],
              [1.7, 2.3, 0.4]])

print(X.shape)      # (3, 3)  → 3 examples, 3 features
m = X.shape[0]      # m = 3
n = X.shape[1]      # n = 3

The Output Matrix Y

Y holds the labels — the correct answers for each training example. For a binary classification problem (yes/no, cat/not-cat), Y is a row vector of shape (1, m) — one label per training example, stored as 0 or 1.

python

# Y for 5 training examples — binary labels
Y = np.array([[1, 0, 1, 1, 0]])  # shape (1, 5)
print(Y.shape)   # (1, 5)

Note that Y is shaped (1, m) — a row vector — not (m,) or (m, 1). The (1, m) convention keeps matrix operations consistent when you multiply W · X and add b. Always check Y.shape when debugging.

The Full Dataset Layout

For a supervised learning problem with m training examples and n input features:

Matrix	Shape	Description
X	(m, n)	Input matrix — m examples, n features each
Y	(1, m)	Output row vector — one label per example
W	(n, 1)	Weight vector — one weight per feature
b	scalar	Bias term

python

m = 1000      # training examples
n = 5         # features per example

X = np.random.randn(m, n)    # (1000, 5)
Y = np.random.randint(0, 2, (1, m))  # (1, 1000)
W = np.zeros((n, 1))         # (5, 1)
b = 0.0                       # scalar

print(X.shape, Y.shape, W.shape)  # (1000, 5) (1, 1000) (5, 1)

Reshaping Arrays

You will often need to reshape arrays to match the expected dimensions. For e.g., when loading images, each image is a 2D pixel grid but needs to be flattened into a 1D vector to form a row of X. The .reshape() method does this without copying data.

python

# Flatten a 64×64 image into a 4096-element vector
img = np.random.randn(64, 64)
img_flat = img.reshape(-1)       # shape (4096,)
img_col = img.reshape(-1, 1)     # shape (4096, 1)

# Flatten m images into matrix X
images = np.random.randn(100, 64, 64)    # 100 images of 64×64
X = images.reshape(100, -1)              # (100, 4096)

Why Matrices?

The Input Matrix X

X.shape — Reading the Dimensions

The Output Matrix Y

The Full Dataset Layout

Reshaping Arrays

Test Your Knowledge