This module covers the three learning paradigms at a high level. For a hands-on treatment of supervised learning algorithms — regression, gradient descent, and cost functions — see the Machine Learning course.
Go to Machine Learning CourseWhy This Matters for Deep Learning
Neural networks — and deep learning in general — are at their most powerful when applied to one specific type of machine learning: supervised learning. That is not a coincidence. The entire training process for a neural network is built around labelled data. You feed the network input-output pairs, it makes predictions, and it adjusts its weights based on how wrong those predictions were. Repeat that millions of times and the network learns to generalise.
Before we go deeper into how neural networks work in practice, it is worth taking a step back and understanding the landscape they sit in. Machine learning has three main learning paradigms. Knowing where deep learning lives inside that landscape — and where it does not — will make every module that follows much clearer.
Deep learning is a subset of machine learning, and neural networks are the algorithm. The learning paradigm that drives neural network training is almost always supervised learning. We briefly look at all three paradigms here so you know the full picture.
Supervised Learning
In supervised learning, the model is trained on labelled examples — every training record is an input paired with a known correct output. The model learns to map inputs to outputs by minimising the error between its predictions and the true labels.
- Classification: predict a category. For e.g., spam or not spam, cat or dog, tumour or no tumour.
- Regression: predict a continuous value. For e.g., house price, tomorrow's temperature, or crop yield per hectare.
- Common algorithms: Linear Regression, Decision Trees, Support Vector Machines, and Neural Networks.
Think back to our crop yield predictor. You have hundreds of historical farm records — each with soil quality, rainfall, fertiliser amount, and temperature — paired with the actual yield measured that season. That measured yield is the label. The model sees thousands of these input-label pairs, figures out the relationship between the inputs and the output, and learns to predict yield for farms it has never encountered. The supervision comes entirely from the labels — without them, there is nothing to learn from.
Supervised learning requires labelled data. Every training example must have a known correct output. The model's job is to learn the mapping from inputs to outputs.
The same structure applies across every domain. In spam detection the label is "spam" or "not spam." In medical imaging it is "tumour present" or "no tumour." In credit scoring it is "defaulted" or "repaid." Inputs plus a known output, repeated at scale — that is supervised learning.
A dataset contains 50,000 farm records with soil quality, rainfall, and fertiliser data. Half have yield labels, half do not. Which half can be used for supervised learning?
Unsupervised Learning
Unsupervised learning discovers structure in unlabelled data. There are no correct answers provided — the algorithm finds patterns on its own.
- Clustering: group similar data points together. For e.g., K-Means grouping customers by purchase behaviour.
- Dimensionality reduction: compress many features into fewer while preserving structure. For e.g., PCA, t-SNE.
- Anomaly detection: identify data points that do not fit any discovered group.
Using the same farm dataset, imagine you have soil sensor readings from ten thousand plots but no yield records at all. You have no labels. An unsupervised model might cluster those plots into groups — one with high moisture and clay-heavy soil, another with sandy, well-drained soil, a third with nitrogen-rich topsoil. The model did not know those categories existed in advance. It found them purely by identifying which plots were similar to each other. Those clusters might then guide targeted irrigation strategies or fertiliser recommendations, even without a single yield label in the dataset.
Unsupervised learning finds hidden structure in unlabelled data. The model is not told what to look for — it discovers patterns on its own.
Unlike supervised learning, there is no objective score to optimise directly. The quality of the result depends on whether the discovered structure turns out to be useful — which means human judgement is always part of the evaluation.
Reinforcement Learning
Reinforcement learning trains an agent to make sequential decisions by rewarding good outcomes and penalising bad ones. There are no labels — instead, the agent explores an environment, takes actions, and receives a reward signal that tells it how well it did.
- Key components: Agent, Environment, State, Action, Reward, Policy.
- The agent's goal is to learn a policy — a strategy for choosing actions — that maximises long-term cumulative reward.
- Famous applications: AlphaZero (self-play to master chess and Go), RLHF (the fine-tuning stage that turned GPT-4 into ChatGPT), and robot locomotion.
- Core algorithms: Q-Learning, PPO, A3C.
Reinforcement learning is powerful but expensive — it requires simulating or interacting with an environment millions of times. For most practical neural network applications, supervised learning remains the dominant paradigm.
A company has two years of customer purchase records but no category labels. They want to find natural buying patterns to personalise offers. Which learning paradigm fits?
Putting It Together
| Supervised | Unsupervised | Reinforcement | |
|---|---|---|---|
| Data needed | Labelled examples | Raw data, no labels | Environment + reward signal |
| Goal | Predict a known output | Find hidden structure | Maximise long-term reward |
| Crop yield example | Predict yield from farm records | Cluster farms by soil type | Agent optimises irrigation decisions |
| Deep learning fit | ✓ Primary use case | Partial (autoencoders, GANs) | ✓ Used in RLHF and robotics |
| Common methods | Neural nets, decision trees | K-means, PCA, autoencoders | Q-Learning, PPO, A3C |
Neural networks appear in all three columns, but supervised learning is where they are most widely used and where they deliver the clearest gains. In the next module, we put that into practice — connecting the neural network architecture directly to supervised learning and walking through five real-world applications that run on billions of devices today.
