AAI Logo
Loading...
AAI Logo
Loading...
Machine Learning
Machine LearningBeginner

Regression

regressionsupervised learningcrop yieldcontinuous outputlinear regression
No reviews yet — be the first!

What Is Regression?

Regression is a type of supervised learning where the output y is a continuous numerical value — a real number rather than a category. The algorithm learns a function f(x) that maps inputs to a number anywhere on a scale.

The defining question for regression is: are we predicting a number? If yes, it's regression.

Regression predicts a number. The output y can be any value — 2.3 tonnes, 340,000 dollars, 21.7°C. There are no fixed categories, just a continuous range.

The Crop Yield Problem

Suppose a farmer wants to predict how many tonnes of wheat a field will produce before the harvest. Several factors are available as inputs.

Diagram
DAY SCENARIOSunlight0.5 hrsRainfall19.0 mmTemperature8°CSoil Quality7.8 / 10NIGHT SCENARIOSunlight0.5 hrsRainfall19.0 mmTemperature8°CSoil Quality7.8 / 10
Rain falls continuously as the sun arcs across the sky — day and night — feeding into the crop yield model.

The inputs — also called features — are the variables the algorithm uses to make its prediction:

Feature (x)What it measures
Soil qualityNutrient content, pH level
RainfallTotal mm of rain in the growing season
TemperatureAverage °C during crop growth
FertilizerAmount applied in kg/hectare

The output y is the crop yield in tonnes per hectare — a number. It could be 3.2, 5.7, or 8.1 depending on the conditions. This is why the problem is regression, not classification: we are not predicting a category like "good crop / bad crop" — we are predicting an exact number.

Diagram
INPUT LAYERHIDDEN LAYEROUTPUTx₁Soil Qualityx₂Temperaturex₃Rainfallx₄FertilizeryCrop Yield4 inputs5 neurons (fixed)1 output
Input features:
The crop yield model takes four input features and produces one continuous numerical output — tonnes per hectare.
Quick Check

Why is crop yield prediction a regression problem and not a classification problem?

Which Function Fits the Data?

We have the inputs and the correct outputs from historical farm records. The algorithm's job is to find a function f(x) that maps those inputs to the yield.

But which kind of function? A straight line through the data? A curve? Something more complex?

Diagram
Crop Yield vs RainfallWhich function fits the data best?100150200250Rainfall (mm)23456Crop Yield (t/ha)Linear?Curve?COMING LATERWe will decidewhich functionfits best —linear, curve,or other.
Scatter plot of rainfall vs crop yield. Multiple functions could fit this data — a straight line, a curve, or something else entirely. Which is best?

This is the central question of regression: choosing the right function family, fitting it to the data, and evaluating how well it generalises to new fields the model has never seen. We will answer this in detail later in the track — covering linear regression, polynomial regression, and how to measure which fit is actually best.

Quick Check

A model is trained to predict house prices. The output is a dollar amount. Which type of supervised learning is this?

Test Your Knowledge

Ready to check how much you remember? Take the quiz for Regression and see your score on the leaderboard.

Take the Quiz

Up next

In the next module, we cover classification — the other type of supervised learning, where the output is a label from a fixed set of categories rather than a number.

Classification