What Is Machine Learning?
Cambridge Dictionary
machine learning
/məˈʃiːn ˈlɜːnɪŋ/
noun
"the process by which a computer teaches itself to do something, usually by finding patterns in data."
— Cambridge Dictionary
The key word in that definition is teaches itself. No programmer writes the rules explicitly — the algorithm finds them in the data.
Where the Term Came From
The term machine learning was coined by Arthur Samuel in 1959 while he was a researcher at IBM. Samuel built a checkers-playing program and noticed something remarkable: the more games it played, the better it got — without him ever telling it how to improve. That observation gave him the idea for a new field of computer science.
His definition has stayed with us ever since: "Field of study that gives computers the ability to learn without being explicitly programmed." A more formal version came from Carnegie Mellon professor Tom Mitchell in his 1997 textbook — useful because it turns the vague word "learn" into something measurable.
Who Defined Machine Learning?
"Field of study that gives computers the ability to learn without being explicitly programmed."
IBM Research — coined the term
HOW HE DISCOVERED IT
Wrote a checkers program that improved the more it played — never told how to get better.
Both definitions say the same thing in different registers. Samuel's is intuitive — the machine improves on its own. Mitchell's is rigorous — improvement must be measurable on a specific task against a defined metric.
In machine learning, the developer provides data and the algorithm discovers the rules — unlike traditional programming where the developer writes the rules explicitly.
The Three Learning Paradigms
Machine learning can be classified mainly into three types based on the learning paradigms. The next few modules in this track go deep on each paradigm. What follows here is a brief overview — just enough to understand what each one is before we dive in.
In supervised learning, the model is trained on labeled examples — input-output pairs where the correct answer is known. It learns to map inputs to outputs by minimising the error between its predictions and the true labels. It is broadly of two types:
- Classification: predict a category. For e.g., spam or not spam, cat or dog.
- Regression: predict a continuous value. For e.g., house price or temperature forecast.
A spam filter is trained on 10,000 emails labelled 'spam' or 'not spam'. Which learning paradigm is this?
