Machine learning landscape 101: Machine Learning Fundamentals & Supervised Learning

This post, “Machine Learning Landscape 101”, will answer the following questions:

  • Where does Machine Learning start and where does it end?
  • What exactly does it mean for a machine to learn something?
  • What are the types of Machine Learning?
  • What is supervised learning?

The next post, “Machine Learning Landscape 102“, will answer the following questions:

  • What is unsupervised learning?
  • What is semi-supervised learning? 
  • What is reinforcement Learning?

Together, these two posts offer a solid foundation for anyone looking to gain a deeper understanding of machine learning and its various applications.

Let’s start.

Machine Learning

More engineering-oriented definition:

A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.

Tom Mitchell, 1997

For example:

Your spam filter is a Machine Learning program that can learn to flag spam given examples of spam emails (e.g., flagged by users) and examples of regular (non-spam, also called “ham”) emails. The examples that the system uses to learn are called the training set.

In this case, 

  • The task T is to flag spam for new emails.
  • The experience E is the training data.
  • The performance measure P needs to be defined.
  • for example, you can use the ratio of correctly classified emails. 
  • This particular performance measure is called accuracy and it is often used in classification tasks.

If you just download a copy of Wikipedia, your computer has a lot more data, but it is not suddenly better at any task. Thus, it is not Machine Learning.

Another example: classifying images of dogs and cats.

  • E = Watching you classify images as dogs or cats.
  • T = Classifying images as dogs or cats.
  • P = The % of images correctly classified.

Types of Machine Learning Systems

Machine Learning systems can be classified according to the amount and type of supervision they get during training.

There are four major categories: 

  • Supervised learning
  • Unsupervised learning 
  • Semi-supervised learning 
  • Reinforcement Learning

Supervised ML

In supervised learning, the training data you feed to the algorithm includes the desired solutions, called labels.

“In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.”

Andrew Ng

Consider a task of facial recognition:

  • The “teacher” shows the network a bunch of faces (the teacher already knows the names associated with each face). 
  • The learning system makes its guesses and the teacher provides the answers. 
  • The learning system can then compare its answers to the known “correct” ones 
  • Make adjustments according to its errors.

Classification and Regression

Classification and regression both involve making a “prediction” based on input data. Classification refers to predicting an output with a discrete set of possibilities like a set of categories or labels. For example: “Given an input image, is it a dog or cat?”

Regression refers to predicting a “continuous” output (a fancy way of saying number). For example: “Given the number of bedrooms, what is the price of a house?” or “Given an input image of a cat, how much does the cat weigh?”.

As a beginner, we divide the data into two sets: the training set & the test set.

Training Dataset: The sample of data used to fit/train the model.

Test Dataset: The sample of data used to provide an unbiased evaluation of a final model fit on the training dataset.

There are other methods of splitting the dataset as well, we will explore them later.

Conclusion

This post, “Machine Learning Landscape 101”, delved into the starting and ending points of machine learning, the meaning of machine learning, and the types of machine learning, specifically focusing on supervised learning.

The next post, “Machine Learning Landscape 102“, covers the remaining types of machine learning, including unsupervised learning, semi-supervised learning, and reinforcement learning, providing a comprehensive overview of the machine learning landscape.

Check out the table of contents for Product Management and Data Science to explore those topics.

Curious about how product managers can utilize Bhagwad Gita’s principles to tackle difficulties? Give this super short book a shot. This will certainly support my work.

AI is fun! Thanks a ton for exploring the AI universe by visiting this website.

Leave a Reply