Machine Learning is one of the most exciting recent technologies. We’ve probably used dozen of its application without noticing. Google, Facebook, Youtube, Amazon and many other sites, the reason why they work so well is because of Machine Learning. They learn from our habits, examine things that we are interested in and then based on the result, they enhance our user experience. Youtube recommends videos that we may like, Amazon suggests deals that we may care, Facebook displays things related to our friends, etc.
One of the reasons why Machine Learning becomes so trendy is the growth of Artificial Intelligent (AI). AI has been integrated in and is the future direction in development of almost all tech-companies in the world. We want to build intelligent machines that can do tons of thing, however we don’t know how to explicitly program an AI. It turns out that the only way to achieve this is to have a machine learns by itself. Furthermore, learning algorithms are being used today to understand human learning and to understand the brain. Only then, we can create an human-level AI to assist our work.
So what is Machine Learning? There are two definitions for Machine Learning:
A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P improves with experience E.
Tom M. Mitchell, Machine Learning, 1997
Another definition is:
Machine learning is programming computers to optimize a performance criterion using example data or past experience.
Introduction to Machine Learning, Second Edition, MIT Press
There are many models of Machine Learning, in general any machine learning problem can be assigned to two broad classifications – “Supervised Learning” and “Unsupervised Learning”. In Supervised Learning, “right answers” are already given for machine to produce more right answers. The problems are categorised into two subgroups – “Regression” and “Classification”. The difference between these groups is the output value, where Regression provide continuous output values while Classification go with a discrete ones. For instance:
(a) Regression – Given size of houses, we have to predict their prices based on a given data set.
(b) Classification – Given a patient with a tumor, we have to predict whether the tumor is malignant or benign.
Figure 1. Supervised Learning – Classification vs. Regression
In contrast, in Unsupervised Learning, machine will be given data set only and nothing more, then it has to find the structure, the pattern of the data. Clustering algorithm is a representative of this model of Machine Learning. For example, take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.
Figure 2. Supervised Learning (Classification) vs. Unsupervised Learning (Clustering)
There are more models around such as Reinforce Learning, Deep Learning but we may talk about them later.