An introduction to machine learning
—3 min read—Corbin Child
At its core machine learning is an algorithm (or a set of instructions) for updating parameters.
For example, take the following function:
def housePriceModel(x0,x1,squareMetres):
return x0 + squareMetres*x1
The purpose of this function is to guess the price of a house according its area (in square metres) and to do it as accurately as possible. It takes x0
and x1
as parameters, however we do not know what these parameters are in advance. This is where machine learning comes to the rescue. Without machine learning, we would pick an x0
and x1
by trial and error. We could pick arbitrary values and then look to see how well they work on a set of data. The goal is to tweak and change x0
and x1
so that the margin of error is as small as possible (we will get into how this is calculated later).
Most people will begin by sequentially adjusting each parameter and then observing the degree of change to the function's result. If the result goes down one might keep changing the same parameter, whereas if it goes up, we might change it in the other direction. This process is not machine learning, but it is very close. Instead of you updating the parameters based on your intuition of how an algorithm updates, machine learning requires that the parameters self-adjust according to mathematical principles.
These mathematical principles often trip people up and can be a barrier of entry for those studying machine learning. Let's go over some of the principles to gain further insight into how machine learning works.
The machine learning model
A machine learning model is an algorithm that has been trained to recognise certain types of patterns. In the example we just looked at it was the housePriceModel
. This function has parameters that are learned. In our case, it was x0
and x1
.
The cost function
A cost function is a function that provides a measure for how well our machine learning model is performing on a set of examples. The cost function takes a model as input and outputs how well it is doing. If our model is doing well on a set of examples, then the cost will be low. If it is doing poorly, then the cost will be high. This is also known as the "loss" or "error" function. In our example, the cost function is Cost(x, h) = (h(price) - RealPice)^2
— where h
is our model.
How does the model learn?
The model learns by calculating the gradient of the cost function. In basic terms, the gradient is how each parameter (x0
and x1
) changes in relation to the cost. A big change will mean it affects the cost a lot and a little change will mean it affects the cost a little. This is similar to how you might have changed the x0 and x1 to manually observe the impact on the result. Gradient analysis is the same idea, albeit expressed in mathematical terms (namely calculus).
Now that you understand the underlying principles of machine learning, you may have noticed that the machine is simply undergoing a series of steps repetitively (and extremely quickly). These steps are:
-
Calculate the cost of your model.
-
Is it good? If it is we can stop (as we have a good model).
-
Not good? Then calculate the gradient and make a small change to our model based on it.
-
Go back to step one and repeat the process.
Closing thoughts
There is a lot we have glossed over, but hopefully this article provides you with some basic insights into how machine learning works and demystifies some of the magic.