A Quick Introduction to Machine Learning

In this article, I’ll give you a quick introduction to machine learning.

Here’s a quick table of contents that will give you an overview of the article. If you want to read about something specific, just click on the link and it will take you to that section of the tutorial.

Table of Contents:

Of course, if you’re really new to data science generally, and machine learning in particular, you’ll probably want to read the whole article. You’ll get a much better overview if you read the whole thing, start to finish.

Ok … let’s get to it.

So what is machine learning, anyway?

WTF Is Machine Learning?

If you’ve done any reading about data science in the last few years, you’ve probably heard the term “machine learning.” Machine learning has become very popular in the tech community generally and in data science specifically.

You’ve probably already heard that machine learning is a form of artificial intelligence. This is true.

But machine learning is a particular type of artificial intelligence.

As opposed to older forms of AI, like rule-based systems where a programmer hard-coded IF/THEN statements to instruct a computer how to behave, machine learning takes a more data-driven approach.

Machine Learning: Training Computers to Learn From Data

Machine learning is a set of techniques that enable computers to learn from data.

As I suggested previously, most people consider machine learning to be a sub-discipline of AI. But given the data-driven approach, machine learning also has deep roots in statistics. As such, machine Learning sits at the intersection of data science, artificial intelligence, statistics, and computer science.

A venn diagram showing how machine learning sits at the intersection of statistics, data science, artificial intelligence, and computer science

Having said that, saying that machine learning enables computers to “learn from data” might seem a little abstract. It might be easier to understand what machine learning is by looking at how it’s used today.

Examples of How Machine Learning is Used Today

You can start to understand what machine learning is by looking at where and how it’s used. It’s actually being used increasingly in a wide range of technology products.

Here are a few examples of machine learning in our everyday lives:

  • recommendation systems (on sites like Amazon and Netflix)
  • self driving cars
  • spam classification

Let’s take a quick look at these.

Example: Recommendation Systems

Have you ever considered purchasing something on Amazon, and noticed a small section on the page titled “Books you may like” or “Other products you may like”?

An image that shows a "books you might like" recommendation system like you might see on Amazon.

You might have seen a similar section on the Netflix home-screen titled “Because you watched …”. So if you watched The Avengers, this section of Netflix recommends other movies similar to or related to The Avengers.

Broadly, these types of systems are called “recommendation systems,” and they are typically built with various forms of machine learning tools.

At a high level, these systems “learn” from your past purchases and use history. Amazon has data on your past purchases, and they use that data to build a machine learning system which “learns” what you like. Once it knows what you like, the ML system can make suggestions (i.e., predictions) about what other items you might like.

Example: Self Driving Cars

Another cool application of machine learning is self driving cars.

Even a decade ago, it would have seemed impossibly futuristic to have cars that could drive themselves, even a little bit.

But today, Teslas are sold with an “autopilot” feature that enables the car to “to steer, accelerate and brake” automatically under certain conditions.

This self-driving feature still has limited capabilities at this stage, but it does work in some circumstances.

So, how did they use machine learning to enable this?

Teslas and similar cars are equipped with sensors, radar systems, and cameras. Those cameras and sensors produce a stream of data, which Tesla’s engineers have fed into a machine learning system (specifically, a “a deep neural network” system). That machine learning system has “learned” about different road features like cars, road signs, road markings, etc, and learned about appropriate responses to different road features.

Data feeds into the system, and the system has learned to evaluate and respond to different driving events.

Example: Spam Classification

Perhaps the most canonical example of how machine learning is used in everyday life is the spam filter for your email.

Almost all modern email clients, such as Google’s Gmail, have a spam filter.

The spam filter automatically evaluates incoming email messages and attempts to categorize any “junk” mail as “spam,” after which, it’s sent to the spam folder so you don’t have to see it.

This too is built with machine learning.

The contents of an email – things like words, grammar, titles, senders – can all be considered forms of data. Email companies have used historical email data to “train” machine learning systems. These systems have “learned” to categorize and identify “spam” messages based on the email contents.

Now, after training these spam classification systems, email services can use them to classify new incoming messages so your inbox stays relatively free of junk email.

What These Systems Have in Common

What you’ll notice about all of these examples – recommendation systems, self-driving cars, and spam filters – is that there is a data stream.

The data stream is used to “train” the machine learning system. The machine learning system “learns” from the data. And then the system produces some output like a prediction, classification, or recommendation.

Although they all use different machine learning techniques, they all “learn” from some data stream.

Essentially, if a software systems today appears to perform some type of prediction or classification, there’s a good chance that it’s using some machine learning.

So now that we’ve looked at a few examples of machine learning so you can see how it’s used today, let’s look at how machine learning works at a high level.

How Machine Learning Works

There are a variety of different types of machine learning tools which have different strengths and different applications.

But although there are differences from one system to another, there are some commonalities with regard to how these machine learning system are created.

That being the case, we can examine how the machine learning model building process works at a high level. This will give you a rough overview of what actually happens when we use machine learning on a dataset.

Essentially, when we build a machine learning model, we have a dataset called a training dataset.

We then use an algorithm to extract some knowledge from that dataset. The machine learning algorithm “learns” from the training data.

Once this process is complete, we can deploy the model as a system that will accept new data, and will produce some output when it sees the new data. Speaking very generally, this output is frequently a prediction or classification of some type.

An image that shows the high-level process of building and using a machine learning model.

But the details about how we build, select, and deploy a model are of course, a little more involved.

Let’s quickly take a closer look at the process of how machine learning models are built.

The Model Building Process

For the most part, the model building process is a step-by-step process that follows a similar general path for each project.

Of course, there are always little differences from one project to another, but at a high level, there are a few typical steps when you build a machine learning system.

This image that shows the model building process. It simplifies things quite a bit, but it gives you a rough idea of what happens when you build a machine learning system.

An image that shows the high-level process for building a machine learning model.

Clarify Objectives

Typically, you start by clarifying the objectives of the machine learning system. In a business setting, this frequently involves talking to team members and business partners to generate system requirements and outcomes.

Get and Clean Data

Then you get the data and clean the data. If you have any experience with data science already, you’ve probably heard that “data wrangling is 80% of the job” in data science. It’s a little more complicated than that, but it’s true that data preparation and exploration is a big part of the task when you create a new machine learning model.

Build Models

Next you build several models. Notice that I said models, plural. In almost all cases, you’ll need to build several different models. Sometimes this means building models that are very similar but with slight differences that change or modify the performance. But this can also mean using multiple different machine learning techniques to build different types of machine learning systems. For example, for a project, you might build a decision tree model, a support vector machine, and a logistic regression model, just to see how each different model type performs.

Evaluate Models and Select

And finally, once you’ve built several models, you need to evaluate them, select the “best” model relative to your project requirements. Once you select one of the models, you finalize it and then deploy the model.

Notice also the backwards arrows that sometimes move backwards to a previous step in the process. This is important. Building a machine learning model is highly iterative, and sometimes you need to go backwards and re-do a previous step of the process.

For example, you might start building your models, and realize that you need more data or different data. In that case, you’d need to go backwards, get new data, clean it, and then start moving through the steps of the model building process again.

Obviously, the process is a lot more complicated once you actually start doing the work. There are a lot of fine-grained details that I’m passing over for the sake of simplicity, clarity, and brevity.

But this should give you a rough idea of how the model building process works in machine learning.

Different Machine Learning Techniques

As I mentioned previously, there are many different techniques for doing machine learning.

And when we build a machine learning system, we often try many different techniques, and then evaluate the different models and compare them against each other.

The reason that we do this is that there are many different algorithms that can operate on datasets in order to produce useful outputs.

These different algorithms – these different machine learning techniques – have strengths and weaknesses.

Some work best when you have small datasets.

Others work comparatively better when you have small datasets.

Some work best on highly structured data …

Others work relatively better when you’re inputting unstructured data.

And so on …

There are dozens (even hundreds) of different machine learning techniques, depending on how you want to categorize them.

For example, a few of the common machine learning techniques are:

  • linear regression
  • logistic regression
  • decision trees
  • random forests
  • boosted trees
  • support vector machines
  • deep neural networks

These are a few broad classes of machine learning techniques, and there are many variants of almost all of these tools.

Whenever you build a machine learning system, you’ll literally have dozens of possible techniques to use, and many ways you can customize or optimize each technique.

Moreover, choosing the right technique for a particular problem is both an art and a science. There are some general rules of thumb for choosing the right technique, but there’s also a bit of “art” involved, in the sense that it takes experience and intuition gained over months and years of practice.

There’s More to Learn

This article should have give you a quick overview of what machine learning is and how it works.

But there’s still more to learn.

We still need to cover topics like:

  • regression vs classification
  • supervised vs unsupervised learning
  • the bias/variance problem
  • the problem of overfitting

… and quite a bit more.

Leave your questions in the comments section

If you have specific questions about machine learning that I haven’t addressed in this article, please leave your question in the comments section at the bottom of the page.

Sign Up for FREE Machine Learning Tutorials

If you’re interested in learning more about machine learning, then sign up for our email list. Through this year and into the foreseeable future, we’ll be posting detailed tutorials about different parts of the machine learning workflow.

We’ll be addressing some of the other topics I just mentioned. We also plan to publish detailed tutorials about the different machine learning techniques, like linear regression, logistic regression, decision trees, neural networks, and more.

So if you want to learn and master machine learning, then sign up for our email list. When you sign up, we’ll send our new tutorials directly to your inbox as soon as they’re published.

Joshua Ebner

Joshua Ebner is the founder, CEO, and Chief Data Scientist of Sharp Sight.   Prior to founding the company, Josh worked as a Data Scientist at Apple.   He has a degree in Physics from Cornell University.   For more daily data science advice, follow Josh on LinkedIn.

5 thoughts on “A Quick Introduction to Machine Learning”

  1. The article is well articulated. I have benefited from it and looking forward to learn more about machine learning.

    Reply

Leave a Comment