How to use data analysis for machine learning (example, part 1)

In my last article, I stated that for practitioners (as opposed to theorists), the real prerequisite for machine learning is data analysis, not math. One of the main reasons for making this statement, is that data scientists spend an inordinate amount of time on data analysis. The traditional statement is that data scientists “spend 80% … Read more

The real prerequisite for machine learning isn’t math, it’s data analysis

When beginners get started with machine learning, the inevitable question is “what are the prerequisites? What do I need to know to get started?” And once they start researching, beginners frequently find well-intentioned but disheartening advice, like the following: You need to master math. You need all of the following: – Calculus – Differential equations … Read more

What’s the difference between machine learning, statistics, and data mining?

Over the last few blog posts, I’ve discussed some of the basics of what machine learning is and why it’s important: – Why machine learning will reshape software engineering – What is the core task of machine learning – How to get started in machine learning in R Throughout those posts, I’ve been using the … Read more

How to make a small multiples chart in R

An important principle in analyzing data is “overview first, zoom and filter, then details on demand” (quote: Ben Shneiderman) In practice, this typically means starting at a high level with a single chart, and then “zooming into” the data by replicating that chart for specific subsets of the dataset. And, even more valuable is being … Read more

How to build an R line chart, step by step (and the importance of process)

Last week, I was talking to a guy who’s learning analytics, coaching him on what skills to learn next and helping him plan a career path. He’s a smart guy with an analytical background and minor coding experience, but he’s new to R.

Towards the end of the conversation, I asked him, “what’s the biggest challenge you have right now, learning analytics.”

His response? “The code is intimidating.”