A few weeks ago, an acquaintance told me that he was interested in getting started with machine learning.
He’s a web developer who primarily works in Ruby and Python, but also has a small amount of experience with R. Day-to-day, his work is run-of-the-mill web development, and he’s confessed to me that he’s a bit bored and looking for something new and exciting.
When he told me he wants to get into machine learning, we started talking. The conversation went like this:
Sharp Sight: “Do you know data visualization? Do you know data wrangling techniques for R?”
Acquaintance: “I don’t want to do data visualization.”
Sharp Sight: “You don’t understand. Data visualization is a prerequisite for machine learning. You need to learn how to dive into a dataset and analyze it before you can make machine learning algorithms work. You need to be able to analyze data first.”
We proceeded to talk for about 10 minutes. He asked a few questions, and I gave him solid advice. I gave him the advice I’ve told you at Sharp Sight: learn basic plots, master basic syntax. Focus on foundations. Be really systematic about learning data visualization and data wrangling, and your progress with ML will be much, much faster.
“Well, I’ll just figure it out.”
“Jump in and figure it out” is a losing strategy
You have to know this guy. He’s young. He’s cocky. He doesn’t know what he doesn’t know yet.
“I’ll just figure it out” is code for, “I don’t want to do all that stuff you recommended, so I’m just going to jump in, without the prerequisites, and see how far I get.”
His plan is to have no plan, and to overcome the challenges in front of him with feigned confidence and a bold attitude.
You need to understand: this is a losing strategy.
I’m fairly certain that if I grill him in 6 months, he won’t know many (if any) of the core ML concepts that he needs to understand. I’m confident that if I gave him a dataset, and asked him to write me some code (from scratch and by memory) to implement a logistic regression, he wouldn’t be able to do it.
It’s not because he’s unintelligent (he’s a fairly smart guy). It’s because his approach is wrong and likely to lead him to failure.
To be clear, there are absolutely people who will be able to “figure it out.” People who use the jump-in-and-figure-it-out strategy sometimes get to their goal. But the odds aren’t good, and it’s terribly inefficient (you’ll work harder, for fewer gains).
Average performers demand to learn the “sexy” stuff first
It’s understandable … everyone wants to learn the sexiest stuff first. This isn’t just limited to data science.
People who begin learning a musical instrument do the same thing. They say, “I want to play guitar” but they want to jump right into playing advanced guitar solos, instead of meticulously and intensely mastering the basics. And because they don’t want to master the basics, they fail to learn critical skills and ultimately miss their target. They never learn the basics, and they never learn to shred.
Don’t be that guy.
Top performers are disciplined and systematic
Top performers are different.
If you look at top performers of all stripes, they are extremely methodical in how they approach learning and skill acquisition.
SEALs, olympians, top performing students, elite musicians (violinists, cellists, guitarists) …. the best people are patient, disciplined, and strategic.
They don’t demand to start with the cool stuff. Top performers diligently learn and master the foundations.
To be the best, you need to learn the right way
I get it. The cool stuff is why you want to get into data science in the first place. For example, machine learning is really exciting right now. It’s powering self driving cars, intelligent IoT objects, and a variety of other cutting-edge technology. Of course you want to do machine learning.
But ask yourself: do you want to be in the bottom 95% who fail to really learn? The bottom 95% that under-perform? The bottom 95% who make less money? The bottom that say “I’ll just figure it out” but then fail, and complain about how hard data science is?
Or do you want to be in the top 5%? … the top 5% who earn most of the money, get the best perks, and work on the coolest projects.
You can choose which group you fall into – the top 5% or the bottom 95%.
You choose by how you learn.
4 thoughts on “Stop trying to jump to the sexy stuff first”
I am highly suspect of people who produce a ML model without a solid understanding of feature engineering which requires a good deal of data analysis: data manipulation, visualization, statistical summaries, etc. Plus, I find these “basic” things to be highly important in communicating the model’s output.
And where do I find those very first steps ? What happens if I don’t develop anymore? Can I still get into data visualization?
Your question has been asked and answered before. Google it and you’ll find a ton of training resources: blogs, exercises, and free or inexpensive courses.