Data science is now, and will continue to be a hot job.
Right now, the average data science salary is somewhere around $150,000.
Although many people are claiming that AI will destroy data science jobs, I’m fairly confident that AI will only increase the need for data scientists.
First of all, AI makes it much easier to build software applications, and almost all software throws off data.
We’re going to see another explosion of data.
Second, AI is built on machine learning. And in practice, somewhere around 75% of machine learning is working with the data … prepping the data to train the model.
Data science will become even more important.
So you should be working hard to get a data science job.
The question is, how?
Well, I have a good answer for you.
You Need to Master Foundations First
If you’re serious about getting a data science job, you should start with the foundations.
What sort of infuriates me is that everyone wants to jump to the advanced topics, but fails to master foundational techniques.
Then they complain about how hard it is to get a data science job.
For example, just a few days ago, I got an email from a Sharp Sight reader about how he wants courses on advanced data science topics like natural language processing, computer vision, large language models, etc:
(Note: I lightly edited his email for clarity)
Aaaand, here’s my reply:
Listen to me.
I’m trying to help you ….
No one is going to hire you as an NLP expert your first year.
In your first year or two in the data industry, no one (except an absolute fool) will ask you to do time series analysis (I’ll explain why some other time).
These are intermediate to advanced topics that require you to be at an intermediate to advanced stage in your career.
More importantly, they presuppose that you’ve already mastered foundational data science skills.
If you’ve been reading here at the Sharp Sight blog for any length of time, you ought to know the answer.
The foundational skills you need to master first (before NLP, before time series analysis, before ML, before LLMs) are data wrangling, data visualization, and data analysis.
Collectively, we can call these skills “data analytics.”
Data analytics = data wrangling + data visualzation + data analysis.
These are the foundations.
They are the foundations for all of the advanced topics that you’re interested in.
And more importantly, they are the key to getting a data science job (which I’m going to get too soon).
Analytics is the Foundation of Data Science
Almost everything that you’ll do as a data scientist will require these skills:
- data wrangling
- data visualization
- data analysis
Want to “find insights in data”?
You need to use data wrangling, data visualization, and data analysis.
Want to create valuable analyses and reports?
You need to use data wrangling, data visualization, and data analysis.
Want to build a machine learning model?
Data wrangling, data visualization, and data analysis.
NLP, LLMs, time series analysis?
You need data wrangling, data visualization and data analysis.
Are you getting it, friend?
Data wrangling, data visualization and data analysis are critical for almost every major task in data science.
The path to almost every interesting skill or task in data science is through these skills.
And maybe more importantly, the path to a data science job is through these skills.
Analytics is the Key to Getting a Data Science Job (for most people)
For most people, an analytics job is the key to getting a proper data science job.
Let me explain.
Many “data science” jobs today ask for all of the intermediate and advanced skills, on top of the foundations. Machine learning, clustering, time series analysis, natural language processing. I could go on.
This is largely a wish list of things that they’ll only rarely find in a single person (most people know a subset of these skills, and those people that claim to “know” all of them are commonly weak in at least a few of those areas).
Whether you like it or not, no one will believe that as a new data science job applicant that you can competently perform all of these tasks.
Trying to tell a hiring manager that you can do all of them for your first role, even though you don’t have any data science work experience, is like trying to convince the talent team of the LA Lakers that you’re good enough to play in the NBA at 15 years old. Even if you have some experience, and some potential, no one will take you seriously.
So, you need to start with a less advanced role that will lead to the full data science role that you want.
What job might that be?
An analytics job.
Analytics is Data Science, Level 1
In most businesses, most of the time, the entry level data science job is actually an analytics job.
As noted above: most places won’t take you seriously as a machine learning or NLP dev for your first job (seriously, unless you have a PhD from Stanford, CMU, or other big name program, it won’t happen).
But many more places will take a chance on you for a data analysis job.
Many places have large backlogs of work for reports and analyses.
Companies need people who can get data, wrangle it, and analyze it.
These jobs technically may even be called “data analyst” instead of “data science.”
You just need to get your first job, start doing some actual work, and start networking.
The only important thing is that the job uses real data science tools, like SQL, R, Python, and maybe Tableau.
If it’s a department that strictly uses Excel to do their analyses, you should avoid it.
As long as you can get a data analysis or analytics job that uses data science tools, you’ll be golden.
What to Look For in A Potential First Analytics Job
In terms of tools and techniques, what should you look for in your first job?
SQL and at least one data science programming language. That’s really it.
If they’re asking you for machine learning, NLP, web scraping, LLMs, clustering and segmentation …
Sorry. You’re probably not qualified, and it’ll be very hard to get the job.
Instead just look for an analytics or data analysis job that asks for SQL and R or Python.
They should want you to be able to clean data, visualize data, analyze data, find insights, and not all that much else.
That’s really an entry level data job that you’re likely to be competitive in.
My Favorite Language for Analytics
As noted, you should be looking for a job that uses SQL and one data science programming language.
SQL is very important, and I’ll write more about that later.
But which data science language?
Well, it depends on your goals.
Python is much better for the most advanced topics beyond analytics, like machine learning, deep learning, NLP, LLMs, etcetera.
But for pure data analysis and analytics, R is still my favorite.
R – specifically, the Tidyverse – is easy to learn and very, very easy to use.
Both dplyr for data wrangling and ggplot2 for data visualization are extremely well designed.
Both of those packages have one function for every little task. Like little LEGO building blocks.
So writing analytics code just becomes like putting the building blocks together in the right order.
R’s Tidyverse just works, and it’s a joy to use.
You might want to learn R and Python
If you strictly want an analytics job, and you don’t want to move on to more hardcore data science, R/Tidyverse might be enough.
If you do eventually want to move on to more advanced data science topics (like ML, AI, NLP, etc), then you probably eventually need to learn Python.
In which case, you might want to learn R first for analytics, and then move to Python over 1-3 years.
Your Path from Analytics to Proper Data Science
So briefly, let me outline a rough path to a full data science job:
- Learn SQL and at least one data science language (probably R, but possibly Python
- Master data analytics: data wrangling, data visualization, data analysis
- Build Portfolio and Network
- Apply for and get analytics job
- Acquire on-the-job analytics and data science skills
- Continue to up-skill on nights and weekends, learning machine learning, deep learning, NLP and more advanced topics
- Continue to build portfolio and build network
- Apply for and get intermediate data science job
Notice that this starts with learning SQL and data analytics.
If you’re immediately jumping to advanced topics, you’re doing it wrong.
Leave Your Questions in the Comments Below
Do you have questions about this blog post?
Questions about R, or data analytics?
Something you think I missed?
I want to hear from you.
Leave your comments and questions in the comments section below.