Why Fluency Matters in Data Science

Here at Sharp Sight, I frequently use the word “fluent” to describe an ideal skill level with data science.

When you join our courses, our goal is to help you become “fluent” in data science, or “fluent” in a particular skill.

We want to help you become fluent in data science as fast as possible.

I think most people intuitively understand this idea, but I want to explain a little bit about what I mean when I use that word in the context of programming and data science.

Fluent: a traditional definition

Let’s start by taking a look at a traditional definition of the word.

Here is a definition taken from Wiktionary.org:

An image showing the definition of "fluent," from wiktionary.org.

Take a look at that second entry.

Fluency indicates an ability to perform “accurately, rapidly, and confidently.”

That is exactly what you’re looking for as someone using data science.

You want to be able to use data science techniques “accurately, rapidly, and confidently.”

Moreover, you want to be able to perform “in a flowing way.”

This is in contrast to many people who lack fluency. Many wannabe programmers and data scientists write code in a way that’s slow, strained, and full of breaks to look up syntax.

Having said that, two of the concepts in the definition above are important, and I want to drill in to them a little bit:

accuracy
speed

If you want to do data-related work, you need accuracy and speed. Let’s break this down.

Fluency implies accuracy

Importantly, fluency implies accuracy.

Just like a spoken language, when you’re fluent in a programming language, you have the ability to recall and use the correct commands and syntax.

Again, a definition from Wiktionary is instructive:

An image of the definition of the word accuracy

One particular part of this is helpful.

Accuracy is the state of being “free from mistakes.”

Do you want to use data science techniques? Do you want a data science job or an analytics job? Do you want to get paid money for these skills?

Then you had better be able to write the code “accurately” …. which is to say, relatively free from mistakes.

To be clear: I’m not talking about absolute perfection. Everyone makes a mistake from time to time. But if you want to get paid to do data work, you need to actually be able to write the code.

You’d think this would be obvious, but many wannabe data scientists actually can’t write data science code.

For example, an acquaintance recently told me that he had “taught himself” Pandas and machine learning in Python. But he added that he couldn’t remember the difference between the .loc[] and the .iloc[] functions. So instead, he just found workarounds for everything using for-loops and other techniques from Base Python.

This guy wanted to use data science techniques (in fact, he was the founder of a defunct AI startup). But instead of learning, memorizing, and mastering the essential syntax, he resorted to work-arounds and cobbled-together code.

This is sort of like a person who wants to learn Spanish for a trip to Mexico, but never learns essential Spanish vocabulary and grammar.

If you want to be a data scientist, or you want to use data science tools in your work, you cannot be like this.

You need to commit to learning the tools and syntax, so that you can recall the syntax accurately, and ultimately apply it.

Fluency implies speed

The other main part of fluency is speed.

Let’s go back to the example of a spoken language, like Spanish.

If you want to have a conversation in Spanish, you need to be able to speak the language with some amount of speed. The words need to flow out quickly, one after another. If you need to take a minute to look up every other word in a dictionary, any conversation will be slow, strained, and difficult.

It’s very similar with data science.

If you have to look up syntax every 4 or 5 minutes, progress on any project you work on will be extremely slow.

That won’t cut it in a real job.

I used to work at Apple. Before Apple, I worked at a large American bank.

I can tell you with some confidence that doing data science work in a real job often has serious time constraints.

On more than one occasion, I was asked to produce a polished, informative, accurate deliverable with only a few hours notice (a deliverable going to high-level executive teams).

Within only a few hours, I needed to get the correct data, clean it, draft the deliverable, iterate a few times to make sure that it met the needs of my team, and then polish it up for presentation.

In situations like these, the deadlines were extremely tight.

I didn’t have time to look up syntax on Google. I didn’t have time to try to remember how to use a particular technique. There was only time to execute, and I had to execute fast.

I got the work done … beautiful deliverables on deadline.

I was able to do the work, because I knew the techniques backwards and forwards, and I was able to use those techniques quickly and accurately.

That’s why they paid me 6 figures.

If you want to get a similar, highly paid data job, that’s what you need.

You need to be able to do the work.

Most of the time, you need to be able to do it fast.

You need to be “fluent” with the important techniques.

Fluency Enables High Productivity

In the end though, “fluency” in a programming language is about productivity.

If you’re working in a data-related job, you’ll be given projects and tasks that will require you to use data science syntax to get things done. The expectation is that you can actually produce deliverables that create value for the business.

You need to be productive.

As I already mentioned, this means being able to produce valuable deliverables, created with accurate, well structured code, within reasonable timeframes.

To do this, you have to know the skillset well enough to get things done, which is to say, a sort of “fluency” with the required skills.

Can you actually do the work, quickly and accurately?

You have to ask yourself a question:

Can you actually do the work? Are you actually productive?

If you’re still looking up code snippets every 5 minutes, you have to ask yourself if you could actually cut it in a real job.

To be clear: it’s normal to look something up every now and again. Here, the analogy with writing is instructive … great writers still use dictionaries and thesauruses. But a great writer will not need to look up a word every 5 minutes. Great writers know the essential vocabulary of a language. They know most of the grammar. And they can produce good, clear sentences in the language in which they are writing.

It’s the same thing in programming and data science.

You need to be able to actually do the work.

You need to be productive.

You need to know the essential syntax, and you need to know how to apply that syntax to produce deliverables.

You need to be “fluent” in your programming language of choice.

How to become “fluent” in Python

The subject of becoming “fluent” is somewhat more complicated, and I’ll leave a full explanation for a different blog post.

But, at a high level, becoming fluent in Python can be divided into a few major steps:

master the essential “vocabulary”
learn how to put the “vocabulary” together
work on projects to integrate it all into a coherent whole

When I say “essential vocabulary,” I mean foundational data techniques like adding new variables, subsetting rows, retrieving columns, as well as creating essential charts and graphs like scatterplots and bar charts.

Ultimately, if you want to master these techniques fast, you need a system.

You need a system for practicing individual commands and functions.

And you need a system for learning how to put individual pieces together.

And finally, you need to know how to work on projects (there’s a bit of an art to selecting good projects when you’re trying to learn).

Here at Sharp Sight, we have courses that cover all of these things. We can show you how to memorize syntax within only a few weeks. We can show you how to put syntax together to do data science work. And we can instruct you on how to select and work on good projects.

Sign up for our email list

If you’re interested in becoming “fluent” in data science, sign up for our email list.

When you sign up, you’ll get free data science tutorials delivered directly to your inbox. Our free tutorials explain specific data science techniques, and also some high-level strategies for becoming fluent in Python.

You’ll also be the first to know when we open our premium data science courses for enrollment.

If you want to become “fluent” in data science, sign up for our free newsletter now:

Why fluency matters

Fluent: a traditional definition

Fluency implies accuracy

Fluency implies speed

Fluency Enables High Productivity

Can you actually do the work, quickly and accurately?

How to become “fluent” in Python

Sign up for our email list

Sign up for FREE data science tutorials

Check your email inbox to confirm your subscription ...

Leave a Comment Cancel reply

Get our FREE
Python Data Science Crash Course