Pandas isna, Explained

In this blog post, I’ll explain how to use the Pandas isna technique.

I’ll describe what the technique does, explain the syntax, and I’ll show you clear examples of how to use it.

If you need to learn something specific, just click on one of these links:

Table of Contents:

Let’s get started with a quick introduction to the isna() technique.

A quick introduction to Pandas isna

The Pandas isna method detects missing values in Python dataframe or Pandas Series.

A simple example of Pandas isna detecting missing values in Python data.

As suggested above, we can us Pandas isna on several different data structures, including:

  • Pandas Series
  • Pandas dataframes
  • individual columns in a dataframe

So in that sense, the method is flexible in terms of how we use it.

Pandas isna is important for Python data wrangling

The isna method is important for data wrangling in Python.

Dealing with missing values is a very common problem when we wrangle data, but also when we analyze data or create machine learning models.

In fact, finding and dealing with missing values is one of the first things you will do when you wrangle or analyze a dataset.

That being the case, you need a way to identify missing values when you’re working with your Python data.

Enter, Pandas isna.

The syntax of isna

Let’s look at the syntax of the isna() technique.

Here, we’ll look at the syntax separately for the following Python data structures:

  • dataframes
  • Series
  • dataframe columns

The reason, is that the syntax for Pandas isna will be slightly different for each object type.

A quick note

Before looking at the syntax, I want to remind you of a couple things.

First, the syntax explanations below assume that you’ve already installed Pandas and imported it into your environment.

Assuming that you have it installed already on your computer, you can import Pandas with this code:

import pandas as pd

Second, the syntax explanations below assume that you have either a Pandas dataframe or a Pandas Series object available.

To learn more about Pandas dataframes, read our Pandas dataframe tutorial.

With all that said, let’s look at the syntax.

Series syntax

First, we’ll look at the syntax for how to use isna() on a lone Pandas Series.

When you use isna on a Series, you first just type the name of the Series object (i.e., the name that you’ve assigned to it).

In image that explains the syntax for using Pandas isna on a Pandas series object.

Then, you just type .isna() to call the method, just like you would call any other method on Python.

That’s all there is to it.

When you do this, the method will produce a new Series of boolean True/False values, that will show which values were missing in the original Series.

dataframe syntax

Next, let’s look at how to use isna on a dataframe.

The syntax for dataframes this is very similar to the syntax above for Pandas Series.

First, you just type the name of the dataframe you want to operate on.

An image that shows the syntax for how to use isna on a Python dataframe.

Then you type .isna() to call the method.

So if your dataframe is named your_dataframe, you’ll type the code your_dataframe.isna().

The output of this operation will be an object that’s the same size of your input dataframe. This output will contain True/False values that indicate which dataframe values were missing in the original.

column syntax

Finally, we’ll look at the syntax for how to use Pandas isna on a single column of a dataframe.

It’s important to remember here that individual columns inside of a dataframe are actually Pandas series objects. So if we retrieve a column using “dot syntax,” then we can use the syntax above for Pandas Series.

Let’s take a look at how this works.

First, you can type the name of the dataframe.

Then, you use “dot syntax” to specify the individual column inside the dataframe that you want to operate on.

An explanation of the syntax for how to use Pandas isna on a dataframe column.

So applying Pandas .isna() to a dataframe column involves two steps:

  • get the column from the dataframe with “dot syntax”
  • use the .isna() method

So for example, if you have a dataframe called your_dataframe that contains a column called column, then you’ll use the syntax your_dataframe.column.isna() to find missing values in that particular column.

Output (additional notes)

Very quickly, let’s talk about the structure and contents of the ouput.

As I mentioned above, the output of .isna() is a new Pandas object that’s the same size as the input object.

This new object will contain True/False values that show which values are missing (True means missing).

The value types that .isna() will consider as “missing” are:

  • None
  • numpy.NaN

So empty strings (i.e., '') or numpy.inf, will not count as missing values; .isna() will return False for these values.

Examples: how to detect missing values in Python

Now that we’ve finished looking at the syntax, let’s look at some examples of Pandas isna().

Examples:

Run this code first

Before we run these examples, there’s a little preliminary setup that you’ll need to run.

Specifically, you’ll need to:

  • import Pandas and Numpy
  • create a dataframe

Let’s do each of those.

Load Pandas

First, we need to import Pandas and Numpy:

import pandas as pd
import numpy as np

We’ll use Pandas to create a dataframe, and we’ll use Numpy to create missing values inside that dataframe using np.nan.

Create a dataframe

Next, we need to create a dataframe that we can work with.

Here, we’re going to create a dataframe that contains mock sales data:

sales_data = pd.DataFrame({"name":["William","Emma","Sofia","Markus","Edward","Thomas","Ethan","Olivia","Arun","Anika","Paulo"]
,"region":["East",np.nan,"East","South","West","West","South","West","West","East","South"]
,"sales":[50000,52000,90000,np.nan,42000,72000,49000,np.nan,67000,65000,67000]
,"expenses":[42000,43000,np.nan,44000,38000,39000,42000,np.nan,39000,44000,45000]})

Let’s print the dataframe to see its contents:

print(sales_data)

OUT:

       name region    sales  expenses
0   William   East  50000.0   42000.0
1      Emma    NaN  52000.0   43000.0
2     Sofia   East  90000.0       NaN
3    Markus  South      NaN   44000.0
4    Edward   West  42000.0   38000.0
5    Thomas   West  72000.0   39000.0
6     Ethan  South  49000.0   42000.0
7    Olivia   West      NaN       NaN
8      Arun   West  67000.0   39000.0
9     Anika   East  65000.0   44000.0
10    Paulo  South  67000.0   45000.0

This dataframe, sales_data, has four variables. Two of the variables contain character data, and two of the variables contain numeric data.

Critically, you’ll notice that some of the values are missing (i.e., NaN).

We’ll use .isna() to detect those missing values.

EXAMPLE 1: Identify missing values in a dataframe column

First, we’ll identify the missing values in one specific column.

We’re going to identify the missing values in the sales column of the dataframe.

sales_data.sales.isna()

OUT:

0     False
1     False
2     False
3      True
4     False
5     False
6     False
7      True
8     False
9     False
10    False
Explanation

Here, we’ve identified the missing values in the sales column of the sales_data dataframe.

This involved 2 steps:

  • we retrieved the sales column using “dot syntax”
  • then, we called .isna() to identify the missing values in that column

So sales_data.sales retrieved the sales column from the dataframe.

And, the syntax .isna() identified the missing values.

Notice that the output of this code is a new object that has the same shape as the sales column. Also notice that where the value was missing in the sales column, the output shows True. Otherwise, the output shows False.

EXAMPLE 2: Identify missing values in an entire dataframe

Next, we’re going to find the missing values in an entire dataframe.

In order to do this, we’ll type the name of the dataframe, and then call .isna().

sales_data.isna()

OUT:

     name  region  sales  expenses
0   False   False  False     False
1   False    True  False     False
2   False   False  False      True
3   False   False   True     False
4   False   False  False     False
5   False   False  False     False
6   False   False  False     False
7   False   False   True      True
8   False   False  False     False
9   False   False  False     False
10  False   False  False     False
Explanation

This should be easy to understand.

Here, we’ve called the .isna() method on the entire sales dataframe.

To do this, we simply typed the name of the dataframe, and then typed .isna() to call the method.

In the output, you’ll notice boolean True/False values for every value of the input. The output shows True where the value was missing in the sales dataframe, and the output shows False otherwise.

EXAMPLE 3: Count the missing values in each column of the dataframe

Finally, let’s count the missing values in each column of our dataframe.

To accomplish this, we’re going to use two Pandas methods:

  • Pandas isna
  • Pandas sum

We’ll use isna to identify the missing values, and we’ll use Pandas sum to count them.

(sales_data
 .isna()
 .sum()
)

OUT:

name        0
region      1
sales       2
expenses    2
dtype: int64
Explanation

Look carefully at the output. The output shows the count of the missing values for each column of the input dataframe.

To accomplish this, we needed to call two Pandas methods, one after the other.

First, we called the .isna() method, which identified the missing values.

Then, we called .sum() to count them.

Additionally, notice that we used a special syntax trick. We enclosed the whole chain of methods inside of parenthesis. And, we put the different methods on different lines.

I sometimes refer to this as Pandas method chaining, although keep in mind that you can use this for almost any type of Python method.

This is a somewhat unconventional technique, but is extremely powerful when you’re doing data wrangling or data analysis. If you know how to use this technique properly, you can chain together multiple methods (many more than 2) to perform complex data manipulations. It also makes it easier to read and debug your code.

This is one of the secrets to mastering Pandas, and you really should learn it.

Frequently asked questions about Pandas isna

Now that you’ve learned about Pandas isna and seen some examples, let’s review some frequently asked questions about the method.

Frequently asked questions:

Question 1: What’s the difference between Pandas isna and isnull?

Essentially, there is no difference.

Pandas isna and Pandas isnull do the same thing, and operate the same way.

Pandas .isnull() is really just an alias of Pandas .isna().

I suggest that you just pick one of the two versions, and use it consistently in your code.

Leave your other questions in the comments below

Do you have other questions about the Pandas isna technique?

Is there something that I didn’t cover here that you need help with?

If so, leave your question in the comments section below.

To learn more about Pandas, sign up for our email list

This tutorial should have given you a good introduction to the Pandas isna technique, but if you really want to master Pandas data wrangling, then you’ll need to learn a lot more.

So if you want to learn more about Python data wrangling and learn more about Python data science generally, then sign up for our email newsletter.

We publish FREE tutorials almost every week on:

  • Base Python
  • NumPy
  • Pandas
  • Scikit learn
  • Machine learning
  • Deep learning
  • … and more.

When you sign up for our email list, we’ll deliver these free tutorials directly to your inbox.

Joshua Ebner

Joshua Ebner is the founder, CEO, and Chief Data Scientist of Sharp Sight.   Prior to founding the company, Josh worked as a Data Scientist at Apple.   He has a degree in Physics from Cornell University.   For more daily data science advice, follow Josh on LinkedIn.

Leave a Comment