Pandas isnull, Explained

In this tutorial, I’ll explain how to use the Pandas isnull technique to detect missing values.

I’ll explain exactly what the technique does, how the syntax works, and I’ll show you step-by-step examples of how to use isnull.

If you need something specific, just click on any of the following links.

Table of Contents:

First, let’s start with an introduction to isnull, and what it does.

A quick introduction to Pandas isnull

The Pandas isnull technique detects missing values in Python.

An image that shows Pandas isnull detecting missing values in a Pandas dataframe.

We can use the isnull technique on several types of Pandas objects, including:

  • Pandas Series
  • whole Pandas dataframes
  • individual columns in a dataframe

So it’s somewhat flexible in terms of what types of objects we can use it on.

Pandas isnull is very useful for data wrangling

The isnull technique is very useful for data wrangling, data cleaning, and data analysis.

Missing values are often somewhat troublesome when we analyze data and create machine learning models.

That being the case, we often need to identify missing values when we clean up our data, analyze it, or before we build a machine learning model.

So this is a simple technique, but often a necessary technique when you’re doing data science in Python.

The syntax of isnull

Now that you’ve learned a little bit about what the Pandas isnull technique does, let’s take a look at the syntax.

As I mentioned earlier, we can use the isnull() technique on:

  • dataframes
  • Series
  • dataframe columns

The syntax for each of these use cases will be slightly different, so we’ll review the syntax for each of those separately.

A quick note

Before we look at the syntax, I need to mention a couple of things.

First, all of the syntax explanations I’m about to show you assume that you’ve already imported Pandas.

If you haven’t done so yet, you can import Pandas with the following code:

import pandas as pd

Second, these syntax explanations will assume that you already have a Pandas series or a Pandas dataframe available.

If you need a refresher on dataframes, you can read our quick introduction to Pandas dataframes.

Series syntax

Let’s start by looking at how we can use isnull() on an individual Pandas Series.

An image that explains how to use the Pandas isnull technique on a Pandas Series.

First, you simply type the name of your Series, followed by .isnull() to call the method.

That’s really it.

The output will be a Series of True/False boolean values that indicate which values are missing, and which are not missing.

dataframe syntax

The syntax for a dataframe is really very similar to the syntax for a Series.

An image that explains the syntax for using isnull on a Pandas dataframe.

You simply type the name of the dataframe, and then .isnull() to call the method.

So if your dataframe is named your_dataframe, you’ll type the code your_dataframe.isnull().

The output will be an object of the same size as your dataframe that contains boolean True/False values. These boolean values indicate which dataframe values were missing.

column syntax

Finally, let’s look at the syntax for using isnull on a dataframe column.

Remember that a dataframe column is actually a Pandas series object. And to retrieve a column from a dataframe, we can use “dot syntax”. Let’s take a look, and I’ll explain further.

An image that shows how to use the isnull technique on a dataframe column.

So detecting missing values in a column is a two-step process:

  • retrieve the column from the dataframe using “dot syntax”
  • call the .isnull() method

So if you have a dataframe named your_dataframe, and there’s a column named column, you’ll use the code your_dataframe.column.isnull() to detect missing values in that column of the dataframe.

Output (additional notes)

Let’s quickly discuss the output.

As I mentioned earlier, the output is a new object of the same size as the input object.

The output object will contain boolean True/False values that indicate which values are missing.

Values that count as “missing” are:

  • None
  • numpy.NaN

Values like an empty string (i.e., '') or numpy.inf will not count as missing values when you use the isnull() method.

Examples: how to detect missing values in Python

Now that we’ve looked at the syntax, let’s look at some examples of how to use the Pandas isnull() technique.

Examples:

Run this code first

Before we actually run the examples, you’ll need to run some preliminary code in order to:

  • import Pandas
  • create a dataframe

Let’s do those one at a time.

Load Pandas

First, you need to import Pandas.

You can do that with the following code:

import pandas as pd

Create a dataframe

Next, we’ll create a dataframe with some mock sales data:

sales_data = pd.DataFrame({"name":["William","Emma","Sofia","Markus","Edward","Thomas","Ethan","Olivia","Arun","Anika","Paulo"]
,"region":["East",np.nan,"East","South","West","West","South","West","West","East","South"]
,"sales":[50000,52000,90000,np.nan,42000,72000,49000,np.nan,67000,65000,67000]
,"expenses":[42000,43000,np.nan,44000,38000,39000,42000,np.nan,39000,44000,45000]})

And let’s print it out to see the contents:

print(sales_data)

OUT:

       name region    sales  expenses
0   William   East  50000.0   42000.0
1      Emma    NaN  52000.0   43000.0
2     Sofia   East  90000.0       NaN
3    Markus  South      NaN   44000.0
4    Edward   West  42000.0   38000.0
5    Thomas   West  72000.0   39000.0
6     Ethan  South  49000.0   42000.0
7    Olivia   West      NaN       NaN
8      Arun   West  67000.0   39000.0
9     Anika   East  65000.0   44000.0
10    Paulo  South  67000.0   45000.0

As you can see, this dataframe has four variables, with a mixture of character data and numeric data.

Importantly, you can see that several rows have missing values (i.e., NaN). We’ll be able to use isnull() to identify those in a programatic way.

EXAMPLE 1: Find missing values in a Pandas dataframe column

First, let’s identify the missing values in a single column.

Here, we’ll identify the missing values in the sales column of the sales_data dataframe:

sales_data.sales.isnull()

OUT:

0     False
1     False
2     False
3      True
4     False
5     False
6     False
7      True
8     False
9     False
10    False
Explanation

Identifying the missing values in the sales variable is a two step process:

  • first we need to retrieve the column using “dot syntax”
  • then, we need to call .isnull()

To the code sales_data.sales retrieves the sales variable from the dataframe.

Then, the code .isnull() identifies the missing values.

Notice that the output is an object with the same shape as the sales variable. The value of the output is True if the input value is missing, and False otherwise.

EXAMPLE 2: Identify the missing values in an entire dataframe

Next, we’ll identify the missing values in a whole dataframe.

To do this, we simply type the name of the dataframe, and then type .insnull() to call the method:

sales_data.isnull()

OUT:

     name  region  sales  expenses
0   False   False  False     False
1   False    True  False     False
2   False   False  False      True
3   False   False   True     False
4   False   False  False     False
5   False   False  False     False
6   False   False  False     False
7   False   False   True      True
8   False   False  False     False
9   False   False  False     False
10  False   False  False     False
Explanation

I think this is easy to understand.

To use the Pandas isnull method on a whole dataframe, just type the name of the dataframe, and then .isnull().

In the output, you can see True/False values for every value of every column. The output value is True when the input value was missing, and False otherwise.

EXAMPLE 3: Count the missing values in every column of a dataframe

Finally, let’s do a slightly more difficult, but more useful example.

Here, we’ll count the number of missing values in every column of a dataframe.

To do this, we actually need to use multiple tools.

We need to use isnull() to identify the missing values, and then we need to use the Pandas sum method to count them up.

Let’s take a look:

(sales_data
 .isnull()
 .sum()
)

OUT:

name        0
region      1
sales       2
expenses    2
dtype: int64
Explanation

In the output, you can see a count of the number of missing values, by column.

When you’re doing data cleaning or data analysis, a technique like this can be extremely useful.

Notice that to do it, we needed to call two Pandas methods in series. We typed the name of the dataframe, then .isnull() to identify the missing values, and .sum() to count the missing values.

Furthermore, notice that we used a special syntax to do this. We enclosed the whole expression inside of parenthesis, and put the different Pandas methods on different lines. This style of Pandas coding is unorthodox, but extremely powerful, once you know how to use it properly. It enables you to combine multiple Pandas methods in series to perform complex data manipulations. Additionally, it makes reading and debugging your code much easier.

If you want to learn more about this style of Pandas data wrangling, sign up for our email newsletter.

Leave your other questions in the comments below

Do you have other questions about the Pandas isnull technique?

Is there something that I didn’t cover here that you need help with?

If so, leave your question in the comments section below.

To learn more about Pandas, sign up for our email list

This tutorial should have given you a good introduction to the Pandas isnull technique, but if you really want to master data wrangling and data science in Python, there’s a lot more to learn.

So if you’re ready to learn more about Pandas and more about data science, then sign up for our email newsletter.

We publish FREE tutorials almost every week on:

  • Base Python
  • NumPy
  • Pandas
  • Scikit learn
  • Machine learning
  • Deep learning
  • … and more.

When you sign up for our email list, we’ll deliver these free tutorials directly to your inbox.

Joshua Ebner

Joshua Ebner is the founder, CEO, and Chief Data Scientist of Sharp Sight.   Prior to founding the company, Josh worked as a Data Scientist at Apple.   He has a degree in Physics from Cornell University.   For more daily data science advice, follow Josh on LinkedIn.

Leave a Comment