Numpy where explained

In this tutorial, I’ll show you how to use the Numpy where function.

I’ll explain what np.where is and also how the syntax of np.where works.

Later in the tutorial, I’ll show you clear, step-by-step examples of how the function works, so you can see it in action.

If you need to find something specific, the following links will take you to the appropriate section in the tutorial.

Table of Contents:

On the other hand, I if you really want to understand how Numpy where works, I recommend that you read the whole tutorial.

A quick introduction Numpy where

Let’s start off by quickly reviewing what Numpy where does.

Numpy where returns elements based on a condition

According to the official documentation, the “Numpy where” function returns elements based on some logical condition.

Does that make sense to you?

Me neither.

Unfortunately, the Numpy where function is a little confusing, and many of the online tutorials and explanations do very little to clear things up. (In fact, a lot of online documentation about Numpy is very confusing.)

Let’s fix that.

I’m going to clarify what Numpy where actually does.

The syntax of Numpy where

To really understand how Numpy where works, you need to understand the syntax first.

Once you understand the syntax, you’ll be able to look at simple examples and the examples will begin to make sense.

A high level explanation of np.where syntax

The syntax of the np.where() function has a few parts.

First is just the name of the function. Typically, when we call the function, we’ll call it as np.where().

Keep in mind that exactly how we call the function depends on how we’ve imported Numpy. The common convention for importing Numpy is to run the code import numpy as np. If we import Numpy like that, then we can use the nickname “np” as an alias for Numpy when we call the Numpy functions. Thus, if we import Numpy that way, we’ll call the function as np.where().

An image that explains the syntax of Numpy where.

Inside of the parenthesis, there are three inputs:

  • condition
  • output-if-true
  • output-if-false

Let’s break down those inputs. Understanding those inputs is critical for understanding what the function does.

The parameters of np.where

The parameters of np.where (i.e., the inputs to the function), are fairly easy to understand.

Let’s talk about them one at a time.

condition (required)

The condition is some statement or object that evaluates as True or False.

For example, condition could simply be a Numpy array with boolean values.

More often though, condition is some comparison operation or logical test that operates on a Numpy array.

For example, if we have an array b with several elements, our condition could be the comparison operation b > 0. In this case, the condition b > 0 would evaluate as True or False for every element of the array. These True/False values from condition then influence the output of np.where.

output-if-true

This is the output of np.where if the condition is True.

This could be a single value, in which case, that value will be the output whenever condition is True.

But this can also be an array or array-like object, such as a list. If it’s an array-like object, the output of np.where will be the item in the output-if-true array that corresponds to the positions in condition that are True.

If that sounds confusing, then just sit tight. I’ll show concrete examples in the examples section.

output-if-false

This is the output of np.where if the condition is False.

Again, this could be a single value, in which case, that value will be the output whenever condition is False.

But this can also be an array or array-like object, such as a list. If it’s an array like object, the output of np.where will be the item in the output-if-false array that corresponds to the positions in condition that are False.

I realize that this syntax explanation might still be a little confusing.

In my opinion, the best way to really understand the syntax of np.where and how it works, is to look carefully at some examples.

Examples of how to use Numpy where

Here, we’re going to look at several examples of the Numpy where function.

To help you understand, we’re going to start very, very simple, and then increase the complexity.

If you really want to understand how numpy.where works, you should start with the first example and work through them all. (You can obviously read the explanation, but it’s probably good to run the code too).

Examples:

Run this code first

Before you run any of the following examples, you’ll need to import Numpy. So run this code first!

import numpy as np

This code will enable us to call Numpy functions with the prefix np.

Ok. Let’s get started with the examples.

EXAMPLE 1: A simple example of numpy.where

Ok. In this examples, we’re going to start very simple.

We’re going to create a simple 1D Numpy array, and use a simple comparison as our condition.

Create Numpy array

Let’s first create a simple 1-dimensional Numpy array.

range_1d = np.arange(start = 1, stop = 5)

And let’s print it out, so you can see it:

print(range_1d)

OUT:

[1 2 3 4]

This is really simple. The range_1d array is just a Numpy array with the values 1 to 4.

Use np.where to find values greater than 2

Now, we’re going to use np.where to find the values greater than 2.

To do this, we’ll call np.where().

Inside of the function, we’ll have a condition that will test if the elements are greater than 2. Then we’ll output “True” if the value is greater than 2, and “False” if the value is not greater than 2.

Here’s the code:

np.where(range_1d > 2, True, False)

And here is the output:

array([False, False,  True,  True])
Explanation

So what happened here?

Let’s go back to the structure of the input array, range_1d.

The array range_1d contains the values [1,2,3,4].

Inside of the np.where function, we have a condition that tests every element of range_1d to evaluate if the element is greater than 2.

An image that explains the syntax of numpy.where for example 1.

Evaluating that condition for every element of range_1d will produce a boolean array with values True or False.

Those true or false values dictate which output np.where will produce.

A visual explanation of example 1 for Numpy where.

Here, we’ve kept it simple.

In this case, np.where function outputs True if the condition evaluates as True, and it outputs False if the condition evaluates as False.

But, we still could have more control over the exact outputs.

Let’s take a look at how to output something different in the next example.

EXAMPLE 2: Output ‘yes’ or ‘no’ from np.where

Next, we’re going to create a minor modification to example 1.

Remember that in example 1, we tested a simple condition and then outputted ‘True‘ if the condition evaluated as true and outputted ‘False‘ if the condition evaluated as false.

Let’s change that very slightly.

Here, we’re going to output ‘yes‘ if the condition evaluates as true and output ‘no‘ if the condition evaluates as false.

(Note that we’re going to use the datset we created in example 1, so if you didn’t run that example, go back and create the range_1d dataset.)

Ok. Here’s the code for example 2:

np.where(range_1d > 2, 'Yes', 'No')

And here’s the output:

array(['No', 'No', 'Yes', 'Yes'])

(Note that the output is a special type of Numpy array with dtype='<U3'.)

Explanation

Ok. What happened?

This example is almost exactly the same as example 1.

Just like in example 1, we’re testing the condition range_1d > 2.

Remember: the dataset range_1d has the values [1,2,3,4].

The major difference in this example is the output.

A visual explanation of the syntax for example 2.

If the condition range_1d > 2 is True, then np.where outputs 'yes'.

If the condition range_1d > 2 is False, then np.where outputs 'no'.

The way that numpy.where is working in this example looks something like this.

An explanation of how Numpy where works for example 2.

Do you see what’s going on here?

Numpy where simply tests a condition … in this case, a comparison operation on the elements of a Numpy array.

If the condition is True, we output one thing, and if the condition is False, we output another thing.

EXAMPLE 3: Take output from a list, else zero

In this example, we’re going to build on examples 1 and 2.

(That means that we’ll still be using the range_1d dataset that we made in example 1. If you haven’t created that dataset, go back and do that now.)

The array range_1d contains the values [1,2,3,4].

Just like in examples 1 and 2, our condition will test if range_1d > 2. That test will operate on every element of range_1d.

But in this example, the output will be a little different.

If the condition range_1d > 2 is False, np.where will output the value 0.

But if the condition range_1d > 2 is True, numpy.where will pull the output value from the values in range_1d.

Let’s run the code and take a look.

Here’s the code:

np.where(range_1d > 2, range_1d, 0)

And here’s the output:

array([0, 0, 3, 4])
Explanation

This is really simple, once you get it (although I strongly recommend that you read example 1 and example 2 first).

Once again, as in all cases of np.where, the behavior of this code hinges on the condition.

Here, the Numpy where function test range_1d > 2.

If that condition is true for a particular element, np.where outputs the correstponding value from range_1d. It does this element wise … so if the condition is true for the element at index 3, it outputs element 3 from range_1d.

But if the condition is false, it outputs 0.

An explanation of the syntax of example 3.

So for any value in range_1d that’s less than or equal to 2, np.where outputs 0, otherwise, it outputs the value in range_1d.

A visual explanation of how np.where works for example 3.

This is almost the same as examples 1 and 2.

There’s a test condition, and then one output if true, and a different output if false. That’s what Numpy where does!

EXAMPLE 4: Take output from a list if true, else take output from a different list

Ok.

One last example to drive this home.

Here, we’re going to use the exact same condition. We’ll test if range_1d > 2.

But if the output is true, we’ll take the output (element-wise) from one list of numbers. If the condition is false, we’ll take the output from a different list of numbers.

Let’s run the code and look at the output.

np.where(range_1d > 2, [10,20,30,40], [-10,-20,-30,-40])

OUT:

array([-10, -20,  30,  40])
Explanation

So what happened in this example?

It’s almost exactly the same as the previous examples!

We’re testing a condition, and then taking the output from one group of numbers if true, and taking the output from a different set of numbers if false.

An explanation of the syntax of example 4.

Note that this happens element wise, meaning that if the condition is false at position 0 of range_1d > 2, it will output the 0th element from the second list. If the condition is true for the test at position 3, it will output the value at position 3 from the first list.

An explanation of how numpy.where works for example 4.

Here in example 4, we’re just testing a condition, and then outputting values element wise from different groups of numbers depending on whether the condition is true or false.

Examples of Numpy where can get much more complicated

All of the examples shown so far use 1-dimensional Numpy arrays. That’s intentional. I wanted to use a simple array as an input to make the examples extremely easy to understand.

However, everything that I’ve shown here extends to 2D and 3D Numpy arrays (and beyond).

Moreover, the conditions in this example were very simple. Having said that, you can use very complicated test conditions in Numpy where.

As always, I recommend that you learn how this works by using simple examples, and then increase the complexity to improve your understanding.

Frequently asked questions about Numpy where

Now that you’ve learned about Numpy where and seen some examples, let’s review some frequently asked questions about the function.

Frequently asked questions:

Can you run Numpy where only with a condition?

The short answer is “yes.”

In all of the examples in the examples section, we use all three parameters: condition, output-if-true, output-if-false.

But it’s possible to run np.where only with condition, and remove output-if-true and output-if-false.

If you do this, Numpy where will simply output the index positions of the elements for which condition is True.

Example:

np.where(range_1d > 2)

OUT:

(array([2, 3]),)

Note that the output in this case is a tuple.

Leave your other questions in the comments below

Do you have other questions about Numpy where?

I know that it’s a little confusing for beginners …

So if you have a question, leave your question in the comments section at the bottom of the page.

Join our course to learn more about Numpy

The examples you’ve seen in this tutorial should be enough to get you started, but if you’re serious about learning Numpy, you should enroll in our premium course called Numpy Mastery.

There’s a lot more to learn about Numpy, and Numpy Mastery will teach you everything, including:

  • How to create Numpy arrays
  • How to use the Numpy random functions
  • What the “Numpy random seed” function does
  • How to reshape, split, and combine your Numpy arrays
  • and more …

Moreover, it will help you completely master the syntax within a few weeks. You’ll discover how to become “fluent” in writing Numpy code.

Find out more here:

Learn More About Numpy Mastery

Leave a Comment