The post A quick introduction to the numpy power function appeared first on Sharp Sight.

]]>**Contents**:

- An review of NumPy
- Introduction to numpy.power
- The syntax of numpy.power
- Examples of NumPy power
- Frequently asked questions about NumPy power

You can click on any of the above links and it will take you to the appropriate part of the tutorial.

So if you’re just looking for a quick answer to a question about the np.power function, just click on a link!

Having said that, if you’re new to NumPy, I recommend that you read the whole tutorial. NumPy can be a little complicated for beginners, and this tutorial is designed to give you a good overview of everything you need to know to use the np power function.

Very quickly, let’s review NumPy.

If you’re new to data science in Python, you might be a little confused about what exactly NumPy is.

It’s pretty simple. NumPy is a toolkit for working with arrays of numbers in Python.

Python (as I hope you know) is a very common programming language. And increasingly, Python has become one of the most important programming languages for doing data science.

One of the reasons for the popularity of Python in the data science community is that it has a set of modules that are excellent for doing data science tasks:

- Pandas enables you to manipulate dataframes
- Matplotlib gives you tools for data manipulation
- NumPy enables you to work with numeric data

Essentially, Python gives you a set of tools for doing almost every part of the data science workflow.

As I’ve mentioned, NumPy focuses on working with numbers.

Specifically, NumPy has tools that enable you to:

- calculate the median of a NumPy array
- calculate the mean of a NumPy array
- sum the values of a NumPy array
- concatenate two separate NumPy arrays
- generate a random sample from a NumPy array

… and more.

There are several dozen functions in the NumPy package … it’s a little hard to list them all.

Essentially, NumPy provides a toolkit for analyzing, reshaping, and working with NumPy arrays, which are arrays of numeric data in Python.

As such, NumPy has tools that enable you to perform a variety of mathematical computations on numbers and arrays of numbers.

One of those tools is the NumPy power function.

So what is the NumPy power function?

It’s pretty simple.

In its simplest form, numpy.power is just a tool for performing exponentiation in Python.

You almost certainly learned about exponentiation early in your school career. When we perform exponentiation, we raise one number b, called the base, to another number n, called the exponent. The exponent n is also sometimes called the “power.”

As explained by Wikipedia, this operation amounts to multiplying the base b by itself n times:

\[\large b^n = \underbrace{b \times \cdots \times b}_{n \text{ times}}\]

When we do this, we say that we raise b to the power of n.

The np.power function enables you to do this.

But in addition to allowing you to do this with single numbers, you can also do this with *arrays of numbers*.

How this works in practice can be a little complicated, but in essence, NumPy power just a tool for performing mathematical exponentiation with arrays of numbers.

So while the numpy.power function enables you to perform simple exponentiation like b to the power of n, it also allows you to do this with large NumPy arrays.

I want to show you some examples of how this works (to develop your intuition), but before I do that, I want to explain the syntax.

Once you understand the syntax, the examples themselves will be much easier to understand.

Here, let’s just take a look at the syntax at a high level.

The syntax is really very simple, but to really “get it” you should understand exactly what the parameters are.

In the image of the syntax above, there are two parameters:

`array-of-bases`

`array-of-exponents`

In the official documentation for the np.power function, these are called `x1`

and `x2`

.

But as is often the case with the official NumPy documentation, I think those names are unintuitive. That being the case, I’m referring to `x1`

and `x2`

as `array-of-bases`

and `array-of-exponents`

respectively.

`array-of-bases`

(required)The first parameter of the np.power function is `array-of-bases`

.

As this implies, the argument to this parameter should be an array of numbers. These numbers will be used as the “bases” of our exponents.

Note that this is required. You must provide an input here.

Also, the item that you supply can take a variety of forms. You can supply a NumPy array, but you can also supply an array-like input. The array-like inputs that will work here are things like a Python list, a Python tuple or one of the other Python objects that have array-like properties.

Keep in mind that you can also just supply a single integer!

`array-of-exponents`

(required)The second parameter is `array-of-exponents`

, which enables you to specify the exponents that you will apply to the bases, `array-of-bases`

.

Note that just like the `array-of-exponents`

input, this input must be a NumPy array or an array-like object. So here you can supply a NumPy array, a Python list, a tuple, or another Python object with array-like properties. You can even provide a single integer!

Note that both arguments, `array-of-bases`

and `array-of-exponents`

, are *positional* arguments.

Positional arguments are a little confusing to beginners, but here’s a quick explanation.

How each input to np.power is used depends on where you put it inside of `np.power()`

.

Inside of `np.power()`

, you must put the bases first and you must put the exponents second. The *position* of the argument inside of `np.power()`

determines how it is used by the function.

Note that this is in contrast to so-called “keyword arguments.” With keyword arguments, you use a particular keyword to designate an input to a function. And as long as you use the correct keywords, they can be in any order you’d like.

So essentially, if the argument is a “positional argument,” the order maters.

And because `array-of-bases`

and `array-of-exponents`

are positional arguments, the bases must be specified first and the exponents must be specified second inside of the function.

Again, this is a little confusing, so be on the lookout for a future tutorial about positional arguments.

Ok. Now that you’ve learned a little about the syntax, let’s look at some real examples of how to use numpy.power.

We’re going to work with a few examples, starting with the simplest and then progressing through examples that are more complicated.

**Examples**:

- Raise an integer to a power
- Calculate the exponent of an array
- Use two 1-d arrays in numpy.power
- An example of “broadcasting”

Before you fun the code in the following examples, you’ll need to run a small piece of code first.

You essentially need to import NumPy and give it an “alias.”

import numpy as np

When we do this, we essentially designate the code `np`

as the “nickname” or alias of the `numpy`

module.

This is a very common convention in Python, and it allows you to call a function starting with `np`

. You’ll see what I mean in a minute.

First, we’re going to work with a *very* simple example.

Here, we’re just going to raise an integer to a power.

To do this, we’ll call the NumPy power function with the code `np.power()`

. Then inside of the parenthesis, we’ll provide two arguments …. the base and the exponent.

np.power(2,3)

OUT:

8

This is very simple. It just calculates 2 to the 3rd power which equals 8.

Notice how the inputs work. The first input (2) is the base and the second input (3) is the exponent.

This is exactly how the remaining examples will work.

Let’s take a look at a more complicated example.

Here, we’re going to change things up a little bit.

Instead of the base being a single integer, the base will be a group of numbers organized into an array-like object (i.e., a Python list).

To be clear, we typically use np.power on *NumPy arrays*. However, I think that everything is easier to understand if we just use a Python list instead. By using a Python list, you’ll actually be able to see all of the numbers inside of the syntax.

In this example, the bases will be a simple list of numbers from 0 to 4: `[0,1,2,3,4]`

.

This list will be the first input to the np.power function. We’re going to raise each element of this list to the 2nd power.

np.power([0,1,2,3,4], 2)

Which produces the following output:

array([ 0, 1, 4, 9, 16])

This is pretty straight forward once you understand what’s going on here.

Essentially, the np.power function takes the elements of the list and uses those as the *bases*. Then, it applies the exponent (the second argument to the function) to every base.

So np.power simply applies the exponent 2 to every single base in the first input array.

And the output of the function is a new NumPy with the computed exponential values.

Now, let’s just re-do this example with a NumPy array instead of a Python list.

As I mentioned above … everything is easier to understand and see when we use a Python list, but we’ll often need to use the np.power function on actual NumPy arrays.

So, I’ll just show you the same example with a NumPy array in place of the Python list.

First, we’ll just create a NumPy array using np.arange:

array_1d = np.arange(5)

This array just contains the numbers from 0 to 4.

Then, we’ll use np.power to calculate the exponent:

np.power(array_1d, 2)

Which produces the following output:

array([ 0, 1, 4, 9, 16])

In terms of outputs, this example creates the same output as `np.power([0,1,2,3,4], 2)`

. So just remember that NumPy power will work with NumPy arrays, Python lists, or any array-like object.

Ok. Now that we’ve taken a look at some simple examples, let’s move on to something more complicated.

Here, we’re going to use *two* input arrays instead of one array and one number.

But keep in mind, we’re actually going to use Python lists instead of proper NumPy arrays, just for the sake of clarity.

The first list is going to be the list of bases, and the second list will be the list of exponents.

np.power([2,2,2,2,2], [0,1,2,3,4])

Which produces the following output:

array([ 1, 2, 4, 8, 16])

So what happened here?

The first input list – `[2,2,2,2,2]`

– is the list of *bases*.

The second input list – `[0,1,2,3,4]`

– is the list of *exponents*.

Recall the previous examples.

In an example like `np.power(2,3)`

, the first argument is the base and the second argument is the exponent.

The code `np.power([2,2,2,2,2], [0,1,2,3,4])`

is essentially the same! The first list is the bases, and the second list is the exponents.

The only major difference is how these inputs are put together to produce the output.

NumPy applies the exponents to the bases *element wise*. This means that it applies the first exponent to the first base, the second exponent to the second base, and so on.

Considering that our input lists are fairly simple, this is still a fairly simple example. But it shows you how powerful (heh heh) the np.power function is when you start working with arrays and array like objects.

Ok. Now let’s move on to a more complicated example where we have a multi dimensional array of bases.

Here, we’re going to use the NumPy power function to compute exponents for a 2-dimensional array of bases.

Moreover, the *exponents* will also be in an array-like structure.

Let’s take a look at the example and then I’ll explain exactly how NumPy handles this.

First, let’s create the input array. This is actually going to be a NumPy array, instead of a Python list.

To create this array, we’ll simply use the np.array function.

array_2d = np.array([[0,0,0,0,0],[1,1,1,1,1],[2,2,2,2,2],[3,3,3,3,3]])

Let’s can print it out, so you can see it:

print(array_2d)

OUT:

[[0 0 0 0 0] [1 1 1 1 1] [2 2 2 2 2] [3 3 3 3 3]]

This is a fairly simple 2-dimensional NumPy array. As you can see, the first row has all 0s, the second row has all 1s, and so on.

Now, let’s run the np.power function on this and apply some exponents to these bases.

np.power(array_2d, [0,1,2,3,4])

Which produces the following output:

array([[ 1, 0, 0, 0, 0], [ 1, 1, 1, 1, 1], [ 1, 2, 4, 8, 16], [ 1, 3, 9, 27, 81]])

What happened here?

You have to think about the structure of the inputs.

The first argument – the array of bases – is a 2-dimensional array. The second argument – the exponents – is a 1-dimensional array. And notice that they both have the same number of columns.

In a case like this, the NumPy power function is sort of smart. It applies the exponents to *every row*.

It still does this element-wise, meaning that the exponent of column 0 is applied to the base of column 0 …. the exponent of column 1 is applied to the base of column 1, and so on.

But essentially, it applies the exponents to every row.

This behavior is called *broadcasting*, and it appears quite a bit in NumPy when you’re working with arrays that have different dimensions. A full explanation of broadcasting is beyond the scope of this post. Just understand that this is common in NumPy, and you’ll probably need to understand how it works.

Before I end the tutorial, let’s quickly discuss any frequently asked questions.

**Frequently asked questions**:

No, NumPy power doesn’t work with negative exponents.

So for example, if you try, you’ll get an error:

np.power(2, -1)

ValueError: Integers to negative integer powers are not allowed.

If you want to use negative exponents, use `numpy.float_power()`

instead.

Do you have any other questions about the NumPy power function?

Leave your question in the comments below.

NumPy is extremely powerful, and very important for data science in Python.

That being the case, if you’re interested in data science in Python, you should really learn more about Python.

You can learn more by checking out our other tutorials:

- How to use the NumPy max function
- Numpy axes explained
- A quick introduction to the NumPy array
- How to use Numpy reshape
- How to use the NumPy zeros function

… and more.

I also suggest that you check out our tutorials about matplotlib and Pandas. For example, we have tutorials about the Pandas dataframe, the Pandas iloc method, and more.

Ultimately, if you’re interested in data science in Python, there’s a lot to learn.

The good news is that we provide FREE data science tutorials here at Sharp Sight.

If you sign up for our email list, you’ll get our tutorials delivered FOR FREE right to your inbox.

When you sign up, you’ll get free tutorials on:

- NumPy
- Pandas
- Base Python
- Scikit learn
- Machine learning
- Deep learning
- … and more.

Want to learn data science?

Sign up right now …

The post A quick introduction to the numpy power function appeared first on Sharp Sight.

]]>The post How to use NumPy hstack appeared first on Sharp Sight.

]]>This is a very simple tool that we use to manipulate NumPy arrays. Specifically, we use np.hstack to combine NumPy arrays horizontally. The function “stacks” arrays in the horizontal direction.

Again, this is a fairly simple function, but to use it properly, you need to know a few things. You first need to have a basic understanding of NumPy. You also need to know how the syntax works for this particular function. Once you have those things, you can start playing around with some code.

That being the case, this tutorial will give both a quick review of NumPy and it will then explain the syntax. Later in the tutorial, I’ll show you clear examples of how to use np.hstack.

**Contents:**

- A quick review of NumPy
- Introduction to NumPy hstack
- The syntax of np.hstack
- NumPy hstack examples
- Frequently asked questions about NumPy hstack

You can click on any of the above links to go to the appropriate section of the tutorial. But if you’re relatively new with NumPy, you should probably read everything.

Let’s get started.

If you’ve done data science in Python for a while, you might know a little bit about NumPy.

But if you’re a beginner, you might still be a little confused about what NumPy is.

To put it simply, NumPy is just a toolkit for working with numeric data in Python.

A little more specifically, NumPy provides tools for working with arrays of numbers.

You can visualize these arrays as something like this:

NumPy arrays can be 2-dimensional (like the one above), but also 1-dimensional (like a vector), or multi-dimensional.

The NumPy package provides tools for creating these arrays. For example, there are tools for creating arrays of all zeros or creating arrays of all ones. There are also tools for transforming other Python objects into NumPy arrays. For example, the np.array() function can transform a list or tuple into a NumPy array.

The NumPy package also provides tools for *manipulating* and organizing these arrays.

So there are tools to change the shape of a NumPy array or to summarize a NumPy array.

And there are tools for *combining* NumPy arrays together.

That’s exactly what NumPy hstack does. It’s one of several tools that helps you combine together other NumPy arrays.

Let’s take a closer look at numpy.hstack and how it works.

As I just mentioned, the NumPy hstack is just a function for combining together other NumPy arrays.

Specifically, it combines together NumPy arrays in the “horizontal” direction. Another way of saying this is that it combines together NumPy arrays “column wise.”

This might not make sense, so let’s take a look at a visual example.

Let’s say that you have two NumPy arrays. One of the arrays is filled with 0’s and the other is filled with 1’s.

You want to combine them together horizontally.

To do this, you can use the NumPy hstack function:

There are other ways to combine together NumPy arrays, but np.hstack is simpler than the other options. It’s easier to use than np.concatenate (although np.concatenate is more flexible).

That’s essentially it ….

NumPy hstack is just a function for combining together NumPy arrays.

Having said that, let’s start to examine the specific details of how it works.

Let’s take a look at the syntax.

The syntax is fairly simple.

If you’ve imported NumPy as `np`

, then you can call the NumPy hstack function with the code `np.hstack()`

. Then, inside of the parenthesis, you provide the NumPy arrays that you want to combine.

There are a few important details to consider though, so let’s talk about some of them.

One thing that might confuse a beginner is the prefix `np`

. This is sort of a nickname that we give to the NumPy module in our code. It’s a common convention.

Having said that, in order to make that nickname work properly, you need to import NumPy a certain way. You need to import NumPy with the code `import numpy as np`

.

It’s also possible to import numpy with the code `import numpy`

. If you do it that way, you actually would need to call the function with the code `numpy.hstack()`

.

The point that I’m trying to emphasize is that how exactly you call the function depends on how you’ve imported NumPy.

Going forward though, we’ll be referring to the function as np.hstack.

Ok … having explained that, let’s now talk about the inputs to the np.hstack function.

There’s really only one input to the function, and that’s a tuple of input arrays.

For the sake of clarity, I’ll call this `tuple-of-input-arrays`

.

`tuple-of-input-arrays`

(required)The only input to np.hstack is a group of arrays, organized into a Python tuple.

Having said that, there is a lot of flexibility in the types of inputs and how you specify the inputs.

Although the standard documentation for the function says that you should provide multiple NumPy arrays organized inside of a tuple, it will actually accept any group of array-like structure of numbers organized inside of a tuple or list.

So you can provide NumPy arrays organized inside of a tuple.

Or you can provide NumPy arrays organized inside of a list.

Or, you could even provide lists of numbers organized inside of a tuple or list.

My point here is that the np.hstack function is fairly flexible in terms of the inputs it will accept.

I’ll show you examples of this in the examples section.

Speaking of examples, let’s look at some.

- Use np.hstack on two lists of numbers
- Combine two 1-dimensional NumPy arrays
- Combine two 2-dimensional arrays

Before you run these examples, make sure you import NumPy properly.

Make sure you run the following code.

import numpy as np

This will import NumPy with the alias “`np`

” which will enable us to refer to the function as np.hstack.

Ok, on to the first example.

First, instead of operating on proper NumPy arrays, we’re actually going to combine two lists of numbers.

The reason for this is that by using lists in our syntax, it’s a little more obvious what’s going on.

This will also demonstrate that np.hstack operates on lists in addition to arrays.

Essentially, we’re going to give np.hstack two Python lists, organized inside of a Python tuple.

The first list is a list of two 0’s: `[0,0]`

.

The the second list is a list of two 1’s: `[1,1]`

.

And they are organized inside of a tuple: `([0,0],[1,1])`

.

Collectively, that tuple of lists is the input to np.hstack:

np.hstack(([0,0],[1,1]))

And here is the output:

array([0, 0, 1, 1])

As you can see, the output is a NumPy array.

So what happened here?

NumPy hstack just combined the inputs horizontally.

As I mentioned earlier, numpy.hstack is fairly flexible in terms of the inputs that it will allow.

Above, we took our two Python lists and organized them inside of a tuple: `([0,0],[1,1])`

.

But you can actually organize those two Python lists inside of a list instead: `[[0,0],[1,1]]`

.

If you did that, the code would look like this:

np.hstack([[0,0],[1,1]])

Which produces the following output:

array([0, 0, 1, 1])

This is the same output as the original syntax where we organized the lists inside of a tuple.

I’m showing you this to reinforce the idea that np.hstack is fairly flexible in terms of the types of inputs it will accept and the structure of those inputs.

Ok. Speaking of flexibility of inputs, let’s re-do the above example with proper NumPy arrays.

In the previous example, we were combining two 1D *lists*.

Now, we’re going to redo the same example with arrays.

First, let’s just create the NumPy arrays.

np_array_zeros_1d = np.array([0,0]) np_array_ones_1d = np.array([1,1])

We can print them out so you can see the contents:

print(np_array_zeros_1d) print(np_array_ones_1d)

OUT:

[0 0]

[1 1]

Essentially, `np_array_zeros_1d`

is a 1-dimensional NumPy array of zeros and `np_array_ones_1d`

is a 1-dimensional array of ones.

Now, let’s combine those two NumPy arrays with np.hstack:

np.hstack((np_array_zeros_1d,np_array_ones_1d))

OUT:

array([0, 0, 1, 1])

This example is almost the same as the example in the previous section.

The output is identical.

The only difference is that here we used two NumPy arrays instead of two lists.

Essentially, np.hstack took the two 1-dimensional NumPy arrays and combined them together in the horizontal direction.

Finally, let’s combine two 2-dimensional NumPy arrays.

This is very similar to the other examples, so it helps if you’ve already reviewed the other two simpler examples.

First, we’re just going to create two 2-dimensional numpy arrays.

We’re going to create an array of zeros with 2 rows and 2 columns:

np_array_zeros_2d = np.zeros(shape = (2,2), dtype = 'int')

And we’re going to create an array of ones with 2 rows and 3 columns:

np_array_ones_2d = np.ones(shape = (2,3), dtype = 'int')

And we can print them out quickly to see the contents:

print(np_array_zeros_2d) print(np_array_ones_2d)

OUT:

[[0 0] [0 0]]

[[1 1 1] [1 1 1]]

So we just have two 2-dimensional NumPy arrays.

Notice also that these arrays have *the same number of rows*. This is important when we use np.hstack on multi-dimensional arrays.

Now, we’re going to combine the arrays with the np.hstack function.

This is extremely straight forward, and we’re going to do it almost exactly the same as how we did it in the previous examples.

We’re going to call the function, and provide the two NumPy arrays as inputs, inside of a tuple.

np.hstack((np_array_zeros_2d, np_array_ones_2d))

OUT:

array([[0, 0, 1, 1, 1], [0, 0, 1, 1, 1]])

Again, this is really simple to understand if you’ve understood the previous examples.

The np.hstack just combined together the two arrays in the horizontal direction. It essentially combined the arrays together column-wise.

Let’s quickly cover some of the frequently asked questions about NumPy hstack.

**Frequently asked questions:**

- What’s the difference between np.hstack and np.vstack?
- What’s the difference between np.hstack and np.concatenate?
- Do the number of rows need to be the same?

If you’ve been paying attention to the earlier parts of this tutorial, you should already know that np.hstack combines arrays horizontally.

On the other hand, np.vstack combines arrays *vertically*.

They’re very similar in how they work … the major difference is that one combines horizontally and the other combines vertically.

For more information about np.vstack, check out our tutorial about NumPy vstack.

Essentially, np.hstack is like a special case of np.concatenate.

Whereas np.hstack only combines horizontally, and np.vstack combines vertically, np.concatenate can combine arrays in any direction.

So again, np.hstack is like a special case of np.concatenate.

Another way of saying this is that np.concatenate is like a more general and more flexible version of np.vstack or np.hstack.

To give you an example though, take a look at the following:

np_array_zeros_2d = np.zeros(shape = (2,2), dtype = 'int') np_array_ones_2d = np.ones(shape = (2,3), dtype = 'int') np.hstack((np_array_zeros_2d, np_array_ones_2d))

Verses …

np_array_zeros_2d = np.zeros(shape = (2,2), dtype = 'int') np_array_ones_2d = np.ones(shape = (2,3), dtype = 'int') np.concatenate((np_array_zeros_2d, np_array_ones_2d), axis = 1)

In both cases, the output will be the same. The syntax for np.concatenate is a little more complicated, because you have to reference the appropriate array axis. But in either case, we’re combining NumPy arrays horizontally.

Yes.

Speaking generally, when you use np.hstack, the number of rows need to be the same for both arrays that you’re combining together.

If the number of rows is not the same, you’ll get an error.

NumPy hstack is just one NumPy function among many functions.

To really master NumPy (and data science in Python), you’ll have to learn a lot more.

I recommend that you check out some of our other tutorials:

- How to use numpy random normal in Python
- An explanation of NumPy random seed
- How to use the NumPy min function
- How to use the NumPy append function

Those are just a few tutorials to help you get started.

If you want to get access to all of our free tutorials though, then sign up for our email list.

When you sign up, we’ll send you our free tutorials as soon as we publish them (about 2 to 4 times a month)

You’ll get free tutorials on:

- NumPy
- Pandas
- Base Python
- Scikit learn
- Machine learning
- Deep learning
- … and more.

If you’re ready to learn more about data science, then sign up now.

The post How to use NumPy hstack appeared first on Sharp Sight.

]]>The post How to use NumPy random choice appeared first on Sharp Sight.

]]>I recommend that you read the whole blog post, but if you want, you can skip ahead. Here are the contents of the tutorial …

**Contents:**

- a quick review of NumPy
- why we use np.random.choice
- the syntax of NumPy random choice
- examples of np.random.choice

Again, if you have the time, I strongly recommend that you read the whole tutorial. Everything will make more sense if you read everything carefully and follow the examples.

Ok … let’s get into it.

First of all, what is np.random.choice?

NumPy random choice is a function from the NumPy package in Python.

You might know a little bit about NumPy already, but I want to quickly explain what it is, just to make sure that we’re all on the same page.

NumPy is a data manipulation module for Python.

Specifically, the tools from NumPy operate on arrays of numbers … i.e., numeric data.

Because NumPy functions operate on numbers, they are especially useful for data science, statistics, and machine learning.

For example, if you want to do some data analysis, you’ll often be working with tables of numbers. Frequently, when you work with data, you’ll need to organize it, reshape it, clean it and transform it. We call these data cleaning and reshaping tasks “data manipulation.”

In recent years, NumPy has become particularly important for “machine learning” and “deep learning,” since these often involve large datasets of numeric data. When you’re doing machine learning and deep learning, numeric data manipulation is a very big part of the workflow.

In any case, whether you’re doing statistics or analysis or deep learning, NumPy provides an excellent toolkit to help you clean up your data.

One common task in data analysis, statistics, and related fields is taking random samples of data.

You’ll see random samples in probability, Bayesian statistics, machine learning, and other subjects. Random samples are very common in data-related fields.

NumPy random choice provides a way of *creating* random samples with the NumPy system.

If you’re working in Python and doing any sort of data work, chances are (heh, heh), you’ll have to create a random sample at some point.

NumPy random choice can help you do just that.

To explain it though, let’s take a look at an example.

Think of a die … the kind of die that you would see in a game:

A typical die has six sides. Each side has some dots on it, corresponding to a number 1 through 6. Essentially, a die has the numbers 1 to 6 on its six different faces.

If you roll the die, when the die lands, one face will emerge pointing upwards, so rolling the die is exactly like selecting a number between 1 and 6. The numbers 1 to 6 on the die are the possible outcomes that can appear, and rolling a die is like randomly *choosing* a number between 1 and 6.

So essentially, in the example of rolling a die, we have possible outcomes (i.e., the faces), and a random process that chooses one of them.

The NumPy random choice function is a lot like this. Given an input array of numbers, numpy.random.choice will *choose* one of those numbers randomly.

So let’s say that we have a NumPy array of 6 integers … the numbers 1 to 6.

If we apply np.random.choice to this array, it will select one. It will *choose* one randomly…. it’s essentially the same as rolling a die.

That’s how np.random.choice works. You input some items, and the function will randomly choose one or more of them as the output.

Conceptually, this function is easy to understand, but using it properly can be a little tricky.

Ultimately, to use NumPy random choice properly, you need to know the syntax and how the syntax works.

That being the case, let’s look at the syntax of np.random.choice.

One quick note …

In this tutorial, you’ll see me refer to the function as np.random.choice.

The term “`np`

” refers to NumPy. But, to get the syntax to work properly, you need to tell your Python system that you’re referring to NumPy as “np”. You need to run the code `import numpy as np`

. This code essentially tells Python that we’re giving the NumPy package the nickname “`np`

“.

I’ll show you exactly how to do that again in the examples section of this tutorial, but I want to briefly explain it before we look at the syntax.

Ok, let’s take a look at the syntax.

The `np.random.choice()`

function is fairly simple. When you use it, there is the name of the function, and then some parameters that will be enclosed inside of parenthesis.

Because the parameters of the function are important to how it works, let’s take a closer look at the parameters of NumPy random choice.

There are four parameters for the NumPy random choice function:

`a`

`size`

`replace`

`p`

Let’s discuss each of these individually.

`a`

(required)The `a`

parameter enables us to specify the array of input values … typically a NumPy array.

This is essentially the set of input elements from which we will generate the random sample.

Note that the `a`

parameter is *required* … you need to provide some array-like structure that contains the inputs to the random selection process.

Also note that the `a`

parameter is flexible in terms of the inputs that it will accept. Typically, we’ll supply a NumPy array of numbers to the `a`

parameter. However, because it is flexible, it will also accept things like Python lists, tuples, and other Python sequences.

Moreover, instead of supplying a sequence like a NumPy array, you can also just provide a *number* (i.e., an integer). If you provide an integer `n`

, it will create a NumPy array of integers up to but excluding n by using the NumPy arange function. In this case, it’s as if you supplied a NumPy array with the code `np.arange(n)`

. I’ll show you an example of this in the examples section of this tutorial.

`size`

The `size`

parameter describes (…. wait for it ….)

… the *size* of the output.

Remember that the NumPy random choice function accepts an input of elements, chooses randomly from those elements, and outputs the random selections as a NumPy array.

Because the output of numpy.random.choice is a NumPy array, the array will have a *size*. If you know about NumPy arrays, this will make sense, but if you’re new to NumPy this may be confusing.

Therefore, if you don’t know what the `size`

attribute is, I suggest that you read our tutorial about NumPy arrays. Specifically, you should read the section about the attributes of NumPy arrays.

`replace`

The `replace`

parameter specifies whether or not you want to sample with replacement.

If you’ve taken a statistics class, you’ll probably be familiar with this.

… but if you *haven’t* taken a stats class, the idea of sampling with and without replacement might be foreign.

That being the case, let me quickly explain.

Let’s say that you have 4 simple cards on a table: a diamond, a spade, a heart, and a club. (This is an extremely simple example, so we’re working with simplified playing cards.)

I turn them over and mix them up on the table. Then I ask you to close your eyes.

You make your selection … it’s the heart card.

Next, I ask you to select another card.

… now, this is the critical point.

Do you put your first card back or not? Do you “replace” your initial selection?

If you *do* put your card back, then it will be possible to re-select the heart card, or any of the other three cards. But if you *do not* replace your initial card, then it will only be possible to select a spade, diamond, or club.

Essentially, *replacement* makes a difference when you choose multiple times.

And this is what the `replace`

parameter controls. It will control whether or not an element that is chosen by numpy.random.choice gets *replaced* back into the pool of possible choices.

I’ll explain this again in the examples section, so you can see it in action.

`p`

Finally, the `p`

parameter controls the probability of selecting a given item.

By default, each item in the input array has an equal probability of being selected.

It’s like rolling a fair die.

A fair die has 6 sides, and each side is equally likely to come up. So the probability of rolling a 1 is .1667 (i.e., 1/6th). The probability of rolling a 2 is also .1667, etc.

Similarly, if we set up NumPy random choice with the input values 1 through 6, then each of those values will have an equal probability of being selected, by default.

But we can change that. We can manually specify the probabilities of the different outcomes. For example, we could make selecting ‘`1`

‘ a probability of .5, and give the other outcomes a probability of .1. (This is akin to rolling an unfair, weighted die.)

Essentially, this is what the `p`

parameter controls: the probabilities of selecting the different input elements.

Note that the `p`

parameter is optional, and if we don’t provide anything, NumPy just treats each outcome as equally likely.

If we *do* provide something to the `p`

parameter, then we need to provide it in the form of an “array like” object, such as a NumPy array, list, or tuple.

Now that we’ve looked at the syntax of numpy.random.choice, and we’ve taken a closer look at the parameters, let’s look at some examples.

**Examples:**

- select a random number from a numpy array
- generate a random sample from a numpy array
- perform random sampling with replacement
- change the probabilities of different outcomes
- select a sample from a list of items

Before you run any of these examples, you’ll need to run some code as a preliminary setup step.

Specifically, you’ll need to properly import the NumPy module.

Keep in mind, that to import the NumPy module into your code environment, you’ll need to have NumPy installed on your computer first. Installing NumPy is complicated, and beyond the scope of this blog post. Having said that, I recommend that you just use Anaconda to get the modules properly installed.

But assuming that you have NumPy installed on your computer, you can import it into your working environment with the following code:

import numpy as np

This will import NumPy with the nickname `np`

. Going forward, we will syntactically refer to NumPy as `np`

in our code.

In this first example, we’re going to select a single integer from a range of possible integers.

More specifically, we’re going to select a single integer between 0 and 9.

First, before we use np random choice to randomly select an integer from an array, we actually need to *create* the NumPy array.

Let’s do that now.

Here, we’re going to create a simple NumPy array with the numpy.arange function.

array_0_to_9 = np.arange(start = 0, stop = 10)

This is fairly straightforward, as long as you understand how to use np.arange. If you don’t, make sure to read our numpy.arange tutorial.

Using NumPy arange this way has created a new array, called array_0_to_9. This array contains the integers from 0 to 9.

You can print it out with the print function:

print(array_0_to_9)

OUTPUT:

[0 1 2 3 4 5 6 7 8 9]

Visually, we can represent the array as follows:

This is really straight forward … this array contains the integers from 0 to 9.

Next, we’re going to randomly select one of those integers from the array.

To select a random number from `array_0_to_9`

we’re now going to use numpy.random.choice.

np.random.seed(0) np.random.choice(a = array_0_to_9)

OUTPUT:

5

If you read and understood the syntax section of this tutorial, this is somewhat easy to understand. But there are a few potentially confusing points, so let me explain it.

Essentially, we’re using np.random.choice with the ‘`a`

‘ parameter. You’ll remember from the syntax section earlier in this tutorial that the `a`

parameter enables us to set the input array (i.e., the NumPy array that contains our input values). In other words, the code `a = array_0_to_9`

indicates that the input values are contained in the array `array_0_to_9`

.

Remember, the input array `array_0_to_9`

simply contains the numbers from 0 to 9.

When we use np.random.choice to operate on that array, it simply randomly selects one of those numbers.

In this case, it randomly selects the number 5.

Visually, we can represent the operation like this:

The input array has 10 values, and NumPy random choice randomly chooses one of them.

There’s one part of this code that confuses many beginners, so I want to address it.

Before we ran the line of code `np.random.choice(a = array_0_to_9)`

, we ran the code `np.random.seed(0)`

.

We need np.random.seed because it “seeds” the random number generator for numpy.random.choice.

But WTF is a “seed” anyway?

This is a little complicated, but I’ll briefly explain here.

The NumPy random choice function operates on the principle of pseudorandom number generation.

When we use a pseudorandom number generator, the numbers in the output *approximate* random numbers, but are not exactly “random.” In fact, when we use pseudorandom numbers, the output is actually *deterministic*; the output is actually determined by an initializing value called a “seed.”

Let me say that again: when we set a seed for a pseudorandom number generator, the output is completely determined by the seed.

What that means is that if we use the same seed, a pseudorandom number generator will produce the same output.

Let me show you:

np.random.seed(0) np.random.choice(a = np.arange(10))

This produces the output 5.

Now run it again with the same seed.

np.random.seed(0) np.random.choice(a = np.arange(10))

It produces the output 5 again.

You can run this code as many times as you like. If you use the same seed, it will produce the exact same output.

What this means is that np.random.choice is random-ish. It’s sort of random, in the sense that there will be no discernible relationship between the seed and the output. But you have to remember that using the same seed will produce the same output.

This is actually good, because it makes the results of a pseudorandom function reproducible. If I share my code with you, and you run it with the same seed, you will get the exact same result. This is good for code testing, among other things.

If this is still confusing, you should read our tutorial about numpy.random.seed, which explains random number generation with NumPy.

Ok.

Now that I’ve shown you how to select a single random number from a specific NumPy array, let’s take a look at another way to select a number from a sequence of values.

Here, we’re going to select a number from the numbers 0 to 9. It’s essentially just like the prior example.

The one major difference is that we’re not going to supply a specific input array. Instead, we’re just going to provide a number inside of the parenthesis when we call np.random.choice. Here, we’re going to run the code `np.random.choice(10)`

.

np.random.seed(0) np.random.choice(10)

Which produces the exact same output as in the previous example.

OUTPUT:

5

What’s going on here?

In this example, we ran the code `np.random.choice(10)`

. We did not provide a specific NumPy array as an input. Instead, we just provided the number `10`

.

When we provide a number to np random choice this way, it will automatically *create* a NumPy array using NumPy arange. Effectively, the code `np.random.choice(10)`

is identical to the code `np.random.choice(a = np.arange(10))`

. So by running np.random.choice this way, it will create a new numpy array of values from 0 to 9 and pass that as the input to numpy.random.choice.

This is essentially a shorthand way to both create an array of input values and then select from those values using the NumPy random choice function.

Now that you’ve learned how to select a *single* number from a NumPy array, let’s take a look at how to create a random sample with NumPy random choice. That is, we’re going to select *multiple* elements from an input range.

First, let’s just create a NumPy array.

Here, we’ll create a NumPy array of values from 0 to 99.

array_0_to_99 = np.arange(100)

Now that we have our input array, let’s select a sample of 5 numbers from it:

To do this, we’ll use the `size`

parameter.

np.random.seed(1) np.random.choice(array_0_to_99, size = 5)

OUTPUT:

array([37, 12, 72, 9, 75])

What happened here?

The NumPy random choice function randomly selected 5 numbers from the input array, which contains the numbers from 0 to 99.

The output is basically a random sample of the numbers from 0 to 99.

Next, let’s create a random sample with replacement using NumPy random choice.

Here, we’re going to create a random sample with replacement from the numbers 1 to 6.

First, we’ll just create a NumPy array of the values from 1 to 6.

array_1_to_6 = np.arange(start = 1, stop =7)

If we print it out, we can see the contents.

print(array_1_to_6)

OUT:

[1 2 3 4 5 6]

This is really straight forward. It’s just the numbers from 1 to 6.

Now, we’ll generate a random sample from those inputs.

Specifically, we’re going to create a sample of 3 values.

Additionally, we will set the `replace`

parameter to `replace = True`

. This will cause np.random.choice to perform random sampling with replacement. That is, even if a value is selected once, it will be “replaced” back into the possible input values, and it will be possible that the input could be selected again.

Let’s run the code.

np.random.seed(77) np.random.choice(a = array_1_to_6, size = 3, replace = True)

OUTPUT:

array([5, 5, 4])

Notice what’s in the output. We have an output of 3 values. This is because we set the `size`

parameter to `size = 3`

. That means that the output must have 3 values.

Also, notice the values that are in the output. The value `5`

is repeated *twice*.

Why?

This is possible because we set the `replace`

parameter to `replace = True`

.

When we do this, it means that an item in the input can be selected (i.e., included in the sample) and will then be “replaced” back into the pool of possible input values. Setting `replace = True`

essentially means that a given input value can be selected multiple times!

Remember earlier in this tutorial that I explained NumPy random choice in terms of rolling a die?

That’s essentially what we’ve done in this example. The code `np.random.choice(a = array_1_to_6, size = 3, replace = True)`

is essentially like rolling a die multiple times!

That’s what’s great about Python and NumPy … if you know how to use the tools right, you can begin to create little models of real-world processes.

Next, we’re going to work with the `p`

parameter to change the probabilities associated with the different possible outcomes.

So for example, let’s reuse our array `array_1_to_6`

.

Here’s the code to create the array again:

array_1_to_6 = np.arange(start = 1, stop =7)

Essentially, the array `array_1_to_6`

has the values from 1 to 6.

Now, we’re going to randomly select from those values (1 to 6) but the probability of each value will not be the same.

Remember that by default, np.random.choice gives each input value an equal probability of being selected.

… but if we use the `p`

parameter, we can change this.

np.random.choice(a = array_1_to_6, p = [.5,.1,.1,.1,.1,.1])

What are we doing here?

We’re using the `p`

parameter to give the input values (1 to 6) different probabilities.

We can visualize the new setup like this:

So essentially, the value “`1`

” will have a probability of being selected of .5 (a 50% chance). And the other values from `2`

to `6`

will each have a probability of .1.

Now let’s run the code:

np.random.seed(42) np.random.choice(a = array_1_to_6, p = [.5,.1,.1,.1,.1,.1])

OUT:

1

Now let’s run the code again, but instead of generating a single value, we’ll generate a random sample of 20 values.

np.random.seed(42) np.random.choice(a = array_1_to_6, p = [.5,.1,.1,.1,.1,.1], size = 20)

OUT:

array([1, 6, 4, 2, 1, 1, 1, 5, 3, 4, 1, 6, 5, 1, 1, 1, 1, 2, 1, 1])

Look closely at the numbers in the output array. LOOK AT ALL THOSE `1`

‘s.

Just by glancing at the output, you can see that `1`

is coming up a lot more than the other values. That’s exactly how we designed it! There’s a 50% chance of generating a `1`

.

Next, let’s move on from using *numbers* as possible outcomes.

…. let’s start using non-numeric inputs in the input array.

Here, we’re going to use a simple example.

For our input array, we’re going to create a Python array of 4 simplified playing cards: a ‘Diamond’ card, a ‘Spade’ card, a ‘Heart’, and a ‘Club’.

simple_cards = ['Diamond','Spade','Heart','Club']

You can think of the list `simple_cards`

like this:

`simple_cards`

represents a simplified set of 4 cards.

This is obviously not like a real set of 52 playing cards. As always, I really want to simplify this as much as possible just so you can see how this works.

Technically though, what is `simple_cards`

? It’s a Python list that contains 4 strings.

Now that we have our Python list, we’re first just going select a single item randomly from that list.

This is really easy. It’s almost exactly the same as some of the previous examples above where we were selecting a single item from a NumPy array of numbers. The only difference is that we’re supplying a *list of strings* to the numpy.random.choice instead of a NumPy array.

Let’s take a look.

np.random.seed(0) np.random.choice(simple_cards)

OUTPUT:

'Diamond'

You can think of this code like selecting a single card from our simplified deck of four cards. There are four possible cards, and we selected the diamond.

From a technical perspective, if you read the earlier examples in this blog post, this should make sense.

All we did is randomly select a single item from our Python list.

Keep in mind though that the code is a little simplified syntactically, because I did not explicitly reference the parameters. If we were a little more explicit in how we wrote this, we could write the code as `np.random.choice(a = simple_cards, replace = True)`

. That’s effectively the same thing.

Now, let’s move on to a slightly more complicated example. We’re going to generate a random sample from our Python list.

Random sampling from a Python list is easy with NumPy random choice.

Once again, it’s almost exactly the same as some of the previous examples in this blog post.

Here, we’re going to select *two* cards from the list.

Essentially, we’re just going to pass the Python list to NumPy random choice and set the `size`

parameter to 2. We’ll also set `replace = False`

to make it so we can’t select the same card twice. I really want this to be like selecting two different cards from a deck of cards.

Let’s take a look at the code.

np.random.seed(55) np.random.choice(a = simple_cards, size = 2, replace = False)

OUT:

array(['Diamond', 'Club'], dtype='U7')

So in this example, we randomly selected two cards from the ‘deck’ (i.e., we randomly selected 2 items from the list).

We selected the ‘Diamond’ and the ‘Club.’

Again, this example is pretty straight forward if you’ve read and understood the previous examples.

If this does *not* make sense, I recommend that you start at the top and review a few of the more simple examples more carefully.

Random sampling is really important for data science, speaking broadly.

The reason is that random sampling is a key concept and technique in probability. It’s also very important in statistics. Moreover, sampling is also applicable to machine learning and deep learning.

Essentially, random sampling is really important for a variety of sub-disciplines of data science.

You really need to know how to do this!

I’ve written this tutorial to help you get started with random sampling in Python and NumPy.

Having said that, I realize that random sampling can be confusing to beginners.

With that in mind, if you have specific questions about random sampling with NumPy or about the NumPy random choice function, please post your question in the comments section at the bottom of this page.

Not only is the numpy.random.choice function important for data science and probability, the broader NumPy toolkit is important for data science in Python.

NumPy gives you a set of tools for working with numeric data in Python. To really get the most out of the NumPy package, you’ll need to learn *many* functions and tools … not just numpy.random.choice. For example, you’ll need to learn

I recommend that you read our free tutorials …. they will teach you a lot about NumPy.

I also recommend that you sign up for our email list.

We regularly post tutorials about NumPy and data science in Python.

If you sign up for our email list, you’ll get our tutorials delivered directly to your inbox …

You’ll get free tutorials on:

- NumPy
- Pandas
- Base Python
- Scikit learn
- Machine learning
- Deep learning
- … and more.

If you want to learn more about NumPy and data science, sign up now.

The post How to use NumPy random choice appeared first on Sharp Sight.

]]>The post A quick guide to NumPy sort appeared first on Sharp Sight.

]]>As the name implies, the NumPy sort technique enables you to *sort* NumPy arrays.

So, this blog post will show you exactly how to use the technique to sort different kinds of arrays in Python.

The blog post has two primary sections, a syntax explanation section and an examples section.

**Contents:**

You can click on either of those links and it will take you to the appropriate section in the tutorial.

But if you’re new to Python and NumPy, I suggest that you read the whole blog post.

Ok. Let’s just start out by talking about the sort function and where it fits into the NumPy data manipulation system.

If you’re reading this blog post, you probably know what NumPy is.

But, just in case you don’t, I want to quickly review NumPy.

NumPy is a toolkit for doing data manipulation in Python.

More specifically, NumPy provides a set of tools and functions for working with arrays of numbers. That’s actually where the name comes from:

“**Num**erical **Py**thon” ….

NumPy.

Although the tools from NumPy can work on a variety of data structures, they are primarily designed to operate on NumPy arrays.

NumPy arrays are essentially arrays of numbers. We’ll create some NumPy arrays later in this tutorial, but you can think of them as row-and-column grids of numbers.

And again, the tools of NumPy can perform manipulations on these arrays. For example, you can do things like calculate the mean of an array, calculate the median of an array, calculate the maximum, etc.

Essentially, NumPy is a broad toolkit for working with arrays of numbers.

And one of the things you can do with NumPy, is you can *sort* an array.

That’s basically what NumPy sort does … it sorts NumPy arrays.

Let me give you a quick example.

Imagine that you have a 1-dimensional NumPy array with five values that are in random order:

You can use NumPy sort to *sort* those values in ascending order. Essentially, numpy.sort will take an input array, and output a new array in sorted order.

Take a look at that image and notice what np.sort did.

It sorted the array in ascending order, from low to high. That’s it.

To be clear, the NumPy sort function can actually sort arrays in more complex ways, but at a basic level, that’s all the function does. It sorts data.

Ok … so now that I’ve explained the NumPy sort technique at a high level, let’s take a look at the details of the syntax.

In this section, I’ll break down the syntax of np.sort.

Before I do that though, you need to be aware of some syntax conventions.

When we write NumPy code, it’s very common to refer to NumPy as `np`

.

Syntactically, `np`

frequently operates as a “nickname” or alias of the NumPy package. So if you see the term `np.sort()`

, that’s sort of a shorthand for `numpy.sort()`

.

Having said that, this sort of aliasing only works if you set it up properly.

To set up that alias, you’ll need to “import” NumPy with the appropriate nickname by using the code `import numpy as np`

.

We’ll talk more about this in the examples section, but I want you to understand this before I start explaining the syntax.

Ok. Let’s take a close look at the syntax.

To initiate the function (assuming you’ve imported NumPy as I explained above), you can call the function as `np.sort()`

. Again though, you can also refer to the function as `numpy.sort()`

and it will work in a similar way.

Then inside of the function, there are a set of parameters that enable you to control exactly how the function works.

The function is fairly simple, but to really understand it, you need to understand the parameters.

With that in mind, let’s talk about the parameters of numpy.sort.

The np.sort function has 3 primary parameters:

`a`

`axis`

`kind`

There’s also a 4th parameter called `order`

. Since `order`

is not used very often and it’s a little more complicated to understand, I am leaving it out of this tutorial.

However, the parameters `a`

, `axis`

, and `kind`

are a much more common. That being the case, I’ll only explain them in a little more detail.

`a`

(required)The `a`

parameter simply refers to the NumPy array that you want to operate on.

Typically, this will be a NumPy array object. However, np.sort (like almost all of the NumPy functions) will also operate on “array-like” objects. So for example, numpy.sort will sort Python lists, tuples, and many other itterable types.

Keep in mind that this parameter is *required*. So you need to provide a NumPy array here, or an array-like object.

`axis`

The `axis`

parameter describes the axis along which you will sort the data.

This parameter is *optional*.

By default, `axis`

is set to `axis = -1`

. This means that if you don’t use the axis parameter, then by default, the np.sort function will sort the data on the last axis.

If you’re not sure what an “axis” is, I recommend that you read our tutorial about NumPy axes. You’ll also learn more about how this parameter works in the examples section of this tutorial.

`kind`

The `kind`

parameter specifies the sorting algorithm you want to use to sort the data.

If you’re not well-trained with computer science and algorithms, you might not realize this ….

… but there are many different algorithms that can be used to sort data. Moreover, these different sorting techniques have different pros and cons. For example, some algorithms are faster than others.

So, there are several different options for this parameter: `quicksort`

, `heapsort`

, and `mergesort`

.

By default, the `kind`

parameter is set to `kind = 'quicksort'`

.

The `quicksort`

algorithm is typically sufficient for most applications, so we’re not really going to change this parameter in any of our examples. (If you have a question about sorting algorithms, just leave your question in the comments section below.)

Ok … now that you’ve learned more about the parameters of numpy.sort, let’s take a look at some working examples.

To learn and master a new technique, it’s almost always best to start with very, very simple examples.

This, by the way, is one of the mistakes that beginners make when learning new syntax; they work on examples that are simply too complicated.

Because simple examples are so important, I want to show you simple examples of how the np.sort function works.

I’ll show you how it works with NumPy arrays of different sizes …

And I’ll also show you how to use the parameters.

Here’s a list of the examples we’ll cover:

- Sort a 1D numpy array
- How to sort the
*columns*of a 2D array - How to sort the
*rows*of a 2D array - Sort a NumPy array in reverse order

But before you run the code in the following examples, you’ll need to make sure that everything is set up properly.

Before you run the code below, you’ll need to have NumPy installed and you’ll need to “import” the NumPy module into your environment.

Installing NumPy can be very complex, and it’s beyond the scope of this tutorial. If you don’t have it installed, you can search online for how to install it. My recommendation is to simply start using Anaconda.

Assuming that you have NumPy *installed* though, you’ll still need to run some code to import it.

To import NumPy, you can run this:

import numpy as np

This will make the NumPy functions available in your code.

Also, after running this code, you’ll be able to refer to NumPy in your code with the nickname ‘`np`

‘.

Ok … now we’re ready to go.

First, we’ll start very simple.

We’re going to sort a simple, 1-dimensional numpy array.

Before we sort the array, we’ll first need to create the array. To do this, we’re going to use the np.array function. The np.array function will enable us to create a NumPy array object from a Python list of 5 numbers:

simple_array_1d = np.array([5,3,1,2,4])

And we can print out the array with a simple print statement:

print(simple_array_1d)

Which shows the following output:

array([5, 3, 1, 2, 4])

This is really simple. We just have a NumPy array of 5 numbers. As you can see, the numbers are arranged in a random order.

Next, we can sort the array with np.sort:

np.sort(simple_array_1d)

When we run this, np.sort will produce the following output array:

array([1, 2, 3, 4, 5])

As you can see, the output of np.sort is the same group of numbers, but now they are sorted in ascending order.

Next, we’re going to sort the columns of a 2-dimensional NumPy array.

To do this, we’ll first need to *create* a 2D NumPy array.

Ultimately here, we’re going to create a 2 by 2 array of 9 integers, randomly arranged.

To do this, we’re going to use the numpy.arange function to create an array of integers from 1 to 9, then randomly arrange them with numpy random choice, and finally reshape the array into a 2 by 2 array with numpy.reshape.

np.random.seed(77) array_2d = np.random.choice(a = np.arange(start = 1, stop = 10), size = 9, replace = False).reshape([3,3])

And now let’s print out `array_2d`

to see what’s in it.

print(array_2d)

Which produces the following output:

array([[3, 6, 1], [2, 4, 7], [5, 9, 8]])

As you can see, we have a 2D array of the integers 1 to 9, arranged in a random order.

To be honest, the process for creating this array is a little complicated, so if you don’t understand it, you should review our tutorial on NumPy arrange and our tutorial on NumPy reshape.

Ok. Now let’s sort the columns of the array.

To do this, we’re going to use numpy.sort with the `axis`

parameter.

np.sort(array_2d, axis = 0)

Which produces the following NumPy array:

array([[2, 4, 1], [3, 6, 7], [5, 9, 8]])

Take a close look at the output. The columns are sorted from low to high.

Why though? Why does the `axis`

parameter do this?

To understand this example, you really need to understand NumPy axes. If you don’t understand axes, you really should read our NumPy axes tutorial.

However, I will explain axes here, briefly.

You can think of axes like *directions*.

In a 2D NumPy array, axis-0 is the direction that runs downwards down the rows and axis-1 is the direction that runs horizontally across the columns.

Once you understand this, you can understand the code `np.sort(array_2d, axis = 0)`

.

What we’re really saying here is that we want to sort the array `array_2d`

along axis 0. Remember, axis 0 is the axis that points downwards.

When we run this code, we’re basically saying that we want to sort the data in the axis-0 direction.

… effectively, this sorts the columns!

Next, let’s sort the *rows*.

Sorting the rows is very similar to sorting the columns.

To do this, we’ll need to use the `axis`

parameter again.

Quickly though, we’ll need a NumPy array to sort.

The following code is exactly the same as the previous example (sorting the columns), so if you already ran that code, you don’t need to run it again.

np.random.seed(77) array_2d = np.random.choice(a = np.arange(start = 1, stop = 10), size = 9, replace = False).reshape([3,3])

Just so we’re clear on the contents of the array, let’s print it out again:

print(array_2d)

OUT:

array([[3, 6, 1], [2, 4, 7], [5, 9, 8]])

Now let’s sort the rows.

Do do this, we’ll use NumPy sort with `axis = 1`

.

np.sort(array_2d, axis = 1)

Which produces the following output array, with sorted rows:

array([[1, 3, 6], [2, 4, 7], [5, 8, 9]])

Take a close look. The rows are sorted from low to high.

Once again, to understand this, you really need to understand what NumPy axes are.

As I mentioned previously in this tutorial, in a 2D array, axis 1 is the direction that runs horizontally:

So when we use the code `np.sort(array_2d, axis = 1)`

, we’re telling NumPy that we want to sort the data along that axis-1 direction.

This basically means, sort the rows!

A common question that people ask when they dive further into NumPy is “how can I sort the data in reverse order?”

Unfortunately, this is not so easy to do.

I think that there should be a way to do this directly with NumPy, but at the moment, there isn’t.

That being the case, I’ll show you a quick-and-dirty workaround.

(But note: this is not necessarily an *efficient* workaround.)

We’re going to sort our 1D array `simple_array_1d`

that we created above.

Let’s print out `simple_array_1d`

to see what’s in it.

print(simple_array_1d)

OUT:

[5 3 1 2 4]

You can see that this is a NumPy array with 5 elements that are arranged in random order.

Now, we’re going to sort these values in *reverse* order.

To do this, we’re going to use np.sort on the negative of the values in `array2d`

(i.e., `-array_2d`

), and we’ll take the negative of that output:

-np.sort(-array_2d)

Which gives us the following result:

array([5, 4, 3, 2, 1])

You can see that the code `-np.sort(-array_2d)`

sorted the numbers in reverse (i.e., descending) order.

You can use this technique in a similar way to sort the columns and rows in descending order.

To do this, we need to use the axis parameter in conjunction with the technique we used in the previous section.

To sort the columns, we’ll need to set `axis = 0`

. And we’ll use the negative sign to sort our 2D array in reverse order.

-np.sort(-array_2d, axis = 0)

Which produces the following output:

array([[9, 7, 5], [8, 4, 3], [6, 2, 1]])

As you can see, the code `-np.sort(-array_2d, axis = 0)`

produces an output array where the columns have been sorted in descending order, from the top of the column to the bottom.

You can do the same thing to sort the rows by using `axis = 1`

.

Again, we’ll be working with `array_2d`

.

-np.sort(-array_2d, axis = 1)

The code `axis = 1`

indicates that we’ll be sorting the data in the axis-1 direction, and by using the negative sign in front of the array name and the function name, the code will sort the rows in descending order.

Here in this tutorial, I’ve explained how to sort numpy arrays by using the np.sort function.

But the NumPy toolkit is much bigger than one function.

If you’re serious about data science and scientific computing in Python, you’ll have to learn quite a bit more about NumPy.

In fact, if you want to master data science in Python, you’ll need to learn quite a few Python packages. You’ll need to learn NumPy, Pandas, matplotlib, scikit learn, and more.

There’s a lot to learn!

If you’re ready to learn data science though, we can help.

Here at Sharp Sight, we teach data science.

We offer premium data science courses to help you master data science *fast* …

And we also offer FREE tutorials.

If you sign up for our email list, you’ll get our free tutorials, and you’ll find out when our courses open for registration.

When you sign up, you’ll get free tutorials on:

- NumPy
- Pandas
- Base Python
- Scikit learn
- Machine learning
- Deep learning
- Data science in Python
- Data science in R
- … and more.

If you want access to our free tutorials every week, enter your email address and sign up now.

The post A quick guide to NumPy sort appeared first on Sharp Sight.

]]>The post NumPy random seed explained appeared first on Sharp Sight.

]]>The function itself is extremely easy to use.

However, the *reason* that we need to use it is a little complicated. To understand *why* we need to use NumPy random seed, you actually need to know a little bit about pseudo-random numbers.

That being the case, this tutorial will first explain the basics of pseudo-random numbers, and will then move on to the syntax of numpy.random.seed itself.

The tutorial is divided up into several different sections.

- A quick introduction to pseudo-random numbers
- How and why we use NumPy random seed
- The syntax of NumPy random seed
- Examples of how to use numpy random seed
- Frequently asked questions about numpy.random.seed
- Applications of pseudo-random numbers

You can click on any of the above links, and it will take you directly to that section.

However, I strongly recommend that you read the whole tutorial.

As I said earlier, numpy.random.seed is very easy to use, but it’s not that easy to understand. Understanding *why* we use it requires some background. That being the case, it’s much better if you actually read the tutorial.

Ok … let’s get to it.

So what exactly is NumPy random seed?

NumPy random seed is simply a function that sets the random seed of the NumPy pseudo-random number generator. It provides an essential input that enables NumPy to generate pseudo-random numbers for random processes.

Does that make sense? Probably not.

Unless you have a background in computing and probability, what I just wrote is probably a little confusing.

Honestly, in order to understand “seeding a random number generator” you need to know a little bit about pseudo-random numbers.

That being the case, let me give you a quick introduction to them …

Here, I want to give you a very quick overview of pseudo-random numbers and why we need them.

Once you understand pseudo-random numbers, numpy.random.seed will make more sense.

At the risk of being a bit of a smart-ass, I think the name “pseudo-random number” is fairly self explanatory, and it gives us some insight into what pseudo-random numbers actually are.

Let’s just break down the name a little.

A pseudo-random number is a *number*. A number that’s sort-of random. *Pseudo*-random.

So essentially, a pseudo-random number is a number that’s almost random, __but not really random__.

It might sound like I’m being a bit sarcastic here, but that’s essentially what they are. Pseudo-random numbers are numbers that appear to be random, but are not actually random.

In the interest of clarity though, let’s see if we can get a definition that’s a little more precise.

According to the encyclopedia at Wolfram Mathworld, a pseudo-random number is:

… a computer-generated random number.

The definition goes on to explain that ….

The prefix pseudo- is used to distinguish this type of number from a “truly” random number generated by a random physical process such as radioactive decay.

A separate article at random.org notes that pseudo-random numbers “appear random, but they are really predetermined”.

Got that? Pseudo-random numbers are computer generated numbers that appear random, but are actually predetermined.

I think that these definitions help quite a bit, and they are a great starting point for understanding why we need them.

I swear to god, I’m going to bring this back to NumPy soon.

But, we still need to understand why pseudo-random numbers are required.

Really. Just bear with me. This will make sense soon.

There’s a fundamental problem when using computers to simulate or work with random processes.

Setting aside some rare exceptions, computers are deterministic by their very design. To quote an article at MIT’s School of Engineering “if you ask the same question you’ll get the same answer every time.”

Another way of saying this is that if you give a computer a certain input, it will precisely follow instructions to produce an output.

… And if you later give a computer the *same* input, it will produce the *same* output.

If the input is the same, then the output will be the same.

THAT’S HOW COMPUTERS WORK.

The behavior of computers is *deterministic* …

Essentially, the behavior of computers is NOT random.

This introduces a problem: how can you use a non-random machine to produce random numbers?

Computers solve the problem of generating “random” numbers the same way that they solve essentially everything: with an algorithm.

Computer scientists have created a set of algorithms for creating psuedo random numbers, called “pseudo-random number generators.”

These algorithms can be executed on a computer.

As such, they are completely deterministic. However, the numbers that they produce have properties that *approximate* the properties of random numbers.

That is to say, the numbers generated by pseudo-random number generators *appear* to be random.

Even though the numbers they are completely determined by the algorithm, when you examine them, there is typically no discernible pattern.

For example, here we’ll create some pseudo-random numbers with the NumPy randint function:

`np.random.seed(1)`

`np.random.randint(low = 1, high = 10, size = 50)`

)

OUT:

[6, 9, 6, 1, 1, 2, 8, 7, 3, 5, 6, 3, 5, 3, 5, 8, 8, 2, 8, 1, 7, 8, 7, 2, 1, 2, 9, 9, 4, 9, 8, 4, 7, 6, 2, 4, 5, 9, 2, 5, 1, 4, 3, 1, 5, 3, 8, 8, 9, 7]

See any pattern here? Me neither.

I can assure you though, that these numbers are not random, and are in fact completely determined by the algorithm. If you run the same code again, you’ll get the exact same numbers.

Importantly, because pseudo-random number generators are deterministic, they are also repeatable.

What I mean is that if you run the algorithm with the same input, it will produce the same output.

So you can use pseudo-random number generators to create and then re-create the exact same set of pseudo-random numbers.

Let me show you.

Here, we’ll create a list of 5 pseudo-random integers between 0 and 9 using numpy.random.randint.

(And notice that we’re using np.random.seed here)

np.random.seed(0) np.random.randint(10, size = 5)

This produces the following output:

array([5, 0, 3, 3, 7])

Simple. The algorithm produced an array with the values `[5, 0, 3, 3, 7]`

.

Ok.

Now, let’s run the same code again.

… and notice that we’re using np.random.seed in exactly the same way …

np.random.seed(0) np.random.randint(10, size = 5)

OUTPUT:

array([5, 0, 3, 3, 7])

Well take a look at that …

The. numbers. are. the. same.

We ran the exact same code, and it produced the exact same output.

I will repeat what I said earlier: pseudo random number generators produce numbers that look random, but are 100% determined.

Determined how though?

Remember what I wrote earlier: computers and algorithms process inputs into outputs. The outputs of computers depend on the inputs.

So just like any output produced by a computer, pseudo-random numbers are dependent on the *input*.

*THIS* is where numpy.random.seed comes in …

The numpy.random.seed function provides the input (i.e., the seed) to the algorithm that generates pseudo-random numbers in NumPy.

Ok, you got this far.

You’re ready now.

Now you can learn about NumPy random seed.

to the pseudo-random number generator

What I wrote in the previous section is critical.

The “random” numbers generated by NumPy are not exactly random. They are pseudo-random … they approximate random numbers, but are 100% determined by the input and the pseudo-random number algorithm.

The np.random.seed function provides an input for the pseudo-random number generator in Python.

That’s all the function does!

It allows you to provide a “seed” value to NumPy’s random number generator.

Importantly, numpy.random.seed doesn’t exactly work all on its own.

The numpy.random.seed function works in *conjunction* with other functions from NumPy.

Specifically, numpy.random.seed works with other function from the `numpy.random`

namespace.

So for example, you might use numpy.random.seed along with numpy.random.randint. This will enable you to create random integers with NumPy.

You can also use numpy.random.seed with numpy.random.normal to create normally distributed numbers.

… or you can use it with numpy.random.choice to generate a random sample from an input.

In fact, there are several dozen NumPy random functions that enable you to generate random numbers, random samples, and samples from specific probability distributions.

I’ll show you a few examples of some of these functions in the examples section of this tutorial.

Remember what I said earlier in this tutorial …. pseudo-random number generators are completely deterministic. They operate by algorithm.

What this means is that if you provide the same seed, you will get the same output.

And if you change the seed, you will get a different output.

The output that you get depends on the input that you give it.

I’ll show you examples of this behavior in the examples section.

The important thing about using a seed for a pseudo-random number generator is that it makes the code *repeatable*.

Remember what I said earlier?

… pseudo-random number generators operate by a deterministic process.

If you give a pseudo-random number generator the same input, you’ll get the same output.

This can actually be a good thing!

There are times when you really want your “random” processes to be repeatable.

Code that has well defined, repeatable outputs is good for testing.

Essentially, we use NumPy random seed when we need to generate pseudo-random numbers in a repeatable way.

The fact that np.random.seed makes your code repeatable also makes is easier to *share*.

Take for example the tutorials that I post here at Sharp Sight.

I post detailed tutorials about how to perform various data science tasks, and I show how code works, step by step.

When I do this, it’s important that people who read the tutorials and run the code get the same result. If a student reads the tutorial, and copy-and-pastes the code exactly, I want them to get the exact same result. This just helps them check their work! If they type in the code exactly as I show it in a tutorial, getting the exact same result gives them confidence that they ran the code properly.

Again, in order to get repeatable results when we are using “random” functions in NumPy, we need to use numpy.random.seed.

Ok … now that you understand what NumPy random seed is (and why we use it), let’s take a look at the actual syntax.

The syntax of NumPy random seed is extremely simple.

There’s essentially only one parameter, and that is the seed value.

So essentially, to use the function, you just call the function by name and then pass in a “seed” value inside the parenthesis.

Note that in this syntax explanation, I’m using the abbreviation “`np`

” to refer to NumPy. This is a common convention, but it requires you to import NumPy with the code “`import numpy as np`

.” I’ll explain more about this soon in the examples section.

Let’s take a look at some examples of how and when we use numpy.random.seed.

Before we look at the examples though, you’ll have to run some code.

To get the following examples to run properly, you’ll need to import NumPy with the appropriate “nickname.”

You can do that by executing the following code:

import numpy as np

Running this code will enable us to use the alias `np`

in our syntax to refer to `numpy`

.

This is a common convention in NumPy. When you read NumPy code, it is extremely common to see NumPy referred to as `np`

. If you’re a beginner you might not realize that you need to import NumPy with the code `import numpy as np`

, otherwise the examples won’t work properly!

Now that we’ve imported NumPy properly, let’s start with a simple example. We’ll generate a single random number between 0 and 1 using NumPy random random.

Here, we’re going to use NumPy to generate a random number between zero and one. To do this, we’re going to use the NumPy random random function (AKA, np.random.random).

Ok, here’s the code:

np.random.seed(0) np.random.random()

OUTPUT:

0.5488135039273248

Note that the output is a float. It’s a decimal number between 0 and 1.

For the record, we can essentially treat this number as a probability. We can think of the np.random.random function as a tool for generating probabilities.

Now that I’ve shown you how to use np.random.random, let’s just run it again with the same seed.

Here, I just want to show you what happens when you use np.random.seed before running np.random.random.

np.random.seed(0) np.random.random()

OUTPUT:

0.5488135039273248

Notice that the number is exactly the same as the first time we ran the code.

Essentially, if you execute a NumPy function with the same seed, you’ll get the same result.

Fore more information on the np.random.random function, check out our tutorial on NumPy random random.

Next, we’re going to use np.random.seed to set the number generator before using NumPy random randint.

Essentially, we’re going to use NumPy to generate 5 random integers between 0 and 99.

np.random.seed(74) np.random.randint(low = 0, high = 100, size = 5)

OUTPUT:

array([30, 91, 9, 73, 62])

This is pretty simple.

NumPy random seed sets the seed for the pseudo-random number generator, and then NumPy random randint selects 5 numbers between 0 and 99.

Let’s just run the code so you can see that it reproduces the same output if you have the same seed.

np.random.seed(74) np.random.randint(low = 0, high = 100, size = 5)

OUTPUT:

array([30, 91, 9, 73, 62])

Once again, as you can see, the code produces the same integers if we use the same seed. As noted previously in the tutorial, NumPy random randint doesn’t exactly produce “random” integers. It produces pseudo-random integers that are completely determined by numpy.random.seed.

It’s also common to use the NP random seed function when you’re doing random sampling.

Specifically, if you need to generate a reproducible random sample from an input array, you’ll need to use numpy.random.seed.

Let’s take a look.

Here, we’re going to use numpy.random.seed before we use numpy.random.choice. The NumPy random choice function will then create a random sample from a list of elements.

np.random.seed(0) np.random.choice(a = [1,2,3,4,5,6], size = 5)

OUTPUT:

array([5, 6, 1, 4, 4])

As you can see, we’ve basically generated a random sample from the list of input elements … the numbers 1 to 6.

In the output, you can see that some of the numbers are repeated. This is because np.random.choice is using random sampling with replacement. For more information about how to create random samples, you should read our tutorial about np.random.choice.

Let’s quickly re-run the code.

I want to re-run the code just so you can see, once again, that the primary reason we use NumPy random seed is to create results that are completely repeatable.

Ok, here is the exact same code that we just ran (with the same seed).

np.random.seed(0) np.random.choice(a = [1,2,3,4,5,6], size = 5)

OUTPUT:

array([5, 6, 1, 4, 4])

Once again, we used the same seed, and this produced the same output.

Now that we’ve taken a look at some examples of using NumPy random seed to set a random seed in Python, I want to address some frequently asked questions.

Dude. I just wrote 2000 words explaining what the np.random.seed function does … which basically explains what np.random.seed(0) does.

Ok, ok … I get it. You’re probably in a hurry and just want a quick answer.

I’ll summarize.

We use np.random.seed when we need to generate random numbers or mimic random processes in NumPy.

Computers are generally deterministic, so it’s very difficult to create truly “random” numbers on a computer. Computers get around this by using pseudo-random number generators.

These pseudo-random number generators are algorithms that produce numbers that appear random, but are not really random.

In order to work properly, pseudo-random number generators require a starting input. We call this starting input a “seed.”

The code `np.random.seed(0)`

enables you to provide a seed (i.e., the starting input) for NumPy’s pseudo-random number generator.

NumPy then uses the seed and the pseudo-random number generator in conjunction with other functions from the numpy.random namespace to produce certain types of random outputs.

Ultimately, creating pseudo-random numbers this way leads to repeatable output, which is good for testing and code sharing.

Having said all of that, to really understand numpy.random.seed, you need to have some understanding of pseudo-random number generators.

… so if what I just wrote doesn’t make sense, please return to the top of the page and read the f*#^ing tutorial.

Basically, it doesn’t matter.

You can use `numpy.random.seed(0)`

, or `numpy.random.seed(42)`

, or any other number.

For the most part, the number that you use inside of the function doesn’t really make a difference.

You just need to understand that using different seeds will cause NumPy to produce different pseudo-random numbers. The output of a `numpy.random`

function will depend on the seed that you use.

Here’s a quick example. We’re going to use NumPy random seed in conjunction with NumPy random randint to create a set of integers between 0 and 99.

In the first example, we’ll set the seed value to 0.

np.random.seed(0) np.random.randint(99, size = 5)

Which produces the following output:

array([44, 47, 64, 67, 67])

Basically, np.random.randint generated an array of 5 integers between 0 and 99. Note that if you run this code again with the exact same seed (i.e. 0), you’ll get the same integers from np.random.randint.

Next, let’s run the code with a *different* seed.

np.random.seed(1) np.random.randint(99, size = 5)

OUTPUT:

array([37, 12, 72, 9, 75])

Here, the code for np.random.randint is exactly the same … we only changed the seed value. Here, the seed is `1`

.

With a *different* seed, NumPy random randint created a *different* set of integers. Everything else is the same. The code for np.random.randint is the same. But with a different seed, it produces a different output.

Ultimately, I want you to understand that the output of a numpy.random function ultimately depends on the value of np.random.seed, but the choice of seed value is sort of arbitrary.

The short answer is, no.

If you use a function from the `numpy.random`

namespace (like np.random.randint, np.random.normal, etc) *without* using NumPy random see first, Python will actually still use numpy.random.seed in the background. NumPy will generate a seed value from a part of your computer system (like `/urandom`

on a Unix or Linux machine).

So essentially, if you don’t set a seed with numpy.random.seed, NumPy will set one for you.

However, this has a disadvantage!

If you don’t explicitly set a seed, your code will not have repeatable outputs. NumPy will generate a seed on its own, but that seed might change moment to moment. This will make your outputs different every time you run it.

So to summarize: you don’t absolutely have to use numpy.random.seed, but you *should* use it if you want your code to have repeatable outputs.

Ok.

We’re really getting into the weeds here.

Essentially, numpy.random.seed sets a seed value for the global instance of the numpy.random namespace.

On the other hand, np.random.RandomState returns one instance of the RandomState and does not effect the global RandomState.

Confused?

That’s okay …. this answer is a little technical and it requires you to know a little about how NumPy is structured on the back end. It also requires you to know a little bit about programming concepts like “global variables.” If you’re a relative data science beginner, the details that you need to know might be over your head.

The important thing is that NumPy random seed is probably sufficient if you’re just using NumPy for some data science or scientific computing.

However, if you’re building software systems that need to be secure, NumPy random seed is probably not the right tool.

To summarize, np.random.seed is probably fine if you’re just doing simple analytics, data science, and scientific computing, but you need to learn more about RandomState if you want to use the NumPy pseudo-random number generator in systems where security is a consideration.

Now that I’ve explained the basics of NumPy random seed, I want to tell you a few applications …

Here’s where you might see the np.random.seed function.

It’s possible to do probability and statistics using NumPy.

Almost by definition, probability involves uncertainty and randomness. As such, if you use Python and NumPy to model probabilistic processes, you’ll need to use np.random.seed to generate pseudo-random numbers (or a similar tool in Python).

More specifically, if you’re doing random sampling with NumPy, you’ll need to use numpy.random.seed.

NumPy has a variety of functions for performing random sampling, including numpy random random, numpy random normal, and numpy random choice.

In almost every case, when you use one of these functions, you’ll need to use it in conjunction with numpy random seed if you want to create reproducible outputs.

Monte Carlo methods are a class of computational methods that rely on repeatedly drawing random samples.

I won’t go into the details here, since Monte Carlo methods are a little complicated, and beyond the scope of this post.

Essentially though, Monte Carlo methods are a powerful computational tool used in science and engineering. In fact, Monte Carlo methods were initially used at the Manhattan Project!

Monte Carlo methods require random numbers. In most cases, when these methods are used, they actually use *pseudo-random* numbers instead of true random numbers.

Interested in machine learning?

Great … it’s a powerful toolset, and it will be extremely important in the 21st century.

Broadly speaking, pseudo-random numbers are important in machine learning.

Performing simple tasks like splitting datasets into training and test sets requires random sampling. In turn, random sampling almost always requires pseudo-random numbers.

So if you’re doing machine learning in Python, you’ll almost certainly need to use NumPy random seed …

More specifically, you’ll also probably use pseudo-random numbers if you want to do deep learning.

For example, if you want to do deep learning in Python, you’ll often need to split datasets into training and test sets (just like with other machine learning techniques). Again, this requires pseudo-random numbers.

… so when people do deep learning in Python, you’ll frequently see at least a few uses of numpy.random.seed.

I’ve really only touched on a few applications of numpy.random.seed in Python. There are many more.

Speaking generally, if you want to use NumPy, you really need to know this little function.

But even though we focused on NumPy random seed in this tutorial, there are many other NumPy functions that you probably need to learn …

If you want to learn how to do data science in Python, NumPy is very important …

If you want to learn NumPy and data science in Python, then sign up for our email list.

Here at Sharp Sight, we teach data science.

… and we regularly post FREE data science tutorials just like this one.

If you want to get our free tutorials delivered directly to you email inbox, then sign up now.

If you sign up for our email list, you’ll get tutorials about:

- NumPy
- Pandas
- Matplotlib
- Seaborn
- Sci-kit learn
- Machine learning
- Deep learning
- … and more

We also teach data science in R, so if you sign up, you’ll get tutorials for both languages.

So if you want to learn more data science for FREE, sign up now.

The post NumPy random seed explained appeared first on Sharp Sight.

]]>