How to Rename Dataframe Columns with Pandas Rename

In this tutorial, I’ll explain how to use the Pandas rename method to rename columns in a Python dataframe.

I’ll explain what the technique does, how the syntax works, and I’ll show you clear examples of how to use it.

If you need something specific, you can click on any of the following links.
Table of Contents:

Ok. Let’s start with a quick introduction to the rename method.

A quick introduction to Pandas rename

The Pandas rename method is fairly straight-forward: it enables you to rename the columns or rename the row labels of a Python dataframe.

This technique is most often used to rename the columns of a dataframe (i.e., the variable names).

An image that shows a simple example of the Pandas rename method, renaming a column in a Pandas dataframe.

But again, it can also rename the row labels (i.e., the labels in the dataframe index).

I’ll show you examples of both of these in the examples section.

But first, let’s take a look at the syntax.

The syntax of Pandas rename

Ok, now that I’ve explained what the Pandas rename method does, let’s look at the syntax.

Here, I’ll show you the syntax for how to rename Pandas columns, and also how to rename Pandas row labels.

A quick note

Everything that I’m about to describe assumes that you’ve imported Pandas and that you already have a Pandas dataframe created.

You can import pandas with the following code:

import pandas as pd

And if you need a refresher on Pandas dataframes and how to create them, you can read our tutorial on Pandas dataframes.

Syntax to rename Pandas columns

Ok, let’s start with the syntax to rename columns. (The syntax for renaming columns and renaming rows labels is almost identical, but let’s just take it one step at a time.)

When we use the rename method, we actually start with our dataframe. You type the name of the dataframe, and then .rename() to call the method.

An image that explains the syntax for how to rename columns in a Pandas dataframe.

Inside the parenthesis, you’ll use the columns parameter, which enables you to specify the columns that you want to change.

How to use the columns parameter

Let’s look carefully at how to use the columns parameter.

When you change column names using the rename method, you need to present the old column name and new column name inside of a Python dictionary.

An image that explains that the argument to the 'columns' parameter must be a dictionary of 'old name':'new name' pairs.

So if the old variable name is old_var and the new variable name is new_var, you would present to the columns parameter as key/value pairs, inside of a dictionary: columns = {'old_var':'new_var'}.

If you have multiple columns that you want to change, you simply provide multiple old name/new name dictionary pairs, separated by commas.

To do this properly, you really need to understand Python dictionaries, so if you need a refresher, then read about how dictionaries are structured.

Syntax to rename Pandas row labels

Now, let’s look at the syntax for renaming row labels (i.e., the labels in the dataframe index).

An image that shows the syntax for how to rename the row labels of a Pandas dataframe.

You’ll notice that the syntax is almost exactly the same as the syntax for changing the column names.

The major difference is that instead of using the columns parameter, when you want to change the row labels, you need to use the index parameter instead.

Beyond that you’ll still use a dictionary with old name/new name pairs as the argument to the index parameter.

The parameters of Pandas rename

Now, let’s take a look at the parameters of the rename method.

The most important parameters that you should know are:

  • columns
  • index
  • inplace

The rename method also has additional parameters – axis, copy, levels, and errors – but these are more advanced and less commonly used, so I won’t cover them here.

columns

The columns parameter enables you to specify the column names you want to change, and what to change them to.

As described above, the argument to this parameter can be a dictionary, but it can also be a function.

When you provide a dictionary, it the values should be structured as old name/new name pairs, like this: {'old_var':'new_var'}.

index

The index parameter is very similar to the columns parameter, except it operates on row index labels instead of column names.

So the index parameter enables you to specify the row labels you want to change, and what to change them to.

As described above, the argument to this parameter can be a dictionary, but it can also be a function.

When you provide a dictionary, it the values should be structured as old name/new name pairs, like this: {'old_var':'new_var'}.

inplace

The inplace parameter enables you to force the rename method to directly modify the dataframe that’s being operated on.

By default, inplace is set to inplace = False. This causes the rename method to produce a new dataframe. In this case, the original dataframe is left unchanged.

If you set inplace = True, the rename method will directly alter the original dataframe, and overwrite the data directly. Be careful with this, and make sure that your code is doing exactly what you want it to.

The output of Pandas rename

By default, the rename method will output a new Python dataframe, with new column names or row labels. As noted above, this means that by default, rename will leave the original dataframe unchanged.

If you set inplace = True, rename won’t produce any new output. In this case, rename will instead directly modify and overwrite the dataframe that’s being operated on.

Examples: How to rename Pandas columns and Pandas row labels

Now that we’ve looked at the syntax, let’s look at some examples.

Examples:

However, before you run the examples, you’ll need to run some preliminary code to import the modules we need, and to create the dataset we’ll be operating on.

Import modules

First, let’s import Pandas.

You can do that with the following code:

#===============
# IMPORT MODULES
#===============
import pandas as pd

Notice that we’re importing Pandas with the alias pd. This makes it possible to refer to Pandas as pd in our code, which is the common convention among Python data scientists.

Create DataFrame

Next, we’ll create a dataframe that we can operate on.

We’ll create our dataframe in three steps:

  • create a Python dictionary
  • create a Pandas dataframe from the dictionary
  • specify the columns and row labels (i.e., the index)

Let’s start by creating a Python dictionary. As you can see, this dictionary contains economic data for several countries.

#==========================
# CREATE DICTIONARY OF DATA
#==========================
country_data_dict = {
    'country_code':['USA', 'CHN', 'JPN', 'GER', 'UK', 'IND']
    ,'country':['USA', 'China', 'Japan', 'Germany', 'UK', 'India']
    ,'continent':['North America','Asia','Asia','Europe','Europe','Asia']
    ,'gross_domestic_product':[19390604, 12237700, 4872137, 3677439, 2622434, 2597491]
    ,'pop':[322179605, 1403500365, 127748513, 81914672, 65788574, 1324171354]
}

Next, we’ll create a DataFrame from the dictionary:

#=================================
# CREATE DATAFRAME FROM DICTIONARY
#=================================
country_data = pd.DataFrame(country_data_dict, columns = ['country_code','country', 'continent', 'gross_domestic_product', 'pop'])

Note that in this step, we’re setting the column names using the columns parameter inside of pd.DataFrame().

Finally, we’ll set the row labels (i.e., the index). By default, Pandas uses a numeric integer index that starts at 0.

We’re going to change that default index to character data.

Specifically, we’re going to use the values in the country_code variable as our new row labels.

To do this, we’ll use the Pandas set_index method:

country_data = country_data.set_index('country_code')

Notice that we’re assigning the output of set_index() to the original dataframe name, country_data, using the equal sign. This is because by default, the output of set_index() is a new dataframe object by default. By default, set_index() does not modify the DataFrame in place.

Ok. Now that we have our dataframe, let’s print it out to look at the contents:

print(country_data)

OUT:

              country      continent  gross_domestic_product         pop
country_code                                                            
USA               USA  North America                19390604   322179605
CHN             China           Asia                12237700  1403500365
JPN             Japan           Asia                 4872137   127748513
GER           Germany         Europe                 3677439    81914672
UK                 UK         Europe                 2622434    65788574
IND             India           Asia                 2597491  1324171354

This dataframe has economic data for six countries, organized in a row-and-column structure.

The dataframe has four columns: country, continent, gross_domestic_product , and pop.

Additionally, notice that the “country_code” variable is set aside off to the left. That’s because we’ve set the country_code column as the index. The values of country_code will now function as the row labels of the dataframe.

Now that we have this dataframe, we’ll be able to use the rename() method to rename the columns and the row labels.

EXAMPLE 1: Rename one dataframe column

First, let’s start simple.

Here, we’re going to rename a single column name.

Specifically, we’re going to rename the gross_domestic_product variable to GDP.

Let’s run the code, and then I’ll explain.

country_data.rename(columns = {'gross_domestic_product':'GDP'})

OUT:

              country      continent       GDP         pop
country_code                                              
USA               USA  North America  19390604   322179605
CHN             China           Asia  12237700  1403500365
JPN             Japan           Asia   4872137   127748513
GER           Germany         Europe   3677439    81914672
UK                 UK         Europe   2622434    65788574
IND             India           Asia   2597491  1324171354
Explanation

Notice in the output that gross_domestic_product has been renamed to GDP.

How did we do it?

We typed the name of the dataframe, country_data, and then used so-called “dot syntax” to call the rename() method.

Inside the parenthesis, we’re using the columns parameter to specify the columns we want to rename, and the new name that we want to use. This ‘old name’/’new name’ pair is presented as a dictionary in the form {'old name':'new name'}.

So we have the synax columns = {'gross_domestic_product':'GDP'}, which is basically saying change the column name 'gross_domestic_product' to 'GDP'.

Remember: the original dataframe is unchanged

One more thing to point out here: when we run this code, the original dataframe will remain unchanged.

That’s because by default, the Pandas rename method produces a new dataframe as an output and leaves the original unchanged. And by default, this output will be sent directly to the console. So when we run our code like this, we’ll see the new dataframe with the new name in the console, but the original dataframe will be left the same.

If you want to save the output, you can use the assignment operator like this:

country_data_new = country_data.rename(columns = {'gross_domestic_product':'GDP'})

Here, I’ve given the output dataframe the new name country_data_new.

We can call it anything we like. We could even call it country_data. Just be careful if you do country_data_new = country_data.rename(...), it will overwrite your original dataset. Make sure your code works perfectly before you do this!

EXAMPLE 2: Rename multiple columns

Next, let’s make things a little more complicated.

Here, we’ll rename multiple columns at the same time.

The way to do this is very similar to the code in example 1, except here, we’ll provide more old name/new name pairs in our dictionary.

Specifically, we’ll rename gross_domestic_product to GDP, and we’ll rename pop to population.

Let’s take a look.

country_data.rename(columns = {'gross_domestic_product':'GDP', 'pop': 'population'})

OUT:

              country      continent       GDP  population
country_code                                              
USA               USA  North America  19390604   322179605
CHN             China           Asia  12237700  1403500365
JPN             Japan           Asia   4872137   127748513
GER           Germany         Europe   3677439    81914672
UK                 UK         Europe   2622434    65788574
IND             India           Asia   2597491  1324171354
Explanation

This should make sense if you understood example 1.

Here, we’re calling the rename() method using dot syntax.

Inside the parenthesis, we have the code columns = {'gross_domestic_product':'GDP', 'pop': 'population'}.

Look inside the dictionary (i.e., inside the curly brackets). Here, we have two old name/new name pairs. These are organized as key/value pairs, just as we normally have inside of a dictionary.

This should be simple to understand, as long as you understand how dictionaries are structured. If you don’t make sure to review Python dictionaries.

EXAMPLE 3: Rename row labels

Now, let’s rename some of the row labels.

Specifically, we’re going to rename the labels GER and UK.

To do this, we’re going to use the index parameter.

Let’s take a look.

country_data.rename(index = {'GER':'DEU','UK':'GBR'})

OUT:

              country      continent  gross_domestic_product         pop
country_code                                                            
USA               USA  North America                19390604   322179605
CHN             China           Asia                12237700  1403500365
JPN             Japan           Asia                 4872137   127748513
DEU           Germany         Europe                 3677439    81914672
GBR                UK         Europe                 2622434    65788574
IND             India           Asia                 2597491  1324171354
Explanation

Here, we’re renaming GER to DEU and we’re renaming UK to GBR.

To do this, we called the rename method and used the code index = {'GER':'DEU','UK':'GBR'} inside the parenthesis.

The index parameter enables us to specify the row labels that we want to change. And we’re using the dictionary as the argument, which contains the old value/new value pairs.

EXAMPLE 4: Change the column names and row labels ‘in place’

Finally, let’s change some of the columns and row labels ‘in place’.

As I mentioned in example 1, and in the syntax section, by default, the rename method leaves the original dataframe unchanged. That’s because by default, the inplace parameter is set to inplace = False. This causes the rename method to produce a new dataframe as the output, while leaving the original dataframe unchanged.

But sometimes, we actually want to modify the original dataframe directly.

To do this we can set inplace = True.

Create copy of dataframe

Before we run our code, we’re actually going to make a copy of our data.

The reason is that we’re going to directly overwrite a dataframe. This can be dangerous if you get it wrong, so we’re actually going to work with a copy of the original.

country_data_copy = country_data.copy()

Now, we have a dataframe, country_data_copy, which contains the same data as the original.

Rename columns and labels in place

Now, we’re going to directly rename the columns and row labels of country_data_copy.

To do this, we’ll use rename with the inplace parameter as follows:

country_data_copy.rename(index = {'GER':'DEU','UK':'GBR'}
                        ,columns = {'gross_domestic_product':'GDP', 'pop': 'population'}
                        ,inplace = True
                        )

And now, let’s print out the data, so we can see it.

print(country_data_copy)

OUT:

              country      continent       GDP  population
country_code                                              
USA               USA  North America  19390604   322179605
CHN             China           Asia  12237700  1403500365
JPN             Japan           Asia   4872137   127748513
DEU           Germany         Europe   3677439    81914672
GBR                UK         Europe   2622434    65788574
IND             India           Asia   2597491  1324171354
Explanation

As you can see in the output the row labels and column names have been changed directly in country_data_copy.

We simply did this by setting inplace = True.

Again: be careful with this. When you do this, you’ll directly modify and overwrite your data. Check your code and double check again to make sure that your code works correctly before using inplace = True.

Frequently asked questions about Pandas rename

Let’s quickly cover a common question about the Pandas rename technique.

Frequently asked questions:

Question 1: Why is my dataframe unchanged after I use the rename function?

Remember: by default, the Pandas rename function creates a new dataframe as an output, but leaves the original dataframe unchanged. (I mentioned this in the syntax section.)

The reason is that by default, the inplace parameter is set to inplace = False. Again, this leaves the original dataframe unchanged, and simply produces a new dataframe as the output.

If you want to directly modify your original dataframe, you need to set inplace = False. I show how to do this in example 4.

Leave your other questions in the comments below

Do you have any other questions about the Pandas rename method?

Is there something that you’re struggling with that I haven’t covered here?

If so, leave your questions in the comments section near the bottom of the page.

For more Pandas tutorials, sign up for our email list

This tutorial should have given you a good idea of how to rename columns in Python using the Pandas rename method.

But to really understand data manipulation in Python, you’ll need to know quite a few more techniques.

Moreover, to understand data science more broadly, there’s really a lot more to learn.

Having said that, if you want to learn data manipulation with Pandas, and data science in Python, then sign up for our email newsletter.

When you sign up, you’ll get free tutorials on:

  • NumPy
  • Pandas
  • Base Python
  • Scikit learn
  • Machine learning
  • Deep learning
  • … and more.

We publish data science tutorials every week, and when you sign up for our email list, we’ll deliver those tutorials directly to your inbox.

Joshua Ebner

Joshua Ebner is the founder, CEO, and Chief Data Scientist of Sharp Sight.   Prior to founding the company, Josh worked as a Data Scientist at Apple.   He has a degree in Physics from Cornell University.   For more daily data science advice, follow Josh on LinkedIn.

2 thoughts on “How to Rename Dataframe Columns with Pandas Rename”

  1. When I want to explore the ‘pop’ column from DataFrame, I can not use the attribute syntax, that is it returns the whole DataFrame. Why?

    Reply

Leave a Comment