How to rename columns in Pandas Dataframe

By mullaned2002

June 4, 2021

1117

In this tutorial, we will cover various methods to rename columns in pandas dataframe in Python. Renaming or changing the names of columns is one of the most common data wrangling task. If you are not from programming background and worked only in Excel Spreadsheets in the past you might feel it not so easy doing this in Python as you can easily rename columns in MS Excel by just typing in the cell what you want to have. If you are from database background it is similar to ALIAS in SQL. In Python there is a popular data manipulation package called pandas which simplifies doing these kind of data operations.

2 Methods to rename columns in Pandas

In Pandas there are two simple methods to rename name of columns.

First step is to install pandas package if it is not already installed. You can check if the package is installed on your machine by running !pip show pandas statement in Ipython console. If it is not installed, you can install it by using the command !pip install pandas.

Import Dataset for practice

To import dataset, we are using read_csv( ) function from pandas package.

import pandas as pd
df = df = pd.read_csv(“https://raw.githubusercontent.com/JackyP/testing/master/datasets/nycflights.csv”, usecols=range(1,17))
To see the names of columns in a data frame, write the command below : df.columns
Index([‘year’, ‘month’, ‘day’, ‘dep_time’, ‘dep_delay’, ‘arr_time’,
‘arr_delay’, ‘carrier’, ‘tailnum’, ‘flight’, ‘origin’, ‘dest’,
‘air_time’, ‘distance’, ‘hour’, ‘minute’],
dtype=’object’)

Method I : rename() function

Suppose you want to replace column name year with years. In the code below it will create a new dataframe named df2 having new column names and same values. df2 = df.rename(columns={‘year’:’years’}) If you want to make changes in the same dataset df you can try this option inplace = Truedf.rename(columns={‘year’:’years’}, inplace = True) By default inplace = False is set, hence you need to specify this option and mark it True. If you want to rename names of multiple columns, you can specify other columns with comma separator. df.rename(columns={‘year’:’years’, ‘month’:’months’ }, inplace = True)

Method II : dataframe.columns = [list]

You can also assign the list of new column names to df.columns. See the example below. We are renaming year and month columns here. df.columns = [‘years’, ‘months’, ‘day’, ‘dep_time’, ‘dep_delay’, ‘arr_time’,
‘arr_delay’, ‘carrier’, ‘tailnum’, ‘flight’, ‘origin’, ‘dest’,
‘air_time’, ‘distance’, ‘hour’, ‘minute’]

Rename columns having pattern

Suppose you want to rename columns having underscore ‘_’ in their names. You want to get rid of underscore df.columns = df.columns.str.replace(‘_’ , ”) New column names are as follows. You can observe no underscore in the column names. Index([‘year’, ‘month’, ‘day’, ‘deptime’, ‘depdelay’, ‘arrtime’, ‘arrdelay’,
‘carrier’, ‘tailnum’, ‘flight’, ‘origin’, ‘dest’, ‘airtime’, ‘distance’,
‘hour’, ‘minute’],
dtype=’object’)

Rename columns by Position

If you want to change the name of column by position (for example renaming first column) you can do it by using the code below. df.columns[0] refers to first column. df.rename(columns={ df.columns[0]: “Col1” }, inplace = True)

Rename columns in sequence

If you want to change the name of column in sequence of numbers you can do it by iterating via for loop. df.columns=[“Col”+str(i) for i in range(1, 17)] In the code below df.shape[1] returns no. of columns in the dataframe. We need to add 1 here as range(1,17) returns 1, 2, 3 through 16 (excluding 17). df.columns=[“Col”+str(i) for i in range(1, df.shape[1] + 1)]

Add prefix / suffix in column names

In case you want to add some text before or after existing column names, you can do it by using add_prefix( ) and add_suffix( ) functions. df = df.add_prefix(‘V_’)
df = df.add_suffix(‘_V’)

How to access columns having space in names

For demonstration purpose we can add space in some column names by using df.columns = df.columns.str.replace(‘_’ , ‘ ‘). You can access the column using the syntax df[“columnname”] df[“arr delay”]

How to change row names

With the use of index option, you can rename rows (or index). In the code below, we are altering row names 0 and 1 to ‘First’ and ‘Second’ in dataframe df. By creating dictionary and taking previous row names as keys and new row names as values. df.rename(index={0:’First’,1:’Second’}, inplace=True)

How to rename columns in Pandas Dataframe

AI and the Future of Work Conference Recordings 2024

Dozens of Researchers Present at Wharton’s Inaugural AI and the Future of Work Conference

2nd Annual Business & Generative AI Workshop Speakers

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

The Lighter Side of the Cloud – Buffering

Scheduling a command in GCP using Cloud Run and Cloud Schedule

What is Network Intelligence Center?

POPULAR CATEGORY