In Pandas there are two simple methods to rename name of columns.
First step is to install pandas package if it is not already installed. You can check if the package is installed on your machine by running !pip show pandas statement in Ipython console. If it is not installed, you can install it by using the command !pip install pandas.
To import dataset, we are using read_csv( ) function from pandas package.
import pandas as pd
df = df = pd.read_csv(“https://raw.githubusercontent.com/JackyP/testing/master/datasets/nycflights.csv”, usecols=range(1,17))
To see the names of columns in a data frame, write the command below : df.columns
Index([‘year’, ‘month’, ‘day’, ‘dep_time’, ‘dep_delay’, ‘arr_time’,
‘arr_delay’, ‘carrier’, ‘tailnum’, ‘flight’, ‘origin’, ‘dest’,
‘air_time’, ‘distance’, ‘hour’, ‘minute’],
dtype=’object’)
Suppose you want to replace column name year with years. In the code below it will create a new dataframe named df2 having new column names and same values. df2 = df.rename(columns={‘year’:’years’}) If you want to make changes in the same dataset df you can try this option inplace = Truedf.rename(columns={‘year’:’years’}, inplace = True) By default inplace = False is set, hence you need to specify this option and mark it True. If you want to rename names of multiple columns, you can specify other columns with comma separator. df.rename(columns={‘year’:’years’, ‘month’:’months’ }, inplace = True)
You can also assign the list of new column names to df.columns. See the example below. We are renaming year and month columns here. df.columns = [‘years’, ‘months’, ‘day’, ‘dep_time’, ‘dep_delay’, ‘arr_time’,
‘arr_delay’, ‘carrier’, ‘tailnum’, ‘flight’, ‘origin’, ‘dest’,
‘air_time’, ‘distance’, ‘hour’, ‘minute’]
Suppose you want to rename columns having underscore ‘_’ in their names. You want to get rid of underscore df.columns = df.columns.str.replace(‘_’ , ”) New column names are as follows. You can observe no underscore in the column names. Index([‘year’, ‘month’, ‘day’, ‘deptime’, ‘depdelay’, ‘arrtime’, ‘arrdelay’,
‘carrier’, ‘tailnum’, ‘flight’, ‘origin’, ‘dest’, ‘airtime’, ‘distance’,
‘hour’, ‘minute’],
dtype=’object’)
If you want to change the name of column by position (for example renaming first column) you can do it by using the code below. df.columns[0] refers to first column. df.rename(columns={ df.columns[0]: “Col1” }, inplace = True)
If you want to change the name of column in sequence of numbers you can do it by iterating via for loop. df.columns=[“Col”+str(i) for i in range(1, 17)] In the code below df.shape[1] returns no. of columns in the dataframe. We need to add 1 here as range(1,17) returns 1, 2, 3 through 16 (excluding 17). df.columns=[“Col”+str(i) for i in range(1, df.shape[1] + 1)]
In case you want to add some text before or after existing column names, you can do it by using add_prefix( ) and add_suffix( ) functions. df = df.add_prefix(‘V_’)
df = df.add_suffix(‘_V’)
For demonstration purpose we can add space in some column names by using df.columns = df.columns.str.replace(‘_’ , ‘ ‘). You can access the column using the syntax df[“columnname”] df[“arr delay”]
With the use of index option, you can rename rows (or index). In the code below, we are altering row names 0 and 1 to ‘First’ and ‘Second’ in dataframe df. By creating dictionary and taking previous row names as keys and new row names as values. df.rename(index={0:’First’,1:’Second’}, inplace=True)
Read MoreListenData