Image by Author
Pandas dataframe has now become mainstream. Everyone is using it for data analytics, machine learning, data engineering, and even software development. Learning to rename columns is the first step in data cleaning, which is the core part of data analytics.
In this mini tutorial, we will review four methods that will help you rename single or multiple-column names.
- Method 1: using rename() function.
- Method 2: assigning list of new column names.
- Method 3: replacing the columns string.
- Method 4: using set_axis() function.
Creating Pandas DataFrame
We will first create a simple dictionary of student class performance. It consists of three columns: "id", "name’" and "grade", and five rows.
To convert a Python dictionary to a pandas dataframe, we will use the pandas DataFrame() function and display the results using Deepnote (Cloud Jupyter Notebook).
Note: we will use the `student_dict` dictionary multiple times to create a dataframe for every method.
Method 1
The first method is quite simple. We will use the pandas `rename()` function to relabel the column names.
Rename a Single Column
In this example, we will rename a single column using `.rename()`. You just have to provide a dictionary of old and new column names to the `columns` argument.
For example: {“old_column_name” : “new_column_name”}
As we can observe, we have replaced “id” with “ID” successfully.
student_df_1.rename(columns={"id": "ID"}, inplace=True)
student_df_1
Note: `inplace = True` means that we are applying changes to the dataframe. It is similar to `df = df.rename()`
Rename Multiple Columns
For multiple columns, we just have to provide dictionaries of old and new column names separated by a comma “,” and it will automatically replace the column names.
The new column names are "Student_ID", "First_Name", and "Average_Grade".
student_df_1.rename(
columns={"ID": "Student_ID", "name": "First_Name", "grade": "Average_Grade"},
inplace=True,
)
student_df_1
Method 2
The second method is straightforward. We will rename the columns by assigning the list of new names to the `columns` attribute of the DataFrame object.
For example, we have created a new dataframe using a dictionary and renamed the columns by providing a list of strings to column attributes.
student_df_2 = pd.DataFrame(student_dict)
student_df_2.columns = ["Student_ID", "First_Name", "Average_Grade"]
student_df_2
Method 3
The third method is native to the Python ecosystem where we replace strings of `columns` attributes.
For example: `df = df.columns.str.replace("old_name", "new_name")`
We have successfully changed the column names to “ID”, “Name”, and “Grades”.
student_df_3 = pd.DataFrame(student_dict)
student_df_3.columns = student_df_3.columns.str.replace("id", "ID")
student_df_3.columns = student_df_3.columns.str.replace("name", "Name")
student_df_3.columns = student_df_3.columns.str.replace("grade", "Grades")
student_df_3
Method 4
In the fourth method, we will rename the columns using the `set_axis()` function. You need to provide a list of new names and set `axis = ”columns”` to rename columns instead of index.
student_df_4 = pd.DataFrame(student_dict)
student_df_4.set_axis(["A", "B", "C"], axis="columns", inplace=True)
student_df_4
Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master's degree in Technology Management and a bachelor's degree in Telecommunication Engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.