Rename columns in pandas
In this article, we will be renaming columns in a pandas dataframe. First, let's import pandas and create an example dataframe
# Import modules
import pandas as pd
# Example dataframe
raw_data = {'fruit': ['Banana', 'Orange', 'Apple', 'lemon', "lime", "plum"],
'color': ['yellow', 'orange', 'red', 'yellow', "green", "purple"],
'kcal': [89, 47, 52, 15, 30, 28]
}
df = pd.DataFrame(raw_data, columns = ['fruit', 'color', 'kcal'])
df
fruit | color | kcal | |
---|---|---|---|
0 | Banana | yellow | 89 |
1 | Orange | orange | 47 |
2 | Apple | red | 52 |
3 | lemon | yellow | 15 |
4 | lime | green | 30 |
5 | plum | purple | 28 |
The most flexible method for renaming columns in pandas is the rename method. It takes a dictornnary as an argument where : * the keys are the old names * the values are the new names you also need to specify the axis.
This method can be used to rename either one or multiple columns
df = df.rename({"fruit" : "produce", "kcal": "energy"}, axis="columns")
df
produce | color | energy | |
---|---|---|---|
0 | Banana | yellow | 89 |
1 | Orange | orange | 47 |
2 | Apple | red | 52 |
3 | lemon | yellow | 15 |
4 | lime | green | 30 |
5 | plum | purple | 28 |
If you want to rename all the columns at once, a common method is to rewrite the columns attribute of the dataframe
df.columns = ["nice fruit", "bright color", "light kcal"]
df
nice fruit | bright color | light kcal | |
---|---|---|---|
0 | Banana | yellow | 89 |
1 | Orange | orange | 47 |
2 | Apple | red | 52 |
3 | lemon | yellow | 15 |
4 | lime | green | 30 |
5 | plum | purple | 28 |
If the only thing you are doing is replacing a space with an underscore, an even better method is to use the str.replace method since you don't have to type all the column names
df.columns = df.columns.str.replace(" ", "_")
df
nice_fruit | bright_color | light_kcal | |
---|---|---|---|
0 | Banana | yellow | 89 |
1 | Orange | orange | 47 |
2 | Apple | red | 52 |
3 | lemon | yellow | 15 |
4 | lime | green | 30 |
5 | plum | purple | 28 |
Similarly, you can use other str methods such as : * capitalize : in order to converts first character to capital letter * lower : in order to have lowercase column names * upper : in order to have uppercase column names * etc.
df.columns = df.columns.str.capitalize()
df
Nice_fruit | Bright_color | Light_kcal | |
---|---|---|---|
0 | Banana | yellow | 89 |
1 | Orange | orange | 47 |
2 | Apple | red | 52 |
3 | lemon | yellow | 15 |
4 | lime | green | 30 |
5 | plum | purple | 28 |
Finaly, if you only need to add a prefix or a suffix to the columns, you can use the add_prefix method
df.add_prefix("pre_")
pre_Nice_fruit | pre_Bright_color | pre_Light_kcal | |
---|---|---|---|
0 | Banana | yellow | 89 |
1 | Orange | orange | 47 |
2 | Apple | red | 52 |
3 | lemon | yellow | 15 |
4 | lime | green | 30 |
5 | plum | purple | 28 |
or the add_suffix method
df.add_suffix("_post")
Nice_fruit_post | Bright_color_post | Light_kcal_post | |
---|---|---|---|
0 | Banana | yellow | 89 |
1 | Orange | orange | 47 |
2 | Apple | red | 52 |
3 | lemon | yellow | 15 |
4 | lime | green | 30 |
5 | plum | purple | 28 |