Rename columns in pandas
In this article, we will be renaming columns in a pandas dataframe. First, let's import pandas and create an example dataframe
# Import modules
import pandas as pd
# Example dataframe
raw_data = {'fruit': ['Banana', 'Orange', 'Apple', 'lemon', "lime", "plum"],
'color': ['yellow', 'orange', 'red', 'yellow', "green", "purple"],
'kcal': [89, 47, 52, 15, 30, 28]
}
df = pd.DataFrame(raw_data, columns = ['fruit', 'color', 'kcal'])
df
| fruit | color | kcal | |
|---|---|---|---|
| 0 | Banana | yellow | 89 |
| 1 | Orange | orange | 47 |
| 2 | Apple | red | 52 |
| 3 | lemon | yellow | 15 |
| 4 | lime | green | 30 |
| 5 | plum | purple | 28 |
The most flexible method for renaming columns in pandas is the rename method. It takes a dictornnary as an argument where : * the keys are the old names * the values are the new names you also need to specify the axis.
This method can be used to rename either one or multiple columns
df = df.rename({"fruit" : "produce", "kcal": "energy"}, axis="columns")
df
| produce | color | energy | |
|---|---|---|---|
| 0 | Banana | yellow | 89 |
| 1 | Orange | orange | 47 |
| 2 | Apple | red | 52 |
| 3 | lemon | yellow | 15 |
| 4 | lime | green | 30 |
| 5 | plum | purple | 28 |
If you want to rename all the columns at once, a common method is to rewrite the columns attribute of the dataframe
df.columns = ["nice fruit", "bright color", "light kcal"]
df
| nice fruit | bright color | light kcal | |
|---|---|---|---|
| 0 | Banana | yellow | 89 |
| 1 | Orange | orange | 47 |
| 2 | Apple | red | 52 |
| 3 | lemon | yellow | 15 |
| 4 | lime | green | 30 |
| 5 | plum | purple | 28 |
If the only thing you are doing is replacing a space with an underscore, an even better method is to use the str.replace method since you don't have to type all the column names
df.columns = df.columns.str.replace(" ", "_")
df
| nice_fruit | bright_color | light_kcal | |
|---|---|---|---|
| 0 | Banana | yellow | 89 |
| 1 | Orange | orange | 47 |
| 2 | Apple | red | 52 |
| 3 | lemon | yellow | 15 |
| 4 | lime | green | 30 |
| 5 | plum | purple | 28 |
Similarly, you can use other str methods such as : * capitalize : in order to converts first character to capital letter * lower : in order to have lowercase column names * upper : in order to have uppercase column names * etc.
df.columns = df.columns.str.capitalize()
df
| Nice_fruit | Bright_color | Light_kcal | |
|---|---|---|---|
| 0 | Banana | yellow | 89 |
| 1 | Orange | orange | 47 |
| 2 | Apple | red | 52 |
| 3 | lemon | yellow | 15 |
| 4 | lime | green | 30 |
| 5 | plum | purple | 28 |
Finaly, if you only need to add a prefix or a suffix to the columns, you can use the add_prefix method
df.add_prefix("pre_")
| pre_Nice_fruit | pre_Bright_color | pre_Light_kcal | |
|---|---|---|---|
| 0 | Banana | yellow | 89 |
| 1 | Orange | orange | 47 |
| 2 | Apple | red | 52 |
| 3 | lemon | yellow | 15 |
| 4 | lime | green | 30 |
| 5 | plum | purple | 28 |
or the add_suffix method
df.add_suffix("_post")
| Nice_fruit_post | Bright_color_post | Light_kcal_post | |
|---|---|---|---|
| 0 | Banana | yellow | 89 |
| 1 | Orange | orange | 47 |
| 2 | Apple | red | 52 |
| 3 | lemon | yellow | 15 |
| 4 | lime | green | 30 |
| 5 | plum | purple | 28 |