Use the pipe function for fluent pandas api

pipe is a method that accepts a function

pipe, by default, assumes the first argument of this function is a data frame and passes the current dataframe down the pipeline

The function should return a dataframe also, if you want to continue with the chaining.

Yet, it can also return any other value if you put it in the last step.

This is incredibly valuable because it takes you one step further from SQL where you do things in reverse

Create a sample dataframe

# Import modules
import pandas as pd
# Example dataframe

raw_data  = {'fruit': ['Banana', 'Orange', 'Apple', 'lemon', "lime", "plum"], 
        'color': ['yellow', 'orange', 'red', 'yellow', "green", "purple"], 
        'kcal': [89, 47, 52, 15, 30, 28],
        'size_cm' : [20, 10, 9, 7, 5, 4]
    }

df = pd.DataFrame(raw_data, columns = ['fruit', 'color', 'kcal', "size_cm"])
df

	fruit	color	kcal	size_cm
0	Banana	yellow	89	20
1	Orange	orange	47	10
2	Apple	red	52	9
3	lemon	yellow	15	7
4	lime	green	30	5
5	plum	purple	28	4

def add_to_col(de, col='kcal', n=200):
    ret=df.copy() # a dataframe is mutable, we use copy in order to avoid modifying any data
    ret[col]=ret[col]+n
    return ret


(df
.pipe(add_to_col)
.pipe(add_to_col, col='size_cm',n=10)
.head(5)
)

	fruit	color	kcal	size_cm
0	Banana	yellow	89	30
1	Orange	orange	47	20
2	Apple	red	52	19
3	lemon	yellow	15	17
4	lime	green	30	15