Use the pipe function for fluent pandas api

pipe is a method that accepts a function

pipe, by default, assumes the first argument of this function is a data frame and passes the current dataframe down the pipeline

The function should return a dataframe also, if you want to continue with the chaining.

Yet, it can also return any other value if you put it in the last step.

This is incredibly valuable because it takes you one step further from SQL where you do things in reverse

Create a sample dataframe

# Import modules
import pandas as pd
# Example dataframe

raw_data  = {'fruit': ['Banana', 'Orange', 'Apple', 'lemon', "lime", "plum"], 
        'color': ['yellow', 'orange', 'red', 'yellow', "green", "purple"], 
        'kcal': [89, 47, 52, 15, 30, 28],
        'size_cm' : [20, 10, 9, 7, 5, 4]
    }

df = pd.DataFrame(raw_data, columns = ['fruit', 'color', 'kcal', "size_cm"])
df
fruit color kcal size_cm
0 Banana yellow 89 20
1 Orange orange 47 10
2 Apple red 52 9
3 lemon yellow 15 7
4 lime green 30 5
5 plum purple 28 4
def add_to_col(de, col='kcal', n=200):
    ret=df.copy() # a dataframe is mutable, we use copy in order to avoid modifying any data
    ret[col]=ret[col]+n
    return ret


(df
.pipe(add_to_col)
.pipe(add_to_col, col='size_cm',n=10)
.head(5)
)
fruit color kcal size_cm
0 Banana yellow 89 30
1 Orange orange 47 20
2 Apple red 52 19
3 lemon yellow 15 17
4 lime green 30 15