Last updated: Apr 13, 2024
Reading time·5 min
Use the DataFrame.applymap()
method to apply a function to each cell of a
Pandas DataFrame
.
The method applies a function to a DataFrame
element-wise.
import math import pandas as pd df = pd.DataFrame({ 'A': [1, 1, 1, 2, 3], 'B': [1, 2, 3, 4, 5], 'C': [0, 0, 3, 4, 5], }) print(df) print('-' * 50) print(df.applymap(math.sqrt))
A B C 0 1 1 0 1 1 2 0 2 1 3 3 3 2 4 4 4 3 5 5 -------------------------------------------------- /home/borislav/Desktop/bobbyhadz_python/main.py:14: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead. A B C 0 1.000000 1.000000 0.000000 1 1.000000 1.414214 0.000000 2 1.000000 1.732051 1.732051 3 1.414214 2.000000 2.000000 4 1.732051 2.236068 2.236068
The
DataFrame.applymap()
method takes a function that accepts and returns a scalar to every element of a
DataFrame
.
The method returns the transformed DataFrame
.
However, notice that the method has been deprecated in Pandas version 2.1.0.
/home/borislav/Desktop/bobbyhadz_python/main.py:14: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.
DataFrame.map()
As the message suggests, if you use a Pandas version greater than 2.1.0, you should use the DataFrame.map method.
import math import pandas as pd df = pd.DataFrame({ 'A': [1, 1, 1, 2, 3], 'B': [1, 2, 3, 4, 5], 'C': [0, 0, 3, 4, 5], }) print(df) print('-' * 50) print(df.map(math.sqrt))
Running the code sample produces the following output.
A B C 0 1 1 0 1 1 2 0 2 1 3 3 3 2 4 4 4 3 5 5 -------------------------------------------------- A B C 0 1.000000 1.000000 0.000000 1 1.000000 1.414214 0.000000 2 1.000000 1.732051 1.732051 3 1.414214 2.000000 2.000000 4 1.732051 2.236068 2.236068
Starting with Pandas version 2.1.0, the DataFrame.applymap()
method has been
deprecated and renamed to DataFrame.map()
.
The function you pass to the method needs to take a single value and return a single value.
I used the math.sqrt()
method in the example but you can also use a custom
function.
import pandas as pd df = pd.DataFrame({ 'A': [1, 1, 1, 2, 3], 'B': [1, 2, 3, 4, 5], 'C': [0, 0, 3, 4, 5], }) print(df) print('-' * 50) def custom_function(num): if num > 1: return num + 10 elif num < 1: return num - 10 else: return 1000 print(df.map(custom_function))
Running the code sample produces the following output.
A B C 0 1 1 0 1 1 2 0 2 1 3 3 3 2 4 4 4 3 5 5 -------------------------------------------------- A B C 0 1000 1000 -10 1 1000 12 -10 2 1000 13 13 3 12 14 14 4 13 15 15
The function takes a single value and returns a single value.
def custom_function(num): if num > 1: return num + 10 elif num < 1: return num - 10 else: return 1000
The custom function gets called with the value of each cell in the DataFrame
.
The new, transformed DataFrame
contains the returned values.
The DataFrame.map()
method also takes a na_action
argument that enables you
to handle NaN
values.
For example, if the argument is set to 'ignore'
, the NaN
values are
propagated without passing them to the supplied function.
import pandas as pd df = pd.DataFrame({ 'A': [1, 1, 1, 2, 3], 'B': [1, 2, None, 4, 5], 'C': [0, 0, None, None, 5], }) print(df) print('-' * 50) def custom_function(num): if num > 1: return num + 10 elif num < 1: return num - 10 else: return 1000 print(df.map(custom_function, na_action='ignore'))
Running the code sample produces the following output.
A B C 0 1 1.0 0.0 1 1 2.0 0.0 2 1 NaN NaN 3 2 4.0 NaN 4 3 5.0 5.0 -------------------------------------------------- A B C 0 1000 1000.0 -10.0 1 1000 12.0 -10.0 2 1000 NaN NaN 3 12 14.0 NaN 4 13 15.0 15.0
Notice that the NaN
values weren't handled.
numpy.vectorize()
You can also use the
numpy.vectorize
method with
DataFrame.apply
to apply a function to each cell of a DataFrame
.
First, make sure you
have the numpy
module installed.
pip install numpy # or with pip3 pip3 install numpy
Now, import the module and use the numpy.vectorize()
method.
import pandas as pd import numpy as np df = pd.DataFrame({ 'A': [1, 1, 1, 2, 3], 'B': [1, 2, 3, 4, 5], 'C': [0, 0, 3, 4, 5], }) print(df) print('-' * 50) def custom_function(num): if num > 1: return num + 10 elif num < 1: return num - 10 else: return 1000 print(df.apply(np.vectorize(custom_function)))
Running the code sample produces the following output.
A B C 0 1 1 0 1 1 2 0 2 1 3 3 3 2 4 4 4 3 5 5 -------------------------------------------------- A B C 0 1000 1000 -10 1 1000 12 -10 2 1000 13 13 3 12 14 14 4 13 15 15
The numpy.vectorize()
method returns an object that acts like the supplied
function but takes arrays as input.
# <numpy.vectorize object at 0x7fe31020e110> print(np.vectorize(custom_function))
The DataFrame.apply()
method applies the supplied function along an axis of
the DataFrame
.
You can also use the numpy.vectorize()
method in a more manual manner to
achieve the same result.
import pandas as pd import numpy as np df = pd.DataFrame({ 'A': [1, 1, 1, 2, 3], 'B': [1, 2, 3, 4, 5], 'C': [0, 0, 3, 4, 5], }) print(df) print('-' * 50) def custom_function(num): if num > 1: return num + 10 elif num < 1: return num - 10 else: return 1000 df[:] = np.vectorize(custom_function)(df) print(df)
Running the code sample produces the following output.
A B C 0 1 1 0 1 1 2 0 2 1 3 3 3 2 4 4 4 3 5 5 -------------------------------------------------- A B C 0 1000 1000 -10 1 1000 12 -10 2 1000 13 13 3 12 14 14 4 13 15 15
We called the function that is returned from numpy.vectorize()
with the
DataFrame
object.
You can learn more about the related topics by checking out the following tutorials: