Last updated: Apr 11, 2024
Reading timeยท6 min
You can use the pandas.DataFrame.fillna()
method to replace None with NaN in
a pandas DataFrame.
The method takes a value
argument that is used to fill the holes.
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", None ], "Age": [29, 30, None, 32], } ) print(df) df = df.fillna(value=np.nan) print('-' * 50) print(df)
The
pandas.DataFrame.fillna()
method takes a value
argument that is used to fill the holes.
We used
numpy.nan()
for the value
argument.
The numpy.nan
property returns a floating-point representation of Not a Number
(NaN).
As shown in the screenshot, the None
value in the Name
column is replaced
with NaN
after calling dataframe.fillna()
.
If you want to replace None
values with NaN
for a column or a Series, call
the fillna()
method on the column.
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", None ], "Age": [29, 30, None, 32], } ) print(df) # ๐๏ธ Calling fillna() on the `Name` column df.Name.fillna(value=np.nan, inplace=True) print('-' * 50) print(df)
The example calls the fillna()
method on the Name
column.
We could've also used bracket notation instead of dot notation.
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", None ], "Age": [29, 30, None, 32], } ) print(df) # ๐๏ธ Using bracket notation df['Name'].fillna(value=np.nan, inplace=True) print('-' * 50) print(df)
We set the value
argument to np.nan
just like in the previous example, but
we also set inplace
to True
.
When inplace
is set to True
, the fillna()
method fills (modifies) the
DataFrame in place.
Setting the argument to True
will also modify any other views of the
DataFrame
.
replace()
You can also use the
DataFrame.replace()
method to replace None
values with NaN
.
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", None ], "Age": [29, 30, None, 32], } ) print(df) df.replace(to_replace=[None], value=np.nan, inplace=True) print('-' * 50) print(df)
If you use the DataFrame.replace()
method as shown in the previous subheading,
it will change all datetime
objects that have missing data to object
dtype.
To not have to change them back to datetime
objects, you can first filter for
object dtype
fields and only replace the matching fields.
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", None ], "Age": [29, 30, None, 32], } ) print(df) # Get a subset of the DataFrame's columns based on the column dtypes obj_columns = list(df.select_dtypes(include=['object']).columns.values) df[obj_columns] = df[obj_columns].replace([None], np.nan) print('-' * 50) print(df)
The replace()
method replaces the values in the to_replace
list with the
supplied value
argument.
We passed the following arguments to the method:
to_replace
- a list containing the values we want to replace.value
- the value that is used to replace any values that match the
to_replace
argument.inplace
- when set to True
, we modify the DataFrame
in place rather
than creating a new one.You can also call the replace()
method on a specific column.
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", None ], "Age": [29, 30, None, 32], } ) print(df) # ๐๏ธ calling replace() on Name column df['Name'].replace(to_replace=[None], value=np.nan, inplace=True) print('-' * 50) print(df)
The example calls the replace()
method on the Name
column.
You can also use the DataFrame.replace()
method if you need to replace
"None"
strings with NaN
in a DataFrame.
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", 'None' ], "Age": [29, 30, 'None', 32], } ) print(df) df.replace(to_replace=['None'], value=np.nan, inplace=True) print('-' * 50) print(df)
The example replaces "None"
strings with NaN.
We set the to_replace
argument to a list containing a None
string.
As shown in the screenshot, both "None"
strings are replaced with NaN
.
You can also call the replace()
method on a specific column.
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", 'None' ], "Age": [29, 30, 'None', 32], } ) print(df) df['Name'].replace(to_replace=['None'], value=np.nan, inplace=True) print('-' * 50) print(df)
We called the replace()
method on the Name
column.
As shown in the screenshot, the "None"
strings are only replaced with NaN
for the Name
column.
If you want to replace both "None"
strings and None
values with NaN
in a
DataFrame, set the to_replace
argument to a list that contains both values.
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", None, 'None' ], "Age": [29, 30, 'None', None], } ) print(df) df.replace(to_replace=[None, 'None'], value=np.nan, inplace=True) print('-' * 50) print(df)
The Name
and Age
columns contain both "None"
strings and None
values.
We set the to_replace
argument to a list containing both values to be able to
replace both with NaN
.
You can also do this for a specific column only.
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", None, 'None' ], "Age": [29, 30, 'None', None], } ) print(df) # ๐๏ธ only for Name column df['Name'].replace(to_replace=[None, 'None'], value=np.nan, inplace=True) print('-' * 50) print(df)
The example only replaces the "None"
strings and None
values with NaN
in
the Name
column.
You can learn more about the related topics by checking out the following tutorials: