How to replace None with NaN in Pandas DataFrame

avatar
Borislav Hadzhiev

Last updated: Apr 11, 2024
6 min

banner

# Table of Contents

  1. How to replace None with NaN in Pandas DataFrame
  2. Replace None with NaN in a Pandas DataFrame using replace()
  3. Replacing "None" strings with NaN in a Pandas DataFrame
  4. Replacing "None" strings and None values with NaN in a Pandas DataFrame

# How to replace None with NaN in Pandas DataFrame

You can use the pandas.DataFrame.fillna() method to replace None with NaN in a pandas DataFrame.

The method takes a value argument that is used to fill the holes.

main.py
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", None ], "Age": [29, 30, None, 32], } ) print(df) df = df.fillna(value=np.nan) print('-' * 50) print(df)

replace none with nan in pandas dataframe

The code for this article is available on GitHub

The pandas.DataFrame.fillna() method takes a value argument that is used to fill the holes.

We used numpy.nan() for the value argument.

The numpy.nan property returns a floating-point representation of Not a Number (NaN).

As shown in the screenshot, the None value in the Name column is replaced with NaN after calling dataframe.fillna().

If you want to replace None values with NaN for a column or a Series, call the fillna() method on the column.

main.py
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", None ], "Age": [29, 30, None, 32], } ) print(df) # ๐Ÿ‘‡๏ธ Calling fillna() on the `Name` column df.Name.fillna(value=np.nan, inplace=True) print('-' * 50) print(df)

replace none with nan for column or series

The code for this article is available on GitHub

The example calls the fillna() method on the Name column.

We could've also used bracket notation instead of dot notation.

main.py
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", None ], "Age": [29, 30, None, 32], } ) print(df) # ๐Ÿ‘‡๏ธ Using bracket notation df['Name'].fillna(value=np.nan, inplace=True) print('-' * 50) print(df)
The code for this article is available on GitHub

We set the value argument to np.nan just like in the previous example, but we also set inplace to True.

When inplace is set to True, the fillna() method fills (modifies) the DataFrame in place.

Setting the argument to True will also modify any other views of the DataFrame.

# Replace None with NaN in a Pandas DataFrame using replace()

You can also use the DataFrame.replace() method to replace None values with NaN.

main.py
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", None ], "Age": [29, 30, None, 32], } ) print(df) df.replace(to_replace=[None], value=np.nan, inplace=True) print('-' * 50) print(df)

replace none with nan in dataframe using replace method

The code for this article is available on GitHub

If you use the DataFrame.replace() method as shown in the previous subheading, it will change all datetime objects that have missing data to object dtype.

To not have to change them back to datetime objects, you can first filter for object dtype fields and only replace the matching fields.

main.py
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", None ], "Age": [29, 30, None, 32], } ) print(df) # Get a subset of the DataFrame's columns based on the column dtypes obj_columns = list(df.select_dtypes(include=['object']).columns.values) df[obj_columns] = df[obj_columns].replace([None], np.nan) print('-' * 50) print(df)
The code for this article is available on GitHub

The replace() method replaces the values in the to_replace list with the supplied value argument.

We passed the following arguments to the method:

  1. to_replace - a list containing the values we want to replace.
  2. value - the value that is used to replace any values that match the to_replace argument.
  3. inplace - when set to True, we modify the DataFrame in place rather than creating a new one.

You can also call the replace() method on a specific column.

main.py
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", None ], "Age": [29, 30, None, 32], } ) print(df) # ๐Ÿ‘‡๏ธ calling replace() on Name column df['Name'].replace(to_replace=[None], value=np.nan, inplace=True) print('-' * 50) print(df)

calling replace on specific column

The example calls the replace() method on the Name column.

# Replacing "None" strings with NaN in a Pandas DataFrame

You can also use the DataFrame.replace() method if you need to replace "None" strings with NaN in a DataFrame.

main.py
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", 'None' ], "Age": [29, 30, 'None', 32], } ) print(df) df.replace(to_replace=['None'], value=np.nan, inplace=True) print('-' * 50) print(df)

replace none strings with nan in dataframe

The code for this article is available on GitHub

The example replaces "None" strings with NaN.

We set the to_replace argument to a list containing a None string.

As shown in the screenshot, both "None" strings are replaced with NaN.

You can also call the replace() method on a specific column.

main.py
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", "Carl", 'None' ], "Age": [29, 30, 'None', 32], } ) print(df) df['Name'].replace(to_replace=['None'], value=np.nan, inplace=True) print('-' * 50) print(df)

replace none string with nan in specific column

We called the replace() method on the Name column.

As shown in the screenshot, the "None" strings are only replaced with NaN for the Name column.

# Replacing "None" strings and None values with NaN in a Pandas DataFrame

If you want to replace both "None" strings and None values with NaN in a DataFrame, set the to_replace argument to a list that contains both values.

main.py
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", None, 'None' ], "Age": [29, 30, 'None', None], } ) print(df) df.replace(to_replace=[None, 'None'], value=np.nan, inplace=True) print('-' * 50) print(df)

replace none strings and values with nan

The code for this article is available on GitHub

The Name and Age columns contain both "None" strings and None values.

We set the to_replace argument to a list containing both values to be able to replace both with NaN.

You can also do this for a specific column only.

main.py
import pandas as pd import numpy as np df = pd.DataFrame( { "Name": [ "Alice", "Bobby Hadz", None, 'None' ], "Age": [29, 30, 'None', None], } ) print(df) # ๐Ÿ‘‡๏ธ only for Name column df['Name'].replace(to_replace=[None, 'None'], value=np.nan, inplace=True) print('-' * 50) print(df)

replace none strings and values with nan specific column

The example only replaces the "None" strings and None values with NaN in the Name column.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.

Copyright ยฉ 2024 Borislav Hadzhiev