Cannot convert non-finite values (NA or inf) to integer

avatar
Borislav Hadzhiev

Last updated: Apr 13, 2024
5 min

banner

# Table of Contents

  1. Cannot convert non-finite values (NA or inf) to integer
  2. Use the DataFrame.fillna() method to solve the error
  3. Only calling fillna() on the specific column
  4. Setting the errors argument to ignore when calling astype()
  5. Using the Nullable integer data type to solve the error
  6. Using the pandas.to_numeric() method to solve the error

# Cannot convert non-finite values (NA or inf) to integer

The Pandas error "Cannot convert non-finite values (NA or inf) to integer" occurs when you try to convert the values in a column with missing or non-finite values to integers.

Use the DataFrame.fillna() method to fill the missing values in the DataFrame column before converting them to integers to solve the error.

Here is an example of how the error occurs.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], 'year_joined': [2023.0, 2022.0, None, 2020.0, None], }) df['year_joined'] = df['year_joined'].astype(int) # ⛔️ pandas.errors.IntCastingNaNError: Cannot convert non-finite values (NA or inf) to integer print(df)

cannot convert non finite values na or inf to integer

Notice that the year_joined column contains missing values (e.g. None, NaN, etc).

Trying to convert missing values to integers with DataFrame.astype causes the error.

# Use the DataFrame.fillna() method to solve the error

One way to solve the error is to use the DataFrame.fillna() method.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], 'year_joined': [2023.0, 2022.0, None, 2020.0, None], }) df = df.fillna(value=0) df['year_joined'] = df['year_joined'].astype(int) # name salary year_joined # 0 Alice 175.1 2023 # 1 Bobby 180.2 2022 # 2 Carl 190.3 0 # 3 Dan 205.4 2020 # 4 Ethan 210.5 0 print(df)

using dataframe fillna method to solve the error

The code for this article is available on GitHub

The DataFrame.fillna() method fills the NA/NaN values in the DataFrame.

main.py
df = df.fillna(value=0)

The only argument we passed to the method is the replacement value (the value that is used to fill the holes).

I used 0 in the example but you can use any other value.

# Only calling fillna() on the specific column

You can also only call the fillna() method on the specific DataFrame column.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], 'year_joined': [2023.0, 2022.0, None, 2020.0, None], }) df['year_joined'] = df['year_joined'].fillna(value=0) df['year_joined'] = df['year_joined'].astype(int) # name salary year_joined # 0 Alice 175.1 2023 # 1 Bobby 180.2 2022 # 2 Carl 190.3 0 # 3 Dan 205.4 2020 # 4 Ethan 210.5 0 print(df)

only calling fillna on the specific dataframe column

The code for this article is available on GitHub

We used bracket notation to select the year_joined column and called the fillna() method on the specific column.

# Setting the errors argument to ignore when calling astype()

If you'd rather keep the missing values, set the errors argument to ignore in the call to astype().

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], 'year_joined': [2023.0, 2022.0, None, 2020.0, None], }) df['year_joined'] = df['year_joined'].astype( int, errors='ignore' ) # name salary year_joined # 0 Alice 175.1 2023.0 # 1 Bobby 180.2 2022.0 # 2 Carl 190.3 NaN # 3 Dan 205.4 2020.0 # 4 Ethan 210.5 NaN print(df)

setting errors argument to ignore to keep missing values

The code for this article is available on GitHub

The DataFrame.astype() method takes an optional errors argument.

By default, the argument is set to "raise" which means that exceptions that occur are raised.

However, you can also set the errors argument to "ignore" to suppress exceptions.

main.py
df['year_joined'] = df['year_joined'].astype( int, errors='ignore' ) # name salary year_joined # 0 Alice 175.1 2023.0 # 1 Bobby 180.2 2022.0 # 2 Carl 190.3 NaN # 3 Dan 205.4 2020.0 # 4 Ethan 210.5 NaN print(df)

When an error occurs, the original object is returned.

However, notice that when using this approach, the not-NaN values don't get converted to integers.

# Using the Nullable integer data type to solve the error

You can also use the nullable integer data type to solve the error.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], 'year_joined': [2023.0, 2022.0, None, 2020.0, None], }) df['year_joined'] = df['year_joined'].astype('Int64') # name salary year_joined # 0 Alice 175.1 2023 # 1 Bobby 180.2 2022 # 2 Carl 190.3 <NA> # 3 Dan 205.4 2020 # 4 Ethan 210.5 <NA> print(df)

using nullable integer data type to solve the error

The code for this article is available on GitHub

The "Int64" type is nullable, so we didn't get an error when astype() encountered missing values.

Notice the capital I in "Int64". Not to be confused with the NumPy int64 type.

The "Int64" type uses the pandas.NA value for the missing values (and not numpy.nan).

If you want to round the values before calling the astype() method, use the round() method.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], 'year_joined': [2023.5, 2022.4, None, 2020.0, None], }) df['year_joined'] = df['year_joined'].round().astype('Int64') # name salary year_joined # 0 Alice 175.1 2024 # 1 Bobby 180.2 2022 # 2 Carl 190.3 <NA> # 3 Dan 205.4 2020 # 4 Ethan 210.5 <NA> print(df)

use round method to round float before conversion to int

The round() method rounds the DataFrame to a variable number of decimal places (0 by default).

If you want to round each float down or up, use the DataFrame.apply() method.

Here is an example that rounds down using numpy.floor().

main.py
import numpy as np import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], 'year_joined': [2023.5, 2022.4, None, 2020.0, None], }) df['year_joined'] = df['year_joined'].apply(np.floor).astype('Int64') # name salary year_joined # 0 Alice 175.1 2023 # 1 Bobby 180.2 2022 # 2 Carl 190.3 <NA> # 3 Dan 205.4 2020 # 4 Ethan 210.5 <NA> print(df)
The code for this article is available on GitHub

Make sure you have the NumPy module installed.

shell
pip install numpy # or with pip3 pip3 install numpy

using numpy floor method

As the name suggests, the numpy.floor() method returns the floor of the supplied value, element-wise.

If you only want to round up, use the numpy.ceil() method.

main.py
import numpy as np import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], 'year_joined': [2023.5, 2022.4, None, 2020.0, None], }) df['year_joined'] = df['year_joined'].apply( np.ceil).astype('Int64') # name salary year_joined # 0 Alice 175.1 2024 # 1 Bobby 180.2 2023 # 2 Carl 190.3 <NA> # 3 Dan 205.4 2020 # 4 Ethan 210.5 <NA> print(df)

using numpy ceil to only round up

The code for this article is available on GitHub

The numpy.ceil() method returns the ceiling of the supplied value, element-wise.

# Using the pandas.to_numeric() method to solve the error

You can also use the pandas.to_numeric() method to solve the error.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], 'year_joined': [2023.0, 2022.0, None, 2020.0, None], }) df['year_joined'] = pd.to_numeric( df['year_joined'], errors='coerce' ).fillna(0) df['year_joined'] = df['year_joined'].astype(int) # name salary year_joined # 0 Alice 175.1 2023 # 1 Bobby 180.2 2022 # 2 Carl 190.3 0 # 3 Dan 205.4 2020 # 4 Ethan 210.5 0 print(df)

using pandas to numeric method to solve the error

The code for this article is available on GitHub

The pandas.to_numeric() method converts the supplied argument to a numeric type.

main.py
df['year_joined'] = pd.to_numeric( df['year_joined'], errors='coerce' ).fillna(0)

By default, the returned dtype is float64 or int64 depending on the supplied data.

We also set the errors argument to "coerce".

When the errors argument is set to "coerce", then values that cannot be parsed are set to NaN.

We called the fillna() method on the result to replace the missing values with zeros.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.