Pandas: Convert entire DataFrame to numeric (int or float)

avatar
Borislav Hadzhiev

Last updated: Apr 12, 2024
4 min

banner

# Table of Contents

  1. Pandas: Convert entire DataFrame to numeric (int or float)
  2. Setting the errors argument if not all columns are convertible to numeric
  3. Setting the errors argument to coerce

# Pandas: Convert entire DataFrame to numeric (int or float)

Use the DataFrame.apply() and the pandas.to_numeric() methods to convert an entire DataFrame to numeric.

The to_numeric() method will convert the values in the DataFrame to int or float, depending on the supplied values.

main.py
import pandas as pd df = pd.DataFrame({ 'id': ['1', '2', '3', '4'], 'experience': ['1', '1', '5', '7'], 'salary': ['175.1', '180.2', '190.3', '205.4'], }) print(df.dtypes) df = df.apply(pd.to_numeric) print('-' * 50) print(df.dtypes)
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
id object experience object salary object dtype: object -------------------------------------------------- id int64 experience int64 salary float64 dtype: object

convert entire dataframe to numeric int or float

The DataFrame.apply() method applies a function along an axis of the DataFrame.

We passed the pandas.to_numeric() method to the apply() function.

main.py
df = df.apply(pd.to_numeric) # id int64 # experience int64 # salary float64 # dtype: object print(df.dtypes)

The to_numeric() method converts the supplied argument to a numeric type.

The default return dtype is float64 or int64 depending on the supplied data.

Notice that the values in the integer columns got converted to int64 and the values in the float columns got converted to float64.

You can also use the DataFrame.info() method to verify that the values have been converted to integers.

main.py
import pandas as pd df = pd.DataFrame({ 'id': ['1', '2', '3', '4'], 'experience': ['1', '1', '5', '7'], 'salary': ['175.1', '180.2', '190.3', '205.4'], }) df = df.apply(pd.to_numeric) # <class 'pandas.core.frame.DataFrame'> # RangeIndex: 4 entries, 0 to 3 # Data columns (total 3 columns): # # Column Non-Null Count Dtype # --- ------ -------------- ----- # 0 id 4 non-null int64 # 1 experience 4 non-null int64 # 2 salary 4 non-null float64 # dtypes: float64(1), int64(2) # memory usage: 224.0 bytes print(df.info())

verify values have been converted to integers

The code for this article is available on GitHub

# Setting the errors argument if not all columns are convertible to numeric

If not all arguments in the DataFrame are convertible to numeric, you will get an error when calling DataFrame.apply():

  • ValueError: Unable to parse string "X" at position 0
main.py
import pandas as pd df = pd.DataFrame({ 'id': ['1', '2', '3', '4'], 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': ['1', '1', '5', '7'], 'salary': ['175.1', '180.2', '190.3', '205.4'], }) # ⛔️ ValueError: Unable to parse string "Alice" at position 0 df = df.apply(pd.to_numeric)

value error unable to parse string at position 0

The pandas.to_numeric method takes an errors argument.

By default, the argument is set to "raise", which means that invalid parsing raises an exception.

You can set the errors argument to "ignore" to return the values as is if an error is raised when parsing.

main.py
import pandas as pd df = pd.DataFrame({ 'id': ['1', '2', '3', '4'], 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': ['1', '1', '5', '7'], 'salary': ['175.1', '180.2', '190.3', '205.4'], }) df = df.apply(pd.to_numeric, errors='ignore') # <class 'pandas.core.frame.DataFrame'> # RangeIndex: 4 entries, 0 to 3 # Data columns (total 4 columns): # # Column Non-Null Count Dtype # --- ------ -------------- ----- # 0 id 4 non-null int64 # 1 name 4 non-null object # 2 experience 4 non-null int64 # 3 salary 4 non-null float64 # dtypes: float64(1), int64(2), object(1) # memory usage: 256.0+ bytes print(df.info())

setting errors argument to ignore to solve the error

The code for this article is available on GitHub

When the errors argument is set to "ignore", invalid parsing returns the input.

The code sample passes the errors argument to the DataFrame.apply() method, however, you can also use the partial class from the built-in functools module when calling apply().

main.py
from functools import partial import pandas as pd df = pd.DataFrame({ 'id': ['1', '2', '3', '4'], 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': ['1', '1', '5', '7'], 'salary': ['175.1', '180.2', '190.3', '205.4'], }) df = df.apply(partial(pd.to_numeric, errors='ignore')) # <class 'pandas.core.frame.DataFrame'> # RangeIndex: 4 entries, 0 to 3 # Data columns (total 4 columns): # # Column Non-Null Count Dtype # --- ------ -------------- ----- # 0 id 4 non-null int64 # 1 name 4 non-null object # 2 experience 4 non-null int64 # 3 salary 4 non-null float64 # dtypes: float64(1), int64(2), object(1) # memory usage: 256.0+ bytes print(df.info())

using partial class from functools when calling apply

# Setting the errors argument to coerce

If you'd rather set values that cannot be converted to numeric to NaN, set the errors argument to "coerce" when calling DataFrame.apply().

main.py
import pandas as pd df = pd.DataFrame({ 'id': ['1', '2', '3', '4'], 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': ['1', '1', '5', '7'], 'salary': ['175.1', '180.2', '190.3', '205.4'], }) df = df.apply(pd.to_numeric, errors='coerce') # id name experience salary # 0 1 NaN 1 175.1 # 1 2 NaN 1 180.2 # 2 3 NaN 5 190.3 # 3 4 NaN 7 205.4 print(df) print('-' * 50) # id int64 # name float64 # experience int64 # salary float64 # dtype: object print(df.dtypes)

set errors argument to coerce when calling apply

The code for this article is available on GitHub

When the errors argument is set to "coerce", values that cannot be parsed are set to NaN.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.