Last updated: Apr 12, 2024
Reading timeยท4 min

To remove the non-numeric rows in a column in a Pandas DataFrame:
pandas.to_numeric() method to convert the values in the column to
numeric.errors argument to "coerce", so non-numeric values get set to
NaN.NaN values using DataFrame.notnull().import pandas as pd df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'salary': [175.1, 'b', 190.3, 'd'], 'experience': [10, 15, 20, 25] }) print(df) print('-' * 50) only_numeric = df[ pd.to_numeric(df['salary'], errors='coerce').notnull() ] print(only_numeric)
Running the code sample produces the following output.
first_name salary experience 0 Alice 175.1 10 1 Bobby b 15 2 Carl 190.3 20 3 Dan d 25 -------------------------------------------------- first_name salary experience 0 Alice 175.1 10 2 Carl 190.3 20

The pandas.to_numeric() method converts the given argument to a numeric type.
We set the errors keyword argument to "coerce" so that values that cannot be
parsed, get set as NaN.
import pandas as pd df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'salary': [175.1, 'b', 190.3, 'd'], 'experience': [10, 15, 20, 25] }) # 0 175.1 # 1 NaN # 2 190.3 # 3 NaN # Name: salary, dtype: float64 print(pd.to_numeric(df['salary'], errors='coerce'))
Notice that the 2 values in the "salary" column that cannot get converted to
numeric values are set as NaN after calling to_numeric() are set to NaN.
The last step is to use the DataFrame.notnull() method to get a mask of boolean values that indicates whether an element is not an NA value.
import pandas as pd df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'salary': [175.1, 'b', 190.3, 'd'], 'experience': [10, 15, 20, 25] }) # 0 True # 1 False # 2 True # 3 False # Name: salary, dtype: bool print(pd.to_numeric(df['salary'], errors='coerce').notnull())
The method returns True if the value is not NA and False otherwise.
You can then use the boolean mask to select the numeric rows.
import pandas as pd df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'salary': [175.1, 'b', 190.3, 'd'], 'experience': [10, 15, 20, 25] }) only_numeric = df[ pd.to_numeric(df['salary'], errors='coerce').notnull() ] # first_name salary experience # 0 Alice 175.1 10 # 2 Carl 190.3 20 print(only_numeric)
apply()You can also use the DataFrame.apply() method to remove the non-numeric rows in a column.
First, make sure that you have the numpy module installed.
pip install numpy # or with pip3 pip3 install numpy
Now, import the module and use the
isinstance function with
DataFrame.apply().
import pandas as pd import numpy as np df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'salary': [175.1, 'b', 190.3, 'd'], 'experience': [10, 15, 20, 25] }) only_numeric = df[df['salary'].apply( lambda x: isinstance( x, (int, np.int64, float, np.float64))) ] # first_name salary experience # 0 Alice 175.1 10 # 2 Carl 190.3 20 print(only_numeric)

The DataFrame.apply() method applies a function along an axis of the
DataFrame.
We used the isinstance() function to check if each value is numeric.
The resulting DataFrame only contains the "salary" rows that store numeric
values.
The code sample considers numeric values ones that have a type of:
intfloatnp.float64You can adjust this in the call to isinstance() depending on your use case.
isnumeric()If you only consider integer values numeric, you can also use the str.isnumeric method.
import pandas as pd df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'salary': [175.1, 180.5, 190.3, 203.3], 'experience': [10, 'b', 20, 'd'] }) only_numeric = df[df['experience'].astype('str').str.isnumeric()] # first_name salary experience # 0 Alice 175.1 10 # 2 Carl 190.3 20 print(only_numeric)

We used the
DataFrame.astype
method to convert the values in the "experience" column to strings.
The str.isnumeric method returns True if all characters in the string are
numeric, and there is at least one character, otherwise False is returned.
Note that the str.isnumeric() method returns False for negative numbers
(they contain a minus) and for floats (they contain a period).
print('10'.isnumeric()) # ๐๏ธ True print('50'.isnumeric()) # ๐๏ธ True print('-100'.isnumeric()) # ๐๏ธ False print('3.14'.isnumeric()) # ๐๏ธ False print('A'.isnumeric()) # ๐๏ธ False
This approach should only be used when you only consider integers to be numeric values.
You can learn more about the related topics by checking out the following tutorials: