Last updated: Apr 12, 2024
Reading timeยท3 min
To find the closest value to a Number in a DataFrame
column:
argsort()
method to get the integer indices that would sort the
Series
.import pandas df = pandas.DataFrame({ 'first name': ['Alice', 'Bobby', 'Carl', 'Alice'], 'age': [20, 25, 50, 65], 'net salary': [75, 60, 100, 70] }) num = 21 result = df.iloc[(df['age'] - num).abs().argsort()[:1]] # first name age net salary # 0 Alice 20 75 print(result)
The code sample selects the closest value to the number 21
in the age
column
of the DataFrame
.
We first subtract the given number from each value and get the absolute (non-negative) result.
import pandas df = pandas.DataFrame({ 'first name': ['Alice', 'Bobby', 'Carl', 'Alice'], 'age': [20, 25, 50, 65], 'net salary': [75, 60, 100, 70] }) num = 21 # 0 1 # 1 4 # 2 29 # 3 44 # Name: age, dtype: int64 print((df['age'] - num).abs())
The next step is to use the
Series.argsort()
method to get the integer indices that would sort the Series
values.
import pandas df = pandas.DataFrame({ 'first name': ['Alice', 'Bobby', 'Carl', 'Alice'], 'age': [20, 25, 50, 65], 'net salary': [75, 60, 100, 70] }) num = 21 # 0 0 # 1 1 # 2 2 # 3 3 # Name: age, dtype: int64 print((df['age'] - num).abs().argsort())
In order to get the single closest value of the given number, we select element
at the first index (0
).
result = df.iloc[(df['age'] - num).abs().argsort()[:1]] # # first name age net salary # # 0 Alice 20 75 # print(result)
If you want to get the closest value as a literal, instead of in a DataFrame
,
access the column and use the Series.tolist()
method.
import pandas df = pandas.DataFrame({ 'first name': ['Alice', 'Bobby', 'Carl', 'Alice'], 'age': [20, 25, 50, 65], 'net salary': [75, 60, 100, 70] }) num = 21 result = df.iloc[(df['age'] - num).abs().argsort()[:1]] print(result['age'].tolist()) # ๐๏ธ [20] print(result['age'].tolist()[0]) # ๐๏ธ 20
The
Series.tolist()
method returns a list of the values stored in the Series
.
If you want to get the first N elements, use a different slice.
For example, the following slice selects the closest 2 values of the given number.
import pandas df = pandas.DataFrame({ 'first name': ['Alice', 'Bobby', 'Carl', 'Alice'], 'age': [20, 25, 50, 65], 'net salary': [75, 60, 100, 70] }) num = 21 result = df.iloc[(df['age'] - num).abs().argsort()[:2]] # first name age net salary # 0 Alice 20 75 # 1 Bobby 25 60 print(result)
If you need to find the index of the closest value in a Pandas DataFrame
column, access the index
attribute on the DataFrame
and call the tolist()
method.
import pandas df = pandas.DataFrame({ 'first name': ['Alice', 'Bobby', 'Carl', 'Alice'], 'age': [20, 25, 50, 65], 'net salary': [75, 60, 100, 70] }) num = 21 result = df.iloc[(df['age'] - num).abs().argsort()[:1]] # first name age net salary # 0 Alice 20 75 print(result) closest_value = result['age'].tolist() print(closest_value) # ๐๏ธ [20] closest_value_index = result.index.tolist() print(closest_value_index) # ๐๏ธ [0]
We first got a DataFrame
that contains the row with the closest value.
You can get the index of the closest value by accessing the index
attribute
and calling the tolist()
method.
The tolist()
method will return a list containing the indices of the rows in
the DataFrame
.
idxmin()
You can also use the
DataFrame.idxmin
method instead of argsort()
.
import pandas df = pandas.DataFrame({ 'first name': ['Alice', 'Bobby', 'Carl', 'Alice'], 'age': [20, 25, 50, 65], 'net salary': [75, 60, 100, 70] }) num = 21 series = df.iloc[(df['age'] - num).abs().idxmin()] # first name Alice # age 20 # net salary 75 # Name: 0, dtype: object print(series) closest_value = series['age'] print(closest_value) # ๐๏ธ 20
The DataFrame.idxmin()
method returns the index of the first occurrence of the
minimum value.
You can learn more about the related topics by checking out the following tutorials: