Pandas: Find the closest value to a Number in a Column

avatar
Borislav Hadzhiev

Last updated: Apr 12, 2024
3 min

banner

# Table of Contents

  1. Pandas: Find the closest value to a Number in a Column
  2. Find the index of the closest value in a Pandas DataFrame column
  3. Find the closest value in a DataFrame column using idxmin()

# Pandas: Find the closest value to a Number in a Column

To find the closest value to a Number in a DataFrame column:

  1. Subtract the number from each value in the given column.
  2. Use the argsort() method to get the integer indices that would sort the Series.
  3. Select the element at the first index.
main.py
import pandas df = pandas.DataFrame({ 'first name': ['Alice', 'Bobby', 'Carl', 'Alice'], 'age': [20, 25, 50, 65], 'net salary': [75, 60, 100, 70] }) num = 21 result = df.iloc[(df['age'] - num).abs().argsort()[:1]] # first name age net salary # 0 Alice 20 75 print(result)

find the closest value to a number in a dataframe

The code for this article is available on GitHub

The code sample selects the closest value to the number 21 in the age column of the DataFrame.

We first subtract the given number from each value and get the absolute (non-negative) result.

main.py
import pandas df = pandas.DataFrame({ 'first name': ['Alice', 'Bobby', 'Carl', 'Alice'], 'age': [20, 25, 50, 65], 'net salary': [75, 60, 100, 70] }) num = 21 # 0 1 # 1 4 # 2 29 # 3 44 # Name: age, dtype: int64 print((df['age'] - num).abs())
The code for this article is available on GitHub

The next step is to use the Series.argsort() method to get the integer indices that would sort the Series values.

main.py
import pandas df = pandas.DataFrame({ 'first name': ['Alice', 'Bobby', 'Carl', 'Alice'], 'age': [20, 25, 50, 65], 'net salary': [75, 60, 100, 70] }) num = 21 # 0 0 # 1 1 # 2 2 # 3 3 # Name: age, dtype: int64 print((df['age'] - num).abs().argsort())

In order to get the single closest value of the given number, we select element at the first index (0).

main.py
result = df.iloc[(df['age'] - num).abs().argsort()[:1]] # # first name age net salary # # 0 Alice 20 75 # print(result)

If you want to get the closest value as a literal, instead of in a DataFrame, access the column and use the Series.tolist() method.

main.py
import pandas df = pandas.DataFrame({ 'first name': ['Alice', 'Bobby', 'Carl', 'Alice'], 'age': [20, 25, 50, 65], 'net salary': [75, 60, 100, 70] }) num = 21 result = df.iloc[(df['age'] - num).abs().argsort()[:1]] print(result['age'].tolist()) # ๐Ÿ‘‰๏ธ [20] print(result['age'].tolist()[0]) # ๐Ÿ‘‰๏ธ 20
The code for this article is available on GitHub

The Series.tolist() method returns a list of the values stored in the Series.

If you want to get the first N elements, use a different slice.

For example, the following slice selects the closest 2 values of the given number.

main.py
import pandas df = pandas.DataFrame({ 'first name': ['Alice', 'Bobby', 'Carl', 'Alice'], 'age': [20, 25, 50, 65], 'net salary': [75, 60, 100, 70] }) num = 21 result = df.iloc[(df['age'] - num).abs().argsort()[:2]] # first name age net salary # 0 Alice 20 75 # 1 Bobby 25 60 print(result)

select closest n values of number in dataframe column

# Find the index of the closest value in a Pandas DataFrame column

If you need to find the index of the closest value in a Pandas DataFrame column, access the index attribute on the DataFrame and call the tolist() method.

main.py
import pandas df = pandas.DataFrame({ 'first name': ['Alice', 'Bobby', 'Carl', 'Alice'], 'age': [20, 25, 50, 65], 'net salary': [75, 60, 100, 70] }) num = 21 result = df.iloc[(df['age'] - num).abs().argsort()[:1]] # first name age net salary # 0 Alice 20 75 print(result) closest_value = result['age'].tolist() print(closest_value) # ๐Ÿ‘‰๏ธ [20] closest_value_index = result.index.tolist() print(closest_value_index) # ๐Ÿ‘‰๏ธ [0]

find index of closest value in pandas dataframe column

The code for this article is available on GitHub

We first got a DataFrame that contains the row with the closest value.

You can get the index of the closest value by accessing the index attribute and calling the tolist() method.

The tolist() method will return a list containing the indices of the rows in the DataFrame.

# Find the closest value in a DataFrame column using idxmin()

You can also use the DataFrame.idxmin method instead of argsort().

main.py
import pandas df = pandas.DataFrame({ 'first name': ['Alice', 'Bobby', 'Carl', 'Alice'], 'age': [20, 25, 50, 65], 'net salary': [75, 60, 100, 70] }) num = 21 series = df.iloc[(df['age'] - num).abs().idxmin()] # first name Alice # age 20 # net salary 75 # Name: 0, dtype: object print(series) closest_value = series['age'] print(closest_value) # ๐Ÿ‘‰๏ธ 20

find closest value in column using idxmin

The code for this article is available on GitHub

The DataFrame.idxmin() method returns the index of the first occurrence of the minimum value.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.

Copyright ยฉ 2024 Borislav Hadzhiev