Last updated: Apr 12, 2024
Reading time·4 min
To select rows between two values in a Pandas DataFrame
:
DataFrame.between()
method to select the rows between the two
values.DataFrame
containing only the matching
rows.import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) print('-' * 50) output = df[df['salary'].between(180, 200)] print(output)
Running the code sample produces the following output.
name experience salary 0 Alice 1 175.1 1 Bobby 3 180.2 2 Carl 5 190.3 3 Dan 7 205.4 -------------------------------------------------- name experience salary 1 Bobby 3 180.2 2 Carl 5 190.3
We used bracket notation to select the salary
column and called the
between()
method on the resulting DataFrame
.
output = df[df['salary'].between(180, 200)]
The between()
method takes a left
and a right
boundary as parameters and
returns a series representing whether each element is between left
and right
(inclusive).
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) # 0 False # 1 True # 2 True # 3 False # Name: salary, dtype: bool print(df['salary'].between(180, 200))
The last step is to filter the original DataFrame
using the Series
.
By default, the between()
method is inclusive for both the left
and right
boundaries.
You can pass an "inclusive"
keyword argument to between()
if you need to
adjust the behavior.
The available values for the inclusive
argument are "both"
, "neither"
,
"left"
and "right"
.
If you need to select the rows that are NOT between two values in a DataFrame
,
use
the tilde ~
operator.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) print('-' * 50) output = df[~df['salary'].between(180, 200)] print(output)
Running the code sample produces the following output.
name experience salary 0 Alice 1 175.1 1 Bobby 3 180.2 2 Carl 5 190.3 3 Dan 7 205.4 -------------------------------------------------- name experience salary 0 Alice 1 175.1 3 Dan 7 205.4
We used the tilde ~
operator to invert the call to the between()
method.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) # 0 True # 1 False # 2 False # 3 True # Name: salary, dtype: bool print(~df['salary'].between(180, 200))
The last step is to filter the original DataFrame
with the boolean Series
.
output = df[~df['salary'].between(180, 200)]
You can also use the boolean AND &
operator to select the rows between two
values in a DataFrame
.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) print('-' * 50) output = df[(df['salary'] >= 180) & (df['salary'] <= 200)] print(output)
The code sample produces the following output.
name experience salary 0 Alice 1 175.1 1 Bobby 3 180.2 2 Carl 5 190.3 3 Dan 7 205.4 -------------------------------------------------- name experience salary 1 Bobby 3 180.2 2 Carl 5 190.3
Notice that we wrapped each condition in parentheses ()
.
output = df[(df['salary'] >= 180) & (df['salary'] <= 200)]
This makes the code quite easy to read and makes the precedence of operations obvious.
We first check if the salary
value for a specific row is greater than or equal
to 180.
Then we check if the same salary
value is less than or equal to 200.
If both conditions are met, the expression returns True
and the corresponding
row gets included in the results.
You can use the boolean AND &
operator to chain as many conditions as
necessary.
query()
You can also use the
DataFrame.query()
method to select the rows between two values in a DataFrame
.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) print('-' * 50) output = df.query('180 <= salary <= 200') print(output)
Running the code sample produces the following output.
name experience salary 0 Alice 1 175.1 1 Bobby 3 180.2 2 Carl 5 190.3 3 Dan 7 205.4 -------------------------------------------------- name experience salary 1 Bobby 3 180.2 2 Carl 5 190.3
The DataFrame.query()
method enables you to query the columns of a DataFrame
with a boolean expression.
You can learn more about the related topics by checking out the following tutorials: