Last updated: Apr 12, 2024
Reading time·3 min
The Pandas "TypeError: Cannot perform 'rand_' with a dtyped [int64] array and
scalar of type [bool]" occurs when you try to filter a DataFrame
based on
multiple conditions without wrapping each condition in parentheses.
To solve the error, wrap the conditions in parentheses so that the order of precedence is correct.
Here is an example of how the error occurs.
import pandas as pd df = pd.DataFrame({ 'A': ['Bobby', 'Bobby', 'Carl', 'Dan'], 'B': [175.1, 180.2, 190.3, 205.4], 'C': [10, 15, 20, 25], }) # ⛔️ TypeError: Cannot perform 'rand_' with a dtyped [int64] array and scalar of type [bool] print(df.loc[df.A == 'Bobby' & df.C > 10])
We tried to filter the DataFrame
based on multiple conditions but forgot to
wrap each condition in parentheses ()
.
When using the logical AND (&) operator to chain multiple conditions, make sure each condition is wrapped in parentheses so that the order of precedence in the expression is correct.
import pandas as pd df = pd.DataFrame({ 'A': ['Bobby', 'Bobby', 'Carl', 'Dan'], 'B': [175.1, 180.2, 190.3, 205.4], 'C': [10, 15, 20, 25], }) # A B C # 1 Bobby 180.2 15 print(df.loc[(df.A == 'Bobby') & (df.C > 10)])
The code sample returns a subset of the DataFrame
where the A
column is
equal to the string "Bobby"
and the C
column is greater than 10
.
DataFrame
.If you try to filter a DataFrame
based on multiple "OR" conditions with a pipe
|
without wrapping each condition in parentheses, you would get an error.
import pandas as pd df = pd.DataFrame({ 'A': ['Bobby', 'Bobby', 'Carl', 'Dan'], 'B': [175.1, 180.2, 190.3, 205.4], 'C': [10, 15, 20, 25], }) # ⛔️ TypeError: Cannot perform 'ror_' with a dtyped [int64] array and scalar of type [bool] df2 = df.loc[df.A == 'Bobby' | df.C > 10]
As shown in the previous example, we have to wrap each condition in parentheses, so that the order of precedence is correct.
import pandas as pd df = pd.DataFrame({ 'A': ['Bobby', 'Bobby', 'Carl', 'Dan'], 'B': [175.1, 180.2, 190.3, 205.4], 'C': [10, 15, 20, 25], }) df2 = df.loc[(df.A == 'Bobby') | (df.C > 10)] # A B C # 0 Bobby 175.1 10 # 1 Bobby 180.2 15 # 2 Carl 190.3 20 # 3 Dan 205.4 25 print(df2)
We used the logical OR |
operator, so either condition has to be met.
For a row to get added to the resulting DataFrame
:
A
column has to return the string "Bobby"
.C
column has to return a value greater than 10
.You also have to wrap each condition in parentheses when filtering a DataFrame
in an assignment.
import pandas as pd df = pd.DataFrame({ 'A': ['Bobby', 'Bobby', 'Carl', 'Dan'], 'B': [175.1, 180.2, 190.3, 205.4], 'C': [10, 15, 20, 25], }) df.loc[(df.A == 'Bobby') & (df.C < 14), 'C'] = 100000 print(df)
Running the code sample produces the following output.
A B C 0 Bobby 175.1 100000 1 Bobby 180.2 15 2 Carl 190.3 20 3 Dan 205.4 25
The A
column has to have a value of "Bobby"
and the C
column has to have a
value of less than 14
for the value of the C
column to get updated to
100000
.
Wrapping each condition in parentheses is documented in this section of the Pandas docs.
Wrapping the conditions in parentheses is necessary because:
By default, Python evaluates an expression such as
df['A'] > 5 & df['B'] < 10
as df['A'] > (5 & df['B']) < 10
.
Whereas, the correct evaluation order is (df['A'] > 5) & (df['B'] < 10)
.
The order of precedence of the &
is set by Python and not by the Pandas
library.
However, wrapping each condition in parentheses indicates to the language that
the expressions between the parentheses have to be evaluated before the &
operator.
You can learn more about the related topics by checking out the following tutorials: