Last updated: Apr 12, 2024
Reading timeยท6 min

You can use boolean indexing to sum the values in a column in a Pandas
DataFrame that match a condition.
Once you select the matching values, call the DataFrame.sum() method.
import pandas as pd df = pd.DataFrame({ 'A': [3, 5, 7, 10, 5, 19, 5], 'B': [1, 2, 4, 9, 15, 30, 4] }) print(df) result = df.loc[df['A'] == 5, 'B'].sum() print('-' * 50) print(result) # ๐๏ธ 21
Running the code sample produces the following output.
A B 0 3 1 1 5 2 2 7 4 3 10 9 4 5 15 5 19 30 6 5 4 -------------------------------------------------- 21

The code sample sums the values in the B column where the corresponding value
in the A column is equal to 5.
df = pd.DataFrame({ 'A': [3, 5, 7, 10, 5, 19, 5], 'B': [1, 2, 4, 9, 15, 30, 4] }) result = df.loc[df['A'] == 5, 'B'].sum() print(result) # ๐๏ธ 21
The expression df['A'] == 5 returns a boolean array that contains True for
the values in the A column that are equal to 5 and False for the other
values.
import pandas as pd df = pd.DataFrame({ 'A': [3, 5, 7, 10, 5, 19, 5], 'B': [1, 2, 4, 9, 15, 30, 4] }) # 0 False # 1 True # 2 False # 3 False # 4 True # 5 False # 6 True # Name: A, dtype: bool print(df['A'] == 5)
The example then uses boolean indexing to only sum the matching values from the
B column.
The same approach can be used to sum the values in a column based on multiple conditions.
The following example sums the values in column B where:
A is equal to 5.C is equal to 6.import pandas as pd df = pd.DataFrame({ 'A': [3, 5, 7, 10, 5, 19, 5], 'B': [1, 9, 4, 9, 15, 30, 4], 'C': [4, 6, 8, 2, 10, 11, 6] }) result = df.loc[(df['A'] == 5) & (df['C'] == 6), 'B'].sum() print(result) # ๐๏ธ 13

The two conditions are wrapped in parentheses ().
The first condition returns a boolean array that contains True values for the
elements that are equal to 5 in column A.
The second condition returns a boolean array that contains True values for the
elements in column B that are equal to 6.
& to produce an array where both conditions have to be met for a True value to be returned.The sum of the matching numbers in the B column is returned.
The previous example showed how to use the & operator to sum the values in a
column if 2 conditions are met.
In some cases, you might want to sum the values in a column if at least one condition is met.
You can do this with the pipe | operator.
The following example sums the values in the B column where:
A column is greater than 9.C column is greater than 10.import pandas as pd df = pd.DataFrame({ 'A': [3, 5, 7, 10, 5, 19, 5], 'B': [1, 9, 4, 9, 15, 5, 4], 'C': [4, 1, 8, 2, 11, 1, 2] }) result = df.loc[(df['A'] > 9) | (df['C'] > 10), 'B'].sum() print(result) # ๐๏ธ 29

Either of the 2 conditions has to be met for the boolean array to contain a
True value.
The expression calculates the sum of 9 + 15 + 5 and returns 29.
query()You can also use the DataFrame.query() method to sum the values in a column based on a condition.
import pandas as pd df = pd.DataFrame({ 'A': [3, 5, 7, 10, 5, 19, 3], 'B': [1, 9, 4, 9, 15, 5, 4], 'C': [4, 1, 8, 2, 11, 1, 2] }) result = df.query('A == 5')['B'].sum() print(result) # ๐๏ธ 24
The DataFrame.query() method takes an expression as a parameter and returns a
DataFrame resulting from the supplied expression.
import pandas as pd df = pd.DataFrame({ 'A': [3, 5, 7, 10, 5, 19, 3], 'B': [1, 9, 4, 9, 15, 5, 4], 'C': [4, 1, 8, 2, 11, 1, 2] }) # A B C # 1 5 9 1 # 4 5 15 11 print(df.query('A == 5')) result = df.query('A == 5')['B'].sum() print(result) # ๐๏ธ 24
The result variable stores the sum of the matching numbers from column B
(9 and 15).
You can also use the and keyword to check if multiple conditions are met.
import pandas as pd df = pd.DataFrame({ 'A': [3, 5, 7, 10, 5, 19, 5], 'B': [1, 9, 4, 9, 15, 5, 4], 'C': [4, 1, 8, 2, 11, 1, 1] }) result = df.query('A == 5 and C == 1')['B'].sum() print(result) # ๐๏ธ 13
The code sample sums the values in B where:
A is equal to 5.C is equal to 1.If you need to check if at least one condition is met before calculating the
sum, use the or keyword.
import pandas as pd df = pd.DataFrame({ 'A': [3, 4, 7, 10, 5, 19, 3], 'B': [1, 9, 4, 9, 15, 5, 4], 'C': [1, 4, 8, 2, 11, 0, 3] }) result = df.query('A == 5 or C == 1')['B'].sum() print(result) # ๐๏ธ 16
The code sample sums the values in B where:
A is equal to 5.C is equal to 1.locYou can also sum the values in a column based on a condition without using the
loc indexer.
import pandas as pd df = pd.DataFrame({ 'A': [3, 4, 7, 10, 5, 19, 5], 'B': [1, 9, 4, 9, 15, 5, 4], 'C': [1, 4, 8, 2, 11, 0, 3] }) result = df[df['A'] == 5]['B'].sum() print(result) # ๐๏ธ 19
We used bracket notation [] to get a boolean DataFrame that stores True
values if the corresponding number in A is equal to 5.
Then, bracket notation is used to select the matching indices in column B
before calling sum().
import pandas as pd df = pd.DataFrame({ 'A': [3, 4, 7, 10, 5, 19, 5], 'B': [1, 9, 4, 9, 15, 5, 4], 'C': [1, 4, 8, 2, 11, 0, 3] }) result = sum(df[df['A'] == 5]['B']) print(result) # ๐๏ธ 19
Using the DataFrame.sum() method is equivalent to passing the resulting
DataFrame to the sum() function.
You can learn more about the related topics by checking out the following tutorials: