How to add a Filter to Pivot Table in Pandas

# Table of Contents

# How to add a Filter to Pivot Table in Pandas

To add a filter to a pivot table in Pandas:

Use bracket notation to filter the DataFrame based on a condition.
Call the pivot_table() method on the filtered DataFrame.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'id': [1, 1, 2, 2, 3, 3],
    'name': ['Alice', 'Alice', 'Bobby', 'Bobby', 'Carl', 'Dan'],
    'experience': [1, 2, 2, 3, 3, 8],
})


table = df[df['id'] > 1].pivot_table(
    index='id',
    columns=['name'],
    values='experience',
    aggfunc='mean'
)

# name  Bobby  Carl  Dan
# id
# 2       2.5   NaN  NaN
# 3       NaN   3.0  8.0
print(table)

add filter to pivot table in pandas

The code for this article is available on GitHub

The first step is to use bracket notation to filter the DataFrame based on a condition.

For example, the following condition only selects rows that have an id value that is greater than 1.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'id': [1, 1, 2, 2, 3, 3],
    'name': ['Alice', 'Alice', 'Bobby', 'Bobby', 'Carl', 'Dan'],
    'experience': [1, 2, 2, 3, 3, 8],
})

#    id   name  experience
# 2   2  Bobby           2
# 3   2  Bobby           3
# 4   3   Carl           3
# 5   3    Dan           8
print(df[df['id'] > 1])

first filter the dataframe

You can then call the DataFrame.pivot_table() method on the filtered DataFrame.

main.py

Copied!
table = df[df['id'] > 1].pivot_table(
    index='id',
    columns=['name'],
    values='experience',
    aggfunc='mean'
)

# Adding multiple filters with the logical AND & operator

If you need to check for multiple conditions before calling DataFrame.pivot_table(), use the logical AND & operator.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'id': [1, 1, 2, 2, 3, 3],
    'name': ['Alice', 'Alice', 'Bobby', 'Bobby', 'Carl', 'Dan'],
    'experience': [1, 2, 2, 3, 3, 8],
})

table = df[(df['id'] > 1) & (df['name'] != 'Carl')].pivot_table(
    index='id',
    columns=['name'],
    values='experience',
    aggfunc='mean'
)

# name  Bobby  Dan
# id
# 2       2.5  NaN
# 3       NaN  8.0
print(table)

using logical and operator to check for multiple conditions

The code for this article is available on GitHub

The code sample returns the rows that have:

an id that is greater than 1
a name value that is not equal to the string "Carl"

Both conditions have to be met for the row to be included in the resulting DataFrame.

The last step is to call the pivot_table on the filtered DataFrame.

# Adding multiple filters with the logical OR | operator

If you need to check if at least one of multiple conditions is met before calling pivot_table(), use the logical OR operator.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'id': [1, 1, 2, 2, 3, 3],
    'name': ['Alice', 'Alice', 'Bobby', 'Bobby', 'Carl', 'Dan'],
    'experience': [1, 2, 2, 3, 3, 8],
})

table = df[(df['id'] > 1) | (df['experience'] == 2)].pivot_table(
    index='id',
    columns=['name'],
    values='experience',
    aggfunc='mean'
)

# name  Alice  Bobby  Carl  Dan
# id
# 1       2.0    NaN   NaN  NaN
# 2       NaN    2.5   NaN  NaN
# 3       NaN    NaN   3.0  8.0
print(table)

use logical or before calling pivot table

The code for this article is available on GitHub

The code sample uses the logical OR | operator before calling pivot_table().

The example checks that either of these 2 conditions is met:

The id value of the row is greater than 1.
The experience value of the row is equal to 2.

If either condition is met, the row gets added to the resulting DataFrame on which we call pivot_table().

# Pass the filtered `DataFrame` in the call to `pandas.pivot_table`

You can also pass the filtered DataFrame in the call to pandas.pivot_table().

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'id': [1, 1, 2, 2, 3, 3],
    'name': ['Alice', 'Alice', 'Bobby', 'Bobby', 'Carl', 'Dan'],
    'experience': [1, 2, 2, 3, 3, 8],
})


table = pd.pivot_table(
    df[df['id'] > 1],
    index='id',
    columns=['name'],
    values='experience',
    aggfunc='mean'
)

# name  Bobby  Carl  Dan
# id
# 2       2.5   NaN  NaN
# 3       NaN   3.0  8.0
print(table)

The code for this article is available on GitHub

The first argument the pandas.pivot_table() method takes is the DataFrame.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.

You can use the search field on my Home Page to filter through all of my articles.

How to add a Filter to Pivot Table in Pandas

# Table of Contents