Check if all values in a Column are Equal in Pandas

avatar
Borislav Hadzhiev

Last updated: Apr 12, 2024
6 min

banner

# Table of Contents

  1. Check if all values in a Column are Equal in Pandas
  2. Check if all values in a Column are Equal for an entire DataFrame
  3. Checking if all columns of a DataFrame are equal to a given value
  4. Find the Rows where all Columns are Equal in Pandas
  5. Checking if specific columns in a Pandas DataFrame are equal

# Check if all values in a Column are Equal in Pandas

To check if all values in a column are equal in Pandas:

  1. Use the to_numpy() method to convert the column to an array.
  2. Check if the first value in the array is equal to every other value.
  3. If the condition is met, all values in the column are equal.
main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [3, 3, 3, 3], 'salary': [175.1, 180.2, 190.3, 205.4], }) def values_in_column_equal(col): arr = col.to_numpy() return (arr[0] == arr).all() # ๐Ÿ‘‡๏ธ True print(values_in_column_equal(df['experience'])) # ๐Ÿ‘‡๏ธ False print(values_in_column_equal(df['name']))

check if all values in column are equal

The code for this article is available on GitHub

The DataFrame.to_numpy() method converts the DataFrame to a NumPy array.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [3, 3, 3, 3], 'salary': [175.1, 180.2, 190.3, 205.4], }) # ๐Ÿ‘‡๏ธ [3 3 3 3] print(df['experience'].to_numpy())

We selected the first element in the array (index 0) and compared it to all other array elements.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [3, 3, 3, 3], 'salary': [175.1, 180.2, 190.3, 205.4], }) # ๐Ÿ‘‡๏ธ [3 3 3 3] arr = df['experience'].to_numpy() # ๐Ÿ‘‡๏ธ [ True True True True] print(arr[0] == arr) # ๐Ÿ‘‡๏ธ True print((arr[0] == arr).all())
The code for this article is available on GitHub

If the condition returns True for all array elements, then all values in the column are equal.

When using this approach, make sure to call the function with a DataFrame column and not with an entire DataFrame.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [3, 3, 3, 3], 'salary': [175.1, 180.2, 190.3, 205.4], }) def values_in_column_equal(col): arr = col.to_numpy() return (arr[0] == arr).all() # ๐Ÿ‘‡๏ธ True print(values_in_column_equal(df['experience'])) # ๐Ÿ‘‡๏ธ False print(values_in_column_equal(df['name']))

check if all values in column are equal

The code for this article is available on GitHub

# Check if all values in a Column are Equal for an entire DataFrame

If you need to check if all values in a column are equal for an entire DataFrame, set the axis to 0 when calling the all() method.

main.py
import pandas as pd def values_in_column_equal(df_): arr = df_.to_numpy() return (arr[0] == arr).all(0) df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [3, 3, 3, 3], 'salary': [175.1, 180.2, 190.3, 205.4], }) # ๐Ÿ‘‡๏ธ [False True False] print(values_in_column_equal(df))

check if all values in column are equal for entire dataframe

The code for this article is available on GitHub

As shown in the code sample, only the values in the experience column are equal.

# Checking if all columns of a DataFrame are equal to a given value

If you need to check if all columns of a DataFrame are equal to a given value, use the DataFrame.eq() method.

main.py
import pandas as pd df = pd.DataFrame({ 'a': [1, 1, 1], 'b': [1, 1, 1], }) value = 1 # a True # b True # dtype: bool print(df.eq(value).all(axis=0))

check if all columns of dataframe are equal to given value

The code for this article is available on GitHub

The DataFrame.eq() method returns a DataFrame of boolean value with the results of the comparison.

We compared each value to 1 and got True values for columns a and b.

# Find the Rows where all Columns are Equal in Pandas

You can use a similar approach to find the rows where all columns are equal.

main.py
import pandas as pd df = pd.DataFrame({ 'a': [1, 1, 1], 'b': [1, 1, 1], 'c': [1, 2, 3], 'd': [1, 2, 3], }) # ๐Ÿ‘‡๏ธ check all columns against the first column print(df.eq(df.iloc[:, 0], axis=0)) print('-' * 50) print(df.eq(df.iloc[:, 0], axis=0).all(1))
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
a b c d 0 True True True True 1 True True False False 2 True True False False -------------------------------------------------- 0 True 1 False 2 False dtype: bool

find rows where all columns equal

Once we check all columns against the first column, we can use the all() method to see if all columns are equal for the specific row.

You could achieve the same result by using the DataFrame.values attribute.

main.py
import pandas as pd df = pd.DataFrame({ 'a': [1, 1, 1], 'b': [1, 1, 1], 'c': [1, 2, 3], 'd': [1, 2, 3], }) values = df.values print(values) print('-' * 50) result = (values == values[:, [0]]).all(axis=1) print(result)

Running the code sample produces the following output.

shell
[[1 1 1 1] [1 1 2 2] [1 1 3 3]] -------------------------------------------------- [ True False False]

find rows where all columns are equal using dataframe values

You can achieve the same result by using the DataFrame.iloc indexer.

main.py
import pandas as pd df = pd.DataFrame({ 'a': [1, 1, 1], 'b': [1, 1, 1], 'c': [1, 2, 3], 'd': [1, 2, 3], }) df['result'] = (df.iloc[:, :-1] == 1).all(1) # a b c d result # 0 1 1 1 1 True # 1 1 1 2 2 False # 2 1 1 3 3 False print(df)
The code for this article is available on GitHub

Only the values in the first row are equal for all columns.

The code sample outputs the results as booleans (True and False), however, you might also want to output the results as integer 1 (for True) and 0 (for False).

main.py
import pandas as pd df = pd.DataFrame({ 'a': [1, 1, 1], 'b': [1, 1, 1], 'c': [1, 2, 3], 'd': [1, 2, 3], }) df['result'] = (df.iloc[:, :-1] == 1).all(1).astype(int) # a b c d result # 0 1 1 1 1 1 # 1 1 1 2 2 0 # 2 1 1 3 3 0 print(df)

# Checking if specific columns in a Pandas DataFrame are equal

Use the DataFrame.apply() method if you need to check if specific columns in a Pandas DataFrame are equal.

main.py
import pandas as pd df = pd.DataFrame({ 'a': [1, 1, 1], 'b': [1, 1, 1], 'c': [1, 2, 3], 'd': [1, 2, 3], }) df['result'] = df.apply( lambda x: x['a'] == x['b'], axis=1 ) # a b c d result # 0 1 1 1 1 True # 1 1 1 2 2 True # 2 1 1 3 3 True print(df)

check if specific columns are equal using apply

The code for this article is available on GitHub

The code sample shows that the a and b columns are equal for all 3 rows.

The same approach can be used to check if more than 2 specific columns are equal.

main.py
import pandas as pd df = pd.DataFrame({ 'a': [1, 1, 1], 'b': [1, 1, 1], 'c': [1, 2, 3], 'd': [1, 2, 3], }) df['result'] = df.apply( lambda x: x['a'] == x['b'] == x['c'], axis=1 ) # a b c d result # 0 1 1 1 1 True # 1 1 1 2 2 False # 2 1 1 3 3 False print(df)

check if more than 2 specific columns are equal

As shown in the output, the columns a, b and c are equal for the first row only.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.

Copyright ยฉ 2024 Borislav Hadzhiev