Check if all values in a Column are Equal in Pandas

Borislav Hadzhiev

Last updated: Apr 12, 2024
6 min

#Check if all values in a Column are Equal in Pandas

To check if all values in a column are equal in Pandas:

1. Use the `to_numpy()` method to convert the column to an array.
2. Check if the first value in the array is equal to every other value.
3. If the condition is met, all values in the column are equal.
main.py
```Copied!```import pandas as pd

df = pd.DataFrame({
'name': ['Alice', 'Bobby', 'Carl', 'Dan'],
'experience': [3, 3, 3, 3],
'salary': [175.1, 180.2, 190.3, 205.4],
})

def values_in_column_equal(col):
arr = col.to_numpy()

return (arr[0] == arr).all()

# ๐๏ธ True
print(values_in_column_equal(df['experience']))

# ๐๏ธ False
print(values_in_column_equal(df['name']))
``````

The code for this article is available on GitHub

The DataFrame.to_numpy() method converts the `DataFrame` to a NumPy array.

main.py
```Copied!```import pandas as pd

df = pd.DataFrame({
'name': ['Alice', 'Bobby', 'Carl', 'Dan'],
'experience': [3, 3, 3, 3],
'salary': [175.1, 180.2, 190.3, 205.4],
})

# ๐๏ธ [3 3 3 3]
print(df['experience'].to_numpy())
``````

We selected the first element in the array (index `0`) and compared it to all other array elements.

main.py
```Copied!```import pandas as pd

df = pd.DataFrame({
'name': ['Alice', 'Bobby', 'Carl', 'Dan'],
'experience': [3, 3, 3, 3],
'salary': [175.1, 180.2, 190.3, 205.4],
})

# ๐๏ธ [3 3 3 3]
arr = df['experience'].to_numpy()

# ๐๏ธ [ True  True  True  True]
print(arr[0] == arr)

# ๐๏ธ True
print((arr[0] == arr).all())
``````
The code for this article is available on GitHub

If the condition returns `True` for all array elements, then all values in the column are equal.

When using this approach, make sure to call the function with a `DataFrame` column and not with an entire `DataFrame`.

main.py
```Copied!```import pandas as pd

df = pd.DataFrame({
'name': ['Alice', 'Bobby', 'Carl', 'Dan'],
'experience': [3, 3, 3, 3],
'salary': [175.1, 180.2, 190.3, 205.4],
})

def values_in_column_equal(col):
arr = col.to_numpy()

return (arr[0] == arr).all()

# ๐๏ธ True
print(values_in_column_equal(df['experience']))

# ๐๏ธ False
print(values_in_column_equal(df['name']))
``````

The code for this article is available on GitHub

#Check if all values in a Column are Equal for an entire `DataFrame`

If you need to check if all values in a column are equal for an entire `DataFrame`, set the `axis` to `0` when calling the `all()` method.

main.py
```Copied!```import pandas as pd

def values_in_column_equal(df_):
arr = df_.to_numpy()

return (arr[0] == arr).all(0)

df = pd.DataFrame({
'name': ['Alice', 'Bobby', 'Carl', 'Dan'],
'experience': [3, 3, 3, 3],
'salary': [175.1, 180.2, 190.3, 205.4],
})

# ๐๏ธ [False  True False]
print(values_in_column_equal(df))
``````

The code for this article is available on GitHub

As shown in the code sample, only the values in the `experience` column are equal.

#Checking if all columns of a DataFrame are equal to a given value

If you need to check if all columns of a `DataFrame` are equal to a given value, use the `DataFrame.eq()` method.

main.py
```Copied!```import pandas as pd

df = pd.DataFrame({
'a': [1, 1, 1],
'b': [1, 1, 1],
})

value = 1

# a    True
# b    True
# dtype: bool
print(df.eq(value).all(axis=0))
``````

The code for this article is available on GitHub

The DataFrame.eq() method returns a `DataFrame` of boolean value with the results of the comparison.

We compared each value to `1` and got `True` values for columns `a` and `b`.

#Find the Rows where all Columns are Equal in Pandas

You can use a similar approach to find the rows where all columns are equal.

main.py
```Copied!```import pandas as pd

df = pd.DataFrame({
'a': [1, 1, 1],
'b': [1, 1, 1],
'c': [1, 2, 3],
'd': [1, 2, 3],
})

# ๐๏ธ check all columns against the first column
print(df.eq(df.iloc[:, 0], axis=0))

print('-' * 50)

print(df.eq(df.iloc[:, 0], axis=0).all(1))
``````
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
```Copied!```      a     b      c      d
0  True  True   True   True
1  True  True  False  False
2  True  True  False  False
--------------------------------------------------
0     True
1    False
2    False
dtype: bool
``````

Once we check all columns against the first column, we can use the `all()` method to see if all columns are equal for the specific row.

You could achieve the same result by using the `DataFrame.values` attribute.

main.py
```Copied!```import pandas as pd

df = pd.DataFrame({
'a': [1, 1, 1],
'b': [1, 1, 1],
'c': [1, 2, 3],
'd': [1, 2, 3],
})

values = df.values
print(values)

print('-' * 50)

result = (values == values[:, [0]]).all(axis=1)
print(result)
``````

Running the code sample produces the following output.

shell
```Copied!```[[1 1 1 1]
[1 1 2 2]
[1 1 3 3]]
--------------------------------------------------
[ True False False]
``````

You can achieve the same result by using the DataFrame.iloc indexer.

main.py
```Copied!```import pandas as pd

df = pd.DataFrame({
'a': [1, 1, 1],
'b': [1, 1, 1],
'c': [1, 2, 3],
'd': [1, 2, 3],
})

df['result'] = (df.iloc[:, :-1] == 1).all(1)

#    a  b  c  d  result
# 0  1  1  1  1    True
# 1  1  1  2  2   False
# 2  1  1  3  3   False
print(df)
``````
The code for this article is available on GitHub

Only the values in the first row are equal for all columns.

The code sample outputs the results as booleans (`True` and `False`), however, you might also want to output the results as integer `1` (for `True`) and `0` (for `False`).

main.py
```Copied!```import pandas as pd

df = pd.DataFrame({
'a': [1, 1, 1],
'b': [1, 1, 1],
'c': [1, 2, 3],
'd': [1, 2, 3],
})

df['result'] = (df.iloc[:, :-1] == 1).all(1).astype(int)

#    a  b  c  d  result
# 0  1  1  1  1       1
# 1  1  1  2  2       0
# 2  1  1  3  3       0
print(df)
``````

#Checking if specific columns in a Pandas DataFrame are equal

Use the DataFrame.apply() method if you need to check if specific columns in a Pandas `DataFrame` are equal.

main.py
```Copied!```import pandas as pd

df = pd.DataFrame({
'a': [1, 1, 1],
'b': [1, 1, 1],
'c': [1, 2, 3],
'd': [1, 2, 3],
})

df['result'] = df.apply(
lambda x: x['a'] == x['b'],
axis=1
)

#    a  b  c  d  result
# 0  1  1  1  1    True
# 1  1  1  2  2    True
# 2  1  1  3  3    True
print(df)
``````

The code for this article is available on GitHub

The code sample shows that the `a` and `b` columns are equal for all 3 rows.

The same approach can be used to check if more than 2 specific columns are equal.

main.py
```Copied!```import pandas as pd

df = pd.DataFrame({
'a': [1, 1, 1],
'b': [1, 1, 1],
'c': [1, 2, 3],
'd': [1, 2, 3],
})

df['result'] = df.apply(
lambda x: x['a'] == x['b'] == x['c'],
axis=1
)

#    a  b  c  d  result
# 0  1  1  1  1    True
# 1  1  1  2  2   False
# 2  1  1  3  3   False
print(df)
``````

As shown in the output, the columns `a`, `b` and `c` are equal for the first row only.

#Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
You can use the search field on my Home Page to filter through all of my articles.

Copyright ยฉ 2024 Borislav Hadzhiev