Pandas ValueError: ('Lengths must match to compare') [Fix]

avatar
Borislav Hadzhiev

Last updated: Apr 12, 2024
4 min

banner

# Table of Contents

  1. Pandas ValueError: ('Lengths must match to compare')
  2. Selecting a specific value before the comparison
  3. Checking if a DataFrame column contains a specific value
  4. Comparing each row in the DataFrame column with a specific value

# Pandas ValueError: ('Lengths must match to compare')

The Pandas "ValueError: ('Lengths must match to compare')" occurs when you try to compare values of a different length.

To solve the error, iterate over the DataFrame column and compare the value to each row or select a specific row before the comparison.

Here is an example of how the error occurs.

main.py
import pandas as pd df = pd.DataFrame({ 'numbers': [1, 2, 3], 'date': ['2023-01-05', '2023-03-25', '2023-01-24'] }) arr = ['2023-01-05'] # โ›”๏ธ ValueError: ('Lengths must match to compare', (3,), (1,)) if df['date'] == arr: print('success')

value error lengths must match to compare

We selected the date column and tried to compare it to a list containing a single date string.

We are trying to compare a Series that contains 3 elements to a list that contains 1 element.

# Selecting a specific value before the comparison

If you need to select a specific value before the comparison, use the DataFrame.iloc indexer.

main.py
import pandas as pd df = pd.DataFrame({ 'numbers': [1, 2, 3], 'date': ['2023-01-05', '2023-03-25', '2023-01-24'] }) arr = ['2023-01-05'] print(df.iloc[0]['date']) # ๐Ÿ‘‰๏ธ "2023-01-05" if df.iloc[0]['date'] == arr[0]: # ๐Ÿ‘‡๏ธ this runs print('success') else: print('failure')

select specific value before comparing

The code for this article is available on GitHub

The DataFrame.iloc attribute is used for integer location-based indexing for selection by position.

The attribute takes an integer position from 0 to length - 1 because Python indices are zero-based.

We end up comparing two strings in the example, so everything works as expected.

We compared the first row of the date column to the first element of the arr list.

# Checking if a DataFrame column contains a specific value

If you need to check if a DataFrame column contains a specific value, use the in operator.

main.py
import pandas as pd df = pd.DataFrame({ 'numbers': [1, 2, 3], 'date': ['2023-01-05', '2023-03-25', '2023-01-24'] }) date = '2023-01-05' if date in df['date'].values: # ๐Ÿ‘‡๏ธ this runs print('success') else: print('failure')
The code for this article is available on GitHub

The Series.values attribute returns an ndarray containing the values of the Series.

We used the in operator to check if the given string is contained in the Series.

If you forget to access the values attribute on the Series, you would be checking if the value is in the index.

main.py
import pandas as pd df = pd.DataFrame({ 'numbers': [1, 2, 3], 'date': ['2023-01-05', '2023-03-25', '2023-01-24'] }) date = '2023-01-05' # 0 2023-01-05 # 1 2023-03-25 # 2 2023-01-24 # Name: date, dtype: object print(df['date']) print('-' * 50) print(date in df['date']) # ๐Ÿ‘‰๏ธ False print('-' * 50) print(0 in df['date']) # ๐Ÿ‘‰๏ธ True
The code for this article is available on GitHub

Running the code sample returns the following output.

shell
0 2023-01-05 1 2023-03-25 2 2023-01-24 Name: date, dtype: object -------------------------------------------------- False -------------------------------------------------- True

You can also use the tolist() method to convert the Series to a list before the comparison.

main.py
import pandas as pd df = pd.DataFrame({ 'numbers': [1, 2, 3], 'date': ['2023-01-05', '2023-03-25', '2023-01-24'] }) date = '2023-01-05' if date in df['date'].tolist(): # ๐Ÿ‘‡๏ธ this runs print('success') else: print('failure')

You can also use the Series.any() method to check if the Series contains a specific value.

main.py
import pandas as pd df = pd.DataFrame({ 'numbers': [1, 2, 3], 'date': ['2023-01-05', '2023-03-25', '2023-01-24'] }) date = '2023-01-05' if (df['date'] == date).any(): # ๐Ÿ‘‡๏ธ this runs print('success') else: print('failure')
The code for this article is available on GitHub

The any() method returns False unless there is at least one element in the Series (or along a DataFrame axis) that is True or equivalent.

You can also use the Series.isin() method along with any().

main.py
import pandas as pd df = pd.DataFrame({ 'numbers': [1, 2, 3], 'date': ['2023-01-05', '2023-03-25', '2023-01-24'] }) date = '2023-01-05' if df['date'].isin([date]).any(): # ๐Ÿ‘‡๏ธ this runs print('success') else: print('failure')

The isin() method takes a list-like object of values and checks if the elements in the Series are contained in the supplied list.

The method returns a boolean Series that shows whether each element in the original Series matches an element in the supplied sequence.

# Comparing each row in the DataFrame column with a specific value

If you need to compare each row in the DataFrame column with a specific value, use the Series.apply() method.

main.py
import pandas as pd df = pd.DataFrame({ 'numbers': [1, 2, 3], 'date': ['2023-01-05', '2023-03-25', '2023-01-24'] }) date = '2023-01-05' # 0 True # 1 False # 2 False # Name: date, dtype: bool print(df['date'].apply(lambda x: x == date))
The code for this article is available on GitHub

The function we passed to the apply() method gets called with each row and compares the value to the given date string.

If you need to get the matching row, use the str.contains() method.

main.py
import pandas as pd df = pd.DataFrame({ 'numbers': [1, 2, 3], 'date': ['2023-01-05', '2023-03-25', '2023-01-24'] }) date = '2023-01-05' match = df[df['date'].str.contains(date)] # numbers date # 0 1 2023-01-05 print(match) print('-' * 50) # numbers 1 # date 1 # dtype: int64 print(match.count())
The code for this article is available on GitHub

The count() method returns a Series that counts the non-NA cells for each column or row.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.

Copyright ยฉ 2024 Borislav Hadzhiev