Last updated: Apr 11, 2024
Reading time·4 min

The Pandas "ValueError: Length mismatch: Expected axis has X elements, new
values have Y elements" occurs when the length of the column names you are
assigning to the DataFrame doesn't match the length of the columns in the
DataFrame.
To solve the error, make sure the length of the column names in the DataFrame
matches the length of the column names.

Here is an example of how the error occurs.
import pandas as pd df = pd.DataFrame( { 'id': [112, 113, 114, 115], 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], } ) # ⛔️ ValueError: Length mismatch: Expected axis has 2 elements, new values have 3 elements df.columns = ['id', 'first_name', 'salary']
The DataFrame in the example has 2 columns - id and name, however, we've
passed 3 column names to the
columns
list.
The expected axis has 2 elements (id and name) and the new values have 3
elements (id, first_name and salary).
One way to solve the error is to ensure the number of columns in the DataFrame
matches the number of columns in the list on the right.
import pandas as pd df = pd.DataFrame( { 'id': [112, 113, 114, 115], 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], } ) df.columns = ['id', 'first_name'] # id first_name # 0 112 Alice # 1 113 Bobby # 2 114 Carl # 3 115 Dan print(df)

The DataFrame has 2 columns and the list on the right-hand side also has 2
columns, so the error is no longer raised.
You can use the len()
function to check that the length of the columns in the DataFrame matches the
length of the column names.
import pandas as pd df = pd.DataFrame( { 'id': [112, 113, 114, 115], 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], } ) # ✅ Print the length of columns in DataFrame print(len(df.columns)) # 👉️ 2 column_names = ['id', 'first_name'] # ✅ Print the length of the column names print(len(column_names)) # 👉️ 2 df.columns = ['id', 'first_name'] # id first_name # 0 112 Alice # 1 113 Bobby # 2 114 Carl # 3 115 Dan print(df)
There has to be a match between the following 2 calls to len().
# ✅ Print the length of columns in DataFrame print(len(df.columns)) # 👉️ 2 # ✅ Print the length of the column names print(len(column_names)) # 👉️ 2
You can use the df.columns attribute if you need to print the columns in the
DataFrame.
import pandas as pd df = pd.DataFrame( { 'id': [112, 113, 114, 115], 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], } ) # Index(['id', 'name'], dtype='object') print(df.columns)
The DataFrame has id and name columns.
You might have excluded one or more of your DataFrame columns somewhere in
your code.
import pandas as pd df = pd.DataFrame( { 'id': [112, 113, 114, 115], 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], } ) df = df.drop(['id'], axis=1) # name # 0 Alice # 1 Bobby # 2 Carl # 3 Dan print(df) # 👇️ Index(['name'], dtype='object') print(df.columns) print(len(df.columns)) # 👉️ 1 df.columns = ['first_name'] # first_name # 0 Alice # 1 Bobby # 2 Carl # 3 Dan print(df)
The code sample uses the
DataFrame.drop() method to remove the
id column from the DataFrame.
After the column is removed, the DataFrame has only 1 column remaining.
Make sure the number of columns in the DataFrame matches the number of column
names in the list on the right.
The error is also raised when you try to assign column names to an empty DataFrame.
import pandas as pd df = pd.DataFrame() print(len(df.columns)) # 👉️ 0 # ⛔️ ValueError: Length mismatch: Expected axis has 0 elements, new values have 2 elements df.columns = ['id', 'name']
We initialized the DataFrame as empty, so it has 0 columns.
Trying to overwrite its column names with a list containing 2 items causes the error.
One way to solve the error is to initialize the DataFrame with 2 columns using
numpy.empty().
import pandas as pd import numpy as np df = pd.DataFrame(np.empty((0, 2))) print(len(df.columns)) # 👉️ 2 df.columns = ['id', 'name'] print(df.columns) # 👉️ Index(['id', 'name'], dtype='object') print(len(df.columns)) # 👉️ 2
Make sure to initialize the DataFrame with the correct number of columns (the
example uses 2 columns).
The numpy.empty() method returns a new array of the specified shape and type, without initializing entries.
If you don't want to use NumPy, use the range class instead.
import pandas as pd df = pd.DataFrame(columns=range(2)) print(len(df.columns)) # 👉️ 2 df.columns = ['id', 'name'] print(df.columns) # 👉️ Index(['id', 'name'], dtype='object') print(len(df.columns)) # 👉️ 2
You might also get the error when setting the index_col argument to 0 when
reading CSV files with
pandas.read_csv.
df = pd.read_csv( 'data.csv', index_col=0, # 👈️ header=None, skiprows=[0, 1, 2] )
The index_col argument determines the column(s) to use as the row labels of
the DataFrame.
Try to set the index_col argument to None instead and see if the error
resolves.
df = pd.read_csv( 'data.csv', index_col=None, # 👈️ header=None, skiprows=[0, 1, 2] )
When index_col is set to None, a separate numerical index is assigned.
None is the default value for the index_col keyword argument.
You can learn more about the related topics by checking out the following tutorials: