Last updated: Apr 11, 2024
Reading time·6 min
You will most commonly get Unnamed: 0 columns in a Pandas DataFrame
when it
is saved with an index in a CSV file.
To resolve the issue, set the index
argument to False
when saving the
DataFrame
to a CSV file.
For example, running the following code sample produces an Unnamed: 0 column.
import pandas as pd df = pd.DataFrame( [[1, 2, 3], [4, 5, 6], [7, 8, 8]], columns=['bobby', 'hadz', 'com'] ) print(df) df.to_csv('data.csv', encoding='utf-8') df = pd.read_csv('data.csv', sep=',', encoding='utf-8') print('-' * 50) print(df)
Here is the output of running the code sample:
bobby hadz com 0 1 2 3 1 4 5 6 2 7 8 8 -------------------------------------------------- Unnamed: 0 bobby hadz com 0 0 1 2 3 1 1 4 5 6 2 2 7 8 8
Notice that we have an Unnamed: 0 column.
This is caused by saving our CSV file with an unnamed index.
You can resolve the issue by setting the index
argument to False
when
calling
DataFrame.to_csv.
import pandas as pd df = pd.DataFrame( [[1, 2, 3], [4, 5, 6], [7, 8, 8]], columns=['bobby', 'hadz', 'com'] ) print(df) # ✅ Set index to False df.to_csv('data.csv', encoding='utf-8', index=False) df = pd.read_csv('data.csv', sep=',', encoding='utf-8') print('-' * 50) print(df)
Running the code sample produces the following output.
bobby hadz com 0 1 2 3 1 4 5 6 2 7 8 8 -------------------------------------------------- bobby hadz com 0 1 2 3 1 4 5 6 2 7 8 8
Notice that we don't have an Unnamed: 0 column anymore.
The index
argument defaults to True
.
When the argument is set to False
, the row names (indices) are not written.
Note: the issue also occurs when you end each row with a comma when writing your data to a CSV file. Make sure you don't have any trailing commas.
You can also explicitly set the index_col
argument in the call to
pandas.read_csv to
resolve the issue.
import pandas as pd df = pd.DataFrame( [[1, 2, 3], [4, 5, 6], [7, 8, 8]], columns=['bobby', 'hadz', 'com'] ) print(df) df.to_csv('data.csv', encoding='utf-8') # ✅ Explicitly set index_col to 0 df = pd.read_csv( 'data.csv', sep=',', encoding='utf-8', index_col=[0] ) print('-' * 50) print(df)
Running the code sample produces the following output.
bobby hadz com 0 1 2 3 1 4 5 6 2 7 8 8 -------------------------------------------------- bobby hadz com 0 1 2 3 1 4 5 6 2 7 8 8
The index_col
argument determines the column that is used as the row label of
the DataFrame.
We used a column index of 0
to set the first column as the index.
# ✅ explicitly set index_col to 0 df = pd.read_csv( 'data.csv', sep=',', encoding='utf-8', index_col=[0] )
This should be your preferred approach when you don't have access to the code that saves the DataFrame to a CSV file.
str.match()
You can also use the str.match() method to drop the Unnamed columns from your DataFrame.
The method enables us to determine whether each string starts with a match of a regular expression.
import pandas as pd df = pd.DataFrame( [[1, 2, 3], [4, 5, 6], [7, 8, 8]], columns=['bobby', 'hadz', 'com'] ) print(df) df.to_csv('data.csv', encoding='utf-8') df = pd.read_csv( 'data.csv', sep=',', encoding='utf-8', ) print('-' * 50) print(df) # 👇️ Index(['Unnamed: 0', 'bobby', 'hadz', 'com'], dtype='object') print(df.columns) print('-' * 50) df = df.loc[:, ~df.columns.str.match('Unnamed')] print(df) # 👇️ Index(['bobby', 'hadz', 'com'], dtype='object') print(df.columns)
Running the code sample produces the following output.
bobby hadz com 0 1 2 3 1 4 5 6 2 7 8 8 -------------------------------------------------- Unnamed: 0 bobby hadz com 0 0 1 2 3 1 1 4 5 6 2 2 7 8 8 Index(['Unnamed: 0', 'bobby', 'hadz', 'com'], dtype='object') -------------------------------------------------- bobby hadz com 0 1 2 3 1 4 5 6 2 7 8 8 Index(['bobby', 'hadz', 'com'], dtype='object')
We used the str.match()
method to match all columns that start with the string
Unnamed
.
The matching columns are then dropped using the DataFrame.loc label indexer.
df.drop()
to drop the Unnamed columnsYou can also use the DataFrame.drop
and
DataFrame.filter
method to drop the Unnamed columns from your DataFrame
.
import pandas as pd df = pd.DataFrame( [[1, 2, 3], [4, 5, 6], [7, 8, 8]], columns=['bobby', 'hadz', 'com'] ) print(df) df.to_csv('data.csv', encoding='utf-8') df = pd.read_csv( 'data.csv', sep=',', encoding='utf-8', ) print('-' * 50) print(df) # 👇️ Index(['Unnamed: 0', 'bobby', 'hadz', 'com'], dtype='object') print(df.columns) print('-' * 50) df.drop(df.filter(regex="Unname"), axis=1, inplace=True) print(df) # 👇️ Index(['bobby', 'hadz', 'com'], dtype='object') print(df.columns)
Running the code sample produces the following output.
bobby hadz com 0 1 2 3 1 4 5 6 2 7 8 8 -------------------------------------------------- Unnamed: 0 bobby hadz com 0 0 1 2 3 1 1 4 5 6 2 2 7 8 8 Index(['Unnamed: 0', 'bobby', 'hadz', 'com'], dtype='object') -------------------------------------------------- bobby hadz com 0 1 2 3 1 4 5 6 2 7 8 8 Index(['bobby', 'hadz', 'com'], dtype='object')
We used the DataFrame.filter()
method to get the DataFrame columns that start
with Unname
.
The next step is to use the DataFrame.drop() method to drop the matching columns in place.
When the inplace
argument is set to True
, the original DataFrame
is
updated, so no reassignment is necessary.
Alternatively, you can rename the Unnamed: 0 columns by using the DataFrame.rename method.
import pandas as pd df = pd.DataFrame( [[1, 2, 3], [4, 5, 6], [7, 8, 8]], columns=['bobby', 'hadz', 'com'] ) print(df) df.to_csv('data.csv', encoding='utf-8') df = pd.read_csv( 'data.csv', sep=',', encoding='utf-8', ) print('-' * 50) print(df) # 👇️ Index(['Unnamed: 0', 'bobby', 'hadz', 'com'], dtype='object') print(df.columns) print('-' * 50) df.rename(columns={'Unnamed: 0': 'Example_Name'}, inplace=True) print(df) # 👇️ Index(['Example_Name', 'bobby', 'hadz', 'com'], dtype='object') print(df.columns)
Running the code sample produces the following output.
bobby hadz com 0 1 2 3 1 4 5 6 2 7 8 8 -------------------------------------------------- Unnamed: 0 bobby hadz com 0 0 1 2 3 1 1 4 5 6 2 2 7 8 8 Index(['Unnamed: 0', 'bobby', 'hadz', 'com'], dtype='object') -------------------------------------------------- Example_Name bobby hadz com 0 0 1 2 3 1 1 4 5 6 2 2 7 8 8 Index(['Example_Name', 'bobby', 'hadz', 'com'], dtype='object')
The DataFrame.rename() method renames columns or index labels.
df.rename( columns={'Unnamed: 0': 'Example_Name'}, inplace=True )
When the inplace
argument is set to True
, the column is renamed in the
original DataFrame
.
You can learn more about the related topics by checking out the following tutorials: