Last updated: Apr 10, 2024
Reading time·3 min

The "ValueError: Excel file format cannot be determined, you must specify an engine manually" error occurs for multiple reasons:
~$file_name.xlsx.pd.read_excel() method to read a csv file or the
pd.read_csv() method to read an xlsx file.
File "/home/borislav/anaconda3/lib/python3.9/site-packages/pandas/io/excel/_base.py", line 1656, in __init__ raise ValueError( ValueError: Excel file format cannot be determined, you must specify an engine manually.
The most common cause of the error is that your operating system creates
temporary lock files named ~$your_file_name.xlsx.
If you have the Excel files open in an application, close the application and adjust your code to ignore these files.
import glob import pandas as pd for file in glob.glob('./*.xlsx'): if '~$' in file: continue else: df = pd.read_excel( file, engine='openpyxl' ) print(df)
The code sample assumes that you have an xlsx file located in the same
directory as your Python script, e.g. example.xlsx.

Here is the output of running the python main.py command.

We used the in operator to ignore files that contain ~$ and read the other
files with the xlsx extension.
openpyxl module installedIn order to read .xlsx files, you have to set the engine to
openpyxl, so make sure you have the module
installed.
pip install openpyxl pandas pip3 install openpyxl pandas python -m pip install openpyxl pandas python3 -m pip install openpyxl pandas py -m pip install openpyxl pandas # 👇️ For Jupyter Notebook !pip install openpyxl pandas
The openpyxl library is used to read and write Excel 2010 xlsx, xlsm,
xltx, xltm files.
pandas.read_excel vs pandas.read_csvAnother common cause of the error is trying to read a .csv file using the
pandas.read_excel
method.
You should use the
pandas.read_csv method
when reading .csv files and the pandas.read_excel method when reading
.xlsx and .xls files.
If you try to read a .csv file with the pandas.read_excel() method, the
error is raised.
Similarly, if you try to read a .xlsx or .xls file with the
pandas.read_csv method, the error is raised.
Here is an example of how to check for a file's extension before reading it.
import pandas as pd file_name = 'example.xlsx' if file_name.endswith('.xlsx'): df = pd.read_excel( file_name, engine='openpyxl' ) print(df) elif file_name.endswith('.xls'): df = pd.read_excel( file_name, engine='xlrd' ) print(df) elif file_name.endswith('.csv'): df = pd.read_csv(file_name) print(df)
The file in the example has a .xlsx extension, so the openpyxl engine is
used.

.xlsx extension, we use the pandas.read_excel method and set the engine to openpyxl.If the file has a .xls extension, we use the pandas.read_excel method and
set the engine to xlrd.
If the file has a .csv extension, we use the pandas.read_csv method.
xlrd module installedMake sure to install the xlrd module if you don't have it installed.
pip install xlrd pip3 install xlrd python -m pip install xlrd python3 -m pip install xlrd py -m pip install xlrd # 👇️ For Jupyter Notebook !pip install xlrd
The xlrd module is used for reading and
formatting data from Excel files in the historical .xls format.
Another common cause of the error is mislabeling the file's extension.
Make sure you haven't set a .xls extension to a .csv file and vice versa.
To solve the "ValueError: Excel file format cannot be determined, you must specify an engine manually" error, make sure:
~$, e.g. ~$file_name.xlsx.pandas method when reading .xlsx and .csv files.