Last updated: Apr 13, 2024
Reading time·3 min
The spaCy error "OSError: [E050] Can't find model 'en_core_web_sm'" occurs when you forget to download the model for your spaCy installation before loading it.
To solve the error, download the model by issuing the
python -m spacy download en_core_web_sm
command.
Here is the complete error message.
OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.
The first thing you should try is to issue the following command.
python -m spacy download en_core_web_sm # Or prefix with ! in Jupyter notebook !python -m spacy download en_core_web_sm
Depending on your Python installation, you might have to use the python3
or
py
commands.
# macOS and Linux python3 -m spacy download en_core_web_sm # Or py alias (Windows) py -m spacy download en_core_web_sm
If you run into issues when downloading en_core_web_sm
, paste the following
code into your Python script and run it with python my_script.py
.
import spacy.cli spacy.cli.download("en_core_web_lg")
Now load the spaCy model from the installed package.
import spacy nlp = spacy.load("en_core_web_sm") doc = nlp('bobbyhadz.com') print(doc)
If the error persists, try to issue the following command from your terminal.
python -m spacy download en
In more recent versions of spaCy, shortcuts such as en
are deprecated,
however, this is what may be needed if you use an older version (spaCy < v3.0).
If the error persists, try to import the en_core_web_sm
package.
import en_core_web_sm nlp = en_core_web_sm.load() doc = nlp('bobbyhadz.com') print(doc)
The load()
method loads a spaCy model from an installed package.
The method returns the loaded nlp
object.
If the error persists and you use Jupyter, try to restart the Kernel.
You might also need to restart the runtime in Google Colab (or press Ctrl
+
M
).
When you issue the python -m spacy download en_core_web_sm
command, spaCy
automatically downloads the best-matching version of the model for your spaCy
installation.
Alternatively, you can try to pass an absolute path to spacy.load()
.
en_core_web_sm
module is installed.spacy.load()
method.For example, you could
use the __file__
attribute
to find where the module is installed.
import en_core_web_sm # /home/borislav/Desktop/bobbyhadz_python/venv2/lib/python3.11/site-packages/en_core_web_sm/__init__.py print(en_core_web_sm.__file__)
Notice that the path points to an __init__.py
file.
We don't need the __init__.py
. Instead, we need the en_core_web_sm-X.Y.Z
file in the en_core_web_sm
directory, e.g. en_core_web_sm-3.6.0
.
Note: my spaCy version is
3.6.0
, however, your version will likely be different.
For example, for me, the path looks as follows.
import spacy nlp = spacy.load( "/home/borislav/Desktop/bobbyhadz_python/venv2/lib/python3.11/site-packages/en_core_web_sm/en_core_web_sm-3.6.0") doc = nlp('bobbyhadz.com') print(doc)
If you are on Windows, make sure to prefix the path with r
to mark it as a raw
string.
import spacy nlp = spacy.load( r'C:\Users\YourUser\Desktop\project\en_core_web_sm\en_core_web_sm-3.6.0' ) doc = nlp('bobbyhadz.com') print(doc)
r
are called raw strings and treat backslashes as literal characters.You can learn more about the related topics by checking out the following tutorials: