OSError: [E050] Can't find model 'en_core_web_sm'

avatar
Borislav Hadzhiev

Last updated: Apr 13, 2024
3 min

banner

# OSError: [E050] Can't find model 'en_core_web_sm'

The spaCy error "OSError: [E050] Can't find model 'en_core_web_sm'" occurs when you forget to download the model for your spaCy installation before loading it.

To solve the error, download the model by issuing the python -m spacy download en_core_web_sm command.

Here is the complete error message.

shell
OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

The first thing you should try is to issue the following command.

shell
python -m spacy download en_core_web_sm # Or prefix with ! in Jupyter notebook !python -m spacy download en_core_web_sm

spacy download en core web sm

Depending on your Python installation, you might have to use the python3 or py commands.

shell
# macOS and Linux python3 -m spacy download en_core_web_sm # Or py alias (Windows) py -m spacy download en_core_web_sm

download en core web lg via python script

If you run into issues when downloading en_core_web_sm, paste the following code into your Python script and run it with python my_script.py.

main.py
import spacy.cli spacy.cli.download("en_core_web_lg")

Now load the spaCy model from the installed package.

main.py
import spacy nlp = spacy.load("en_core_web_sm") doc = nlp('bobbyhadz.com') print(doc)

load spacy model from installed package

If the error persists, try to issue the following command from your terminal.

shell
python -m spacy download en

In more recent versions of spaCy, shortcuts such as en are deprecated, however, this is what may be needed if you use an older version (spaCy < v3.0).

If the error persists, try to import the en_core_web_sm package.

main.py
import en_core_web_sm nlp = en_core_web_sm.load() doc = nlp('bobbyhadz.com') print(doc)

try importing the en core web sm package

The load() method loads a spaCy model from an installed package.

The method returns the loaded nlp object.

If the error persists and you use Jupyter, try to restart the Kernel.

You might also need to restart the runtime in Google Colab (or press Ctrl + M).

When you issue the python -m spacy download en_core_web_sm command, spaCy automatically downloads the best-matching version of the model for your spaCy installation.

# Passing an absolute path to spacy.load

Alternatively, you can try to pass an absolute path to spacy.load().

  1. Find the directory where the en_core_web_sm module is installed.
  2. Copy the path.
  3. Pass it to the spacy.load() method.

For example, you could use the __file__ attribute to find where the module is installed.

main.py
import en_core_web_sm # /home/borislav/Desktop/bobbyhadz_python/venv2/lib/python3.11/site-packages/en_core_web_sm/__init__.py print(en_core_web_sm.__file__)

Notice that the path points to an __init__.py file.

We don't need the __init__.py. Instead, we need the en_core_web_sm-X.Y.Z file in the en_core_web_sm directory, e.g. en_core_web_sm-3.6.0.

Note: my spaCy version is 3.6.0, however, your version will likely be different.

For example, for me, the path looks as follows.

main.py
import spacy nlp = spacy.load( "/home/borislav/Desktop/bobbyhadz_python/venv2/lib/python3.11/site-packages/en_core_web_sm/en_core_web_sm-3.6.0") doc = nlp('bobbyhadz.com') print(doc)

call spacy load with absolute path

If you are on Windows, make sure to prefix the path with r to mark it as a raw string.

main.py
import spacy nlp = spacy.load( r'C:\Users\YourUser\Desktop\project\en_core_web_sm\en_core_web_sm-3.6.0' ) doc = nlp('bobbyhadz.com') print(doc)
Strings that are prefixed with r are called raw strings and treat backslashes as literal characters.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.