UnicodeEncodeError: 'ascii' codec can't encode character in position

avatar

Borislav Hadzhiev

Last updated: May 2, 2022

banner

Photo from Unsplash

UnicodeEncodeError: 'ascii' codec can't encode character in position #

The Python "UnicodeEncodeError: 'ascii' codec can't encode character in position" occurs when we use the ascii codec to encode a string that contains non-ascii characters. To solve the error, specify the correct encoding, e.g. utf-8.

unicodeencodeerror ascii codec cant encode character

Here is an example of how the error occurs.

main.py
my_str = 'one ф' # ⛔️ UnicodeEncodeError: 'ascii' codec can't encode character '\u0444' in position 4: ordinal not in range(128) my_bytes = my_str.encode('ascii')

The error is caused because the string contains non-ASCII characters.

To solve the error, use the correct encoding to encode the string, e.g. utf-8.

main.py
my_str = 'one ф' my_bytes = my_str.encode('utf-8') print(my_bytes) # 👉️ b'one \xd1\x84'
The utf-8 encoding is capable of encoding over a million valid character code points in Unicode.

If you got the error when opening a file, set the encoding keyword argument to utf-8 in the call to the open() function, e.g. with open('example.txt', 'w', encoding='utf-8').

You can view all of the standard encodings in this table of the official docs.

Encoding is the process of converting a string to a bytes object and decoding is the process of converting a bytes object to a string.

Here is what the complete process looks like.

main.py
my_str = 'one ф' # 👇️ encode str to bytes my_bytes = my_str.encode('utf-8') print(my_bytes) # 👉️ b'one \xd1\x84' # 👇️ decode bytes to str my_str_again = my_bytes.decode('utf-8') print(my_str_again) # 👉️ "one ф"

When decoding a bytes object, we have to use the same encoding that was used to encode the string to a bytes object.

If the error persists when using the utf-8 encoding, try setting the errors keyword argument to ignore to ignore characters that cannot be encoded.

main.py
my_str = 'one ф' # 👇️ encode str to bytes my_bytes = my_str.encode('utf-8', errors='ignore') print(my_bytes) # 👉️ b'one \xd1\x84' # 👇️ decode bytes to str my_str_again = my_bytes.decode('utf-8', errors='ignore') print(my_str_again) # 👉️ "one ф"

Note that ignoring characters that cannot be encoded can lead to data loss.

You can also try using the ascii encoding with errors set to ignore to ignore any non-ASCII characters.

main.py
my_str = 'one ф' # 👇️ encode str to bytes my_bytes = my_str.encode('ascii', errors='ignore') print(my_bytes) # 👉️ b'one ' # 👇️ decode bytes to str my_str_again = my_bytes.decode('ascii', errors='ignore') print(my_str_again) # 👉️ "one"

Notice that the last character (which is a non-ASCII character) got dropped when we encoded the string into bytes.

If you got the error when opening a file, open the file with encoding set to utf-8.

main.py
my_str = 'one ф' # 👇️ set encoding to utf-8 with open('example.txt', 'w', encoding='utf-8') as f: f.write(my_str)

You can also set the errors keyword argument to ignore to ignore any encoding errors when opening a file.

main.py
my_str = 'one ф' with open('example.txt', 'w', encoding='utf-8', errors='ignore') as f: f.write(my_str)

Conclusion #

The Python "UnicodeEncodeError: 'ascii' codec can't encode character in position" occurs when we use the ascii codec to encode a string that contains non-ascii characters. To solve the error, specify the correct encoding, e.g. utf-8.

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.