UnicodeEncodeError: 'charmap' codec can't encode characters in position

avatar

Borislav Hadzhiev

Last updated: May 2, 2022

banner

Check out my new book

UnicodeEncodeError: 'charmap' codec can't encode characters in position #

The Python "UnicodeEncodeError: 'charmap' codec can't encode characters in position" occurs when we use an incorrect codec to encode a string to bytes. To solve the error, specify the correct encoding when opening the file or encoding the string, e.g. utf-8.

unicodeencodeerror charmap codec cant encode characters

Here is an example of how the error occurs.

main.py
my_str = 'hello 𝘈Ḇ𝖢𝕯٤ḞԍНǏ' # ⛔️ UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-8: character maps to <undefined> my_bytes = my_str.encode('cp856')

The error is caused because the string cannot be encoded with the specified encoding.

To solve the error, use the correct encoding to encode the string, e.g. utf-8.

main.py
my_str = 'hello 𝘈Ḇ𝖢𝕯٤ḞԍНǏ' my_bytes = my_str.encode('utf-8') # 👇️ b'hello \xf0\x9d\x98\x88\xe1\xb8\x86\xf0\x9d\x96\xa2\xf0\x9d\x95\xaf\xd9\xa4\xe1\xb8\x9e\xd4\x8d\xd0\x9d\xc7\x8f' print(my_bytes)
The utf-8 encoding is capable of encoding over a million valid character code points in Unicode.

You can view all of the standard encodings in this table of the official docs.

If you got the error when opening a file, set the encoding keyword argument to utf-8 in the call to the open() function.

main.py
my_str = 'hello 𝘈Ḇ𝖢𝕯٤ḞԍНǏ' with open('example.txt', 'w', encoding='utf-8') as f: f.write(my_str)
Encoding is the process of converting a string to a bytes object and decoding is the process of converting a bytes object to a string.

Here is how the complete process looks like.

main.py
my_str = 'hello 𝘈Ḇ𝖢𝕯٤ḞԍНǏ' # 👇️ encode str to bytes my_bytes = my_str.encode('utf-8') print(my_bytes) # 👇️ decode bytes to str my_str_again = my_bytes.decode('utf-8') print(my_str_again) # 👉️ "hello 𝘈Ḇ𝖢𝕯٤ḞԍНǏ"

When decoding a bytes object, we have to use the same encoding that was used to encode the string to a bytes object.

If the error persists when using the utf-8 encoding, try setting the errors keyword argument to ignore to ignore characters that cannot be encoded.

main.py
my_str = 'hello 𝘈Ḇ𝖢𝕯٤ḞԍНǏ' # 👇️ encode str to bytes my_bytes = my_str.encode('utf-8', errors='ignore') print(my_bytes) # 👇️ decode bytes to str my_str_again = my_bytes.decode('utf-8', errors='ignore') print(my_str_again) # 👉️ "hello 𝘈Ḇ𝖢𝕯٤ḞԍНǏ"

Note that ignoring characters that cannot be encoded can lead to data loss.

You can also set the errors keyword argument to ignore to ignore any encoding errors when opening a file.

main.py
my_str = 'hello 𝘈Ḇ𝖢𝕯٤ḞԍНǏ' with open('example.txt', 'w', encoding='utf-8', errors='ignore') as f: f.write(my_str)
I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.