How to Remove \xa0 from a String in Python

avatar
Borislav Hadzhiev

Last updated: Apr 9, 2024
4 min

banner

# Table of Contents

  1. Remove \xa0 from a string in Python
  2. Remove \xa0 from a string using str.replace()
  3. Remove \xa0 from a string using split() and join()
  4. Remove \xa0 from a string using BeautifulSoup
  5. Remove \xa0 from a List of Strings in Python

# Remove \xa0 from a string in Python

Use the unicodedata.normalize() method to remove \xa0 from a string.

The unicodedata.normalize method returns the normal form for the provided Unicode string by replacing all compatibility characters with their equivalents.

main.py
import unicodedata my_str = 'bobby\xa0hadz' result = unicodedata.normalize('NFKD', my_str) print(result) # ๐Ÿ‘‰๏ธ 'bobby hadz'

remove xa0 from string

The code for this article is available on GitHub
The \xa0 character represents a non-breaking space, so the way to remove it from a string is to replace it with a space.

The unicodedata.normalize() method returns the normal form for the provided Unicode string.

The first argument is the form - NFKD in our case. The normal form NFDK replaces all compatibility characters with their equivalents.

Since the equivalent of the \xa0 character is a space, it gets replaced by a space.

If you get unexpected results when using the NFKD form, try using one of NFC, NFKC and NFD.

The NFKC form first applies compatibility decomposition, then canonical decomposition.

main.py
import unicodedata my_str = 'bobby\xa0hadz' result = unicodedata.normalize('NFKC', my_str) print(result) # ๐Ÿ‘‰๏ธ 'bobby hadz'

# Remove \xa0 from a string using str.replace()

Alternatively, you can use the str.replace() method.

main.py
my_str = 'bobby\xa0hadz' result = my_str.replace('\xa0', ' ') print(result) # ๐Ÿ‘‰๏ธ 'bobby hadz'

remove xa0 from string using str replace

The code for this article is available on GitHub

Since the \xa0 character represents a non-breaking space, we can simply replace it with a space.

The str.replace() method returns a copy of the string with all occurrences of a substring replaced by the provided replacement.

The method takes the following parameters:

NameDescription
oldThe substring we want to replace in the string
newThe replacement for each occurrence of old
countOnly the first count occurrences are replaced (optional)

Note that the method doesn't change the original string. Strings are immutable in Python.

# Remove \xa0 from a string using split() and join()

You can also use the str.split() and str.join() methods to remove the \xa0 characters from a string.

main.py
my_str = 'bobby\xa0hadz' result = ' '.join(my_str.split()) print(result) # ๐Ÿ‘‰๏ธ bobby hadz

remove xa0 using split and join

The code for this article is available on GitHub

We used the str.split() method to split the string on each whitespace character.

main.py
my_str = 'bobby\xa0hadz' # ๐Ÿ‘‡๏ธ ['bobby', 'hadz'] print(my_str.split())
When no separator is passed to the str.split() method, it splits the input string on one or more whitespace characters.

We could have also explicitly passed the \xa0 character in the call to the split() method.

main.py
my_str = 'bobby\xa0hadz' result = ' '.join(my_str.split('\xa0')) print(result) # ๐Ÿ‘‰๏ธ bobby hadz

The last step is to join the list of strings with a space separator.

main.py
my_str = 'bobby\xa0hadz' # ๐Ÿ‘‡๏ธ bobby hadz print(' '.join(my_str.split('\xa0')))

The str.join() method takes an iterable as an argument and returns a string which is the concatenation of the strings in the iterable.

# Remove \xa0 from a string using BeautifulSoup

If you use the BeautifulSoup4 module, use the get_text() method to remove the \xa0 characters from a string.

main.py
from bs4 import BeautifulSoup my_html = 'bobby\xa0hadz' result = BeautifulSoup(my_html, 'lxml').get_text(strip=True) print(result) # ๐Ÿ‘‰๏ธ bobby hadz
The code for this article is available on GitHub

Make sure you have beautifulsoup4 and lxml installed in order to run the code snippet.

shell
pip install lxml pip install beautifulsoup4 # ๐Ÿ‘‡๏ธ or with pip3 pip3 install lxml pip3 install beautifulsoup4

# Remove \xa0 from a List of Strings in Python

To remove \xa0 characters from a list of strings:

  1. Use a list comprehension to iterate over the list.
  2. On each iteration, use the str.replace() method to replace occurrences of \xa0 with a space.
  3. The strings in the new list won't contain any \xa0 characters.
main.py
my_list = ['bobby\xa0', '\xa0hadz'] result = [string.replace('\xa0', ' ') for string in my_list] print(result) # ๐Ÿ‘‰๏ธ ['bobby ', ' hadz']
The code for this article is available on GitHub

We used a list comprehension to iterate over the list.

List comprehensions are used to perform some operation for every element, or select a subset of elements that meet a condition.

On each iteration, we replace occurrences of the \xa0 character with a space and return the result.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.

Copyright ยฉ 2024 Borislav Hadzhiev