Split a string on all special characters in Python

avatar

Borislav Hadzhiev

Thu Jun 23 20221 min read

banner

Photo by Toa Heftiba

Split a string on all special characters in Python #

Use the re.split() method to split a string on all special characters. The re.split() method takes a pattern and a string and splits the string on each occurrence of the pattern.

main.py
import re my_str = "hello<one!two>three.four!five'six" my_list = re.split(r'[`!@#$%^&*()_+\-=\[\]{};\':"\\|,.<>\/?~]', my_str) # 👇️ ['hello', 'one', 'two', 'three', 'four', 'five', 'six'] print(my_list)

We used the re.split method to split a string on all occurrences of a special character.

The square brackets are used to indicate a set of characters.

Make sure that all characters you consider special characters are in the set.

You can add or remove characters according to your use case.

Alternatively, you can use a regular expression that matches any character that is not a letter, a digit or a space.

main.py
import re my_str = "hello<one!two>three.four!five'six" my_list = re.split(r'[^a-zA-Z0-9\s]', my_str) # 👇️ ['hello', 'one', 'two', 'three', 'four', 'five', 'six'] print(my_list)

The caret ^ at the beginning of the set means "NOT". In other words, match all characters that are NOT lowercase letters a-z, uppercase letters A-Z, digits 0-9 or whitespace \s characters.

You can add any characters that you don't want to match between the square brackets of the regular expression.

You can tweak the regular expression according to your use case. This section of the docs has information regarding what each special character does.

Use the search field on my Home Page to filter through my more than 1,000 articles.