Last updated: Apr 10, 2024
Reading timeยท4 min
To split a string with multiple delimiters:
re.split()
method, e.g. re.split(r',|-', my_str)
.re.split()
method will split the string on all occurrences of one of
the delimiters.import re # ๐๏ธ split string with 2 delimiters my_str = 'bobby,hadz-dot,com' my_list = re.split(r',|-', my_str) # ๐๏ธ split on comma or hyphen print(my_list) # ๐๏ธ ['bobby', 'hadz', 'dot', 'com']
The re.split() method takes a pattern and a string and splits the string on each occurrence of the pattern.
The pipe |
character is an OR
. Either match A or B
.
The example splits a string using 2 delimiters - a comma and a hyphen.
# ๐๏ธ split string with 3 delimiters my_str = 'bobby,hadz-dot:com' my_list = re.split(r',|-|:', my_str) # ๐๏ธ comma, hyphen or colon print(my_list) # ๐๏ธ ['bobby', 'hadz', 'dot', 'com']
Here is an example that splits the string using 3 delimiters - a comma, a hyphen and a colon.
You can use as many |
characters as necessary in your regular expression.
[]
Alternatively, you can use square brackets []
to indicate a set of characters.
import re my_str = 'bobby,hadz-dot,com' my_list = re.split(r'[,-]', my_str) print(my_list) # ๐๏ธ ['bobby', 'hadz', 'dot', 'com']
Make sure to add all of the delimiters between the square brackets.
import re # ๐๏ธ split string with 3 delimiters my_str = 'bobby,hadz-dot:com' my_list = re.split(r'[,-:]', my_str) print(my_list) # ๐๏ธ ['bobby', 'hadz', 'dot', 'com']
You might get empty string values in the output list if the string starts with or ends with one of the delimiters.
You can use a list comprehension to remove any empty strings from the list.
import re # ๐๏ธ split string with 3 delimiters my_str = ',bobby,hadz-dot:com:' my_list = [ item for item in re.split(r'[,-:]', my_str) if item ] print(my_list) # ๐๏ธ ['bobby', 'hadz', 'dot', 'com']
The list comprehension takes care of removing the empty strings from the list.
An alternative approach is to use the str.replace()
method.
str.replace()
This is a two-step process:
str.replace()
method to replace the first delimiter with the
second.str.split()
method to split the string by the second delimiter.my_str = 'bobby_hadz!dot_com' my_list = my_str.replace('_', '!').split('!') print(my_list) # ๐๏ธ ['bobby', 'hadz', 'dot', 'com']
First, we replace every occurrence of the first delimiter with the second, and then we split on the second delimiter.
The str.replace() method returns a copy of the string with all occurrences of a substring replaced by the provided replacement.
The method takes the following parameters:
Name | Description |
---|---|
old | The substring we want to replace in the string |
new | The replacement for each occurrence of old |
count | Only the first count occurrences are replaced (optional) |
Note that the method doesn't change the original string. Strings are immutable in Python.
Here is another example.
my_str = 'bobby hadz, dot # com. abc' my_list = my_str.replace( ',', '').replace( '#', '').replace('.', '').split() print(my_list) # ๐๏ธ ['bobby', 'hadz', 'dot', 'com', 'abc']
We used the str.replace()
method to remove the punctuation before splitting
the string on whitespace characters.
You can chain as many calls to the str.replace()
method as necessary.
The last step is to use the str.split()
method to split the string into a list
of words.
The str.split() method splits the string into a list of substrings using a delimiter.
The method takes the following 2 parameters:
Name | Description |
---|---|
separator | Split the string into substrings on each occurrence of the separator |
maxsplit | At most maxsplit splits are done (optional) |
str.split()
method, it splits the input string on one or more whitespace characters.my_str = 'bobby hadz com' print(my_str.split()) # ๐๏ธ ['bobby', 'hadz', 'com']
If the separator is not found in the string, a list containing only 1 element is returned.
If you need to split a string based on multiple delimiters often, define a reusable function.
import re def split_multiple(string, delimiters): pattern = '|'.join(map(re.escape, delimiters)) return re.split(pattern, string) my_str = 'bobby,hadz-dot:com' print(split_multiple(my_str, [',', '-', ':']))
The split_multiple
function takes a string and a list of delimiters and splits
the string on the delimiters.
The str.join() method is used
to join the delimiters with a pipe |
separator.
# ๐๏ธ ,|-|: print('|'.join([',', '-', ':']))
This creates a regex pattern that we can use to split the string based on the specified delimiters.
If you need to split a string into a list of words with multiple delimiters, you
can also use the re.findall()
method.
You can learn more about the related topics by checking out the following tutorials: