Last updated: Apr 8, 2024
Reading timeยท3 min
map()
To split a string and remove whitespace:
str.split()
method to split the string into a list.str.strip()
method to remove the leading and
trailing whitespace.my_str = 'bobby, hadz, com' my_list = [word.strip() for word in my_str.split(',')] print(my_list) # ๐๏ธ ['bobby', 'hadz', 'com']
We used the str.split()
method to split the string on each occurrence of a
comma.
The str.split() method splits the string into a list of substrings using a delimiter.
The method takes the following 2 parameters:
Name | Description |
---|---|
separator | Split the string into substrings on each occurrence of the separator |
maxsplit | At most maxsplit splits are done (optional) |
The only argument we passed to split()
is the separator we want to split on.
my_str = 'bobby, hadz, com' l = my_str.split(', ') print(l) # ๐๏ธ ['one', ' two', ' three', ' four']
The next step is to use a list comprehension to iterate over the list of strings.
my_str = 'bobby, hadz, com' my_list = [word.strip() for word in my_str.split(',')] print(my_list) # ๐๏ธ ['bobby', 'hadz', 'com']
On each iteration, we call the str.strip()
method to remove any leading or
trailing whitespace from the string.
example = ' bobbyhadz.com ' print(repr(example.strip())) # ๐๏ธ 'bobbyhadz.com'
The str.strip() method returns a copy of the string with the leading and trailing whitespace removed.
map()
This is a three-step process:
str.split()
method on the string to get a list of strings.str.strip
method and the list to the map()
function.map
function will call the str.strip
method on each string in the
list.my_str = 'bobby, hadz, com' my_list = list(map(str.strip, my_str.split(','))) print(my_list) # ๐๏ธ ['bobby', 'hadz', 'com']
The map() function takes a function and an iterable as arguments and calls the function with each item of the iterable.
str.strip
method as the function, so the map
function is going to call the str.strip()
method on each item in the list.The map
function returns a map
object (not a list). If you need to convert
the value to a list, pass it to the list()
class.
You can also use a regular expression to split a string and remove the whitespace.
import re my_str = 'bobby, hadz, com' pattern = re.compile(r'^\s+|\s*,\s*|\s+$') my_list = [word for word in pattern.split(my_str) if word] print(my_list) # ๐๏ธ ['bobby', 'hadz', 'com']
The pattern in the example is formatted as
'^\s+|\s*{your_split_char}\s*|\s+$'
.
Here is an example that uses an underscore as the split character.
import re my_str = 'bobby_ hadz_ com' pattern = re.compile(r'^\s+|\s*_\s*|\s+$') my_list = [word for word in pattern.split(my_str) if word] print(my_list) # ๐๏ธ ['bobby', 'hadz', 'com']
The re.compile() method compiles a regular expression pattern into an object.
We used the re.split() method to split the string based on the provided regular expression.
The caret ^
matches the start of the string and the dollar sign $
matches
the end of the string.
import re my_str = 'bobby, hadz, com' pattern = re.compile(r'^\s+|\s*,\s*|\s+$') my_list = [word for word in pattern.split(my_str) if word] print(my_list) # ๐๏ธ ['bobby', 'hadz', 'com']
The \s
character matches Unicode whitespace characters like [ \t\n\r\f\v]
.
The plus +
matches the preceding character (whitespace) 1 or more times.
The pipe |
special character means OR, e.g. X|Y
matches X
or Y
.
The asterisk *
matches the preceding regular expression (whitespace) zero or
more times.
In its entirety, the regular expression splits on leading or trailing whitespace characters or commas.
Make sure to replace the comma in the regex with the character you need to split on.
The pattern in the example is formatted as
'^\s+|\s*{your_split_char}\s*|\s+$'
.
You can learn more about the related topics by checking out the following tutorials: