Last updated: Apr 10, 2024
Reading timeยท4 min
for
loopTo remove the punctuation from a list of strings:
for
to iterate over each string in the list.import string a_list = ['b.o.b.by', 'h,adz', '', 'c:om'] new_list = [''.join(char for char in item if char not in string.punctuation) for item in a_list] print(new_list) # ๐๏ธ ['bobby', 'hadz', '', 'com']
We used a list comprehension to iterate over the list.
On each iteration, we use a nested for
to iterate over each string.
The string.punctuation attribute returns a string of punctuation characters.
import string print(string.punctuation) # ๐๏ธ !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
On each iteration, we use the not in
operator to exclude any punctuation
characters from the current string.
str.join()
method.The str.join() method takes an iterable as an argument and returns a string which is the concatenation of the strings in the iterable.
The string the method is called on is used as the separator between the elements.
If you need to
remove empty string elements from the list,
add an if
statement.
import string a_list = ['b.o.b.by', 'h,adz', '', 'c:om'] new_list = [''.join(char for char in item if char not in string.punctuation) for item in a_list if item != ''] print(new_list) # ๐๏ธ ['bobby', 'hadz', 'com']
Alternatively, you can use the re.sub()
method.
This is a three-step process:
re.sub()
method to remove the punctuation from each string in the
list.import re a_list = ['b.o.b.by', 'h,adz', 'c:om'] new_list = [re.sub(r'[^\w\s]', '', item) for item in a_list] print(new_list) # ๐๏ธ ['bobby', 'hadz', '', 'com']
The re.sub() method returns a new string that is obtained by replacing the occurrences of the pattern with the provided replacement.
import re a_str = 'bobby,hadz.com;' result = re.sub(r'[^\w\s]', '', a_str) print(result) # ๐๏ธ bobbyhadzcom
If the pattern isn't found, the string is returned as is.
re.sub()
method is a regular expression.The square brackets []
are used to indicate a set of characters.
The \w
character matches:
The \s
character matches Unicode whitespace characters like [ \t\n\r\f\v]
.
The caret ^
at the beginning of the set means "NOT".
You can add any characters that you don't want to match between the square brackets of the regular expression.
If you ever need help reading or writing a regular expression, consult the regular expression syntax subheading in the official docs.
The page contains a list of all of the special characters with many useful examples.
Alternatively, you can use a for loop.
for
loopThis is a four-step process:
for
loop to iterate over the list of strings.re.sub()
method to remove the punctuation from each string.import re a_list = ['b.o.b.by', 'h,adz', 'c:om'] new_list = [] for item in a_list: new_list.append(re.sub(r'[^\w\s]', '', item)) print(new_list) # ๐๏ธ ['bobby', 'hadz', 'com']
We used a for
loop to iterate over the list of strings.
On each iteration, we use the re.sub()
method to replace all punctuation marks
in the string with empty strings.
The last step is to use the list.append()
method to add the results to the
list.
The list.append() method adds an item to the end of the list.
Which approach you pick is a matter of personal preference. I'd use a list
comprehension with the re.sub()
method as I find it more readable than using a
nested for
.
str.translate
You can also use the str.translate
method to remove the punctuation from a
list of strings.
import string a_list = ['b.o.b.by', 'h,adz', '', 'c:om'] new_list = [ item.translate(string.punctuation) for item in a_list ] print(new_list) # ๐๏ธ ['b.o.b.by', 'h,adz', '', 'c:om']
The str.translate() method can be used to remove characters from the string based on the supplied string.
If you need to remove the empty strings from the list, use an if
statement.
import string a_list = ['b.o.b.by', 'h,adz', '', 'c:om'] new_list = [ item.translate(string.punctuation) for item in a_list if item != '' ] print(new_list) # ๐๏ธ ['b.o.b.by', 'h,adz', 'c:om']
You can learn more about the related topics by checking out the following tutorials: