Remove punctuation from a List of strings in Python

avatar
Borislav Hadzhiev

Last updated: Apr 10, 2024
4 min

banner

# Table of Contents

  1. Remove punctuation from a List of strings in Python
  2. Remove punctuation from a List of strings using re.sub()
  3. Remove punctuation from a List of strings using a for loop
  4. Remove punctuation from a List of strings using str.translate

# Remove punctuation from a List of strings in Python

To remove the punctuation from a list of strings:

  1. Use a list comprehension to iterate over the list.
  2. Use a nested for to iterate over each string in the list.
  3. Remove the punctuation marks from each string and return the result.
main.py
import string a_list = ['b.o.b.by', 'h,adz', '', 'c:om'] new_list = [''.join(char for char in item if char not in string.punctuation) for item in a_list] print(new_list) # ๐Ÿ‘‰๏ธ ['bobby', 'hadz', '', 'com']

using list comprehension

The code for this article is available on GitHub

We used a list comprehension to iterate over the list.

List comprehensions are used to perform some operation for every element or select a subset of elements that meet a condition.

On each iteration, we use a nested for to iterate over each string.

The string.punctuation attribute returns a string of punctuation characters.

main.py
import string print(string.punctuation) # ๐Ÿ‘‰๏ธ !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~

On each iteration, we use the not in operator to exclude any punctuation characters from the current string.

The last step is to join the non-punctuation characters into a string using the str.join() method.

The str.join() method takes an iterable as an argument and returns a string which is the concatenation of the strings in the iterable.

The string the method is called on is used as the separator between the elements.

If you need to remove empty string elements from the list, add an if statement.

main.py
import string a_list = ['b.o.b.by', 'h,adz', '', 'c:om'] new_list = [''.join(char for char in item if char not in string.punctuation) for item in a_list if item != ''] print(new_list) # ๐Ÿ‘‰๏ธ ['bobby', 'hadz', 'com']
The code for this article is available on GitHub

Alternatively, you can use the re.sub() method.

# Remove punctuation from a List of strings using re.sub()

This is a three-step process:

  1. Use a list comprehension to iterate over the list.
  2. Use the re.sub() method to remove the punctuation from each string in the list.
  3. The strings in the new list won't contain any punctuation marks.
main.py
import re a_list = ['b.o.b.by', 'h,adz', 'c:om'] new_list = [re.sub(r'[^\w\s]', '', item) for item in a_list] print(new_list) # ๐Ÿ‘‰๏ธ ['bobby', 'hadz', '', 'com']

using-re-sub

The code for this article is available on GitHub

The re.sub() method returns a new string that is obtained by replacing the occurrences of the pattern with the provided replacement.

main.py
import re a_str = 'bobby,hadz.com;' result = re.sub(r'[^\w\s]', '', a_str) print(result) # ๐Ÿ‘‰๏ธ bobbyhadzcom

If the pattern isn't found, the string is returned as is.

The first argument we passed to the re.sub() method is a regular expression.

The square brackets [] are used to indicate a set of characters.

The \w character matches:

  • characters that can be part of a word in any language
  • numbers
  • the underscore character

The \s character matches Unicode whitespace characters like [ \t\n\r\f\v].

The caret ^ at the beginning of the set means "NOT".

In other words, match all characters that are not a part of a word in any language, numbers, underscores or whitespace and replace them with an empty string (remove them).

You can add any characters that you don't want to match between the square brackets of the regular expression.

If you ever need help reading or writing a regular expression, consult the regular expression syntax subheading in the official docs.

The page contains a list of all of the special characters with many useful examples.

Alternatively, you can use a for loop.

# Remove punctuation from a List of strings using a for loop

This is a four-step process:

  1. Declare a new variable and initialize it to an empty list.
  2. Use a for loop to iterate over the list of strings.
  3. Use the re.sub() method to remove the punctuation from each string.
  4. Append the results to the new list.
main.py
import re a_list = ['b.o.b.by', 'h,adz', 'c:om'] new_list = [] for item in a_list: new_list.append(re.sub(r'[^\w\s]', '', item)) print(new_list) # ๐Ÿ‘‰๏ธ ['bobby', 'hadz', 'com']

using for loop

The code for this article is available on GitHub

We used a for loop to iterate over the list of strings.

On each iteration, we use the re.sub() method to replace all punctuation marks in the string with empty strings.

The last step is to use the list.append() method to add the results to the list.

The list.append() method adds an item to the end of the list.

Which approach you pick is a matter of personal preference. I'd use a list comprehension with the re.sub() method as I find it more readable than using a nested for.

# Remove punctuation from a List of strings using str.translate

You can also use the str.translate method to remove the punctuation from a list of strings.

main.py
import string a_list = ['b.o.b.by', 'h,adz', '', 'c:om'] new_list = [ item.translate(string.punctuation) for item in a_list ] print(new_list) # ๐Ÿ‘‰๏ธ ['b.o.b.by', 'h,adz', '', 'c:om']

using str translate

The code for this article is available on GitHub

The str.translate() method can be used to remove characters from the string based on the supplied string.

If you need to remove the empty strings from the list, use an if statement.

main.py
import string a_list = ['b.o.b.by', 'h,adz', '', 'c:om'] new_list = [ item.translate(string.punctuation) for item in a_list if item != '' ] print(new_list) # ๐Ÿ‘‰๏ธ ['b.o.b.by', 'h,adz', 'c:om']

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.

Copyright ยฉ 2024 Borislav Hadzhiev