Remove duplicate characters from a String in Python

avatar

Borislav Hadzhiev

Last updated: Aug 13, 2022

banner

Photo from Unsplash

Remove duplicate characters from a String in Python #

To remove the duplicate characters from a string:

  1. Use the dict class to get a dictionary with the characters as keys.
  2. Use the join() method to join the keys of the dictionary into a string.
  3. The new string won't contain duplicate characters.
main.py
from collections import OrderedDict my_str = 'aappllee' result = ''.join(OrderedDict.fromkeys(my_str)) print(result) # 👉️ 'aple' # ------------- # ✅ Use native dict if Python v3.7+ result_2 = ''.join(dict.fromkeys(my_str)) print(result_2) # 👉️ 'aple'

The first example uses the OrderedDict class to remove the duplicate characters from a string.

The second example uses the dict class.

As of Python 3.7, the standard dict class is guaranteed to preserve the order as well.

The standard dict class is also a little more performant than the OrderedDict class and doesn't require an import statement.

The dict.fromkeys method takes an iterable and a value and creates a new dictionary with keys from the iterable and values set to the provided value.

main.py
my_str = 'aappllee' # 👇️ {'a': None, 'p': None, 'l': None, 'e': None} print(dict.fromkeys(my_str))

We didn't pass a value to the method because we only need the keys.

The keys in a dictionary are guaranteed to be unique, so any duplicate characters get automatically dropped.

The last step is to join the keys of the dictionary into a string.

main.py
my_str = 'aappllee' result_2 = ''.join(dict.fromkeys(my_str)) print(result_2) # 👉️ 'aple'

The str.join method takes an iterable as an argument and returns a string which is the concatenation of the strings in the iterable.

We joined the collection of strings without a separator.

If you decide to use the OrderedDict class, make sure to import it.

main.py
from collections import OrderedDict my_str = 'aappllee' result_2 = ''.join(OrderedDict.fromkeys(my_str)) print(result_2) # 👉️ 'aple'

If you need to remove the duplicate characters from the string, but don't need to preserve the order, use the set() class.

main.py
jmy_str = 'aappllee' result = ''.join(set(my_str)) print(result) # 👉️ eapl

We used the set() class to convert the string to a set object.

main.py
my_str = 'aappllee' print(set(my_str)) # 👉️ pael

Set objects are an unordered collection of unique elements, so any duplicate characters get automatically removed when the conversion takes place.

Make sure to only use a set object if the order of the characters in the string doesn't have to be preserved.

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.