Count the number of unique words in a String in Python

avatar

Borislav Hadzhiev

Last updated: Sep 22, 2022

banner

Photo from Unsplash

Count the number of unique words in a String in Python #

To count the number of unique words in a string:

  1. Use the str.split() method to split the string into a list of words.
  2. Use the set() class to convert the list to a set.
  3. Use the len() function to get the count of unique words in the string.
main.py
my_str = 'one one two two' unique_words = set(my_str.split()) print(unique_words) # 👉️ {'one', 'two'} length = len(unique_words) print(length) # 👉️ 2

We first used the str.split() method to split the string into a list of words.

main.py
my_str = 'one one two two' print(my_str.split()) # 👉️ ['one', 'one', 'two', 'two']

The str.split() method splits the string into a list of substrings using a delimiter.

When no separator is passed to the str.split() method, it splits the input string on one or more whitespace characters.

The next step is to use the set() class to convert the list of words to a set object.

main.py
my_str = 'one one two two' unique_words = set(my_str.split()) print(unique_words) # 👉️ {'one', 'two'}

The set() class takes an iterable optional argument and returns a new set object with elements taken from the iterable.

Set objects store an unordered collection of unique elements, so converting the list to a set removes all duplicate elements.

The last step is to use the len() function to get the number of unique words.

main.py
my_str = 'one one two two' unique_words = set(my_str.split()) print(unique_words) # 👉️ {'one', 'two'} length = len(unique_words) print(length) # 👉️ 2

The len() function returns the length (the number of items) of an object.

The argument the function takes may be a sequence (a string, tuple, list, range or bytes) or a collection (a dictionary, set, or frozen set).

Alternatively, you can use a for loop.

Count the number of unique words in a String using for loop #

To count the number of unique words in a string:

  1. Declare a new variable that stores an empty list.
  2. Use the str.split() method to split the string into a list of words.
  3. Use a for loop to iterate over the list.
  4. Use the list.append() method to append all unique words to the list.
  5. Use the len() function to get the length of the list.
main.py
my_str = 'one one two two' unique_words = [] for word in my_str.split(): if word not in unique_words: unique_words.append(word) print(len(unique_words)) # 👉️ 2 print(unique_words) # 👉️ ['one', 'two']

We used the str.split() method to split the string into a list of words and used a for loop to iterate over the list.

On each iteration, we use the not in operator to check if the element is not present in the list.

The in operator tests for membership. For example, x in l evaluates to True if x is a member of l, otherwise it evaluates to False.

x not in l returns the negation of x in l.

The list.append() method adds an item to the end of the list.

main.py
my_list = ['bobby', 'hadz'] my_list.append('com') print(my_list) # 👉️ ['bobby', 'hadz', 'com']

The last step is to use the len() function to get the number of unique words in the string.

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.