Remove duplicates from a list of lists in Python

avatar

Borislav Hadzhiev

Last updated: Jun 30, 2022

banner

Photo from Unsplash

Remove duplicates from a list of lists in Python #

To remove the duplicates from a list of lists:

  1. Declare a new variable and set it to an empty list.
  2. Iterate over the list of lists and check if each nested list is not in the new list.
  3. If the condition is met, add the list to the new list.
main.py
list_of_lists = [['a', 1], ['b', 2], ['a', 1], ['b', 2], ['c', 3]] new_list = [] for l in list_of_lists: if l not in new_list: new_list.append(l) # 👇️ [['a', 1], ['b', 2], ['c', 3]] print(new_list)

The first step is to iterate over the list of lists.

On each iteration, we check if the current nested list is not present in the new list.

The in operator tests for membership. For example, x in l evaluates to True if x is a member of l, otherwise it evaluates to False.

x not in l returns the negation of x in l.

If the condition is met, we use the list.append() method to add the list to the new list.

The list.append() method adds an item to the end of the list.

The new list doesn't contain any duplicate lists.

Alternatively, you can use the itertools.groupby method.

To remove the duplicates from a list of lists:

  1. Use the list.sort() method to sort the list.
  2. Use the itertools.groupby method to group the nested lists.
  3. Use the list() class to convert the result to a list.
main.py
import itertools list_of_lists = [['a', 1], ['b', 2], ['a', 1], ['b', 2], ['c', 3]] # 👇️ sort list first list_of_lists.sort() new_list = list(l for l, _ in itertools.groupby(list_of_lists)) # 👇️ [['a', 1], ['b', 2], ['c', 3]] print(new_list)

We used the itertools.groupby method to remove the duplicates from a list of lists.

Note that the list has to be sorted before we use the groupby method because the method generates a new group every time the key changes.

main.py
import itertools list_of_lists = [['a', 1], ['b', 2], ['a', 1], ['b', 2], ['c', 3]] # 👇️ sort list first list_of_lists.sort() # 👇️ [(['a', 1], <itertools._grouper object at 0x7fb416cdf520>), (['b', 2], <itertools._grouper object at 0x7fb416cdf550>), (['c', 3], <itertools._grouper object at 0x7fb416cddbd0>)] print(list(itertools.groupby(list_of_lists)))

We used a generator expression to iterate over the iterator and used an underscore to ignore the itertools._grouper objects.

main.py
import itertools list_of_lists = [['a', 1], ['b', 2], ['a', 1], ['b', 2], ['c', 3]] # 👇️ sort list first list_of_lists.sort() new_list = list(l for l, _ in itertools.groupby(list_of_lists)) # 👇️ [['a', 1], ['b', 2], ['c', 3]] print(new_list)
Generator expressions are used to perform some operation for every element or select a subset of elements that meet a condition.

Note that this approach requires that we sort the list in advance.

The list.sort method sorts the list in place and it uses only < comparisons between items.

Alternatively, you can convert the list of lists to a set.

To remove the duplicates from a list of lists:

  1. Use a list comprehension to convert each nested list to a tuple.
  2. Convert the list of tuples to a set to remove the duplicates.
  3. Use a list comprehension to convert the set to a list of lists.
main.py
list_of_lists = [['a', 1], ['b', 2], ['a', 1], ['b', 2], ['c', 3]] my_set = set(tuple(l) for l in list_of_lists) new_list = [list(tup) for tup in my_set] print(new_list) # 👉️ [['a', 1], ['c', 3], ['b', 2]]

Notice that we had to convert each nested list to a tuple before converting to a set.

This is necessary because tuples are immutable and hashable, whereas lists are mutable and unhashable (cannot be members of a set).

Set objects are an unordered collection of unique elements, so when we convert the list of tuples to a set, all duplicates are automatically removed.

One thing to note though - set objects are unordered, so you don't have a guarantee that the order of the items in the list is going to be preserved.
main.py
list_of_lists = [['a', 1], ['b', 2], ['a', 1], ['b', 2], ['c', 3]] my_set = set(tuple(l) for l in list_of_lists) new_list = [list(tup) for tup in my_set] print(new_list) # 👉️ [['c', 3], ['a', 1], ['b', 2]]

If you need to be sure that the order of the tuples in the list is going to be preserved, use the for loop solution.

The set() class takes an iterable optional argument and returns a new set object with elements taken from the iterable.

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.