Combine two Lists and remove Duplicates in Python

avatar
Borislav Hadzhiev

Last updated: Apr 10, 2024
4 min

banner

# Table of Contents

  1. Combine two lists and remove duplicates in Python
  2. Combine two lists and remove duplicates using numpy
  3. Combine two lists and remove duplicates using a list.extend()

# Combine two lists and remove duplicates in Python

To combine two lists and remove duplicates:

  1. Use the set() class to convert the lists to set objects.
  2. Get the difference between the sets.
  3. Use the list() class to convert the result back to a list.
  4. Use the addition (+) operator to combine the two lists.
main.py
list1 = [1, 4, 6, 9] list2 = [4, 6, 14, 7] result = list1 + list(set(list2) - set(list1)) print(result) # ๐Ÿ‘‰๏ธ [1, 4, 6, 9, 14, 7]

combine two lists and remove duplicates

The code for this article is available on GitHub
If you need a solution that doesn't involve set objects, scroll down to the next subheading.

The first step is to use the set() class to convert the lists to set objects.

main.py
list1 = [1, 4, 6, 9] list2 = [4, 6, 14, 7] print(set(list1)) # ๐Ÿ‘‰๏ธ {1, 4, 6, 9} print(set(list2)) # ๐Ÿ‘‰๏ธ {4, 7, 6, 14}

Set objects are an unordered collection of unique elements and implement a difference() method.

# Using subtraction vs using set.difference()

The following 2 examples combine the two lists and remove the duplicates.

The minus sign is a shorthand for calling the difference() method on the set.

main.py
list1 = [1, 4, 6, 9] list2 = [4, 6, 14, 7] result = list1 + list(set(list2) - set(list1)) print(result) # ๐Ÿ‘‰๏ธ [1, 4, 6, 9, 14, 7] result = list1 + list(set(list2).difference(list1)) print(result) # ๐Ÿ‘‰๏ธ [1, 4, 6, 9, 7, 14]

using subtraction vs using set difference

The code for this article is available on GitHub

The difference() method returns a new set with elements in the set that are not in the provided iterable.

In other words, set(list2).difference(list1) returns a new set that contains the items in list2 that are not in list1.

main.py
list1 = [1, 4, 6, 9] list2 = [4, 6, 14, 7] print(set(list2).difference(list1)) # ๐Ÿ‘‰๏ธ {7, 14}

The last step is to convert the set to a list and use the addition (+) operator to combine the two lists.

main.py
list1 = [1, 4, 6, 9] list2 = [4, 6, 14, 7] result = list1 + list(set(list2) - set(list1)) print(result) # ๐Ÿ‘‰๏ธ [1, 4, 6, 9, 14, 7] result = list1 + list(set(list2).difference(list1)) print(result) # ๐Ÿ‘‰๏ธ [1, 4, 6, 9, 7, 14]

When the addition (+) operator is used with two lists, it combines the lists into a single list.

main.py
print(['a'] + ['b']) # ๐Ÿ‘‰๏ธ ['a', 'b']

# Combine two lists and remove duplicates using numpy

If you use numpy, you can also use the numpy.unique() method.

main.py
import numpy as np list1 = [1, 4, 6, 9] list2 = [4, 6, 14, 7] result = np.unique(list1 + list2).tolist() print(result) # ๐Ÿ‘‰๏ธ [1, 4, 6, 7, 9, 14]

combine two lists and remove duplicates using numpy

The code for this article is available on GitHub

Make sure you have the numpy module installed to be able to run the code sample.

main.py
pip install numpy # ๐Ÿ‘‡๏ธ or with pip3 pip3 install numpy

The numpy.unique() method returns the sorted unique elements of the provided array-like object.

The tolist method converts an array to a list.

Alternatively, you can use the list.extend() method

# Combine two lists and remove duplicates using a list.extend()

This is a four-step process:

  1. Use the list.copy() method to create a copy of the first list.
  2. Use a generator expression to iterate over the second list.
  3. Check if each item is not present in the copied list.
  4. Use the list.extend() method to combine the two lists.
main.py
list1 = [1, 4, 6, 9] list2 = [4, 6, 14, 7] result = list1.copy() result.extend(item for item in list2 if item not in result) print(result) # ๐Ÿ‘‰๏ธ [1, 4, 6, 9, 14, 7]
The code for this article is available on GitHub

The list.copy() method returns a shallow copy of the object on which the method was called.

main.py
list1 = [1, 4, 6, 9] result = list1.copy() print(result) # ๐Ÿ‘‰๏ธ [1, 4, 6, 9]

We used a generator expression to iterate over the second list.

Generator expressions are used to perform some operation for every element or select a subset of elements that meet a condition.

On each iteration, we check if the current item is not present in the copied list and return the result.

main.py
list1 = [1, 4, 6, 9] list2 = [4, 6, 14, 7] result = list1.copy() result.extend(item for item in list2 if item not in result) print(result) # ๐Ÿ‘‰๏ธ [1, 4, 6, 9, 14, 7]

The in operator tests for membership. For example, x in l evaluates to True if x is a member of l, otherwise, it evaluates to False.

The last step is to extend the copy with the items of the second list that are not in the first.

The list.extend() method takes an iterable and extends the list by appending all of the items from the iterable.

main.py
my_list = ['bobby'] my_list.extend(['hadz', '.', 'com']) print(my_list) # ๐Ÿ‘‰๏ธ ['bobby', 'hadz', '.', 'com']
The code for this article is available on GitHub

The list.extend() method returns None as it mutates the original list.

Which approach you pick is a matter of personal preference. I'd use a generator expression because I find them quite direct and easy to read.

I've also written an article on how to find the common values in multiple lists.

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.

Copyright ยฉ 2024 Borislav Hadzhiev