How to Create a Set from a Series in Pandas [5 Ways]

avatar
Borislav Hadzhiev

Last updated: Apr 12, 2024
4 min

banner

# Table of Contents

  1. How to Create a Set from a Series in Pandas
  2. How to convert a Series from a DataFrame to a Set
  3. Passing the result of calling unique() to the set() constructor
  4. Create a Set from a Series in Pandas using a for loop

# How to Create a Set from a Series in Pandas

To create a Set from a Series in Pandas:

  1. Use the Series.unique() method if you need to get an array containing the unique values in the Series.
  2. Use the set() class if you need to convert the Series to a set object.
main.py
import pandas as pd s = pd.Series([1, 2, 3, 3, 1, 4, 5, 5]) unique = s.unique() print(unique) # ๐Ÿ‘‰๏ธ [1 2 3 4 5] # ๐Ÿ‘‡๏ธ <class 'numpy.ndarray'> print(type(unique)) a_set = set(unique) print(a_set) # ๐Ÿ‘‰๏ธ {1, 2, 3, 4, 5}

create set from series in pandas

The code for this article is available on GitHub

The Series.unique() method returns the unique values contained in a Series object.

main.py
import pandas as pd s = pd.Series([1, 2, 3, 3, 1, 4, 5, 5]) unique = s.unique() print(unique) # ๐Ÿ‘‰๏ธ [1 2 3 4 5]

The unique() method returns the unique values as a NumPy array.

If you need to get the result as a set, you can use the set() constructor instead.

main.py
import pandas as pd s = pd.Series([1, 2, 3, 3, 1, 4, 5, 5]) a_set = set(s) print(a_set) # ๐Ÿ‘‰๏ธ {1, 2, 3, 4, 5} print(type(a_set)) # ๐Ÿ‘‰๏ธ <class 'set'>

using set constructor to convert series to set

Set objects are an unordered, unique collection of elements.

# How to convert a Series from a DataFrame to a Set

If you need to convert a Series in a DataFrame to a Set, access it before using the unique() method or set() constructor.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'salary': [100, 100, 100, 200] }) unique = df['salary'].unique() print(unique) # ๐Ÿ‘‰๏ธ [100 200] a_set = set(df['salary']) print(a_set) # ๐Ÿ‘‰๏ธ {200, 100}

convert series in dataframe to set

The code for this article is available on GitHub

We used bracket notation [] to access the Series before calling the unique() method.

If you need to get an array containing the unique values in the Series, the unique() method will suffice.

If you need to get a set object, use the set() constructor.

Notice that the elements in the set are not ordered.

# Passing the result of calling unique() to the set() constructor

If you work with large Series objects, it is faster to:

  1. Use the unique() method to remove the duplicates from the Series.
  2. Pass the Series of unique elements to the set() constructor.
main.py
import pandas as pd s = pd.Series([1, 2, 3, 3, 1, 4, 5, 5]) a_set = set(s.unique()) print(a_set) # ๐Ÿ‘‰๏ธ {1, 2, 3, 4, 5}

pass result of calling unique to series

The code for this article is available on GitHub

We first remove the duplicates from the Series using unique() and pass the Series of unique values to the set().

This will be more performant for large Series objects.

# Create a Set from a Series in Pandas using a for loop

You can also use a basic for loop to create a set from a Series in Pandas.

main.py
import pandas as pd s = pd.Series([1, 2, 3, 3, 1, 4, 5, 5]) a_set = set() for element in s.unique(): a_set.add(element) print(a_set) # ๐Ÿ‘‰๏ธ {1, 2, 3, 4, 5}

create set from series in pandas using for loop

The code for this article is available on GitHub

We used a for loop to iterate over the unique values in the Series and used the set.add() method to add each element to the set.

You don't necessarily have to call the unique() method to achieve the same result.

main.py
import pandas as pd s = pd.Series([1, 2, 3, 3, 1, 4, 5, 5]) a_set = set() for element in s: a_set.add(element) print(a_set) # ๐Ÿ‘‰๏ธ {1, 2, 3, 4, 5}

create set from series using for loop without unique

The code for this article is available on GitHub

The code sample achieves the same result because set objects only store unique elements, so no duplicates can get added to the set.

In other words, adding a duplicate value to a set is a no-op (no operation).

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.

Copyright ยฉ 2024 Borislav Hadzhiev