Last updated: Apr 12, 2024
Reading timeยท4 min

To create a Set from a Series in Pandas:
Series.unique() method if you need to get an array containing the
unique values in the Series.set() class if you need to convert the Series to a set object.import pandas as pd s = pd.Series([1, 2, 3, 3, 1, 4, 5, 5]) unique = s.unique() print(unique) # ๐๏ธ [1 2 3 4 5] # ๐๏ธ <class 'numpy.ndarray'> print(type(unique)) a_set = set(unique) print(a_set) # ๐๏ธ {1, 2, 3, 4, 5}

The
Series.unique()
method returns the unique values contained in a Series object.
import pandas as pd s = pd.Series([1, 2, 3, 3, 1, 4, 5, 5]) unique = s.unique() print(unique) # ๐๏ธ [1 2 3 4 5]
The unique() method returns the unique values as a NumPy array.
If you need to get the result as a set, you can use the set() constructor
instead.
import pandas as pd s = pd.Series([1, 2, 3, 3, 1, 4, 5, 5]) a_set = set(s) print(a_set) # ๐๏ธ {1, 2, 3, 4, 5} print(type(a_set)) # ๐๏ธ <class 'set'>

Set objects are an unordered, unique collection of elements.
If you need to convert a Series in a DataFrame to a Set, access it before
using the unique() method or set() constructor.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'salary': [100, 100, 100, 200] }) unique = df['salary'].unique() print(unique) # ๐๏ธ [100 200] a_set = set(df['salary']) print(a_set) # ๐๏ธ {200, 100}

We used bracket notation [] to access the Series before calling the
unique() method.
If you need to get an array containing the unique values in the Series, the
unique() method will suffice.
If you need to get a set object, use the set() constructor.
set are not ordered.unique() to the set() constructorIf you work with large Series objects, it is faster to:
unique() method to remove the duplicates from the Series.Series of unique elements to the set() constructor.import pandas as pd s = pd.Series([1, 2, 3, 3, 1, 4, 5, 5]) a_set = set(s.unique()) print(a_set) # ๐๏ธ {1, 2, 3, 4, 5}

We first remove the duplicates from the Series using unique() and pass the
Series of unique values to the set().
This will be more performant for large Series objects.
for loopYou can also use a basic for loop to create a
set from a Series in Pandas.
import pandas as pd s = pd.Series([1, 2, 3, 3, 1, 4, 5, 5]) a_set = set() for element in s.unique(): a_set.add(element) print(a_set) # ๐๏ธ {1, 2, 3, 4, 5}

We used a for loop to iterate over the unique values in the Series and used
the set.add()
method to add each element to the set.
You don't necessarily have to call the unique() method to achieve the same
result.
import pandas as pd s = pd.Series([1, 2, 3, 3, 1, 4, 5, 5]) a_set = set() for element in s: a_set.add(element) print(a_set) # ๐๏ธ {1, 2, 3, 4, 5}

The code sample achieves the same result because set objects only store unique
elements, so no duplicates can get added to the set.
In other words, adding a duplicate value to a set is a no-op (no operation).
You can learn more about the related topics by checking out the following tutorials:
pd.read_json()