Last updated: Apr 12, 2024
Reading time·4 min
dict()
set_index()
to_records()
MultiIndex.from_frame
To create a dictionary from two DataFrame columns in Pandas:
pandas.Series()
constructor to create a Series
.data
argument.index
argument.import pandas as pd df = pd.DataFrame({ 'digit': [1, 2, 3], 'day_name': ['Monday', 'Tuesday', 'Wednesday'] }) print(df) print('-' * 50) a_dict = pd.Series( df['day_name'].values, index=df['digit'] ).to_dict() print(a_dict)
Running the code sample produces the following output.
digit day_name 0 1 Monday 1 2 Tuesday 2 3 Wednesday -------------------------------------------------- {1: 'Monday', 2: 'Tuesday', 3: 'Wednesday'}
The
pandas.Series
class is used to create a one-dimensional ndarray
with axis labels.
# {1: 'Monday', 2: 'Tuesday', 3: 'Wednesday'} a_dict = pd.Series( df['day_name'].values, index=df['digit'] ).to_dict()
We used the
DataFrame.values
attribute to get a NumPy representation of the day_name
column.
import pandas as pd df = pd.DataFrame({ 'digit': [1, 2, 3], 'day_name': ['Monday', 'Tuesday', 'Wednesday'] }) # ['Monday' 'Tuesday' 'Wednesday'] print(df['day_name'].values)
The digit
column is used for the index
parameter of the pandas.Series
class.
import pandas as pd df = pd.DataFrame({ 'digit': [1, 2, 3], 'day_name': ['Monday', 'Tuesday', 'Wednesday'] }) # 0 1 # 1 2 # 2 3 # Name: digit, dtype: int64 print(df['digit'])
The last step is to use the
Series.to_dict()
method to convert the Series
to a dictionary.
# {1: 'Monday', 2: 'Tuesday', 3: 'Wednesday'} a_dict = pd.Series( df['day_name'].values, index=df['digit'] ).to_dict()
The to_dict() method
returns the key-value representation of the Series
.
dict()
You can also use the dict()
class and the
zip function to create a dictionary from two
DataFrame columns.
import pandas as pd df = pd.DataFrame({ 'digit': [1, 2, 3], 'day_name': ['Monday', 'Tuesday', 'Wednesday'] }) print(df) print('-' * 50) a_dict = dict(zip(df['digit'], df['day_name'])) print(a_dict)
Running the code sample produces the following output.
digit day_name 0 1 Monday 1 2 Tuesday 2 3 Wednesday -------------------------------------------------- {1: 'Monday', 2: 'Tuesday', 3: 'Wednesday'}
The zip() function iterates over several iterables in parallel and produces tuples with an item from each iterable.
import pandas as pd df = pd.DataFrame({ 'digit': [1, 2, 3], 'day_name': ['Monday', 'Tuesday', 'Wednesday'] }) # [(1, 'Monday'), (2, 'Tuesday'), (3, 'Wednesday')] print(list(zip(df['digit'], df['day_name'])))
We can pass the iterable of tuples to the dict()
class to construct a
dictionary.
If you'd like to
swap the keys and values,
simply switch the places when calling zip()
.
import pandas as pd df = pd.DataFrame({ 'digit': [1, 2, 3], 'day_name': ['Monday', 'Tuesday', 'Wednesday'] }) a_dict = dict(zip(df['day_name'], df['digit'])) # {'Monday': 1, 'Tuesday': 2, 'Wednesday': 3} print(a_dict)
set_index()
You can also use the
DataFrame.set_index() method
to create a dictionary from two DataFrame
columns.
import pandas as pd df = pd.DataFrame({ 'digit': [1, 2, 3], 'day_name': ['Monday', 'Tuesday', 'Wednesday'] }) a_dict = df.set_index('digit').to_dict()['day_name'] # {'Monday': 1, 'Tuesday': 2, 'Wednesday': 3} print(a_dict)
The DataFrame.set_index()
method sets the DataFrame
index using the
specified column.
import pandas as pd df = pd.DataFrame({ 'digit': [1, 2, 3], 'day_name': ['Monday', 'Tuesday', 'Wednesday'] }) # day_name # digit # 1 Monday # 2 Tuesday # 3 Wednesday print(df.set_index('digit'))
The method returns a DataFrame
on which we can access the to_dict()
method.
import pandas as pd df = pd.DataFrame({ 'digit': [1, 2, 3], 'day_name': ['Monday', 'Tuesday', 'Wednesday'] }) # {'day_name': {1: 'Monday', 2: 'Tuesday', 3: 'Wednesday'}} print(df.set_index('digit').to_dict())
The last step is to access the day_name
key to get the nested dictionary.
import pandas as pd df = pd.DataFrame({ 'digit': [1, 2, 3], 'day_name': ['Monday', 'Tuesday', 'Wednesday'] }) # {'day_name': {1: 'Monday', 2: 'Tuesday', 3: 'Wednesday'}} print(df.set_index('digit').to_dict()) # {1: 'Monday', 2: 'Tuesday', 3: 'Wednesday'} print(df.set_index('digit').to_dict()['day_name'])
to_records()
You can also use the
DataFrame.to_records
method if your DataFrame
only has 2 columns.
import pandas as pd df = pd.DataFrame({ 'digit': [1, 2, 3], 'day_name': ['Monday', 'Tuesday', 'Wednesday'] }) a_dict = dict(df.to_records(index=False)) # {1: 'Monday', 2: 'Tuesday', 3: 'Wednesday'} print(a_dict)
The DataFrame.to_records()
method converts the DataFrame
to a NumPy record
array.
import pandas as pd df = pd.DataFrame({ 'digit': [1, 2, 3], 'day_name': ['Monday', 'Tuesday', 'Wednesday'] }) # [(0, 1, 'Monday') (1, 2, 'Tuesday') (2, 3, 'Wednesday')] print(df.to_records()) # [(1, 'Monday') (2, 'Tuesday') (3, 'Wednesday')] print(df.to_records(index=False))
We had to set the index
argument to False
to exclude the index from the
resulting array of tuples.
You can pass the array of tuples to the dict()
class to create a dictionary.
MultiIndex.from_frame
You can also use the MultiIndex.from_frame() method.
import pandas as pd df = pd.DataFrame({ 'digit': [1, 2, 3], 'day_name': ['Monday', 'Tuesday', 'Wednesday'] }) a_dict = dict(pd.MultiIndex.from_frame(df)) # {1: 'Monday', 2: 'Tuesday', 3: 'Wednesday'} print(a_dict)
The pandas.MultiIndex.from_frame
method creates a MultiIndex from a
DataFrame
.
import pandas as pd df = pd.DataFrame({ 'digit': [1, 2, 3], 'day_name': ['Monday', 'Tuesday', 'Wednesday'] }) # MultiIndex([(1, 'Monday'), # (2, 'Tuesday'), # (3, 'Wednesday')], # names=['digit', 'day_name']) print(pd.MultiIndex.from_frame(df))
You can pass the MultiIndex
to the dict()
class to construct a dictionary.
You can learn more about the related topics by checking out the following tutorials: