Pandas: Create a Tuple from two DataFrame Columns

# Table of Contents

# Pandas: Create a Tuple from two DataFrame Columns

To create a tuple from two DataFrame columns in Pandas:

Use the zip() function to get a zip object of tuples with the values of the two columns.
Convert the zip object to a list.
Add the result as a DataFrame column.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'first_name': ['Alice', 'Bobby', 'Carl'],
    'salary': [175.1, 180.2, 190.3],
    'experience': [10, 15, 20]
})

df['stats'] = list(zip(df['salary'], df['experience']))

#   first_name  salary  experience        stats
# 0      Alice   175.1          10  (175.1, 10)
# 1      Bobby   180.2          15  (180.2, 15)
# 2       Carl   190.3          20  (190.3, 20)
print(df)

create tuple from two dataframe columns in pandas

The code for this article is available on GitHub

The zip() function iterates over several iterables in parallel and produces tuples with an item from each iterable.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'first_name': ['Alice', 'Bobby', 'Carl'],
    'salary': [175.1, 180.2, 190.3],
    'experience': [10, 15, 20]
})

# [(175.1, 10), (180.2, 15), (190.3, 20)]
print(list(zip(df['salary'], df['experience'])))

The first item in each tuple is the salary value and the second is the experience value.

Once we've created the list of tuples, we can add it as a column to the DataFrame using bracket notation.

main.py

Copied!
df['stats'] = list(zip(df['salary'], df['experience']))

#   first_name  salary  experience        stats
# 0      Alice   175.1          10  (175.1, 10)
# 1      Bobby   180.2          15  (180.2, 15)
# 2       Carl   190.3          20  (190.3, 20)
print(df)

The zip() function returns a zip object, so make sure to convert the result to a list by using the list class.

# Pandas: Create a Tuple from two DataFrame Columns using `apply()`

You can also use the DataFrame.apply() method to create a tuple from two DataFrame columns.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'first_name': ['Alice', 'Bobby', 'Carl'],
    'salary': [175.1, 180.2, 190.3],
    'experience': [10, 15, 20]
})

df['stats'] = df[['salary', 'experience']].apply(tuple, axis=1)

#   first_name  salary  experience          stats
# 0      Alice   175.1          10  (175.1, 10.0)
# 1      Bobby   180.2          15  (180.2, 15.0)
# 2       Carl   190.3          20  (190.3, 20.0)
print(df)

create tuple from two dataframe columns using apply

The code for this article is available on GitHub

The DataFrame.apply() method applies a function along an axis of the DataFrame.

We set the axis to 1, so the given function is applied to each row.

main.py

Copied!
df['stats'] = df[['salary', 'experience']].apply(tuple, axis=1)

We passed the tuple class as the first argument to apply(), so the values of the salary and experience columns get converted to a tuple.

Notice that we used two sets of square brackets when accessing the values of the two columns.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'first_name': ['Alice', 'Bobby', 'Carl'],
    'salary': [175.1, 180.2, 190.3],
    'experience': [10, 15, 20]
})

#    salary  experience
# 0   175.1          10
# 1   180.2          15
# 2   190.3          20
print(df[['salary', 'experience']])

# Pandas: Create a Tuple from two DataFrame Columns using `itertuples()`

You can also use the DataFrame.itertuples() method to create a tuple from two DataFrame columns.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'first_name': ['Alice', 'Bobby', 'Carl'],
    'salary': [175.1, 180.2, 190.3],
    'experience': [10, 15, 20]
})


df['stats'] = list(
    df[['salary', 'experience']].itertuples(
        index=False,
        name=None
    )
)

#   first_name  salary  experience        stats
# 0      Alice   175.1          10  (175.1, 10)
# 1      Bobby   180.2          15  (180.2, 15)
# 2       Carl   190.3          20  (190.3, 20)
print(df)

The code for this article is available on GitHub

The method iterates over the rows of the DataFrame as named tuples.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'first_name': ['Alice', 'Bobby', 'Carl'],
    'salary': [175.1, 180.2, 190.3],
    'experience': [10, 15, 20]
})

# [(175.1, 10), (180.2, 15), (190.3, 20)]
print(
    list(df[['salary', 'experience']].itertuples(
        index=False,
        name=None
    ))
)

Notice that we set the name argument to None.

This is necessary because, by default, the tuples are named.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'first_name': ['Alice', 'Bobby', 'Carl'],
    'salary': [175.1, 180.2, 190.3],
    'experience': [10, 15, 20]
})

# [Pandas(salary=175.1, experience=10), Pandas(salary=180.2, experience=15), Pandas(salary=190.3, experience=20)]
print(
    list(df[['salary', 'experience']].itertuples(
        index=False,
    ))
)

We also had to set the index argument to False.

If you don't, the index is the first element of each tuple.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'first_name': ['Alice', 'Bobby', 'Carl'],
    'salary': [175.1, 180.2, 190.3],
    'experience': [10, 15, 20]
})

# [(0, 175.1, 10), (1, 180.2, 15), (2, 190.3, 20)]
print(
    list(df[['salary', 'experience']].itertuples(
        name=None
    ))
)

create tuple from two columns in pandas using itertuples

The code for this article is available on GitHub

# Pandas: Create a List from two DataFrame Columns

If you need to create a list from two DataFrame columns (instead of a tuple), you can also use the DataFrame.to_records() method.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'first_name': ['Alice', 'Bobby', 'Carl'],
    'salary': [175.1, 180.2, 190.3],
    'experience': [10, 15, 20]
})


df['stats'] = list(
    df[['salary', 'experience']].to_records(index=False)
)

#   first_name  salary  experience        stats
# 0      Alice   175.1          10  [175.1, 10]
# 1      Bobby   180.2          15  [180.2, 15]
# 2       Carl   190.3          20  [190.3, 20]
print(df)

create list from two dataframe columns using to records

The code for this article is available on GitHub

The DataFrame.to_records() method converts the DataFrame to a NumPy record array.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'first_name': ['Alice', 'Bobby', 'Carl'],
    'salary': [175.1, 180.2, 190.3],
    'experience': [10, 15, 20]
})


# [(175.1, 10), (180.2, 15), (190.3, 20)]
print(
    list(
        df[['salary', 'experience']].to_records(index=False)
    )
)

Notice that we had to set the index argument to False.

The argument defaults to True which means that the index is the first element of each

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'first_name': ['Alice', 'Bobby', 'Carl'],
    'salary': [175.1, 180.2, 190.3],
    'experience': [10, 15, 20]
})


# [(0, 175.1, 10), (1, 180.2, 15), (2, 190.3, 20)]
print(
    list(
        df[['salary', 'experience']].to_records(index=True)
    )
)

# Pandas: Create a List from two DataFrame Columns using `values.tolist()`

You can also use the DataFrame.values.tolist() method to achieve the same result.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'first_name': ['Alice', 'Bobby', 'Carl'],
    'salary': [175.1, 180.2, 190.3],
    'experience': [10, 15, 20]
})


df['stats'] = df[['salary', 'experience']].values.tolist()

#   first_name  salary  experience          stats
# 0      Alice   175.1          10  [175.1, 10.0]
# 1      Bobby   180.2          15  [180.2, 15.0]
# 2       Carl   190.3          20  [190.3, 20.0]
print(df)

The code for this article is available on GitHub

The DataFrame.values() method returns a NumPy representation of the DataFrame.

main.py

Copied!
import pandas as pd

df = pd.DataFrame({
    'first_name': ['Alice', 'Bobby', 'Carl'],
    'salary': [175.1, 180.2, 190.3],
    'experience': [10, 15, 20]
})

# [[175.1  10. ]
#  [180.2  15. ]
#  [190.3  20. ]]
print(df[['salary', 'experience']].values)

The last step is to use the tolist method to convert the values ndarray to a list.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.

You can use the search field on my Home Page to filter through all of my articles.

Pandas: Create a Tuple from two DataFrame Columns

# Table of Contents

# Pandas: Create a Tuple from two DataFrame Columns

# Pandas: Create a Tuple from two DataFrame Columns using apply()

# Pandas: Create a Tuple from two DataFrame Columns using itertuples()

# Pandas: Create a List from two DataFrame Columns

# Pandas: Create a List from two DataFrame Columns using values.tolist()

# Additional Resources

Borislav Hadzhiev

# Pandas: Create a Tuple from two DataFrame Columns using `apply()`

# Pandas: Create a Tuple from two DataFrame Columns using `itertuples()`

# Pandas: Create a List from two DataFrame Columns using `values.tolist()`