Last updated: Apr 12, 2024
Reading timeยท6 min
To swap two DataFrame
columns in Pandas:
DataFrame.reindex()
method to swap the DataFrame
columns.import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'age': [29, 30, 31, 32], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) column_names = ['name', 'salary', 'age'] df = df.reindex(columns=column_names) print('-' * 50) print(df)
Running the code sample produces the following output.
name age salary 0 Alice 29 175.1 1 Bobby 30 180.2 2 Carl 31 190.3 3 Dan 32 205.4 -------------------------------------------------- name salary age 0 Alice 175.1 29 1 Bobby 180.2 30 2 Carl 190.3 31 3 Dan 205.4 32
Note: the example only changes the order of the columns in the
DataFrame
. If you also want to change the column contents, click on the following subheading:
The
DataFrame.reindex()
method conforms the DataFrame
to the new index.
column_names = ['name', 'salary', 'age'] df = df.reindex(columns=column_names)
The columns argument is an array-like object that stores the new labels for the columns.
You can also achieve the same result by using the loc
indexer.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'age': [29, 30, 31, 32], 'salary': [175.1, 180.2, 190.3, 205.4], }) column_names = list(df) print(column_names) # ๐๏ธ ['name', 'age', 'salary'] column_names[1], column_names[2] = column_names[2], column_names[1] print('-' * 50) print(column_names) # ๐๏ธ ['name', 'salary', 'age'] df = df.loc[:, column_names] print('-' * 50) # name salary age # 0 Alice 175.1 29 # 1 Bobby 180.2 30 # 2 Carl 190.3 31 # 3 Dan 205.4 32 print(df)
We used the list() class to get a list containing the column names.
The next step is to swap the columns using their respective indices.
column_names[1], column_names[2] = column_names[2], column_names[1]
Lastly, we use the df.loc
indexer to set the updated column names.
df = df.loc[:, column_names]
The two previous examples only change the order of the DataFrame columns.
If you also need to change the contents of the columns, set the updated columns
list using the columns
attribute.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'salary': [29, 30, 31, 32], 'age': [175.1, 180.2, 190.3, 205.4], }) print(df) column_names = list(df) column_names[1], column_names[2] = column_names[2], column_names[1] df.columns = column_names print('-' * 50) print(df)
Running the code sample produces the following output.
name salary age 0 Alice 29 175.1 1 Bobby 30 180.2 2 Carl 31 190.3 3 Dan 32 205.4 -------------------------------------------------- name age salary 0 Alice 29 175.1 1 Bobby 30 180.2 2 Carl 31 190.3 3 Dan 32 205.4
We used the list()
class to get a list of the names of the columns in the
DataFrame
.
# ๐๏ธ ['name', 'salary', 'age'] column_names = list(df)
The next step is to reorder the column names in the list.
column_names = list(df) # ๐๏ธ ['name', 'salary', 'age'] print(column_names) column_names[1], column_names[2] = column_names[2], column_names[1] # ๐๏ธ ['name', 'age', 'salary'] print(column_names)
Once the order is correct, use the DataFrame.columns
attribute to swap the two
columns.
This approach swaps the names of the columns and their contents as opposed to the previous two (which only swapped the column names).
If you have a large DataFrame
with many columns, it might be easier to define
a reusable function that swaps the two columns.
import pandas as pd def swap_df_columns(df, col1, col2): col_list = list(df) a, b = col_list.index(col1), col_list.index(col2) col_list[b], col_list[a] = col_list[a], col_list[b] df = df[col_list] return df df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'age': [29, 30, 31, 32], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) df = swap_df_columns(df, 'salary', 'age') print('-' * 50) print(df)
Running the code sample produces the following output.
name age salary 0 Alice 29 175.1 1 Bobby 30 180.2 2 Carl 31 190.3 3 Dan 32 205.4 -------------------------------------------------- name salary age 0 Alice 175.1 29 1 Bobby 180.2 30 2 Carl 190.3 31 3 Dan 205.4 32
The function does all the heavy lifting for us.
It takes the DataFrame
and the names of the columns as arguments.
def swap_df_columns(df, col1, col2): col_list = list(df) a, b = col_list.index(col1), col_list.index(col2) col_list[b], col_list[a] = col_list[a], col_list[b] df = df[col_list] return df
The function creates a list of the column names of the DataFrame
.
Then, the list.index()
method is used to get the index of the supplied column
names.
The column names are swapped using the indices.
The last step is to update the DataFrame
and return the result.
We could've also used the loc
indexer to achieve the same result.
import pandas as pd def swap_df_columns(df, col1, col2): col_list = list(df) a, b = col_list.index(col1), col_list.index(col2) col_list[b], col_list[a] = col_list[a], col_list[b] # โ Using loc indexer instead df = df.loc[:, col_list] return df df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'age': [29, 30, 31, 32], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) df = swap_df_columns(df, 'salary', 'age') print('-' * 50) print(df)
The code sample produces the same output.
name age salary 0 Alice 29 175.1 1 Bobby 30 180.2 2 Carl 31 190.3 3 Dan 32 205.4 -------------------------------------------------- name salary age 0 Alice 175.1 29 1 Bobby 180.2 30 2 Carl 190.3 31 3 Dan 205.4 32
You can slightly tweak the function if you need to also swap the contents of the columns (and not only the names).
import pandas as pd def swap_df_columns(df, col1, col2): col_list = list(df) a, b = col_list.index(col1), col_list.index(col2) col_list[b], col_list[a] = col_list[a], col_list[b] # โ Swap contents of columns as well df.columns = col_list return df df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'salary': [29, 30, 31, 32], 'age': [175.1, 180.2, 190.3, 205.4], }) print(df) df = swap_df_columns(df, 'salary', 'age') print('-' * 50) print(df)
Running the code sample produces the following output.
name salary age 0 Alice 29 175.1 1 Bobby 30 180.2 2 Carl 31 190.3 3 Dan 32 205.4 -------------------------------------------------- name age salary 0 Alice 29 175.1 1 Bobby 30 180.2 2 Carl 31 190.3 3 Dan 32 205.4
We used the DataFrame.columns
attribute to swap the names and contents of the
two columns.
You can learn more about the related topics by checking out the following tutorials: