Pandas: Select first N or last N columns of DataFrame

avatar
Borislav Hadzhiev

Last updated: Apr 12, 2024
5 min

banner

# Table of Contents

  1. Pandas: Select first N columns of DataFrame
  2. Pandas: Select last N columns of DataFrame
  3. Exclude the last N columns from a DataFrame
  4. Select the Last N columns of a DataFrame using DataFrame.columns

# Pandas: Select first N columns of DataFrame

Use the DataFrame.iloc integer-based indexer to select the first N columns of a DataFrame in Pandas.

You can specify the n value after the comma, in the expression.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'experience': [1, 1, 5, 7, 7], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], }) print(df) print('-' * 50) first_2_columns = df.iloc[:, :2] print(first_2_columns)
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
name experience salary 0 Alice 1 175.1 1 Bobby 1 180.2 2 Carl 5 190.3 3 Dan 7 205.4 4 Ethan 7 210.5 -------------------------------------------------- name experience 0 Alice 1 1 Bobby 1 2 Carl 5 3 Dan 7 4 Ethan 7

select first n columns of dataframe

The DataFrame.loc indexer is used for selection by position (index).

We specified the n value after the comma in the expression.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'experience': [1, 1, 5, 7, 7], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], }) print(df) print('-' * 50) n = 2 first_2_columns = df.iloc[:, :n] # name experience # 0 Alice 1 # 1 Bobby 1 # 2 Carl 5 # 3 Dan 7 # 4 Ethan 7 print(first_2_columns)

extract n value in variable

The code for this article is available on GitHub

If you have to do this often, define a reusable function.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'experience': [1, 1, 5, 7, 7], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], }) def select_first_n_rows(data_frame, n): return data_frame.iloc[:, :n] print(select_first_n_rows(df, 2)) print('-' * 50) print(select_first_n_rows(df, 1))

Here is the output of running the code sample.

shell
name experience 0 Alice 1 1 Bobby 1 2 Carl 5 3 Dan 7 4 Ethan 7 -------------------------------------------------- name 0 Alice 1 Bobby 2 Carl 3 Dan 4 Ethan

define reusable function to select first n columns

# Pandas: Select last N columns of DataFrame

You can also use the DataFrame.iloc position-based indexer to select the last N columns of a DataFrame.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'experience': [1, 1, 5, 7, 7], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], }) last_2_columns = df.iloc[:, -2:] # experience salary # 0 1 175.1 # 1 1 180.2 # 2 5 190.3 # 3 7 205.4 # 4 7 210.5 print(last_2_columns)

select last n columns of dataframe

The code for this article is available on GitHub

The code sample selects the last 2 columns of the DataFrame.

Notice that we used -n between the square brackets.

If you have to do this often, define a reusable function.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'experience': [1, 1, 5, 7, 7], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], }) def last_n_columns(data_frame, n): return data_frame.iloc[:, -n:] print(last_n_columns(df, 2)) print('-' * 50) print(last_n_columns(df, 1))

Running the code sample produces the following output.

shell
experience salary 0 1 175.1 1 1 180.2 2 5 190.3 3 7 205.4 4 7 210.5 -------------------------------------------------- salary 0 175.1 1 180.2 2 190.3 3 205.4 4 210.5

select last n columns of dataframe with reusable function

The function takes the DataFrame and n as parameters and returns the last n columns of the DataFrame.

# Exclude the last N columns from a DataFrame

A similar approach can be used to exclude the last N columns from a DataFrame.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'experience': [1, 1, 5, 7, 7], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], }) print(df) print('-' * 50) exclude_last_2_columns = df.iloc[:, :-2] print(exclude_last_2_columns)
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
name experience salary 0 Alice 1 175.1 1 Bobby 1 180.2 2 Carl 5 190.3 3 Dan 7 205.4 4 Ethan 7 210.5 -------------------------------------------------- name 0 Alice 1 Bobby 2 Carl 3 Dan 4 Ethan

exclude last n columns from dataframe

We excluded the last 2 columns from the DataFrame.

If you have to do this often, define a reusable function.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'experience': [1, 1, 5, 7, 7], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], }) def exclude_last_n_columns(data_frame, n): return data_frame.iloc[:, :-n] print(exclude_last_n_columns(df, 2)) print('-' * 50) print(exclude_last_n_columns(df, 1))
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
name 0 Alice 1 Bobby 2 Carl 3 Dan 4 Ethan -------------------------------------------------- name experience 0 Alice 1 1 Bobby 1 2 Carl 5 3 Dan 7 4 Ethan 7

exclude last n columns from data frame using reusable function

# Select the Last N columns of a DataFrame using DataFrame.columns

You can also use slicing with the DataFrame.columns attribute to select the last N columns of a DataFrame.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'experience': [1, 1, 5, 7, 7], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], }) print(df) print('-' * 50) last_2_columns = df[df.columns[-2:]] print(last_2_columns)
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
name experience salary 0 Alice 1 175.1 1 Bobby 1 180.2 2 Carl 5 190.3 3 Dan 7 205.4 4 Ethan 7 210.5 -------------------------------------------------- experience salary 0 1 175.1 1 1 180.2 2 5 190.3 3 7 205.4 4 7 210.5

extract last n columns of dataframe using columns attribute

The code sample selects the last 2 columns of the DataFrame using the DataFrame.columns attribute.

The attribute returns an index that contains the column labels of the DataFrame.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan', 'Ethan'], 'experience': [1, 1, 5, 7, 7], 'salary': [175.1, 180.2, 190.3, 205.4, 210.5], }) # Index(['name', 'experience', 'salary'], dtype='object') print(df.columns)
The code for this article is available on GitHub

I've also written an article on how to get the Nth row or every Nth row in a DataFrame.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.