How to modify a Subset of Rows in a Pandas DataFrame

avatar
Borislav Hadzhiev

Last updated: Apr 12, 2024
5 min

banner

# Table of Contents

  1. How to modify a Subset of Rows in a Pandas DataFrame
  2. Modify a Subset of Rows in a Pandas DataFrame using numpy.where()
  3. Modify a Subset of Rows in multiple columns in a Pandas DataFrame
  4. Modify a Subset of Rows in a Pandas DataFrame using apply()

# How to modify a Subset of Rows in a Pandas DataFrame

To modify a subset of rows in a Pandas DataFrame:

  1. Use the DataFrame.loc label-based indexer to select the rows you'd like to modify.
  2. Set the values in the selected rows to the replacement value.
main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 1, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) df.loc[df.experience == 1, 'salary'] = 150 print('-' * 50) print(df)
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
name experience salary 0 Alice 1 175.1 1 Bobby 3 180.2 2 Carl 1 190.3 3 Dan 7 205.4 -------------------------------------------------- name experience salary 0 Alice 1 150.0 1 Bobby 3 180.2 2 Carl 1 150.0 3 Dan 7 205.4

pandas modify subset of rows in dataframe

We used the DataFrame.loc label-based indexer to access specific rows of the DataFrame by a boolean Series.

The df.experience == 1 expression returns a boolean Series containing True values for the rows that match the condition and False for those that don't.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 1, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) # 0 True # 1 False # 2 True # 3 False # Name: experience, dtype: bool print(df.experience == 1)
The code for this article is available on GitHub

Notice that we specified the name of the column we want to update after the comma.

main.py
df.loc[df.experience == 1, 'salary'] = 150

We basically select the rows in which the "experience" column values are equal to 1 and update the corresponding "salary" values to 150.

You can also calculate the value on the right-hand side using the df.loc indexer.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 1, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) df.loc[df.experience == 1, 'salary'] = df.loc[ df.experience == 1, 'salary'] - 100 print('-' * 50) print(df)

Running the code sample produces the following output.

shell
name experience salary 0 Alice 1 175.1 1 Bobby 3 180.2 2 Carl 1 190.3 3 Dan 7 205.4 -------------------------------------------------- name experience salary 0 Alice 1 75.1 1 Bobby 3 180.2 2 Carl 1 90.3 3 Dan 7 205.4

calculating value on right hand side with loc indexer

The code sample updates the salary values of the rows in which the experience value is equal to 1.

main.py
df.loc[df.experience == 1, 'salary'] = df.loc[ df.experience == 1, 'salary'] - 100

The salary value of the matching rows is set to the current value minus 100.

In other words, the syntax is:

  • df.loc[selection criteria, columns to update] = value
The DataFrame.loc indexer is primarily label-based, but it can also be used with a boolean array (or Series).

The boolean array (or Series) tells Pandas which rows we want to select.

The column name after the comma instructs Pandas about the rows in which column we want to update.

# Modify a Subset of Rows in a Pandas DataFrame using numpy.where()

You can also use the numpy.where() method to modify a subset of rows in a Pandas DataFrame.

First, make sure you have the numpy module installed.

shell
pip install numpy # or with pip3 pip3 install numpy

Now, import the numpy module and use the numpy.where() method.

main.py
import pandas as pd import numpy as np df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 1, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) replacement_value = 150 df['salary'] = np.where( df.experience.values == 1, replacement_value, df.salary.values ) print('-' * 50) print(df)
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
name experience salary 0 Alice 1 175.1 1 Bobby 3 180.2 2 Carl 1 190.3 3 Dan 7 205.4 -------------------------------------------------- name experience salary 0 Alice 1 150.0 1 Bobby 3 180.2 2 Carl 1 150.0 3 Dan 7 205.4

modify subset of rows in dataframe using numpy where

The numpy.where() method returns the elements chosen from x or y depending on the supplied condition.

main.py
df['salary'] = np.where( df.experience.values == 1, replacement_value, df.salary.values )

The first argument the method takes is the condition and the second and third arguments are x and y.

main.py
import pandas as pd import numpy as np df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 1, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) replacement_value = 150 # [150. 180.2 150. 205.4] print(np.where( df.experience.values == 1, replacement_value, df.salary.values ))
The code for this article is available on GitHub

The where() method returns an array with the elements from x where the condition is True and the elements from y elsewhere.

# Modify a Subset of Rows in multiple columns in a Pandas DataFrame

If you need to modify a subset of rows in multiple columns in a DataFrame, use the DataFrame.values attribute.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Bobby', 'Dan'], 'experience': [1, 3, 1, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) print('-' * 50) df.loc[df.name == 'Bobby', ['experience', 'salary'] ] = df.loc[ df.name == 'Bobby', ['experience', 'salary'] ].values / 2 print(df)
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
name experience salary 0 Alice 1 175.1 1 Bobby 3 180.2 2 Bobby 1 190.3 3 Dan 7 205.4 -------------------------------------------------- name experience salary 0 Alice 1.0 175.10 1 Bobby 1.5 90.10 2 Bobby 0.5 95.15 3 Dan 7.0 205.40

modify subset of rows in multiple columns in dataframe

The code sample updates the values in the experience and salary columns where the name column has a value of "Bobby".

main.py
df.loc[df.name == 'Bobby', ['experience', 'salary'] ] = df.loc[ df.name == 'Bobby', ['experience', 'salary'] ].values / 2

The corresponding values in the experience and salary columns are set to their current value divided by 2.

# Modify a Subset of Rows in a Pandas DataFrame using apply()

You can also use the DataFrame.apply method to modify a subset of rows in a Pandas DataFrame.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Bobby', 'Dan'], 'experience': [1, 3, 1, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) print(df) print('-' * 50) replacement_value = 150 df['salary'] = df.apply( lambda x: replacement_value if x['experience'] == 1 else x['salary'], axis=1 ) print(df)
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
name experience salary 0 Alice 1 175.1 1 Bobby 3 180.2 2 Bobby 1 190.3 3 Dan 7 205.4 -------------------------------------------------- name experience salary 0 Alice 1 150.0 1 Bobby 3 180.2 2 Bobby 1 150.0 3 Dan 7 205.4

modify subset of rows in pandas dataframe using apply

The DataFrame.apply() method applies a function along an axis of the DataFrame.

main.py
df['salary'] = df.apply( lambda x: replacement_value if x['experience'] == 1 else x['salary'], axis=1 )

The axis parameter defines the axis along which the function is applied.

By default, the axis parameter is set to 0 which means that the function is applied to each column.

However, we want to apply the function to the matching rows, so we set the axis argument to 1.

The code sample updates the rows in the salary column where the value in the experience column is equal to 1.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.