Update a Pandas DataFrame while iterating over its rows

avatar
Borislav Hadzhiev

Last updated: Apr 12, 2024
6 min

banner

# Table of Contents

  1. Update a Pandas DataFrame while iterating over its rows
  2. Update a Pandas DataFrame while iterating over its rows based on multiple conditions
  3. Update a Pandas DataFrame while iterating over its rows using DataFrame.index
  4. Update a Pandas DataFrame while iterating over its rows using DataFrame.itertuples
  5. Update a Pandas DataFrame while iterating over its rows using apply()

# Update a Pandas DataFrame while iterating over its rows

To update a Pandas DataFrame while iterating over its rows:

  1. Use the DataFrame.iterrows() method to iterate over the DataFrame row by row.
  2. Check if a certain condition is met.
  3. If the condition is met, use the DataFrame.at() method to update the value of the column for the current row.
main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) for index, row in df.iterrows(): if row['salary'] < 190: df.at[index, 'salary'] = 200 # name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 200.0 # 2 Carl 5 190.3 # 3 Dan 7 205.4 print(df)

update pandas dataframe while iterating over its rows

The code for this article is available on GitHub

The DataFrame.iterrows() method enables us to iterate over the DataFrame's rows as (index, Series) pairs.

main.py
for index, row in df.iterrows(): if row.salary < 190: df.at[index, 'salary'] = 200

On each iteration, we access the salary attribute on the current row and check if the value is less than 190.

If the condition is met, we use the DataFrame.at() method to update the value of the salary column for the current row.

The code sample sets the "salary" values that are less than 190 to 200.

shell
# name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 200.0 # 2 Carl 5 190.3 # 3 Dan 7 205.4 print(df)

# Update a Pandas DataFrame while iterating over its rows based on multiple conditions

If you need to update a Pandas DataFrame while iterating over its rows based on multiple conditions use the logical AND & or logical OR | operators.

The following code sample iterates over the DataFrame row by row and updates the "salary" values if:

  1. The "salary" value of the current row is less than 190.
  2. And the "name" value of the current row is equal to "Alice".
main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) for index, row in df.iterrows(): if (row['salary'] < 190) & (row['name'] == 'Alice'): df.at[index, 'salary'] = 200 # name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 180.2 # 2 Carl 5 190.3 # 3 Dan 7 205.4 print(df)

update pandas dataframe while iterating over it based on multiple and conditions

The code for this article is available on GitHub

Both conditions have to be met for the "salary" value of the current row to be updated.

If you need to satisfy only one condition for the value to get updated, use the logical OR | operator instead.

The following example updates the DataFrame while iterating over it if:

  1. The "salary" value of the current row is less than 190.
  2. Or the "name" value of the current row is equal to "Dan".
main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) for index, row in df.iterrows(): if (row['salary'] < 190) | (row['name'] == 'Dan'): df.at[index, 'salary'] = 200 # name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 200.0 # 2 Carl 5 190.3 # 3 Dan 7 200.0 print(df)

update dataframe while iterating over it using logical or

The code for this article is available on GitHub

# Update a Pandas DataFrame while iterating over its rows using DataFrame.index

You can also use the DataFrame.index attribute to update a Pandas DataFrame while iterating over its rows.

The index attribute is used to access the row labels of the DataFrame.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) for index in df.index: if df.at[index, 'salary'] < 190: df.at[index, 'salary'] = 200 # name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 200.0 # 2 Carl 5 190.3 # 3 Dan 7 205.4 print(df)

update pandas dataframe while iterating using index

The code for this article is available on GitHub

We used the index attribute to iterate over the DataFrame.

On each iteration, we check if the current "salary" value is less than 190.

If the condition is met, we update the "salary" value of the current row, setting it to 200.

You can also add an else statement or an elif statement.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) for index in df.index: if df.at[index, 'salary'] < 190: df.at[index, 'salary'] = 200 else: df.at[index, 'salary'] = 300 # name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 200.0 # 2 Carl 5 300.0 # 3 Dan 7 300.0 print(df)

added else statement to the loop

# Update a Pandas DataFrame while iterating over its rows using DataFrame.itertuples

You can also use the DataFrame.itertuples method to update a Pandas DataFrame while iterating over its rows.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) for row in df.itertuples(): if row.salary < 190: df.at[row.Index, 'salary'] = 200 # name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 200.0 # 2 Carl 5 190.3 # 3 Dan 7 205.4 print(df)

update dataframe while iterating using itertuples

The code for this article is available on GitHub

The DataFrame.itertuples() method enables us to iterate over the DataFrame rows as named tuples.

The method returns an iterator object with named tuples for each row in the DataFrame with the first field being the index and the following fields being the column values.

# Update a Pandas DataFrame while iterating over its rows using apply()

You can also use the DataFrame.apply() method to update a Pandas DataFrame while iterating over it in a single line.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) df['salary'] = df.apply( lambda x: 200 if x.salary < 190 else 300, axis=1 ) # name experience salary # 0 Alice 1 200 # 1 Bobby 3 200 # 2 Carl 5 300 # 3 Dan 7 300 print(df)

update pandas dataframe while iterating over it using apply

The code for this article is available on GitHub

The DataFrame.apply() method applies a function along an axis of the DataFrame.

We set the axis argument to 1 to have the function applied to each row.

main.py
df['salary'] = df.apply( lambda x: 200 if x.salary < 190 else 300, axis=1 )

We check if the salary value of each row is less than 190.

If the condition is met, then the salary value gets set to 200, otherwise, it gets set to 300.

If you don't want to update the value of the row in the else statement, return it as is.

main.py
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) df['salary'] = df.apply( lambda x: 200 if x.salary < 190 else x.salary, axis=1 ) # name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 200.0 # 2 Carl 5 190.3 # 3 Dan 7 205.4 print(df)

return value as is in else statement

The code for this article is available on GitHub

Instead of returning a different value in the else statement, we return the current "salary" value of the row.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.