Last updated: Apr 12, 2024
Reading time·6 min
To update a Pandas DataFrame
while iterating over its rows:
DataFrame.iterrows()
method to iterate over the DataFrame
row by
row.DataFrame.at()
method to update the value
of the column for the current row.import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) for index, row in df.iterrows(): if row['salary'] < 190: df.at[index, 'salary'] = 200 # name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 200.0 # 2 Carl 5 190.3 # 3 Dan 7 205.4 print(df)
The DataFrame.iterrows() method enables us to iterate over the DataFrame's rows as (index, Series) pairs.
for index, row in df.iterrows(): if row.salary < 190: df.at[index, 'salary'] = 200
On each iteration, we access the salary
attribute on the current row and check
if the value is less than 190
.
If the condition is met, we use the
DataFrame.at()
method to update the value of the salary
column for the current row.
The code sample sets the "salary"
values that are less than 190
to 200
.
# name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 200.0 # 2 Carl 5 190.3 # 3 Dan 7 205.4 print(df)
If you need to update a Pandas DataFrame
while iterating over its rows based
on multiple conditions use the logical AND &
or
logical OR |
operators.
The following code sample iterates over the DataFrame
row by row and updates
the "salary"
values if:
"salary"
value of the current row is less than 190
."name"
value of the current row is equal to "Alice"
.import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) for index, row in df.iterrows(): if (row['salary'] < 190) & (row['name'] == 'Alice'): df.at[index, 'salary'] = 200 # name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 180.2 # 2 Carl 5 190.3 # 3 Dan 7 205.4 print(df)
Both conditions have to be met for the "salary"
value of the current row to be
updated.
If you need to satisfy only one condition for the value to get updated, use
the logical OR |
operator instead.
The following example updates the DataFrame
while iterating over it if:
"salary"
value of the current row is less than 190
."name"
value of the current row is equal to "Dan"
.import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) for index, row in df.iterrows(): if (row['salary'] < 190) | (row['name'] == 'Dan'): df.at[index, 'salary'] = 200 # name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 200.0 # 2 Carl 5 190.3 # 3 Dan 7 200.0 print(df)
DataFrame.index
You can also use the DataFrame.index
attribute to update a Pandas DataFrame
while iterating over its rows.
The index
attribute is used to access the row labels of the DataFrame
.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) for index in df.index: if df.at[index, 'salary'] < 190: df.at[index, 'salary'] = 200 # name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 200.0 # 2 Carl 5 190.3 # 3 Dan 7 205.4 print(df)
We used the index
attribute to iterate over the DataFrame
.
On each iteration, we check if the current "salary"
value is less than 190
.
If the condition is met, we update the "salary"
value of the current row,
setting it to 200
.
You can also add an else
statement or an elif
statement.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) for index in df.index: if df.at[index, 'salary'] < 190: df.at[index, 'salary'] = 200 else: df.at[index, 'salary'] = 300 # name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 200.0 # 2 Carl 5 300.0 # 3 Dan 7 300.0 print(df)
DataFrame.itertuples
You can also use the
DataFrame.itertuples
method to update a Pandas DataFrame
while iterating over its rows.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) for row in df.itertuples(): if row.salary < 190: df.at[row.Index, 'salary'] = 200 # name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 200.0 # 2 Carl 5 190.3 # 3 Dan 7 205.4 print(df)
The DataFrame.itertuples()
method enables us to iterate over the DataFrame
rows as named tuples.
DataFrame
with the first field being the index and the following fields being the column values.apply()
You can also use the DataFrame.apply()
method to update a Pandas DataFrame
while iterating over it in a single line.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) df['salary'] = df.apply( lambda x: 200 if x.salary < 190 else 300, axis=1 ) # name experience salary # 0 Alice 1 200 # 1 Bobby 3 200 # 2 Carl 5 300 # 3 Dan 7 300 print(df)
The
DataFrame.apply()
method applies a function along an axis of the DataFrame
.
We set the axis
argument to 1
to have the function applied to each row.
df['salary'] = df.apply( lambda x: 200 if x.salary < 190 else 300, axis=1 )
We check if the salary
value of each row is less than 190
.
If the condition is met, then the salary
value gets set to 200
, otherwise,
it gets set to 300
.
If you don't want to update the value of the row in the else
statement, return
it as is.
import pandas as pd df = pd.DataFrame({ 'name': ['Alice', 'Bobby', 'Carl', 'Dan'], 'experience': [1, 3, 5, 7], 'salary': [175.1, 180.2, 190.3, 205.4], }) df['salary'] = df.apply( lambda x: 200 if x.salary < 190 else x.salary, axis=1 ) # name experience salary # 0 Alice 1 200.0 # 1 Bobby 3 200.0 # 2 Carl 5 190.3 # 3 Dan 7 205.4 print(df)
Instead of returning a different value in the else
statement, we return the
current "salary"
value of the row.
You can learn more about the related topics by checking out the following tutorials:
pd.read_json()