How to repeat Rows N times in a Pandas DataFrame

avatar
Borislav Hadzhiev

Last updated: Apr 12, 2024
5 min

banner

# Table of Contents

  1. How to repeat Rows N times in a Pandas DataFrame
  2. Repeating each row N times in a DataFrame based on another column
  3. Repeat Rows N times in a Pandas DataFrame using np.repeat()
  4. Repeat Rows N times in a Pandas DataFrame using pd.concat()

# How to repeat Rows N times in a Pandas DataFrame

Use the DataFrame.index.repeat() method to repeat the rows in a Pandas DataFrame N times.

The repeat() method will repeat each index in the DataFrame the specified number of times.

main.py
import pandas as pd df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], }) df2 = df.loc[df.index.repeat(2)] print(df2)
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
first_name salary 0 Alice 175.1 0 Alice 175.1 1 Bobby 180.2 1 Bobby 180.2 2 Carl 190.3 2 Carl 190.3

repeat rows n times in pandas dataframe

The index.repeat() method repeats the elements of an index.

main.py
import pandas as pd df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], }) # 👇️ Index([0, 0, 1, 1, 2, 2], dtype='int64') print(df.index.repeat(2))
The code for this article is available on GitHub

We then used the DataFrame.loc indexer to access the group of rows and columns by the indices.

main.py
df2 = df.loc[df.index.repeat(2)] # first_name salary # 0 Alice 175.1 # 0 Alice 175.1 # 1 Bobby 180.2 # 1 Bobby 180.2 # 2 Carl 190.3 # 2 Carl 190.3 print(df2)

Notice that there are also repeat indices in the output.

If you want to reset the indices, use the DataFrame.reset_index() method.

main.py
import pandas as pd df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], }) df2 = df.loc[df.index.repeat(2)].reset_index(drop=True) print(df2)

Running the code sample produces the following output.

shell
first_name salary 0 Alice 175.1 1 Alice 175.1 2 Bobby 180.2 3 Bobby 180.2 4 Carl 190.3 5 Carl 190.3

repeat rows n times in pandas dataframe and reset index

The DataFrame.reset_index method resets the index of the DataFrame, so the default index is used.

# Repeating each row N times in a DataFrame based on another column

The same approach can be used to repeat each row N times based on another column.

main.py
import pandas as pd df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl'], 'times': [1, 2, 3], }) df2 = df.loc[df.index.repeat(df.times)].reset_index(drop=True) # first_name times # 0 Alice 1 # 1 Bobby 2 # 2 Bobby 2 # 3 Carl 3 # 4 Carl 3 # 5 Carl 3 print(df2)

repeat rows n times based on another column

The code for this article is available on GitHub

We used the times column to repeat each row N times.

The first row is not repeated, the second row is repeated once and the third row is repeated twice in the example.

The code sample also uses the reset_index() method to reset the index, however, this is optional.

# Repeat Rows N times in a Pandas DataFrame using np.repeat()

You can also use the numpy.repeat() method to repeat the rows of a DataFrame N times.

main.py
import pandas as pd import numpy as np df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], }) df2 = pd.DataFrame(np.repeat(df.values, 2, axis=0)) df2.columns = df.columns print(df2)
The code for this article is available on GitHub

Running the code sample produces the following output.

shell
first_name salary 0 Alice 175.1 1 Alice 175.1 2 Bobby 180.2 3 Bobby 180.2 4 Carl 190.3 5 Carl 190.3

repeat rows n times using numpy repeat

Make sure you have the numpy module installed to be able to run the code sample.

shell
pip install pandas numpy # or with pip3 pip3 install pandas numpy

The numpy.repeat() method takes an input array, the number of repetitions and the axis as parameters and returns an array with the repeated values.

main.py
import pandas as pd import numpy as np df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], }) # [['Alice' 175.1] # ['Alice' 175.1] # ['Bobby' 180.2] # ['Bobby' 180.2] # ['Carl' 190.3] # ['Carl' 190.3]] print(np.repeat(df.values, 2, axis=0))

We have to pass the array to the pandas.DataFrame() constructor and set the columns of the new DataFrame to the columns of the existing DataFrame.

main.py
df2 = pd.DataFrame(np.repeat(df.values, 2, axis=0)) df2.columns = df.columns

You can also set the columns when initializing the new DataFrame.

main.py
import pandas as pd import numpy as np df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], }) df2 = pd.DataFrame(np.repeat(df.values, 2, axis=0), columns=df.columns) # first_name salary # 0 Alice 175.1 # 1 Alice 175.1 # 2 Bobby 180.2 # 3 Bobby 180.2 # 4 Carl 190.3 # 5 Carl 190.3 print(df2)
The code for this article is available on GitHub

The code sample uses the column argument of the pandas.DataFrame() class to achieve the same result.

You can also use the loc indexer as we did in the first subheading.

main.py
import pandas as pd import numpy as np df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], }) df2 = df.loc[np.repeat(df.index, 2)].reset_index(drop=True) # first_name salary # 0 Alice 175.1 # 1 Alice 175.1 # 2 Bobby 180.2 # 3 Bobby 180.2 # 4 Carl 190.3 # 5 Carl 190.3 print(df2)

However, when using the df.loc indexer, we have to manually reset the indices of the DataFrame by calling reset_index().

# Repeat Rows N times in a Pandas DataFrame using pd.concat()

You can also use the pandas.concat method to repeat the rows in a DataFrame N times.

main.py
import pandas as pd df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], }) df2 = pd.concat([df] * 2).sort_index().reset_index(drop=True) # first_name salary # 0 Alice 175.1 # 1 Alice 175.1 # 2 Bobby 180.2 # 3 Bobby 180.2 # 4 Carl 190.3 # 5 Carl 190.3 print(df2)

repeat dataframe rows n times using concat

The code for this article is available on GitHub

The pandas.concat() method concatenates Pandas objects along a particular axis.

main.py
import pandas as pd df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], }) # first_name salary # 0 Alice 175.1 # 1 Bobby 180.2 # 2 Carl 190.3 # 0 Alice 175.1 # 1 Bobby 180.2 # 2 Carl 190.3 print(pd.concat([df] * 2))

We then have to:

  1. Use the sort_index method to sort the indices.
  2. Use the reset_index() method to reset the indices of the DataFrame.
main.py
import pandas as pd df = pd.DataFrame({ 'first_name': ['Alice', 'Bobby', 'Carl'], 'salary': [175.1, 180.2, 190.3], }) df2 = pd.concat([df] * 2).sort_index().reset_index(drop=True) # first_name salary # 0 Alice 175.1 # 1 Alice 175.1 # 2 Bobby 180.2 # 3 Bobby 180.2 # 4 Carl 190.3 # 5 Carl 190.3 print(df2)

repeat dataframe rows n times using concat

The code for this article is available on GitHub

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.