Last updated: Apr 12, 2024
Reading timeยท4 min
To get the memory size of a DataFrame
in Pandas:
DataFrame.memory_usage()
method to get the number of bytes each
column occupies.sum()
method on the result to get the total memory size of the
DataFrame
.import pandas as pd df = pd.DataFrame({ 'Name': [ 'Alice', 'Bobby', 'Carl' ], 'Date': [ '2023-07-12', '2023-08-23', '2023-08-21' ] }) # Index 128 # Name 24 # Date 24 # dtype: int64 print(df.memory_usage()) print('-' * 50) print(df.memory_usage(index=True).sum()) # ๐๏ธ 176
The pandas.DataFrame
method returns the memory usage of each column of the
DataFrame
in bytes.
You can use the index
argument to specify if you want to include the
contribution of the index
in the calculation.
import pandas as pd df = pd.DataFrame({ 'Name': [ 'Alice', 'Bobby', 'Carl' ], 'Date': [ '2023-07-12', '2023-08-23', '2023-08-21' ] }) # Index 128 # Name 24 # Date 24 # dtype: int64 print(df.memory_usage(index=True)) print('-' * 50) # Name 24 # Date 24 # dtype: int64 print(df.memory_usage(index=False))
By default, the index
argument is set to True
, which means the memory usage
of the DataFrame's index is included in the returned Series.
index
is set to True
, its memory consumption is the first row in the output.To calculate the memory consumption of the entire DataFrame
(in bytes), sum
the memory usage of all columns.
import pandas as pd df = pd.DataFrame({ 'Name': [ 'Alice', 'Bobby', 'Carl' ], 'Date': [ '2023-07-12', '2023-08-23', '2023-08-21' ] }) print(df.memory_usage(index=True).sum()) # ๐๏ธ 176
The DataFrame.sum() method returns the sum of the values over the requested axis.
The method is equivalent to numpy.sum()
.
If you want to include the memory footprint of object
dtype columns
in the result, set the deep
argument to True
when calling
DataFrame.memory_usage()
.
import pandas as pd df = pd.DataFrame({ 'Name': [ 'Alice', 'Bobby', 'Carl' ], 'Date': [ '2023-07-12', '2023-08-23', '2023-08-21' ] }) print(df.memory_usage(deep=True).sum()) # ๐๏ธ 514 print(df.memory_usage(deep=False).sum()) # ๐๏ธ 176
If the deep
argument is set to True
, the calculation accounts for the full
usage of the contained in the DataFrame
objects.
By default, the deep
argument is set to True
, so the memory footprint of
object dtype
columns is not included.
Here is an example of setting deep
to True
without chaining a sum()
call.
import pandas as pd df = pd.DataFrame({ 'Name': [ 'Alice', 'Bobby', 'Carl' ], 'Date': [ '2023-07-12', '2023-08-23', '2023-08-21' ] }) # Index 128 # Name 185 # Date 201 # dtype: int64 print(df.memory_usage(deep=True)) print('-' * 50) # Index 128 # Name 24 # Date 24 # dtype: int64 print(df.memory_usage(deep=False))
Passing deep=False
is the same as not passing the argument at all because
False
is its default value.
sys.getsizeof()
You can also use the
sys.getsizeof()
method to get the memory size of a DataFrame
.
import sys import pandas as pd df = pd.DataFrame({ 'Name': [ 'Alice', 'Bobby', 'Carl' ], 'Date': [ '2023-07-12', '2023-08-23', '2023-08-21' ] }) print(df.memory_usage(deep=True).sum()) # ๐๏ธ 514 print(sys.getsizeof(df)) # ๐๏ธ 530
The method returns the size of the supplied object in bytes.
DataFrame
using DataFrame.info()
You can also use the DataFrame.info()
method to get the memory size of a
DataFrame
.
import pandas as pd df = pd.DataFrame({ 'Name': [ 'Alice', 'Bobby', 'Carl' ], 'Date': [ '2023-07-12', '2023-08-23', '2023-08-21' ] }) # memory usage: 176.0+ bytes print(df.info())
The
DataFrame.info()
method prints a concise summary of a DataFrame
.
You should be able to see the memory usage toward the end of the output.
You can also set the memory_usage
argument to "deep"
to include the memory
footprint of object dtype
columns.
import pandas as pd df = pd.DataFrame({ 'Name': [ 'Alice', 'Bobby', 'Carl' ], 'Date': [ '2023-07-12', '2023-08-23', '2023-08-21' ] }) # memory usage: 514.0 bytes print(df.info(memory_usage='deep'))
If the deep
argument is set to True
, the calculation accounts for the full
usage of the contained in the DataFrame
objects.
You can learn more about the related topics by checking out the following tutorials: