NumPy: Unable to allocate array with shape and data type

avatar
Borislav Hadzhiev

Last updated: Apr 12, 2024
5 min

banner

# Table of Contents

  1. NumPy: Unable to allocate array with shape and data type
  2. Setting the overcommit mode to 1
  3. NumPy: Unable to allocate array with shape and data type on Windows
  4. Make sure your Python installation is 64-bit
  5. Try setting the dtype argument to uint8
  6. Splitting the data structure or reading it in chunks

# NumPy: Unable to allocate array with shape and data type

The NumPy error "Unable to allocate array with shape and data type" occurs most commonly because of the system's default overcommit handling.

By default, the overcommit mode is set to 0.

This means that overcommits of address space are refused automatically.

Here is an example of how the error occurs.

main.py
import numpy as np # ⛔️ numpy.core._exceptions._ArrayMemoryError: Unable to allocate 236. GiB for an array with shape (126816, 36, 55406) and data type uint8 arr = np.zeros((126816, 36, 55406), dtype='uint8')

unable to allocate for array with shape and data type uint8

Note, if you got the error on Windows, click on the following subheading:

On Linux, you can check your overcommit mode by issuing the following command.

shell
cat /proc/sys/vm/overcommit_memory

get current overcommit mode on linux

The default overcommit mode is 0.

When we tried to create the array, we tried to allocate 236 GiB.

main.py
import numpy as np # ⛔️ numpy.core._exceptions._ArrayMemoryError: Unable to allocate 236. GiB for an array with shape (126816, 36, 55406) and data type uint8 arr = np.zeros((126816, 36, 55406), dtype='uint8')

Here is how this is calculated.

main.py
# 👇️ 235.37039217029948 print(126816*36*55406 / 1024.3 ** 3)

how memory allocated is calculated

Trying to allocate 236 GiB fails when the overcommit mode is set to 0.

# Setting the overcommit mode to 1

One way to resolve the issue is to set the overcommit mode to 1.

You can do this by issuing the following 2 commands.

shell
# switch to super user sudo su # set overcommit mode to 1 echo 1 > /proc/sys/vm/overcommit_memory

set overcommit mode to 1

Note that you have to have super user privileges when running the echo command, otherwise, you'd get a permission denied error.

When the overcommit mode is set to 1 ("always commit"), the system allows you to commit the necessary amount of memory.

main.py
import numpy as np arr = np.zeros((126816, 36, 55406), dtype='uint8') print(arr)

error solved after setting overcommit mode to 1

Notice that we are now able to create the NumPy array, after setting the overcommit mode to 1.

You can use the cat command to verify that the overcommit mode has been set to 1.

shell
cat /proc/sys/vm/overcommit_memory

verify overcommit mode has been set to 1

Note that changing your overcommit mode as shown above will not persist after a reboot.

You might have to rerun the following 2 commands after each reboot if you want to change the overcommit mode to 1.

shell
# switch to super user sudo su # set overcommit mode to 1 echo 1 > /proc/sys/vm/overcommit_memory

set overcommit mode to 1

There are also ways to update your /proc/sys settings in a persistent manner but they are distribution-specific.

# NumPy: Unable to allocate array with shape and data type on Windows

If you got the NumPy "Unable to allocate array with shape and data type" error on Windows:

  1. Click on the Windows key or on the search bar and type "SystemPropertiesAdvanced".

search system properties advanced

  1. Right-click on "SystemPropertiesAdvanced" and select "Run as administrator".

  2. In the Advanced tab, under Performance, click on Settings....

click settings under performance

  1. Click on the Advanced tab and then click Change....

in advanced tab click change

  1. Uncheck the "Automatically manage paging file size for all drives" checkbox.

  2. Click on "Custom size" and type in your preferred Initial and Maximum sizes in MB.

uncheck automatically manage paging file size for all drives

  1. Click on Set, press OK to confirm and exit.

  2. Restart your computer.

If your PC has enough memory to process your NumPy array, the error will be resolved.

The formula for calculating the initial size is: initial size = 1.5 * total system memory.

The formula for the maximum size is: maximum size = 3 * initial size.

For example, if you have 16 GB of memory (1024 MB * 16 = 16384 MB).

The initial size would be 16384 * 1.5 = 24576.

The maximum size would be 3 * 24576 = 73728.

# Make sure your Python installation is 64-bit

Another thing you should check is that your Python installation is 64-bit.

I've written a detailed guide on how to Check if Python is running as 32-bit or 64-bit.

In short, you can use the following command.

shell
python -c "import sys; print(sys.maxsize > 2**32)"

check if python runs in 64 bit or 32 bit

It will return True if your Python interpreter is running in 64-bit.

If you get a False value back, then your Python installation is 32-bit.

If your Python installation is 32-bit, then it can allocate a maximum of 4 GB or RAM.

Even if your PC has more than 4 GB of RAM, they will remain unallocated if your Python interpreter is 32-bit.

You have to visit the official Python downloads page and download the 64-bit Python version (x86-64 label).

You might also run into this issue if you install the 32-bit version of the PyCharm IDE by mistake.

Make sure to install the 64-bit version instead.

# Try setting the dtype argument to uint8

If the error persists, try to set the dtype argument to "uint8" when creating the array.

main.py
import numpy as np arr = np.zeros((26816, 36, 55406), dtype='uint8') print(arr)

The uint8 data type represents unsigned integers of 8 bits.

It contains all whole numbers from 0 to 255.

Note that the numbers are unsigned, so they cannot be negative.

The uint8 data type takes less space than float64 or int64, so it often helps resolve memory issues.

If you got the error when working with a Pandas DataFrame, use the astype() method to set the type of the values to uint8.

main.py
import numpy as np import pandas as pd df = pd.DataFrame({ 'x': np.array([1, 2, 3]) }) df['x'] = df['x'].astype(np.uint8) print(df)

set type of values to uint8

The code sample sets the types of the values in the X column to uint8.

# Splitting the data structure or reading it in chunks

If the error persists, you might be able to split the dataset you're working with in multiple chunks.

If the object is already initialized and stores all of the data, you might be able to only read a certain amount of the dataset at a time.

Each chunk should be processed separately, so you don't run out of memory.

# Additional Resources

You can learn more about the related topics by checking out the following tutorials:

I wrote a book in which I share everything I know about how to become a better, more efficient programmer.
book cover
You can use the search field on my Home Page to filter through all of my articles.