How to solve CNMEM_STATUS_OUT_OF_MEMORY error with Theano on CUDA

Have yo come across the CNMEM_STATUS_OUT_OF_MEMORY error when using Theano with CUDA, with Keras? You might have been trying to train a slightly larger model, and just when the training starts it throws this error and fails.

The CNMEM_STATUS_OUT_OF_MEMORY thrown in Theano with CUDA

The full error stack looks something like this,

 Traceback (most recent call last):  
  File "C:\Users\Thimira\Anaconda3\envs\tensorflow12\lib\site-packages\theano\compile\function_module.py", line 884, in __call__  
   self.fn() if output_subset is None else\  
 MemoryError: Error allocating 513802240 bytes of device memory (CNMEM_STATUS_OUT_OF_MEMORY).  
   
 During handling of the above exception, another exception occurred:  
   
 Traceback (most recent call last):  
  File "TheMachine.py", line 123, in <module>  
   batch_size=training_batch_size, nb_epoch=10, verbose=1)  
  File "C:\Users\Thimira\Anaconda3\envs\tensorflow12\lib\site-packages\keras\models.py", line 672, in fit  
   initial_epoch=initial_epoch)  
  File "C:\Users\Thimira\Anaconda3\envs\tensorflow12\lib\site-packages\keras\engine\training.py", line 1196, in fit  
   initial_epoch=initial_epoch)  
  File "C:\Users\Thimira\Anaconda3\envs\tensorflow12\lib\site-packages\keras\engine\training.py", line 891, in _fit_loop  
   outs = f(ins_batch)  
  File "C:\Users\Thimira\Anaconda3\envs\tensorflow12\lib\site-packages\keras\backend\theano_backend.py", line 959, in __call__  
   return self.function(*inputs)  
  File "C:\Users\Thimira\Anaconda3\envs\tensorflow12\lib\site-packages\theano\compile\function_module.py", line 898, in __call__  
   storage_map=getattr(self.fn, 'storage_map', None))  
  File "C:\Users\Thimira\Anaconda3\envs\tensorflow12\lib\site-packages\theano\gof\link.py", line 325, in raise_with_op  
   reraise(exc_type, exc_value, exc_trace)  
  File "C:\Users\Thimira\Anaconda3\envs\tensorflow12\lib\site-packages\six.py", line 685, in reraise  
   raise value.with_traceback(tb)  
  File "C:\Users\Thimira\Anaconda3\envs\tensorflow12\lib\site-packages\theano\compile\function_module.py", line 884, in __call__  
   self.fn() if output_subset is None else\  
 MemoryError: Error allocating 513802240 bytes of device memory (CNMEM_STATUS_OUT_OF_MEMORY).  
 Apply node that caused the error: GpuElemwise{Composite{(i0 * (i1 + Abs(i1)))},no_inplace}(CudaNdarrayConstant{[[[[ 0.5]]]]}, GpuElemwise{Add}[(0, 0)].0)  
 Toposort index: 109  
 Inputs types: [CudaNdarrayType(float32, (True, True, True, True)), CudaNdarrayType(float32, 4D)]  
 Inputs shapes: [(1, 1, 1, 1), (128, 20, 224, 224)]  
 Inputs strides: [(0, 0, 0, 0), (1003520, 50176, 224, 1)]  
 Inputs values: [b'CudaNdarray([[[[ 0.5]]]])', 'not shown']  
 Outputs clients: [[GpuContiguous(GpuElemwise{Composite{(i0 * (i1 + Abs(i1)))},no_inplace}.0)]]  
   
 HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.  
 HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

You might think that your training dataset might be too big, and that's what's causing the error. But actually, it the size of your machine learning model that's the cause. When a machine learning model gets complex (more layers, more convolutions, etc.) it takes a lot of memory in the GPU in order to train.

So, how do you solve it?
One way to get around the issue is by reducing the training batch size, which is controlled by the batch_size parameter in model.fit().

 model.fit(trainData, trainLabels, batch_size=128, nb_epoch=50, verbose=1)

If you got the error for batch_size=128, try reducing it to 64, or 32, until it fits in the GPU memory. But keep in mind to adjust the learning rates with the batch size also.

Build Deeper: Deep Learning Beginners' Guide is the ultimate guide for anyone taking their first step into Deep Learning.

Get your copy now!

Pages

Sunday, February 26, 2017

How to solve CNMEM_STATUS_OUT_OF_MEMORY error with Theano on CUDA

No comments:

Post a Comment