Common issues
luajit: out of memory¶
This most likely happened when training a model with long sequences and the LuaJIT memory limit was reached. You will need to switch to Lua 5.2 instead.
THCudaCheck FAIL [...]: out of memory¶
This means your model was too large to fit on the available GPU memory.
To work around this error during training, follow these steps in order and stop when the training no more fails:
- Prefix your command line with
THC_CACHING_ALLOCATOR=0
- Reduce the
-max_batch_size
value (160 by default) - Reduce the
-max_tokens
value (1800 by default) - Reduce the
-src_seq_length
and-tgt_seq_length
values during the preprocessing - Reduce your model size (
-layers
,-rnn_size
, etc.)
unknown Torch class <torch.CudaTensor>¶
This means you wanted to load a GPU model but did not use the -gpuid
option to define which GPU to use.