batch_sizesince increasing it would result in OOM and decreasing it would result in under utilization of compute resources.
Translatorobject? Can you post these details in this issue: OpenNMT/CTranslate2#414
Before CUBLAS_NOT_SUPPORTED we got "Out of memory” error.
We run 2 models, each of 300mb. But in Nvidia-SMI I saw that only 1GB of 40GB is used and then get “out of memory"
To convert models to CTRanslate I used that command (to create 8 bit models)
ct2-opennmt-tf-converter --model_path INPUT_ONMT_MODEL_DIR --model_spec TransformerBig --output_dir OUTPUT_DIR --quantization int8
May be it’s because of “fabric manager”, that need to run with A100 GPU ?
The Translator object creates like:
model = ctranslate2.Translator(path, device=DEVICE)
The package version: