torch.set_num_threads(16)
but didn't see any improvements. I'm running translation on a machine with multiple CPU cores
Hi, I was using OpenNMT for English to Gujarati Translation. I used sentence piece transform for Gujarati Corpus. I trained the model for 70k sentences. But when I use translate.py for the output and printed out the verbose, I found that most of the subwords were being mapped to the <unk> tag. Eg.
"""
[2020-11-30 12:07:40,417 INFO]
SENT 1: ['but', 'some', 'really', 'good', 'work', 'is', 'being', 'done', 'around', 'us', ',', 'in', 'our', 'society', '.', 'and', 'all', 'this', 'has', 'been', 'possible', 'through', 'the', 'collective', 'efforts', 'of', 'a', '130', 'crore', 'countrymen', '.']
PRED 1: ▁છત <unk> <unk> ▁કરવા . . . . ▁સમગ્ર ▁ચ વ ર્ક ી સુર ક્ષ િત ▁અને ▁સમગ્ર ત ▁વધાર ો ▁છે ▁.
PRED SCORE: -30.2231
GOLD 1: <unk> <unk> <unk> <unk> <unk> <unk> <unk> આપણી <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> તે <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> .
GOLD SCORE: -674.3291
[2020-11-30 12:07:40,417 INFO]
SENT 2: ['this', 'can', 'pave', 'a', 'new', 'way', 'to', 'serve', 'the', 'purpose', 'of', 'ek', 'bharat', 'shresht', 'bharat', 'abhiyan', ',', 'ultimately', 'providing', 'a', 'great', 'boost', 'to', 'india’s', 'tourism', 'development', '.']
PRED 2: ▁સરપંચ ▁છું ▁ , ▁વિમુદ્રીકરણ ▁ <unk> ▁ , ▁ડેરી ▁ઉદ્યોગ નું ▁ઉદ્ઘાટન ▁કરવું ▁ , ▁ડેરી ઓ ▁ ,
PRED SCORE: -21.4394
GOLD 2: <unk> એક ભારત , શ્રેષ્ઠ ભારત <unk> <unk> પૂર્ણ <unk> <unk> માર્ગ <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> .
GOLD SCORE: -431.0326
"""
So is there any specific format, the input has to be given. Please guide me with this issue.
Thanks in advance.
Here GOLD Sentences are target gorund truth sentences.
example_vocab.src
has the following snippet:products 117
still 117
market 116
Hotel 116
see 115
great 114
today 114
through 114
against 112
rights 112
That 111
believe 111
therefore 110
under 110
case 109
same 109
But 109
offer 109
I had a few questions related to the REST Server as well. The server.py stated in the Documentation is showing as unavailable. Is there any upgrade in the server as well with the library upgrade ?
I don't understand. Post an issue on the repo with exact command and error trace.
Note:
some features require Python 3.5 and after (eg: Distributed multigpu, entmax)
we currently only support PyTorch 1.4
I can see pytorch can find GPU resources.
>>> torch.cuda.is_available()
True
>>> torch.cuda.current_device()
0
>>> torch.cuda.device(0)
<torch.cuda.device object at 0x7f8c0a3cec50>
>>> torch.cuda.device_count()
1
>>> torch.cuda.get_device_name(0)
'NVIDIA Tesla K80'
>>> torch.cuda.get_device_name(0)
'NVIDIA Tesla K80'
However, OpenNMT is not finding the GPU resource. What is going on here?
I am using OpenNMT from https://github.com/OpenNMT/OpenNMT-py/tree/1.2.0
Traceback (most recent call last):
File "../../OpenNMT-py/train.py", line 6, in <module>
main()
File "/root/work/context/huggingface-models/OpenNMT-py/onmt/bin/train.py", line 197, in main
train(opt)
File "/root/work/context/huggingface-models/OpenNMT-py/onmt/bin/train.py", line 91, in train
p.join()
File "/root/miniconda3/envs/open-nmt-env/lib/python3.7/multiprocessing/process.py", line 140, in join
res = self._popen.wait(timeout)
File "/root/miniconda3/envs/open-nmt-env/lib/python3.7/multiprocessing/popen_fork.py", line 48, in wait
return self.poll(os.WNOHANG if timeout == 0.0 else 0)
File "/root/miniconda3/envs/open-nmt-env/lib/python3.7/multiprocessing/popen_fork.py", line 28, in poll
pid, sts = os.waitpid(self.pid, flag)
File "/root/work/context/huggingface-models/OpenNMT-py/onmt/bin/train.py", line 181, in signal_handler
raise Exception(msg)
Exception:
-- Tracebacks above this line can probably
be ignored --
Traceback (most recent call last):
File "/root/work/context/huggingface-models/OpenNMT-py/onmt/bin/train.py", line 135, in run
gpu_rank = onmt.utils.distributed.multi_init(opt, device_id)
File "/root/work/context/huggingface-models/OpenNMT-py/onmt/utils/distributed.py", line 27, in multi_init
world_size=dist_world_size, rank=opt.gpu_ranks[device_id])
File "/root/miniconda3/envs/open-nmt-env/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 510, in init_process_group
timeout=timeout))
File "/root/miniconda3/envs/open-nmt-env/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 603, in _new_process_group_helper
timeout)
RuntimeError: ProcessGroupNCCL is only supported with GPUs, no GPUs found!