torch.set_num_threads(16)
but didn't see any improvements. I'm running translation on a machine with multiple CPU cores
Hi, I was using OpenNMT for English to Gujarati Translation. I used sentence piece transform for Gujarati Corpus. I trained the model for 70k sentences. But when I use translate.py for the output and printed out the verbose, I found that most of the subwords were being mapped to the <unk> tag. Eg.
"""
[2020-11-30 12:07:40,417 INFO]
SENT 1: ['but', 'some', 'really', 'good', 'work', 'is', 'being', 'done', 'around', 'us', ',', 'in', 'our', 'society', '.', 'and', 'all', 'this', 'has', 'been', 'possible', 'through', 'the', 'collective', 'efforts', 'of', 'a', '130', 'crore', 'countrymen', '.']
PRED 1: ▁છત <unk> <unk> ▁કરવા . . . . ▁સમગ્ર ▁ચ વ ર્ક ી સુર ક્ષ િત ▁અને ▁સમગ્ર ત ▁વધાર ો ▁છે ▁.
PRED SCORE: -30.2231
GOLD 1: <unk> <unk> <unk> <unk> <unk> <unk> <unk> આપણી <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> તે <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> .
GOLD SCORE: -674.3291
[2020-11-30 12:07:40,417 INFO]
SENT 2: ['this', 'can', 'pave', 'a', 'new', 'way', 'to', 'serve', 'the', 'purpose', 'of', 'ek', 'bharat', 'shresht', 'bharat', 'abhiyan', ',', 'ultimately', 'providing', 'a', 'great', 'boost', 'to', 'india’s', 'tourism', 'development', '.']
PRED 2: ▁સરપંચ ▁છું ▁ , ▁વિમુદ્રીકરણ ▁ <unk> ▁ , ▁ડેરી ▁ઉદ્યોગ નું ▁ઉદ્ઘાટન ▁કરવું ▁ , ▁ડેરી ઓ ▁ ,
PRED SCORE: -21.4394
GOLD 2: <unk> એક ભારત , શ્રેષ્ઠ ભારત <unk> <unk> પૂર્ણ <unk> <unk> માર્ગ <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> .
GOLD SCORE: -431.0326
"""
So is there any specific format, the input has to be given. Please guide me with this issue.
Thanks in advance.
Here GOLD Sentences are target gorund truth sentences.
example_vocab.src
has the following snippet:products 117
still 117
market 116
Hotel 116
see 115
great 114
today 114
through 114
against 112
rights 112
That 111
believe 111
therefore 110
under 110
case 109
same 109
But 109
offer 109
I had a few questions related to the REST Server as well. The server.py stated in the Documentation is showing as unavailable. Is there any upgrade in the server as well with the library upgrade ?
I don't understand. Post an issue on the repo with exact command and error trace.
Note:
some features require Python 3.5 and after (eg: Distributed multigpu, entmax)
we currently only support PyTorch 1.4