Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • May 05 2021 16:41
    guillaumekln transferred #580
  • May 05 2021 16:22
    hangcao1004 opened #580
  • Apr 16 2021 02:57
    raymondhs closed #400
  • May 29 2020 07:14
    guillaumekln closed #579
  • May 29 2020 07:14
    guillaumekln commented #579
  • May 28 2020 20:37
    Roweida-Mohammed edited #579
  • May 28 2020 20:36
    Roweida-Mohammed opened #579
  • Feb 19 2020 16:08
    codecov-io commented #578
  • Feb 19 2020 16:08

    guillaumekln on master

    Updating the intel-mkl URL. (#5… (compare)

  • Feb 19 2020 16:08
    guillaumekln closed #578
  • Feb 19 2020 16:08
    guillaumekln commented #578
  • Feb 19 2020 15:59
    arturgontijo opened #578
  • Feb 12 2020 17:58
    melindaloubser1 closed #553
  • Feb 12 2020 17:58
    melindaloubser1 commented #553
  • Dec 13 2019 08:43
    guillaumekln transferred #574
  • Dec 13 2019 08:43
    guillaumekln transferred #577
  • Dec 13 2019 08:14
    tkngoutham edited #577
  • Dec 13 2019 08:13
    tkngoutham opened #577
  • Dec 13 2019 06:37
    tkngoutham commented #574
  • Oct 09 2019 11:36

    guillaumekln on master

    Add CTranslate2 Change project cards title (compare)

Guillaume Klein
@guillaumekln
With input feeding, the output of the attention layer at t is concatenated to the word embedding input at t + 1. cf. https://arxiv.org/abs/1508.04025
The brnn_merge option only applies to bidirectional layer, which the rnn encoded type does not have.
@stribizhev Disabling the CUDA caching memory allocator will have a small impact on training time, at most 10% I would say (I did not measure it though). Regarding the network size, people frequently add these options : -encoder_type brnn -layers 4 -rnn_size 1000.
Ayushi Dalmia
@ayushidalmia
Thanks!
stribizhev
@stribizhev
@guillaumekln Hi Guillaume, thank you for feedback. I have kept the CUDA CMA enabled and the training completed in 1.5 days. The frequently used options are use frequently, I think, because these are the ones used in WMT16 training tutorial. The main problem for me with the network settings is I can't find any guidelines on how to calculate the size. I see that people mostly rely on gut feeling here. http://forum.opennmt.net/t/issues-when-running-the-english-german-wmt15-training/228/19?u=wiktor.stribizew and http://forum.opennmt.net/t/how-should-i-choose-parameters/994 prove that it is not so evident. I used -layers 4 -rnn_size 1000 -encoder_type brnn -word_vec_size 600 for the next training (ENJA), with 4M segments, and Epoch 1 has been training for ~2 days. Maybe it is OK, but it also takes a lot of GPU memory, and it constantly crashes out with "out of memory" errors.
stribizhev
@stribizhev
Ah, as for the out of memory, it turned out that after nohup background process crashed, /torch/install/bin/luajit process for that training was still working. Strange.
Ayushi Dalmia
@ayushidalmia
@guillaumekln I have a torch model trained using OpenNMT. Can I load it in OpenNMT tensorflow?
Guillaume Klein
@guillaumekln
No, the two projects have very different internals.
Ayushi Dalmia
@ayushidalmia
Ok. Thanks @guillaumekln.
Jacker
@jackeri
Any ideas why I am
getting error loading module 'tools.restserver.restserver' from file 'tools/restserver/restserver': cannot read tools/restserver/restserver: Is a directory
when I am trying to startup the rest server
Guillaume Klein
@guillaumekln
Hello! Which version of the code are you using?
SantoshSanas
@SantoshSanas
What will happen if I am not use tokenization in OpenNMT
Jean Senellart
@jsenellart
Hi Santosh - nothing bad! you can use external tokenization, but you do need to have detokenization too
SantoshSanas
@SantoshSanas
Thanks Jean!!!
Shivani Poddar
@shivanipoddariiith
Hi, I was hoping to add context vectors for each source sentence to condition the decoder output in the baseline Encoder-Decoder Architecture for ONMT. The task to integrate these in the TextDataset is seemingly non-trivial. Does anyone have ideas on what would be a simplified design choice to achieve this?
Thanks a lot
fgaim
@fgaim
@guillaumekln In regards to target side word features (factored MT), the docs for OpenNMT (lua) say the decoder predicts the features of the decoded sentence. So, is the prediction loss being backpropagated along with translation loss, and hence eventually improve both the translation and feature prediction tasks? OR does the translation in any way benefit from target side features in the current implementation? Thanks!
Guillaume Klein
@guillaumekln

is the prediction loss being backpropagated along with translation loss, and hence eventually improve both the translation and feature prediction tasks?

Yes, the losses are summed together before backprogpagation.

stribizhev
@stribizhev
@guillaumekln @jsenellart Is there a way to combine two BPE models? The reason is I have a generic model with its BPE model, then I train an incremental model using new corpora for which I create a new BPE model. When running incremental training, I use -update_vocab merge. If the vocabulary is merged, I guess I should also merge the BPE models to use when translating, right?
Vincent Nguyen
@vince62s
don't create a new bpe model, just tokenize with the first one.
Ayushi Dalmia
@ayushidalmia
Hi, Is there a way to generate the attention map for a given torch model?
hypnoseal
@hypnoseal
Hello, I am interested in learning more about Machine Learning and OpenNMT has me captivated. I have an idea for a project to learn more with OpenNMT, but I require more information on preparing the data source and target files. I've looked through the documentation, but I seem to be having trouble finding any resources about pre-preprocessing the data. As in, are there best practices in preparing the data in source.txt and target.txt before preprocessing? Are there guidelines on the corpus of text required (number of sentences/size of sentences/theme/etc)? Any available reading materials or other resources I could learn more about preparing the data would be greatly appreciated. Thank you very much for the help!
To add to the above question: I plan on using a corpus of text where one body is English and the other is written in Canadian Aboriginal syllabics. Are there considerations I should have with a latin to non-latin characterset? It is possible to convert the syallbics to roman orthography.
Rausch Merscen
@rauschmerscen
@rauschmerscen
Can OpenNMT help me achieve a tool like https://spinbot.com
I have a theory that if I do English->German(or any another language)->English, I can get an article spinner
?
Aditya
@adityakpathak
Hello sir I have question related Audio identification can I use Opennmt for language identification then after language identify coverted into text
It is possible?
Jean Senellart
@jsenellart
@rauschmerscen - the idea is good, but I am not sure it will work since the models are generally trained on same dataset, and EN>DE>EN will be very close to the original. on spinbot they appararently have list of paraphrases they use to generate these alternatives. You can find some good review on paraphrase building and novel approaches here: http://aclweb.org/anthology/E17-1083
Rausch Merscen
@rauschmerscen
@jsenellart Thanks for looking into my project,
Jean Senellart
@jsenellart
@DA1234k - technically it is possible, and we did something like that recently using opennmt-tf (language id on audio signal) - unfortunately we cannot share the paper yet that is in submission. but it is not very hard, you just need data for that
Rausch Merscen
@rauschmerscen
@jsenellart vertu helpful paper
*very
Can you suggest me the easiest way to achieve spinbot, i can't find any repo.
How can I generate/where can i get a list of paraphrases like spin bot has
Saurabh Vyas
@saurabhvyas
Anyone worked with image to markup using open nmt? I am trying to perform inference on a single image using a pretrained model but it currently required whole test set , I dont have labels , just the image
Yuntian
@da03
@saurabhvyas Are you using LuaTorch version or PyTorch version? If it's PyTorch version then no -tgt needs to be provided to translate.py; if it was LuaTorch you can provide dummy ground truth labels such as 'aaa'
stribizhev
@stribizhev
Hi guys, any idea why I am getting THCudaCheck FAIL file=/tmp/luarocks_cunn-scm-1-1394/cunn/lib/THCUNN/generic/SoftMax.cu line=72 error=48 : no kernel image is available for execution on the device /torch/install/bin/luajit: /torch/install/share/lua/5.1/nn/THNN.lua:110: cuda runtime error (48) : no kernel image is available for execution on the device at /tmp/luarocks_cunn-scm-1-1394/cunn/lib/THCUNN/generic/SoftMax.cu:72 after [07/06/18 11:09:23 INFO] Preparing memory optimization...? I run training on a Tesla V100 having 4 GPU cores and can't work around this issue. I run training using th ./train.lua ... -gpuid 1 2 3 4.
stribizhev
@stribizhev
Ok, reinstalled Torch, all Luarocks, reboot, tried to install nccl 2.2, udated nn and cunn, and it now runs with CUDA_VISIBLE_DEVICES=0,1,2,3 th ./train.lua ... gpuid 1 2 3 4, but for the life of me, it still says there is no nccl though I installed it from NVIDIA site, both Ubuntu 16.04 normal and S agonstic versions (for CUDA 9.0, though I have CUDA 9.1 installed - but there is no NCCL 9.1 :().
Saurabh Vyas
@saurabhvyas
@da03 thanks for your reply, sorry for late response, it was for lua, and with dummy data is worked nicely, however I am trying pytorch version now, and one tensorflow implementation ( https://github.com/guillaumegenthial/im2latex/ ), in tf version, , I trained with custom font rendering and size augmentation on same formulas.lst provided by im2latex dataset, even though the perplexity training reached 1.07 , and validation perplexity 1.15 ,, almost 80% of the formulas are not exactly matching in test set, should I play with hyperparamters,?
Sirogha
@Sirogha
hi. how i can know - which version cuda used by opennmt?
Guillaume Klein
@guillaumekln
@Sirogha Do you mean the CUDA version that is required by OpenNMT?
M Saiful Bari
@sbmaruf
@guillaumekln hi there...!
In case if two language doesn't share the characters then how do we do the translation on a Name. Like "Rahul" (a male name) is english, should be "راهول" in Arabic, "राहुल" in Hindi.
So how actually <unk> tag replace work in that case.
If there is any related reference paper you can cite that would be great also. And also does OpenNMT supports this?
Guillaume Klein
@guillaumekln
Hi! In OpenNMT-Lua, there is the -phrase_table that can be used for this: for any <unk> token in the target, the corresponding source token is looked up in the table to find a translation. Other approaches include splitting names on characters and let the model learn the translation or, more commonly, replacing entities with placeholder tokens and have a separate processing software to replace these placeholders.
M Saiful Bari
@sbmaruf
Is it also available in openNMT-py/tf?
I don't see any -phrase_table tag for OpenNMT-py
Guillaume Klein
@guillaumekln
Yes it's only in OpenNMT-lua unfortunately but it's not complicated to add in OpenNMT-py. This feature gets less attention nowadays because Transformer models are incompatible with it as it requires alignment information from the model.
M Saiful Bari
@sbmaruf
So if I'm using transformer model there's no way I can solve this?
Guillaume Klein
@guillaumekln
You can try to split them on characters if you believe there are enough examples in the training data for the model to learn something. Otherwise, this requires an external pre- and post- processing.
M Saiful Bari
@sbmaruf
I was thinking about doing some kind of Transliteration/Romanization of the target language to the source language.
Otherwise, this requires an external pre- and post- processing.
what are the pre/post processing? Any resource out there?
Apart from that if I see(in the following link) google translate, see the red marked text here. Is this the pre/post processing steps here?
https://drive.google.com/file/d/1POkH3XE6yfCoNkJn4DyMRP6J11Ecyqjl/view?usp=sharing
Guillaume Klein
@guillaumekln

what are the pre/post processing?

There are several ways to achieve this and I don't know you current workflow for NMT so it's difficult for me to answer.

M Saiful Bari
@sbmaruf
@guillaumekln Thank you for your answer. I will think and work on your points. If I came to any specific problem then I will ask you again. Actually right now I'm exploring NMT. I wanted to see how to handle the different issues of NMT. This is a pedagogical exploration. Right now I don't have any specific workflow. My research goal is to do kind of Discourse exploration in NMT. I just started.