Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Guillaume Klein
    @guillaumekln
    This option is not implemented in OpenNMT-tf.
    alrudak
    @alrudak
    thanks
    James
    @JOHW85

    I stopped a training midway, and changed the corpus (to a slightly bigger, cleaned one), but now I'm having trouble resuming my training. The error I get:
    tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
    (0) Invalid argument: StringToNumberOp could not correctly convert string: XXXXXXXXXXXX

    XXXXX tends to be untokenized text which doesn't even exist in my corpus (tokenized or untokenized) and changes randomly on each run.

    Using the command to train (my YAML file is the same): onmt-main --model_type TransformerBig --config v4_big.yml --auto_config train --with_eval

    Guillaume Klein
    @guillaumekln
    Can you post your YAML configuration?
    James
    @JOHW85

    model_dir: model_v4_big/

    data:
    train_features_file: v4_big.cleaned.zh
    train_labels_file: v4_big.cleaned.en
    example_weights: v4_big.cleaned.score
    eval_features_file: v4_big_val_old.tok.zh
    eval_labels_file: v4_big_val_old.tok.en
    source_vocabulary: sp5_zh.opennmttf.txt
    target_vocabulary: sp5_en.opennmttf.txt

    train:
    max_step: 5000000
    save_checkpoints_steps: 5000
    batch_size: 4096
    batch_type: tokens

    sample_buffer_size: 40000000

    maximum_features_length: 250
    maximum_labels_length: 250
    eval:
    scorers: bleu
    steps: 5000
    export_on_best: bleu
    early_stopping:
    min_improvement: 0.001
    steps: 20

    Guillaume Klein
    @guillaumekln
    Can you check that the file "v4_big.cleaned.score" is correctly generated and does not contain these texts?
    James
    @JOHW85
    none of the files i provided in the YAML file had those lines. But when I reduced the file to not include back translations (halving the file), I haven't had a problem. However, running out of system memory was another error previously (usually get killed instead of getting an error)
    alrudak
    @alrudak
    How to force CTranslate2 to run on CPU only, even if there is GPU on server ?
    Guillaume Klein
    @guillaumekln
    By default it runs on CPU only (the default value of the "device" option is "cpu").
    alrudak
    @alrudak
    We tried with “cpu” parameter, but it worked on GPU
    Also we added CUDA_VISIBLE_DEVICES: 0 env but it didn't help.
    Guillaume Klein
    @guillaumekln
    How did you verify it was executed on the GPU?
    alrudak
    @alrudak
    We launched nvidia-smi during the tests and see that running process is found and it consumes GPU memory
    Guillaume Klein
    @guillaumekln
    I did not see this issue before. How do you initialize and run CTranslate2?
    alrudak
    @alrudak
    @guillaumekln Sorry, it was our fault. Works fine on CPU
    VishalKakkar
    @VishalKakkar
    Hi @guillaumekln I want to fine tune a pretrained model and want to fix the word embedding and initial few layers of encoder/decoder. Is it possible to do?
    Guillaume Klein
    @guillaumekln
    Hi, yes it's possible. For example this configuration freezes the word embeddings and the first layer of the encoder and decoder:
    params:
      freeze_layers:
        - examples_inputter
        - encoder/layers/0
        - decoder/layers/0
    VishalKakkar
    @VishalKakkar
    Thank you @guillaumekln
    elx42
    @elx42

    Hi, we are trying to run OpenNMT-tf models in Triton using the exported savedmodel. It works perfectly on GPU, but on CPU we are seeing the following error message:

    InferenceServerException: The CPU implementation of FusedBatchNorm only supports NHWC tensor format for now.
        [[{{node transformer_base_1/self_attention_encoder_1/self_attention_encoder_layer_6/transformer_layer_wrapper_30/layer_norm_33/FusedBatchNormV3}}]]

    Did anyone run into the same and knows a solution for this issue? Thank you!

    Guillaume Klein
    @guillaumekln
    Hi, what is the TensorFlow version used buy the Triton server?
    3 replies
    Jordi Mas
    @jordimas
    Hello. For visibility I have used OpenNMT to solve a GEC task. I documented what I did and the model created:
    https://github.com/jordimas/gec-opennmt-english
    If this is can useful for others, I can make a PR to add it to https://opennmt.net/Models-tf/
    nashid
    @nashid
    I have a CPU only environment. What flag do I have to pass for training in a CPU-only environment?
    python3 train.py \ -data $data_path/final \ -encoder_type brnn \ -enc_layers 2 \ -decoder_type rnn \ -dec_layers 2 \ -rnn_size 256 \ -global_attention general \ -batch_size 32 \ -word_vec_size 256 \ -bridge \ -copy_attn \ -reuse_copy_attn \ -train_steps 20000 \ -save_checkpoint_steps 10000 \ -save_model $data_path/final-model
    Guillaume Klein
    @guillaumekln
    You probably want to post this question in https://gitter.im/OpenNMT/OpenNMT-py or on the forum.
    Benjamin
    @bmkor

    Hello, I'm new to opennmt-tf and would like to ask a stupid question.

    I tried training a chinese-to-english translation with around 10000 lines of training sample and basically followed the procedures in the quick start tutorial (https://opennmt.net/OpenNMT-tf/quickstart.html) except that the vocabulary was the output of sentencepiecemodel (trained with the same training sample). Everything seems worked fine except that the inferred result (after running !onmt-main --config data.yml --auto_config infer --features_file src_test.txt) giving all <unk>.

    Just wonder if this is expected? (Perhaps too small in the training sample size?) Hope anyone can throw me some light. Thanks in advance.

    Guillaume Klein
    @guillaumekln
    Hi, 10k lines is usually not enough to train Transformer models. You should first look to gather more data.
    Benjamin
    @bmkor
    Thanks @guillaumekln. The sample is from a specific domain (weather related domain and the sample size would be small) and I'm just wondering if there is any fine-tuning/domain adaption I can try with such a small sample size. In fact, I've googled for quite a while and still found no clue. Perhaps I need some directions or keywords for my searching...
    Guillaume Klein
    @guillaumekln
    See this post which explores multiple domain adaption techniques: https://forum.opennmt.net/t/domain-adaptation-techniques/3382
    Benjamin
    @bmkor
    Thanks a lot. @guillaumekln
    VishalKakkar
    @VishalKakkar
    Hi @guillaumekln I have trained a RNN custem model, How can I use it in python for the inference. I can use onmt-main infer, but in production I don't want to read from text file and infer. I tries to write python code for inference but was getting errors. can you please share some reference code? I am using opennmt 1.25 version. and here is my custom code

    import opennmt
    import tensorflow as tf
    from opennmt.utils.misc import merge_dict

    class MyCustomRnn(opennmt.models.SequenceToSequence):
    """Defines a medium-sized bidirectional LSTM encoder-decoder model."""
    def auto_config(self, num_devices=1):
    config = super(MyCustomRnn, self).auto_config(num_devices=num_devices)
    return merge_dict(config, {
    "params": {
    "optimizer": "AdamOptimizer",
    "learning_rate": 0.0002,
    "param_init": 0.1,
    "clip_gradients": 5.0,
    "beam_width":11,
    },
    "train": {
    "batch_size": 64,
    "maximum_features_length": 80,
    "maximum_labels_length": 80
    }
    })

    def init(self):
    super(MyCustomRnn, self).init(
    source_inputter=opennmt.inputters.WordEmbedder(
    vocabulary_file_key="source_words_vocabulary",
    embedding_size=128),
    target_inputter=opennmt.inputters.WordEmbedder(
    vocabulary_file_key="target_words_vocabulary",
    embedding_size=128),
    encoder=opennmt.encoders.BidirectionalRNNEncoder(
    num_layers=1,
    num_units=128,
    reducer=opennmt.layers.ConcatReducer(),
    cell_class=tf.nn.rnn_cell.LSTMCell,
    dropout=0.3,
    residual_connections=False),
    decoder=opennmt.decoders.AttentionalRNNDecoder(
    num_layers=1,
    num_units=128,
    bridge=opennmt.layers.CopyBridge(),
    attention_mechanism_class=tf.contrib.seq2seq.LuongAttention,
    cell_class=tf.nn.rnn_cell.LSTMCell,
    dropout=0.3,
    residual_connections=False))
    model = MyCustomRnn

    Guillaume Klein
    @guillaumekln
    Hi, see this documentation: https://opennmt.net/OpenNMT-tf/serving.html
    VishalKakkar
    @VishalKakkar
    Hi @guillaumekln I was trying fro this link: https://github.com/OpenNMT/OpenNMT-tf/blob/v1.25.3/examples/library/minimal_transformer_training.py . I got the following error:
    VishalKakkar
    @VishalKakkar
    @guillaumekln I got error in this line sess.run(tf.tables_initializer()), where it says not able to found vocab file, whereas I pass the path in the inputter already.
    VishalKakkar
    @VishalKakkar
    @guillaumekln getting error (load() missing 2 required positional arguments: 'tags' and 'export_dir'
    ) in serving as well. I am using older version of opennmt
    Guillaume Klein
    @guillaumekln
    I suggest updating to TensorFlow 2 and OpenNMT-tf 2 if possible. OpenNMT-tf v1 is no longer supported.
    VishalKakkar
    @VishalKakkar
    Hi @guillaumekln is it possible to get log probablities for each time step from ctranslate?
    Guillaume Klein
    @guillaumekln
    The scores of each step are not returned during translation, but you can look into the score_batch method to score an existing translation.
    nashid
    @nashid
    What is the difference between OpenNMT-tf and tensor2tensor?
    Guillaume Klein
    @guillaumekln
    I answered in the GitHub issue OpenNMT/OpenNMT-tf#931 (no need to post the same question in multiple places)
    Karen Lastmann Assaraf
    @karen.lastmannassaraf_gitlab
    Hi!
    Is there a simple way to retrieve corresponding keras architecture for Opennmt SelfAttentionDecoder?
    Guillaume Klein
    @guillaumekln
    Hi, what are you looking for exactly? The SelfAttentionDecoder is already a Keras layer.
    neo
    @code2graph

    For a Neural Machine Translation (NMT) task, my input data has relational information. Probably I can use Graph Neural Network (GNN) and use a Graph2Seq model. But I can't find a good generational model for GNN.

    So I want to use a Transformer model. But then the massive problem is how can I embed structural information in a Transformer? Is there any open source artefact for Relational Transformer that I can use out of the box?

    neo
    @code2graph
    Can you use both copy mechanism and BPE?
    I read to alleviate the problem of Out of Vocabulary (OOV), there are two techniques:
    1. BPE
    2. Copy mechanism
      It appears to me they are two orthogonal approaches. Can we combine the two, i.e., we use both the copy mechanism and BPE? Are there any work out there that combines the two? I cant find any.
    Guillaume Klein
    @guillaumekln
    Hello, please do not post the same questions in multiple channels. For example OpenNMT-tf does not support copy mechanism so it does not make sense to ask the question here. Thanks!
    neo
    @code2graph
    How to do hyperparameter optimization for an NMT task?

    I would like to do try out different hyperparameters and compare their performance. There are libraries like the Hyperopt, etc. for tuning.

    I would be interested to know is there a programmatic way to try out automated hyperparameter tuning. Any ideas welcome.

    neo
    @code2graph

    Can I use BERT embedding for LSTM seq2seq? I am trying to understand how would the model know which embedding to use as for BERT there could be multiple embedding depending on the usage context?

    Or for BERT embedding to I have to use Transformers?

    Can anyone please give me pointers?