Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Guillaume Klein
    @guillaumekln
    But how do you explain that one performs better than the other if they are using the same configuration?
    VishalKakkar
    @VishalKakkar
    Hi @guillaumekln, I am debugging the performance diff. I am using this code https://www.tensorflow.org/tutorials/text/transformer in my implementation. I could see 2 major difference between this code and open nmt code. 1) open nmt code has extra normalization layer after encoder and decoder layer. 2) I think there is also difference of how two codes using the positional embeddings.
    Guillaume Klein
    @guillaumekln
    Ok. So it is not the same model and you can't convert it to CTranslate2 unfortunately.
    VishalKakkar
    @VishalKakkar
    What if I add normalisation in my code and load my model weights in the open nmt model and then use ctranslate to convert? Will that work?
    Guillaume Klein
    @guillaumekln
    Guillaume Klein
    @guillaumekln
    But I don't see how the TensorFlow tutorial could be better than OpenNMT outside of toy examples. It is missing beam search and multi GPU/gradient accumulation.
    VishalKakkar
    @VishalKakkar
    @guillaumekln yes you are right, I am debugging the diff.
    VishalKakkar
    @VishalKakkar
    Hi @guillaumekln can we train Language Model using open nmt, and can we get score from trained language model in ctranslate2.
    Guillaume Klein
    @guillaumekln
    VishalKakkar
    @VishalKakkar
    Hu @guillaumekln is it possible to get alignment score without training model with alignment if I am using only one decoder layer in Transformer.
    Guillaume Klein
    @guillaumekln
    You could get the attention values, but they will not represent alignments as you would expect.
    VishalKakkar
    @VishalKakkar
    @guillaumekln this you are saying because of multiple heads? Or there is any other reason?
    Guillaume Klein
    @guillaumekln
    Yes because there are multiple heads and no single head is expected to reflect alignments.
    Anna Samiotou
    @annasamt
    Hello, for OpenNMT-tf v2x, are they all compatible versions (back&forward)? For example, I've have trained with v2.2.1 on the test environment but in the production environment (used for translations) I've installed v2.6.0. I also plan to update test env. with latest version v2.8.0. Thanks in advance.
    Guillaume Klein
    @guillaumekln
    Yes. Versions are backward compatible in the same major release (e.g. 2.x). Forward compatibility, however, is not guaranteed.
    Anna Samiotou
    @annasamt
    Many thanks, Guillaume, it's backward compatibility I'm interested in.
    Anna Samiotou
    @annasamt
    Hello, I train a shared sentencepiece model (spm_train)) and then run the onmt-build-vocab for each language. I then train the omnt-tf v2x model. I notice that sometimes funny words are generated in the output, a result of gluing subword chunks that don't make a valid word in the target language. In spm-encode, we can run it with the parameter --vocabulary and also --vocabulary_threshold, to only allow to produce symbols that exist in the vocabulary and possibly with some frequency. Is this also possible for the inference in order to avoid generation of non-existing target words? and how can I introduce it? Thanks
    Guillaume Klein
    @guillaumekln
    For how long did you train the model? This is usually not an issue when the model is well trained.
    Anna Samiotou
    @annasamt
    20K steps
    It is model trained only with domain data, so it performs well for the use case. Well, it is not very frequent but it happens.
    Guillaume Klein
    @guillaumekln
    20k does not seem enough. Maybe let it run longer. On fairly large datasets, you still see gains after 500k steps.
    Anna Samiotou
    @annasamt
    The datasets of this use case are a few hundred K segments. I continued training for longer with the condition of BLEU score to improve at least 0.2points and training stopped at 24K. After that it kind of overfits.
    VishalKakkar
    @VishalKakkar
    @guillaumekln I am using opennmt for seq2seq, I want to put condition on accuracy to improve for auto stop. Is it possible to configure this in opennmt?
    Guillaume Klein
    @guillaumekln
    I'm not sure to understand. Can you describe more what you want to do?
    VishalKakkar
    @VishalKakkar
    Basically I want to check accuracy at each saving step instead of bleu score. And if it is not improving for certain no of steps I want to stop training.
    Guillaume Klein
    @guillaumekln
    Accuracy is not implemented for seq2seq models at the moment. But you could prepare a custom model definition that overrides the get_metrics and update_metrics methods. Then you could select your newly defined metric in the early stopping config.
    VishalKakkar
    @VishalKakkar
    Can you provide reference to examples if possible?
    Guillaume Klein
    @guillaumekln
    The SequenceClassifier model is defining an accuracy metric, maybe that could help: https://github.com/OpenNMT/OpenNMT-tf/blob/master/opennmt/models/sequence_classifier.py#L68-L72
    VishalKakkar
    @VishalKakkar
    Thanks @guillaumekln
    VishalKakkar
    @VishalKakkar

    Hi, @guillaumekln I tried the above method but getting this error "No scorer associated with the name: accuracy". Here is the code of my custom model: import opennmt

    class MyCustomTransformer(opennmt.models.Transformer):
    def init(self):
    super().init(
    source_inputter=opennmt.inputters.WordEmbedder(embedding_size=128,vocabulary_file_key="source_words_vocabulary"),
    target_inputter=opennmt.inputters.WordEmbedder(embedding_size=128,vocabulary_file_key="target_words_vocabulary"),
    num_layers=1,
    num_units=128,
    num_heads=8,
    ffn_inner_dim=512)

    # Here you can override any method from the Model class for a customized behavior.
    
    def get_metrics(self):
        return {"accuracy": tf.keras.metrics.Accuracy()}
    
    def update_metrics(self, metrics, predictions, labels):
        metrics["accuracy"].update_state(labels, predictions)

    model = MyCustomTransformer

    Guillaume Klein
    @guillaumekln
    What is your YAML configuration?
    VishalKakkar
    @VishalKakkar
    Here it is:

    eval:
    eval_delay: 360 # Every 1 hour
    external_evaluators: accuracy
    beam_width: 5
    early_stopping:
    metric: accuracy
    min_improvement: 0.2
    steps: 4

    infer:
    batch_size: 64
    beam_width: 5

    Guillaume Klein
    @guillaumekln
    In this case, accuracy is not an external evaluator as it is attached to the model. You should remove this line.
    Are you using a recent version of OpenNMT-tf? eval_delay no longer exists, see https://opennmt.net/OpenNMT-tf/v2_transition.html
    VishalKakkar
    @VishalKakkar
    I am using Version:1.25.0 as it is the same version prod boxes. Tell me one thing if it is not an external evaluation then how I will track that early stopping is happening on accuracy or it is happening at all or not?
    Guillaume Klein
    @guillaumekln
    First, please note that early stopping was added in version 2.0.0 so you should first update. Early stopping will work on accuracy if you set metric: accuracy in the configuration (as you did).
    VishalKakkar
    @VishalKakkar
    ok Got it, let me then try with newer version of opennmt. Thanks. @guillaumekln is there any way we can set it as external_evaluators also to track accuracy after every save_checkpoint_step.
    Guillaume Klein
    @guillaumekln
    Evaluation reports the loss, the metrics declared by the model, and the scores returned by external evaluators. As you declared the accuracy as a model metric, it will be automatically reported during evaluation.
    VishalKakkar
    @VishalKakkar
    Sure Thanks
    VishalKakkar
    @VishalKakkar
    Hi @guillaumekln I am missing something here, as after save_checkpoint_steps, I am getting this: INFO:tensorflow:Done running local_init_op.
    INFO:tensorflow:Evaluation predictions saved to ft_only_single_error_with_auto_stopping/eval/predictions.txt.141000
    INFO:tensorflow:BLEU evaluation score: 61.290000
    INFO:tensorflow:Finished evaluation at 2020-03-13-22:13:50
    INFO:tensorflow:Saving dict for global step 141000: global_step = 141000, loss = 0.98452455
    INFO:tensorflow:Saving 'checkpoint_path' summary for global step 141000: ft_only_single_error_with_auto_stopping/model.ckpt-141000
    INFO:tensorflow:Calling model_fn.
    accuracy is no where. and this was my config:

    eval:
    external_evaluators: bleu
    beam_width: 5
    early_stopping:
    metric: accuracy
    min_improvement: 0.2
    steps: 4

    infer:
    batch_size: 64
    beam_width: 5

    Guillaume Klein
    @guillaumekln
    All we discussed above is for OpenNMT-tf V2. So please update first.
    VishalKakkar
    @VishalKakkar
    @guillaumekln ohh ok
    VishalKakkar
    @VishalKakkar
    Hi @guillaumekln I have updated the version of Opennmt but Now at evaluation step I am getting some weird error. Here is the log:
    Traceback (most recent call last):
    File "/Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/bin/onmt-main", line 8, in <module>
    sys.exit(main())
    File "/Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/opennmt/bin/main.py", line 204, in main
    checkpoint_path=args.checkpoint_path)
    File "/Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/opennmt/runner.py", line 208, in train
    moving_average_decay=train_config.get("moving_average_decay"))
    File "/Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/opennmt/training.py", line 104, in call
    early_stop = self._evaluate(evaluator, step, moving_average=moving_average)
    File "/Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/opennmt/training.py", line 183, in _evaluate
    evaluator(step)
    File "/Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/opennmt/evaluation.py", line 268, in call
    loss, predictions = self._eval_fn(source, target)
    File "/Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 568, in call
    result = self._call(args, **kwds)
    File "/Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 615, in _call
    self._initialize(args, kwds, add_initializers_to=initializers)
    File "/Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 497, in _initialize
    args, kwds))
    File "/Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2389, in _get_concrete_function_internal_garbage_collected
    graphfunction, , _ = self._maybe_define_function(args, kwargs)
    File "/Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2703, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
    File "/Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2593, in _create_graph_function
    capture_by_value=self._capture_by_value),
    File "/Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/tensorflow_core/python/framework/func_graph.py", line 978, in func_graph_from_py_func
    func_outputs = python_func(*func_args,
    func_kwargs)
    File "/Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 439, in wrapped_fn
    return weak_wrapped_fn().wrapped(args, *kwds)
    File "/Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/tensorflow_core/python/framework/func_graph.py", line 968, in wrapper
    raise e.ag_error_metadata.to_exception(e)
    tensorflow.python.framework.errors_impl.NotFoundError: in converted code:
    /Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/opennmt/models/model.py:137 evaluate  *
        outputs, predictions = self(features, labels=labels)
    /Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py:778 __call__
        outputs = call_fn(cast_inputs, *args, **kwargs)
    /Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/opennmt/models/sequence_to_sequence.py:165 call  *
        predictions = self._dynamic_decode(
    /Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/opennmt/models/sequence_to_sequence.py:239 _dynamic_decode  *
        sampled_ids, sampled_length, log_probs, alignment, _ = self.decoder.dynamic_decode(
    /Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/opennmt/decoders/decoder.py:396 dynamic_decode  *
        return decoding.dynamic_deco
    VishalKakkar
    @VishalKakkar
    and this error is not for just accuracy metric, even for blue score I am getting the same error.
    Guillaume Klein
    @guillaumekln
    Is that the complete error log? Looks like the end is missing.
    VishalKakkar
    @VishalKakkar
    /Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/opennmt/decoders/decoder.py:396 dynamic_decode
    return decoding.dynamic_decode(
    /Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/opennmt/utils/decoding.py:491 dynamic_decode

    ids, attention, lengths = decoding_strategy._finalize( # pylint: disable=protected-access
    /Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/opennmt/utils/decoding.py:344 _finalize
    ids = tfa.seq2seq.gather_tree(step_ids, parent_ids, maximum_lengths, end_id)
    /Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/tensorflow_addons/seq2seq/beam_search_decoder.py:35 gather_tree

    return _beam_search_so.ops.addons_gather_tree(args, *kwargs)
    /Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/tensorflow_addons/utils/resource_loader.py:49 ops
    self._ops = tf.load_op_library(get_path_to_datafile(self.relative_path))
    /Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/tensorflow_core/python/framework/load_library.py:57 load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
    NotFoundError: /Vernacular/vishal.Linux_debian_9_3.py36.latest_onmt/lib/python3.6/site-packages/tensorflow_addons/custom_ops/seq2seq/_beam_search_ops.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrESs
    Here is the other part @guillaumekln