Is everything comparable? How much slower in your benchmark?
not everything, but the definition of slow are: 1) how fast the loss going down (in terms of number of steps to reach), the overall loss after ~50k steps still 1-ish something while t2t
are 0.5-ish something. 2) how fast to stepping (per 100 steps), This one might 5x slower.
These measurement is oversimplified, however i just can feel it.
Actually there are still some odd things going on... Still investigating and might need to refine the behavior in a patch version. It's unclear what TensorFlow is doing under the hood.
Noted
score
is usually used to score an existing prediction. You could define a custom model that extends the base SequenceClassifier
, something like:class MyClassifier(onmt.models.SequenceClassifier):
def __init__(self):
super(MyClassifier, self).__init__(...)
def call(self, *args, **kwargs):
logits, _ = super(MyClassifier, self).super(*args, **kwargs)
predictions = dict(probs=tf.nn.softmax(logits))
return logits, predictions
def print_prediction(self, prediction, params=None, stream=None):
print(prediction["probs"], file=stream)