Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    lemon-bean
    @lemon-bean
    @leytondsilva Hi Leyton, I got the same error, "cannot import name 'sentencepiece_vocabulary' from 't5.data' (/usr/local/lib/python3.7/dist-packages/t5/data/init.py)". Have you solved it? Can you help?
    Leyton D'silva
    @leytondsilva
    @lemon-bean try !pip install trax==1.3.5
    !pip install t5==0.8.1
    lemon-bean
    @lemon-bean
    @leytondsilva Thank you!
    Leyton D'silva
    @leytondsilva
    Is it possible to convert the generative conversation to a proper chatbot?
    I really need help!!
    Phillip Bock
    @friesel
    Fineunting a pretrained model which converges MUCH faster if I eval more often (at every 500 steps rather than every 2500 steps).
    Only explanation I have is a bug in my data-pipeline. Or could it be that the eval-process does something with the learning rate scheduler or resets other stuff? Anybody had a similar experience?
    Phillip Bock
    @friesel
    Replying to my own post: It was (as it usually is) a bug in my inputs-preparation. Case closed.
    memora0101
    @memora0101
    Andrei Nesterov
    @manifest

    Hi everyone,

    I wonder how do you implement the early stopping strategy for the Trax? I may be missing something, but I don't see any implementation of it.

    Is there any build-in way to store a copy of the model parameters, every time the error on the validation set improves?

    Phillip Bock
    @friesel
    @manifest You're training with the class Trainer()? Check out the parameter checkpoint_highest. That should be the angle.
    1 reply
    memora0101
    @memora0101
    Hello Everyone, I need help with my predictions, i am trying to predict the Angry and Happy labels as seen in my notebook, however the output is not quite right an gave been struggling to make the right changes to predict "sent", [prob-HAPPY,prob-Angry]. The following is a copy of my notebook.
    Leyton D'silva
    @leytondsilva
    Is it possible to convert the generative conversation to a proper chatbot?
    memora0101
    @memora0101
    Hello all, Im not sure why I'm getting the following error can anyone help plz:
    Screen Shot 2021-03-30 at 12.26.24 PM.png
    Screen Shot 2021-03-30 at 12.25.41 PM.png
    image.png
    Omar Alsaqa
    @OmarAlsaqa

    Hi @memora0101, you just need to add np.argmax() over axis =1 to your prediction output. So, it returns if 0 index (Angry) is taking higher probability or the 1 index (Happy).

    for sent in train_x[60:70]:
        inputs = np.array(sent_to_tensor(sent, vocab_dict=Vocab))
        inputs = inputs[None, :]
        #### This get you 16 (your used batch size) input and and 16 target. Not needed.
        # example_input = next(val_generator(batch_size=batch_size, shuffle=True))
        # tmp_inputs, tmp_targets, tmp_example_weights = example_input
        tmp_pred = training_loop.eval_model(inputs)
        print(f'example input str: {sent}')
        print(f'example input array:{inputs}')
        print(f'Model returned sentiment probabilities: {tmp_pred.argmax(axis=1)}') 
        print('_________')

    part of the output:

    example input str: Great, I’m glad I could help you!! Thanks for being a customer of ToyCityInc. – We hope your daughter has a very happy birthday!
    example input array:[[ 37 46 47 48 12 71 9 96 249 250 251 47 252 177 253 202 254]]
    Model returned sentiment probabilities: [1]


    example input str: I appreciate it!
    example input array:[[173]]
    Model returned sentiment probabilities: [1]


    example input str: I have a problem with one of the products that I have purchased at your store
    example input array:[[244 107 255 256 29]]
    Model returned sentiment probabilities: [0]


    example input str: It is an iphone 10, it has been a week about now and the phone has started to overheat.
    example input array:[[257 258 259 78 260 261]]
    Model returned sentiment probabilities: [0]

    If you want to work with eval:

    for sent in range(5):
        example_input = next(val_generator(batch_size=batch_size, shuffle=True))
        tmp_inputs, tmp_targets, tmp_example_weights = example_input
        tmp_pred = training_loop.eval_model(tmp_inputs)
        print(f'example input array:{tmp_pred.argmax(axis=1)}')
        print(f'Model returned sentiment probabilities: {tmp_targets}')
        print('_________')

    outputs:

    example input array:[0 1 1 1 0 0 1 1 1 0 1 0 0 0 0 1]
    Model returned sentiment probabilities: [1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0]


    example input array:[1 0 0 1 1 1 1 1 0 0 0 0 0 0 0 1]
    Model returned sentiment probabilities: [1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0]


    example input array:[0 0 1 1 1 0 1 1 0 0 0 0 0 1 1 0]
    Model returned sentiment probabilities: [1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0]


    example input array:[1 0 1 1 1 1 1 1 0 1 0 0 0 1 1 0]
    Model returned sentiment probabilities: [1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0]


    example input array:[1 1 0 1 1 1 1 0 0 0 0 1 0 0 1 0]
    Model returned sentiment probabilities: [1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0]

    The model needs more data as it fits on inputs and fail on validation (overfits) or reduce the number of steps (I got 80% with 20 steps). Also, you can try to add layers to the model until you get the best results with this small amount of data.

    memora0101
    @memora0101
    @OmarAlsaqa thank you very much for guiding me! will add this to the model!
    Sudhanshu Bhoi
    @SudhanshuBhoi
    Hello, how can we create a custom wrapper layer over combinator layer like Serial? Can this wrapper layer run the underlying serial layer multiple times using for loop?
    JayyZu
    @jayyzu:matrix.org
    [m]
    Hi all, does trax have a higher level api like keras/fastai?
    edorsi
    @edorsi
    Hi everyone, quite new into this space... I'm trying to install trax on my local env, could someone point me to the correct version for python and TF?
    I've tried python 3.7 and TF 2.4 but I get this error importing trax tensorflow.python.framework.errors_impl.NotFoundError: /home/edo/miniconda3/envs/NMT_TF_01/lib/python3.7/site-packages/tensorflow_text/python/metrics/_text_similarity_metric_ops.so: undefined symbol: _ZNK10tensorflow8OpKernel11TraceStringERKNS_15OpKernelContextEb
    Thanks
    seamusl
    @seamusl
    Reformer: Machine Translation colab not working
    With the following sample Machine Translation colab example, there are problems: https://colab.research.google.com/github/google/trax/blob/master/trax/models/reformer/machine_translation.ipynb , I get the following error when I try to run the first cell: Copying gs://trax-ml/reformer/jaxlib-0.1.39-cp36-none-manylinux2010_x86_64.whl...
    / [1 files][ 26.9 MiB/ 26.9 MiB]
    Operation completed over 1 objects/26.9 MiB.
    Copying gs://trax-ml/reformer/jax-0.1.59-cp36-none-manylinux2010_x86_64.whl...
    / [1 files][312.6 KiB/312.6 KiB]
    Operation completed over 1 objects/312.6 KiB.
    ERROR: jaxlib-0.1.39-cp36-none-manylinux2010_x86_64.whl is not a supported wheel on this platform.
    ERROR: jax-0.1.59-cp36-none-manylinux2010_x86_64.whl is not a supported wheel on this platform.
    grpc://10.63.67.10:8470
    ...
    Reformer: Machine Translation colab not working. Beam search not working.
    seamusl
    @seamusl

    I also get errors when trying to run the cell which uses a Beam Search in this colab:

    beam_decoder = Search(
    trax.models.Reformer, model_weights,
    beam_size=4,
    alpha=0.6, # For length normalization, set to 0.6 following Vaswani et al.
    eos_id=1, # The stop token has id 1 in the vocabulary we use.
    max_decode_len=146,
    )

    I get the following error:

    AttributeError Traceback (most recent call last)
    in ()
    5 alpha=0.6, # For length normalization, set to 0.6 following Vaswani et al.
    6 eos_id=1, # The stop token has id 1 in the vocabulary we use.
    ----> 7 max_decode_len=146,
    8 )

    /usr/local/lib/python3.7/dist-packages/trax/models/beam_search.py in init(self, model, weights, max_decode_len, beam_size, temperature, alpha, eos_id)
    469 # Work around a jax error
    470 # Ref: google/jax#1919 (comment)
    --> 471 jax_partial_eval._thread_local_state.remat = True # pylint: disable=protected-access
    472
    473 def _get_initial_state(self, inputs, targets_prefix, batch_size):

    AttributeError: module 'jax.interpreters.partial_eval' has no attribute '_thread_local_state'

    Has anyone been able to get the Reformer: Machine Translation colab working ? If so, I'd love to know. I spent some time at this morning and I'll give up on it if I can't even get a basic example working. Thanks!
    memora0101
    @memora0101
    hello all, I've been working on the Deeplearning.AI improving deep neural networks and wondering how weights are being treated. In the first assignment they pointed out that HE initialization helps reduce cost more than random initialization. however i was looking to test it with my model and was looking to see how trax handles weights. I have the following code
    class Dense(Layer):
    """
    A dense (fully-connected) layer.
    """
    # __init__ is implemented for you
    def __init__(self, n_units, init_stdev=0.1):
    
        # Set the number of units in this layer
        self._n_units = n_units
        self._init_stdev = init_stdev
    
    # Please implement 'forward()'
    def forward(self, x):

    START CODE HERE (Replace instances of 'None' with your code)

        # Matrix multiply x and the weight matrix
        dense = np.dot(x,self.weights) 

    END CODE HERE

        return dense
    
    # init_weights
    def init_weights_and_state(self, input_signature, random_key):

    START CODE HERE (Replace instances of 'None' with your code)

        # The input_signature has a .shape attribute that gives the shape as a tuple
        input_shape = (input_signature.shape[-1], self._n_units)
    
        # Generate the weight matrix from a normal distribution, 
        # and standard deviation of 'stdev'        
        w = trax.fastmath.random.normal(key=random_key, shape=input_shape) * self._init_stdev

    END CODE HERE

        self.weights = w
        return self.weights
    i believe that this is random initialization if im not mistaken, but would like to know if I can change it to HE initialization similar to the following code:

    GRADED FUNCTION: initialize_parameters_he

    def initialize_parameters_he(layers_dims):
    """
    Arguments:
    layer_dims -- python array (list) containing the size of each layer.
    Returns:
    parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":
                    W1 -- weight matrix of shape (layers_dims[1], layers_dims[0])
                    b1 -- bias vector of shape (layers_dims[1], 1)
                    ...
                    WL -- weight matrix of shape (layers_dims[L], layers_dims[L-1])
                    bL -- bias vector of shape (layers_dims[L], 1)
    """
    
    np.random.seed(3)
    parameters = {}
    L = len(layers_dims) - 1 # integer representing the number of layers
    
    for l in range(1, L + 1):
        ### START CODE HERE ### (2 lines of code)
        parameters['W' + str(l)] = np.random.randn(layers_dims[l],layers_dims[l-1])*np.sqrt(2./layers_dims[l-1])
        parameters['b' + str(l)] = np.zeros((layers_dims[l],1))
        ### END CODE HERE ###
    
    return parameters
    Leyton D'silva
    @leytondsilva

    Hi, I am very new to Trax and I am trying to make the Chatbot Coursera course work:

    I am getting a layer error, my imports are
    !pip install trax==1.3.5
    !pip install t5==0.8.1


    LayerError Traceback (most recent call last)

    <ipython-input-46-83df29050cce> in <module>()
    ----> 1 model.init_from_file('/content/drive/My Drive/work/chatbot_model1.pkl.gz', weights_only=True, input_signature=shape11)
    2
    3 STARTING_STATE = model.state

    9 frames
    /usr/local/lib/python3.7/dist-packages/trax/layers/base.py in init(self, input_signature, rng, use_cache)
    286 name, trace = self._name, _short_traceback(skip=3)
    287 raise LayerError(name, 'init', self._caller,
    --> 288 input_signature, trace) from None
    289
    290 def init_from_file(self, file_name, weights_only=False, input_signature=None):

    LayerError: Exception passing through layer Serial (in init):
    layer created in file [...]/models/reformer/reformer.py, line 236
    layer input shapes: Traced<ShapedArray(int32[1,1])>with<DynamicJaxprTrace(level=1/0)>

    File [...]/trax/layers/combinators.py, line 104, in init_weights_and_state
    sublayer.init(inputs, use_cache=True))

    LayerError: Exception passing through layer ReversibleSerial (in init):
    layer created in file [...]/models/reformer/reformer.py, line 229
    layer input shapes: (ShapeDtype{shape:(1, 1, 512), dtype:float32}, ShapeDtype{shape:(1, 1, 512), dtype:float32})

    File [...]/trax/layers/combinators.py, line 105, in init_weights_andstate
    outputs,
    = sublayer._forward_abstract(inputs)

    LayerError: Exception passing through layer ReversibleHalfResidual (in _forward_abstract):
    layer created in file [...]/models/reformer/reformer.py, line 113
    layer input shapes: (ShapeDtype{shape:(1, 1, 512), dtype:float32}, ShapeDtype{shape:(1, 1, 512), dtype:float32})

    File [...]/jax/interpreters/partial_eval.py, line 404, in abstract_evalfun , avalsout, = trace_to_jaxpr_dynamic(lu.wrap_init(fun, params), avals)

    File [...]/jax/interpreters/partial_eval.py, line 1178, in trace_to_jaxpr_dynamic
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(fun, main, in_avals)

    File [...]/jax/interpreters/partial_eval.py, line 1188, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers)

    File [...]/dist-packages/jax/linear_util.py, line 166, in call_wrapped
    ans = self.f(args, *dict(self.params, kwargs))

    File [...]/dist-packages/jax/linear_util.py, line 166, in call_wrapped
    ans = self.f(args, *dict(self.params, kwargs))

    LayerError: Exception passing through layer ReversibleHalfResidual (in pure_fn):
    layer created in file [...]/models/reformer/reformer.py, line 113
    layer input shapes: (ShapeDtype{shape:(1, 1, 512), dtype:float32}, ShapeDtype{shape:(1, 1, 512), dtype:float32})

    File [...]/trax/layers/base.py, line 588, in _do_custom_gradients
    do_forward = fastmath.custom_grad(do_forward_vjp, _do_forward)

    File [...]/trax/fastmath/ops.py, line 260, in custom_grad
    return backend()'custom_grad'

    File [...]/trax/fastmath/jax.py, line 158, in _customgrad
    f
    = jax.custom_transforms(f_original)

    AttributeError: module 'jax' has no attribute 'custom_transforms'

    knetch
    @knetch

    Anyone have issues setting backend?

    import trax
    trax.fastmath.use_backend('numpy')
    print(f'using backend: {trax.fastmath.backend_name()}')

    Always prints:
    using backend: jax

    I am on macos without any gpus.

    5 replies
    Rosalvo Neto
    @rosalvoneto
    Hi everyone, Please, Where can I find information about the pre-trained models available in gs://trax-ml/transformer_c4_pretrained_1.3.2/ ? I would like to know the dataset used in these models.
    2 replies
    elx42
    @elx42
    Hi everyone, I have a very specific, and perhaps stupid question: I previously trained a custom machine translation model using the tensor2tensor library with very good translation quality. At that time, I exported the model by using the provided binaries, and was able to host the model successfully using tensorflow_serving. Inception works in a way that I can send a single vector and receive the "translated" output vector. Now, I was planning to migrate to the Trax library for our future developments. I managed to train a custom translation model using the transformer architecture, and also to export it to keras in a way that I can host it similarly via tensorflow_serving. However, the way I need to provide information to get a proper translation seems to work differently in Trax: The way to get a proper translation seems to be via the decoding.autoregressive_sample function, which, if I understand it correctly, is using the output of step t-1 to predict the symbol at time t until seeing an EOS tag. I guess I would need to implement it similarly for the model hosted in tf_serving - which would require a lot of back and forth requests per sentence. Also, it seems to be much slower compared to what I have currently running. Is there any way to mimic the behaviour of the tensor2tensor model (sending a single request once, and receiving the translation? Is this autoregressive logic implemented in the exported model in tensor2tensor? I'm trying to understand also whether there are architectural differences between the transformer model in t2t vs trax. I'm not getting my head around how I could solve this issue. If anyone could help me out here, that would be greatly appreciated! Many thanks!
    3 replies
    Çağatay Onur Şengör
    @csengor:matrix.org
    [m]
    Hello everyone. I recently found out about Trax, and am trying to build a NMT pipeline for English > Turkish using the Transformer model. I've mixed and matched the tutorials I could find into https://colab.research.google.com/drive/175zm5P1XB-Ig855AYhhe_z0QrwUz-74a#scrollTo=wCSrzbRGNy8T, but I'm not getting anywhere. Thank you for your time!
    3 replies
    edorsi
    @edorsi
    @edorsi
    Hi, I've recreated the NMT example from Coursera week one (NLP with Attention Models), but when I try to use the model and translate the sample sentence I see the next_symbol function keeps on giving me back the same element.
    @lukaszkaiser or anybody else, could you please point me to the right direction?
    Thanks
    2 replies
    image.png
    memora0101
    @memora0101
    Hello all, am having issues fixing this problem but getting errors while running my trax model:
    Screen Shot 2021-04-26 at 9.14.11 AM.png
    Screen Shot 2021-04-26 at 9.15.01 AM.png
    memora0101
    @memora0101
    @OmarAlsaqa after implementing the recommendations the following errors occurred, please refer to the images above
    memora0101
    @memora0101
    after printing sent and inputs together i found the following
    its an empty tensor: on the Why not, it is 8265456366 []
    Screen Shot 2021-04-26 at 9.34.40 AM.png
    Omar Alsaqa
    @OmarAlsaqa
    @memora0101 You should remove numbers from the dataset.
    thomas chaton
    @tchaton
    Hey everyone ! Happy to join this community.
    Afroz Mohiuddin
    @afrozenator
    @OmarAlsaqa - Would you like to submit the awesome NMT with Transformers/Reformers using Trax.ipynb notebook as a PR?
    1 reply
    Cambridge Yang
    @thisiscam
    Hi all, I'm wondering if trax has support for building "packed" version of a text dataset, for better performance on TPU. For example, this is done in a flax example: https://github.com/google/flax/blob/master/examples/wmt/input_pipeline.py#L72.
    In general, since trax uses python based data pipelining, does anyone have any observations about its performance, compared to tf.data pipelining?