Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    James Reed
    @jamesr66a
    Please feel free to chime in with new use cases, considerations for backends (HW and framework), as well as general issues or comments
    Masood S. Mortazavi
    @masoodmortazavi
    @jamesr66a Has the canonical example (mentioned in the first reference) worked out?
    James Reed
    @jamesr66a
    Working on it right now :) We just have to tie some loose ends in Pytorch then we can deploy it
    rshlomo
    @rshlomo
    Can you please add details to the example for wave front implementation? it is mentioned under the "Loop" construct, but this sentence confuses me "(with time being the inner looping dimension)" --> It is not completely clear if a nested loop is required on Layer/Timestamp (as one can understand from the word "inner"). I believe it isn't but wanted to make sure. Thanks a lot
    James Reed
    @jamesr66a
    I think I see where the confusion is. In ONNX, multiple layers will likely not present themselves as a "loop" per-se, but as being unrolled into the network defintion, like this: https://gist.github.com/jamesr66a/bc26918bcdb3fb6ae542984a0b971915 The wording there is confusing in that respect, is that correct?
    rshlomo
    @rshlomo
    Thank you for the quick response, it clarifies.
    G. Ramalingam
    @gramalingam
    (Repeating the question I asked at: onnx/onnx#502 ): I find the description of LoopIndexTensor confusing, namely: "This is a special operator only valid inside the loop that supports the common case behavior of accessing the correct element of the input sequence in an RNN. This operator MUST be directly given the passed-in iteration number to the body of a Loop graph. This signals to back-ends that this is a direct indexing operation, with no transforms applied to the index." I think it can be made a normal operator (sort of a special case of Gather) without these kind of restrictions. What is the intension behind the above restrictions?
    James Reed
    @jamesr66a
    @gramalingam When we've discussed with hardware vendors, they would like us to provide guarantees about the access patterns within control constructs so they can more aggressively optimize (e.g. no indirect indexing or anything like that). Admittedly this solution is fairly confusing and It hink we're going to re-think this
    G. Ramalingam
    @gramalingam
    Following up on our discussions at the workshop: let us add zero or more scan-inputs to the loop construct and add an attribute to the loop construct to tell us how many of the inputs are loop-carried and how many are scan-inputs. We can then eliminate LoopIndexTensor. Does this seem reasonable?
    James Reed
    @jamesr66a
    Sure, can you write up a PR to do so?
    BTW Prasanth is supposed to send out all of the summary slides today
    G. Ramalingam
    @gramalingam
    Sure
    leodestiny
    @leodestiny
    Hello, I have a question abot LSTM Inputs. There are some optional inputs in LSTM operator. How backend recognize different number of optional inputs and their order? Maybe using name?
    James Reed
    @jamesr66a
    Sorry for the late response, I couldn't access gitter for some reason. The inputs are all positional and optional inputs can be omitted by passing empty string as the input name
    rshlomo
    @rshlomo
    CUDNN expects the input tensor in RNN to be sorted according to valid time stamps: when receiving batch of inputs, the longest input should be in B=0, the shortest input should be in B=N-1.
    This is currently not specified in ONNX.
    Any reason to keep it unspecified? Having the input sorted might make it easier for HW vendors to support
    James Reed
    @jamesr66a
    @rshlomo sounds good to me from a back-end perspective. From a front-end perspective, I'm interested in if anyone will have trouble with this given their existing RNN APIs. I believe Pytorch already conforms to this. Anyone else have comments?
    @gramalingam any update on scan_inputs?
    James Reed
    @jamesr66a
    Here's a data-point about sorting sequences by length: our NLP team is having to jump through hoops to preserve the order of sequences for character models, in which the batch is a sentence and each sequence is a word. Order of the sequences is significant here
    G. Ramalingam
    @gramalingam
    Hi, apologies for the delayed follow-up: please take a look at https://github.com/gramalingam/onnx/tree/loop-redesign for the proposed replacement Scan and ScanWhile for the existing Loop and LoopIndexTensor. I can create a PR if that will be easier to review, but I thought it would help if we discussed as a workgroup before creating a PR. The op documentation is at: https://github.com/gramalingam/onnx/blob/loop-redesign/docs/Operators.md#Scan and https://github.com/gramalingam/onnx/blob/loop-redesign/docs/Operators.md#ScanWhile … there is also a generic While at https://github.com/gramalingam/onnx/blob/loop-redesign/docs/Operators.md#While but that is less critical.
    James Reed
    @jamesr66a
    Thanks @gramalingam! We're going to look over it
    G. Ramalingam
    @gramalingam
    Thanks @jamesr66a … I will also go ahead and create a PR so that others not in here can provide feedback … I will create the PR without the "while loop" to keep it somewhat focused.
    bas-aarts
    @bas-aarts
    @gramalingam, would it be possible to add examples on how to use the Scan primitive for bidirectional and/or ragged batched RNNs?
    G. Ramalingam
    @gramalingam
    For backward scan, we could add an attribute. For bidirectional, we would need to duplicate things (one for each direction): for example a different graph for each direction. Do we really need to do that? Conceptually, it is just two different scans, which the runtime can fuse for performance, if desired … as for ragged/batched RNNs, that is not within the scope of this PR … we would have to add packing-information if we want that …
    G. Ramalingam
    @gramalingam
    @bas-aarts : on further thought, once we add an attribute to indicate whether the scan is forward or backward (a separate flag for each scan-input), bidirectional can also be handled … it is just the same tensor X appearing twice in the scan-input list, once as backward and once as forward.
    G. Ramalingam
    @gramalingam
    We can definitely add support for ragged/batched RNNs … I have heard that this is critical for training, but have also heard that this is less critical for inference. If any of you think this is important for inference too, please do let us know … that would be useful to know.
    G. Ramalingam
    @gramalingam
    One of the other suggestions is to add a recurrence-step/stride attribute k (to handle recurrences of the form Y[t] = f ( Y[t-k], x[t] )) … that is, in each iteration, the value of the loop-carried-dependence used is the value computed k iterations earlier. (The first k iterations all use the initial value of the corresponding state-variable.) While this can be encoded using the current construct, we would need to explicitly represent the k values as k different state-variables. Furthermore, the attribute k is a declarative assertion that can be used to run the scan as k different sub-scans in parallel (if desired).
    G. Ramalingam
    @gramalingam
    @jamesr66a , Any feedback on the PR? The primary motivation was replacement of LoopIndexTensor, since it was ill-defined, by scan-inputs. Given that the inputs are scanned automatically, it was not clear why the loop index variables were required, and they were dropped. Now, if an iteration count is desirable, we can add that. (I thought that for the case where scan-inputs are present, the implicit iteration-count might suffice; and for the general case we have the while condition anyway.)
    James Reed
    @jamesr66a
    @gramalingam do you folks have control flow operators implemented in a frontend or backend? I think the best way to go in terms of de-risking these operators as part of the core spect is to achieve buy-in from different classes of partners (hardware backend, framework frontend, framework backend) and surface the issues we see there. In that way the reviews of proposed changes can be grounded in concrete implementation terms as well
    G. Ramalingam
    @gramalingam
    @jamesr66a taking a step back: (a) I got the impression from Prasanth that there is a desire to "finalize" control-flow operations for ONNX 1.3, coming up within a matter of weeks; what is your perception? (b) When you say different classes of partners, I'm unclear how the interaction is supposed to happen between the partners except via this working group; if there are other channels of communication between the partners, I am, personally, unaware of these. (c) I am guessing you have concerns with the proposal … if so, I would find it useful if you could elaborate on the concerns … is the concern that (A) the constructs are not expressive enough (for a frontend to express what it wants), or is it a concern that (B) the constructs are not restrictive enough (to enable a backend to implement it efficiently)? These two concerns pull us in different directions.
    James Reed
    @jamesr66a
    @gramalingam (a) My perception is that we shouldn't be pushing to merge these in if we don't have sufficient engagement from the different classes of partners that are involved in the effort. Others' opinions may differ (b) I agree with you, and the issue is that there has been little engagement within this working group. This is my call to action for additional engagement (c) I agree that these two design spaces are opposed to each other and I have in fact complained about this at the previous partner workshop, but we have not seen any clarifying action thus far. I am reluctant to push the project in one direction or another with the control flow operators, and thus comes the request that we see concrete concerns from the different classes of partners

    I propose 3 alternatives:
    1) Engage the community and move the operators out of experimental. This will involve seeing a) demand and b) implementation from different classes of partners.
    2) Leave them in experimental, and thus not compel backends to implement them. This will allow for users to utilize them if their given front/back-end support them but we are not compelling anyone to support them if it makes analysis/compliance harder
    3) Remove the operators. If we do not see any signal for demand for these operators (please let me know if there is demand outside FB) we may just remove them. This will simplify the specification

    Please let me know your thoughts

    G. Ramalingam
    @gramalingam
    @jamesr66a : I interpret (part of) your message as pushing for a better understanding of the requirements, and I agree that is very desirable. Here's my partial answer based on what I have learnt: CNTK has a general "Recurrence" operator, which is used quite a bit, which cannot be expressed with the non-experimental ONNX ops currently. The proposed "scan" operator is less general than CNTK's "Recurrence", so I think CNTK's recurrence implementation is a proof-of-implementability of the scan operation. I think TensorFlow's "scan" is also similar to the proposed "scan", so that is some further justification.
    I believe that "scan" should be okay from (B) perspective above (it is restricted enough to be efficiently implementable). However, it does suffer from not being expressive enough (in some scenarios). Hence, I think we may need to go for multiple ops (at least two): one that is general enough to serve as a fallback allowing us to express everything we want to, and one (or more) that is restricted enough to allow certain implementations.
    G. Ramalingam
    @gramalingam
    My impression is that Prasanth is suggesting (and I don't know how representative this is of the other ONNX partners' positions) option (1) of your alternatives: move the operators out of experimental.
    G. Ramalingam
    @gramalingam
    Moving operators out of experimental seems like a reasonable option … however, the existing "LoopIndexTensor" seems ill-defined (we can't quite associate it with a mathematical semantics), as at the least we need to remove it and replace it with some equivalent functionality. I suggest we try to identify a "scan" or "fold" like primitive that gives us enough value to justify introducing such an operator, while not being too hard/complicated (so we don't get pushback from that direction). We can always add more ops that add more expressive power gradually. What are the minimum features you would like to see in a scan op? What are features do you prefer to avoid in this op?
    G. Ramalingam
    @gramalingam
    Can we schedule a meeting to discuss what would be acceptable in moving ops out of experimental?
    danshirron
    @danshirron
    Regarding loop op: do i have to pass all body_network constants(weights and biases) as loop carried dependencies? (this is assuming i read them from a file)
    G. Ramalingam
    @gramalingam
    Constants can be embedded directly in the graph representing the loop body, so there is no issue there. However, the question that is worth debating is whether the loop body graph can contain references to names of tensors defined in the outer scope (e.g., an initializer in the outer graph). There are pros and cons to either decision here. Allowing such references simplifies the job of the exporter creating such a loop, but it complicates the job of backends that implement such loops.
    Emad Barsoum
    @ebarsoum

    Thanks everyone who joined the control flow and loop track in the ONNX workshop, here the summary of the discussion and what’s next:

    • if node: no disagreement, it should be removed from experiment and put into ONNX core. The description need to clarify that the variadic output need to be the same type for both branches of the if statement.
    • select node: if node only work on a scalar condition and only execute then or else branch. So we need a select node which is an element wise select. Both branches are tensors with the same shape and the result is another tensor that combine both based on the select condition. The condition in this case is a tensor and not scalar.
    • Scan node: formerly named ScanLoop, we need to add reverse attribute and a list of sequence lengths, here the current PR: onnx/onnx#1177. ScanWhile is gone.
    • Loop and LoopTensorIndex nodes: LoopTensorIndex is not needed. And Loop will be the more generic while loop. We should be able to implement Scan on the top of Loop if needed. Also, we should remove some of the current restriction and address feedback.

    Next steps, I will send an email to the participant and the owner of each of the above work items. And please provide feedback. The feedbacks for both Loop and Scan are pretty low.

    danshirron
    @danshirron
    Just to make sure i got this right, scanloop has state_variables, scan_inputs and scan_outputs. Loop has state_variables, scan inputs, scan_outputs and condition? so the only difference is the condition input? Regarding feedback, i used Caffe2 onnxwhile op to implement wavenet. Instead of using the looptensorindex op i used caffe2 ability of dynamic slice. The scanloop op should be able to match my needs. Only problem i encountered was how to pass the internal loop weights and biases. Since in caffe2 there is a separate init network i had to specify all of them as state_variables so they propagate to the internal network. Another way to do this could have been (as suggested by @gramalingam above) to embed them directly in the body predict_net but i guess this is not the prefered way in caffe2. any one has some suggestion here? Any other backend (other than caffe2) that supports the ONNX Loop/Scan ops?
    G. Ramalingam
    @gramalingam
    The previous loop did not have scan_inputs … that is one of the differences between scanloop and loop.
    G. Ramalingam
    @gramalingam
    Please see the issue onnx/onnx#1273 … I think we need to settle on some solution here for the typing of variadic inputs/outputs that happens with control-flow constructs. There is at least one solution (solution 1) which is fairly simple and doesn't impact anything else much.
    G. Ramalingam
    @gramalingam
    Was there any discussion in the workshop about whether the nested graphs can reference names of tensors defined in the outer scope graph?
    Emad Barsoum
    @ebarsoum
    No, even without it, we can pass the tensors in the outer scope to the nested loop.
    Dmytro Dzhulgakov
    @dzhulgakov

    Back to the reference of the enclosing scope

    Actually, current description of the Loop operator is “Values from the enclosing scope (i.e. variable a here) are in scope” (https://github.com/onnx/onnx/blob/master/docs/Operators.md#Loop)

    The rationale for this is that otherwise referring to too many initializers is too verbose (imagine passing every weight tensor to the input loop). And lexical analysis is a simple pass to carry out in the backend

    G. Ramalingam
    @gramalingam
    The PR is at: onnx/onnx#1296
    G. Ramalingam
    @gramalingam
    I personally am not convinced about the verbosity argument … I completely buy it in languages where users write code by hand, but in an IR it is not clear we need to optimize for human-readability and it is not clear that the model size will be significantly affected by this. But it is true that the backend can do the work of resolving these references to outer scope … i
    Emad Barsoum
    @ebarsoum
    Thanks everyone who participated in this work group. Loop, Scan and If are in and part of the just released ONNX 1.3 (https://github.com/onnx/onnx/tree/rel-1.3.0).
    cayleyhamilton
    @cayleyhamilton
    @ebarsoum @jamesr66a Is there a sample for beam search implemented in ONNX? (Ideally, an NMT in ONNX as described in the "NLP-in-ONNX: Strategy Proposal" )
    Prasanth Pulavarthi
    @prasanthpul
    @cayleyhamilton that's a great idea. @ebarsoum what do you think?