Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Adam Pocock
    @Craigacp
    Fair enough, I haven't gone through the docs in detail and there didn't seem to be a history page. That explains more about the design of the system. Presumably it was internal to BoA until 2014?
    Samuel Audet
    @saudet
    Motorola I think, but anyway, the point is, he's working solo on it and managed to use things like MKL all by himself to make it faster and usable on larger dataset for modern applications. You say Tribuo is supposed to be like that, so basic stuff like that should be there, but isn't.
    Adam Pocock
    @Craigacp
    Oracle Labs has been working with the BLIS/FLAME group at UT Austin on BLAS for Java. Once the next Tribuo feature release is done we're also going to prototype a vectorised linear algebra system on top of Java 16's vector API (the design for that will be a little tricky as Tribuo's linear algebra is heavily lambda based as it reduces allocation, and lambdas don't work well with the Java 16 version of the vector API). Neither of these are ready for a Tribuo release yet, but those are the directions we're going in. Tribuo has been plenty fast enough for our needs so far but I agree there are things that could be sped up (there always are).
    Adam Pocock
    @Craigacp
    Note by BLAS for Java I don't mean something that would live inside the JDK, I mean we're looking at how to use a BLAS in modern Java using Panama features as an external library in the same way as nd4j or neanderthal.
    Samuel Audet
    @saudet
    Right, so that's fine for the "labs" in "Oracle Labs", but it's not going to get something out in production today
    Adam Pocock
    @Craigacp
    Yep. Today in Tribuo you have the pure Java one, or use TF Java. As I said we've not had issues in our deployments.
    Karl Lessard
    @karllessard

    Any chance we could rename FloatNdArray.read(dst) (et al) to FloatNdArray.readTo(dst), and rename write(src) to writeFrom(src)?

    @jxtps, I don’t see why not, any chance you can create a PR for this?

    however, I'm not sure the proper way use StdArrays or equivalent to take an existing byte[][] and turn it into a TString

    @killeent , try TString.tensorOfBytes(StdArrays.ndCopyOf(byte[][]))

    Karl Lessard
    @karllessard
    @killeent , about my last comment, ideally you’d feed your data directly into a ByteNdArray instead of passing by a byte[][], check if that’s possible in your case
    bhack
    @bhack
    I don't know the final target of this https://github.com/flashlight/flashlight
    But It seems that they have more wrapping surface or not?
    Gili Tzabari
    @cowwoc
    Machine learning libraries are like Javascript frameworks. It's becoming impossible to keep up :)
    @bhack Out of curiosity, what does "wrapping surface" mean in this context?
    bhack
    @bhack
    If a framework expand over python I think that will be hard to not duplicate all the API in the other languages instead of wrapping It
    Samuel Audet
    @saudet
    That's why I think Python is already pretty much the "JavaScript of AI". For Java, we can bundle everything in JAR files (actually Maven artifacts), which people have been doing to do "JavaScript on Java" for a while now. We can do the same with Python. Check out how it feels in the case of libraries like SciPy and TVM:
    https://github.com/bytedeco/javacpp-presets/tree/master/scipy
    http://bytedeco.org/news/2020/12/12/deploy-models-with-javacpp-and-tvm/
    Of course, CPython has limitations, but when that becomes a problem, we just need to put more resources in efforts such as GraalVM:
    https://github.com/oracle/graalpython
    As for Flashlight, it sounds like they might have chosen PyTorch if the C++ API existed before 2018: flashlight/flashlight#1
    yottabytt
    @yottabytt
    Hi all, I am from PyTorch land. I am very new to Tensorflow as well as its JVM APIs. I came across these links:
    1. https://www.tensorflow.org/api_docs/python/tf/io/TFRecordWriter
    2. https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/lib/io/tf_record.py .
      Is there an equivalent API in Java i.e., I am looking to avoid the introduction of elements related to the computation graph for my project. I only need a low-level API that does the job of writing to and reading from TFRecord files.
    Karl Lessard
    @karllessard
    Hi @yottabytt , TFRecord are basically just protos that coule be read/written directly. You can use these pregenerated bindings to do so
    This is a VERY outdated example where I’m playing with similar protos, if that can give you some ideas to start with
    yottabytt
    @yottabytt
    Hi @karllessard, Thank you so much for your reply. Yeah I understand Example, Features and Feature protos . I am actually looking if there is an API in tensorflow/java that does this particular job (TFRecord)
    Please correct me if I am missing something.
    yottabytt
    @yottabytt
    Based on this tutorial, the TFRecordDataset in tensorflow/java was available in org.tensorflow.framework. However, it expects a Ops param. That's why I meant "I am looking to avoid the introduction of elements/concepts related to the computation graph".
    Such a thing is actually out there in python. I wonder if there would be an equivalent in tensorflow/java soon.
    Adam Pocock
    @Craigacp
    That python code is deprecated, the new ways all involve operating in eager mode. The hadoop record reader doesn't seem to do much, you could achieve the same by reading the protobuf using the generated Java protos.
    Adam Pocock
    @Craigacp
    If all you're looking for is iterating a file containing example protos then it might be easier to do that directly.
    yottabytt
    @yottabytt
    I see. Cool. Thank you so much !!
    bhack
    @bhack
    Have you solved with c++ tensorflow/java#283
    ?
    Or are you still interested in tensorflow/tensorflow#47748 ?
    Samuel Audet
    @saudet
    We can get away with anything that works with pybind11, so that's most likely not a problem. The API is getting pretty ugly though...
    Adam Pocock
    @Craigacp
    A stable C API would be nice. It's not clear how stable that one Samuel is wrapping is given its marked experimental.
    Stanislav Zemlyakov
    @rcd27
    Hello everyone! I have trained CNN model for detecting particular images in Python. Need a direction for my next steps for using it in Java. I have a Mat object from OpenCV (a preprocessed image of a symbol to be "detected") and a loaded model with session and stuff. Maybe there are some open-source examples.
    Stanislav Zemlyakov
    @rcd27
    Found this one: https://stackoverflow.com/questions/62241546/java-tensorflow-keras-equivalent-of-model-predict/62295153#62295153 . So what I think I should do: prepare my Mat object to be a representation of Tensor<Float32> and use session runner, like explained in StackOverflow answer. Am I missing something?
    Adam Pocock
    @Craigacp
    Things have moved on since Karl wrote that answer. Now TFloat32 is a subtype of Tensor so you need to make one of those. The SavedModelBundle exposes a call entry point which executes the requested function in the model (e.g. https://github.com/tensorflow/java/blob/35b73ce43cb821e5462ffc9ece51d6528dad224d/tensorflow-core/tensorflow-core-api/src/test/java/org/tensorflow/SavedModelBundleTest.java#L131).
    Stanislav Zemlyakov
    @rcd27
    image.png
    @Craigacp thanks, I have something now. Much better, than nothing
    So now I should prepare TFloat32 object from my image and pass it to function.call(..). API says I need 4 dimensional tensor. How it can be made from 2-dimensional array like image?
    Ryan Nett
    @rnett
    I'm guessing it's expecting [batch, height, width, channels], although it depends on what your Python model's input shape was
    Stanislav Zemlyakov
    @rcd27
    Ryan Nett
    @rnett
    It should just be a basic reshape. The best way is probably to do something like NDarray.slice(Indices.newaxis(), Indices.ellipse(), Indices.newaxis()) on your ndarray. Make sure the width/height order matches your model though. In gneral, if you need to do ndarray operations your best bet is usually to load it into an eager session (Ops.constantOf) and do it there.
    Karl Lessard
    @karllessard
    @rcd27 , if that can help, you can see how to migrate from TF-Java 0.2.0 to 0.3.1 by looking at this guide
    torito
    @torito

    Hi, I have the following

    val input = TString.tensorOf(Shape.of(1, 1), DataBuffers.ofObjects(title))
    val runner = session.runner()
        runner.feed("serving_default_text", input)
        runner.fetch("StatefulPartitionedCall").run()

    however, I dont know how I can make predictions in batch, I google for some hints without success. I need to made a matrix ? dataset ? I'm a little lost without too much java documentation. Or it is ok if I reuse the runner and iterate with my inputs ? Thanks

    Adam Pocock
    @Craigacp
    You should make a new runner for each call to run. But depending on how your model is setup, the first dimension of the input is usually the batch size so you could pass in multiple inputs in a single batch tensor.
    torito
    @torito

    That means I need to put something like (in scala)

    TString.tensorOf(Shape.of(titles.length, 1), DataBuffers.ofObjects(titles:_*))

    , but the model should be built to handle the the batch size

    I thought to run in batch was independent of the model, and handled by tensorflow framework
    Adam Pocock
    @Craigacp
    Not necessarily, batch inference is part of the model, usually by specifying an unknown dimension as the first dimension in input tensors.
    Karl Lessard
    @karllessard
    Keras will add implicitly this dimension in your model, maybe that’s what you mean @torito . So the dimension is there but is hidden by Keras (not the TF core lib on which we rely on).
    you can get more info on the expected shape of your input tensors by looking at the signature of your model (e.g. savedModel.function(“serving_default”).signature().toString())
    torito
    @torito
    I have this
    {serving_default=inputs {
      key: "text"
      value {
        name: "serving_default_text:0"
        dtype: DT_STRING
        tensor_shape {
          dim {
            size: -1
          }
          dim {
            size: 1
          }
        }
      }
    }
    outputs {
      key: "dense_2"
      value {
        name: "StatefulPartitionedCall:0"
        dtype: DT_FLOAT
        tensor_shape {
          dim {
            size: -1
          }
          dim {
            size: 189
          }
        }
      }
    }
    method_name: "tensorflow/serving/predict"
    , __saved_model_init_op=outputs {
      key: "__saved_model_init_op"
      value {
        name: "NoOp"
        tensor_shape {
          unknown_rank: true
        }
      }
    }
    }
    Adam Pocock
    @Craigacp
    So the first dimension is -1 which means it's unknown and is used (in this case) to represent the batch size.
    torito
    @torito
    Ok, so if after running it I only have one result, that means the model is no taking into account the first dimension, I'll see with the person who write the model in python, thanks