Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Samuel Audet
    @saudet
    @cowwoc BTW, if you're interested in non-DL methods, https://github.com/haifengl/smile is the closest thing that Java has to scikit-learn
    bhack
    @bhack
    Are you interested to comment in tensorflow/community#331 ?
    Adam Pocock
    @Craigacp
    Wrt sig-core yes but I'm not sure what progress has been made recently.
    I'm a little biased as I make a competitor to smile, but it always struck me as odd that it insists on implementing everything from scratch and doesn't wrap the good libraries which already exist. Tribuo landed in a different design point along a few axes anyway, but one thing we do differently is we're happy to hand interfaces to other libraries like TF and XGBoost. Plus he changed the license to LGPL which causes more issues from a commerical perspective.
    Adam Pocock
    @Craigacp
    *hand = have. Gitter's not letting me edit things today.
    Also I think I was one of the last people to comment on the SIG-core discussion page, and there was no response. If there is something happening then we'd be interested in commenting.
    Samuel Audet
    @saudet
    @Craigacp I think it's just because Smile predates things like libsvm-java and XGBoost. I would say Tribuo is the one that "insists on implementing everything from scratch and doesn't wrap the good libraries which already exist".
    Samuel Audet
    @saudet
    I suppose being LGPL is a valid reason to not want to use it, but that has not prevented projects like Linux, FFmpeg, and OpenJDK from succeeding in the industry.
    Adam Pocock
    @Craigacp
    Well smile seems to have been released in 2014 and libsvm had been available for years at that point. Tribuo starts by wrapping ML libraries that already exist that's one of the things it does. It's easier to wrap ones which are more focused, but it's entirely doable. We could wrap smile in tribuo, but it would be a lot of code and I admit I've not looked at it much due to licensing issues since they changed to the LGPL. Even if I can link against it there are still problems with looking at the code under that kind of license, especially when I make a competitor.
    Anyway, scikit-learn is successful as everyone implements it's methods and because python is duck typed, that works. We can't duck type things in Java, so we need well defined interfaces everyone can agree on. In Tribuo we're happy to accept prs which add new trainers and models wrapping other libraries (provided they are compatibly licensed, which rules out weka for example).
    Adam Pocock
    @Craigacp
    If the data mining JSRs from the early 2000s had succeeded things would be quite different. It'll be interesting to see if the visrec jsr can bring together a community around ML interfaces in Java.
    Samuel Audet
    @saudet
    Smile dates back from at least 2008: https://github.com/haifengl/smile/issues/86#issuecomment-226485089
    The first open-source release was around the time you mention though, but there's a difference.
    The way you think about it, does this mean you believe Java didn't exist before 2007? :)
    Adam Pocock
    @Craigacp
    Fair enough, I haven't gone through the docs in detail and there didn't seem to be a history page. That explains more about the design of the system. Presumably it was internal to BoA until 2014?
    Samuel Audet
    @saudet
    Motorola I think, but anyway, the point is, he's working solo on it and managed to use things like MKL all by himself to make it faster and usable on larger dataset for modern applications. You say Tribuo is supposed to be like that, so basic stuff like that should be there, but isn't.
    Adam Pocock
    @Craigacp
    Oracle Labs has been working with the BLIS/FLAME group at UT Austin on BLAS for Java. Once the next Tribuo feature release is done we're also going to prototype a vectorised linear algebra system on top of Java 16's vector API (the design for that will be a little tricky as Tribuo's linear algebra is heavily lambda based as it reduces allocation, and lambdas don't work well with the Java 16 version of the vector API). Neither of these are ready for a Tribuo release yet, but those are the directions we're going in. Tribuo has been plenty fast enough for our needs so far but I agree there are things that could be sped up (there always are).
    Adam Pocock
    @Craigacp
    Note by BLAS for Java I don't mean something that would live inside the JDK, I mean we're looking at how to use a BLAS in modern Java using Panama features as an external library in the same way as nd4j or neanderthal.
    Samuel Audet
    @saudet
    Right, so that's fine for the "labs" in "Oracle Labs", but it's not going to get something out in production today
    Adam Pocock
    @Craigacp
    Yep. Today in Tribuo you have the pure Java one, or use TF Java. As I said we've not had issues in our deployments.
    Karl Lessard
    @karllessard

    Any chance we could rename FloatNdArray.read(dst) (et al) to FloatNdArray.readTo(dst), and rename write(src) to writeFrom(src)?

    @jxtps, I don’t see why not, any chance you can create a PR for this?

    however, I'm not sure the proper way use StdArrays or equivalent to take an existing byte[][] and turn it into a TString

    @killeent , try TString.tensorOfBytes(StdArrays.ndCopyOf(byte[][]))

    Karl Lessard
    @karllessard
    @killeent , about my last comment, ideally you’d feed your data directly into a ByteNdArray instead of passing by a byte[][], check if that’s possible in your case
    bhack
    @bhack
    I don't know the final target of this https://github.com/flashlight/flashlight
    But It seems that they have more wrapping surface or not?
    Gili Tzabari
    @cowwoc
    Machine learning libraries are like Javascript frameworks. It's becoming impossible to keep up :)
    @bhack Out of curiosity, what does "wrapping surface" mean in this context?
    bhack
    @bhack
    If a framework expand over python I think that will be hard to not duplicate all the API in the other languages instead of wrapping It
    Samuel Audet
    @saudet
    That's why I think Python is already pretty much the "JavaScript of AI". For Java, we can bundle everything in JAR files (actually Maven artifacts), which people have been doing to do "JavaScript on Java" for a while now. We can do the same with Python. Check out how it feels in the case of libraries like SciPy and TVM:
    https://github.com/bytedeco/javacpp-presets/tree/master/scipy
    http://bytedeco.org/news/2020/12/12/deploy-models-with-javacpp-and-tvm/
    Of course, CPython has limitations, but when that becomes a problem, we just need to put more resources in efforts such as GraalVM:
    https://github.com/oracle/graalpython
    As for Flashlight, it sounds like they might have chosen PyTorch if the C++ API existed before 2018: flashlight/flashlight#1
    yottabytt
    @yottabytt
    Hi all, I am from PyTorch land. I am very new to Tensorflow as well as its JVM APIs. I came across these links:
    1. https://www.tensorflow.org/api_docs/python/tf/io/TFRecordWriter
    2. https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/lib/io/tf_record.py .
      Is there an equivalent API in Java i.e., I am looking to avoid the introduction of elements related to the computation graph for my project. I only need a low-level API that does the job of writing to and reading from TFRecord files.
    Karl Lessard
    @karllessard
    Hi @yottabytt , TFRecord are basically just protos that coule be read/written directly. You can use these pregenerated bindings to do so
    This is a VERY outdated example where I’m playing with similar protos, if that can give you some ideas to start with
    yottabytt
    @yottabytt
    Hi @karllessard, Thank you so much for your reply. Yeah I understand Example, Features and Feature protos . I am actually looking if there is an API in tensorflow/java that does this particular job (TFRecord)
    Please correct me if I am missing something.
    yottabytt
    @yottabytt
    Based on this tutorial, the TFRecordDataset in tensorflow/java was available in org.tensorflow.framework. However, it expects a Ops param. That's why I meant "I am looking to avoid the introduction of elements/concepts related to the computation graph".
    Such a thing is actually out there in python. I wonder if there would be an equivalent in tensorflow/java soon.
    Adam Pocock
    @Craigacp
    That python code is deprecated, the new ways all involve operating in eager mode. The hadoop record reader doesn't seem to do much, you could achieve the same by reading the protobuf using the generated Java protos.
    Adam Pocock
    @Craigacp
    If all you're looking for is iterating a file containing example protos then it might be easier to do that directly.
    yottabytt
    @yottabytt
    I see. Cool. Thank you so much !!
    bhack
    @bhack
    Have you solved with c++ tensorflow/java#283
    ?
    Or are you still interested in tensorflow/tensorflow#47748 ?
    Samuel Audet
    @saudet
    We can get away with anything that works with pybind11, so that's most likely not a problem. The API is getting pretty ugly though...
    Adam Pocock
    @Craigacp
    A stable C API would be nice. It's not clear how stable that one Samuel is wrapping is given its marked experimental.
    Stanislav Zemlyakov
    @rcd27
    Hello everyone! I have trained CNN model for detecting particular images in Python. Need a direction for my next steps for using it in Java. I have a Mat object from OpenCV (a preprocessed image of a symbol to be "detected") and a loaded model with session and stuff. Maybe there are some open-source examples.
    Stanislav Zemlyakov
    @rcd27
    Found this one: https://stackoverflow.com/questions/62241546/java-tensorflow-keras-equivalent-of-model-predict/62295153#62295153 . So what I think I should do: prepare my Mat object to be a representation of Tensor<Float32> and use session runner, like explained in StackOverflow answer. Am I missing something?
    Adam Pocock
    @Craigacp
    Things have moved on since Karl wrote that answer. Now TFloat32 is a subtype of Tensor so you need to make one of those. The SavedModelBundle exposes a call entry point which executes the requested function in the model (e.g. https://github.com/tensorflow/java/blob/35b73ce43cb821e5462ffc9ece51d6528dad224d/tensorflow-core/tensorflow-core-api/src/test/java/org/tensorflow/SavedModelBundleTest.java#L131).
    Stanislav Zemlyakov
    @rcd27
    image.png
    @Craigacp thanks, I have something now. Much better, than nothing
    So now I should prepare TFloat32 object from my image and pass it to function.call(..). API says I need 4 dimensional tensor. How it can be made from 2-dimensional array like image?
    Ryan Nett
    @rnett
    I'm guessing it's expecting [batch, height, width, channels], although it depends on what your Python model's input shape was
    Stanislav Zemlyakov
    @rcd27