Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    torito
    @torito
    Ok, so if after running it I only have one result, that means the model is no taking into account the first dimension, I'll see with the person who write the model in python, thanks
    Gili Tzabari
    @cowwoc
    Andrew Ng keeps on talking about plotting learning curves (training, validation errors) to detect high bias, variance. Is this still considered a best practice nowadays? Is there a standard function to do this kind of thing in most libraries?
    Adam Pocock
    @Craigacp
    Plotting the learning curve is still good practice for training large models, but we don't have explicit plotting support in TF Java. I usually use Tensorboard for that, but I admit I've not tried to use tensorboard from TF Java yet.
    Stanislav Zemlyakov
    @rcd27

    I do have some progress, thanks to this chat. My model has this signature:

    Signature for "serving_default":
                Method: "tensorflow/serving/predict"
                Inputs:
                    "conv2d_input": dtype=DT_FLOAT, shape=(-1, 30, 30, 3)
                Outputs:
                    "dense_2": dtype=DT_FLOAT, shape=(-1, 56)

    I've converted image to a required Tensor, however having problems with input Shape which should be 4-dimensional (and Image is only 3). As @rnett said, model expexts [batch, height, width, channels]. So what do I have now is:

    val image = ImageIO.read(File("./src/test/resources/tensorflow/A.png"))
            Truth.assertThat(image).isNotNull()
    
            val byte: ByteArray = (image.data.dataBuffer as DataBufferByte).data!!
            Truth.assertThat(byte).isNotEmpty()
    
            val desiredShape = Shape.of(0L, 30, 30, 3)
    
            // FIXME: seems like image tensor is prepared badly. Try to wrap image in NdArray and then expand its dimension by 1
            val imageTensor: TFloat32 = TFloat32.tensorOf(
                /*Shape.of(image.height.toLong(), image.width.toLong(), 3)*/desiredShape,
                NioDataBufferFactory.create(FloatBuffer.wrap(byte.map { it.toFloat() }.toFloatArray()))
            )
            Truth.assertThat(imageTensor).isNotNull()
    
            val function: ConcreteFunction = model.function(Signature.DEFAULT_KEY)
            val result: Tensor = function.call(imageTensor) as TFloat32
    
            Truth.assertThat(result).isNotNull()
            println("Result rank: ${result.rank()}") // 2
    
            println(result.size()) // 0

    So now I'm thinking about preparing the right shaped image tensor, using NdArray. Am I thinking in right direction?

    Karl Lessard
    @karllessard

    If you intend to pass a single image per batch, then the size of the first dimension should be 1, not 0.

    On another note, there is also this endpoint that allows you to allocate a tensor of any type from a buffer of bytes, I don’t know if that would work in and simplify your case: https://github.com/tensorflow/java/blob/35b73ce43cb821e5462ffc9ece51d6528dad224d/tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Tensor.java#L180

    I usually use Tensorboard for that, but I admit I've not tried to use tensorboard from TF Java yet.

    @Craigacp , I’ve tried that a few years ago already and it is unfortunately not obvious to write down summaries as expected by TensorBoard. Here’s an old example doing it on TF1.x. We should check if this is still the right way for doing it and if so, wrap this functionnality with some Java utility classes.

    Stanislav Zemlyakov
    @rcd27
    @karllessard I've changed batch size to 1, now result Tensor has size of 56, like model is supposed to work! Seems like I'm close for prediction :) The last thing is to extract it from this result tensor.
    Stanislav Zemlyakov
    @rcd27
    The only Float I was able to extract from the result Tfloat32 was 0.0...
    Adam Pocock
    @Craigacp
    Did you extract all 56 elements or just the first one?
    Stanislav Zemlyakov
    @rcd27
    @Craigacp just used getFloat() with no args. Should I iterate over entire collection?
    Adam Pocock
    @Craigacp
    You should be able to inspect each element by passing in the co-ordinates, or copy it out into a Java array if necessary.
    Stanislav Zemlyakov
    @rcd27
    @Craigacp I iteraded through TFloat32 by calling .getFloat(0, idx) (where idx is in 0..result.size()) and all values seem to be 0.0
    Stanislav Zemlyakov
    @rcd27
    I think that image is wrapped in Tensor with mistake. I passed a shape of 4-dim array, but image itself is only 3-dim. Probably should create NdArray to wrap image bytes, and then expand it with 1 dimension. Like this is done in Python:
    self.model.predict(np.expand_dims(im, axis=0))
    Stanislav Zemlyakov
    @rcd27
    Is there a way to expand NdArray with one dimension?
    Adam Pocock
    @Craigacp
    Yes, but as it's stored as a blob of memory all it does is change the shape variable, so I don't think that it's likely to fix your issue (assuming you're using code similar to what you posted above)
    Stanislav Zemlyakov
    @rcd27
    @Craigacp I'll try to solve that from another side: train model for it to expect shape = (30, 30, 3)
    Adam Pocock
    @Craigacp
    No, I think you probably want the batch size to be the dimension. An ndarray of shape (1, 30, 30, 3) has the same data as one of shape (30, 30, 3), but you should check that you're passing in the pixels in the right order, and that your data is laid out in row major order.
    Stanislav Zemlyakov
    @rcd27
    @Craigacp I've got a 1.0 value from one of resulting TFloat elements!
    Stanislav Zemlyakov
    @rcd27
    That's a big deal for me, a person who knows a bit more than nothing about how TF works. Thank you guys so much! I'll continue to implement my idea. But what I also want to do is to understand TF. Any recommendations about courses?
    I understand Python, but my main stack is Java/Kotlin (so I will probably train models in Python infrastructure and use them in Kotlin)
    Stanislav Zemlyakov
    @rcd27
    I would have never got this done if not tensorflow/java and this open community, which helps people. Damn... Special thanks to @karllessard and @Craigacp
    Stanislav Zemlyakov
    @rcd27
    @karllessard I've updated your stackoverflow answer: https://stackoverflow.com/a/67289545/6748943
    Karl Lessard
    @karllessard
    @rcd27 , it is pretty common to normalize the image data before converting it to a float as well, i.e. dividing it by 255.0f. See here, we use TF eagerly to execute this preprocessing. If you were using Keras in Python, then your data was normalized.
    Adam Pocock
    @Craigacp
    Understanding TF is probably pretty tricky from a standing start. It depends what your goals are and your current familiarity with ML/DL.
    Adam Pocock
    @Craigacp
    I'm seeing some deeply weird non-determinism in the training code, which seems to be in the way that the TF C API generates gradients. I've opened an issue upstream, but if people are seeing non-determinism in their training runs, this might be why - tensorflow/tensorflow#48855
    jxtps
    @jxtps

    I've run into a weird Out Of Memory issue with TFJ 0.3.1. I initialize a couple of models on server startup & process a couple of images. CPU, not GPU. That works. Then the server idles for some time (half an hour?). Then if I process another image, it sometimes throws java.lang.OutOfMemoryError: Physical memory usage is too high: physicalBytes (57976M) > maxPhysicalBytes (37568M) at org.bytedeco.javacpp.Pointer.deallocator(Pointer.java:695). The server was typically using <10 gigs (as reported by Java, not Pointer.physicalBytes) before this, so it's very weird that it would shoot up to almost 60 gigs with no TF usage!?

    This all started happening after I introduced a GPU microservice - the servers used to process one image per couple of seconds with CPU inference and never crashed, but now that they only process an image if the GPU microservice is being too slow (= fallback), they've all of a sudden started crashing!? The GPU microservice is holding up fine - it's processing say 1 image per second or something and not crashing.

    Now, the resident set (RES column in top) is vastly greater than the ram usage reported by java - I'm for example seeing >50gigs of RES but <10 gigs of Runtime.getRuntime.totalMemory() - Runtime.getRuntime.freeMemory().
    Any suggestions for how to track this down? Is there a way to list / enumerate the ram used by TFJ?
    jxtps
    @jxtps
    And any ideas why RES (= 50 gigs) would be so much higher than e.g. Runtime.getRuntime.totalMemory() (= 20 gigs)
    Hmm... I'm realizing we switched from G1 GC to ZGC in the interim (unrelated issues on a separate site where G1 GC would mode-switch and start using >20% CPU after 12-24 hours of significant memory churn). Also switched from Java 14 to 15 (couldn't switch to 16 due to issues with Play framework not working with that version)
    Ryan Nett
    @rnett
    @jxtps Try using G1 again maybe? tensorflow/java#315 seems like the same issue, but just w/ ZGC
    Adam Pocock
    @Craigacp
    I don't think runtime.totalMemory() includes memory allocated via JNI (e.g. inside TF or through JavaCPP).
    Adam Pocock
    @Craigacp
    What GC tuning parameters are you setting in addition to the use of ZGC? Things like Xmx, and then anything ZGC specific or -XX:+DisableExplicitGC etc. ZGC maps the memory multiple times, which is known to inflate the resident set size (https://mail.openjdk.java.net/pipermail/zgc-dev/2018-November/000540.html), throwing off JavaCPP's calculation.
    jxtps
    @jxtps
    @rnett yeah, that sounds like the culprit, thanks! @Craigacp I think you're absolutely right about that, but 30 gigs for TFJ!? I run two models, one "small" and one "large", but that's an outrageous amount of RAM being used?! The graphics card they trained on had 32 gigs, and used batches!?
    Adam Pocock
    @Craigacp
    So it might not actually be a lot of memory, just an interaction between JavaCPP's accounting (which uses Linux's resident set size, which is known to be inaccurate in ZGC's use case) and ZGC.
    jxtps
    @jxtps
    Ah, ok, yeah, 3x makes a big difference. No specific switches for ZGC, just turned it on. Xmx is set to 60% of whatever free -m returns in the Mem row, Total column
    Sounds like I'll be reverting some of those recent changes then, thanks!
    Adam Pocock
    @Craigacp
    I think that turning off JavaCPP's native memory accounting should be sufficient.
    You're actually staying within the acceptable memory limits, but the RSS based method that JavaCPP uses can't figure that out (because Linux doesn't know that either).
    Gili Tzabari
    @cowwoc
    Based on https://docs.oracle.com/en/java/javase/15/gctuning/available-collectors.html#GUID-C7B19628-27BA-4945-9004-EC0F08C76003 it sounds like one should be using the parallel or G1 garbage collectors tuned for max-throughput if you're training a model. That said, ZGC might still make sense in the context of evaluating a pre-trained model.
    Gili Tzabari
    @cowwoc
    Hey. I posted a modeling question over at https://ai.stackexchange.com/q/27657/46787. I would appreciate your advice even though it's not TensorFlow-specific.
    Jakob Sultan Ericsson
    @jakeri

    Hello, we run Tensorflow Java 0.2.0 (TF 2.3.1). And we have a model that produces an image out of the model as a byte[]. This is fine when running the model in python. We can write the bytes to a file. But when trying to do the same thing using TF Java we run into getting too few bytes from the output. I think I have managed to boil it down to unit test with TString.

        public void testTFNdArray() throws Exception {
            ClassLoader contextClassLoader = Thread.currentThread().getContextClassLoader();
            byte[] readAllBytes = Files.readAllBytes(Path.of(contextClassLoader.getResource("img_12.png").getFile()));
            NdArray<byte[]> vectorOfObjects = NdArrays.vectorOfObjects(readAllBytes);
    
            Tensor<TString> tensorOfBytes = TString.tensorOfBytes(vectorOfObjects);
            TString data = tensorOfBytes.data();
    
            byte[] asBytes = tensorOfBytes.data().asBytes().getObject(0);
    
            System.out.println("Bytes original file: " + readAllBytes.length);
            System.out.println("NdArray byte[] length: " + vectorOfObjects.getObject().length);
            System.out.println("Tensor numbytes: " + tensorOfBytes.numBytes());
            System.out.println("TString size: " + data.size());
            System.out.println("Bytes with reading from TString (WRONG):  " + asBytes.length);
        }

    This is the problem that I get with running through a real model. How should we be able to get the full byte[] out again?

    Samuel Audet
    @saudet
    The implementation of TString has changed a lot recently. Could you please try again with TF Java 0.3.1?
    Jakob Sultan Ericsson
    @jakeri
    Thanks, I will give it a try.
    We wanted to stay on the same version as the generated Python code.
    Jakob Sultan Ericsson
    @jakeri
    I rewrote the above test with 0.3.1 and results seems to be somewhat better but the process core dumps 9 out of 10 times. On my MacBook Pro BigSur 11.3.1.
    Jakob Sultan Ericsson
    @jakeri
    Seems to be enough to write TString.scalarOf("hello"); to get the coredump.
    Jakob Sultan Ericsson
    @jakeri
    This doesn't fail if I run in a linux-vm so the problem is probably isolated to OSX (our dev-env).
    Adam Pocock
    @Craigacp
    What does your version of the test look like with 0.3.1?
    Jakob Sultan Ericsson
    @jakeri
        @Test
        public void testTString() throws Exception {
            TString.scalarOf("hello");
        }
    The jvm core dumps.
    But running this in a docker linux vm the test pass.