Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
  • Mar 27 12:33
    treo labeled #7319
  • Mar 27 12:33
    treo commented #7319
  • Mar 27 08:15
    AlexDBlack edited #8812
  • Mar 27 08:14
    AlexDBlack labeled #8812
  • Mar 27 08:14
    AlexDBlack labeled #8812
  • Mar 27 08:14
    AlexDBlack labeled #8812
  • Mar 27 08:14
    AlexDBlack opened #8812
  • Mar 26 19:42
    linuxciscoarnaud opened #8811
  • Mar 26 18:26
    sshepel commented #8750
  • Mar 26 13:33
    AlexDBlack closed #8765
  • Mar 26 13:33
    AlexDBlack closed #8715
  • Mar 26 13:33
    AlexDBlack closed #8718
  • Mar 26 13:33
    AlexDBlack closed #8776
  • Mar 26 13:33
    AlexDBlack closed #8704
  • Mar 26 13:33
    AlexDBlack closed #8768
  • Mar 26 13:33
    AlexDBlack closed #8777
  • Mar 26 13:10
    TramsWang opened #8810
  • Mar 26 10:17
    saudet commented #8806
  • Mar 26 09:58
    mheck136 commented #8806
  • Mar 26 09:14
    saudet commented #8750
if you use eclipse , then you have to import "shared-utilities" project also in your eclipse Package Explorer. I fixed by it.
or you can refer "eclipse/deeplearning4j#8307"
git clone https://github.com/eclipse/deeplearning4j-examples
cd shared-utilities
mvn install
cd ../dl4j-examples
mvn install
Hello. I am trying to train a MultiLayerNetwork using the fit method. I first tried to create an INDArrayDataSetIterator, where I passed the first INDArray as a feature matrix (64 rows and 6 columns) and an INDArray for the label prediction (64 rows and 2 columns).
However, it didn't seem to train correctly as I was constantly getting the same values. I then tried to do it wihtout a dataset iterator but nothing changed.
Any ideas on what i'm doing wrong? I don't think it is related to the network configuration or the learning rate.. Only one parameter of the network output seems to change
Amlesh Sivanantham
hey guys, I'm having some trouble running valgrind to test my java class running some dl4j code. I was following the guide on the old nd4j wiki but the output is not what I expect. They create a script called valgrindJava which essentially wraps the java process into valgrind. I tried running it with my actual arguments, through maven, and standalone without the script. In all cases, it seems that valgrind gives its summary before the java process even runs. Yet in the valgrind logs, I do see the full java command being printed so it is getting it. Hoping someone here understands what the issue could be. Here is the wiki page I was referring to, https://github.com/deeplearning4j/libnd4j/wiki/Debugging-libnd4j
Amlesh Sivanantham
Hey all identified the issue. My java in my path was actually a shell script that then forwarded the call to the actually java binary. Once I passed that in, its correctly being scaned by valgrind
Yuniel Acosta Pérez
I can in the process of transforming my dataset convert a column that contains text into a Vector using Word2Vec. If so, can someone explain to me how
Yuniel Acosta Pérez
I need help please.
@yuniel-acosta You can, see the example in https://deeplearning4j.org/docs/latest/deeplearning4j-nlp-word2vec you the you just need to input the information from your file instead of raw_sentences.txt
Anybody can suggest some good tutorial on how to setup a simple reinforcement learning environment in DL4J? Examples require frameworks, the toy sample is not documented and object involved are not immediate to understand event diving in the code.
new to DL4J
trying to do the serup
getting below exception when i try to run the CSVExample.java in IntelliJ
i completed the Maven clean install
any help plz?
@sskmaestro As you can see, the files on the azure cloud where they are supposed to be hosted are no longer available, so the link is broken. I solved searching those files and caching myself in my local folders. e.g., C:\Users\myself\dl4j-examples-data\dl4j-examples\nlp\raw_sentences.txt
You can find the local path of IrisData.zip in project "shared-utilities*/.../DownloaderUtility.java, in this case:
IRISDATA("IrisData.zip", "datavec-examples", "bb49e38bb91089634d7ef37ad8e430b8", "1KB"),
so it would be:
Unfortunately I can't find the mirror I used.
@artsakenos I dont see the IrisData.zip under \dl4j-examples-data\dl4j-examples\datavec-examples\
Manish Patel
Hi, is there a way for ImageRecordReader to autodetect the correct number of channels, e.g. from the first image in the data set?

@sskmaestro Yes, what I meant you have to cache it by yourself on that folder, i.e., download it and put it there. For example this link should have an iris data folder: https://archive.ics.uci.edu/ml/datasets/Iris

I'm so sad Java Developers and tools for Machine Learning always seem just like they're being abandoned.


Hi, I'm debugging an application where physical memory usage grows indefinitely, resulting in OutOfMemory errors. The Errors may occur in a few places, but the ultimate result is "Physical memory usage is too high: physicalBytes(40238M) > maxPhysicalBytes(40050m). It usually takes around 1 million requests to a tomcat app where the ComputationGraphs and DataSetIterator live.

I cannot debug the above application on a linux machine where I can easily change code and iterate quickly. I have created a toy app based on one of the dl4j examples. There is no tomcat, it just loads the net built in CnnSentenceClassificationExample (I used the same training code there), then loops over the DataISetIterator and call net.output on each batch.

I noticed that if logging Pointer.physicalBytes(), in the toy app it goes up until a certain point. Then there is a few second pause, and the totalBytes number drops. Once I see this happen, I see that it happens each time the value gets to a certain max value.

In the tomcat app logs, I don't see the value ever go down. I wasn't logging physical bytes when I saw the OOM, and will do more tests. But, I'm currently trying to pin down where in the dl4j, nd4j, javacpp code is running during this pause (which appears to be garbage collection for native memory). I'd like to find that to see if for some reason that same code segment is being skipped in my tomcat app. Any advice here would be appreciated, thanks.

The OutOfMemory errors generally occur in <init>FloatPointer
Samuel Audet
@tc64 It's possible that garbage collection is taking more time with Tomcat because of other applications running in the container. Try to increase the number of retries to System.gc(): http://bytedeco.org/javacpp/apidocs/org/bytedeco/javacpp/Pointer.html#maxRetries
Hello, How to change model in interface? Example:
Hi , we can use deeplearning4j for analysis vedio !!

I'm seeing something interesting in Nd4j's getRow. When doing getRow(0) it appears to return all of the data, not just the first row. Code snippet:
val shape = outMat.shape()
val out = which + ";" + (0L until shape(0)).map(ll=>{
val data = outMat.getRow(ll).data().asDouble()

outMat is an INDArray 2D array with size (5,256). Obviously using Scala, and not the latest version of DL4J (1.0.0-beta2), so forgive me if this has been seen before!

this code snippet yields:

Martin Krybus
Hello guys, could someone direct me please ? Im using QLearningDiscreteConv to learn car drive straight road, rewards are made in the following way: if car deviate from middle of the road he get -0.1, either can get closer to the middle so he gets 0.1, the result should be to learn car drive straight road trying to stay in the middle, only 2 action, turn left and turn right 4 degrees, lower epsilon is the worse results the car gets, in other words, if he tries random action he is better that take action based on values from q-table, what am i doing wrong? Typically if policy.EpsGreedy - EP: is very low number he typically stick to one action, very frustrating, as input im using immediate surroundings of the car - to optimize training process
Rural Hunter

Hi, I have a question regarding using external error: https://github.com/eclipse/deeplearning4j-examples/blob/master/dl4j-examples/src/main/java/org/deeplearning4j/examples/misc/externalerrors/MultiLayerNetworkExternalErrors.java
If my nn is like this:

MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
            .updater(new Nesterovs(0.001,0.9))
            .layer(0, new DenseLayer.Builder().nOut(20).build())
            .layer(1, new DenseLayer.Builder().activation(Activation.SOFTMAX).nOut(4).build())

To calculate the gradient, should I use the vanila error or the error after softmax activation in the net.backpropGradient() function?

what is thingsboard exactly??
Hello, I have question about random seed. I am new to GPU training, so this will not be my last question :). If I set .seed(123) in FineTuneConfiguration and try to train model on CPU, it works as it should. With every new run with everything same i have same results. But on GPU are results every time different (with same seed). I also tried Nd4J.getRandom().setSeed() method, but didn't help. On both machines I have same code. Do you know, what can be possible wrong? Am I missing something?
Torsten Bergh Moss


Does anybody have any idea of what sort of classification speed one can expect using DL4J?

I have a single-hidden-LSTM-layer RNN doing sentiment analysis of tweets with the Cuda-10.1-backend (Without cudNN, working on getting that installed but I have limited privileges on the machine) and two Tesla P100-16GB GPU's. Classifying using net.output(); I get a throughput of about 100 tweets processed per second. This is way lower than I was hoping for, as I achieved a throughput of 15k tweets per second using a CPU-based implementation of Naive Bayes last semester.

Does anybody have any experience trying to make NN's faster and more scalable? Would greatly appreciate any nudge in the right direction.

Torsten Bergh Moss
Why am I using net.output(); instead of a DataSetIterator you might ask. I am using the network in a streaming context and not on a static dataset.
@bergh_moss_twitter I think the usual way to get an answer is to fill in an issue in GitHub and post the link there.
@bergh_moss_twitter Please note that DL4J forum moved to https://community.konduit.ai

I am trying KerasModelImport.importKerasSequentialModelAndWeights() to load a Keras model generated by Python Keras, but cannot get the same result from Python and Java.

# Python code:
input_length = 3
model = Sequential()
model.add(Embedding(7, 4, input_length=input_length))
model.add(Dense(1, activation='relu'))
model.compile(optimizer=SGD(lr=0.01, momentum=0.9), loss='mean_squared_error')

# model.save('.\\data\\simple_mlp_4.h5')
model = load_model('.\\data\\simple_mlp_4.h5')

data = [[1, 2, 3], [4, 3, 6]]
data = np.asarray(data, dtype=np.int32)

output = model.predict(data)
// Java code:
final String modelFile = new File(dataLocalPath,"simple_mlp_4.h5").getAbsolutePath();
MultiLayerNetwork model = KerasModelImport.importKerasSequentialModelAndWeights(modelFile, true);

INDArray input = Nd4j.create(new float[]{1,2,3,4,3,6},new int[]{2,3}).castTo(DataType.INT);
INDArray output = model.output(input);


Python code prints out [[0.04062647] [0.01502968]], while Java code prints out [0.0604, 0.0385]. They load the same model and weights, but produce the different results. Anyone can help?

Can someone helpt with this issue? eclipse/deeplearning4j-examples#944
Hi everyone! I'm using arbiter and I can't see the relationship between the candidate iterations that I obtain, number of Epochs and the batchSize and totalSize of my Dataset. I'm expecting (totalSizeDataset/batchSize) taking the integer number rounded up, multiplied by the number of Epochs, is that true??
Hello everyone! Can someone suggest if you had attempted to do incremental training using DL4J on the already trained model . I am just trying to reload the computation graph and re-train the model with new dat a on top of exisiting one . Any idea or suggestions on this will be helpful .Thanks
Hello friends, i received this error with Resnet50 zoo model:

A fatal error has been detected by the Java Runtime Environment:


EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00007ff842b3a1a8, pid=3620, tid=0x000000000000303c

Can you help me to fix it?
Hello everyone, i try to get my application running with cuda 10.2 , i tried beta6 and snaphot with the cuda redist dependency . gpu usage is very low. 0%-4%. here is the output when start learning [main] INFO org.nd4j.linalg.factory.Nd4jBackend - Loaded [JCublasBackend] backend
[main] INFO org.nd4j.nativeblas.NativeOpsHolder - Number of threads used for linear algebra: 32
[main] INFO org.nd4j.nativeblas.Nd4jBlas - Number of threads used for OpenMP BLAS: 0
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Backend used: [CUDA]; OS: [Windows 10]
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Cores: [4]; Memory: [14,2GB];
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Blas vendor: [CUBLAS]
[main] INFO org.nd4j.linalg.jcublas.JCublasBackend - ND4J CUDA build version: 10.2.89
[main] INFO org.nd4j.linalg.jcublas.JCublasBackend - CUDA device 0: [GeForce 840M]; cc: [5.0]; Total memory: [2147483648]
[main] INFO org.deeplearning4j.models.sequencevectors.SequenceVectors - Starting vocabulary building...

and stacktrace 2020-02-28 16:35:54
Full thread dump OpenJDK 64-Bit Server VM (25.242-b08 mixed mode):

"VectorCalculationsThread 0" #21 prio=5 os_prio=0 tid=0x00000253b5a3c000 nid=0x1a38 runnable [0x000000c0accfe000]
java.lang.Thread.State: RUNNABLE
at org.nd4j.nativeblas.Nd4jCuda.execCustomOp2(Native Method)
at org.nd4j.linalg.jcublas.ops.executioner.CudaExecutioner.exec(CudaExecutioner.java:2498)
at org.nd4j.linalg.jcublas.ops.executioner.CudaExecutioner.exec(CudaExecutioner.java:2306)
at org.deeplearning4j.models.embeddings.learning.impl.elements.SkipGram.iterateSample(SkipGram.java:534)
at org.deeplearning4j.models.sequencevectors.SequenceVectors$VectorCalculationsThread.run(SequenceVectors.java:1317)

Locked ownable synchronizers:

- None

"AsyncSequencer thread" #20 daemon prio=5 os_prio=0 tid=0x00000253b5a3b000 nid=0x1848 runnable [0x000000c0acbff000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338)
at org.deeplearning4j.util.ThreadUtils.uncheckedSleep(ThreadUtils.java:26)
at org.deeplearning4j.models.sequencevectors.SequenceVectors$AsyncSequencer.run(SequenceVectors.java:1170)

Locked ownable synchronizers:

- None

"DeallocatorServiceThread_1" #19 daemon prio=5 os_prio=0 tid=0x00000253b5a33800 nid=0x1b50 in Object.wait() [0x000000c0acaff000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)

- locked <0x00000006b7f205a0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at org.nd4j.linalg.memory.deallocation.DeallocatorService$DeallocatorServiceThread.run(DeallocatorService.java:123)

Locked ownable synchronizers:

- None

"DeallocatorServiceThread_0" #18 daemon prio=5 os_prio=0 tid=0x00000253b5a32000 nid=0x1504 in Object.wait() [0x000000c0ac9ff000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)

- locked <0x00000006b7f20758> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at org.nd4j.linalg.memory.deallocation.DeallocatorService$DeallocatorServiceThread.run(DeallocatorService.java:123)

Locked ownable synchronizers:

- None

"Threadly clock updater" #13 daemon prio=5 os_prio=0 tid=0x0000025338537000 nid=0x2534 in Object.wait() [0x000000c0ac7fe000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at org.threadly.util.Clock$ClockUpdater.run(Clock.java:250)

- locked <0x00000003c0bfe020> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:748)

Locked ownable synchronizers:

- None

"JavaCPP Deallocator" #11 daemon prio=10 os_prio=2 tid=0x000002537c5f7000 nid=0x2498 in Object.wait() [0x000000c0ac5fe000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)

- locked <0x00000003c0098080> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at org.bytedeco.javacpp.Pointer$DeallocatorThread.run(Pointer.java:375)

Locked ownable synchronizers:

- None

"Service Thread" #10 daemon prio=9 os_prio=0 tid=0x000002537bc8a800 nid=0x9bc runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

Locked ownable synchronizers:

- None

"C1 CompilerThread2" #9 daemon prio=9 os_prio=2 tid=0x000002537bc83000 nid=0x638 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

Locked ownable synchronizers:

- None

"C2 CompilerThread1" #8 daemon prio=9 os_prio=2 tid=0x000002537bc13800 nid=0x17e8 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

Locked ownable synchronizers:

- None

"C2 CompilerThread0" #7 daemon prio=9 os_prio=2 tid=0x000002537bc13000 nid=0x5bc waiting on condition