Are there DL4j resources for loading time-series data stored in HDFS and constructing an iterator out of it to fit a model?
As in, locally fitting from remotely stored data?
Yes, it's similar to a local data pipeline (for like CSV or whatever) but you should use the following InputSplit instead of say FileInputSplit:
I must not look at the right place: could someone explain me/point to me how to stop the thread(s) performing the training. I'm running this code in a SwingWorker:
final QLearningDiscreteDense<State> qldd = new QLearningDiscreteDense<State>(mdp, Configuration.NETWORK, Configuration.Q_LEARNING, Configuration.DATA_MANAGER);
and would like to stop the training when cancelling the worker (using "this.worker.cancel(true)" or some other means).
@nimishjain I'm building such a simple example using Pavlov's dog as "problem", you can find it here: https://gitlab.com/wimp.today/Code/tree/master/RL4J%20Tests%20%28From%20Maven%29
It has a CLI version and a UI to visually display the training (and soon the using) of the model. It doesn't require anything special (no GYM or Web server...) only RL4J (through Maven) and Java.
Hello. I m importing keras functional model and as far as my model has 2 Lambda layers I registered them by «KerasLayer.registerLambdaLayer». lambda_1 layer is a simple function that squares values. The layer is located between convolutional and dense layers. To make the model work, I have to set in «InputType getOutputType» the value InputType.recurrent(16) only, and nothing else works. (for details see code and model in link below).
1) Why recurrent? It's not logical. The previous layer to lambda is convolutional, the next one is dense. The model doesn't contain recurrent layers at all.
2) For what «InputType getOutputType» is responsible for? For input or for output?
3) As for 2-nd lambda layer: InputType in lambda_2 layer is set to InputType.feedforward(2). The argument can be set equal to any integer and final result doesn't change. Why there is no change? The parameter has no meaning?
I attached gist with:
2) model (h5);
3) summary of origin keras model (png).
You can run code with model and check. Set your path when importing model. Model also shows summary and allows you to put values (in range 0-39) to check how result changed.
@orausch yes, you have two options
1) use workspaces - best performance (for cyclical workloads), but a little more complex
2) Deallocate manually using INDArray.close() (which is fine, but has some performance overhead associated with it)
Hi everyone, I'm currently facing an issue using Spark and DL4J.
I'm trying to fit a neural network for timeseries forecasting. I have done all the data preparation steps relative to Spark : Load my data as a RDD of some objects to get at the end an RDD<DataSet> .
Then I create my SparkDl4jMultiLayer to fit my network. I use Gradient Sharing Implementation.
The problem is I encounter a NullPointerException coming from SharedTrainingWrapper.java, Here is a more detailed stacktrace :
19/10/02 22:18:13 WARN SharedTrainingWrapper: Exception encountered during fit operation java.lang.NullPointerException at org.deeplearning4j.spark.parameterserver.pw.SharedTrainingWrapper.run(SharedTrainingWrapper.java:475) at org.deeplearning4j.spark.parameterserver.functions.SharedFlatMapPathsAdapter.call(SharedFlatMapPaths.java:94) at org.deeplearning4j.spark.parameterserver.functions.SharedFlatMapPathsAdapter.call(SharedFlatMapPaths.java:62) at org.datavec.spark.transform.BaseFlatMapFunctionAdaptee.call(BaseFlatMapFunctionAdaptee.java:40) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$4$1.apply(JavaRDDLike.scala:153) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$4$1.apply(JavaRDDLike.scala:153) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:800) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:800) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
After running the debugger, it looks like that the SharedTrainingWrapper instance has an attribute
consumer that is initialized to null and gets never updated. The only place where it seems to be updated is at line 310 :
// if we're running in spark localhost mode - we don't want double initialization if (!ModelParameterServer.getInstance().isInitialized())
But since i'm running Spark in localhost mode I don't pass the if statement.
My project is in scala, built with sbt. I'm using spark 2.3.1 and deeplearning4j 1.0.0_beta4.
Could someone help me with this ?