Dear everyone,
I am trying to generate uber-jar using sbt complie and sbt package commands for running my application on our remote server with spark installed as standalone mode there. I used deeplearning4j framework for building LSTM neural network and tend to perform traing model through spark. Nevertheless, I got into issue when running spark-submit command:
spark-submit --class "lstm.SparkLSTM" --master local[*] stock_prediction_scala_2.11-0.1.jar --packages org.deeplearning4j:deeplearning4j-core:0.9.1 "/home/hadoop/ScalaWorkspace/Stock_Prediction_Scala/target/lstm_train/prices-split-adjusted.csv" "WLTW"
The problem is that seemly spark-submit did not take effect in my circumstance. It has been done right after entering spark-submit withou throwing any error. I have seen the progress of training in the ouput.
[hadoop@gaion34 lstm_train]$ spark-submit --class "lstm.SparkLSTM" --master local[*] stock_prediction_scala_2.11-0.1.jar --packages org.deeplearning4j:deeplearning4j-core:0.9.1 "/home/hadoop/ScalaWorkspace/Stock_Prediction_Scala/target/lstm_train/prices-split-adjusted.csv" "WLTW"
2018-04-25 17:06:50 WARN Utils:66 - Your hostname, gaion34 resolves to a loopback address: 127.0.0.1; using 192.168.0.173 instead (on interface eno1)
2018-04-25 17:06:50 WARN Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
2018-04-25 17:06:51 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-04-25 17:06:51 INFO ShutdownHookManager:54 - Shutdown hook called
2018-04-25 17:06:51 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-c4aee15e-d23b-4c03-95a7-12d9d39f714a
[hadoop@abc lstm_train]$ spark-submit --class "lstm.SparkLSTM" --master local[*] stock_prediction_scala_2.11-0.1.jar --packages org.deeplearning4j:deeplearning4j-nn:0.9.1 "/home/hadoop/ScalaWorkspace/Stock_Prediction_Scala/target/lstm_train/prices-split-adjusted.csv" "WLTW"
2018-04-25 17:07:12 WARN Utils:66 - Your hostname, abcresolves to a loopback address: 127.0.0.1; using 192.168.0.11 instead (on interface eno1)
2018-04-25 17:07:12 WARN Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
2018-04-25 17:07:13 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-04-25 17:07:13 INFO ShutdownHookManager:54 - Shutdown hook called
2018-04-25 17:07:13 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-82fdaebf-1121-4e31-8c4f-37aea9683922
my main class:
object SparkLSTM {
def main(args: Array[String])= {
if (args.length == 2) {
val filePath = args(0) //"/Users/kym1992/STUDY/NEU/CSYE7200/Dataset/nyse/prices-split-adjusted.csv"
val symbolName = args(1)
BasicConfigurator.configure()
val prepared = StockPricePredictionLSTM.prepare(filePath, symbolName, 0.90)
val result = StockPricePredictionLSTM.predictPriceOneAhead(prepared._1, prepared._2, prepared._3, prepared._4, prepared._5)
println("predicts, actual")
(result.predicts, result.actuals).zipped.foreach((x, y) => println(x + ", " + y))
saveAsCsv(result, symbolName)
result.predicts.foreach(r => println(r))
}
}
Any one has experienced this issue before, please advise me . thanks
docker pull hello-world
pulls a very small test image. If you suspect problems pulling, this documentation should help: https://docs.docker.com/engine/reference/commandline/pull/. Note the proxy configuration section.
localhost:8888
. In some terminal environments you can control/command click the URL to do this without having to copy and paste. Let me know if this doesn't work.
%showTypes on
isn't really needed; the types are shown anyway. So, I removed that cell. Later on, you'll run into problems with %%dataframe
, but I added cells showing the alternative way to render a DataFrame
, using the show()
method.
jupyter/all-spark-notebook
with Spark 2.4. By the way, the current instruction will lead the reader to Spark 3.0.1, not sure if there will be any conflict with the notebook content. Anyway, thank you for sharing this notebook, it's helpful!
Hi Guys
We have open-sourced a small library - spark-property-tests for writing easy tests on spark dataframes. We have been using it internally at zeotap data engineering for some time now and thought that the community might benefit from it.
We have published the artifacts in maven central for spark 2.4.x and other versions are on the way. Please review the utility and let us know your feedback.