java.nio.channels.ClosedChannelException
. This one you need to go after. There are quite some posts on stackoverflow, I guess you just need to try the advices there. Jobserver seems to be working fine here, so you probably need to tune Spark :)
Hi @valan4ik Thanks a lot for the quick response. I was using SJS 0.9.0, And noticed that Spark-jobserver will start the H2 Db if spark.jobserver.startH2Server is enabled in the env.conf.
Actual code in JobServer.scala
// start embedded H2 server
if (config.getBoolean("spark.jobserver.startH2Server")) {
val rootDir = config.getString("spark.jobserver.sqldao.rootdir")
val h2 = org.h2.tools.Server.createTcpServer("-tcpAllowOthers", "-baseDir", rootDir).start();
logger.info("Embeded H2 server started with base dir {} and URL {}", rootDir, h2.getURL: Any)
}
The above code starts the H2 Database using 9092 port and spark job server comes up if we use spark.jobserver.sqldao.jdbc.url="jdbc:h2:tcp://<serverHost>:9092/h2-db;AUTO_RECONNECT=TRUE". The real issue when the 9092 port is busy. H2 Database comes up on a free port and we wont be able to set the right spark.jobserver.sqldao.jdbc.url in the env.conf.
I have used following code to overcome this issue. Let me know if you know any other better solution.
// start embedded H2 server
if (config.getBoolean("spark.jobserver.startH2Server")) {
val rootDir = config.getString("spark.jobserver.sqldao.rootdir")
var h2ServerPort = config.getString("spark.jobserver.h2ServerPort")
if ( h2ServerPort == null || h2ServerPort == "") {
h2ServerPort = "9092"
}
val h2 = org.h2.tools.Server.createTcpServer(
"-tcpPort", h2ServerPort, "-tcpAllowOthers", "-baseDir", rootDir).start();
logger.info("Embeded H2 server started with base dir {} and URL {}", rootDir, h2.getURL: Any)
}
And env.conf contains the following properties:
spark.jobserver.h2ServerPort=7272
spark.jobserver.sqldao.jdbc.url="jdbc:h2:tcp://<ServeHost>:7272/h2-db;AUTO_RECONNECT=TRUE"
Hi All,
What is the best way to get the failure reason got the spark task if the jobs are submitted asynchronously, I am getting the following response for the ERROR case.
[{
"duration": "5.948 secs",
"classPath": "<ApplicationClass>",
"startTime": "2019-10-15T11:19:30.803+05:30",
"context": "INFA_DATA_PREVIEW_APP_DIS_SJS",
"status": "ERROR",
"jobId": "0e79d7b3-7e31-4232-b354-b1fb01b0928a",
"contextId": "08ee61f8-7425-45f2-a4ce-c4b456fd24b8"
}]
Do we need to perform anything extra in the ApplicationClass in order to get the error stacktrace in the job response body?
{
"duration": "69.903 secs",
"classPath": "org.air.ebds.organize.spark.jobserver.GBSegmentJob",
"startTime": "2019-11-18T02:41:07.524Z",
"context": "test-context",
"result": {
"message": "Job aborted due to stage failure: Task 2 in stage 10.0 failed 1 times, most recent failure: Lost task 2.0 in stage 10.0 (TID 530, localhost, executor driver): java.lang.OutOfMemoryError: Java heap space\n\tat scala.reflect.ManifestFactory$$anon$9.newArray(Manifest.scala:117)\n\tat scala.reflect.ManifestFactory$$anon$9.newArray(Manifest.scala:115)\n\tat scala.Array$.ofDim(Array.scala:218)\n\tat geotrellis.raster.IntArrayTile$.ofDim(IntArrayTile.scala:271)\n\tat geotrellis.raster.ArrayTile$.alloc(ArrayTile.scala:422)\n\tat geotrellis.raster.CroppedTile.mutable(CroppedTile.scala:141)\n\tat geotrellis.raster.CroppedTile.mutable(CroppedTile.scala:133)\n\tat geotrellis.raster.CroppedTile.toArrayTile(CroppedTile.scala:125)\n\tat geotrellis.raster.crop.SinglebandTileCropMethods$class.crop(SinglebandTileCropMethods.scala:45)\n\tat geotrellis.raster.package$withTileMethods.crop(package.scala:55)\n\tat geotrellis.raster.crop.MultibandTileCropMethods$$anonfun$cropBands$1.apply$mcVI$sp(MultibandTileCropMethods.scala:45)\n\tat geotrellis.raster.crop.MultibandTileCropMethods$$anonfun$cropBands$1.apply(MultibandTileCropMethods.scala:44)\n\tat geotrellis.raster.crop.MultibandTileCropMethods$$anonfun$cropBands$1.apply(MultibandTileCropMethods.scala:44)\n\tat scala.collection.immutable.Range.foreach(Range.scala:160)\n\tat geotrellis.raster.crop.MultibandTileCropMethods$class.cropBands(MultibandTileCropMethods.scala:44)\n\tat geotrellis.raster.package$withMultibandTileMethods.cropBands(package.scala:83)\n\tat geotrellis.raster.crop.MultibandTileCropMethods$class.crop(MultibandTileCropMethods.scala:72)\n\tat geotrellis.raster.package$withMultibandTileMethods.crop(package.scala:83)\n\tat geotrellis.raster.package$withMultibandTileMethods.crop(package.scala:83)\n\tat geotrellis.raster.crop.CropMethods$class.crop(CropMethods.scala:66)\n\tat geotrellis.raster.package$withMultibandTileMethods.crop(package.scala:83)\n\tat geotrellis.spark.buffer.BufferTiles$$anonfun$17.geotrellis$spark$buffer$BufferTiles$$anonfun$$genSection$1(BufferTiles.scala:544)\n\tat geotrellis.spark.buffer.BufferTiles$$anonfun$17$$anonfun$apply$17.apply(BufferTiles.scala:549)\n\tat geotrellis.spark.buffer.BufferTiles$$anonfun$17$$anonfun$apply$17.apply(BufferTiles.scala:549)\n\tat scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)\n\tat scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)\n\tat scala.collection.immutable.List.foreach(List.scala:381)\n\tat scala.collection.TraversableLike$class.map(TraversableLike.scala:234)\n\tat scala.collection.immutable.List.map(List.scala:285)\n\tat geotrellis.spark.buffer.BufferTiles$$anonfun$17.apply(BufferTiles.scala:549)\n\tat geotrellis.spark.buffer.BufferTiles$$anonfun$17.apply(BufferTiles.scala:521)\n\tat scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)\n\nDriver stacktrace:"
You can use the following to increase the driver memeory:
curl -X POST "localhost:8090/contexts/test-context?launcher.spark.driver.memory=512m
The valid values are 512m, 1g,2g etc