These are chat archives for thunder-project/thunder

24th
Sep 2015
timberonce
@timberonce
Sep 24 2015 02:00
@PhC-PhD thanks! Another question: How can I build thunder from the source code?
alexandrelaborde
@AlexandreLaborde
Sep 24 2015 10:20
@PhC-PhD this is what I get when I load the fish series
data = tsc.loadExample('fish-series')

15/09/24 11:16:06 INFO MemoryStore: ensureFreeSpace(288336) called with curMem=0, maxMem=556038881

15/09/24 11:16:06 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 281.6 KB, free 530.0 MB)

15/09/24 11:16:07 INFO MemoryStore: ensureFreeSpace(19020) called with curMem=288336, maxMem=556038881

15/09/24 11:16:07 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 18.6 KB, free 530.0 MB)

15/09/24 11:16:07 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:58898 (size: 18.6 KB, free: 530.3 MB)

15/09/24 11:16:07 INFO SparkContext: Created broadcast 0 from newAPIHadoopFile at PythonRDD.scala:512

15/09/24 11:16:07 INFO MemoryStore: ensureFreeSpace(154568) called with curMem=307356, maxMem=556038881

15/09/24 11:16:07 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 150.9 KB, free 529.8 MB)

15/09/24 11:16:07 INFO MemoryStore: ensureFreeSpace(18980) called with curMem=461924, maxMem=556038881

15/09/24 11:16:07 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 18.5 KB, free 529.8 MB)

15/09/24 11:16:07 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:58898 (size: 18.5 KB, free: 530.2 MB)

15/09/24 11:16:07 INFO SparkContext: Created broadcast 1 from broadcast at PythonRDD.scala:469

15/09/24 11:16:07 INFO FileInputFormat: Total input paths to process : 1
alexandrelaborde
@AlexandreLaborde
Sep 24 2015 10:55

On an somewhat related note, I am trying to connect my a worker in a VM to my cluster that is already on but Ikeep getting this error

15/09/24 11:49:18 INFO Worker: Connecting to master 10.40.11.145:7077...

15/09/24 11:49:19 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkMaster@10.40.11.145:7077] has failed, address is now gated for 
[5000] ms. Reason: [Disassociated] 

15/09/24 11:49:30 INFO Worker: Retrying connection to master (attempt # 1)

I searched online and this type of error appears to be related to incorrect heartbeat times... do you have any idea what this is ?

alexandrelaborde
@AlexandreLaborde
Sep 24 2015 14:19
this is what my master is reporting
15/09/24 15:12:39 INFO Worker: Successfully registered with master spark://10.40.11.145:7077

15/09/24 15:13:12 INFO Master: Registering worker 10.40.11.152:60669 with 8 cores, 14.6 GB RAM

15/09/24 15:13:57 INFO Master: Registering worker 10.40.11.43:43224 with 8 cores, 14.6 GB RAM

15/09/24 15:14:25 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@10.40.11.114:45367] has failed, address is now gated for [5000] ms. Reason is: [org.apache.spark.deploy.DeployMessages$RegisterWorker; local class incompatible: stream classdesc serialVersionUID = -7760370200516123021, local class serialVersionUID = -8324752651840721060].

15/09/24 15:14:25 INFO Master: akka.tcp://sparkWorker@10.40.11.114:45367 got disassociated, removing it.

15/09/24 15:14:38 ERROR Remoting: org.apache.spark.deploy.DeployMessages$RegisterWorker; local class incompatible: stream classdesc serialVersionUID = -7760370200516123021, local class serialVersionUID = -8324752651840721060

java.io.InvalidClassException: org.apache.spark.deploy.DeployMessages$RegisterWorker; local class incompatible: stream classdesc serialVersionUID = -7760370200516123021, local class serialVersionUID = -8324752651840721060

    at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:617)

    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)

    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)

    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)

    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)

    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)

    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)

    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)

    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)

    at akka.serialization.JavaSerializer$$anonfun$1.apply(Serializer.scala:136)

    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)

    at akka.serialization.JavaSerializer.fromBinary(Serializer.scala:136)

    at akka.serialization.Serialization$$anonfun$deserialize$1.apply(Serialization.scala:104)

    at scala.util.Try$.apply(Try.scala:161)

    at akka.serialization.Serialization.deserialize(Serialization.scala:98)

    at akka.remote.MessageSerializer$.deserialize(MessageSerializer.scala:23)

    at akka.remote.DefaultMessageDispatcher.payload$lzycompute$1(Endpoint.scala:58)

    at akka.remote.DefaultMessageDispatcher.payload$1(Endpoint.scala:58)

    at akka.remote.DefaultMessageDispatcher.dispatch(Endpoint.scala:76)

    at akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(Endpoint.scala:937)

    at akka.actor.Actor$class.aroundReceive(Actor.scala:465)

    at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:415)

    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)

    at akka.actor.ActorCell.invoke(ActorCell.scala:487)

    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)

    at akka.dispatch.Mailbox.run(Mailbox.scala:220)

    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)

    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)

    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)

    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)

    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)

    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)

    at akka.serialization.JavaSerializer$$anonfun$1.apply(Serializer.scala:136)

    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)

    at akka.serialization.JavaSerializer.fromBinary(Serializer.scala:136)

    at akka.serialization.Serialization$$anonfun$deserialize$1.apply(Serialization.scala:104)

    at scala.util.Try$.apply(Try.scala:161)

    at akka.serialization.Serialization.deserialize(Serialization.scala:98)

    at akka.remote.MessageSerializer$.deserialize(MessageSerializer.scala:23)

    at akka.remote.DefaultMessageDispatcher.payload$lzycompute$1(Endpoint.scala:58)

    at akka.remote.DefaultMessageDispatcher.payload$1(Endpoint.scala:58)

    at akka.remote.DefaultMessageDispatcher.dispatch(Endpoint.scala:76)


    at akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(Endpoint.scala:937)

    at akka.actor.Actor$class.aroundReceive(Actor.scala:465)

    at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:415)

    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)

    at akka.actor.ActorCell.invoke(ActorCell.scala:487)

    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)


    at akka.dispatch.Mailbox.run(Mailbox.scala:220)

    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)

    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)

    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)

    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)

    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
I built a cluster with the master in 10.40.11.145 a slave on 152 and other on 43, now i am trying to add my VM worker that is on 114...
alexandrelaborde
@AlexandreLaborde
Sep 24 2015 15:22
Solved the problem... looks like spark is not that backwards compatible... my cluster was built on spark 1.4.1 and my VM has spark 1.5.0.. used the old one and it worked
alexandrelaborde
@AlexandreLaborde
Sep 24 2015 21:45
@freeman-lab how can I increase the amout of memory Thunder uses ?
Jeremy Freeman
@freeman-lab
Sep 24 2015 21:48
@seethakris thanks for the info! that’s bizarre, though it looks from this http://stackoverflow.com/questions/31058504/spark-1-4-increase-maxresultsize-memory that you’re not alone (maybe that’s where you saw the trick)
Jeremy Freeman
@freeman-lab
Sep 24 2015 21:56
@AlexandreLaborde glad you figured it out, using different versions across different parts of a Spark deployment can definitely be a problem
by VM you mean the driver?
they should be the same
and I’m totally sure what you mean by increasing the memory? it’ll all depend on Spark’s memory settings, of which unfortunately there are many (see all parameters here http://spark.apache.org/docs/1.4.1/configuration.html)
alexandrelaborde
@AlexandreLaborde
Sep 24 2015 22:22
@freeman-lab I was assuming I cannot load the fish series due to some memory limitation but since my VM has 5GB RAM that cannot be the problem, so I thought, maybe it is Thunder itself that doesn't have enough memory.
I am just trying to do the PCA example