These are chat archives for typelevel/cats

24th
Mar 2015
Pascal Voitot
@mandubian
Mar 24 2015 07:24
I hope this weekend I'll have a few minutes to tackle free reorg in cats... so busy lately...
Mike (stew) O'Connor
@stew
Mar 24 2015 07:25
sounds good, let me know If I can help
i've also been way busier than I would like lately, and that is unfortunately probably the sign of a medium term trend
I've been thinking that we should probably push for getting a snapshot released to get some fresh life into the project
Pascal Voitot
@mandubian
Mar 24 2015 07:27
that might be a solution yes...
we shouldn't lose motivation... maybe it was too much too fast at the beginning ;)
fast & furious cats :cat:
Adelbert Chang
@adelbertc
Mar 24 2015 08:13
what are people's thoughts on non/cats#260 ? if people are not completely opposed i may spent some time tacking Serializable onto some stuff
also would a snapshot of cats also involve a snapshot of algebra? @non
Mike (stew) O'Connor
@stew
Mar 24 2015 08:20
@adelbertc when this came up during a google hangout there were no objections to us trying to make this as friendly as possible for spark users WRT serializable
stew @stew -> bed
Adelbert Chang
@adelbertc
Mar 24 2015 08:25
@stew awesome possum - a serializable PR coming soon to a github near you
Miles Sabin
@milessabin
Mar 24 2015 08:37
@adelbertc @stew I'm keen to make shapeless Spark-friendly too. What's the recipe? Sprinkle Serializable everywhere?
Miles Sabin
@milessabin
Mar 24 2015 11:15
Anyone?
I'm reluctant to start slapping Serializable on everything if there's any sort of viable alternative ... what are the options for Spark?
Ben Hutchison
@benhutchison
Mar 24 2015 11:46
@milessabin Comes with java and kryo support, but ultimately pluggable via this interface http://spark.apache.org/docs/1.2.1/api/scala/index.html#org.apache.spark.serializer.SerializerInstance
Miles Sabin
@milessabin
Mar 24 2015 11:48
So what's the benefit of @adelbertc's PR for #260?
Wouldn't we be better off providing some sort of type class based mechanism (generally, not just for Cats or shapeless) and providing a corresponding Serializer implementation?
Ben Hutchison
@benhutchison
Mar 24 2015 11:58
Imo, thats the elegant approach. Implementing Serializable is convenient and will interop with java libs well, but smells a bit.
Miles Sabin
@milessabin
Mar 24 2015 11:59
It's also a lot of busywork, and I think breaks binary compatability?
If I do do it for shapeless, I want to be confident that it really is a desirable option.
Ben Hutchison
@benhutchison
Mar 24 2015 12:10
Depends so much on individual usage scenarios. If someones putting lots of preexisting java classes into spark, they'll probably want Serializable, and typeclasses will feel like hard work. OTOH in a strongly typed fp style system typeclasses will be easy. Spark wants one Serializer across cluster, i think, so cant mix and match styles. Hence Serializable as lowest common denominator.
Miles Sabin
@milessabin
Mar 24 2015 13:25
Yay for "standards"! :-(
Mike (stew) O'Connor
@stew
Mar 24 2015 14:35
is it possible to use some kind of serialization other than Serializable with spark?
Mike (stew) O'Connor
@stew
Mar 24 2015 14:45
oh wow, ok
Miles Sabin
@milessabin
Mar 24 2015 15:43
So what should we do? I'm still not clear what the general consensus is?
Pascal Voitot
@mandubian
Mar 24 2015 15:47
let's also remind that Spark is still on scala 2.10.x ... :(
Erik Osheim
@non
Mar 24 2015 16:19
@milessabin so -- in algebra and spire (and maybe in cats?) i think i will just suck it up and extend Serializable.
(we are already doing this in algebra, fwiw)
Miles Sabin
@milessabin
Mar 24 2015 16:20
For every type class and instance?
Erik Osheim
@non
Mar 24 2015 16:20
yes.
Miles Sabin
@milessabin
Mar 24 2015 16:20
Ouch :-(
Erik Osheim
@non
Mar 24 2015 16:20
yeah it's not great.
unfortunately i don't think anyone will hit serialization errors and ask why hadoop/scalding/spark are broken
rather they will ask why algebra/spire are broken
in other broken news: our Comonad[OneAnd[?, F]] instance appears wrong
Miles Sabin
@milessabin
Mar 24 2015 16:21
Is adding extends Serializable binary compatible?
Erik Osheim
@non
Mar 24 2015 16:22
i'm not sure about that.
in algebra's case there is no prior release.
Rodolfo Hansen
@kryptt
Mar 24 2015 16:24
@milessabin I would expect it to be, yes. The jvm treats it as a tag when it starts to serialize object graphs, I don't think much bytecode gets transfigured (not 100% sure though)
Erik Osheim
@non
Mar 24 2015 16:24
(to follow up on my claim about OneAnd -- the definition of coflatMap just drops the .tail on the floor, which is definitely wrong.)
Mike (stew) O'Connor
@stew
Mar 24 2015 16:25
does anyone ever actually use coflatmap for something other than "let me show you what you might use coflatmap for"
Erik Osheim
@non
Mar 24 2015 16:25
haha i'm not sure
i was just comparing our definition to haskell's for nonempty
as you can see, i think the "right" thing to do is to take the tail apart and recursively evaluate f on all the tails
which would probably require a Comonad[F] in this case
my first instinct is to just remove the instance for now.
we can worry about it later. but i'm happy to try to fix it up if that seems useful
(i noticed it while i was adding a reducible instance)
Miles Sabin
@milessabin
Mar 24 2015 16:41
I've sent out a "help wanted" call for Serializable for shapeless. @non ... if you've got any specific advice based on your experience with Algebra I'd very much appreciate it if you could comment on the ticket here: milessabin/shapeless#343
Erik Osheim
@non
Mar 24 2015 16:42
@milessabin ok -- i'll try to comment today after work. we have some tests in algebra that were added to catch serialization problems.
Rob Norris
@tpolecat
Mar 24 2015 16:46
@stew i have used it once or twice with trees ... it "visits" every subtree, which is useful from time to time
but it's definitely something i get really excited about whenever i do it, so pretty rare
Mike (stew) O'Connor
@stew
Mar 24 2015 16:46
well i don't want to take that excitement away from you
but you should now write up some examples :)
Erik Osheim
@non
Mar 24 2015 16:47
looking into it a bit more, i'm going to just remove the Comonad[OneAnd[...]] instance for now. it's impossible to correctly implement unless you have a way of turning F[A] => Either[A, (A, F[A])]
which would require a bit more machinery than we have now.
(ironically, we have this for all the concrete uses of OneAnd, e.g. Nel, just not the generalization.)
Miles Sabin
@milessabin
Mar 24 2015 16:48
@non I didn't see @SerialVersionUID in Algebra ... intentional?
Erik Osheim
@non
Mar 24 2015 16:49
we probably still need to add them
Miles Sabin
@milessabin
Mar 24 2015 16:50
Damn ... that was the biggest part of the chore, IIRC :-(
Erik Osheim
@non
Mar 24 2015 16:51
yeah... :/
Adelbert Chang
@adelbertc
Mar 24 2015 17:32
@milessabin @stew @non @benhutchison just opened my laptop, playing catch-up witht eh Serializable stuff
not sure how it would be used though..
Adelbert Chang
@adelbertc
Mar 24 2015 17:37
org.apache.spark.serializer.SerializerInstance seems to serve a different purpose.. https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/serializer/Serializer.scala
@benhutchison have you been using the SerializerInstance stuff to get around Serializable issues with types that don't <: Serializable ?
Adelbert Chang
@adelbertc
Mar 24 2015 18:14
i would love a type class based approach
Adelbert Chang
@adelbertc
Mar 24 2015 18:27
oooo http://spark.apache.org/docs/latest/configuration.html#compression-and-serialization the conf value spark.serializer lets you specify an alternative Serializer, seems like we could do a type class-y approach. i also maybe smell some scodec. can you confirm @benhutchison
Ben Hutchison
@benhutchison
Mar 24 2015 20:44
@adelbertc that's my understanding, but havent plugged in custom serializer myself
Adelbert Chang
@adelbertc
Mar 24 2015 20:56
though now im not sure how you enforce everything inside a block to say, RDD#map is serializable.. bah
Adelbert Chang
@adelbertc
Mar 24 2015 21:03
Miles Sabin
@milessabin
Mar 24 2015 21:18
@adelbertc like I mentioned a while back ... my experience with Spark a couple of years ago was that the main problem wasn't serialization per se, but accidental capture of non-serializable things ... which is exactly what Spores is intended to address.
Adelbert Chang
@adelbertc
Mar 24 2015 22:03
@milessabin yeah makes sense, i pinged heather & spark folks on twitter
Pascal Voitot
@mandubian
Mar 24 2015 22:35
spore & pickling will be cool in the future but as long as Spark hasn't migrated to scala 2.11, we still have to find other solutions