These are chat archives for nrinaudo/kantan.csv

2nd
May 2017
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:17
@aroberts it's fixed locally (except for joda && java8 instances, which I still need to work on), but I can't make a new release for the moment
it's complicated, but I'm waiting for a new version of tut - I've contributed a few fixes I needed and depend on them in the local version
it shouldn't be long though, I'll keep you posted
but Serializable is a mess.
Andrew Roberts
@aroberts
May 02 2017 19:19
that’s great to hear!
and Serializable … ffffff - tell me about it. pretty much the worst
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:21
the one thing I won't be able to provide a generic fix for is - wait for it - collections
RowDecoder[List[Int]], for example, is not, and won't ever be generically Serializable
Andrew Roberts
@aroberts
May 02 2017 19:22
collections?
huh- how is that represented in csv?
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:22
pretty much anything in scala.collection._
well, your typical CSV row is a List[String] for example
Andrew Roberts
@aroberts
May 02 2017 19:22
does that represent a flexible-width csv of ints?
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:22
for example
or you might only be interested in unique values, so you represent rows as Sets
Andrew Roberts
@aroberts
May 02 2017 19:23
oh, I see
that’s interesting
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:23
what's interesting is that you'd never thought of it before, it says a lot about the way your brain works
everybody thinks of CSV rows as collections first
you must have been using strongly typed languages for ages
(that sounded bad, I meant it as a compliment)
Andrew Roberts
@aroberts
May 02 2017 19:24
actually, no, but I’m very committed to them in my mental model
basically I learned very early how to offload various checks I used to perform in my head to the compiler
now that I have a compiler that can do them
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:25
I'm currently training a node.js team who's transitioning to Scala. That's a leap that they're having a really hard time making
a few things like that - or that runtime exceptions are bad, or that return is evil
Andrew Roberts
@aroberts
May 02 2017 19:26
javascript...
that’s a steep climb
that whole universe is ad-hoc built on top of ad-hoc
I don’t envy you that task :\
what’s the issue with scala collection? they appear to implement serializable...
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:27
well, they're the dev. team I inherited when my company was bought by a larger one and I was made CTO. So it's either train them or learn node.js
oh, the problem is not the collections themselves, but bloody CanBuildFrom
so you have a RowDecoder[List[A]]. This works by creating a new list builder and filling it with the values in the row. In order to create that builder, I need a CanBuildFrom
which is overcomplicated and whose variance makes it a nightmare to reason about, but in this simple use case, is a great tool
it's just - it's not serializable. You need to provide your own instance if you want it to be
Andrew Roberts
@aroberts
May 02 2017 19:29
gotcha
that’s a shame
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:29
I brought it up with the Scala team, the conclusion is - CanBuildFrom will not make it to the next iteration of the collection API, so there's no point in fixing it now
Andrew Roberts
@aroberts
May 02 2017 19:30
makes sense
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:30
by the way - did you say you were using joda?
Andrew Roberts
@aroberts
May 02 2017 19:30
(at least, I see their side of it)
nope
java8 time, but I deal with weird serialization there
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:30
ah. Was it java 8 date / times then?
right.
Andrew Roberts
@aroberts
May 02 2017 19:30
so I roll my own CellDecoder[ZonedDateTime]
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:31
with a DateTimeFormatter.ofPattern(...) ?
or do you have a weird date pattern that you can't express as a String?
Andrew Roberts
@aroberts
May 02 2017 19:31
a chain of them loaded from configuration, plus a method to handle a ms-from-epoch number
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:32
a chain of them?
oh yes, I remember now
you're the one I wrote CellDecoder.oneOf for, right?
Andrew Roberts
@aroberts
May 02 2017 19:32
no, but that sounds useful :)
hang on, let me get over to that branch
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:33
yeah, CellDecoder and RowDecoder both have a oneOf method that takes any number of decoders and, for each value, will take the first decoder that doesn't fail
it's basically an Either decoder, if Either was recursive
wait
did I just re-invent HList again?
damn you, Miles Sabin
Andrew Roberts
@aroberts
May 02 2017 19:34
  implicit val decodeZdt: CellDecoder[ZonedDateTime] = CellDecoder.from(since => {
    val formattedAttempts = Config.parsing.timestampFormats.iterator.map { f =>
      Try(LocalDateTime.parse(since, DateTimeFormatter.ofPattern(f)))
    }

    DecodeResult.fromOption(
      LocalDateTimeExtensions.parseEpochTimestamp(since).toOption
        .orElse(formattedAttempts.collectFirst { case Success(local) => local })
        .map(_.atZone(UTCZoneId)),
      TypeError(s"Couldn't parse `$since` as UTC date time")
    )
  })
haha, hlist really is a nice tool
still, though, for things that are much narrower, I think having a specific syntax that enables common operations (like oneOf/orElse etc) are really nice, and they do make things much more readable for someone picking up the code for the first time
onboarding a new hire with shapeless’s type algebra is always kind of a trip
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:36
yeah, they do, it's just annoying to realise that whenever I'm trying to make something generic, I end up re-inventing half of shapeless, only not as good
right, so in the code you just pasted, I think you can replace parts by:
val decoder = CellDecoder.oneOf(timestampFormats.map(f => zonedDateTimeDecoder(DateTimeFormatter.ofPattern(f))))
and then just map on the result to apply your epoch magic
Andrew Roberts
@aroberts
May 02 2017 19:39
yeah, it looks like it
(ish, at least)
what does the failure case look like?
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:40
do you mean the error message that will go in the TypeError ?
Andrew Roberts
@aroberts
May 02 2017 19:40
yea
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:41
s"'$s' is not a valid $typeName"
Andrew Roberts
@aroberts
May 02 2017 19:42
good enough
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:42
where:
  • s is the input value
  • typeName is, in your case, ZoneDateTime
mmm hang on, no, it won't work
you need the cell value in your epoch magic, and you just map on the CellDecoder.oneOf result, you won't have it anymore
Andrew Roberts
@aroberts
May 02 2017 19:44
I could add it to the collection though
CellDecoder.oneOf(epochCellDecoder ++ formats.map(…))
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:44
oh, that's a final attempt, not something you do on a success, sorry!
Andrew Roberts
@aroberts
May 02 2017 19:44
yeah
a primary attempt, but the result is the same
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:45
yeah, that works and removes a fair amount of boilerplate
I have a question though
how the !@# is that Serializable?
Andrew Roberts
@aroberts
May 02 2017 19:45
haha
ours is not to reason why
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:47
I'm guessing it breaks down if you happen to have different versions of Config - the decoder will depend on the local version of Config, which is, strictly speaking, not the expected behaviour
but who cares, it appeases the angry Spark gods
Andrew Roberts
@aroberts
May 02 2017 19:51
flink
but yeah
Nicolas Rinaudo
@nrinaudo
May 02 2017 19:51
right