These are chat archives for nrinaudo/kantan.csv

27th
Jan 2017
Andrew Roberts
@aroberts
Jan 27 2017 03:43
@nrinaudo that syntax works for me - is it documented anywhere? I didn’t see initially how I could use that to replace the anonymous function. thanks!
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 08:39
obviously I need to think about my documentation though, you're not the first person that I know has read the doc. but not found the information he was looking for
Andrew Roberts
@aroberts
Jan 27 2017 15:21
ah, thank you. If it helps, I was reading the “rows as arbitrary types” section
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 15:22
it's in there as well :)
implicit val carDecoder = RowDecoder.decoder(0, 1, 2, 3, 4) { (y: Int, m: String, mo: String, d: Option[String], p: Float) ⇒
  new Car(y, m, mo, d, p)
}
ah, but I see. I need to fix that.
thanks!
Andrew Roberts
@aroberts
Jan 27 2017 15:22
yeah, I found that, but that’s - yeah
no problem!
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 15:48
Note that the syntax will not be great, since I can't turn a constructor in a lambda
implicit val carDecoder = RowDecoder.decoder(0, 1, 2, 3, 4)(new Car(_, _, _, _, _))
Although in this instance it'll actually be:
implicit val carDecoder = RowDecoder.ordered(new Car(_, _, _, _, _))
Andrew Roberts
@aroberts
Jan 27 2017 16:28
the apply solution is perfect for my use case, anyhow
alright @nrinaudo next hurdle for this weird problem. I am dealing with fairly heterogenous CSV lines. I have an ADT (sealed trait + case classes) representing each case, and RowDecoder instances for each. I want to make a RowDecoder instance that matches against a specific index, and then based on the string at that index uses a more specific decoder
basically a discriminator type pattern, but in csv
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 18:24
You don't go for the easy issues, do you :smile:
I'll need to write some sample code for that, but it'll be a bit nasty. I'm away from my computer for a few hours, but I'm hoping to get something together tonight.
(Essentially, you need to write a bespoke row decoder - no fancy helper for that, you'll need to get your hands dirty)
Andrew Roberts
@aroberts
Jan 27 2017 19:09
No problem
I’m most of the way there:
  def discriminatorRowDecoder[A, C, T <: A](index: Int)(discrimator: PartialFunction[C, RowDecoder[T]])
    (implicit cd: CellDecoder[C]): RowDecoder[A] = RowDecoder.from(input => {
      for {
        data <- input.lift(index).map(cd.decode).getOrElse(DecodeResult.outOfBounds(index))
        discriminated <- discrimator(data).decode(input)
      } yield discriminated
    })
haven’t handled descriminator not being defined at data yet, but I believe the issue is stemming from the PartialFunction being diverse RowDecoder[T] instances
Andrew Roberts
@aroberts
Jan 27 2017 19:18
looks like result has the variance I need to get this done
Andrew Roberts
@aroberts
Jan 27 2017 19:50
second attempt, with bonus typos fixed
  def discriminatorRowDecoder[C, A](index: Int)(discriminator: PartialFunction[C, Seq[String] => DecodeResult[A]])
    (implicit cd: CellDecoder[C]): RowDecoder[A] = RowDecoder.from(input => {
    for {
      data <- input.lift(index).map(cd.decode).getOrElse(DecodeResult.outOfBounds(index))
      discriminated <- discriminator.applyOrElse(data, _ => _ => TypeError(s"Couldn't decode discriminator: $data"))(input)
    } yield discriminated
  })
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 20:18
oh, fancy, you're trying to make it into a generic combinator :)
a few comments, if I may
the return result of your discriminator is a Seq[String] => DecodeResult[A]. That's an awful lot like a RowDecoder[A]
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 20:24
the point of this would be to let you write your discriminator like this:
discriminatorRowDecoder(0) {
  case 1 => RowDecoder[Type1]
  case 2 => RowDecoder[Type2]
}
Andrew Roberts
@aroberts
Jan 27 2017 20:24
@nrinaudo I started there, but RowDecoder is invariant
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 20:25
right.
Andrew Roberts
@aroberts
Jan 27 2017 20:25
in the first paste above, discriminator is C => RowDecoder[T], but I had trouble getting the compiler to take it with different values of T (despite all Ts being subtypes of A)
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 20:26
I guess I need to sit down and actually play with this
thanks, that's an interesting use case. And I appreciate you trying to find a generic, reusable solution rather than a one-of for your specific problem
Andrew Roberts
@aroberts
Jan 27 2017 20:27
my pleasure! I’m interested to see where this goes :)
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 20:28
a small stylistic comment: it's quite common nowadays with type classes to use context bounds and instance summoning methods, rather than named instances
in plain english, this is usually the preffered syntax:
def decode[A: CellDecoder](cell: String) = CellDecoder[A].decode(cell)
Andrew Roberts
@aroberts
Jan 27 2017 20:29
ah, interesting, I didn’t know CD provided those summoners
I prefer that as well
I’m trying again with the RowDecoder approach
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 20:30
they don't type classes usually do though :)
this is an apply method on CellDecoder's companion object
a sort of specialised implicitly method
it so happens that in kantan libs, this is backed by a macro, so CellDecoder[A] is replaced, at compile time, by the actual instance
(by which I mean there is no runtime cost to this)
Andrew Roberts
@aroberts
Jan 27 2017 20:40
yeah, found that
I somehow blew up my code
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 20:41
I think I might need a bit more to go on, I'm not quite sure what you mean by that
Andrew Roberts
@aroberts
Jan 27 2017 20:43
sorry, just explaining my attention. connected to some other changes I made, the compiler is suddenly unhappy with the applyOrElse params I am passing. I can probably figure this one out, but …ugh
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 20:48
so I just realised that Decoder doesn't have a flatMap method, which would actually be useful in your case...
this is what I put together (not a generic case yet), but unfortunately, it requires turning Decoder into a proper monad:
import kantan.csv._

object Test extends App {
  val input = "true,1,2\nfalse,foo,bar"

  sealed abstract class Foo extends Product with Serializable
  final case class Ints(a: Int, b: Int) extends Foo
  final case class Strings(a: String, b: String) extends Foo

  implicit val intsRowDecoder: RowDecoder[Ints] = RowDecoder.decoder(1, 2)(Ints.apply _)
  implicit val stringsRowDecoder: RowDecoder[Strings] = RowDecoder.decoder(1, 2)(Strings.apply _)

  implicit val fooDecoder: RowDecoder[Foo] = RowDecoder.from { row =>
    RowDecoder.decoder(0)(identity[Boolean] _).flatMap { b =>
      if(b) RowDecoder[Ints].decode(row)
      else  RowDecoder[Strings].decode(row)
    }
  }

}
oh well, Result is, might as well use that. Let me work on this a bit more
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 20:55
alright, so first, the working, non generic solution for your use case, just to get you unstuck: http://scastie.org/25508
I might need to make a better combinator for this: RowDecoder.decoder(0)(identity[Boolean] _)
something like RowDecoder.field[Boolean](0)
Andrew Roberts
@aroberts
Jan 27 2017 20:59
hmm
why does identity need the type param?
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 21:00
so that RowDecoder.decoder knows what it's decoding to in order to pick an instance of CellDecoder
Andrew Roberts
@aroberts
Jan 27 2017 21:00
ah
yes
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 21:00
but if you look at my code closely, it's pretty much a non-generic version of your PartialFunction solution
Andrew Roberts
@aroberts
Jan 27 2017 21:00
right
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 21:01
and I'm reaching the same conclusion - Decoder is not covariant, so we're pretty much screwed
this is annoying.
and this is all due to Scala's annoying encoding of ADTs with subtyping !@#
Andrew Roberts
@aroberts
Jan 27 2017 21:03
agreed - I love generic solutions and I continually run up against this variance problem - just finished hitting this wall with Finch actually
yes, 100%
not sure the JVM can represent it any other way, though, unfortunately
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 21:03
yeah, when working with ADTs and type classes, this is something you'll unfortunately often encounter
sure, but the compiler could, before generating the bytecode, and deal with the corresponding variance issues without bothering us with them
Andrew Roberts
@aroberts
Jan 27 2017 21:04
hah, true
yeah that would be great
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 21:04
in my example, Ints and Strings are not types - they're values. They're not a subtype of Foo. The compiler could understand that and not mess our reasoning up
so I'll think a bit more on your issue, but it seems like your second solution is probably our best shot
Andrew Roberts
@aroberts
Jan 27 2017 21:06
ok
if you’re still poking at this: does this compile for you?
  type Decode[A] = Seq[String] => DecodeResult[A]

  private def invalidDiscriminator[C](data: C) = RowDecoder.from(_ => DecodeResult.typeError(s"Couldn't decode discriminator: $data"))
  def discriminatorRowDecoder[C: CellDecoder, A](index: Int)(discriminator: PartialFunction[C, Decode[A]]): RowDecoder[A] =
    RowDecoder.from(input => for {
      data <- input.lift(index).map(CellDecoder[C].decode).getOrElse(DecodeResult.outOfBounds(index))
      discriminated <- discriminator.applyOrElse(data, d => invalidDiscriminator(d).decode)(input)
    } yield discriminated)
the compiler is complaining about no type parameter for d in the 2nd-to-last line
but I think that should be inferred properly
and I could swear it WAS being inferred 10 minutes ago
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 21:08
I don't think it could - try type annotating invalidDiscriminator
do you not need another type parameter?
it looks like it might be returning a Rowdecoder[Any]
but are you not making this too hard?
Andrew Roberts
@aroberts
Jan 27 2017 21:10
invalidDiscriminator should be a RowDecoder[Nothing], right? because it always returns an error?
haha, I’ve heard that before - what do you mean specifically?
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 21:10
but RowDecoder is invariant though, so you're returning a A or an Any
mmm... hang on, there's something else going on...
Andrew Roberts
@aroberts
Jan 27 2017 21:13
yup, it’s the type
had to screw with it a little further
  type Decode[A] = Seq[String] => DecodeResult[A]

  private def invalidDiscriminator[C](data: C): Decode[Nothing] = RowDecoder.from(_ => DecodeResult.typeError(s"Couldn't decode discriminator: $data")).decode
  def discriminatorRowDecoder[C: CellDecoder, A](index: Int)(discriminator: PartialFunction[C, Decode[A]]): RowDecoder[A] =
    RowDecoder.from(input => for {
      data <- input.lift(index).map(CellDecoder[C].decode).getOrElse(DecodeResult.outOfBounds(index))
      discriminated <- discriminator.applyOrElse(data, invalidDiscriminator)(input)
    } yield discriminated)
thanks for your help!
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 21:16
now I understand.
oh, right, discriminator returns a Result, not a Decoder, and Result is covariant.
Andrew Roberts
@aroberts
Jan 27 2017 21:18
yep- for some reason I have next to no intuition around invariant types - I find myself always expecting either co or contravariance
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 21:18
it's not bad, with my example, you get to write this:
  implicit val fooDecoder: RowDecoder[Foo] = discriminatorRowDecoder[Boolean, Foo](0) {
    case true => RowDecoder[Ints].decode
    case false => RowDecoder[Strings].decode
  }
invariant types are so much simpler though :) The more I work with scala, the more I lean toward no subtyping of a kind
dumb data types, all the logic in type classes
I think I might steal that combinator, or some version of it. Alright if I credit you with it?
Andrew Roberts
@aroberts
Jan 27 2017 21:20
totally
Nicolas Rinaudo
@nrinaudo
Jan 27 2017 21:21
thanks. As usual, your problems melt my brains but yield interesting results
Tomas Svarovsky
@fluke777
Jan 27 2017 23:01
Hi guys. Is there a way how to encode a row into string without explicitly using writers? I am trying to use kantan in a spark job where I do not own the serialization process.
exactly what I need