These are chat archives for nrinaudo/kantan.csv

7th
Oct 2016
Andrew Roberts
@aroberts
Oct 07 2016 19:31

@nrinaudo does kantan support csvs with variable lengths? can I decode

1,”a”,100
2,”b"

into a case class shaped like

case class Foo(first: Int, second: String, third: Option[Int])
I don’t need generic support for this
Andrew Roberts
@aroberts
Oct 07 2016 19:40
also, why isn’t e.g. DecodeResult covariant in type A?
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:01
@aroberts regarding your first question: I just found out you can't
but that's a bug.
if your second row was 2,"b",, then it'd work.
But yeah, your actual example fails, and that's something that needs to be addressed. Would you care to create an issue?
as for DecodeResult, it's covariant in both its type parameters, but that might not be obvious. It's just a type alias for kantan.codecs.Result, which is declared that way:
https://github.com/nrinaudo/kantan.codecs/blob/master/core/src/main/scala/kantan/codecs/Result.scala#L34
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:06
@aroberts to get back to your initial issue, I understand that it's not ideal, but you could read rows as Either[Foo, Bar], where:
case class Foo(first: Int, second: String)
case class Bar(first: Int, second: String, third: Int)
Andrew Roberts
@aroberts
Oct 07 2016 20:07
thanks
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:07
but I absolutely agree that what you expected to be able to do is better and will need to addressed.
Andrew Roberts
@aroberts
Oct 07 2016 20:08
regarding the result, I was having trouble using DecodeResult - I wanted something similar to Result.fromOption, but couldn’t get it to compile with Option.fold
I ended up using Result.fromOption with TypeError directly
looks OK but I wasn’t sure if that was what was intended or not
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:09
what was intended was for you never to need to worry about these details :)
can you tell me more about your use case?
Andrew Roberts
@aroberts
Oct 07 2016 20:10
sure. I’m decoding a timestamp that may be presented in a number of different formats
I’m using an iterator of formats, and mapping them to Try[LocalDateTime], and then calling collectFirst { case Success … }
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:10
do you use either java.util.Date or joda.date.DateTime for that?
I see
Andrew Roberts
@aroberts
Oct 07 2016 20:11
that gives me Option[LocalDateTime]
also, fwiw it might be worth supporting java.time.* over joda time - as of java 8 it supercedes
anyway, I ended up with Result.fromOption(parsed, TypeError(s"Couldn't decode timestamp: $s”))
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:12
I think what you want is kantan.codecs.Result.sequence
Andrew Roberts
@aroberts
Oct 07 2016 20:12
which isn’t horrible, but confusing given the first-class styling of the DecodeResult type
Oh, hm, let me take a look
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:13
or do you, hang on, let me think (I'm pretty tired and might not make much sense tonight)
no, you don't. Sequence lets you turn a F[Result[A]] into a Result[F[A]], if F has a Monad
but what you have is an Iterator[Option[DecodeResult[A]]], right?
Andrew Roberts
@aroberts
Oct 07 2016 20:14
I have Iterator[Try[A]]
which I suppose I could map to Iterator[DR[A]]
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:15
yeah but there's really no easy way to do that either, unfortunately
Andrew Roberts
@aroberts
Oct 07 2016 20:15
but I don’t feel like sequence is an accurate representation of the transformation I want
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:15
ok, so yeah, I'd not thought of your use case.
Andrew Roberts
@aroberts
Oct 07 2016 20:16
yeah. I have no issue with the structure I ended up with, I just think that it would be convenient if the Result helper methods were provided by a trait that was available on DecodeResult
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:17
yeah. Sounds good to me.
I'll need to give it some more thoughts when not quite that tired, but you make a good point.
Andrew Roberts
@aroberts
Oct 07 2016 20:17
I think my use case is esoteric enough to not be explicitly supported by the library - it’s just the way I then use the Result family that wasn’t as intuitive.
Sounds good!
Love what you’re doing, in any case
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:18
to be honest, I've always been a bit uncomfortable about Result, especially now that Either is right-biased - I might get rid of it altogether and provide type aliases for backward compatibility in the future
Andrew Roberts
@aroberts
Oct 07 2016 20:18
I’m unfortunately going to be gated on that first issue I mentioned (I think), but I’m happy to file the ticket, and I’ll keep an eye on this project
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:18
oh thanks, that's always great to hear :)
well hang on. How soon do you need that issue fixed?
Andrew Roberts
@aroberts
Oct 07 2016 20:19
yeah- I feel like every transformation-oriented project has to implement their own form of Result, hah
oh, shrug, whenever?
whenever it ends up fixed I’ll pick back up my branch for switching over to kantan
:)
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:20
right, cause if it's a show stopper for you, I feel it might be relatively trivial and something I might be able to sort out this weekend
I've got a bunch of commits I need to put in a new release anyway
Andrew Roberts
@aroberts
Oct 07 2016 20:20
I mean, I’m not going to say don’t do it :)
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:21
oh I'm going to do it, now that I'm aware of the issue it's going to bother me until I get it sorted
it's just that if you have an urgent need for it, and it means it the difference between having you as a user or not, I might try to make time this weekend
(oh, and about your java 8 comment: absolutely, I just need to take the time to write java8 specific builds)
(and about Result: at least mine is shared accross 3 projects so far, kantan.csv, kantan.regex and kantan.xpath :) )
Andrew Roberts
@aroberts
Oct 07 2016 20:22
for sure, everything is time time time
hah, well that’s good
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:23
right. So. Can I let you create that "optional trailing column" ticket?
Andrew Roberts
@aroberts
Oct 07 2016 20:23
yep
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:24
if you feel up to it, it'd be great if you could also dump your thoughts on the Result and DecodeResult thing in another issue
Andrew Roberts
@aroberts
Oct 07 2016 20:24
for sure
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:24
if not let me know and I'll do it myself
thanks, I appreciate this
this channel is almost always dead, can't help but wonder whether no one uses kantan.csv or it's just so very intuitive and bug free :)
Andrew Roberts
@aroberts
Oct 07 2016 20:25
as far as gaining a user- right now I’m using scala-csv, and I hate that it returns an Option and yet throws exceptions. I’m very familiar with the decoding/encoding approach that you’re using (circe/finch/etc user), so I’d love to switch over. at the end of the day, there’s someone with veto power over me, but it’s not much code to write to migrate, so I’m just going to stab it out and see what people say :)
hah
well, you never know. maven downloads I guess?
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:26
unfortunately, those are hugely inflated by the various mirrors and leeches
Andrew Roberts
@aroberts
Oct 07 2016 20:26
interesting that no one’s built a blacklist to solve that problem
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:26
yeah, circe is definitely an inspiration
another reason to prefer kantan.csv over scala-csv are the performances: http://nrinaudo.github.io/kantan.csv/tut/benchmarks.html
I mean this is terribly immodest, but kantan.csv is twice as fast when decoding
Andrew Roberts
@aroberts
Oct 07 2016 20:32
I believe it
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:33
truthfully, it's hardly ever noticeable unless your CSV files are massive, but still.
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:42
Thanks @aroberts, that's great.
Andrew Roberts
@aroberts
Oct 07 2016 20:43
any time! glad to contribute :)
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:43
just in time for me to reference it in the commit that has the fix, too :)
Andrew Roberts
@aroberts
Oct 07 2016 20:43
hah :) perfect
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:43
tests are still running, but I don't expect anything to break.
and to be honest, I'll probably need to write a few non-regression tests and all, but I expect this to be in a SNAPSHOT build tonight or tomorrow, with hopefully a full release this weekend
Andrew Roberts
@aroberts
Oct 07 2016 20:45
oh man, that would be perfect
what does kantan mean, anyway?
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:45
"simple" in japanese
Andrew Roberts
@aroberts
Oct 07 2016 20:45
hah, well played
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:46
yeah, it's also a great way for me to name my projects - I suck at naming things, this allows me to just stick something after kantan
I mean the next few projects are probably going to be for JSON, SQL and Mongodb: kantan.json, kantan.sql. kantan.mongodb
there, sorted.
cheers for #54, too. I'll give this more time though, I need to mule it over
but it certainly needs to be addressed.
Andrew Roberts
@aroberts
Oct 07 2016 20:48
yeah… I feel like I’ve run into this before with type in scala
there’s so much sugar that doesn’t get applied to type declarations
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 20:49
I've actually realised a workaround recently: in the case of DecodeResult, for instance, just declare val DecodeResult = kantan.codecs.Result in kantan.csv's package object
well, it doesn't work in this case because the type parameters are different, but it works well for actual aliases
(think scala.collection.Seq == scala.collection.immutable.Seq, for instance)
yeah, tests pass. Releasing the SNAPSHOT build as we speak.
Andrew Roberts
@aroberts
Oct 07 2016 20:56
excellent
Nicolas Rinaudo
@nrinaudo
Oct 07 2016 21:08
@aroberts SNAPSHOT builds released for 2.10 and 2.11, non-regression tests written, committed and pushed. My work here is done. I'm off to bed
(and intend to do a proper release this weekend)
Andrew Roberts
@aroberts
Oct 07 2016 21:12
thanks again man