These are chat archives for nrinaudo/kantan.csv

18th
Dec 2017
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:03
@nrinaudo yes I did. I ended up using a different csv library that uses shapeless (purecsv) and it did work once I increased the stack size. Only noting that because it may be something lib-specific
I ran into trouble when trying to parse with kantan, increasing the stack size worked to help it compile
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:06
interesting. But are you saying that shapeless was able to derive instances for case classes of more than 22 fields? I didn't think these were supported
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:07
yeah, it works fine with PureCSV if I increase the stack size so the compiler doesn't StackOverflow
and it also compiled with kantan too
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:07
seriously?
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:07
yeah
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:07
well, you taught me something today
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:08
haha
kantan actually compiled without a stack size increase up to like 120 fields, but any more than that it ran out
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:08
weird. I'll have to investigate. Thanks!
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:08
you're welcome. Looks like a great lib!
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:09
it's pretty niche, but I like it :)
purecsv is pretty good as well, although I kind of felt that it was not seeing much activity - its author has been focusing on pureconfig these days
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:11
yeah it was definitely lower on the search results for scala csv
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:12
kantan.csv is high in the scala csv search list these days? Neat!
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:12
yeah I think it was like the first result after stackoverflow posts of people doing in manually
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:13
ah, yeah, don't do it manually
csv is an unnecessarily tricky format
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:15
exactly. that's a fools errand
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:16
well, I'm disapointed kantan.csv didn't work out for you, but I'll try and improve it. It's pretty clear @melrief knows his way around shapeless better than I do, I should take a closer look at his implementation
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:17
thanks! I am surprised to learn that scala has such a hard time handling case classes over 22 fields
since it seems to be the default "struct" data type
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:18
design flaw. Functions and case classes basically have a hard-coded limit of 22 parameters
it's been somewhat lifted in recent versions, but it's a bit of a hack and doesn't really work all that well.
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:18
yeah for sure. My first guess is that this is due to it being implemented on the JVM
but nothing to inform that
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:19
the official reason is if you need more than 22 parameters, you're doing it wrong
no, it's just an arbitrary number
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:19
hah interesting
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:19
to the best of my knowledge, they just thought 22 parameters was enough for everybody
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:21
am I really doing it wrong though? I have an externally dictated data structure that I have no control over. Using an HList from a 3rd party library seems a lot more wrong
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:21
well, 150 fields is a bit much :)
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:21
oh it is ridiculous for sure. such is life when working in healthcare though
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:21
can you link me to some code / spec so that I can make an informed opinion?
can you not split it into coherent groups though?
something like:
case class Foo(f1: Int, f2: Int, f3....)

// =>
case class Bar(f1: Int, f2: Int)
case class Foo(bar: Bar, f2....)
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:22
well, it's basically the flat structure of an insurance claim
so I could possibly parse it into a nested data structure
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:23
that would be my recommendation, but I've not seen the data, so maybe it makes no sense
it would also make the compiler quite a bit happier, I would think
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:23
hah yeah it would be nice to not have a 30 meg stack
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:24
shapeless is great, but it does make the compiler jump through some crazy hoops
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:24
for sure, the backtrace was insane
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:26
honestly, for this kind of problems, I tend to not use shapeless. I'll split my data into logical chunks and hand-write decoders for them
(I think purecsv calls them FieldConverters)
final case class Company(name: String, address: Address)
final case class Claimant(name: String, age: Int)
final case class Claim(claimant: Claimant, company: Company)

implicit val claimDecoder: Decoder[Claim] = ...
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:27
yeah, I am using StringConverter but I thought it was using shapeless behind the scenes
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:28
oh it's definitely using shapeless to turn a list of cells into a case class
StringConverter is working at the cell level.
Preston Marshall
@bbhoss_twitter
Dec 18 2017 21:29
so the converter would work on what?
Nicolas Rinaudo
@nrinaudo
Dec 18 2017 21:30
well, bearing in mind that I'm no purecsv expert, I would have thought that StringConverter works on a cell - it lets you tell purecsv how to turn a CSV cell into an Int, say, or an Option[IP]. It looks like RawFieldsConverter is what's used to work on entire rows of data
kantan.csv calls these CellDecoder and RowDecoder
you should be able to write your own RawFieldsConverter instance, although if you have 150 flat fields in a case class, it's probably going to be better for your carpal syndrom to let shapeless derive it
one trick that you can use to ease things up on the compiler is to "cache" that derived instance by placing it in the companion object. I'm not 100% sure I understand all the ins and outs of how to do this with purecsv, so you might need to ask on their gitter channel