These are chat archives for nrinaudo/kantan.csv
Seq[String]in memory before passing it to a
RowDecoder. But you can certainly decode rows to a type that doesn't use all cells
String, the remaining 99 strings will be discarded (and gc-ed out, if you have good gc options) immediately
In my use case I have large .tsv-s with
val headers = Seq("Name", "Length", "EffectiveLength", "TPM", "NumReads")
There I only want to get Name and TPM.
Currently I do it with:
type SimpleSalmon = (String, Int, Double, Double, Double) p.toIO.unsafeReadCsv[Vector, SimpleSalmon](config.withHeader).map(v=>v._1 -> v._4)
is there a way to dicrease memory consumption here?
RowDecoder[SimpleSalmon]that only takes the first and fourth cell
final case class SimpleSalmon(name: String, tpm: Double) implicit val simpleSalmonDecoder: HeaderDecoder[SimpleSalmon] = HeaderDecoder.decode("Name", "TPM")(SimpleSalmon.apply _) p.toIO.unsafeReadCsv[Vector, SimpleSalmon](config.withHeader)
asCsvReader, do the mapping on the iterator-like structure this returns, and then load the whole thing in memory with