Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Anton Kulaga
    @antonkulaga
    Hi there! I see that there are not many people here although the lib looks quite interesting
    mark lister
    @marklister
    Yeah, the interest probably doesn't warrant a chat room. I'm pretty sure that most people who use this lib do so just to perform csv i/o and lets face it, it's so simple there's not really much to discuss!
    Anton Kulaga
    @antonkulaga
    Regarding possibilities of MapLike access ( https://github.com/marklister/product-collections/issues/31#issuecomment-94257552 )
    I think what is important on its own is having a way to get val headers:Seq[String] because in most of csv files there are headers and when I use default CSVParsers with withHeaders = true then information about headers is lost for my code
    Even If I have just headers field and everything remains the same (with products) then in my code I will be able to see the headers. If I do my analysis in something like ScalaNotebook it is rather useful
    Anton Kulaga
    @antonkulaga
    I think an alternative solution can be having everything generated by macroses. In such case developer will provide a case class for csv and it will use colseq with case class field names instead of numbers
    mark lister
    @marklister
    Yeah the macro solution has been mooted for some years now. I documented it in 2013 or so. But you're right it's absolutely incorrect to just throw away the header data . I've got some spare time this week and I'll prototype some ideas and perhaps we can knock them around...
    Anton Kulaga
    @antonkulaga
    I will make a PR in 10 mins
    I've managed to construct something with macro annotations
    mark lister
    @marklister
    Sounds good., sorry I won't be able to look at it in detail until tomorrow as the aforementioned lunch went on a bit late...
    Thanks for the input,
    Anton Kulaga
    @antonkulaga
    @marklister here is the PR marklister/product-collections#32 with macro-annotations.
    mark lister
    @marklister
    I have a less revolutionary idea for retaining header data and returning something a little more type specific based around this:
    It would be in addition to case class asFrame, just for CollSeq
    Anton Kulaga
    @antonkulaga
    I think that there are several use-cases for csv data:
    1) When the data is ok and you know its structure. In such case either tuples or case classes (if there are some headers) work well
    2) When you know the structure but data is not of good quality and some mistakes may occur (like puting strings to double columns and so on). It is what I saw in Framian, where the data can be a Value, NA or MA
    3) Exploration of data form worksheet or scala notebook. Here you explore the data with DataFrames and after you undestood the structure by looking at headers and data you are moving to more typesafe approach that is based either on tuples or case classes
    Bob R
    @tzbob
    Hey, let's say I have a csv table with 50 columns but I'm only interested in 3.
    Is there anyway to define an extractor that basically maps 3 indices to 3 to-be-extracted fields?
    Or do I have to define 50 dummy columns to get to the other 3
    mark lister
    @marklister
    Hi Bob,
    You'd need the dummy columns, except that PC's typed io is limited to 22 columns...
    Bob R
    @tzbob
    Oh right
    Yeah that's a common issue
    mark lister
    @marklister
    The underlying parser will return an Iterator[Array[String]]
    So you could work from there... It's the same API as Opencsv, but it works on scala-js
    Bob R
    @tzbob
    Thanks
    Bob R
    @tzbob
    I could just map that to the data that I want, right?
    mark lister
    @marklister
    Yeah go xxx.next and you have an array string that corresponds to one line of data
    Bob R
    @tzbob
    Yup, that's all I need
    mark lister
    @marklister
    And all the edge cases have already been tested against
    Good luck