Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Jun 09 23:59
    chenharryhua opened #610
  • May 21 05:22
    scala-steward closed #602
  • May 21 05:22
    scala-steward commented #602
  • May 21 05:22
    scala-steward opened #609
  • Apr 25 12:33
    scala-steward closed #589
  • Apr 25 12:33
    scala-steward commented #589
  • Apr 25 12:33
    scala-steward opened #608
  • Apr 17 08:39
    scala-steward closed #605
  • Apr 17 08:39
    scala-steward commented #605
  • Apr 17 08:39
    scala-steward opened #607
  • Apr 12 18:54
    codecov-commenter commented #606
  • Apr 12 18:54
    codecov-commenter commented #606
  • Apr 12 18:51
    scala-steward closed #592
  • Apr 12 18:51
    scala-steward commented #592
  • Apr 12 18:50
    scala-steward opened #606
  • Apr 10 00:44
    scala-steward closed #577
  • Apr 10 00:44
    scala-steward commented #577
  • Apr 10 00:44
    scala-steward opened #605
  • Apr 08 07:25
    scala-steward opened #604
  • Apr 07 12:55
    scala-steward opened #603
Adelbert Chang
@adelbertc
so would i have something like Enumerator[Task, RawData], a parser that looks like Iteratee[Task, RawData, ActualData], and then the thing that runs the task like Iteratee[Task, ActualData, Unit] via foreach ?
and then at the top level glue those pieces together?
Travis Brown
@travisbrown
well, the parser would be Enumeratee[Task, RawData, ActualData], but yeah, that sounds right.
Adelbert Chang
@adelbertc
looks up Enumeratee
Travis Brown
@travisbrown
…and foreachM if you need side effects in the iteratee.
Adelbert Chang
@adelbertc
and right
ah*
alright let me give this a go
hold my beer
Travis Brown
@travisbrown
an enumeratee is a transformer that can attach contramap-style to an iteratee (resulting in an iteratee), or map-style to an enumerator (resulting in an enumerator).
it doesn't have to maintain a 1:1 input-to-output mapping, though.
Adelbert Chang
@adelbertc
ayyyyy it works
and i like it a lot more
thank you sir, you are a gentleman and a scholar
Travis Brown
@travisbrown
:+1:
Travis Brown
@travisbrown
@adelbertc I'll be curious to hear what you think about the iteratee model in general once you've spent some time with it. I've been (slowly) working on a blog post comparing it and fs2.
Adelbert Chang
@adelbertc
sounds good
dwhitney
@dwhitney
it seems that I am just following @adelbertc around since he's in all of the same gitter channels I am :)
Adelbert Chang
@adelbertc
;)
dwhitney
@dwhitney
anyway @travisbrown I just took the advice you gave me at SBTB and read up on your iteratee lib. Looks really cool!
RomanIakovlev
@RomanIakovlev

I figured I better bring this topic to iteratee channel. As mentioned in Circe channel, I want to produce json with Circe in a streaming fashion. I’ve tried to start and faced one question. Basically I need a function like this (imports omitted):

def writeJson[T: Encoder](enum: Enumerator[Task, T], file: File): Task[Unit] = {
    val printer = Printer.noSpaces.copy(dropNullKeys = true)
    val opener = Enumerator.enumOne[Task, String]("[")
    val closer = Enumerator.enumOne[Task, String]("]")
    val entries = enum.map(_.asJson.pretty(printer) + ",")
    opener.append(entries).append(closer).into(writeLines(file))
  }

The problem is the comma after the last entry, which makes the resulting json invalid. Is there a way to somehow introspect the Enumerator and to know if that’s the last entry, to handle it differently?

RomanIakovlev
@RomanIakovlev
Okay, I’ve found Enumeratee.intersperse, it solves this particular problem.
Travis Brown
@travisbrown
@RomanIakovlev right, intersperse is the way to go.
you could wrap all of that work up in an enumeratee, I guess—I haven't yet just because it's more straightforward than the decoding side.
RomanIakovlev
@RomanIakovlev

It would be nice to have streaming parser also for the non-trivial json structures. By non-trivial I mean something like this:

{
  “my_objects”: [… a huge list of objects …]
}

as opposed to just [… a huge list of objects …]. Have no idea how to approach it though.

Travis Brown
@travisbrown
@RomanIakovlev agreed, but figuring out the API for that is hard.
@RomanIakovlev I could imagine some navigation methods on the decoding enumeratees that allow you to navigate into the structure before streaming starts…
RomanIakovlev
@RomanIakovlev
It surprises me how efficient the streaming reading/writing is. A mediocre 150MB of serialized minified json required 8+ gigs of RAM to write and just under 8 gigs to read. Streaming makes it work in constant memory under 4 gigs, for both reading and writing.
But it forces the rigid structure of array of objects though. It’s tolerable for now, because I own both producer and consumer side.
Travis Brown
@travisbrown
@RomanIakovlev I'm more surprised about how bad that is on the non-streaming side…
RomanIakovlev
@RomanIakovlev
I didn’t dig deep into it, so I can’t provide much info now. I first had a non-trivial structure, an object with 2 huge arrays in it. It required a lot of RAM. When I switched to 2 separate arrays and streaming, it works with default SBT memory size.
RomanIakovlev
@RomanIakovlev
To be clear, the memory size I’m describing is not exactly how much json processing have taken, but how much I had to give to SBT to make my applications run (in a non-forked mode). Json part took somewhat less, but still the most of that memory.
RomanIakovlev
@RomanIakovlev
@travisbrown I’m going to give an internal tech talk for my team at work about my experience using Circe streaming with Iteratee. I’m trying to find a good and concise definitions of Iteratee’s main classes: Iteratee, Enumeratee and Enumerator, what they do on their own and how they interact with each other. IIRC you’ve promised a blog post about Iteratee architecture some time ago (no pressure! :smile: ) , but in absence of it, could you please explain here what they are and what they do, in general?
Travis Brown
@travisbrown
@RomanIakovlev cool! have you seen the descriptions in my original blog post?
@RomanIakovlev depending on the audience enumerator = stream, enumerate = transformer, iteratee = fold (or sink) might work.
s/enumerate/enumeratee
RomanIakovlev
@RomanIakovlev
I don't think I've seen any blog post about Iteratee.
RomanIakovlev
@RomanIakovlev
RomanIakovlev
@RomanIakovlev
This blog post definitely helps! I guess my questions are pretty much covered there. I had somehow missed it before.
Teodor Dimov
@teodimoff
i tried the library and i liked it ... 10x
Teodor Dimov
@teodimoff
@travisbrown i was looking at fs2 , but i am a fan of your work :D
Travis Brown
@travisbrown
@teodimoff thanks! there are plenty of things fs2 can do that are out of scope for this project, but I find this model works for a lot of the stuff I need, and it's simpler / faster.
Teodor Dimov
@teodimoff
Agreed. The three abstractions are composable enough and flexible enough to make things happen... quickly
Teodor Dimov
@teodimoff
@travisbrown closest thing to Iteratee.sortBy(w => w.length -> w)? i want the hole file sorted by length and by alphabetic order.
Travis Brown
@travisbrown
@teodimoff there's not a super nice way, since all of the off-the-shelf enumeratees don't require gathering all elements in memory…
@teodimoff this isn't too terrible, though:
scala> import cats.Monad
import cats.Monad

scala> import io.iteratee.{ Enumeratee, Enumerator, Iteratee }
import io.iteratee.{Enumeratee, Enumerator, Iteratee}

scala> def sortBy[F[_]: Monad, A, B: Ordering](f: A => B): Enumeratee[F, A, A] =
     |   Enumeratee.sequenceI(Iteratee.consume[F, A]).map(_.sortBy(f)).andThen(Enumeratee.flatMap(Enumerator.enumVector[F, A]))
sortBy: [F[_], A, B](f: A => B)(implicit evidence$1: cats.Monad[F], implicit evidence$2: Ordering[B])io.iteratee.Enumeratee[F,A,A]

scala> import cats.instances.option._
import cats.instances.option._

scala> import io.iteratee.modules.option._
import io.iteratee.modules.option._

scala> enumVector(Vector("a", "aaa", "aa")).through(sortBy((_: String).length)).toVector
res0: Option[Vector[String]] = Some(Vector(a, aa, aaa))
Teodor Dimov
@teodimoff
@travisbrown nope its not bad at all... thx
Travis Brown
@travisbrown
Srepfler Srdan
@schrepfler
:clap:
circe coming in 3… 2… 1…