Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Aug 08 17:39
    scala-steward closed #587
  • Aug 08 17:39
    scala-steward commented #587
  • Aug 08 17:39
    scala-steward opened #618
  • Jul 31 19:04
    scala-steward closed #608
  • Jul 31 19:04
    scala-steward commented #608
  • Jul 31 19:04
    scala-steward opened #617
  • Jul 15 23:27
    scala-steward closed #615
  • Jul 15 23:27
    scala-steward commented #615
  • Jul 15 23:27
    scala-steward opened #616
  • Jul 10 06:30
    codecov-commenter commented #615
  • Jul 10 06:30
    codecov-commenter commented #615
  • Jul 10 06:28
    codecov-commenter commented #613
  • Jul 10 06:27
    codecov-commenter commented #613
  • Jul 10 06:27
    scala-steward closed #609
  • Jul 10 06:27
    scala-steward commented #609
  • Jul 10 06:27
    scala-steward opened #615
  • Jul 10 06:26
    codecov-commenter commented #611
  • Jul 10 06:25
    codecov-commenter commented #611
  • Jul 10 06:25
    scala-steward closed #572
  • Jul 10 06:25
    scala-steward commented #572
Adelbert Chang
@adelbertc
cause it to actually attempt to collect the results and thereby OOMing
Travis Brown
@travisbrown
@adelbertc yeah, you'd probably want to plug it into a different iteratee (a fold, etc.).
Adelbert Chang
@adelbertc
ah yes good call
apologies for the dumb questions, still wrapping my head around the iteratee model
Travis Brown
@travisbrown
no worries! but yeah, toVector generally isn't as useful as it looks.
Adelbert Chang
@adelbertc
sounds good
so if i just intend to use it tos ide effect
looks like i can just pipe it into a
fold(())((_, _) => ())
Travis Brown
@travisbrown
@adelbertc or foreach(_ => ())
Adelbert Chang
@adelbertc
aha. yes. or that.
Travis Brown
@travisbrown
@adelbertc I'm curious about the use case, though—typically you'd have the processing happen in an iteratee and plug your enumerator into that, rather than layering all the processing directly onto the enumerator.
Adelbert Chang
@adelbertc
@travisbrown its very possible im structing it wrong
Travis Brown
@travisbrown
I'm not sure "wrong" is the right word, but depending on how you compose the pieces they can be more or less reusable.
Adelbert Chang
@adelbertc
the use case is effectively i have A => Task[Unit], and i want to continuously pull off a queue (an Enumerator[Task, A]i believe) and run that function on elements
so the Enumerator is for all intents and purposes infinite
Travis Brown
@travisbrown
that sounds like creating a foreachM iteratee might be a better fit. then if you need to filter or transform the data, etc., you can plug a new piece in between the enumerator and iteratee more cleanly.
Adelbert Chang
@adelbertc
so would i have something like Enumerator[Task, RawData], a parser that looks like Iteratee[Task, RawData, ActualData], and then the thing that runs the task like Iteratee[Task, ActualData, Unit] via foreach ?
and then at the top level glue those pieces together?
Travis Brown
@travisbrown
well, the parser would be Enumeratee[Task, RawData, ActualData], but yeah, that sounds right.
Adelbert Chang
@adelbertc
looks up Enumeratee
Travis Brown
@travisbrown
…and foreachM if you need side effects in the iteratee.
Adelbert Chang
@adelbertc
and right
ah*
alright let me give this a go
hold my beer
Travis Brown
@travisbrown
an enumeratee is a transformer that can attach contramap-style to an iteratee (resulting in an iteratee), or map-style to an enumerator (resulting in an enumerator).
it doesn't have to maintain a 1:1 input-to-output mapping, though.
Adelbert Chang
@adelbertc
ayyyyy it works
and i like it a lot more
thank you sir, you are a gentleman and a scholar
Travis Brown
@travisbrown
:+1:
Travis Brown
@travisbrown
@adelbertc I'll be curious to hear what you think about the iteratee model in general once you've spent some time with it. I've been (slowly) working on a blog post comparing it and fs2.
Adelbert Chang
@adelbertc
sounds good
dwhitney
@dwhitney
it seems that I am just following @adelbertc around since he's in all of the same gitter channels I am :)
Adelbert Chang
@adelbertc
;)
dwhitney
@dwhitney
anyway @travisbrown I just took the advice you gave me at SBTB and read up on your iteratee lib. Looks really cool!
RomanIakovlev
@RomanIakovlev

I figured I better bring this topic to iteratee channel. As mentioned in Circe channel, I want to produce json with Circe in a streaming fashion. I’ve tried to start and faced one question. Basically I need a function like this (imports omitted):

def writeJson[T: Encoder](enum: Enumerator[Task, T], file: File): Task[Unit] = {
    val printer = Printer.noSpaces.copy(dropNullKeys = true)
    val opener = Enumerator.enumOne[Task, String]("[")
    val closer = Enumerator.enumOne[Task, String]("]")
    val entries = enum.map(_.asJson.pretty(printer) + ",")
    opener.append(entries).append(closer).into(writeLines(file))
  }

The problem is the comma after the last entry, which makes the resulting json invalid. Is there a way to somehow introspect the Enumerator and to know if that’s the last entry, to handle it differently?

RomanIakovlev
@RomanIakovlev
Okay, I’ve found Enumeratee.intersperse, it solves this particular problem.
Travis Brown
@travisbrown
@RomanIakovlev right, intersperse is the way to go.
you could wrap all of that work up in an enumeratee, I guess—I haven't yet just because it's more straightforward than the decoding side.
RomanIakovlev
@RomanIakovlev

It would be nice to have streaming parser also for the non-trivial json structures. By non-trivial I mean something like this:

{
  “my_objects”: [… a huge list of objects …]
}

as opposed to just [… a huge list of objects …]. Have no idea how to approach it though.

Travis Brown
@travisbrown
@RomanIakovlev agreed, but figuring out the API for that is hard.
@RomanIakovlev I could imagine some navigation methods on the decoding enumeratees that allow you to navigate into the structure before streaming starts…
RomanIakovlev
@RomanIakovlev
It surprises me how efficient the streaming reading/writing is. A mediocre 150MB of serialized minified json required 8+ gigs of RAM to write and just under 8 gigs to read. Streaming makes it work in constant memory under 4 gigs, for both reading and writing.
But it forces the rigid structure of array of objects though. It’s tolerable for now, because I own both producer and consumer side.
Travis Brown
@travisbrown
@RomanIakovlev I'm more surprised about how bad that is on the non-streaming side…
RomanIakovlev
@RomanIakovlev
I didn’t dig deep into it, so I can’t provide much info now. I first had a non-trivial structure, an object with 2 huge arrays in it. It required a lot of RAM. When I switched to 2 separate arrays and streaming, it works with default SBT memory size.
RomanIakovlev
@RomanIakovlev
To be clear, the memory size I’m describing is not exactly how much json processing have taken, but how much I had to give to SBT to make my applications run (in a non-forked mode). Json part took somewhat less, but still the most of that memory.
RomanIakovlev
@RomanIakovlev
@travisbrown I’m going to give an internal tech talk for my team at work about my experience using Circe streaming with Iteratee. I’m trying to find a good and concise definitions of Iteratee’s main classes: Iteratee, Enumeratee and Enumerator, what they do on their own and how they interact with each other. IIRC you’ve promised a blog post about Iteratee architecture some time ago (no pressure! :smile: ) , but in absence of it, could you please explain here what they are and what they do, in general?