Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Apr 25 12:33
    scala-steward closed #589
  • Apr 25 12:33
    scala-steward commented #589
  • Apr 25 12:33
    scala-steward opened #608
  • Apr 17 08:39
    scala-steward closed #605
  • Apr 17 08:39
    scala-steward commented #605
  • Apr 17 08:39
    scala-steward opened #607
  • Apr 12 18:54
    codecov-commenter commented #606
  • Apr 12 18:54
    codecov-commenter commented #606
  • Apr 12 18:51
    scala-steward closed #592
  • Apr 12 18:51
    scala-steward commented #592
  • Apr 12 18:50
    scala-steward opened #606
  • Apr 10 00:44
    scala-steward closed #577
  • Apr 10 00:44
    scala-steward commented #577
  • Apr 10 00:44
    scala-steward opened #605
  • Apr 08 07:25
    scala-steward opened #604
  • Apr 07 12:55
    scala-steward opened #603
  • Apr 06 00:44
    scala-steward closed #600
  • Apr 06 00:44
    scala-steward commented #600
  • Apr 06 00:44
    scala-steward opened #602
  • Apr 05 01:32
    codecov-commenter commented #601
Travis Brown
@travisbrown
@RomanIakovlev I'm more surprised about how bad that is on the non-streaming side…
RomanIakovlev
@RomanIakovlev
I didn’t dig deep into it, so I can’t provide much info now. I first had a non-trivial structure, an object with 2 huge arrays in it. It required a lot of RAM. When I switched to 2 separate arrays and streaming, it works with default SBT memory size.
RomanIakovlev
@RomanIakovlev
To be clear, the memory size I’m describing is not exactly how much json processing have taken, but how much I had to give to SBT to make my applications run (in a non-forked mode). Json part took somewhat less, but still the most of that memory.
RomanIakovlev
@RomanIakovlev
@travisbrown I’m going to give an internal tech talk for my team at work about my experience using Circe streaming with Iteratee. I’m trying to find a good and concise definitions of Iteratee’s main classes: Iteratee, Enumeratee and Enumerator, what they do on their own and how they interact with each other. IIRC you’ve promised a blog post about Iteratee architecture some time ago (no pressure! :smile: ) , but in absence of it, could you please explain here what they are and what they do, in general?
Travis Brown
@travisbrown
@RomanIakovlev cool! have you seen the descriptions in my original blog post?
@RomanIakovlev depending on the audience enumerator = stream, enumerate = transformer, iteratee = fold (or sink) might work.
s/enumerate/enumeratee
RomanIakovlev
@RomanIakovlev
I don't think I've seen any blog post about Iteratee.
RomanIakovlev
@RomanIakovlev
RomanIakovlev
@RomanIakovlev
This blog post definitely helps! I guess my questions are pretty much covered there. I had somehow missed it before.
Teodor Dimov
@teodimoff
i tried the library and i liked it ... 10x
Teodor Dimov
@teodimoff
@travisbrown i was looking at fs2 , but i am a fan of your work :D
Travis Brown
@travisbrown
@teodimoff thanks! there are plenty of things fs2 can do that are out of scope for this project, but I find this model works for a lot of the stuff I need, and it's simpler / faster.
Teodor Dimov
@teodimoff
Agreed. The three abstractions are composable enough and flexible enough to make things happen... quickly
Teodor Dimov
@teodimoff
@travisbrown closest thing to Iteratee.sortBy(w => w.length -> w)? i want the hole file sorted by length and by alphabetic order.
Travis Brown
@travisbrown
@teodimoff there's not a super nice way, since all of the off-the-shelf enumeratees don't require gathering all elements in memory…
@teodimoff this isn't too terrible, though:
scala> import cats.Monad
import cats.Monad

scala> import io.iteratee.{ Enumeratee, Enumerator, Iteratee }
import io.iteratee.{Enumeratee, Enumerator, Iteratee}

scala> def sortBy[F[_]: Monad, A, B: Ordering](f: A => B): Enumeratee[F, A, A] =
     |   Enumeratee.sequenceI(Iteratee.consume[F, A]).map(_.sortBy(f)).andThen(Enumeratee.flatMap(Enumerator.enumVector[F, A]))
sortBy: [F[_], A, B](f: A => B)(implicit evidence$1: cats.Monad[F], implicit evidence$2: Ordering[B])io.iteratee.Enumeratee[F,A,A]

scala> import cats.instances.option._
import cats.instances.option._

scala> import io.iteratee.modules.option._
import io.iteratee.modules.option._

scala> enumVector(Vector("a", "aaa", "aa")).through(sortBy((_: String).length)).toVector
res0: Option[Vector[String]] = Some(Vector(a, aa, aaa))
Teodor Dimov
@teodimoff
@travisbrown nope its not bad at all... thx
Travis Brown
@travisbrown
Srepfler Srdan
@schrepfler
:clap:
circe coming in 3… 2… 1…
Travis Brown
@travisbrown
@schrepfler yep :smile:
John Sullivan
@sullivan-
so i wrote this method to convert a scala.collection.Iterator into a io.iteratee.Enumerator. it seems to be working, but i wonder if ppl would be willing to look it over to see what you think? I am still pretty shaky with this cats stuff...
   def toEnumerator[F[_], E](iterator: => Iterator[E])(implicit F: Monad[F]): Enumerator[F, E] = {
     new Enumerator[F, E] {
       final def apply[A](step: Step[F, E, A]): F[Step[F, E, A]] = {
         if (iterator.hasNext) {
           F.flatMap(step.feedEl(iterator.next))(s => apply[A](s))
         } else {
           F.pure(step)
         }
       }
     }
   }
John Sullivan
@sullivan-
@travisbrown i would especially appreciate your opinion
Travis Brown
@travisbrown
@sullivan- looks reasonable to me (you could add chunking, etc., but that's just an optimization). the only reason we don't provide something exactly like that is because the resulting enumerator inherits the mutability of the iterator.
John Sullivan
@sullivan-
thanks @travisbrown ! when you say "inherits the mutability" do you mean something like other holders of the Iterator might call methods on the iterator, which would cause the enumerator to behave differently?
Travis Brown
@travisbrown

@sullivan- that's part of it, but it's more about the mutability breaking referential transparency even when you're working just with the enumerator itself—e.g. even if you're reading a file like this:

val e = io.iteratee.monix.task.readLines(new java.io.File("build.sbt"))

you can reuse e as many times as you like and never have to worry about the internal state of the enumerator, etc.

@sullivan- that wouldn't be the case for an enumerator you'd get from toEnumerator.
John Sullivan
@sullivan-

Got it! Thanks, that makes sense. So if I changed the signature from

def toCatsEnumerator[F[_], E](iterator: => Iterator[E])(implicit F: Monad[F])

to

def toCatsEnumerator[F[_], E](iteratorGen: () => Iterator[E])(implicit F: Monad[F])

i could theoretically get around that problem by calling iteratorGen in the right place inside the Enumerator, right?

Ghost
@ghost~55118a7f15522ed4b3ddbe95
That's still not enough since iteratorGen could be returning the same iterator each time.
John Sullivan
@sullivan-
so my assumption is that the caller of this toCatsEnum method knows they have to provide a function that is going to produce a new, equivalent iterator each time. e.g., i could put that in the scaladoc comment
John Sullivan
@sullivan-
@travisbrown you're right, chunking would help. its taking like 20 seconds to process 5M ints..
John Sullivan
@sullivan-
(for the record, that 20 seconds was due to a bad fold somewhere else)
Ghost
@ghost~55118a7f15522ed4b3ddbe95
@sullivan- Then it doesn't really matter if you take your Iterator by => or by () =>.
They are almost entirely equivalent.
The former is syntactically easier to use.
John Sullivan
@sullivan-
thanks @alexknvl . yeah that was just sloppy on my part
Travis Brown
@travisbrown
@sullivan- @alexknvl I'd probably use Iterable since it basically means "something that can produce an iterator" but has the additional conventional requirement that it's a fresh one.
(Of course as far as the types are concerned Iterable[A] and () => A are equivalent.)
Travis Brown
@travisbrown
Also I think the syntactic ease of use of => Iterator[A] is exactly what I'd not want here, personally.
Sorry, not at a computer—meant () => Iterator[A] two messages up.
John Sullivan
@sullivan-
that totally makes sense, thanks Travis. It turns out what I actually want is something more like () => Iterator[A] with java.io.Closeable, so the enumerator is able to call iter.close() when the enumeration is done early.
John Sullivan
@sullivan-
Hey @travisbrown, I just wanted to let you know I got a version now that produces a reusable Enumerator from an Iterator! (I'm not expecting you to care that much, I'm just excited and have to share :smile: ) https://github.com/longevityframework/unblocking/blob/master/core/src/main/scala/unblocking/ToCatsEnumerator.scala
Travis Brown
@travisbrown
@sullivan- :+1:
John Sullivan
@sullivan-
it occurred to me that converters from/to org.reactivestreams.Publisher[A] would probably be much more sensible than converters from/to () => Iterator[A] with Closeable
Travis Brown
@travisbrown
@sullivan- I'd like to have a reactivestreams conversions module—just haven't had the time to work on one myself (hint, hint :smile:).
John Sullivan
@sullivan-
I'm probably going to write one soon. I'd be happy to contribute it to your codebase. I'd probably need your assistance in figuring out where to put impls and tests, and how to integrate these things with your codebase. How about I write an implementation in my own repo first, and then if you like it, we can talk about how to get it into yours?
There is also this whole TCK Java-API testing framework thingie that I'm not sure how much I want to deal with.. I guess I ought to take a look in the least http://www.reactive-streams.org/reactive-streams-tck-1.0.0-javadoc/
Travis Brown
@travisbrown
@sullivan- :+1:, sounds good to me!