Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Feb 01 10:11
    @SystemFw banned @Hudsone_gitlab
  • Jan 31 2019 04:19
    404- forked
    404-/fs2
  • Jan 31 2019 03:01
    SethTisue commented #1232
  • Jan 30 2019 17:22
  • Jan 30 2019 13:45
  • Jan 30 2019 10:48
    pchlupacek commented #1406
  • Jan 30 2019 10:47
    pchlupacek commented #1406
  • Jan 30 2019 10:39
    pchlupacek commented #1407
  • Jan 30 2019 09:58
    lJoublanc commented #870
  • Jan 30 2019 09:42
    vladimir-popov commented #1407
  • Jan 30 2019 08:10
    vladimir-popov closed #1407
  • Jan 30 2019 08:10
    vladimir-popov commented #1407
  • Jan 29 2019 19:20
    SystemFw commented #1407
  • Jan 29 2019 19:20
    SystemFw commented #1407
  • Jan 29 2019 18:57
    SystemFw commented #1406
  • Jan 29 2019 17:47
    pchlupacek commented #1406
  • Jan 29 2019 17:42
    pchlupacek commented #1406
  • Jan 29 2019 17:39
    pchlupacek commented #1407
  • Jan 29 2019 17:39
    vladimir-popov edited #1407
  • Jan 29 2019 17:38
    vladimir-popov commented #1406
Pavel Chlupacek
@pchlupacek
I experimented with this in like M3, but it didn’t work, partly in bugs in cats effects I think. But now I am quite confident this shall work.
Fabio Labella
@SystemFw

I'm just saying that you can't solve this:

def a = Stream(streamA1.drain, streamA2.drain).join
def b = Stream(streamB1.drain, streamB2.drain).join
def c = Stream(a.drain, b.drain).join

you can reduce the initial cost of join, but you can't make the above be the same as

def c = Stream(streamA1.drain, streamA2.drain, streamB1.drain, streamB2.drain).join

unless you make join cost 0 (allocate no new primitives per join)

I do agree that a joinDrain or something can be useful to reduce the cost of a single join in the first place, given the very common use case of just concurrently running things for their effect
unless we start doing magic things like introducing concurrency in the Stream algebra tree, with on-the-fly rewriting and stuff like that
but basically 1) modularity in the code, 2) modularity in the interpreter, or 3) extra allocation of machinery seem mutually exclusive
our current situation gives us 1) and 2) without 3) -- first code snippet
or 2) and 3) without 1) -- second code snippet
Pavel Chlupacek
@pchlupacek
@SystemFw In fact, I think we may perhaps, solve even this nested situation. That would, hoverer require addition to algebra and more experimenting :-)
Fabio Labella
@SystemFw
yeah, that would be 1) and 3) without 2)
the interpreter needs to know about concurrency
:P
Pavel Chlupacek
@pchlupacek
yup
Fabio Labella
@SystemFw
I'm a bit scared of going that direction unless we can make it very general
Christopher Davenport
@ChristopherDavenport
Speaking of Pulls, could someone explain the pattern for using echo.
I have a function which I think could be massively simplified.
/** Parse a stream and return a single terminal ParseResult. */
  def parse1[F[_], A](p: Parser[A]): Pipe[F, String, ParseResult[A]] = s => {
    def go(r: ParseResult[A])(s: Stream[F, String]): Pull[F, ParseResult[A], Unit] = {
      r match {
        case Partial(_) => 
          s.pull.uncons1.flatMap{
            // Add String To Result If Stream Has More Values
            case Some((s, rest)) => go(r.feed(s))(rest)
            // Reached Stream Termination and Still Partial - Return the partial
            case None => Pull.output1(r)
          }
        case _ => Pull.output1(r)
      }
    }
    go(p.parse(""))(s).stream
  }
I think go is terribly written compared to what it could be.
Fabio Labella
@SystemFw
echo is just a Pull that emits all the remaining stream
it's useful when you are done processing in a pull loop, and just want to emit the rest unchanged
example here
Ross A. Baker
@rossabaker
@pchlupacek @ChristopherDavenport A regular scodec dependency in core won't work because of shapeless. shapeless-2.3.3 in a Spark shop is poison.
This was a huge pain at Verizon, and the main reason we forked parboiled2 instead of using the mainline one.
An internal version of scodec might be a nicer API, so that could be considered. But an http4s-kernel with a hard shapeless dependency would not work for a lot of people.
Michael Pilquist
@mpilquist
Is that because Spark depends on an old Shapeless?
Ross A. Baker
@rossabaker
Yes. 2.3.2. Which doesn't work with 2.3.3.
This fact will be mentioned on my gravestone.
Peter Aaser
@PeterAaser
\o/
Michael Pilquist
@mpilquist
Wait really?
Peter Aaser
@PeterAaser
joys of programming
Michael Pilquist
@mpilquist
I have never run in to that before
Fabio Labella
@SystemFw
btw I'm coming around to core being small enough to be a serviceable kernel, more or less
Ross A. Baker
@rossabaker
Any Netty or Shapeless dependencies in a Spark environment mean you're going to have a bad day or two.
You can shade them with the assembly plugin, but then getting your tests to run against the shaded version is a whole other puddle of quicksand.
Michael Pilquist
@mpilquist
Hm how is shapeless 2.3.3 not binary compatible with 2.3.2
Long Cao
@longcao
@rossabaker we are using Shapeless 2.3.3 and Spark 2.3.1... I wonder how I have not been bitten by this?
Michael Pilquist
@mpilquist
I've used 2.3.x interchangeably for years and have never had a problem
Ross A. Baker
@rossabaker
It's pretty easy to make a Spark job that works. It's pretty easy to align a classpath that works. And it's really, really, really hard to make a build where they both work.
Ross A. Baker
@rossabaker
Unidirectional or bidirectional compatibility?
Because the older version is the one Spark foists upon you.
Michael Pilquist
@mpilquist
Unidirectional -- if you build against 2.3.3, then you need to run against 2.3.3. Anything built against 2.3.{0,1,2} will link and work fine when run against 2.3.3.
Ross A. Baker
@rossabaker
And that's the problem.
Because Spark loads 2.3.2, even if your assembly includes 2.3.3.
If you don't shade your 2.3.3-compiled code, you lose.
Michael Pilquist
@mpilquist
I don't know Spark at all besides there's a thing called RDD, but is there not a way to customize the Spark classloader? Like the equivalent of changing the Tomcat classloader as opposed to WAR JARs?
Ross A. Baker
@rossabaker
I imagine it's possible, but we use EMR, so that would fight against the "Hey, Amazon, spin up a cluster for me!" model.
I have some scars from commons-logging like 15 years ago in Tomcat environments, but this is just not really a problem on servlet deployments anymore. They isolate container classloaders from application classloaders in a way that doesn't happen in Spark.
Ross A. Baker
@rossabaker
I ran into further fun with shading when shapeless was introduced by a macro. I still don't understand why that made a difference, because macros are compile time, and shading should happen after compilation. But I had to fork that library and manually change all the references to shapeless.
Long Cao
@longcao
@rossabaker I'm really curious as to what the runtime error you are running into is
we definitely have our own shapeless code on 2.3.3 and Spark 2.3.1 on EMR but I don't recall doing any special shading rules specifically for shapeless