Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 03:10
    rossabaker opened #3791
  • 03:02
    rossabaker commented #3790
  • 03:01
    rossabaker commented #3789
  • 03:00
    ChristopherDavenport commented #3789
  • 02:59
    ChristopherDavenport commented #3789
  • 02:30
    rossabaker opened #3790
  • 02:25

    rossabaker on cats-effect-3

    Updated to CE3/FS3; Migration W… core now compiles Restored sbt files to their unf… and 9 more (compare)

  • 02:25
    rossabaker closed #3784
  • 02:20
    ChristopherDavenport synchronize #3789
  • 02:13
    ChristopherDavenport edited #3789
  • 02:13
    ChristopherDavenport edited #3789
  • 02:12
    ChristopherDavenport opened #3789
  • Oct 27 22:05
    domaspoliakas synchronize #3784
  • Oct 27 17:45

    rossabaker on gh-pages

    updated site (compare)

  • Oct 27 17:42
    scala-steward closed #436
  • Oct 27 17:42
    scala-steward closed #450
  • Oct 27 17:42
    scala-steward closed #460
  • Oct 27 17:42
    scala-steward closed #457
  • Oct 27 17:42
    scala-steward closed #445
  • Oct 27 17:42
    scala-steward closed #444
Ross A. Baker
@rossabaker

I don't know all the types there, but if you're doing big ones, they get persisted to a temp file. That means:

  1. They shouldn't accrue in memory just by calling parseStreamedFile. If they do, that's an http4s bug.
  2. If you process each Part in the Multipart a streaming fashion, it should run in constant space with respect to the file sizes.
  3. If you buffer each Part into a big blob to write at once, you'll have choppy memory usage, but it should still free when you're done. If it's not, make sure nothing in your code is holding a reference to it.

Using a profiler would help a lot in seeing who is really holding the reference.

alexh2o17
@alexh2o17_gitlab
I get an entity body from each body part in multipart and then I sent each of them with a different request with blaze client stream
I can try with profiler
Ross A. Baker
@rossabaker
If it's EntityBody => EntityBody, it sounds like you're in case 2 above.
If you have a good test environment and can just do nothing with the uploads, that would be interesting. Or just drain each entity without sending it via blaze.
See if we can simplify the process that causes the leak.
Rob Norris
@tpolecat
I have an AuthedRoutes and a middleware that reduces it to HttpRoutes. If the middleware is able to extract a user from the request all is well. If it fails to extract a user I can't really return 403 because then the routes can't be composed anymore because they respond to all requests.
So I'm thinking maybe AuthedRoutes isn't great and I should have code to extract the user in each route where I need one and return 403 where it's appropriate.
Or maybe there is a pattern I'm not seeing.
Fabio Labella
@SystemFw
are you using AuthedRoutes for actual auth, or for user extraction? (or both I guess)
Rob Norris
@tpolecat
It's just pulling the user out of a JWT if one is present in the headers.
Fabio Labella
@SystemFw
I think AuthedRoutes was generalised/renamed to ContextRoutes
Rob Norris
@tpolecat
If it's not present then maybe it's ok because maybe there are other routes later on that will match the request.
hm
Fabio Labella
@SystemFw
so perhaps you could have Either[FailedToExtract, User] as your context
but I don't really know what I'm talking about
but couldn't you pass through in that case though?
if it fails to extract the user, return None
Rob Norris
@tpolecat
Right but that means I get a 404 instead of a 403 if I try to do something legit but fail to pass a user.
Ryan Zeigler
@rzeigler
How does Multipart work? I see that each part has a body that is a Stream[F, Byte]. How does that work with parsing the entire Multipart ahead of time with the EntityDecoder?
Fabio Labella
@SystemFw

@tpolecat

Right but that means I get a 404 instead of a 403 if I try to do something legit but fail to pass a user.

yeah fair enough, sounds like the actual logic for that needs to be at route level then as you suspected, or it would in contradiction with

I can't really return 403 because then the routes can't be composed anymore because they respond to all requests.

so perhaps you could have Either[FailedToExtract, User] as your context

I think you can still do this if you want to keep that extraction in a middleware and use ContextRoutes. Also I guess Option[User] is enough

Ryan Zeigler
@rzeigler
We ran into this, still looking at solutions but the primary consideration is replacing the OptionT in the HttpRoutes with something that can encode 'user is not authorized for subtree, but please keep looking'
Fabio Labella
@SystemFw
I think you can do that
withFallThrough
the problem is that if no other route matches you will respond with 404
I also think you might be able to hack something that can keep track of the first matching but unauthorised route with a Ref, though it gets unpleasant to think about
although it might work for Rob's case too
it boils down to how you want to slice the ugliness between routes and middleware, which in turn depends on how many routes you have etc
Ross A. Baker
@rossabaker
@rzeigler Yeah, multipart is a pain, because it's essentially a stream of streams, and you can't get later parts without buffering prior parts.
It's gotten a bit of a facelift since I last used it personally, so take this all with a grain of salt, and improving the docs here would be helpful:
The EntityDecoder for Multipart is going to have to buffer it. But large parts are written to a temp file, so that doesn't imply the whole thing is in memory. But if you want to process a huge upload in a streaming fashion, that's probably not what you want.
Ryan Zeigler
@rzeigler
mmm, I don't really have that much data so buffering isn't an issue
I went poking around inside of MultipartParser and that was magical
Ross A. Baker
@rossabaker
Yeah, that was a real pain to port to fs2, and the person who did it hasn't been around lately.
Ryan Zeigler
@rzeigler
I enjoy that the last commit message is you saying Good grief
Ross A. Baker
@rossabaker
When I wrote the original, there were two interfaces: an EntityDecoder[F, Multipart[F]] (I think we weren't parametric in F back then, but whatever), and a Pipe[F, Byte, Either[Headers, Chunk[Byte]]
:laughing: That was a scalafmt, but it still is fair.
That latter pipe signature was an extremely low-level interface. Think of it as more like an "event based" model. A bit like SAX, if you've ever had the misfortune of parsing XML that way.
Ryan Zeigler
@rzeigler
the shape isn't uncommon. i've written push parsers before and thats not terribly fun
Ross A. Baker
@rossabaker
There's a parseToPartsStream now, which is Pipe[F, Byte, Part[F]]
From the type, it's not certain whether it parses the bodies strictly, or whether you have to consume each part before progressing to the next. That's a big hole in the (scala)docs for this, if someone cares to fill it.
Ryan Zeigler
@rzeigler
This may just be the fact that I only understand fs2 as a user, but I'm mystified that it seems to be possible to get a Multipart from a stream which must contain all the chunks body streams
but there doesn't appear to be any buffering mechanism
Ross A. Baker
@rossabaker
Well, all those part bodies can be either buffered in memory, or represent a temp file that was created as we parse.
Ryan Zeigler
@rzeigler
Stream[F, Part[F]] makes sense, I just don't understand how one then gets a Multipart out of those parts before draining the whole stream
Ross A. Baker
@rossabaker
The parts in the multipart are a Vector, not a stream of parts. (Thinking aloud, after all this time: why not?)
Ryan Zeigler
@rzeigler
yes, that is correct, it is constructed as a .fold(Vector.empty )(_ :+ _) though
on the part stream
Ross A. Baker
@rossabaker
It's a difficult model to make properly streaming.
We can buffer it. We can memoize and do some lazy parsing, but it's tricky not to leak memory. Or we can provide that low level sum type pushy interface.