by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 11:12
    vytas7 commented #1758
  • 11:04
    CaselIT commented #1758
  • 11:00
    vytas7 commented #1589
  • 10:56
    vytas7 commented #1589
  • 10:43
    vytas7 commented #1589
  • Aug 10 20:00
    codecov[bot] commented #1759
  • Aug 10 19:58
    codecov[bot] commented #1759
  • Aug 10 19:54
    codecov[bot] commented #1759
  • Aug 10 19:53
    CaselIT review_requested #1759
  • Aug 10 19:53
    codecov[bot] commented #1759
  • Aug 10 19:53
    CaselIT review_requested #1759
  • Aug 10 19:52
    CaselIT opened #1759
  • Aug 10 19:50

    CaselIT on middleware-typo

    docs(middleware): fix typo (compare)

  • Aug 09 19:36
    codecov[bot] commented #1754
  • Aug 09 19:35
    codecov[bot] commented #1754
  • Aug 09 19:33
    CaselIT commented #1758
  • Aug 09 17:11
    codecov[bot] commented #1732
  • Aug 09 17:10
    codecov[bot] commented #1732
  • Aug 09 17:09
    codecov[bot] commented #1732
  • Aug 09 17:09
    codecov[bot] commented #1732
Federico Caselli
@CaselIT
I'm thinking at your suggestion, and probably the best solution is to have this on all media handlers. Now for some handler it may not make sense, but if worded correctly it should be ok.
Vytautas Liuolia
@vytas7
For MultipartFormHandler this probably doesn't make much sense. And for URLEncodedFormHandler it is not needed/not applicable.
But we could add it for JSON, yes. And probably MessagePack, although I'm less convinced of that case.
Federico Caselli
@CaselIT
well multipart on empty body would just return an empty iterable
Vytautas Liuolia
@vytas7
Technically yes, but I'm not sure if it makes sense logically when thinking of HTML forms. Anyway, it could be added, yes.
Federico Caselli
@CaselIT
currently how it behaves?
Vytautas Liuolia
@vytas7
Raises an error. I mean, if you gave it a chance to, such as in the ASGI flavor.
For JSONHandler, it would probably be good to add it regardless of the framework design we settle on.
Because I noticed some people swear by None, whereas others expect a {} (even though JSON does not strictly need to be a dict).
vytas7 @vytas7 is checking browser console :slight_smile:
Federico Caselli
@CaselIT
just to clarify, you idea was to not modify the get_media (other than making it deserialize an empty body), but handle it in each handler?
the browser errors
Vytautas Liuolia
@vytas7
Wow found that out, yeah :slight_smile: I thought the browser happily gives something, that's where people are expecting it not to error.
I don't have a strong preference / a solution that I would feel is clearly the best :grimacing:
But my idea was that JSONHandler could have the empty value option regardless of the remaining of the framework.
And (I'm less convinced myself...) the remaining of the framework could probably use your machinery to raise specialized missing body exceptions (?)
Vytautas Liuolia
@vytas7
Getting back to the interesting discussion with @sisp btw.
Shall we reopen falconry/falcon#1499, add comments to the WSGI middleware issue, or create a new one?
31 replies
Federico Caselli
@CaselIT

well multipart on empty body would just return an empty iterable

I think that multipart actually should return an empty iterator (aka empty tuple) when the body is empty, at least I think it's what makes the most sense in this case since we are iterating over it anyway

Vytautas Liuolia
@vytas7
I think I disagree, but I actually developed that based on popular demand from our community and sponsors, so I might not be the right person to ask.
Because if a client submits a multipart/form-data with a boundary parameter, and that boundary is not found in the payload, that is a bad request in my estimation.
Federico Caselli
@CaselIT
yes, that is an error. but that would not be an empty body
empty body -> len(body) == 0
Vytautas Liuolia
@vytas7
The boundary parameter is provided in the Content-Type header. Body can still be unexpectedly empty.
Federico Caselli
@CaselIT
I had that doubt. I guess that is the age old question. One can also interpret that an empty body was delimited by the boundary, just no data was sent
still it could use an option regarding what to do when an empty body is encountered
Vytautas Liuolia
@vytas7
No, even an empty form (although that's probably forbidden by the spec), should have both the prologue and epilogue boundaries.
Federico Caselli
@CaselIT
did not know there were those
I actually like better the option to set that parameter in the media hander instead of each time we call get_media
Vytautas Liuolia
@vytas7
(I'm not sure these are actually called prologue and epilogue in this context. But you get the idea: https://ec.haxx.se/http/http-multipart#the-http-this-generates )
(The example form begins with --------------------------d74496d66958873e, and ends with --------------------------d74496d66958873e--).
Federico Caselli
@CaselIT
why did the feel the need to add a random boundary in the spec? wouldn't a standard one with escape rules suffice?
Are escape rules still needed even in the current spec?
Vytautas Liuolia
@vytas7
The boundary is usually random or semi-random. There is no escaping in case of general purpose binary content.
Federico Caselli
@CaselIT
that must be the reason then
Vytautas Liuolia
@vytas7
It is the client's responsibility in choosing a unique boundary that does not exist in the content.
Federico Caselli
@CaselIT
probably they could not agree on how to escape the boundary and settled for a random one :)
Vytautas Liuolia
@vytas7
Most probably it was done that way due to performance reasons, it is cheap to test inclusion.
   As noted in Section 5.1
   of [RFC2046], the boundary delimiter MUST NOT appear inside any of
   the encapsulated parts, and it is often necessary to enclose the
   "boundary" parameter values in quotes in the Content-Type header
   field.
Given the right algorithm, finding a long substring in a long string is actually very fast.
Vytautas Liuolia
@vytas7
SIMD optimizations aside, the main principle for building these scans is that, if you make sure that your long substring is not at the current position, you can skip forward its entire length.
Federico Caselli
@CaselIT
I'm not up to date on these algorithms
Vytautas Liuolia
@vytas7
These are largely irrelevant for Falcon anyway :slight_smile:
We rely on Python's in and .find(), with the exception of one pure-Cython hack in the cythonized version.
Federico Caselli
@CaselIT
these are mostly irrelevant for most people not doing performance critical systems in c/rust/other high performance langave, and even then usually throwing more hw at the problem solves it
Vytautas Liuolia
@vytas7
But HTTP parsers and clients do care about that.
usually throwing more hw at the problem solves it
Sad but true too :slight_smile:
Federico Caselli
@CaselIT
yes, but usually to the point of using the correct function of the std-lib
like don't roll your own double for that is O(n*m)
Vytautas Liuolia
@vytas7
O(fun)