Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Travis Brown
    @travisbrown
    Right, almost everyone who uses Circe uses Jawn for parsing.
    Ross A. Baker
    @rossabaker
    Jawn is a JSON parser that is faster than many libraries’ native solution. In the case of Circe, which is my strong recommendation, it is the native solution.
    It also supports incremental parsing, which lends itself nicely to streaming solutions like circe-fs2 or jawn-fs2.
    Vladislav Rybin
    @VladislavRybin
    Got you, guys. Thanks for the explanation.
    Andriy Plokhotnyuk
    @plokhotnyuk
    Jawn is one of most slowers. Please see results of benchmarks which compares Circe (that uses Jawn) and other JSON parsers for Scala: https://plokhotnyuk.github.io/jsoniter-scala/
    Travis Brown
    @travisbrown
    I'm not sure those comparisons are entirely fair to Jawn. Circe's AST and decoding model have some overhead that means that some of the pairings there are apples and oranges, and in any case don't tell you much about Jawn itself.
    In my experience Jawn is competitive with Jackson as a parsing backend for Circe, and it's generally faster than spray-json's parser.
    Andriy Plokhotnyuk
    @plokhotnyuk
    Here is a PR with direct comparison of Jawn vs jsoniter-scala for parsing to Jawn's AST: plokhotnyuk/jsoniter-scala#424
    Ross A. Baker
    @rossabaker
    Converting bytes to Strings and then parsing those also slows things down unnecessarily.
    Ross A. Baker
    @rossabaker
    When your source is bytes, you should be using parseFromByteBuffer in jawn.
    Andriy Plokhotnyuk
    @plokhotnyuk
    the PR tests both ways... and it shows that though String option is faster on JDK 11
    Ross A. Baker
    @rossabaker
    I don’t know how many of these benchmarks you had reviewed by people who are experts in the respective projects, but some of the circe usage is a bit dubious.
    I would be a bit more careful making sure to use the libraries correctly before coming into their channels and taking a shit on their work.
    Andriy Plokhotnyuk
    @plokhotnyuk
    @rossabaker feel free to provide a PR which will make that numbers better... and ask clarifying questions to dispel all doubts
    Andriy Plokhotnyuk
    @plokhotnyuk
    Also, with my PR you can reproduce DoS/DoW vulnerability of JawnFacade and pick a solution of using java.util.LinkedHashMap instead of scala.collection.mutable.HashMap that was shamelessly copied from Circe. To reproduce, please, clone jsoniter-scala repo, checkout the jawn-ast branch and run the following command: sbt -no-colors 'jsoniter-scala-benchmark/jmh:run -i 1 -wi 1 -p size=1,10,100,1000,10000,100000 ExtractFieldsReading.jawn'
    And, you should get result like this:
    [info] REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
    [info] why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
    [info] experiments, perform baseline and negative tests that provide experimental control, make sure
    [info] the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
    [info] Do not assume the numbers tell you what you want them to tell.
    [info] Benchmark                                  (size)   Mode  Cnt        Score   Error  Units
    [info] ExtractFieldsReading.jawnByteBufferParser       1  thrpt       1811139.569          ops/s
    [info] ExtractFieldsReading.jawnByteBufferParser      10  thrpt        260677.346          ops/s
    [info] ExtractFieldsReading.jawnByteBufferParser     100  thrpt         19623.857          ops/s
    [info] ExtractFieldsReading.jawnByteBufferParser    1000  thrpt           301.378          ops/s
    [info] ExtractFieldsReading.jawnByteBufferParser   10000  thrpt             1.608          ops/s
    [info] ExtractFieldsReading.jawnByteBufferParser  100000  thrpt             0.006          ops/s
    [info] ExtractFieldsReading.jawnJsoniterScala          1  thrpt       2865937.664          ops/s
    [info] ExtractFieldsReading.jawnJsoniterScala         10  thrpt        330815.496          ops/s
    [info] ExtractFieldsReading.jawnJsoniterScala        100  thrpt         26956.753          ops/s
    [info] ExtractFieldsReading.jawnJsoniterScala       1000  thrpt          2279.925          ops/s
    [info] ExtractFieldsReading.jawnJsoniterScala      10000  thrpt           203.146          ops/s
    [info] ExtractFieldsReading.jawnJsoniterScala     100000  thrpt            16.296          ops/s
    [info] ExtractFieldsReading.jawnStringParser           1  thrpt       2266465.614          ops/s
    [info] ExtractFieldsReading.jawnStringParser          10  thrpt        358482.177          ops/s
    [info] ExtractFieldsReading.jawnStringParser         100  thrpt         24793.306          ops/s
    [info] ExtractFieldsReading.jawnStringParser        1000  thrpt           352.264          ops/s
    [info] ExtractFieldsReading.jawnStringParser       10000  thrpt             1.630          ops/s
    [info] ExtractFieldsReading.jawnStringParser      100000  thrpt             0.006          ops/s
    So the 1Mb request is able to burn 4GHz CPU core for 3 minutes... I hope this (and any other non-direct usage of Scala's HashMap/HashSet) will be fixed before 1.0.0 release
    Andriy Plokhotnyuk
    @plokhotnyuk
    Parsing of JSON is a minefield... Most of AST-based parsers are vulnerable under attacks which exploit using of recursion for parsing: https://github.com/lovasoa/bad_json_parsers
    Andriy Plokhotnyuk
    @plokhotnyuk
    BTW, Jawn API forces users to introduce yet another security vulnerabilities like: circe/circe#1040
    Srepfler Srdan
    @schrepfler
    just curious, how do other parsers address this kind of DoS?
    would adding support to limit for example big-ints via config to a certain amount of digits be viable solution?
    Travis Brown
    @travisbrown
    I've just been trying to get back to 1.0.0 preparation and I'm wondering what people think about removing the RawX layer?
    It was introduced in typelevel/jawn#102 to maintain the Facade interface, but in my view it doesn't serve any real purpose, it's badly named, and if we're about to commit ourselves to a long-term 1.0.0 now is the time to get rid of it.
    Travis Brown
    @travisbrown
    Also, if anyone has any objection to Scalafmt-ing the Jawn repo, please let us know asap: typelevel/jawn#210
    Ross A. Baker
    @rossabaker
    I'm not aware of Raw ever seeing use. The author went his own way, and I don't recall seeing it anywhere else.
    Travis Brown
    @travisbrown
    I’ll open a PR in the morning. Maybe we can get the scalafmt one merged before then?
    Ross A. Baker
    @rossabaker
    I gave it a second blessing. Just need to rerun again since the other work was merged.
    Travis Brown
    @travisbrown
    Okay, here's the PR: typelevel/jawn#219
    Travis Brown
    @travisbrown
    This week is our last chance to get changes into 1.0.0: https://github.com/typelevel/jawn/issues/193#issuecomment-573684699
    Matt Hughes
    @matthughes
    I’ve been playing around with adding SJS support to Jawn. Obviously things like File/Channel aren’t going to work, but outside of that, one of the things I’ve stumbled into is a couple places where the project relies on IndexOutOfBoundsException/StringIndexOutOfBoundsException (for charAt). Both of those exceptions by default are undefined behavior in SJS. You can get around the IndexOutOfBoundsException with a setting, but I believe StringIndexOutOfBoundsException will still be undefined.

    My ultimate goal was to add SJS support to circe-fs2.

    Anyway, I can add bounds checks to get around this problem (only when running in JS) but I only want to do this where the caller doesn’t already do bounds checks.

    However, I’m seeing a couple cases where the caller indicates it does bounds checks but is still failing. For example, parseNumSlow is supposedly slower than parseNum because it does bounds checks. Yet I’m seeing failures in JS where that doesn’t appear to be true. For example, one of the tests tries to access index 5 in string “1.1e+”. Is this expected?

    Matt Hughes
    @matthughes
    I don’t see a lot of history but it does seem like folks have attempted this before. Anything that’s obvious show stoppers?
    Ross A. Baker
    @rossabaker
    No obvious showstoppers, but to diminish my own credibility: without looking, I thought we already had that support.
    Is your goal in a scala.js port speed, incrementality, or compatibility with other Jawn-dependent solutions? Maybe those answers guide how much code gets to remain in shared and how much becomes platform specific.
    Matt Hughes
    @matthughes
    My ultimate goal is getting circe-fs2 working as I wanted to support having my client consume a streaming JSON response from the server. Another approach would be to just not use jawn at all in circe-fs2 and try to use some JS-based, streaming JSON library but I haven’t found any great fits.
    I’ve patched the various places that use charAt/Array.apply to do bounds checking (only on JS side) but would rather omit that if caller is already doing the checking.
    Ross A. Baker
    @rossabaker
    Oh, right, my reading comprehension was poor.
    Ross A. Baker
    @rossabaker
    I mostly merge dependendencies and lack the original context, but no, I don't see why a test would do that.
    Sean McLaughlin
    @seanmcl
    How do changes make it from github to maven central? I’d like to pull in this change: https://github.com/typelevel/jawn/pull/269/files, and am wondering if the process is automatic and I should just wait, or if it can take a while and I should patch locally.
    Ross A. Baker
    @rossabaker
    @seanmcl Some projects publish snapshots, some don't. I'm unsure on that one. But I should be able to cut a release tonight.
    Sean McLaughlin
    @seanmcl
    Thanks! Really appreciate it
    Syncing to Central now.
    Ross A. Baker
    @rossabaker
    We need a strategy for the jawn support modules.
    The withDottyCompat causes problems downstream. And two of the three support libraries that are left have no Dotty release.
    We can stop publishing Dotty builds of those and nudge Play along.
    Or we can get out of the support module business entirely.
    This channel is pretty quiet, so I'll open a PR, too.
    Julien Richard-Foy
    @julienrf
    Hello! Do you plan to backpublish jawn 1.1.2 for Scala 3.0.0, or to publish a 1.1.3?
    Julien Richard-Foy
    @julienrf
    I have submitted a PR for Scala 3.0.0