These are chat archives for mdedetrich/scala-json-ast

2nd
Mar 2016
Ichoran
@Ichoran
Mar 02 2016 00:00
If you want to restructure your JSON, you probably want to use a representation that has structural sharing, which means you want the immutable branch.
Matthew de Detrich
@mdedetrich
Mar 02 2016 00:00
@Ichoran its kind of both tbh, usually to be very fast you do need to use mutable data structures
The other thing is that mutable datastructures does allow you to be fast, due to stuff like splicing arrays when parsing
Ichoran
@Ichoran
Mar 02 2016 00:00
If you want to be really fast, you do have to use mutable data structures, but you don't necessarily want the extra tiny bit of speed that causes you to lose safety.
Matthew de Detrich
@mdedetrich
Mar 02 2016 00:01
@Ichoran the speed difference between a smart parser using mutable arrays, and using something like Vector/Map is definitely not tiny
Ichoran
@Ichoran
Mar 02 2016 00:01
Agreed.
eugene yokota
@eed3si9n
Mar 02 2016 00:01
yea. if this is the first JSON library a newcomer would pick up, i think it needs to be idiomatic in Scala culturual way and easy to learn
Matthew de Detrich
@mdedetrich
Mar 02 2016 00:02
Honestly I am happy to kill fast, but that means your referntial transparency is gone
At least for something that could be valid JSON according to the spec, even if its stupid
Ichoran
@Ichoran
Mar 02 2016 00:02
I think fast is trying to carry too many disparate goals.
Matthew de Detrich
@mdedetrich
Mar 02 2016 00:03
It handles one goal, and it also very nicely (out of consequence) handles another goal.
eugene yokota
@eed3si9n
Mar 02 2016 00:03
so my plea is to adopt referential transparency a la circe/argonaut in current safe
Matthew de Detrich
@mdedetrich
Mar 02 2016 00:03
In communication its definitely disparate, but in terms of what niches its filling I think it actually does it quite well
Ichoran
@Ichoran
Mar 02 2016 00:03
The referential transparency is on the "safe" side.
Matthew de Detrich
@mdedetrich
Mar 02 2016 00:03
@eed3si9n I don’t think thats possible, thats the problem
At least if you want to provide an AST that normal people would want to use
I am responding to the ticket to explain why
eugene yokota
@eed3si9n
Mar 02 2016 00:09
waiting ... (unapply?)
Matthew de Detrich
@mdedetrich
Mar 02 2016 00:09
Gimme a moment, responding to the ticket
eugene yokota
@eed3si9n
Mar 02 2016 00:09
no problem. take your time
Matthew de Detrich
@mdedetrich
Mar 02 2016 00:16
@eed3si9n Responded with my thoughts
Matthew de Detrich
@mdedetrich
Mar 02 2016 00:22
@eed3si9n Also iirc, the Vector/Map dual that languages like Circe/Argonaut use for JObject, is just for preserving ordering of Map and not duplicate keys
I don’t think they handle that, in which case those implementations arent referentially transparent either
Ichoran
@Ichoran
Mar 02 2016 00:24
Anyway, I'm kind of bored of talking about it now. I want to get my implementation fully working and see how much faster or slower it is than other stuff.
Matthew de Detrich
@mdedetrich
Mar 02 2016 00:25
Sure thing. I think the main thing I am trying to say, is that in my opinion asking for referential transparency will also aiming for a AST design that is nice for users doesn’t really work well
Ichoran
@Ichoran
Mar 02 2016 00:25
Then I might be able to offer specific technical solutions to problems that "one can't do X".
I am going to try to prove otherwise, but we'll see.
Matthew de Detrich
@mdedetrich
Mar 02 2016 00:26
In regards to RT, or something else?
Ichoran
@Ichoran
Mar 02 2016 00:26
As with most of what I'm doing here, I don't really expect to succeed. But it'd be nice if I could :)
RT specifically, in this case.
Matthew de Detrich
@mdedetrich
Mar 02 2016 00:27
If we disregard RT, then essentially we can provide what we have with safe right now (maybe with some JNumber adjustments) and thats basically it. safe, btw, is what all other languages implement in some form for their common “JSON” type
Fast got spawned for 2 reasons, one is referntial transparency, the other is actually for speed (this is what @sirthias was asking for in Spray at a time). I am not sure if things have changed
eugene yokota
@eed3si9n
Mar 02 2016 00:28
it's getting late in CET, so i'll sleep on this more
Matthew de Detrich
@mdedetrich
Mar 02 2016 00:30
Okay cool, its definitely a lot to think about. Personally, if I put my benevolent dictator hat on, I would release a single AST, being safe (with maybe some adjustments to JNumber) and thats it. Maybe release fast as some unsupported JSON AST, just so it gives the message so that users wouldn’t use it directly (but maybe other libraries could work with it for interopt reasons, similar to how we handle byte arrays in Java/Scala right now)
@SethTisue, what are your thoughts on this?
Matthew de Detrich
@mdedetrich
Mar 02 2016 01:20
@Ichoran what do you think of mdedetrich/scala-json-ast#3
Essentially I am going to move scala.json.ast.safe.JValue to scala.json.ast.JValue and scala.json.ast.fast.JValue to scala.json.ast.unsafe.JValue
Johannes Rudolph
@jrudolph
Mar 02 2016 09:30
:point_up: March 2, 2016 12:03 AM I don't know which thread you are referring to or if you are actually meaning me but I cannot remember saying anything like that :) BigDecimal is certainly problematic if you want the best performance.
nafg
@nafg
Mar 02 2016 09:53
Maybe he meant @jroper?
Seth Tisue
@SethTisue
Mar 02 2016 10:50
@mdedetrich I won’t have time to really think about it this week, but it certainly sounds plausible. personally, my main interest is that the SLIP should include something that’s good enough for ordinary use. also there isn’t only once chance at this; a single-AST SLIP could get in, and then we could consider a possible followup that adds the “fast” option.
I definitely like removing the “safe” marker from the regular ASTs.
I don’t actually recall how many of the technical (as opposed to philosophical) objections to the first proposal had to do with the “fast” option specifically?
Matthew de Detrich
@mdedetrich
Mar 02 2016 13:58

Fast was mainly conceived for 2 reasons.

  1. iirc @eed3si9n needed an AST that was referentially transparent (for his usecase in SBT), which meant an AST that preserved key order, and duplicate keys for JObject
  2. @sirthias needed an AST that was catered for speed (aparently the main usecase in Spray)

fast happened to cover both use cases very well

@jrudolph This wasn’t in regards to performance, it was in regards to people doing something like JNumber("1e2147483647”) and then calling .toBigInt on the value, which causes Scala to freeze due to the computation being really slow
BigDecimal has obvious performance issues, basically everyone knows that, thats what you pay for proper precision
Matthew de Detrich
@mdedetrich
Mar 02 2016 14:25
Although it may have been someone else who said it
InTheNow
@InTheNow
Mar 02 2016 16:16

The real problem is that the JSON spec specifies that a number can be of any size

Which json spec:

In particular from the former " This specification allows implementations to set limits on the range
and precision of numbers accepted"... even suggesting that implementations that implement > Double may not play with.." than is widely available.".

This suggests that JSON is "too simple"....
If you now mix in JSON-LD, that is "pure" JSON, but can be extended with a schema and XSD Datatypes....
So you would want to use the datatypes that are in the schema
InTheNow
@InTheNow
Mar 02 2016 16:21
I'm beginning to think that, not deliberately nor out of spite, the creators of JSON made a spec so simple that it merely kicked the can down the road, perhaps explaining why there needs to be so many implementations....
InTheNow
@InTheNow
Mar 02 2016 16:39

if it's primarily an AST designed for interop and will often (usually?) be used in conjunction with a real decoding library, then JNumber(value: String) should be fine

Perhaps for a JSON-ast this is all that is needed?

InTheNow
@InTheNow
Mar 02 2016 17:17
[an aside on JSON-LD, one use case is google knowledge graph
Seth Tisue
@SethTisue
Mar 02 2016 18:09
@mdedetrich I think the default ASTs should preserve key order. I don’t find the argument on this in the original SLIP, from November, convincing. has your position on this changed since then? (sorry, I assume this has been discussed, but the volume of discussion has been so great…)
Ichoran
@Ichoran
Mar 02 2016 18:11
@SethTisue - The problem is that you pay a heavy tax for keeping that ability with immutable data structures.
Seth Tisue
@SethTisue
Mar 02 2016 18:12
ok, that could be a better argument than the one in the old SLIP
Ichoran
@Ichoran
Mar 02 2016 18:13
I don't remember what the old argument was. Some sort of excuse about how you shouldn't have multiple keys or care about their order anyway?
Anyway, the problem is that you need a list-like structure (which we have) and a map-like structure (which we have), but the only way to have both simultaneously involves either keeping track of insertion number and storing it in every node, which then gives you O(n log n) runtime on traversal, or maintaining both data structures which is not kind to your space requirements.
Circe and Argonaut do it, though, since you kind of have to for correctness.
Well, conservative correctness.
Seth Tisue
@SethTisue
Mar 02 2016 18:21
since these aren’t the fast trees, I’m willing to pay
Ichoran
@Ichoran
Mar 02 2016 18:22
Yeah. Except Play might not be willing to pay, since they might care about memory budget. Dunno.
Should ask James.
InTheNow
@InTheNow
Mar 02 2016 18:22

Implementations whose behavior does not depend on member ordering will be interoperable in the sense that they will not be affected by these differences.

Suggests that unordered is actually better.

Ichoran
@Ichoran
Mar 02 2016 18:23
No, the requirements for readers and writers are different. Standard co/contravariance stuff.
When you are reading, you shouldn't care. When you are writing, you should.
So you should build exact representations and faithfully serialize them, but make the easiest API to use be the one that doesn't care about faithfulness.
InTheNow
@InTheNow
Mar 02 2016 18:24
I'm merely quoting the JSON spec
Ichoran
@Ichoran
Mar 02 2016 18:25
Yes, you missed the context. It's about parsers specifically.
That is, if you're consuming JSON, it's wiser not to care because if you do care, someone else may not have, and you may get messed up.
InTheNow
@InTheNow
Mar 02 2016 18:27
" An object is an unordered collection of zero or more name/value
pairs, where a name is a string and a value is a string, number,
boolean, null, object, or array."
So ^^^you refer to preserving the original order, or am I lost (again :) )
Seth Tisue
@SethTisue
Mar 02 2016 18:32
If you read some JSON in, make a few changes, and write it back out again, it's just annoying to have the order shuffled. For human-readability reasons, for testing, diffs, etc.
It would be wrong to associate semantics with the ordering, definitely. But it isn't wrong to refrain from reordering.
InTheNow
@InTheNow
Mar 02 2016 18:40

I guess where I'm sort of heading here is: There are two (or more?) official specs that in the main are the same, major diff is (no surprise) on number format/precision. I'm guessing these can easily combined into one api, and would be similar to the current fast. Perhaps this should/could be renamed conforming-ast (or similar).

Safe could then actually be made into two : Browser and JVM or similar, and would cover most use cases

Ditto for "names within an object SHOULD be unique." The conforming ast must allow, that the other two just follow platform convention
Dale Wijnand
@dwijnand
Mar 02 2016 20:50
Just out of curiosity, why isn't ListMap viable?
Ichoran
@Ichoran
Mar 02 2016 21:02
@dwijnand - Because it takes forever to look up a key if the object isn't tiny.
You could switch implementations depending on size, but you can't use ListMap for a large object due to the O(n) lookup time.
Dale Wijnand
@dwijnand
Mar 02 2016 21:29
I see. Thanks
Matthew de Detrich
@mdedetrich
Mar 02 2016 21:57
@SethTisue There isn’t a effectively constant ordered immutable map data structure
Thats the main problem
Either that or you have to maintain a Vector and a Map and create some weird dual data structure that needs to interopt with Scala collections library
That also uses a lot of memory
@dwijnand Yeah, ListMap has O(n) lookup
Obviously ideally you would want to print out the keys in some order, although there is nothing stopping JSON libraries that use the json ast from converting the map to an ordered map before printing
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:04
One thing I don’t really want to move on is using really slow data structures, because it forces everyone to be really slow. json4s uses List for its underlying JObject structure, and its caused a lot of problems because its so slow on what a common operation for users is (look up a Map by its key, which is what you will be doing most of the time with a JObject)
Seth Tisue
@SethTisue
Mar 02 2016 22:06
LinkedHashMap behind an immutable interface?
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:08
LinkedHashMap is essentially the mutable version of having a Map and a Vector. It maintains another entire doubly linked list behind the scenes so it can preserve order
eugene yokota
@eed3si9n
Mar 02 2016 22:11
for sbt's usage I don't care about duplicate keys
Ichoran
@Ichoran
Mar 02 2016 22:11
Backing by mutable data structures isn't a good idea if we want the AST to be modifiable.
The mutable structures, viewed immutably, have no structural sharing.
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:12
@eed3si9n Okay thats good, I don’t think we could made that work
Ichoran
@Ichoran
Mar 02 2016 22:12
So I really think it's back to the classic immutable/mutable split, based on the backing data structures, and if we have two ASTs, that is what they should be called.
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:13
Also I am pretty sure that other languages have the same issue that we are dealing with as well, and in their case they just deal with it
Unless there is some magical data structure that I am not aware of
eugene yokota
@eed3si9n
Mar 02 2016 22:13
but the point of mdedetrich/scala-json-ast#2 is not about RT specifically. I put the subject "Goals" for a reason
Ichoran
@Ichoran
Mar 02 2016 22:13
I don't know what e.g. Haskell does. Mostly they use mutable data structures.
For instance, Julia uses a backing array with an overlaid hash map for fast lookups.
(So they can handle multiple keys in traversal but when you look up by key you get zero or one.)
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:14
I am not sure about the others
Ichoran
@Ichoran
Mar 02 2016 22:14
So they can't maintain order or handle duplicate keys.
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:15
I dont think many people use that library, whats used for JSON in Haskell nowadays
@eed3si9n I think you can substitite RT for key ordering and the argument would be the same beacuse thats what its asking for
eugene yokota
@eed3si9n
Mar 02 2016 22:16
@mdedetrich sure. let's try to agree on the goals before we get into specific data structures
Ichoran
@Ichoran
Mar 02 2016 22:16
RT means that for all conforming deterministic JSON parsers, parse(String) == parse(serialize(deserialize(String)))?
Because it certainly can't mean String == serialize(deserialize(String)) unless you preserve whitespace.
("conforming" meaning "conforms to the JSON specification")
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:18
I think what it meant in that context is that key order for Map was preserved, whitespace is ignored
Referentially Transparent probably isn’t the right word
eugene yokota
@eed3si9n
Mar 02 2016 22:19
for me, it's about JValue(JObject("x" -> 1, "y" -> 2)) producing an identical JSON string every time so SHA-1 would match up
Ichoran
@Ichoran
Mar 02 2016 22:19
Well, but there are other things that can vary, such as whether you emit unicode or whether you u-escape everything not in 7-bit ascii, or whether you use a + sign before e in your numbers, and whether you preserve trailing zeros or the sign of -0.
So serialize(json) == serialize(deserialize(serialize(json))), @eed3si9n ?
eugene yokota
@eed3si9n
Mar 02 2016 22:20
yes
Ichoran
@Ichoran
Mar 02 2016 22:21
Do you also want json == deserialize(serialize(json))?
eugene yokota
@eed3si9n
Mar 02 2016 22:21
and I admit due to whitespace etc my RT argument is not water tight, but being able to cauculate SHA-1 is pretty common need
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:21
Aeson, the most common JSON in library in Haskell, also uses a HashMap
Ichoran
@Ichoran
Mar 02 2016 22:21
As long as you never calculate sha-1 off a bare string, but only off strings you produce, that shouldn't be an impossible constraint to get the same output each time.
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:22
In fact I am pretty sure that, 90% of the time, JSON asts in mainstream languages use a HashMap
so does javascript (under the hood)
Ichoran
@Ichoran
Mar 02 2016 22:22
Probably! (But as I mentioned, Julia doesn't.)
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:24
The way I see it, we have 2 choices, we can either use a backing array/vector to maintain key order, which means that, for everyone, it will use more memory and it will slow down things a bit, or we have unsafe (or mutable or w/e you want to call it), which handles that use case as well as others
eugene yokota
@eed3si9n
Mar 02 2016 22:26
if we can agree on the goals, then we can try to run benchmark to actually evaluate these claims
Ichoran
@Ichoran
Mar 02 2016 22:26
Looks like Rust does it the map way also. (Not sure about serde, just serialize::json.)
(Have to go now.)
Dale Wijnand
@dwijnand
Mar 02 2016 22:30
:+1: on @eed3si9n approach, but given this is intended for beginner audiences that don't want to specialise on what json ast they want, I think correctness should trump performance.
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:35
Its not really a question about correctness, the JSON spec isn’t (in reality) that specific on whether or not, you should preserve map key ordering
And the thing is, JSON is an interchange format, if the majority of other maintream languages don’t preserve key ordering, you shouldn’t really rely on it, and the arguably “correct” is that key order is undefined
Also when users use map to look up a key, they expect something that is effectively constant, I don’t think thats an area we should move on
Dale Wijnand
@dwijnand
Mar 02 2016 22:38
From following the conversation here and before it doesn't look as clear cut as you're making it look like...
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:38
And its not something I want to impose on all other JSON libraries, I can definitely say a few of them would refuse to use the ast in such a case, or they would say “use it, but it has terrible performance in x” which isn’t a good look
What isn’t clear cut?
Dale Wijnand
@dwijnand
Mar 02 2016 22:38
Duplicates, ordering issues
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:38
Thats because the JSON spec isn’t clear cut
it doesn’t say yes or no, it uses language like “should” which it defines (albet its not helpful)
Dale Wijnand
@dwijnand
Mar 02 2016 22:40
What were the pitfalls of scala.util.parsing.json?
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:40
I am saying that if we compare it to other language AST’s, which have to make a decision as well, most of them appear to ignore key orderings, they map to the JObject to the languages native HashMap, which has no concept of key ordering
Not sure, I don’t use that library
eugene yokota
@eed3si9n
Mar 02 2016 22:41
why bring up other languages when we have data points within the Scala ecosystem?
Lift JSON was the de facto standard a while back, which had case class JObject(obj: List[JField]) extends JValue
Dale Wijnand
@dwijnand
Mar 02 2016 22:42
I think why scala.until.parsing.json failed should be something that is well understood for this SLIP
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:42
Well JSON is a data interchange format, so it does have to deal with how other languages treat JSON, but the reason I am bringing it up is to try and get some more clarity on what other people do
eugene yokota
@eed3si9n
Mar 02 2016 22:42
Play Json: JsObject(fields: Seq[(String, JsValue)])
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:43
Thats not right
Play json moved away from a list based JObject ages ago
They do maintain one underneath for ordering reasons, but the main structure is a Map
eugene yokota
@eed3si9n
Mar 02 2016 22:44
Argonaut/Circe: internally Vector + Map, but hides implementation to the user
spray JSON: case class JsObject(fields: Map[String, JsValue]) extends JsValue
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:46
Yeah, so what Play/Spray do, is they do use a map internally they just either provide a helper method to provide the field names, or the k/v pair
eugene yokota
@eed3si9n
Mar 02 2016 22:46
I am skeptical about the time overhead of adding extra Vector
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:47
More concerned about memory or time. Spray for example sees speed as important, so they just did a helper funtion
*function
eugene yokota
@eed3si9n
Mar 02 2016 22:47
and I am ok with saying "we care about reasonable field lookup speed, but we don't care about memory"
Matthew de Detrich
@mdedetrich
Mar 02 2016 22:47
Which also isn’t entirely correct, it will lose the initial key ordering
Yup, thing is other people have different opinions as well, at the end of the day we have to make a decision and pick one
eugene yokota
@eed3si9n
Mar 02 2016 22:52
another thing we should consider stop caring about is handling the mythical full legal range of JSON
if we say "here's the range of number, shape, etc we handle" and parse method returned Left or threw an exception consistently, that's not too bad
InTheNow
@InTheNow
Mar 02 2016 23:09

Thats because the JSON spec isn’t clear cut

So... .how does one model random "stuff" , like poor specs, in scala?

JSON cannot be modeled, and that is why it is presents so many problems, in a proper language (like scala), with people that care.

Matthew de Detrich
@mdedetrich
Mar 02 2016 23:15
I mean this is why the 2 ast split happened, we can make a solution which is in the middle, but it didn’t make any terribly too happy (as we can say, individually frameworks have different needs). On the other hand, the 2 AST split does actually cover all bases, but we know have 2 AST’s, which is going to be a hard sell (but may aslo be possible)
@Ichoran Also regarding Julia, since it is backed by mutable data strcuture (apparently an Array), its going to use proportionally less memory than a Vector (Vector being backed by a specially designed tree since its an immutable data structure)
Immutable data structures tend to use a lot more memory for their basic layout, of course this pays out when it comes to structural sharing
eugene yokota
@eed3si9n
Mar 02 2016 23:18
imho the value of having a single (even if it's mediocre) AST outweighs the potential confusion of having 2 or community having 5 different JSON ASTs
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:19
An immutable Vector + Map takes up a more memory than a mutable Array + Hashmap
I think anything over 2 is downright stupid, 2 you could get away with if you communicated it really well
But I am not sold on that either
The thing is, the AST’s are not completely disparate, you can convert between them seamlessly, so it doesn’t matter which AST the user gets. The confusion can be mitigate with better communication, which I admit hasn’t been great so far
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:25
I have just pushed my ast rename onto master, and updated the documentation so its hopefully a lot clearer
@eed3si9n We could also just ask the community
Dale Wijnand
@dwijnand
Mar 02 2016 23:25
2 is stupid for the problem set you're trying to solve, if you create 2: one more correct used by play, one more fast used by spray, how's that helping library X? he/she's switching from play's and spray's original asts to the new asts...
because 2 asts is less than 8 asts?
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:26
@dwijnand if you get back an unsafe.JValue, and you need a JValue, you can just call .toStandard

because 2 asts is less than 8 asts?

The 8 AST’s all happen to roughtly share 2 disparate designs

eugene yokota
@eed3si9n
Mar 02 2016 23:27
@mdedetrich we could ask the community. but I think what we should ask the community to fill out yes/no form is the goals, not the numbers of ASTs they can memorize or their favorite data structure
immutability discussion could be worked around to some degree by hiding the implementation from the interface
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:28
Yeah, the thing is, I made this point before, but if you try to achieve to many goals with a single AST then you don’t make anyone happy because you end up doing a crappy job of achieving those goals
Yes but you also lose structural sharing then, which could be potentially even worse
Dale Wijnand
@dwijnand
Mar 02 2016 23:29
given it's roughly 2 disparate designs, it's not strictly 2 designs either.. so even 2 isn't enough
eugene yokota
@eed3si9n
Mar 02 2016 23:29
I don't think you will make anyone completely happy
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:29
I think if you have 2 AST’s which are very different in their goals (and thats communicate clearly), thats definitely more justifiable than having 8 or more AST’s which are just really slight variations of eachother
@dwijnand Lets put it this way, I don’t think anyone has complained from a technical perspective about the 2 AST’s, the main issue people have with the 2 AST design is that there are 2 AST’s. But if someone is needing x/y, the 2 AST’s cover all of the bases well enough that people would be justified in using it
eugene yokota
@eed3si9n
Mar 02 2016 23:31
if you picked any AST, I'll be happy given it's bincompat for a while
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:32
Not trying to trivialize the 2 AST’s, but it seems to have covered the use cases well enough that from a technical POV. Like I said, anything more than 2 is stupid, you wouldn’t really add anything. You get the most bang for your buck with either 1 or 2 AST's
@eed3si9n Yeah I don’t think there is any disagreement in that
@dwijnand The anology I came up with, which I am not sure if you read, is that its like the distinction between a byte array and a String
eugene yokota
@eed3si9n
Mar 02 2016 23:33
is there a usecase for ex-safe/normal to go to fast?
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:33
Both are very disparate even though they represent the same concept, however most (if not all?) mainstream languages have both concepts in their stdlib
@eed3si9n Typically I don’t think so
eugene yokota
@eed3si9n
Mar 02 2016 23:34
but fast might want to validate itself and become normal?
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:34
However if some library gives you a normal JValue, and you really need a unsafe.JValue, the method is there if you need it

but fast might want to validate itself and become normal?

Could you rephrase?

eugene yokota
@eed3si9n
Mar 02 2016 23:35
there are some potential use cases for unsafe.JValue to become JValue?
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:35
Oh definitely, I can see parsers and webframeworks focused on high performance (over anything else) just providing an unsafe.JValue
eugene yokota
@eed3si9n
Mar 02 2016 23:36
if this is thought to be a one-way street, then unsafe.JValue can easily be a community maintained project right?
the term unsafe reminds me of sun.misc.Unsafe, which probalby is really unsafe as opposed to storing JSON in an array
there are other options like Jawn or Jackson. we don't have to make everyone happy
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:41
Well thats kinda the point, unsafe is meant to make people have that thought
unsafe can store invalid data, and it can blow up at runtime if you try to serialize it
unsafe can be community maintained, but ironically, I see unsafe as being much more stable in its design compared to the standard JValue
It really does just put the minimum JSON AST structure over just representing it as a String. So everything is an Array, or a String
The thing is, for your usecase (which I think is pretty exceptional/specific), unsafe does fullfill your goals, and you can always expose a standard JValue once you are done comparing your SHA-1 values or w/e
Its also nice in that it sends the message that “don’t expect to have ordering/duplicates for keys in a JValue unless you are doing something special” which honestly is the reality
If you put ordering by default in the proper AST, people assume thats what the typical use case is (when it fact, isn’t)
@eed3si9n Anyways, from a communication perspective, I think the current master is a lot better
eugene yokota
@eed3si9n
Mar 02 2016 23:47
i am not sure if i agree with that ordering logic
if the goal is fast lookup and thus Map, I see that
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:48
I primarily deal with web frameworks, and the statement there seems to be “never rely on, or expect key ordering in JObject's"
eugene yokota
@eed3si9n
Mar 02 2016 23:48
but ordering semantics an issue each app to not rely on
that doesn't preclude from an AST to randomize the output of your application and make it not RT
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:48
Like if you look at this answer
A lot of large frameworks (such as android) tend to say “ordering is undefined"
eugene yokota
@eed3si9n
Mar 02 2016 23:50
i am fine with JSON spec saying that
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:50
@eed3si9n From a datastructure point of view, its still “RT” in the sense you use the word
eugene yokota
@eed3si9n
Mar 02 2016 23:50
but that doesn't conflict with AST trying to be RT
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:50
Two Scala Map’s will be equal even if they keys don’t have the same order
eugene yokota
@eed3si9n
Mar 02 2016 23:50
Map or Set isn't RT the moment you call toSeq
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:51
Yup agreed, although there is nothing stopping you from doing toSeq and then ordering it (by key name for example)
eugene yokota
@eed3si9n
Mar 02 2016 23:51
Map is perfectly fine as long as you use it as String -> JValue function
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:52
@eed3si9n I guess my question is, would you be happy with unsafe.JValue for your particular use case?
eugene yokota
@eed3si9n
Mar 02 2016 23:52
yea. so a pretty printer that does sorting by key to me is actually an ok compromise
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:52
You can always use implicit class to extend the AST to provide nice pretty printing. In fact thats what pretty much all frameworks will do
I can also provide some helper functions, but I think thats its cluttering the AST too much (and its already pretty big for something that is designed to be very minimal)
@eed3si9n Naively you can also override .toString so it prints the keys in an ordered fashion
eugene yokota
@eed3si9n
Mar 02 2016 23:55
i am personally happy with unsafe.JValue, but I also have concern about 2 AST situation
Matthew de Detrich
@mdedetrich
Mar 02 2016 23:55
I am also perfectly happy with a community maintained module that uses scala-json-ast to add some nice pretty printing functionality, which can also take into account key ordering

i am personally happy with unsafe.JValue, but I also have concern about 2 AST situation

Same, although I am more leaning towards this situation. However if we go forward with it, we must be ultra careful that we communicate it well and properly, both through code and generally

I don’t think the sell is too hard, but it has to be done right
I also want to get @SethTisue opinion on this
I think the change right now with packaging has already improved it by leaps and bounds
For now, I am actually more concerned about what to do with JNumber
I am kinda disliking the fact that Scala doesn’t have a proper number type to represent this (that combined with the JSON spec being so loose)
@Ichoran, did you come up with anything