Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Jan 31 22:45
    eolivelli commented #4914
  • Jan 31 22:07
    samsartor starred google/flatbuffers
  • Jan 31 21:28
    marang starred google/flatbuffers
  • Jan 31 20:51
    thyrlian starred google/flatbuffers
  • Jan 31 19:19
    harshshah903 commented #5144
  • Jan 31 19:19
    harshshah903 commented #5144
  • Jan 31 18:56
    aardappel commented #4914
  • Jan 31 18:54
    aardappel commented #5144
  • Jan 31 18:51
    aardappel commented #5141
  • Jan 31 18:51
    aardappel commented #5145
  • Jan 31 18:51
    krojew commented #5142
  • Jan 31 18:49
    krojew commented #5142
  • Jan 31 18:48
    gabyx edited #5142
  • Jan 31 18:48
    gabyx edited #5142
  • Jan 31 18:47
    gabyx commented #5142
  • Jan 31 18:47
    aardappel commented #5002
  • Jan 31 18:43
    gabyx commented #5142
  • Jan 31 18:43
    krojew commented #5142
  • Jan 31 18:43
    aardappel commented #5143
  • Jan 31 18:42
    gabyx commented #5142
MikkelFJ
@mikkelfj
I did some tests. FlatBuffers decode numbers are way off 290ns vs 98ns, but the encode times are approximately correct. I made a similar test for C it was a bit faster than C++ but not by a large margin. Protobuf was broken. C++ build times are horrifying.
MikkelFJ
@mikkelfj
There is going to be problems with alignment on some platforms
MikkelFJ
@mikkelfj
but overall the format makes sense - I’m working with something similar for fast internal types
MikkelFJ
@mikkelfj
There is no union type. There is an optional type, but that costs a lot of space.
MikkelFJ
@mikkelfj
Inheritance appears to be broken wrt versioning because parent fields are copied into the parent without size information. So you can tell if a struct is larger because the child or the parent has changed, or both.
It still works if you can guarantee one linear history of versions.
Björn Harrtell
@bjornharrtell
Thanks for the analysis @mikkelfj, very informative.
Paulo Pinheiro
@paulovap

@aardappel

For languages that can interface with C/C++, schema-less has a big advantage: you can actually use the C++ implementation to go back and forth from JSON to FlexBuffers in a single function call.. unlike FlatBuffers

Can you tell me which call should I look for? I was looking into flexbuffers.h and couldn't find a function to parse json.

Paulo Pinheiro
@paulovap
I saw that one, I was trying to find the "From" equivalent
Wouter van Oortmerssen
@aardappel
Well, idl_parser.cpp has support for parsing a FlexBuffer from JSON if it sits inside the a FlatBuffer.. wondering if we ever added code to allow without..
Paulo Pinheiro
@paulovap
perfect, thank you!
deepgrace
@deepgrace
The Art of Template MetaProgramming in Modern C++
https://github.com/deepgrace/monster
Paulo Pinheiro
@paulovap
On FlexBuffer cpp api I see Reference::isInt() and Reference::AsInt64(), Reference::AsInt32(), Reference::AsInt16(), and Reference::AsInt8(). But I am wondering, how can I know, using the public API the right size of a given int? I can check if a field is int, but I cannot tell if its a long long or short, for instance.
Wouter van Oortmerssen
@aardappel
FlexBuffers always stores things at whatever small size it can, the "original size" that was serialized was not preserved. If you serialize a Int64, but both the contents and the current vector allow for 16-bit storage, it will use 16-bits
Hence, the API is such that you should ask the int out ideally at the same size as you put it in, and it will just work
if you have no idea what size it could be, you should typically use 64-bits
Paulo Pinheiro
@paulovap
My question is more towards when I don't exactly know the right size to retrieve. We have that internally
I see
Wouter van Oortmerssen
@aardappel
we could add an API that lets know the smallest size the int could fit in, but that would likely be clumsy to use
if (ref.FitsInInt16()) { ref.AsInt16().. } else { what? }
not sure how useful that would be
Paulo Pinheiro
@paulovap
Not sure if its useful as well, I am just studying the API and wondering that I might accidentally ask for short while the actual data is a long long
Wouter van Oortmerssen
@aardappel
some data you just know it must fit in 16-bits, so you can ask for that
but if you want to write things very generically.. use 64
Paulo Pinheiro
@paulovap
On another note, I am working on porting FlexBuffer to java, I was wondering if you would be interested in offer an "alternative" String API that would possibly avoid the extra copy. Right now we convert to String duplicating the data. But you can also have a Wrapper class for ByteBuffer that implements a "CharSequence" interface. Which can be used for a lot of things, including UI.
In my unscientific benchmark it shows it might perform a bit better:
CharSequence time: 168 ms
String time: 559 ms
This test just return the same string a million times as CharSequence and as String
Wouter van Oortmerssen
@aardappel
Yup that would fit FlatBuffers/FlexBuffers way better.. its just that a lot of APIs expect String
We could simply offer 2 accessors for each one
Paulo Pinheiro
@paulovap
-java-char-sequence-accessor or something
Wouter van Oortmerssen
@aardappel
I think we could even add it by default.. I would like to encourage Java programmers to use a non-copying representation
I guess it still wont be very efficient though, since the CharSequence has to decode utf8 on each char access
Paulo Pinheiro
@paulovap
Lets see, I will push a WIP for FlexBuffer soon, then we check if its worth it or not. I haven't done a decent analysis, just a quick check on the idea.
Wouter van Oortmerssen
@aardappel
ok!
Paulo Pinheiro
@paulovap

Can someone help me understand the BUILDER_FLAG_SHARE_KEYS implementation? (https://github.com/google/flatbuffers/blob/master/include/flatbuffers/flexbuffers.h#L951)

I am trying to understand how one know, given a set of key positions key_pool and the new key if the key is already present in the buffer or not. I think I am missing something here.

Maxim Zaks
@mzaks
It's been a while, so I might be wrong, but here is my 5 cents. A dictionary is based on two vectors: vector of keys and vector of values. Those vectors are referenced by the dictionary. If you have multiple dictionaries which have the same set of keys, than you can reuse the same vector of keys for all those dictionaries. This however means that on build, you will have to keep track of all already added key vectors, compare them with currently build up keys vector and reuse the already added if the keys are equal. This is a deduplication technique, which has "no effect" on reading FlexBuffers, it can be understood as additional feature on write and there for IMHO can be considered as optional. E.g. I didn't implement it in FlexBuffersSwift.
the "no effect" on reads is meant as there is no difference for correctness of reads, there can be a performance difference though. But this is a more complex topic
Paulo Pinheiro
@paulovap
That part I understood. I failed to understand the implementation, because it just store the set of positions and not something like map<string, int>. So it left me wondering how you know, while you are trying to insert a new key, if that key is already there or not.
Maxim Zaks
@mzaks
See it has been too long :) It's actually not a vector of keys and vector of values, it's a key, value pair, where key is a reference to the string. What I just explained was BUILDER_FLAG_SHARE_KEY_VECTORS and as far as I can see it is unused.
Maxim Zaks
@mzaks
so when you add key value pairs ("maxim", 38) you add "maxim" to the buffer and get it's position in the buffer back say 678 than if the int is store not indirectly the key value pair is (678, 38). If you have BUILDER_FLAG_SHARE_KEYS turned on than you check if storing "maxim" is necessary or this key is already stored so you can just get the existing position in the buffer. The C++ code is interesting because it adds the string anyways and if the flag is set and the string is already stored, it will put the "cursor" back to where it was before adding "maxim" and return the "old position" of "maxim".
My implementation if FlexBuffersSwift is probably more intuitive but the C++ implementation could be more performance oriented.
https://github.com/mzaks/FlexBuffersSwift/blob/master/FlexBuffers/FlexBuffers.swift#L284
Paulo Pinheiro
@paulovap

Now I finally understood the C++ implementation. KeyOffsetCompare holds the buf and its called to check the uniqueness of the element. So it does the comparison by strcmp the keys in the buffer itself. Clever.

I did the Java implementation very similar to what you did in Swift. A Map<String, Int> for key_pool. Then I check if the key exists before writing it. Later I will try to mimic C++ implementation to save some memory

Paulo Pinheiro
@paulovap
Anyone interested in reviewing java implementation of FlexBuffer? https://github.com/google/flatbuffers/pull/5476/files
Wouter van Oortmerssen
@aardappel
@paulovap sorry, was away last week, will definitely review it.. though welcome additional reviewers :)
Paulo Pinheiro
@paulovap
@aardappel no worries. Java does not have unsigned types, so I had to do some.. "workarounds". So the more people looking into it the better