Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Jan 31 2019 22:45
    eolivelli commented #4914
  • Jan 31 2019 22:07
    samsartor starred google/flatbuffers
  • Jan 31 2019 21:28
    marang starred google/flatbuffers
  • Jan 31 2019 20:51
    thyrlian starred google/flatbuffers
  • Jan 31 2019 19:19
    harshshah903 commented #5144
  • Jan 31 2019 19:19
    harshshah903 commented #5144
  • Jan 31 2019 18:56
    aardappel commented #4914
  • Jan 31 2019 18:54
    aardappel commented #5144
  • Jan 31 2019 18:51
    aardappel commented #5141
  • Jan 31 2019 18:51
    aardappel commented #5145
  • Jan 31 2019 18:51
    krojew commented #5142
  • Jan 31 2019 18:49
    krojew commented #5142
  • Jan 31 2019 18:48
    gabyx edited #5142
  • Jan 31 2019 18:48
    gabyx edited #5142
  • Jan 31 2019 18:47
    gabyx commented #5142
  • Jan 31 2019 18:47
    aardappel commented #5002
  • Jan 31 2019 18:43
    gabyx commented #5142
  • Jan 31 2019 18:43
    krojew commented #5142
  • Jan 31 2019 18:43
    aardappel commented #5143
  • Jan 31 2019 18:42
    gabyx commented #5142
Maxim Zaks
@mzaks
pathComponents are an array of string because I guess it is often that parts of the path repeat themselves.
I made parameters as just one concatenated string, becuase my guess is often the params are repeated as whole as well. But those are just my speculations, if you have examples it is better to check those and see what separation would be better
Maxim Zaks
@mzaks
and why would box String and Bool? if it is for null comparibility, there is a new feature which just implemented default nulls for scalars. In fact, types like Date, Duration, int128, int256 should be defined as struct. Structs are non evolvable, but we are talking about base types which should not be evolved and structs, don't have the overhead of virtual table and are stored inline.
Well regarding structs I guess if you use just the fbs and not the FlatBuffers as is, then it probably does not matter
Maxim Zaks
@mzaks
Ah OK, I looked a bit further through your examples I guess you have your own own generator from fbs to other languages, or do you use flatc? It seems to me that the generated classes are not typical for flatc output. So you don't use FlatBuffers for Flatbuffers sake but rather just the schema syntax. So all the deduplication and tables vs. structs remarks I did are not really applicable in this case.
MikkelFJ
@mikkelfj

Yeah - deprecation is complicated.

It is, but I was being sarcastic:
deprecated has been deprecated ...

As to boxed types, I can see why you would want them all to be tables, but boxing actually makes more sense with structs as Max points out, except for variable length strings. int128 can be represented as a struct int128 (align: 16) { val: [byte:16]; }. I have actually thought about adding typedef support for such wrappers. These structs have no overhead compared to native integers, but contrary to native integers (for supported sizes), they can be used in Unions.
MikkelFJ
@mikkelfj
You might want to use Int128 in upper case, and also define Int64 as struct { val: int64; } etc. because then all types can work in Unions, and all wrapped types are consistently uppercase. Lowercase int128 could theoretically conflict with a future FlatBuffer native type.
MikkelFJ
@mikkelfj
@mzaks I would keep URL simple. If needed you could added a ParsedURL or something in a separate library. Optimizing for reuse of string components likely cost more than they offer unless you have a lot of urls with long reused components. E.g. a 6 byte component would use 16 bytes: 4 for reference in the components in array, 4 for length, 6 for content, 1 for 0 terminator, and 1 for padding. Using it embedded in a another string would only use 7 bytes including a ‘/'. But of course this is for metadata and not actual contents.
adsharma
@adsharma

Ah OK, I looked a bit further through your examples I guess you have your own own generator from fbs to other languages, or do you use flatc? It seems to me that the generated classes are not typical for flatc output. So you don't use FlatBuffers for Flatbuffers sake but rather just the schema syntax. So all the deduplication and tables vs. structs remarks I did are not really applicable in this case.

That's right. This is the compiler I use:

https://github.com/adsharma/flattools/blob/master/bin/flatc.py

adsharma
@adsharma

The idea is that flatbuffers are useful when you have large buffers and you want zero overhead deserialization. Perhaps at other times you want to choose a different serialization format (thrift supported multiple, but I haven't looked lately).

So what if each table can specify its own serialization format as an annotation? It should be possible to translate that into a swift attribute?

table message (serde: flatbuffers) {
...
}

could end up with:

@flatbuffer
class Message {
...
}

But perhaps you want to use @grpc or some other attribute for other tables. This way the choice of serialization format doesn't drive the choice of IDL.

As to boxed types, I can see why you would want them all to be tables, but boxing actually makes more sense with structs as Max points out, except for variable length strings. int128 can be represented as a struct int128 (align: 16) { val: [byte:16]; }. I have actually thought about adding typedef support for such wrappers. These structs have no overhead compared to native integers, but contrary to native integers (for supported sizes), they can be used in Unions.

I didn't write those types. The boxed types exist in Telegram's TL (type language) described here:

https://core.telegram.org/mtproto/TL

The included script (compiler.py - which I derived from another project) spits out all the fbs files in the directory.

Yes - unions are a major use case for boxed types. Having multiple constructors is another I think.

adsharma
@adsharma

You might want to use Int128 in upper case, and also define Int64 as struct { val: int64; } etc. because then all types can work in Unions, and all wrapped types are consistently uppercase. Lowercase int128 could theoretically conflict with a future FlatBuffer native type.

Yup. Int128 makes more sense. Will do.

https://github.com/adsharma/fbs-schemas/blob/main/core.fbs

Most of these types came from ActivityStreams 2.0 (which is a JSON-LD schema spec). I feel the flatbuffer schema is 100x more readable for the average programmer, even if JSON-LD is more general and might have other capabilities.

MikkelFJ
@mikkelfj

So what if each table can specify its own serialization format as an annotation?

That might get a bit clumsy, and you might want to use multiple serialization formats, for example in a gateway.

Lijie.Jiang
@lijie-jiang
Would it be possible to include multiple input path with '-I' option? google/flatbuffers#6346
Sargun Dhillon
@sargun
I noticed flatbuffers doesn't have canonicalization:
RE: On purpose, the format leaves a lot of details about where exactly things live in memory undefined, e.g. fields in a table can have any order, and objects to some extent can be stored in many orders. This is because the format doesn't need this information to be efficient, and it leaves room for optimization and extension (for example, fields can be packed in a way that is most compact). Instead, the format is defined in terms of offsets and adjacency only. This may mean two different implementations may produce different binaries given the same input values, and this is perfectly valid.
Is there appetite for it?
MikkelFJ
@mikkelfj

Is there appetite for it?

What do you mean?
If implementations use this to pack better?
I'm not sure how much, but for example flatcc packs vtables at the end of the buffer by default.
And in general, data are placed as they arrive as much as possible on all implementations that I am aware of, which in itself helps performance by avoiding buffering.

Sargun Dhillon
@sargun
Is there a plan to add canonicalization to the specification, or similar?
Maxim Zaks
@mzaks
What would be the benefit of it, at this point in time?
I think if there would be a FlatBuffers 2.0 canonicalization might make sense. But pushing it on an established format without big benefits, is questionable.
It think it is ok to introduce guidelines, explaining that folowing form is more efficient and that people should follow it, if they can. But the format in current state is flexible and there is data already created in different way, so it needs to be supported anyways.
Sargun Dhillon
@sargun
Are structs themselves always guaranteed to be reproducible?
adsharma
@adsharma

Is there a plan to add canonicalization to the specification, or similar?

https://adsharma.github.io/flattools/ - pick a canonical serialization that works for you and implement it as a decorator in your favorite language, while enjoying the benefits of flatbuffer syntax as IDL.

Maxim Zaks
@mzaks

Are structs themselves always guaranteed to be reproducible?

Yes. They are rigid. You can not evolve them. You can specify some special layout properties through attributes though. But you can't change it after you have used it as it will be a breaking change.

@sargun ^
MikkelFJ
@mikkelfj
@sargun you can also print to JSON without spaces, that is probably as close as you can get. And yes structs are always the same - except potential flaws in exports where padding space is not zeroed - I just found a bug in a flatcc that failed to ensure that - because in some cases user code is allowed to influence that via a raw copy.
adsharma
@adsharma

One more blog post on flattools and where it fits in the stack:

https://adsharma.github.io/flattools-programs/

Happy New Year!

cyberquarks
@cyberquarks
Hi can this work with Flatbuffers?
message Entity {
  string dir = 1; 
  string entity_type = 2;
  string entity_id = 3;
  repeated EntityBlob blobs = 4;
  map<string, EntityProperty> properties = 5;
  map<string, EntityIdList> related_entities = 6;
}
message EntityProperty {
  oneof property_value {
    string string_value = 1;
    EntityArrayProperty array_value = 2;
    EntityObjectProperty object_value = 3;
    bool bool_value = 4;
    double double_value = 5;
  }
}
message EntityArrayProperty {
  repeated EntityProperty values = 1;
}
message EntityObjectProperty {
  map<string, EntityProperty> property_map = 1;
}
message EntityIdList {
  repeated EntityId ids = 1;
}
message EntityBlob {
  string blob_name = 1;
  bytes blob_bytes = 2;
}
message EntityId {
  string type = 1;
  string id = 2;
}
cyberquarks
@cyberquarks

I tried to translate this with flatc and I got this with the "Anonymous0" table:

// Generated from schema.proto

namespace ;

table Entity {
  dir:string;
  entity_type:string;
  entity_id:string;
  blobs:[EntityBlob];
  properties:[MapFieldEntry];
  links:[MultimapFieldEntry];
}

table EntityProperty {
  property_value:EntityProperty_.Anonymous0;
}

namespace EntityProperty_;

table Anonymous0 {
  string_value:string;
  array_value:EntityArrayProperty;
  object_value:EntityObjectProperty;
  bool_value:bool;
  double_value:double;
}

namespace ;

table EntityArrayProperty {
  values:[EntityProperty];
}

table EntityObjectProperty {
  properties:[MapFieldEntry];
}

table EntityIdList {
  ids:[EntityId];
}

table EntityBlob {
  blob_name:string;
  blob_bytes:[ubyte];
}

table EntityId {
  type:string;
  id:string;
}

table MapFieldEntry {
  key:string;
  value:EntityProperty;
}

table MultimapFieldEntry {
  key:string;
  value:EntityId;
}

What does this Anonymous0 mean?

Wouter van Oortmerssen
@aardappel
nice @adsharma
@cyberquarks Protobuf to FlatBuffers is not a 1:1 mapping, and FlatBuffers doesn't have the oneof construct.. you can just rename it to something else. And since its the only field in EntityProperty you can just replace it with EntityProperty directly. Or use a FlatBuffers union.
also MapFieldEntry should probably have a (key) attribute on the key field, so you can actually use it with dictionary lookup
vjani
@vjani
@here Had a question about tags for the flatbuffers repo, do they indicate the official releases? If so, there has been quite some time since the last one(March 2020) and I need to consume some of the later fixes, what is a good way to do this?
except compiling from master. I have tried that and it works however there is a danger of using official flatbuffer package from repository and using flatc from the master may generate code which may be incompatible with the classes in the official package. I ran today into similar issue where builder.EndVector signature changed and caused an incomptatibility
Wouter van Oortmerssen
@aardappel
I think I already answered you on discord..
kofu145
@kofu145
image.png
Heya, was looking up how I could possibly implement an ecs model into flatbuffers for easy serialization, and I came across this:
if I have tons of components though, apparently vectors of unions are going to be inefficient? If so, would anyone know of a better way to do such a thing?
kofu
@kofu:matrix.org
[m]
(sorry I'll use this account from now on, just found out that you could access gitter rooms from element)
MikkelFJ
@mikkelfj
It depends. If you have a ton of union elements and few vectors to contain them, then it isn't bad: A union vector is really two vectors one of which is just a byte array of types. The other vector is similar to a vector of tables. If you store structs in unions these will not be stored inline in the union and thus come with overhead.
If you need more advice, you need to be more specific. I have a rough idea of what ECS is, but no idea how you intend to do it in flatbuffers.
MikkelFJ
@mikkelfj
well, I found your screenshot which was very tiny, but still it doesn't tell if you will have one vector per each object, or only one vector to control lots of objects, or a vector per property of which there can be many. Either way, as long as you don't have a lot of a very small vectors, you should be OK. Even good, since scanning a byte array for a specific object type will be efficient. Keep in mind that flatbuffers are read only.
tsindhuja
@tsindhuja
I'm trying to use flatbuffers in C++ with Object API. I am still unable to change the values in the tables that are given as a reference to another table. Is it possible to manipulate values of a table indirectly through another table in Object API?
Wouter van Oortmerssen
@aardappel
@tsindhuja yes that is possible.. what error are you getting?
Martin Hans
@martinhansdk
I'd like to use flatbuffers on an embedded system written in C++. We don't allow dynamic memory allocation here. Is it possible to have a pre-allocated buffer and hand it to flatbuffers to serialize the data into? What about when deserializing?
MikkelFJ
@mikkelfj
Reading FlatBuffers requires no allocation in C/C++ but might in Java etc. due to string allocation. As building buffers I can't answer for C++, but I can give some guidance on FlatBuffers for C. C requires some dynamic allocation but it can be limited in size if your buffers are simple. The allocator can be overriden so you can place allocations in a reserved area. There might be reallocation requests to grow areas, but that will not happen if your buffers are reasonable. It requires experimentation to decide the exact requirements for the problem at hand.
C generated buffer interfaces can be called from C++
Wouter van Oortmerssen
@aardappel
@martinhansdk yes you can specify an allocator for the C++ builder which can then supply existing memory. It can do all its building in that one buffer and will not allocate any further memory