Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Jacopo Scazzosi
    @jacoscaz
    And yes, we are based in Milan
    Adrian Gschwend
    @ktk
    one of my colleagues will start playing with it
    great will ping you next time I'm there, once this 'Rona is over
    Jacopo Scazzosi
    @jacoscaz
    Oh yeah. I'm also looking forward to traveling a bit more and getting to meet some of the RDF(/JS) community, it's been long overdue.
    Adrian Gschwend
    @ktk
    absolutely
    was in milano in september but resisted to ping people I know
    was not the right moment
    Jacopo Scazzosi
    @jacoscaz
    @ktk just FYI, we have 7.1.0-beta.1 out which fixes a few issues with TS typings in dev dependencies. Your colleague might want to start from that.
    Adrian Gschwend
    @ktk
    @jacoscaz great tnx!
    Jacopo Scazzosi
    @jacoscaz
    Should RDF/JS projects refer to their own gitter room or should they point to this one, or perhaps rdfjs/public? Bigger projects like Comunica probably deserve to have a gitter of their own not to add too much noise in here but what about smaller projects like quadstore? Keeping the conversation limited to a few rooms would probably help the community grow tighter.
    Thomas Bergwinkl
    @bergos
    using rdfjs/public make sense to me. there was a discussion about merging the channels, but i think it's good to have a channel for people using rdfjs library and one for people developing rdfjs libraries. i guess each group has their own topics to discuss and using one more channel should not be problem. at least for me it doesn't matter much.
    Jacopo Scazzosi
    @jacoscaz
    Yeah, that makes sense.
    I'll wait a little bit in case anyone else wants to chime in but I was thinking of adding a reference to rdfjs/public in quadstore's README as a couple of users have asked about a dedicated gitter room.
    Jacopo Scazzosi
    @jacoscaz
    Is there consensus among the RDF/JS community on how to prevent collisions of unrelated blank nodes in the low-level API or whether this is even something that should be addressed in the low-level API?
    Thomas Bergwinkl
    @bergos
    @jacoscaz every library that generates terms/quads should accept a factory argument to share a single factory
    on a little bit higher level i have something in mind which i call environment. it implements the factory, but it's also possible to attach parsers, namespaces, fetch, ... and all attached functions will use the environment to create terms/quads/datasets.
    Jacopo Scazzosi
    @jacoscaz

    every library that generates terms/quads should accept a factory argument to share a single factory

    If I understand what you are saying, this implies that using more than one factory instance across a single RDF/JS project is not supported by the spec and may lead to data corruption, yes?

    Thomas Bergwinkl
    @bergos
    this topic is not covered explicitly by the spec, but it implies something like you said. i would just express it a little bit different: merging quads from different sources requires using factories that share the same blank node generator.
    i plan already to have multiple factory instances. the environment can create new factories with different parsers/serializers and namespaces/prefixes attached. they will still share the same blank node generator.
    Jacopo Scazzosi
    @jacoscaz
    Sounds like something for a next revision of the spec. For the time being I assume that as implementors of RDF/JS libraries we should provide scoping options to deal with blank node collisions - yes?
    Jacopo Scazzosi
    @jacoscaz
    I think it'd be worth mentioning the nature of blank nodes in the RDF/JS spec and the potential for blank node collisions. This is particularly relevant for those willing to implement the Store interface in a manner that allows the store the be re-used across process restarts (on the backend) or page reloads (on the frontend). For example, the JSON-LD parser https://github.com/rubensworks/jsonld-streaming-parser.js produces colliding blank nodes when used to parse completely different documents across multiple restarts. I don't think this is an issue with the parser itself, though, as its behavior is coherent with the semantics of blank nodes (and so is the behavior of every other library I've tested, this is just an example).
    Jacopo Scazzosi
    @jacoscaz
    Even factories sharing the same blank node generator as per @bergos 's proposal would not completely solve the problem. A generator-based solution would be to structure the generator's state in such a way as to be serializable with each update and re-hydratable at a later date but that would impose the need of the few (persistent stores) on the needs of the many (the entire spec) and seems to be a no-go to me.
    Based on the work I've done in quadstore to address this, I think that store implementations should be responsible for handling potential blank node collisions.
    Thomas Bergwinkl
    @bergos
    @jacoscaz i don't think it's that's easy and just defining the leading role for one specific object will solve the problem. i'm not sure how the jsonld-streaming-parser works, maybe blank node ids from the json document are used. that would make it easier especially in a streaming context, but the right way of handling blank nodes would be a map that translates the blank node ids from the document to blank node instances generated from the given factory. that would be also the case for your store. if a factory is given you would need to translate a persistent blank node id to a blank node object from the factory. it would be also possible to define your own factory which comes with the store, so the store has the leading role for the blank nodes. from your perspective i can see that you would prefer the second option for performance and simple logic in the code reasons. from the perspective of somebody using the library, i would be happy to have both options. maybe i would like to use a specific factory for a good reason. if i can't hand over the factory to your store i have to do the mapping. that's something we wanted to avoid when we started working on the RDF/JS specs. i would be also happy if parsers would offer me that option. not only for performance reasons, but also for debugging. sometimes it's useful to be able to identify a blank node. but the default mode should be with the map + factory, cause that approach matches (more) the actual idea of blank nodes.
    Jacopo Scazzosi
    @jacoscaz

    @bergos - thank you for brainstorming with me!

    the right way of handling blank nodes would be a map that translates the blank node ids from the document to blank node instances generated from the given factory

    Indeed, this is what I'm doing within quadstore ATM. I maintain persistent "scope" instances, each a collection of blank node mappings that can be re-used across write operations. If a scope is provided to a write operation, quadstore translates blank nodes according to the existing mappings and adds a new mapping for each previously-unencountered blank node. Scope mappings are persisted incrementally (and atomically with newly-written quads) as they are added to each scope and each scope can be dropped from the store when not needed anymore. This does not require any specific factory or generator, it's a feature of the store itself. However, this is not (yet) available in the methods coming from the RDF/JS Store interface as I'd rather keep those aligned with the spec.

    maybe i would like to use a specific factory for a good reason. if i can't hand over the factory to your store i have to do the mapping.
    that's something we wanted to avoid when we started working on the RDF/JS specs.

    Yeah, I understand. This is partially the reason behind my current approach - users are completely free to pass whatever factory they want to quadstore without any performance penalty even when using scopes.

    Thinking about your comments, I guess my point is two-fold:
    a) I think it would be best for the RDF/JS spec to explicitly mention the issue of blank node collisions to save users and developers that are not familiar with RDF a few likely headaches (one can spend their whole career without ever hearing about existential variables);
    b) I think the RDF/JS spec should settle on a shared strategy to prevent blank node collisions, and this strategy should be compatible with persistent stores.

    As for the strategy itself, I am actually rather ambivalent as long as I can maintain transactional atomicity and performance levels.
    Dmitri Zagidulin
    @dmitrizagidulin
    I wonder if the rdf canonicalization algo (urdna-15 or whatever) could be useful here? (Since it deterministically derives IDs for blank nodes, from the hash of the graph)
    or it might be overkill
    (overkill because it may not be appropriate for high-throughput / low latency use cases)
    Jacopo Scazzosi
    @jacoscaz
    @dmitrizagidulin I'll tinker with rdf-canonicalize over the next few days and see how much of a performance impact it has. From a brief look at the code I expect it to have a significant impact. Any idea on how I might incrementally store and re-hydrate those blank node mappings with it?
    Thank you for the suggestion!
    Dmitri Zagidulin
    @dmitrizagidulin
    Really good question (re incrementally storing the mappings). I don't know.
    elf Pavlik
    @elf-pavlik
    maybe @gkellogg could share some experience on how ruby implementations deal with blank node collisions and if canonicalization can help here
    Blake Regalia
    @blake-regalia
    yes, canonicalization prevents bnode collisions
    The discussion around it on HN does have some interesting comments
    Adrian Gschwend
    @ktk
    can't remember a HN discussion on RDF & co that was remotely useful, is it this time?
    Jacopo Scazzosi
    @jacoscaz
    Well, I guess it depends on what each of us finds useful. I like reading reasonably-constructed criticism to approaches and technologies I use, it makes it harder to get stuck in my own bubble. There’s a lot of noise that doesn’t deserve any attention but I do find interesting points every now and then. YMMV
    Martynas Jusevicius
    @namedgraph_twitter
    where's the HN link though?
    Martynas Jusevicius
    @namedgraph_twitter
    ok i'm done with commenting over there :D
    danbri
    @danbri:matrix.org
    [m]
    What’s new and news in rdfjs land?
    @namedgraph_twitter on HN?
    @namedgraph_twitter I gave up on them long time ago ;)
    danbri
    @danbri:matrix.org
    [m]
    Nice, thanks!
    Ruben Taelman
    @rubensworks
    Hi all! @jacoscaz have been working on a new specification aimed at better interop between query engine libs.
    An initial version of this spec is introduced in this PR: rdfjs/query-spec#4
    Any comments are welcome!
    Tomasz Pluskiewicz
    @tpluscode
    is there a rendered preview online?
    Ruben Taelman
    @rubensworks