Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    kenwenzel
    @kenwenzel
    Another question: Why do you want to use DTOs? You can just pass the "real" beans to the service layer...

    oh btw, let's say i retrieved a bean via a CONSTRUCT that does not retrieve its properties

    You can also simply use a SELECT query - why do you need CONSTRUCT?

    lolive
    @lolive
    in my mental model, CONSTRUCT creates a graph of objects whereas a SELECT is more a table. so using a SELECT to build a graph of objects sounds unnatural to me.
    oh, btw, can i implement my own equals() and hashCode() on beans, via the behaviours? (that i call Support classes)
    lolive
    @lolive
    Oh, and DTOs make the service layer to be independant of the lower layers. for example you avoid sending of massive and unexpected SELECTs to the DB.
    1 reply
    kenwenzel
    @kenwenzel

    in my mental model, CONSTRUCT creates a graph of objects whereas a SELECT is more a table. so using a SELECT to build a graph of objects sounds unnatural to me.

    Yes, you are right. The point is that KOMMA uses the underlying RDF store as "the graph" and not a subset that is extracted via construct (only for prefetching of bean properties as discussed before).

    oh, btw, can i implement my own equals() and hashCode() on beans, via the behaviours? (that i call Support classes)

    Yes, you can. But I would avoid this since KOMMA uses RDF's resources identity by comparing URIs or BNode IDs.

    lolive
    @lolive
    ok, but when i need to add beans to a map for example. i need to have those methods implemented
    kenwenzel
    @kenwenzel
    Those methods are already implemented in EntitySupport (base behaviour that is added to all beans of an entity manager):
    https://github.com/komma/komma/blob/master/bundles/core/net.enilink.komma.em/src/main/java/net/enilink/komma/em/internal/behaviours/EntitySupport.java#L74
    lolive
    @lolive
    Ok.
    My concern now is about your statement about "the graph"
    I have a unit test that might fail because of that
    I will investigate that a little bit and come back to you later in the afternoon, if you are ok
    kenwenzel
    @kenwenzel
    Yes, that's OK.
    lolive
    @lolive
    a behaviour must implement BOTH a bean interface and Behaviour<beanInterface>? that sounds overkill, no? just Behaviour<beanInterface> would be enough, no?
    or you want to have the @Override annotation to work when coding the override a bean's method?
    kenwenzel
    @kenwenzel
    It MUST implement one or more bean interfaces. The Behaviour interface CAN be implemented to access the bean instance itself via getBehaviourDelegate() because 'this' points to the behaviour instance and not to the bean. The generic parameter of Behaviour is just there to avoid type casts.
    lolive
    @lolive
    hello again. if you have some time for me, i still need some clarification about property prefetching in my CONSTRUCT query
    26 replies
    naturzukunft
    @naturzukunft:matrix.org
    [m]

    Hi there,
    i'm Fredy, currently working on https://linkedopenactors.org & https://www.iese.fraunhofer.de/de/innovation_trends/sra/smarte-landregionen.html.
    I use rdf4j to handle rdf data in both projects. This weekend i was playing with komma and getting it running for writing/loading my loa organisations.

    In loa i use a rdf4j httpRepo and maybe this is the problem of the really bad performance. I create and read a really small entity
    (String addressCountry, String addressLocality, String addressRegion, String streetAddress, String postalCode) and it takes ~ 1 minute.

    Is it possible in principle to use komma with http repos, or is this rather not a use case for komma.

    btw. @kenwenzel thanks for your really fast answer!

    kenwenzel
    @kenwenzel

    @naturzukunft:matrix.org
    Hello Fredy,
    KOMMA normally uses lazy loading. This results in one query per bean for its RDF types and one query for each of its properties. The alternative is to use prefetching with construct queries.
    You can find an example for this at https://github.com/numerateweb/numerateweb/blob/60154f1b543049695c3363f71736a7ee45571ae7/bundles/core/org.numerateweb.math/src/main/java/org/numerateweb/math/rdf/ObjectSupport.java#L21

    Usually it is perfectly possible to use an HTTP repo but you have to reduce the overall number of queries (by prefetching and/or caching).
    Maybe you can give some more explanations on your use case and the related SPARQL queries.

    naturzukunft
    @naturzukunft:matrix.org
    [m]
    UseCase:
    I have a lot of publications (CreativeWork) in a rdf repository that have the following structure:
    https://linkedopenactors.org/specification/information_exchange_datamodel.html#information-exchange-data-model
    This rdf repository is a clone of an existing database. It's just a RDF representation that also provide SPARQL.
    There is a scheduled job that looks for changes in the main database and brings new publications or changes to publications into the rdf repository.
    So every few minutes a variable number of publications (CreativeWork) are created or changed.
    I will try to make my code with unit test available soon. I may also have made a mistake when initialising the entity manager.
    naturzukunft
    @naturzukunft:matrix.org
    [m]

    This here seems to be the biggest problem:

    public void save(String subject, PostalAddressLoa postalAddressLoa) {
      PostalAddressKomma created = entityManager.createNamed(URIs.createURI(subject),PostalAddressKomma.class);
      created.setAddressCountry(postalAddressLoa.getAddressCountry());
      created.setAddressLocality(postalAddressLoa.getAddressLocality());
      created.setAddressRegion(postalAddressLoa.getAddressRegion());
      created.setPostalCode(postalAddressLoa.getPostalCode());
      created.setStreetAddress(postalAddressLoa.getStreetAddress());
    }

    because each line seems to do a lot http things ?! Each setter takes ~ 10 sec.
    I think there is another way to create objects, but haven't found it yet.

    kenwenzel
    @kenwenzel
    OK. Do you already use transactions? You should wrap the creation of your entities within a transaction by using
    ITransaction tx = entityManager.getTransaction();
    tx.begin();
    try {
      // create beans here
      tx.commit();
    } catch (KommaException e) {
      // rollback tx if necessary
    }
    BTW, 10s for each setter is way too long. Which KOMMA version are you using? Do you use the IModelSet-API or only the entity manager?
    naturzukunft
    @naturzukunft:matrix.org
    [m]
    Transaction is a good idea, test runtime before 55.853sec with transaction 12.379
    I use 1.4.0, copied it from an example.
    switching to 1.5.2 end in applicationContext errors while initialization.
    is there somewhere an example with 1.5.2 ?
    kenwenzel
    @kenwenzel
    I try to upgrade https://github.com/komma/komma-examples to 1.5.2 tomorrow. But we're already using it in our linked data platform enilink https://github.com/enilink/enilink without any problems... hm...
    @naturzukunft:matrix.org Can you provide your examples/unit tests somehow?
    kenwenzel
    @kenwenzel

    I think version 1.4.0 uses eager change tracking leading to many HTTP requests as observed by you. This is fixed in version 1.5.2.
    You can try:

    DataChangeTracker changeTracker = injector.getInstance(DataChangeTracker.class);
    changeTracker.setEnabled(null, false);

    Use the code above once before executing any modifications with an entity manager.
    The enabled state of the change tracker is a thread-local.

    1 reply
    naturzukunft
    @naturzukunft:matrix.org
    [m]
    @kenwenzel: we currenly struggeling with another problem. do you have experience with using rdf4j in a cluster?
    In the moment one of our node seems to lock the repo directory and the other node fails.
    The same problem would we have using komma.
    kenwenzel
    @kenwenzel
    @naturzukunft:matrix.org Are you trying to directly share the repo directory between multiple instances? Are you using the NativeStore?
    naturzukunft
    @naturzukunft:matrix.org
    [m]
    yes, we do ;-) maybe not a good idea ?
    we was using memory store, but changed to native store. Same problem with both.
    i got confused by the "The Repository API supports multithreaded access" statement because i assumed there were threads on several nodes.
    kenwenzel
    @kenwenzel
    You can't share the files between different NativeStore instances. If you need clustering then you have to switch to GraphDB https://www.ontotext.com/products/graphdb/ that AFAIK only supports full replication or Stardog https://www.stardog.com/
    naturzukunft
    @naturzukunft:matrix.org
    [m]
    that sounds awful. there is no open source solution?
    kenwenzel
    @kenwenzel
    You could also try Virtuoso or Blazegraph (discontinued, now Amazon Neptune). Do you really need clustering for performance reasons or only for HA?
    naturzukunft
    @naturzukunft:matrix.org
    [m]
    In the IESE environment, we very definitely need fail-safety. It will be a kind of activityPub server, but it is still unclear which features will actually be implemented and which will be publicly accessible. We are starting very small and want to gain experience with rdf and activityPub. We would also use it as a kind of message bus, exchanging messages/tasks between systems. If we open it up to users, the number of users can reach ~ 30,000 quite quickly. However, each user would have their own repository! So load balancing could also be achieved without a cluster.
    kenwenzel
    @kenwenzel
    I would also suggest to start with the NativeStore and then switch the database technology if necessary. We also combine RDF4J stores e.g. with LevelDB for high-performance (> 100.000 values per second) collection of time-series data.
    naturzukunft
    @naturzukunft:matrix.org
    [m]
    i did not really sea the issue with the cluster, if there is a shared file system. What is the difference to multithreaded app on one cluster. Am I missing something?
    kenwenzel
    @kenwenzel
    A multi-threaded app is able to use locks to ensure that the database file(s) are only updated by one thread at a time and hence race conditions, like overwriting data by two simultaneous threads, are prevented. Two separate processes need to synchronize their work by other mechanisms, e.g. by locking whole files. This is exactly what you are oberserving with the NativeStore. But this is the same for (most) other database systems. They all need to ensure that the data files are only updated by one process at a time - it doesn't matter if the database is distributed by nature or not.
    BTW, using a network file system for a database is only a good idea if it is connected via a really fast network.
    Did you make any progress with KOMMA?
    naturzukunft
    @naturzukunft:matrix.org
    [m]
    Unfortunately there was no time for Komma today.
    kenwenzel
    @kenwenzel
    Let me know if you've made any progress :-)
    naturzukunft
    @naturzukunft:matrix.org
    [m]
    ^ found it !
    naturzukunft
    @naturzukunft:matrix.org
    [m]
    kenwenzel
    @kenwenzel

    Cool!
    BTW, to speed up the retrieval of the bean at this line

    https://git.fairkom.net/fairsync/development/loa-suite/-/blob/master/loa-repository-KOMMA/src/main/java/org/linkedopenactors/repository/PostalAddressRepository.java#L47

    you can replace em.find(...) by something like:

    em.createQuery("construct { ?s a <komma:Result> ; ?p ?o } where { ?s ?p ?o }").setParameter("s", uri).getSingleResult(PostalAddressKomma.class);