Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Pierre-Antoine Champin
    @pchampin

    Hi, you might be interested in a small tool I just started hacking. It allows you to explore an RDF store in your filesystem, using FUSE.

    https://github.com/pchampin/rdflib-fuse

    For the moment, it is read-only, but my plan is to support mutations as well (through creating, modifying and deleting files).

    Benjamin Riesenberg
    @briesenberg07
    Hi all:
    Question from a relative newcomer to Python, and rdflib.
    I'm planning to write a XSLT script to convert RDF/XML data into an HTML document.
    As part of this workflow I'm planning on using Python + rdflib to retrieve some triples and serialize as RDF/XML. This will be run on a regular basis going forward, so I thought I should ask:
    Is the output from rdflib's RDF/XML serializer stable?
    That is to say, can it be reasonably expected to stay mostly the same going forward? Or is this functionality still under development and likely to change?
    Thank you!
    Carlos Vega
    @carlosvega
    Hello, I have an issue with blank nodes that I've posted on Stack Overflow

    is there any way to avoid the creation of that extra rdf:description tag and provide parseType="Resource" to dcterms:modified?

    import rdflib
    from rdflib.namespace import Namespace, DCTERMS, RDF, XSD
    from datetime import datetime
    
    graph = rdflib.Graph()
    
    graph.bind('dcterms', DCTERMS)
    graph.bind('xsd', XSD)
    description = rdflib.URIRef(f'#TNFalpha_944')
    
    w3cdtf_node = rdflib.BNode()
    
    date = rdflib.Literal(datetime.now(), datatype=XSD.dateTime)
    graph.add((description, DCTERMS.modified, w3cdtf_node))
    graph.add((w3cdtf_node, DCTERMS.W3CDTF, date))
    
    ann = graph.serialize(format="pretty-xml").decode('utf-8')
    print(ann)

    this is the code I used to generate that output.

    Carlos Vega
    @carlosvega
    Untitled.png
    joylix
    @joylix
    Hi, all, Will RDFLib support RDF*( also RDF star)?
    Iwan Aucamp
    @aucampia
    @joylix if someone adds support for it
    I am sure there is no opposition in principle
    but resources are finite
    Daniele Nicolodi
    @dnicolodi

    Hello, to make a long story short, I am have some data for which I think could be nicely modeled as an RDF graph and for which SPARQL could be an effective query language. I don't know much about RDF or SPARQL or RDFlib, thus I started with a tiny example to familiarize myself with the concepts:

    import rdflib
    from rdflib import FOAF
    
    graph = rdflib.Graph()
    
    for n in range(0, 1000):
        node = rdflib.term.BNode()
        graph.add((node, FOAF.age, rdflib.term.Literal(n)))
        graph.add((node, FOAF.name, rdflib.term.Literal(f'Name{n:}')))
    
    rows = graph.query("""SELECT
                            ?name ?age
                          WHERE {                 
                            ?x :age ?age . FILTER(?age > 998) 
                            ?x :name ?name . }
                          ORDER BY ?age""",
                       initNs={'': FOAF})
    
    for name, age, in rows:
        print(name, age)

    This simple test runs in about two seconds on my laptop. Unfortunately this kind of performance would not be sufficient to work with my real data. Am I doing something wrong, or is the processing in RDFlib the bottleneck? Is there a way to speed this up?

    Thank you!

    Thomas Tanon
    @Tpt
    Hi Daniele! Rdflib is written in plain python. It is designed to be feature complete, easy to use and extend but is not much optimized for performance.
    Anyway, your code takes 0.95s on the first run in my laptop and only 0.2s if I restart it while keeping the "imports" loaded using a REPL. If I drop the printing and replace range(0,1000) by range(0, 100000) it takes 18s.
    Tory Clasen
    @tclasen
    @dnicolodi , without profiling the code you I can make a guess, if you create you build your list of triples as an iterator such as a list-comprehension, you can do a bulk insert into the graph using addN: https://rdflib.readthedocs.io/en/stable/apidocs/rdflib.html#rdflib.graph.Graph.addN
    Iwan Aucamp
    @aucampia
    hi, not sure if this is the best forum to ask, are you open to PR for adding linting ?
    Ashley Sommer
    @ashleysommer
    Hi @aucampia
    What kind of linting are you suggesting? We already require PEP8 compliant pull-requests, and we also strongly suggest all contributors use black on their code before creating a PR (though we don't enforce that).
    We've had the discussion about linting many times in the past. The crux of the matter is we want to keep the barrier to contribution as low as possible for users of RDFLib. And the kinds of people who use RDFLib are not necessarily software engineers. We want researchers, academics, scientists, semantic extereprts, ontology experts, etc to be able to contribute to RDFLib. I don't like to stereotype or put people into categories, but in my experience a lot of experts do not want to deal with jumping through hoops to make their code compliant.
    Iwan Aucamp
    @aucampia
    @ashleysommer thanks for the reply, and valid points. I was thinking PEP8, specifically autopep8, I guess I read right past it, but when I ran autopep8 on the codebase there were many warnings, so I guess I just assumed it was not being used, but I guess it makes sense to only apply it to pull requests.
    Iwan Aucamp
    @aucampia
    so I am looking at this, to add tests for transitive_objects and transitive_subjects: https://github.com/RDFLib/rdflib/blob/master/examples/transitive.py
    nvm, I get it I think
    Iwan Aucamp
    @aucampia
    are there some standard graphs that you run tests on?
    Iwan Aucamp
    @aucampia
    I made a PR here: RDFLib/rdflib#1307 - suggestions for tests welcome
    Iwan Aucamp
    @aucampia
    This issue can be closed: RDFLib/rdflib#1279
    Any suggestions on tests here would be appreciated: RDFLib/rdflib#1291
    Iwan Aucamp
    @aucampia
    Are there any plans to make further 5.x releases?
    And if not can I look at it?
    white_gecko
    @white_gecko:matrix.org
    [m]
    As I see it, we have merged some breaking changes already so we will head for 6.x
    Iwan Aucamp
    @aucampia
    Well we could branch 5.0 and cherry pick some
    but I guess if there is no big pressure it is not needed, if 6.x is not to far on the horizon it makes sense
    Iwan Aucamp
    @aucampia
    This should be closed also: RDFLib/rdflib#1291
    Iwan Aucamp
    @aucampia
    What is the policy on type annotations, are they allowed? Encouraged/Discouraged?
    Iwan Aucamp
    @aucampia
    Asked it here also: RDFLib/rdflib#1311
    Have you considered enabling the GitLab Community features?
    So there is a place to ask questions like that without making issues
    Iwan Aucamp
    @aucampia
    why is there so little funding for RDF :/ - really need more people on RDFLib
    If I were in control of universities I would get people to make PRs for things like this instead of dumb coding assignments
    Iwan Aucamp
    @aucampia
    @dnicolodi if you want performance (and better maintenance) try using rdf4j or jena - they are both fast and get more maintenance, though you will need to run on JVM
    white_gecko
    @white_gecko:matrix.org
    [m]
    @aucampia: you are right, there is a lot to do in the rdflib. We try our best in maintaining it and are happy about contributions to improve it.
    Also there is some activity in improving the performance of th rdflib
    Iwan Aucamp
    @aucampia
    recommendation of rdf4j and jena is not meant to be disparaging of rdflib, I also use rdflib mostly because most of the time I don't want to struggle with JVM and JVM does not have pip, pipx, etc - just don't want people to not use RDF because of a performance concern of rdflib
    rdflib is awesome for what it is
    but if I were to build something production grade that needs good performance I would use Jena or RDF4J
    I think the best hope is to find more commercial applications for RDF
    The more commercial use the more funding and more contributions
    But if universities were better actors in the ecosystem it would help, if they instead direct resources to maintaining existing stuff instead of making yet another research project that will be abandoned it would be very beneificial
    Iwan Aucamp
    @aucampia
    I am going to submit a fix for tox this weekend, and then also submit changes to add mypy to CI pipeline (and eventually to tox) - hope it is well received. I will also try and make CONTRIBUTING.md similar to this: https://github.com/pallets/click/blob/main/CONTRIBUTING.rst