Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Graham Higgins
    @gjhiggins
    2 replies
    greattang123
    @greattang123

    Dear rdflib manitainers,
    I am a user of rdflib. When I use rdflib.Graph to parse one .nt file, the terminal output is
    rdflib.exceptions.ParserError: Invalid line: http://dbpedia.org/resource/2015_African_Rugby_Under-19_Cup_Division_"A" .
    While I remove the double quotations of ”A", i.e., "A" -> A,the program is OK. I want to know if there are other ways to parsing the .nt file rather than remove the double quotations of ”A".
    Thanks a lot.
    Yours Sincerely.

    Appendix
    Data{ <a:> http://www.w3.org/2002/07/owl#sameAs http://dbpedia.org/resource/2015_African_Rugby_Under-19_Cup_Division_"A" .}
    Code{
    from rdflib import Graph
    g = Graph()
    g.parse("./data/example.nt")
    print(f"Graph g has {len(g)} statements.")
    print(g.serialize(format="turtle"))
    }

    Graham Higgins
    @gjhiggins
    Looks like a problematic URL for dbpedia. With, or without the double qutoes, that dbpedia URL returns a “301 See other” and the relocate is https://dbpedia.org/resource/2015_African_Rugby_Under-19_Cup_Division_%22A%22- which itself fails to resolve. OTOH, wikipedia returns a corresponding redirect https://en.wikipedia.org/wiki/2015_African_Rugby_Under-19_Cup_Division_%22A%22 which is found. The dbpedia page is essentially blank, it has none of the info for that entry on wikipedia.
    1 reply
    Iwan Aucamp
    @aucampia
    @greattang123 I once had similar issues, and then what I did was basically to run urllib.parse.quoteon the bad URL first
    1 reply
    @greattang123 https://gitlab.com/aucampia/incubator/-/blob/master/osdu-ld/tools.py/src/osdu/ld/util.py#L143 < nasty code, but that is one place where I used it
    Thomas Tanon
    @Tpt
    Hi! I would love to get feedback on: RDFLib/rdflib#1396
    I am moving forward with my Oxigraph project (https://github.com/oxigraph/oxigraph) and I would love to be able to provide a fast rdflib store based on it.
    The task description a bit convoluted. I can work to make it clearer if needed
    Graham Higgins
    @gjhiggins
    I'm working in that area atm, trying to progress the "identifier-as-context" issue that's been extant since 2010 or thereabouts and which has had a couple of implementations, the last one being oohlaf's PR of RDFLib/rdflib#958. It's looking reasonably good thus far but there's a coupla aspects I'm not sure about but may just be down to terminological inexactitude - f'rinstance https://github.com/RDFLib/rdflib/blob/9379a69d6ec6e18819aa0c5a0d54849e7abb8223/rdflib/graph.py#L1558 where ConjunctiveGraph binds a Graph to the instance variable self.default_context.
    Thomas Tanon
    @Tpt
    @gjhiggins Great! Thank you! The PR you link does not solves my problem but seems a good step into the right direction
    4 replies
    Iwan Aucamp
    @aucampia
    @gjhiggins you developing on windows? If you want I can tune up the docker compose file a bit to make it easier and quicker to run tests
    should prbbably do it anyway
    Graham Higgins
    @gjhiggins
    's not my primary dev env, I'm on Mint and Python 3.9. I have an up-date-ish Win10 VM but that's running Python 3.10, so for modularity, I've just reinstated an old Windows10 VM and installed 3.8 on it. I also have Big Sur on a VM so I can investigate any MacOS-specific issues.
    Graham Higgins
    @gjhiggins
    @aucampia huh, tests all pass on WIndows10 VM with Python 3.8.10
    Graham Higgins
    @gjhiggins
    @aucampia aha, triples order varies with run, sometimes the sets match, sometimes they don't.
    Iwan Aucamp
    @aucampia
    I was actually just wondering how people run tests, I setup a virtual environment and run them in there, tox also works but it is quite fiddly if you just want to run one test file. Docker compose can maybe be the simplest option to run tests.
    Graham Higgins
    @gjhiggins
    venvs here, ftw. I have sublime text set up to run pytest on saving the file, I also just execute ./runtests.py /test/<the test> or just ./runtests.py
    Graham Higgins
    @gjhiggins
    I also have sublime test run black every time the file is saved and mypy every time it's opened. Might as well take advantage of these coding/development assistants, they do seem to help avoid committing typos/mispastes.
    If only they helped with brain farts :(
    Graham Higgins
    @gjhiggins
    @aucampia Thanks! I stole aucampia/rdflib@20b3d1c
    Iwan Aucamp
    @aucampia
    glad it helped :D
    Graham Higgins
    @gjhiggins
    Sadly, I had to revert, that particular approach creates failures in a whole buncha DAWG tests :(
    Iwan Aucamp
    @aucampia
    if you share branch and let me know what is happening I don't mind having a look also
    Graham Higgins
    @gjhiggins
    Will do, grateful for the offer. If that approach can be made to work, it'll relieve a particular pain point: ConjunctiveGraph.parse() creates a Graph to handle the parsing and has a profoundly-unwanted side-effect creates a BNode in the Store's contexts, which screws things up from the Dataset side.
    iwan.aucamp
    @iwan.aucamp:matrix.org
    [m]
    okay will check it out
    is there a test to reproduce? Maybe I can write one
    Graham Higgins
    @gjhiggins
    I pared it all down to just the rebased PR plus the changes from your commit and a file of tests which I took from my alternate thread of work - which preserves the original approach of using a Graph for parsing. https://github.com/DOACC/rdflib/tree/identifier-as-context
    Graham Higgins
    @gjhiggins
    In another repos (https://github.com/gjhiggins/rdflib/tree/identifier-as-context), I have a way more extensive re-working (tbh, it's a something of an ad hoc mess, very much a cluttered workbench because of the extensive impact of the changes but the tests are keeping me fairly straight). There are a number of issues that are pertinent and serve to set the scope of the changes required to make the result coherent. The core dataset test file is test_dataset_default.py, it exercises the three aspects of dataset functionality, direct manipulation with Python, indirect manipulation via RDF input and indirect manipulation via SPARQL update. (I've also had to keep the ConjunctiveGraph in lockstep because of Dataset's inheritance of that class). There are two main working outputs test_dataset_anomalies.py and test_dataset_anomalies_resolved.py, there's also test_dataset_anomalies_wip.py and test_dataset_graph_ops.py both for work-in-progress on different aspects of the subject. Apologies for the mess but I've completely failed to be able to reduce it to a logical sequence of changes.
    Iwan Aucamp
    @aucampia
    :wave:
    sorry been a bit swamped @gjhiggins
    Graham Higgins
    @gjhiggins
    np, been busy on tests
    Iwan Aucamp
    @aucampia
    You have been quite prolific :D - I will try check in here more often so it is less of a ghost town :smile:
    Graham Higgins
    @gjhiggins
    I happen to have plenty of spare time :)
    white_gecko
    @white_gecko:matrix.org
    [m]
    Very nice, my inbox is full of messages from you. I hope someone will finf time to go through it.
    Iwan Aucamp
    @aucampia
    I process one or two a day :smile: - will finish them all next year
    1 reply
    Iwan Aucamp
    @aucampia
    There was some json format for RDF that nick mentioned, which consisted of something like ["subject", "predicate", "object", "type", "language"]
    but I can't recall it now
    Iwan Aucamp
    @aucampia
    @gjhiggins you need help with rebase?
    doing a review of this now: RDFLib/rdflib#1646
    Graham Higgins
    @gjhiggins
    @aucampia Dunno if it's worth it really, all I intended to do oriignally was rebase oohlaf's rebase of gromgull's original PR. Problem is that it doesn't work, so I thrashed around for a while trying to find a means of getting it to work without a major rewrite, found soemthing that works but without the rewriting of the parsers, it's bound to be crude.
    The only actual up side is the raft of extant issues that get resolved but some of them date back years, so the implication is that they're not exactly showstoppers/
    Graham Higgins
    @gjhiggins
    @aucampia my concern is not to waste people's time with an inadequate contribution. I've never had the skillset for core dev, my role in RDFLib has been primarily of ancillary support.
    Iwan Aucamp
    @aucampia
    Well I'm checking, maybe the right solution is just to reduce the scope of changes a bit, what you are trying to do is quite critical, annoying that the API uses graphs instead of identifiers
    Graham Higgins
    @gjhiggins
    Reducing the scope has been an objective but not easy to maintain as the effects of the changes ripple out. As long as I'm not actually being a nuisance or a burden, I'm okay with it. Thanks for the understanding.
    Iwan Aucamp
    @aucampia
    @gjhiggins there are some minor changes intermixed with your PR that is very uncontroversial and safe, like
    $ git diff origin/master  -- rdflib/util.py
    diff --git a/rdflib/util.py b/rdflib/util.py
    index 7318dcc3..c98dfdac 100644
    --- a/rdflib/util.py
    +++ b/rdflib/util.py
    @@ -364,6 +364,7 @@ SUFFIX_FORMAT_MAP = {
         "html": "rdfa",
         "svg": "rdfa",
         "nq": "nquads",
    +    "nquads": "nquads",
         "trig": "trig",
         "json": "json-ld",
         "jsonld": "json-ld",
    Would you mind if I slice these off and make seperate PRs, will keep your commits but basically just take slices of it, will make an example to show you.
    Graham Higgins
    @gjhiggins
    I have absolutely no objections to anything you wish to do with the PR, I'm pathetically grateful for the benefit of your skill and experience.
    No need for example
    I have separated it out into a single commit (https://github.com/RDFLib/rdflib/pull/1646/commits/2914a109c00877ee4ef99103dd975a8c28b26140) if that helps.