Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
Sarven Capadisli
@csarven
@michielbdejong I'll run NSS on personal site and for dokie.li's inbox
Actually, dokie.li's inbox already uses NSS v5.1.6
I use NSS4 on my local machine.
I don't keep dokie.li's notifications around long though. People experiment all the time on the homepage.
showing all annotations gets a bit out of hand at the moment.. need to improve the UI
a lot of the annotations require authentication so the app ends up fetching the notifications on the homepage but then hits a 401 on the annotation. no point in keeping the tests around..
Sarven Capadisli
@csarven

curl -I https://www.w3.org/TR/webarch/

content-location: Overview.html

in comparison to the ones that don't:

curl -I http://csarven.ca/

So, when I do something like:

curl -ki 'https://web.archive.org/save/https://www.w3.org/TR/webarch/'
-H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:69.0) Gecko/20100101
Firefox/69.0' -H 'Accept: */*' -H 'Accept-Language:
en-CA,en;q=0.7,en-US;q=0.3' --compressed -H 'Referer:
https://localhost:8443/' -H 'Origin: https://localhost:8443' -H
'Connection: keep-alive' -H 'DNT: 1' -H 'Pragma: no-cache' -H
'Cache-Control: no-cache'

I get:

content-location: Overview.html

And that kind of screws up things for me because I can't figure out the
actual snapshot location from the headers. Okay if JS-enabled agent is
making the request because it eventually redirects.. but that's not what
I want because I'm making this call from dokieli and only want to work
with headers (or whatever is proper structured data is available.. as
opposed to scraping stuff).

This is in comparison to say:

curl -ki 'https://web.archive.org/save/http://csarven.ca/' -H
'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:69.0) Gecko/20100101
Firefox/69.0' -H 'Accept: */*' -H 'Accept-Language:
en-CA,en;q=0.7,en-US;q=0.3' --compressed -H 'Referer:
https://localhost:8443/' -H 'Origin: https://localhost:8443' -H
'Connection: keep-alive' -H 'DNT: 1' -H 'Pragma: no-cache' -H
'Cache-Control: no-cache'

which gives a nice workable:

content-location: /web/20190708123256/http://csarven.ca/

Have I missed something obvious that I can use?

Sarven Capadisli
@csarven
@rubensworks https://github.com/rubensworks/rdfa-streaming-parser.js great! How do we get it on dokieli? :)
Ruben Taelman
@rubensworks
@csarven I haven't used simplerdf myself, but it should be just a matter of plugging it in here: https://github.com/linkeddata/dokieli/blob/master/src/simplerdf.js
The parser is based on RDFJS, so it should be compatible with RDF-ext
Sarven Capadisli
@csarven
@rubensworks Yes, hopefully. I think SimpleRDF's parse is a bit buggy though. I had to patch it up the go because we couldn't get to an upstream package.. IIRC something DOM related. So, maybe if we change all of the parsers/serialisers in SimpleRDF to use whatever RDFJS is available, that'll work.
Ruben Taelman
@rubensworks
@csarven This may also be relevant for you then: https://www.npmjs.com/package/rdf-parse (the RDFa parser will be added to that soon)
It's basically a convenience package that exposes the same functionality that Comunica uses to parse RDF files.
Not sure if it will work well in your pipeline though.
Sarven Capadisli
@csarven
@rubensworks rdf-parse looks like a good candidate. I think we'll have to rework the code around whatever we do. I don't mind dropping SimpleRDF... finally. It served its purpose but can't mess around with the issues it brings either.
We also need Turtle and JSON-LD serialisers.
Ruben Taelman
@rubensworks
@csarven I'm not really familiar with the internals of Dokieli, but LDflex, Clownface or rdf-object may be good alternatives for SimpleRDF.
Serializers for Turtle and JSON-LD exist that are based on RDFJS exist as well :-)
Sarven Capadisli
@csarven
--
Fixed InternetArchive issue for resources with relative Content-Location on GET. I hate the fix because it looks for a JS line in the snapshot-to-be to extract the snapshot URL. Out of our control.. maybe when IA fixes the their CL header we can drop it too.
Ruben Taelman
@rubensworks
But if you need JSON-LD framing, then you'll probably have to use this one: https://www.npmjs.com/package/jsonld
Sarven Capadisli
@csarven
SimpleRDF's convenience for fetching was quite useful in the early days but then so many cases emerged where it made sense get closer to the fetcher calls with options. Right now SimpleRDF is used to help re-serialise and find/select triples.
Do one of those helper packages let you set the subject URI so that you can grab its p,o?
@rubensworks Do you know @awwright 's https://github.com/awwright/node-rdf ?
Ruben Taelman
@rubensworks
rdf-parse is pretty low-level, it only returns a quad stream, so you'll have to pipe it into some kind of store and execute SPO queries manually.
LDflex on the other hand is more high-level, as it uses a SPARQL engine internally. It allows you to do things like const name = await path.create('http://example.org/mydoc.ttl').friends.name
The former will be more flexible if you need to handle a lot of edge-cases manually. The latter is a lot more powerfull, and allows you to do complex things using a very convenient syntax.
Ah, and if LDflex queries would not be expressive enough, you could also use GraphQL-like queries with this: https://github.com/rubensworks/graphql-ld.js
I have never used node-rdf myself tbh, so I can't really say much about it, but it looks like it can do a lot.
Thomas Bergwinkl
@bergos
i think https://github.com/rdfjs/fetch-lite using the dataset should be good enough. simplerdf was mainly used to fetch the rdf data, but selecting the triples was done more in a standard .match() way.
to directly use jsonld from a RDF/JS quad stream, this package could be useful: https://github.com/rdfjs/serializer-jsonld-ext
it's not yet released, but should be ready and just waits for some reviews/feedback
Sarven Capadisli
@csarven
@rubensworks Thanks! Don't immediately need to do advanced queries right now but it can come in handy later. Right now there are some SPARQL queries in place for eg. statistical linked data, and maybe we can revisit that. For now, I'd like to do the main replacement around SimpleRDF
@bergos fetch-lite looks great! Do you and @rubensworks plan to add other parsers/serializers to formats-common? It'd be great to see rdfa-streaming-parser in there!
In fact, I think right now the RDFa parser is the only one that's needed in formats-common as far as dokieli is concerned. We can repackage of course, but better in formats-common I think. What do you think?
Ruben Taelman
@rubensworks
It's not on my todo-list at the moment. But it should be easy to PR it in though, as the parser implements .import like the other parsers in there.
My next plans with the RDFa parser are to combine it with other kinds of HTML processing, such as extracting script tags that include JSON-LD. But this will require a more elaborate architecture if I want to parsing the HTML document only once.
Sarven Capadisli
@csarven
Why JSON-LD specifically?
Thomas Bergwinkl
@bergos
PR to add the rdfa parser to formats common is very welcome
i think JSON-LD is the most common RDF format embedded into html script tags. schema.org examples are done in that style: https://schema.org/Movie#Movie-gen-25
Sarven Capadisli
@csarven
Sure re JSON-LD in script, however I was trying to understand why it needs to be limited to that if so. If capable of detecting script blocks, why not extract any or all of the RDF syntaxes? I presume the parsing of what's in the block will be handled by a dedicated parser any way.
Thomas Bergwinkl
@bergos
ah, i think it was just an example. i expect @rubensworks will use media types to find the parser, so it should not be a problem to add other formats.
Ruben Taelman
@rubensworks
Indeed, JSON-LD was just an example, as it's the most common one that is embedded in script blocks.
Sarven Capadisli
@csarven
Great!
Sarven Capadisli
@csarven
Added @dr0i and @bergos to https://github.com/linkeddata/dokieli#contributors . Pardon me for not doing this earlier. I've overlooked @dr0i 's contributions.. and @bergos has been helping over the past few years with the project.
timbl @timbl wonders why the “open” icon in dokieli is a coffee cup
Sarven Capadisli
@csarven
@timbl Random reasons.. :) I figured it'd be like the user starting their day by opening up their work with a cup of coffee. I couldn't find anything particularly suitable that wasn't too mainstream.
Dmitri Zagidulin
@dmitrizagidulin
@csarven lol :) mainstream is a /good/ thing when it comes to icons :)
that's what you /want/
Sarven Capadisli
@csarven
@dmitrizagidulin I'm on to you.. right behind you! Booo
Dmitri Zagidulin
@dmitrizagidulin
:P
Sarven Capadisli
@csarven
What are you up to now?
Dmitri Zagidulin
@dmitrizagidulin
usual shenanigans :) working with digital bazaar, consulting with various, working on a contact mgmt mobile app
Tim Berners-Lee
@timbl
Agreed the art of icon language is knowing what the mainstream user base culture— not avoiding it :-)
Sarven Capadisli
@csarven
Arguably... but I'm just not sold on using iconography like floppy disks to communicate 'saving'