Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
eckart
@eckart:messenger.ukp.informatik.tu-darmstadt.de
[m]
I think maybe a simple closed tagset with unspecific labels like "Label 1", "Label 2", "Label 3" might be sensible
otherwise the quite useful radio-choice is hard to find
eckart
@eckart:messenger.ukp.informatik.tu-darmstadt.de
[m]
The upcoming v23.4 release will be the 90th release on github
not counting release candidates
github's release counter is already at 101 ;)
Richard Eckart de Castilho
@reckart
Ogun Oz
@ogunoz
Screen Shot 2022-05-10 at 11.36.14.png
Hello, first of all thank you so much for making such a great annotation tool. I have a question regarding the KB search.
I have a local, custom knowledge base with 500k entities with rdfs:label s. When I try to search for an entry, it takes a long time. I tried increasing knowledge-base.cache-size, decreasing knowledge-base.default-max-results in settings.properties but it didn't help. Do you have any suggestion that I can apply? What I wanted was a simple title search (can even be a typeahead). But maybe there is a more complicated process running under the hood.
eckart
@eckart:messenger.ukp.informatik.tu-darmstadt.de
[m]
how long is long?
which FTS mode is configured on your KB?
Ogun Oz
@ogunoz
Single word around 16 seconds, two word around 30 seconds. I used the default: lucenesail#matches for FTS as it is suggested for local KBs in the documentation.
Is there any configuration i can do? Including updating the KB for example: having 500k class vs. 1 class and 500k instance. At some point I felt like concept linking is also running when I trigger the request but maybe i am wrong
eckart
@eckart:messenger.ukp.informatik.tu-darmstadt.de
[m]
Do you limit the ob search to a scope in the feature settings or do you use custom kb roots in the kb settings?
Ogun Oz
@ogunoz
I am not sure what the feature settings are there. But I chose OWL schema for IRI Schema, didn't customize any other thing. I have 500k entity triples like below:
<{some_domain}/{some_uuid}> a owl:Class ;
    rdfs:label "National cycling route network"@en ;
    dc:identifier "{'external_ids': {...}, 'id': '{some_uuid}'}" ;
    rdfs:comment "A national cycling route network is a nationwide network of designated long-distance cycling routes ..."@en .
eckart
@eckart:messenger.ukp.informatik.tu-darmstadt.de
[m]
check in the project settings in the "layer" tab, select the annotation layer you want to link to the KB, select the concept feature and check if any "scope" is set there
Ogun Oz
@ogunoz
image.png
Ah thanks for the instructions. Named entity layer is the one i want to connect and "scope" field is empty
eckart
@eckart:messenger.ukp.informatik.tu-darmstadt.de
[m]
can you share the RDF data for investigation?
Ogun Oz
@ogunoz
Sure, i need to anonymize the data a bit then I will share a download link here. Thanks
Ogun Oz
@ogunoz
Hi again, i prepared this mock data with the same number of entries. Triple structure is also same and has slow performance
https://www.swisstransfer.com/d/b63e81c3-bed3-4b60-9adf-155b55bcc2c0
eckart
@eckart:messenger.ukp.informatik.tu-darmstadt.de
[m]
thanks - it would be great if you could open a bug report here as well so that you may get a notification when it is fixed or such that there is a way to get back to you: https://github.com/inception-project/inception/issues/new/choose
it may take some time
Ogun Oz
@ogunoz
Of course, thanks for the suggestion, i am on it!
eckart
@eckart:messenger.ukp.informatik.tu-darmstadt.de
[m]
did you change the local KB to be "read only"? if you do not make updates to the KB, then switching it to read-only allows more aggressive caching
FYI: I created a project using the "entity linking (wikidata)" quick template, deleted the wikidata kb, created an empty local OWL KB, save it and then imported your mock data
then imported a toy text, marked up 1-4 words and pressed "space"
the dropdown takes a moment to appear but not 16-30 seconds
eckart
@eckart:messenger.ukp.informatik.tu-darmstadt.de
[m]
ConceptLinkingServiceImpl - Generated [367] candidates in 2615ms when pressing space on 5 words
seems to get very slow though when I enter a query term into the "identifier" field
ConceptLinkingServiceImpl - Found [0] candidates exactly matching [mountain, Chicago and taught constitutional law]
ConceptLinkingServiceImpl - Found [79] candidates starting with [mountain]]
ConceptLinkingServiceImpl - Found [1000] candidates using matching [mountain, Chicago and taught constitutional law]
ConceptLinkingServiceImpl - Generated [1000] candidates in 52875ms
seems to depend on the term I enter - e.g. "hexachloride" returns quickly
Ogun Oz
@ogunoz
Thank you so much for the initial investigation. I didn't know the effect of the read-only option. I am following the issue on the bug ticket
Richard Eckart de Castilho
@reckart
Shivakumar
@shivain22
Hi, how can enable logging for the sparql queries being fired?
eckart
@eckart:messenger.ukp.informatik.tu-darmstadt.de
[m]
try -Dlogging.level.de.tudarmstadt.ukp.inception.kb.querybuilder.SPARQLQueryBuilder=TRACE
which version are you using?
Shivakumar
@shivain22
thanks @eckart:messenger.ukp.informatik.tu-darmstadt.de i got the logs
:-)
Richard Eckart de Castilho
@reckart
... with KB improvements ;)
@ogunoz 23.6 should perform much better on your KB
Shivakumar
@shivain22
Hi, i am trying to use aws neptune with opensearch for KB. But the sparql syntax to use opensearch full text search is different from the sparql syntax produced by inception.
Is there any already running solution in anybody's knowledge. I am eager to know if anyone successfully implemented this
eckart
@eckart:messenger.ukp.informatik.tu-darmstadt.de
[m]
hi @shivain22
I have recently added support for the text search in Stardog to INCEpTION
you can look at the pull request and follow the example to see if you can make the necessary extensions for AWS neptune
Shivakumar
@shivain22
@eckart:messenger.ukp.informatik.tu-darmstadt.de wow, that was awesome. I just wonder when you sleep
Richard Eckart de Castilho
@reckart
Richard Eckart de Castilho
@reckart