Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 19:33
    svlandeg closed #4661
  • 19:32
    svlandeg labeled #4625
  • 19:25
    svlandeg closed #5074
  • 19:23
    svlandeg commented #5119
  • 19:23
    svlandeg closed #5119
  • 19:21
    jknix commented #5126
  • 19:19
    svlandeg closed #5126
  • 19:19
    svlandeg commented #5126
  • 19:07
    svlandeg closed #5231
  • 18:48
    adrianeboyd commented #5231
  • 18:48
    adrianeboyd labeled #5231
  • 18:48
    adrianeboyd labeled #5231
  • 18:12
    erotavlas edited #5231
  • 17:57
    svlandeg labeled #5137
  • 17:53
    svlandeg labeled #5149
  • 17:38
    erotavlas edited #5231
  • 17:37
    erotavlas opened #5231
  • 17:34
    svlandeg labeled #5230
  • 17:21
    lfiedler opened #5230
  • 16:48
    no-response[bot] unlabeled #5219
Rojuinex
@Rojuinex
Hey, just wondering how I can suppress warning W007? I've looked around and all I can see is how to do it from an environment variable, but I'm running in Google CoLab so don't think I can set env vars... Any ideas?
Kuro Denjiro
@kurodenjiro
image.png
i got error The nlp object should have a pretrained ner component. after train linked model
how can fix it ?
myeghaneh
@myeghaneh
Hi all, can someone help me regarding this question
?
Sai
@sai-prasanna
Hi, I was trying out displaCy in streamlit, the arc labels are not rendering? Any clue what's wrong?
Sai
@sai-prasanna
This code specifically, doesn't render the arc labels as shown in the video.
Qi
@mira67
I am trying to use DASK + SpaCy, anyone has some experience and examples? I need to deploy SpaCy to workers in K8S environment, not sure if this is feasible.
Kaique Spagnol Tofoli
@kakatofoli_gitlab
Hello people, I am trying to extract the sentence in which the entity was found but I am not having success. Can anyone help me?
the code is entities=[(i, i.label_, i.start_char, i.end_char, i.sent) for i in nytimes.ents]
Gabriel Altay
@galtay
hi friends, I was trying to find an example from Matt Honnibal I'd seen online a while back, it involved streaming through a text corpus that also had metadata. the technique had something to do with splitting the generator so one could stream through the text and the other could yield the metadata and then joining them back up again (i.e. I want to pass one to a tokenizer and one to something that pulls out labels). this might have been in the text classification demo maybe? ringing any bells for anyone?
Gabriel Altay
@galtay
I would be happy if anyone can help me with the response
tnx
Steph van Schalkwyk
@svanschalkwyk
Hi. Trying to use SpaCy with bert-squad. How does one set a mask etc. ? Logits? Is SpaCy even usable for SQUAD?
delta_ark
@delta_ark_twitter
Hey y'all, is there a general NLP slack/discord/gitter community that people recommend; or, is this sort of where the action is?
I've got GPT2 writing some good poetry for me, and I'm trying to get another network to select the "best" of the output; wondering if I should ask about that here, or in a more general NLP community?
Kevin Brubeck Unhammer
@unhammer
train another transformer on literary criticism, replace the quotations in a template with ones from the first ones =P
delta_ark
@delta_ark_twitter
@unhammer lol
Luca Foppiano
@lfoppiano
Dear all, I was wondering if it's possible to add additional properties to a single Span instance. For example, I extract entities from a document and I have their coordinates, which I would like to attach to the span when I map the text, tokens and entities into Spacy. Any idea on how to do that?
I checked the set/get extension but they are at class level if I understood correctly
Jack Rory Staunton
@jack-rory-staunton
hello all! just stumbled on this (never heard of gitter) but I use spacy everyday for nlp/ner! incidentally if there's anyone in the DC area there's a meetup this evening on NER in spaCy
@kakatofoli_gitlab your code works!
Jack Rory Staunton
@jack-rory-staunton
@lfoppiano you can set custom extensions on Token, Span, and/or Doc objects. I found this article to the most illuminating https://medium.com/@ashiqgiga07/spacy-and-the-set-extension-attributes-47a094c921c7
Jack Rory Staunton
@jack-rory-staunton
@galtay not sure if its what you have in mind but https://support.prodi.gy/t/how-to-incorporate-document-metadata/296/3
Luca Foppiano
@lfoppiano
@jack-rory-staunton thanks!
Jack Rory Staunton
@jack-rory-staunton
delta_ark
@delta_ark_twitter
So, no thoughts on making an "NLP editor" yet, eh? I did a bunch of reading about NLP GANS, which seem to be on the way, but aren't here yet.
I also started looking at poetry recommendation engines, and will keep researching that, bc people are coming up with lists of features to use, but those features are often quite silly/arbitrary
Jack Rory Staunton
@jack-rory-staunton
@delta_ark_twitter interesting application you have there. have you tried a synonym mapping (e.g. from a thesaurus or even a rhyming dictionary) for generating new combinations? just a first thought :) also, I bet adding a syllable-count attribute to your tokens would be instrumental.
delta_ark
@delta_ark_twitter
@jack-rory-staunton hey Jack; doing generation is not really the problem; I can do all sorts of cool generative tricks with GPT2. The problem is discrimination. Some of what the model spits out is bad, some is so so so, and some is truly excellent. I want to be able to generate a large text (on a topic) <-- this part isn't hard, but I want another model to come in and identiy/clip the best parts and hand it to me, so I can use that to build my poem. (One day, would be nice to add a scraping component to this whole system, but making the editor/discriminator is more important now). And yeah, this is like, what I'm doing when I'm unemployed, lol.
Jack Rory Staunton
@jack-rory-staunton
@delta_ark_twitter i see- so its text classification then. clustering on text features ("authorship style") from e.g. textacy stats might help. binary classification (albeit requiring manual annotation) w vowpal wabbit might help too. another thing that could be cool: writing "parallel" lines by doing some word vector arithmetic a la [wv('king') - wv('queen') ~= wv('man') - wv('woman')]
delta_ark
@delta_ark_twitter
@jack-rory-staunton hi there; cool, I'm going to follow up and research these terms (I'm kind of new to this); do you have a "text classification for dummies" article that you recommend?
Jack Rory Staunton
@jack-rory-staunton
@delta_ark_twitter I don't yet have much experience w text classification per se but this example is where I would start https://www.kaggle.com/poonaml/text-classification-using-spacy
delta_ark
@delta_ark_twitter
^ cool; thank you!
Ryan Schmukler
@rschmukler
Hey all! I was currently looking at training spacy with custom semantics ala https://spacy.io/usage/training#intent-parser - I was wondering - how should I go about training documents that have potentially multiple roots? eg. find hotels with wifi and email them to susan
(perhaps not a perfect example since you could argue that emailing them, etc is connected to find... so perhaps find stores that are open and find hotels with wifi)
Ryan Schmukler
@rschmukler
Actually realized I was misreading how the head indexes worked! This is actually easy :)
Alex Manson
@NeuroWinter
Hey all I was wondering does Spacy NER give the salience of the entity?
Dominik
@D-Pavlov
Hi everyone, I have a question, that I couldn't find answer anywhere for.
So I'm training a Spacy NER model and on every iteration I see precision, recall and F1 score (as NER P, NER R and NER F).
My question is, are these values micro- or macro-averages, or something else completely?
Thanks!
Jack Rory Staunton
@jack-rory-staunton
@delta_ark_twitter syllable tool https://github.com/sloev/spacy-syllables
Srevin Saju
@srevinsaju
Hello,
Using Spacy is it possible to get a past tense of a lemma word?
Example: take -> took
gazakhova
@gazakhova
Hi guys, I want to avoid overfitting in my training custom NER model. Is feeding with 'enough' data the only solution? Is there another solution that you would recommend? How important is feeding data without training entities? E.g. from SpaCy documentation ("Do they bite?", {"entities": []}). Thanks a lot.
sreejapk
@sreejapk
Hi guys, I have created a custom model for new NER types identification. The loading time is higher in this case. Any suggestions to reduce?