Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Prateek Saxena
    @prtksxna
    o/
    Prateek Saxena
    @prtksxna
    Good morning o/
    Rachna Chakraborty
    @rachnachakraborty
    Good Evening
    Prateek Saxena
    @prtksxna
    Sorry I wasn't looking at this for a while. Not sure if anyone came here, I don't think they did.
    I'll be lurking here regularly from now
    Sanjaya Kumar Saxena
    @sanjayaksaxena
    We need to add chat badge to all repos - a pending ToDo
    prtksxna @prtksxna nods
    Prateek Saxena
    @prtksxna
    When using bm25, is it ok to run addDoc after consolidate, and then running consolidate again?
    Sanjaya Kumar Saxena
    @sanjayaksaxena
    documents cannot be added post consolidation
    Prateek Saxena
    @prtksxna
    @sanjayaksaxena: Got it! What would you recommend if I need to start the search but also add documents later?
    @sanjayaksaxena: Thanks
    Sanjaya Kumar Saxena
    @sanjayaksaxena
    @prtksxna as of now all documents along with the additional documents will have to be added
    Prateek Saxena
    @prtksxna
    @sanjayaksaxena: Understood :)
    Nishant
    @nishantrpai
    pretty cool project i have to say
    what are all the languages which are currently supported?
    Sanjaya Kumar Saxena
    @sanjayaksaxena
    @nishantrpai thank you :)
    @nishantrpai targeted for nodejs/javascript developers
    Sanjaya Kumar Saxena
    @sanjayaksaxena
    We are announcing a big change for wink today: http://winkjs.org/blog/a-more-permissive-license.html
    allnulled
    @allnulled
    Hello
    Rachna Chakraborty
    @rachnachakraborty
    hi there
    Arye Shalev
    @pantchox
    Hello, great jon on the suite of libs!
    wanted to know if there is a possibility that the tokenizer will output VB/NN/XX instead of just "word"
    Rachna Chakraborty
    @rachnachakraborty
    @pantchox Thank you for using wink packages.
    Rachna Chakraborty
    @rachnachakraborty
    The tokenizer is splitting the given text into valid tokens and publishing the token type as an output. For pos tags you will have to use wink-pos-tagger. The output will be in this format: [ { value: 'He', tag: 'word', normal: 'he', pos: 'PRP' }, // { value: 'is', tag: 'word', normal: 'is', pos: 'VBZ', lemma: 'be' }, // { value: 'trying', tag: 'word', normal: 'trying', pos: 'VBG', lemma: 'try' }, // { value: 'to', tag: 'word', normal: 'to', pos: 'TO' }, // { value: 'fish', tag: 'word', normal: 'fish', pos: 'VB', lemma: 'fish' }
    Rachna Chakraborty
    @rachnachakraborty
    Here is a better formatted output:
    [
      { value: 'He', tag: 'word', normal: 'he', pos: 'PRP' }, 
      { value: 'is', tag: 'word', normal: 'is', pos: 'VBZ', lemma: 'be' }, 
      { value: 'trying', tag: 'word', normal: 'trying', pos: 'VBG', lemma: 'try' },
      { value: 'to', tag: 'word', normal: 'to', pos: 'TO' }, 
      { value: 'fish', tag: 'word', normal: 'fish', pos: 'VB', lemma: 'fish' }
    ]
    you can easily .map this array to any format that you may require.
    Arye Shalev
    @pantchox
    thanks