Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Joseph Bullock
    @JosephPB

    Hi @ggqshr No problem. If you want to be able to reproduce the same result each time then you can set the random_state to an interger value. See the parameters on the gensim page: https://radimrehurek.com/gensim/models/ldamodel.html

    Hope this helps :)

    ggqshr
    @ggqshr
    @JosephPB unfortunately, I have set random_state, but the results are still different each time.My situation is the same as the following page, but the passes parameter does not work.
    Philippe Rivière
    @Fil
    hello, I'm using gensim to generate an LDA model of my documents. Then I export the vectors to matrixmarket format, and create a 2D embedding with UMAP in JavaScript. So far so good. Now I would like to do this UMAP transform in python, but I can't find out how to "convert" the documents vectors in the LDA topic space… It should be "obvious" in the sense that what I need is a n * m matrix when n is the numbers of documents and m the number of topics.
    Philippe Rivière
    @Fil

    I'm blocked here:

    transformed = lda[corpus_lda]
    X = np.array(transformed)
    embedding = umap.UMAP().fit_transform(X)

    the value of X is an array of lists instead of a numpy array expected by umap.

    Philippe Rivière
    @Fil
    I built the np.array by hand and it works
    Herli Menezes
    @herlimenezes
    Hi, is there any gensim module for portuguese language?
    Herli Menezes
    @herlimenezes
    More specifically. How to manage diacritics in gensim?
    chet
    @chetkhatri
    Hi All, is this channel active?
    Ajda
    @ajdapretnar
    @piskvorky Quick question. I know that LSI can return less than requested number of topics (for short texts, usually). I think LDA does that, too. How about HDP? Could it ever return less than the requested number of topics (in my interpretation, that is the m_T property)?
    Andrew M Olney
    @aolney
    Greetings, I'm teaching a class using gensim at this very moment. All my windows users have hit “OverflowError: Python int too large to convert to C long” when executing this line of code: fakeDataset = downloader.load('fake-news') I could try to distribute the dataset manually, but are there any other suggestions?
    Andrew M Olney
    @aolney
    I'll put an issue on GitHub. Thanks :)
    iamsainianuj
    @iamsainianuj
    I am having issue with LDA model, after training when i try to see the topics distribution of some terms it gives an empty list [], could anyone tell why it is happening.. Thanks in advance.. :)
    Rob Creel
    @robcreel

    Good day. I'm going through the tutorials and I'm getting an error. On the run_corpora_and_vector_spaces.ipynb notebook, in the cell with the following code

    for vector in corpus_memory_friendly:  # load one vector into memory at a time
        print(vector)

    I get this error

    HTTPError: 404 Client Error: Not Found for url: https://radimrehurek.com/gensim/mycorpus.txt

    The code does not look like it should be calling/visiting a URL, but it seems to be trying and failing to. What's going on here? How may I run the tutorial?

    Machine specs:
    Operating System: Manjaro Linux
    Processors: 4 × Intel® Core™ i5-3320M CPU @ 2.60GHz
    Memory: 15.5 GiB of RAM
    Notebook is running in Jupyter Lab in Firefox 77.0.1 (64-bit)

    lesshaste
    @lesshaste
    Hi. In word2vec it can be useful to distinguish the context to the left of a word and the context to the right.
    does gensim support this?
    Qi Wang
    @aywq2008_gitlab
    Has sent2vec been merged yet?
    Data Knight 🎠
    @emeka_boris_twitter
    How can i start contributing to genism
    ddovnar
    @ddovnar
    hello everyone. I'm a novice in gensim, so my question will be very simple. How can I train some easy text data, for detection similar text, like this:
    I want, to get most similar value for word "animal" if I type "cat, dog, rabbit"
    I'm trying to do this one:
    cat animal
    dog animal
    rabbit animal
    but result, is not that I need. "Animal" doesn't has most similar value
    11 replies
    ddovnar
    @ddovnar
    image.png
    image.png
    Stephan ☕️
    @stephankaag_twitter

    I'm trying to get some code for older gensim working on V4.

    This is part of the code:

        # Get the doc2vec labels from indices
        for elem in bestdoc2vec:
            ind = d_indices[elem]
            temp = model1.dv.index_to_doctag(ind)
            resultdoc2vec.append((temp, float(avgdoc2vec[elem])))

    This results in the error: AttributeError: 'KeyedVectors' object has no attribute 'index_to_doctag'.

    Any ideas how to rewrite this code for V4?

    1 reply