accuracy_at_k(https://scikit-learn.org/stable/modules/generated/sklearn.metrics.top_k_accuracy_score.html#sklearn.metrics.top_k_accuracy_score) an implementation of the
hit ratio at k(https://www.researchgate.net/publication/344486356_Hit_ratio_An_Evaluation_Metric_for_Hashtag_Recommendation)
It really depends on the kind of data that you have. If you have a corpus of documents LDA would be one way to get cluster/topics https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html
You could also try pre-trained embeddings like word2vec and the likes
why do they have to be of the same length?
Hello, I am wondering why this PR (scikit-learn/scikit-learn#18758) doesn't show up at the top here:
Is it because I had submitted it a long time ago, but my recent changes are considered updates?
pytest sklearnand see the following. Is this ok, or is there something wrong with my build:
SKIPPED  sklearn/utils/tests/test_validation.py:1374: could not import 'pandas': No module named 'pandas' ==== 355 failed, 19625 passed, 1443 skipped, 117 xfailed, 37 xpassed, 3371 warnings in 2380.84s (0:39:40) ==== (sklearn-dev)