Hello. The format of the file is standard Gensim binary model file. Please refer to Gensim's docs for more details
Hi, I am having issues with the torrent link. Is there any other option to download the model file?
Hi @apogre, see my comment on your issue: idio/wiki2vec#34
Hey, is there another way to download the model files. In our organization, torrent is not allowed.
Hi I try to convert the wikipedia model with your class org.idio.wikipedia.dumps.ReadableWiki as described in the readme. But when executing the command (java -Xmx10G -Xms10G -cp org.idio.wikipedia.dumps.ReadableWiki wiki2vec-assembly-1.0.jar path-to-wiki-dump/eswiki-20150105-pages-articles-multistream.xml.bz2 pathTo/output/ReadableWikipedia) I get the following error: can not find main class I'm wondering where the main class is located?
Your command seems to be the wrong way around - -cp argument accepts jar name, not the classname
Does anyone have the pre-trained model for DBpedia entities from latest wikipedia dump?
i try to build the project with the following command : mvn assembly:assembly. but it failed and i got this error msg: Execution default-test of goal org.apache.maven.plugins:maven-surefire-plugin:2.12.4:test failed: The forked VM terminated without saying properly goodbye. VM crash or System.exit called ? -> [Help 1]
anyone know the solution to this=
i am using maven 3.5 by the way
which project are you building?
Hi guys, I am looking to train word2vec on wikipedia dump , only for computer science category articles , but I also need to convert few bigrams into phrases like operating system to operating_system , can anyone provide a suggestion?
Hello, I've been playing with idio/json-wikipedia and I have some issue using the java API. If someone has some insight, I'm more than interrested ! I filled an issue here: idio/json-wikipedia#43
@apogre : did you find any pre-trained model for DBpedia entities?
Hi i am looking for a basic understanding of how vectors are generated using wiki2vec, can anybody please point out to me the algorithm behind it ?
to my understanding, it is word2vec but in the input data the enitities are made up of one atomic unit like Barack Obama is dbpedia/Barack_Obama