These are chat archives for ramin-git/word_tree_structure

24th
Apr 2015
ramin-git
@ramin-git
Apr 24 2015 12:35
I have some question about word tree: Is this tree Huffman tree?
Word tree generated on train data using word frequency?
Nicholas Léonard
@nicholas-leonard
Apr 24 2015 14:19
Not a huffman tree. Yes you could use word frequency to separate them into bins. As long as you structure them into the right kind of hierarchy, you should be alright. This is the doc for the SoftMaxTree: https://github.com/clementfarabet/lua---nnx#softmaxtree
ramin-git
@ramin-git
Apr 24 2015 18:47
Thanks, i read it.
please send me guide for creating word tree for a new language or a new data. I do not know how get child_id for parent_id
Nicholas Léonard
@nicholas-leonard
Apr 24 2015 19:07
You could create 1000 bins (parentId) of say 1000 words (childId) each. You could just divide them by frequency (but the most frequent in bin 1, second most in bin 2, etc, etc).
If you want more speed, you can also divide those 1000 (in this case, childId) bins further into 10 higher level bins (parentId) .