Hi everyone! I’m thinking of adding corpora into the CLTK. I’d like to know if there is any restriction of the corpus. Is it necessary to be dependency treebanks? Because Chinese is pretty analytic, it seems the phrase structure grammar might be more suitable.
And does Old Japanese belong to Classical languages? I found the Oxford Corpus of Old Japanese (OCOJ, http://vsarpj.orinst.ox.ac.uk/corpus/
). The text and a phonemic transcription are available online. If it’s compatible with the CLTK I’ll ask for permission.