watzon on master
Update FUNDING.yml (compare)
classifier.train
on the entire courpus for each of the languages and it was taking hours with no end in sight. But I just now tried training line by line instead and it finished in minutes.
cadmiumcr/classifier
and one waiting for cadmiumcr/distance
:-)
Case
tokenizer
cadmiumcr
repo (without discussing it first) ?cadmiumcr/rfcs
. Why not use it ?cadmium_language_detector
, I'm open to replace it with you solution, wrap our algos as engines to a common API, etc.cadmium_language_detector
project is cool, and we can probably do some API merging, but personally I'd like to go with whichever is more accurate for the actual algorithm.
cadmium_lang
is pretty dang accurate, even with small text samples