Orbiter on dev-dist
status of susi_server_binary_la… (compare)
@Orbiter Thanks for the testing. Yes, large model is not going to work, it is mainly for the servers. You need to use small model.
We can add susi if critical, it is not very common words.
To recognize from 5m distance you need a special microphone (respeaker or something like that). What microphone did you use?
test_microphone.pytests with the same German sentence as done before. vosk almost perfectly understood the sentence with one or two words wrong. I repeated the test several times and there were always at least one wrong word.
htopduring the test and that showed that vosk is running on one single CPU only. Is there any way to run it on all four cores of the RPi? That would be a massive improvement!!
–num-threads 2where could we put such things in? Or maybe thats too naive..
I pushed support for vosk to
susi_installer. If you have a desktop installation or rpi, git pull in the two dirs should be enough.
What remains is putting the models into a directory
vosk-data/LL below the
speech_recognition python lib, on the desktop this is in
... LOG (VoskAPI:ReadDataFiles():model.cc:194) Decoding params beam=10 max-active=3000 lattice-beam=2 LOG (VoskAPI:ReadDataFiles():model.cc:197) Silence phones 1:2:3:4:5:6:7:8:9:10 LOG (VoskAPI:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 0 orphan nodes. LOG (VoskAPI:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 0 orphan components. LOG (VoskAPI:CompileLooped():nnet-compile-looped.cc:345) Spent 0.0242901 seconds in looped compilation. LOG (VoskAPI:ReadDataFiles():model.cc:221) Loading i-vector extractor from /usr/local/lib/python3.9/dist-packages/speech_recognition/vosk-data/en/ivector/final.ie LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:204) Done. LOG (VoskAPI:ReadDataFiles():model.cc:251) Loading HCL and G from /usr/local/lib/python3.9/dist-packages/speech_recognition/vosk-data/en/graph/HCLr.fst /usr/local/lib/python3.9/dist-packages/speech_recognition/vosk-data/en/graph/Gr.fst LOG (VoskAPI:ReadDataFiles():model.cc:273) Loading winfo /usr/local/lib/python3.9/dist-packages/speech_recognition/vosk-data/en/graph/phones/word_boundary.int ... INFO: Keyword 1 detected at time: 2021-03-22 19:57:59 DEBUG: Entering hotword callback DEBUG: We are idle, so work on it! DEBUG: vlcplayer: starting to say something! DEBUG: vlcplayer: finished saying DEBUG: notify renderer for listening DEBUG: listening to voice command DEBUG: Converting audio to text INFO: Trying to recognize audio with vosk in language: en DEBUG: recognize_audio => what is the time ...
Hey all, Happy New Year.
Some news from Vosk project - we have just released public Japanese model for Vosk:
more languages are ongoing.
Is fossasia popular in Japan?
SEPIA #OpenAssistant v2.6.1 😃 It comes with a lot of improvements and new features e.g. shared access permissions 📲📲, custom wake-words 🗣🤖, LED array interface 🚨, new TTS voices and much more. Check it out: https://github.com/SEPIA-Framework/sepia-installation-and-setup/releases