Orbiter on dev-dist
status of susi_server_binary_la… (compare)
@Orbiter Thanks for the testing. Yes, large model is not going to work, it is mainly for the servers. You need to use small model.
We can add susi if critical, it is not very common words.
To recognize from 5m distance you need a special microphone (respeaker or something like that). What microphone did you use?
test_microphone.py
tests with the same German sentence as done before. vosk almost perfectly understood the sentence with one or two words wrong. I repeated the test several times and there were always at least one wrong word.
htop
during the test and that showed that vosk is running on one single CPU only. Is there any way to run it on all four cores of the RPi? That would be a massive improvement!!
–num-threads 2
where could we put such things in? Or maybe thats too naive..
I pushed support for vosk to susi_linux
and susi_installer
. If you have a desktop installation or rpi, git pull in the two dirs should be enough.
What remains is putting the models into a directory vosk-data/LL
below the speech_recognition
python lib, on the desktop this is in /usr/local/lib/python3.N/dist-packages/speech_recognition/vosk-data/en/
...
LOG (VoskAPI:ReadDataFiles():model.cc:194) Decoding params beam=10 max-active=3000 lattice-beam=2
LOG (VoskAPI:ReadDataFiles():model.cc:197) Silence phones 1:2:3:4:5:6:7:8:9:10
LOG (VoskAPI:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 0 orphan nodes.
LOG (VoskAPI:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 0 orphan components.
LOG (VoskAPI:CompileLooped():nnet-compile-looped.cc:345) Spent 0.0242901 seconds in looped compilation.
LOG (VoskAPI:ReadDataFiles():model.cc:221) Loading i-vector extractor from /usr/local/lib/python3.9/dist-packages/speech_recognition/vosk-data/en/ivector/final.ie
LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:204) Done.
LOG (VoskAPI:ReadDataFiles():model.cc:251) Loading HCL and G from /usr/local/lib/python3.9/dist-packages/speech_recognition/vosk-data/en/graph/HCLr.fst /usr/local/lib/python3.9/dist-packages/speech_recognition/vosk-data/en/graph/Gr.fst
LOG (VoskAPI:ReadDataFiles():model.cc:273) Loading winfo /usr/local/lib/python3.9/dist-packages/speech_recognition/vosk-data/en/graph/phones/word_boundary.int
...
INFO: Keyword 1 detected at time: 2021-03-22 19:57:59
DEBUG: Entering hotword callback
DEBUG: We are idle, so work on it!
DEBUG: vlcplayer: starting to say something!
DEBUG: vlcplayer: finished saying
DEBUG: notify renderer for listening
DEBUG: listening to voice command
DEBUG: Converting audio to text
INFO: Trying to recognize audio with vosk in language: en
DEBUG: recognize_audio => what is the time
...
Hey all, Happy New Year.
Some news from Vosk project - we have just released public Japanese model for Vosk:
https://alphacephei.com/vosk/models/vosk-model-small-ja-0.22.zip
more languages are ongoing.
Is fossasia popular in Japan?
Interesting also:
SEPIA #OpenAssistant v2.6.1 😃 It comes with a lot of improvements and new features e.g. shared access permissions 📲📲, custom wake-words 🗣🤖, LED array interface 🚨, new TTS voices and much more. Check it out: https://github.com/SEPIA-Framework/sepia-installation-and-setup/releases