Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Apr 04 04:17
    Danesprite labeled #367
  • Apr 04 04:17
    Danesprite assigned #367
  • Apr 04 04:17
    Danesprite opened #367
  • Apr 04 04:02

    Danesprite on master

    Remove unnecessary install requ… (compare)

  • Apr 04 04:02

    Danesprite on major

    Remove multiple engine options … Remove the dragonfly.util sub-p… Make dragonfly.accessibility su… (compare)

  • Mar 31 05:17

    Danesprite on master

    Clarify sections of the documen… (compare)

  • Mar 31 05:17

    Danesprite on major

    Remove dragonfly.rpc sub-packag… (compare)

  • Mar 23 11:43
    Danesprite closed #354
  • Mar 23 11:43
    Danesprite commented #354
  • Mar 22 10:39
    Danesprite closed #337
  • Mar 22 10:39

    Danesprite on master

    Update sections of the document… (compare)

  • Mar 20 06:09

    Danesprite on master

    Remove unused test suite module (compare)

  • Mar 20 06:02

    Danesprite on minor

    Fix Kaldi marking of whether re… Add Kaldi optional local or onl… Update Kaldi version, to fix er… and 32 more (compare)

  • Mar 20 06:01

    Danesprite on major

    Remove training mode from the S… Remove built-in sleep/wake mode… (compare)

  • Mar 20 05:36

    Danesprite on talon-dev

    (compare)

  • Mar 19 11:42

    Danesprite on master

    Add missing documentation file … (compare)

  • Mar 19 11:36
    Danesprite closed #240
  • Mar 19 11:36
    Danesprite labeled #240
  • Mar 19 11:35

    Danesprite on 0.35.0

    (compare)

  • Mar 19 11:35

    Danesprite on master

    Update changelog Update to version 0.35.0 Pleas… (compare)

westocl
@westocl
ahhh... Learned something new everyday. Never knew that you could just have temporary environment variables in a on-liner.
one-liner
David Zurow
@daanzu
yep, can be pretty handy
westocl
@westocl
when running the text backend for test, does the text engine execute the same recognition sequence, "on_begin_callback", "recognition_callback", "failure_callback".. ect?
David Zurow
@daanzu
@Danesprite Is mimic() supposed to adhere to grammar/rule contexts?
David Zurow
@daanzu
I'm trying to remember whether it's a good idea to use one Element in multiple Rules. Perhaps this should be mentioned in https://dragonfly2.readthedocs.io/en/latest/elements.html#refelementclasses ?
Dane Finlay
@Danesprite

@westocl

Environment variables are a good way to go here. There is no way to specify arguments to the Python files in question, since they are in fact being imported as modules, not run as programs.

Regarding recognition callbacks, yes, the sequence of callbacks should be exactly the same when using the text backend. That is the whole point of that engine backend. :-)

@daanzu

Yep, mimic() is supposed to do that.

I think reusing an element is fine in most cases. I would say just copy the element with copy.copy() if it causes a problem. Mentioning this in the documentation sounds like a good idea to me.

westocl
@westocl
@daanzu, @Danesprite Thanks for you help.
Dane Finlay
@Danesprite
No problem. :+1:
Dane Finlay
@Danesprite

If the engine.speak() method for text-to-speech is used by anyone in this channel, I was wondering if there is any interest in changing Dragonfly to make use of the text-to-speech built into Windows for all engine back-ends, if it is available, instead of only with the WSR/SAPI 5 back-end. The Natlink back-end would still use Dragon's TTS.

I suppose this would be mostly interesting to Kaldi users.

Vojtěch Drábek
@comodoro
Seems the right time to look here, I would definitely be interested, another advantage is already having language TTS support (for possible future Dragonfly development). I wonder, SAPI 5 seems possible to use everywhere, but MS Mobile voices usage is unknown to me.
David Zurow
@daanzu
@Danesprite Thanks for the info! Regarding TTS, that sounds good to me. It would be nice to integrate something open and cross platform, but that would entail significantly more work.
Dane Finlay
@Danesprite

@comodoro Okay then, I will have a look into this.

I hadn't considered the advantage for other languages. Windows has TTS support for quite a few languages or "voices" through SAPI 5. It should be possible to separate the TTS functionality from each engine class so that you could, for example, use the SAPI 5 TTS instead of Dragon's.

I don't think any of this would work on mobile unless pywin32 does. I would guess only x86/x86_64 devices would work.

@daanzu No worries. It is certainly possible to add an integration for eSpeak or Festival that simply shells out to the command-line programs:
$ echo "speak some words" | espeak --stdin
$ echo "speak some words" | festival --tts
Vojtěch Drábek
@comodoro
Yes, I remember a while ago, I think it was a school assignment, it was super easy to use SAPI 5, however in .NET. In Python it should not be too hard either. I mean using the API, not saying that it is not work to implement. I have been under the impression however that SAPI 5 TTS is being slowly (like the SAPI STT perhaps) phased out or frozen in favor of MS Mobile voices and some search did not reveal much about handling those. I guess there is nothing wrong with using just SAPI for now, but a Czech system for example has one MS Mobile and no SAPI 5 voice.
Dane Finlay
@Danesprite

@comodoro Ah, okay that is a shame. Thanks for elaborating on MS Mobile voices. I would never have guessed what it was from the name. Leave it to Microsoft to make things more complicated than they ought to be.

Both the TTS and STT parts of SAPI haven't really been actively worked on for a long time. It isn't too difficult to work with the API in Python, I suppose. Dragonfly's SAPI 5 engine back-end works using COM. I can see it is pretty simple to set the current TTS voice. The API error messages given could be more helpful though.

If there is a public API for utilising MS Mobile voices, it would probably require .NET. Since we are using CPython, that would be difficult. I'll just stick with SAPI for now. I suppose you could try Google TTS instead for Czech.

Vojtěch Drábek
@comodoro
Well, there is https://pypi.org/project/winrt/, but it is experimental and for Python 3.7 or later, not worth the effort. Google TTS has one big disadvantage, almost decisive, and that is cloud. But open Czech STT for Dragonfly is not on the horizon, unless Deepspeech adds grammars, or perhaps, @daanzu , what is needed for a Kaldi model? I have several hundred hours of data and some grapheme to phoneme could be generated e.g. using espeak, Czech is less irregular than English.
Ryan Hileman
@lunixbochs
I thought deepspeech was dead?
Shervin Emami
@shervinemami
I've tried various TTS options in Windows & Linux, mostly from when I tried what it's like to be a fully blind computer programmer. (Answer is that it's extremely frustrating, but it is possible!)
There's basically 2 paths you'd want to choose between for TTS: 1) Nice & natural sounding TTS. 2) Robotic & harsh sounding TTS.
Shervin Emami
@shervinemami
Nice sounding TTS is great for beginners, or anyone just wanting to hear the text easily, at normal playback speed (between around 0.7x - 1.5x speeds). Whereas robotic TTS is great for people that want to use TTS a lot at fast playback speeds (1.5x - 4x speeds), such as if you wanted to hear the content of a whole paragraph or page of text and consume it very quickly, and it's something you'd do often and therefore you want speed & efficiency even if it takes some weeks to get accustomed to the fast robotic TTS.
Shervin Emami
@shervinemami
"eSpeak" is great at supporting the power users that want fast robotic speech, it's open source & portable & well established. While for natural nice sounding speech, there are various open source options that work on Linux & other OSes but it's an area that Google & Microsoft & others also invest in since it has commercial prospects for them. My preference is that we make a nice & naturally sounding open-source cross-platform solution as the default TTS backend, and potentially allow people to replace it with alternative backend such as a Microsoft / Google or espeak if the user wants it. But default to open-source cross-platform.
Shervin Emami
@shervinemami
I personally really like SVOX "pico2wave", I believe it's a free open source TTS with a nice & natural voice in Linux. I've also tried some commercial TTS systems including Acapela TTS & Cepstral TTS, and they tend to be smarter at handling language intonations but I prefer the way symbols are handled by SVOX pico2wave, since it's important for programming & technical content rather than just natural language content.
David Zurow
@daanzu
deepspeech: mozilla terminated its involvement, but since the project is open source, I believe in various people are continuing to work on it, including some of the original contributors
@comodoro for kaldi, what you have may be enough to work. What is needed is: the audio and matched transcripts, plus a lexicon listing all of the words and their matching pronunciations. I think the lexicon may be available for czech already.
David Zurow
@daanzu
@shervinemami thanks for the info and comparison. Very interesting!
Vojtěch Drábek
@comodoro
I fail to see an official statement, but the repository seems to be alive. Anyway there is at the minimum a fork called Coqui, recently announced by some of the same people.
@daanzu You mean specifically lexicon for Kaldi? I will have a look, I only remember the Vystadial one, generated from its (small) dataset.
Dane Finlay
@Danesprite

@comodoro I wasn't aware of the Python winrt project. I would still prefer to just stick with SAPI 5.

Czech is listed as a supported eSpeak language BTW: http://espeak.sourceforge.net/languages.html

You can actually use eSpeak voices with SAPI 5 if you install the Windows version available on this page. I can use the Czech voice that way in Python, although I can't speak to how intelligible it is as I don't speak Czech. :-)

The voices are probably compiled as 32-bit, since I can't use them in 64-bit processes.

Vojtěch Drábek
@comodoro
Yes, that's no problem. As I say, not needed right now and I would kind of feel obliged to try it myself if it came to that. Right now I am using an English UI (on top of base Czech system because of internal windows-1250 ANSI encoding, still being in use in some applications). I have tried espeak before - terrible quality for Czech, but the grapheme to phoneme transcription, if expanded, looks usable (for Kaldi). Exceptions are a problem though.
David Zurow
@daanzu
@comodoro yes, the lexicon would need to be in the right format, but it should be easy to use the vystadial one. it is already in the kaldi repo
Vojtěch Drábek
@comodoro
@daanzu I see the generator https://github.com/daanzu/kaldi-fork-active-grammar/blob/master/egs/vystadial_cz/s5/local/phonetic_transcription_cs.pl, which is very simplistic and I suspect tailored for the Vystadial CZ dataset, but perhaps OK for a small prototype. I am starting to see why end-to-end is all the rage:) All right, do you think it would be better to try to improve upon the Vystadial recipe, or try a new one from some modern English recipe?
David Zurow
@daanzu
@comodoro the vystadial recipe isn't bad although it isn't completely up to date. It's probably worth a try, but it shouldn't be too hard to modify an english one either. I have a docker image for training I am working on, but it is still pretty janky
Dane Finlay
@Danesprite

@comodoro Okay then, fair enough. As Shervin said, eSpeak is quite robotic and isn't for everyone. Just thought I would mention it. There are also Czech voices for Festival.

If you do end up getting Czech STT working, I can help with adding Czech support for IntegerRefs.

@shervinemami Thanks for the info. I'm happy to add to Dragonfly a nice, extensible interface for text-to-speech, allowing use of Dragon, SAPI 5, eSpeak, Festival, pico, etc. It would also not be difficult to add a Speak action class. This functionality isn't a high priority, however.
Vojtěch Drábek
@comodoro
Robotic for English, terrible for Czech:)
I will first try some Kaldi tutorials and see how it goes.
Dane Finlay
@Danesprite
Ah, gotcha :)
Alex Boche
@alexboche
Not sure if i'm following the above discussion. If this talk of TTS is used for visual problems, it might be worth talking with Rudiger Wilke, he had a product called dragon echo for that http://www.rwilke.de/dragonecho/ . Even if his thing isn't what's needed per se, he might have some ideas.
Dane Finlay
@Danesprite

@alexboche The Dragonfly TTS functionality discussed above is fairly simplistic. At present, it only consists of the engine.speak(text) method which synthesises text strings into speech. Beyond a Speak action class that does the same thing and better utilisation and choice of the TTS back-end, I don't think additional TTS functionality should be added into Dragonfly.

Dedicated screen reader software like DragonEcho would be much more appropriate for users with visual impairments.

Dane Finlay
@Danesprite

@/all Dragonfly2 version 0.30.0 has now been released, as of 21 March. The additions, changes and fixes are listed in the changelog here: https://dragonfly2.readthedocs.io/en/latest/changelog.html

You can upgrade by running pip install --upgrade dragonfly2.

Thanks very much to everyone who contributed! My apologies for the long gap between this release and the last one.

@lunixbochs Sorry that the Talon integration (PR #326) was not included in this version. I haven't tested the changes sufficiently yet.
David Zurow
@daanzu
@Danesprite thanks for all your work maintaining!
tripfish
@tripfish
Thanks! :+1:
Dane Finlay
@Danesprite
No worries. :-)
timoses
@timoses:matrix.org
[m]
Shouldn't it be possible calling Key from a Function Action? It somehow does not seem to execute Key.. It does if I use Key directly in the Mapping..
    def omgzo(**test):
        print('uffi')
        Key('j')

    class TmuxRule(MappingRule):

        mapping = {
            "pane (<dir>|<n>)":
                #Function(lambda **test: Key('j'))
                Function(omgzo)
                #Key('j')
(It does print 'uffi')..
LexiconCode
@LexiconCode
@timoses:matrix.org When utilizing dragonfly within functions add .execute() after the actionKey('j').execute() . You can execute python files with dragonfly without speech engine and it will execute the actions along with any other code. Example below:
from dragonfly import Key

def omgzo():
    print('uffi')
    Key('j').execute()

omgzo()
timoses
@timoses:matrix.org
[m]
Aw ty. I knew there was something in the bushes : )