Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 16:36
    codebold commented #358
  • 12:36
    Danesprite commented #358
  • Jan 21 15:28
    kendonB commented #358
  • Jan 21 11:12
    m-spisiak commented #358
  • Jan 21 11:00
    m-spisiak commented #358
  • Jan 20 06:25
    Danesprite labeled #358
  • Jan 20 06:24
    Danesprite commented #358
  • Jan 19 20:10
    kendonB commented #358
  • Dec 16 2021 07:24
    Danesprite commented #326
  • Dec 16 2021 07:14
    Danesprite commented #326
  • Dec 09 2021 06:29

    Danesprite on 0.34.0

    (compare)

  • Dec 09 2021 06:29
    Danesprite closed #361
  • Dec 09 2021 06:29
    Danesprite closed #362
  • Dec 09 2021 06:29

    Danesprite on master

    Remove deprecation notices for … Cleanup mouse cursor position f… Convert all relative import sta… and 12 more (compare)

  • Dec 01 2021 05:24
    daanzu commented #360
  • Nov 30 2021 00:34

    Danesprite on master

    Fix Kaldi marking of whether re… Add Kaldi optional local or onl… Update Kaldi version, to fix er… and 2 more (compare)

  • Nov 30 2021 00:34
    Danesprite closed #360
  • Nov 30 2021 00:34
    Danesprite commented #360
  • Nov 29 2021 06:03
    Danesprite commented #360
  • Nov 29 2021 05:06
    daanzu commented #360
Dane Finlay
@Danesprite

@comodoro Okay then, I will have a look into this.

I hadn't considered the advantage for other languages. Windows has TTS support for quite a few languages or "voices" through SAPI 5. It should be possible to separate the TTS functionality from each engine class so that you could, for example, use the SAPI 5 TTS instead of Dragon's.

I don't think any of this would work on mobile unless pywin32 does. I would guess only x86/x86_64 devices would work.

@daanzu No worries. It is certainly possible to add an integration for eSpeak or Festival that simply shells out to the command-line programs:
$ echo "speak some words" | espeak --stdin
$ echo "speak some words" | festival --tts
Vojtěch Drábek
@comodoro
Yes, I remember a while ago, I think it was a school assignment, it was super easy to use SAPI 5, however in .NET. In Python it should not be too hard either. I mean using the API, not saying that it is not work to implement. I have been under the impression however that SAPI 5 TTS is being slowly (like the SAPI STT perhaps) phased out or frozen in favor of MS Mobile voices and some search did not reveal much about handling those. I guess there is nothing wrong with using just SAPI for now, but a Czech system for example has one MS Mobile and no SAPI 5 voice.
Dane Finlay
@Danesprite

@comodoro Ah, okay that is a shame. Thanks for elaborating on MS Mobile voices. I would never have guessed what it was from the name. Leave it to Microsoft to make things more complicated than they ought to be.

Both the TTS and STT parts of SAPI haven't really been actively worked on for a long time. It isn't too difficult to work with the API in Python, I suppose. Dragonfly's SAPI 5 engine back-end works using COM. I can see it is pretty simple to set the current TTS voice. The API error messages given could be more helpful though.

If there is a public API for utilising MS Mobile voices, it would probably require .NET. Since we are using CPython, that would be difficult. I'll just stick with SAPI for now. I suppose you could try Google TTS instead for Czech.

Vojtěch Drábek
@comodoro
Well, there is https://pypi.org/project/winrt/, but it is experimental and for Python 3.7 or later, not worth the effort. Google TTS has one big disadvantage, almost decisive, and that is cloud. But open Czech STT for Dragonfly is not on the horizon, unless Deepspeech adds grammars, or perhaps, @daanzu , what is needed for a Kaldi model? I have several hundred hours of data and some grapheme to phoneme could be generated e.g. using espeak, Czech is less irregular than English.
Ryan Hileman
@lunixbochs
I thought deepspeech was dead?
Shervin Emami
@shervinemami
I've tried various TTS options in Windows & Linux, mostly from when I tried what it's like to be a fully blind computer programmer. (Answer is that it's extremely frustrating, but it is possible!)
There's basically 2 paths you'd want to choose between for TTS: 1) Nice & natural sounding TTS. 2) Robotic & harsh sounding TTS.
Shervin Emami
@shervinemami
Nice sounding TTS is great for beginners, or anyone just wanting to hear the text easily, at normal playback speed (between around 0.7x - 1.5x speeds). Whereas robotic TTS is great for people that want to use TTS a lot at fast playback speeds (1.5x - 4x speeds), such as if you wanted to hear the content of a whole paragraph or page of text and consume it very quickly, and it's something you'd do often and therefore you want speed & efficiency even if it takes some weeks to get accustomed to the fast robotic TTS.
Shervin Emami
@shervinemami
"eSpeak" is great at supporting the power users that want fast robotic speech, it's open source & portable & well established. While for natural nice sounding speech, there are various open source options that work on Linux & other OSes but it's an area that Google & Microsoft & others also invest in since it has commercial prospects for them. My preference is that we make a nice & naturally sounding open-source cross-platform solution as the default TTS backend, and potentially allow people to replace it with alternative backend such as a Microsoft / Google or espeak if the user wants it. But default to open-source cross-platform.
Shervin Emami
@shervinemami
I personally really like SVOX "pico2wave", I believe it's a free open source TTS with a nice & natural voice in Linux. I've also tried some commercial TTS systems including Acapela TTS & Cepstral TTS, and they tend to be smarter at handling language intonations but I prefer the way symbols are handled by SVOX pico2wave, since it's important for programming & technical content rather than just natural language content.
David Zurow
@daanzu
deepspeech: mozilla terminated its involvement, but since the project is open source, I believe in various people are continuing to work on it, including some of the original contributors
@comodoro for kaldi, what you have may be enough to work. What is needed is: the audio and matched transcripts, plus a lexicon listing all of the words and their matching pronunciations. I think the lexicon may be available for czech already.
David Zurow
@daanzu
@shervinemami thanks for the info and comparison. Very interesting!
Vojtěch Drábek
@comodoro
I fail to see an official statement, but the repository seems to be alive. Anyway there is at the minimum a fork called Coqui, recently announced by some of the same people.
@daanzu You mean specifically lexicon for Kaldi? I will have a look, I only remember the Vystadial one, generated from its (small) dataset.
Dane Finlay
@Danesprite

@comodoro I wasn't aware of the Python winrt project. I would still prefer to just stick with SAPI 5.

Czech is listed as a supported eSpeak language BTW: http://espeak.sourceforge.net/languages.html

You can actually use eSpeak voices with SAPI 5 if you install the Windows version available on this page. I can use the Czech voice that way in Python, although I can't speak to how intelligible it is as I don't speak Czech. :-)

The voices are probably compiled as 32-bit, since I can't use them in 64-bit processes.

Vojtěch Drábek
@comodoro
Yes, that's no problem. As I say, not needed right now and I would kind of feel obliged to try it myself if it came to that. Right now I am using an English UI (on top of base Czech system because of internal windows-1250 ANSI encoding, still being in use in some applications). I have tried espeak before - terrible quality for Czech, but the grapheme to phoneme transcription, if expanded, looks usable (for Kaldi). Exceptions are a problem though.
David Zurow
@daanzu
@comodoro yes, the lexicon would need to be in the right format, but it should be easy to use the vystadial one. it is already in the kaldi repo
Vojtěch Drábek
@comodoro
@daanzu I see the generator https://github.com/daanzu/kaldi-fork-active-grammar/blob/master/egs/vystadial_cz/s5/local/phonetic_transcription_cs.pl, which is very simplistic and I suspect tailored for the Vystadial CZ dataset, but perhaps OK for a small prototype. I am starting to see why end-to-end is all the rage:) All right, do you think it would be better to try to improve upon the Vystadial recipe, or try a new one from some modern English recipe?
David Zurow
@daanzu
@comodoro the vystadial recipe isn't bad although it isn't completely up to date. It's probably worth a try, but it shouldn't be too hard to modify an english one either. I have a docker image for training I am working on, but it is still pretty janky
Dane Finlay
@Danesprite

@comodoro Okay then, fair enough. As Shervin said, eSpeak is quite robotic and isn't for everyone. Just thought I would mention it. There are also Czech voices for Festival.

If you do end up getting Czech STT working, I can help with adding Czech support for IntegerRefs.

@shervinemami Thanks for the info. I'm happy to add to Dragonfly a nice, extensible interface for text-to-speech, allowing use of Dragon, SAPI 5, eSpeak, Festival, pico, etc. It would also not be difficult to add a Speak action class. This functionality isn't a high priority, however.
Vojtěch Drábek
@comodoro
Robotic for English, terrible for Czech:)
I will first try some Kaldi tutorials and see how it goes.
Dane Finlay
@Danesprite
Ah, gotcha :)
Alex Boche
@alexboche
Not sure if i'm following the above discussion. If this talk of TTS is used for visual problems, it might be worth talking with Rudiger Wilke, he had a product called dragon echo for that http://www.rwilke.de/dragonecho/ . Even if his thing isn't what's needed per se, he might have some ideas.
Dane Finlay
@Danesprite

@alexboche The Dragonfly TTS functionality discussed above is fairly simplistic. At present, it only consists of the engine.speak(text) method which synthesises text strings into speech. Beyond a Speak action class that does the same thing and better utilisation and choice of the TTS back-end, I don't think additional TTS functionality should be added into Dragonfly.

Dedicated screen reader software like DragonEcho would be much more appropriate for users with visual impairments.

Dane Finlay
@Danesprite

@/all Dragonfly2 version 0.30.0 has now been released, as of 21 March. The additions, changes and fixes are listed in the changelog here: https://dragonfly2.readthedocs.io/en/latest/changelog.html

You can upgrade by running pip install --upgrade dragonfly2.

Thanks very much to everyone who contributed! My apologies for the long gap between this release and the last one.

@lunixbochs Sorry that the Talon integration (PR #326) was not included in this version. I haven't tested the changes sufficiently yet.
David Zurow
@daanzu
@Danesprite thanks for all your work maintaining!
tripfish
@tripfish
Thanks! :+1:
Dane Finlay
@Danesprite
No worries. :-)
timoses
@timoses:matrix.org
[m]
Shouldn't it be possible calling Key from a Function Action? It somehow does not seem to execute Key.. It does if I use Key directly in the Mapping..
    def omgzo(**test):
        print('uffi')
        Key('j')

    class TmuxRule(MappingRule):

        mapping = {
            "pane (<dir>|<n>)":
                #Function(lambda **test: Key('j'))
                Function(omgzo)
                #Key('j')
(It does print 'uffi')..
LexiconCode
@LexiconCode
@timoses:matrix.org When utilizing dragonfly within functions add .execute() after the actionKey('j').execute() . You can execute python files with dragonfly without speech engine and it will execute the actions along with any other code. Example below:
from dragonfly import Key

def omgzo():
    print('uffi')
    Key('j').execute()

omgzo()
timoses
@timoses:matrix.org
[m]
Aw ty. I knew there was something in the bushes : )
Dane Finlay
@Danesprite
I have released Dragonfly2 version 0.30.1, which includes fixes for some DNS dictation formatting bugs.
LexiconCode
@LexiconCode
@Danesprite Awesome! Congrats another release!
Dane Finlay
@Danesprite
:)
DanKaplanSES
@DanKaplanSES
image.png
I restarted my computer and now get this error message when I start Dragon:
should I follow those instructions or will that mess things up?
Dane Finlay
@Danesprite

@tieTYT Haven't seen that error before, thanks for mentioning it on here and for the KB solution link :+1:

I don't know about Dragon, but Compatibility Mode shouldn't cause issues with pywin32, Natlink or Dragonfly since these were written to work with older (then new) versions of Windows.

Quintijn Hoogenboom
@quintijn
I have not seen this message before either. Sorry...
Vojtěch Drábek
@comodoro
Is there an easy equivalent to Choice with identical mapping, i.e. 'a': 'a' etc?
Which seems redundant
Vojtěch Drábek
@comodoro
Hmm, looks like List:)
Dane Finlay
@Danesprite
@comodoro List works. You could also use dictionary comprehension:
Choice("choice", {key:key for key in [
    "a", "b", "c"
]})