Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Sep 13 12:43
    Danesprite labeled #357
  • Sep 13 12:43
    Danesprite assigned #357
  • Sep 13 12:43
    Danesprite opened #357
  • Sep 13 12:29

    Danesprite on doc-changes

    (compare)

  • Sep 13 12:29

    Danesprite on master

    Make mention of AGPL v3 license… Merge pull request #356 from di… (compare)

  • Sep 13 12:29
    Danesprite closed #356
  • Sep 13 12:29
    Danesprite closed #355
  • Sep 12 13:27
    Danesprite labeled #356
  • Sep 12 13:27
    Danesprite labeled #356
  • Sep 12 13:27
    Danesprite opened #356
  • Sep 12 13:27
    Danesprite assigned #356
  • Sep 12 13:22

    Danesprite on doc-changes

    Make mention of AGPL v3 license… (compare)

  • Sep 12 12:53
    Danesprite commented #355
  • Sep 12 12:28
    Danesprite commented #355
  • Sep 12 12:16
    Danesprite assigned #355
  • Sep 12 12:16
    Danesprite labeled #355
  • Sep 12 12:16
    Danesprite labeled #355
  • Sep 08 05:04
    Danesprite commented #354
  • Sep 07 09:17
    Danesprite commented #354
  • Sep 07 09:12
    Danesprite labeled #354
David Zurow
@daanzu
@LexiconCode It would only be mutable if ListRef was used underneath.
Dane Finlay
@Danesprite
Agreed, I'll add this in the next release. No, it would not be mutable since this would use Literal, not ListRef.
Dane Finlay
@Danesprite
@tripfish Yeah, time will tell what this means for Dragon. As Aaron has said in a few other channels, Microsoft and Nuance have had a close working relationship for a while. It seems that this acquisition is more related to Nuance's cloud solutions.
westocl
@westocl
I have been calling the text engine with Popen(["python", "-m", "dragonfly", "test",...]) as a subprocess
and PIPE'ing the std_in and std_out. Is there another other way to do somethin similar without having to use a subprocess?
Dane Finlay
@Danesprite
@westocl Sure. You could initialise a "text" engine instance and use it instead. What is your use case for the test command?
westocl
@westocl

I started a server, and am feeding the recogninzed commands from a client computer over a socket.
From there i was taking the command and piping it into the std_input.

I was trying to see the viablity of just having the text engine on the server and having the client
just have gramamrs that recognize and send teh command over a socket or http.

So if I understand you correctly. I would
1) load the grammars
2) call get_engine(text).mimic(...)

I was strill trying to simulate the on_before(), on_recognize(), and post_recognize(), so somehow i would need to call
engine.do_recognition() ?

does mimic(...) need to have the proper window title and executable or is there a way to bypass the context

LexiconCode
@LexiconCode
The test engine which utilizes mimic should utilize the grammar contexts. I'm not sure about the bypass. This would be an interesting concept for those that work remote support over VM/RDP.
I'm not sure if there's a way to recognize If a spec is valid without at executing it's associated action.
westocl
@westocl

Hmmm... yea i have been thinking about setting this up for a couple of months now and finally got some time to do it.

I plan on getting it set up soon even if the first verion is all hacked together.

Right now i have two applications both with the same grammars. On the client side, I stripped away all the mapping functions and process_recogningion functions.
On sapi5 recognition, client it pings the server. The server runs the text engine and the server replies back with a string that the client uses for its next grammar context function.
Its crude but works pretty well actually.

Right now both the client and server are on the same pc, but looks like its gonna work. When im done i will put the server on a website.

I'll let you guys know when i get something up running.

LexiconCode
@LexiconCode
@westocl what engine are you using?
westocl
@westocl
Windows Sapi5
LexiconCode
@LexiconCode
Have you considered using Kaldi? With it might be possible to transmit audio over the network to the host. That way you don't have to have grammars on both.
westocl
@westocl
@LexiconCode Really? How would I avoid having grammars on both. What is different about Kaldi? I have been hearing about it but didnt know what made it more useful
@LexiconCode just read you message again. 'transmit audio'... never thought of going that way
LexiconCode
@LexiconCode
I spent a lot of time trying to figure out how to do this well Dragon but DNS loves to ignore professional/virtual microphones. Kaldi Would definitely not have this problem and I can't speak to Sapi5. I definitely can say though Kaldi is faster and more accurate than Sapi5 in my experience.
1 reply
So my thought would be to set up a virtual microphone on the remote OS and stream microphone audio from the local OS.
There are some use cases where this is not practical like remote support but it would work when you have control both of the local and remote environments.
westocl
@westocl
@LexiconCode I like this idea also. Its good that there may be a couple of ways to approach it. i will definitly look at Kaldi once i get this setup. Ill let you know how it turns out. Thanks
The text engine approach guarantees they cant ignore the microphones :)
LexiconCode
@LexiconCode
:)
Dane Finlay
@Danesprite

@westocl Ah, okay then, interesting use case!

If you haven't already, I would suggest looking at the source code for the python -m dragonfly test command. It shouldn't be too difficult to do this in your Python process instead.

Regarding window contexts for engine.mimic(), you can pass the remote window context to the method (see TextInputEngine.mimic()). This is specific to the text-input back-end. This is not really good enough for this sort of server-client interaction, however. For that, you would need to have Dragonfly use remote platform implementations of classes like Window and Keyboard.

You may also want to have a look at look at Aenea, which does something similar with Dragonfly. This would be easier IMO. I believe it still needs to be ported to Python 3, however.

sean-hut
@sean-hut
This is a test message. I sent using an IRC client and irc.gitter.im. Can someone respond if the message shows up in Gitter's web client. Thanks
Dane Finlay
@Danesprite
Loud and clear, Sean. :-)
sean-hut
@sean-hut
Danesprite: Thanks
westocl
@westocl
Quick question. I've been using WSR. I usually do not use the "on_failure" callback. However, if when I do, notice that, when a successful recognition happens, it seems like the sequence is , "on_before", "on_failure", "on_recognition", "on_post_recognintion". Why would "on_failure" ever run on a successful recognition? Is this a WSR bug?
Dane Finlay
@Danesprite

Hi @westocl,

Yes, this sounds like a SAPI 5/WSR bug; on_failure() shouldn't be called if the recognition was successful. I am not able to replicate this at the moment, however.

@/all Dragonfly2 version 0.31.0 has now been released. The changes and fixes are listed in the changelog here: https://dragonfly2.readthedocs.io/en/stable/changelog.html

You can upgrade by running pip install --upgrade dragonfly2.

Thanks very much to everyone who contributed!

westocl
@westocl
@Danesprite just read the change log
"Change the Choice element class to allow using list/tuple choices."
Thank you guys for putting this in!!
Dane Finlay
@Danesprite
@westocl No worries.
The documentation for Choice has been updated with info on that: https://dragonfly2.readthedocs.io/en/stable/elements.html#dragonfly.grammar.elements_compound.Choice
DanKaplanSES
@DanKaplanSES
I'm using this question as an example. I don't know if my question is due to python ignorance, or an opportunity to improve the documentation. How do I figure out what I need to import to use this action? https://dragonfly2.readthedocs.io/en/latest/actions.html#pause-action
1 reply
DanKaplanSES
@DanKaplanSES
Another suggestion: maybe this should say what the numbers represent. I'm assuming it's the same as Pause
2 replies
image.png
DanKaplanSES
@DanKaplanSES
and another question: I have this command that is going to run for many seconds. Is there a way I can make it interruptible?
this is the command:
        'youtube seek':
            R(Key("1/200, 2/200, 3/200, 4/200, 5/200, 6/200, 7/200, 8/200, 9/200")),
LexiconCode
@LexiconCode
With Casters Asynchronous actions you can make it interruptible.
DanKaplanSES
@DanKaplanSES
oops, that was a caster question anyway :)
Dane Finlay
@Danesprite
Cancelling or invalidating recognitions is a little different than asynchronous actions that can be interrupted. I haven't looked into this, but I guess you would do it with threads and signals.
Hawkeye Parker
@haughki
OT: on Windows, looking for a lightweight (not MSWord) Select and Say editor, multiple undo (not Notepad), doesn't always revert to double-spacing (not WordPad). Just want to make sure I'm not missing something obvious.
This is for Dragon.
Alex Boche
@alexboche
I think there was a thing called Dagonpad that came with Dragon. But for prose writing, mostly Dragon native select and say is just worse than a combination of sophisticated navigation commands and eyetracking. Eyetracking is probably best for long jumps then can use commands to move within a line or small number of lines. Imitation select and say (wolfmanstout accessibility api approach/ocr, Caster, or Talon probably has something). Others can elaborate.
Dane Finlay
@Danesprite
@haughki As Alex mentioned, DragonPad might be what you want. You can open it via the Dragon Tools menu or by saying "open DragonPad".
Hawkeye Parker
@haughki

@alexboche @Danesprite Tx: For some reason, I gave up on DragonPad a long time ago. I'll try again and maybe it will be great, or maybe I will remember why I stopped using it before :)

@alexboche I hear you. I don't use eye tracking or ocr, but for coding, I have a ton of navigation commands -- AceJump, gotoline, scanline, etc. But, for prose dictation, some apps are terrible: slack, gmail -- those are the two I can think of off the top of my head -- often insert double letters during dictation (at least for me), and other weirdness. Other than that, I don't have any solid data, but it feels to me like Select-and-Say applications have overall better recognition, even if I don't use any of the "correct" commands. I do find the "insert before/after" commands to be really useful to quickly get the cursor exactly where I want. Hrm. Maybe it's time to try eye tracking and/or ocr.

Ryan Hileman
@lunixbochs

often insert double letters during dictation

that's a dragon bug

they send multiple keydowns but only one key up, I think it's a timer that fires too fast or something
Dane Finlay
@Danesprite
Ah, fair enough. I don't use DragonPad either. I get by just fine with Emacs. It doesn't have Select-and-Say support though.

James Stout contributed some accessibility / Select-and-Say functionality to Dragonfly a little while ago. You can read about it here: https://handsfreecoding.org/2018/12/27/enhanced-text-manipulation-using-accessibility-apis/

He also has a few posts on and code for OCR + eye tracking if that is something you want to look into: https://handsfreecoding.org/2020/07/25/say-what-you-see-efficient-ui-interaction-with-ocr-and-gaze-tracking/

Alex Boche
@alexboche
Yeah I suppose Dragon select and say has some advantages. The fact that it uses the words on the page to recognize is nice. The annoying problem for me is that you have to pause before and after. I poured my soul in a clipboard select and say approach in Caster that works in almost any app; I don't know if anyone ever used it :(
I actually felt making corrections was occasionally worth doing for common phrases (if you can manage to not have your profile randomly deleted!).
I think there was a rumor that notepad++ was select and say but it didn't seem to work for me last time i checked.
When I was on Windows I found the speech productivity dictation box pretty cool. Rudiger had another type of dictation box with seemless auto transfer that seemed cool.
Its always worth seeing what people are up to on talon slack. People have ideas there.
Anatole Matveief
@amatveie
@haughki over on the know-brainer forum, in addition to the Dragon capture which Rudiger built and is mentioned by Alex, there's another individual who has developed an alternative to the dictation box that gets pretty good reviews https://www.knowbrainer.com/forums/forum/messageview.cfm?catid=25&threadid=35489&enterthread=y . It might be worth checking out. Select-and-Say definitely does a better job during dictation. In particular it gets the spacing right and usually gets the capitalization rate as well which is a big help. With regards to Notepad++ there is a hack for the registry hack, also documented in 1 of those forum threads, that allows it to get some Select-and-Say functionality, but I don't think it's perfect. If you are adventurous and able to install natlink you could also try Mark's vortex code which will generally enable Select-and-Say for more apps. https://github.com/mdbridge/Vocola-2/tree/vortex . I'm surprised that you have a hard time with Gmail as it seems to work pretty well with me in both chrome and edge. For Slack what I do is use it inside of the web browser, which allows for Select-and-Say vs the Windows app which does not. (I actually go one step further and have it as an 'App' in my taskbar from MS Edge. (Just click on the ... menu->Apps->install the site as an app). In other news there's an effort underway trying to get better dictation support for other apps and speech engines by using the accessibility APIs. But that's certainly not ready for prime time.