I tried out numen to chat a few days ago, while my fingers were hurting from climbing too hard.
I'd like to share a few observations about the transcription mode and then my proposals for how it
could be improved!
Observations
- You can't (afaics) easily correct mistakes in transcribed text, not while dictating and it's
clumsy afterwards.
- You can't (afaics) add punctuation while transcribing: you have to wait until you leave
transcription mode and then add your punctuation.
- You have to keep speaking to keep the transcription mode active, which is stressing.
- When you start speaking, you don't see the transcription until after you stop speaking.
Suggestions
- Wouldn't it be nice to see the words come in as you speak? When the model changes its mind,
backspace can be pressed. I know I make it sound simpler then it is, but I think this would be a
big improvement.
- Live closed captions for TV are often created with voice recognition. There the system changes
"comma" into a comma symbol for instance. I don't know how they deal with it when someone is
speaking *about* a comma, but I imagine an escape word or sound could work nicely for that.
The other way around is also possible: requiring an escape word or sound to recognize a single
non-transcription command.
- Have the option to keep scribe active until you terminate it. (It could be a command like what I
suggest in the previous bullet point.)
Finally: nice work, I hope this project can keep bringing you fulfillment, and I hope to find some
time to contribute one day!