Jump to content

[TOPIC: topicViewTemplate]
[GLOBAL: userSmallPhoto]
Photo

Voice to Text
Started by Scott Harrison Oct 17 2016 09:45 AM

106 replies to this topic
[TOPIC CONTROLS]
« Page 4 of 5 2 3 4 5
[/TOPIC CONTROLS]
[modOptionsDropdown]
[/modOptionsDropdown]
[reputationFilter]
[TOPIC: post.html]
#76

dmarques42

[GLOBAL: userInfoPane.html]
dmarques42
  • Enthusiast

  • 62 posts
  • Corona SDK

I have another question, about options for getting phonemes.

 

The children I am working with have language issues, so pronunciation a bit off sometimes (eg, 'puh' instead of 'up'). I put these in my own hitlist for comparison, but I never get 'puh' back, obviously because it is not in any lexicon. Is there a way to get just the phonemes, and I do the word matching myself? Almost all my matching is single words with very specific context (at least so far). I tried setting the language to a phony one, but that didn't work. 

 

Thanks for any suggestions. This is working pretty well for me now, though can't use it for these kids unless I can get more raw or submit my augmented lexicon.



[TOPIC: post.html]
#77

alistair.crompton

[GLOBAL: userInfoPane.html]
alistair.crompton
  • Observer

  • 2 posts
  • Corona SDK

Hi,

thanks for this plugin that does a great job!

I was wondering if somehow, is there a way to disable the start and stop recording default system sounds?



[TOPIC: post.html]
#78

Scott Harrison

[GLOBAL: userInfoPane.html]
Scott Harrison
  • Corona Geek

  • 1,837 posts
  • Enterprise

Hi,

thanks for this plugin that does a great job!

I was wondering if somehow, is there a way to disable the start and stop recording default system sounds?

Yes, on Android it is voiceToText.startRecording(nil, true)



[TOPIC: post.html]
#79

dmarques42

[GLOBAL: userInfoPane.html]
dmarques42
  • Enthusiast

  • 62 posts
  • Corona SDK

>>Yes, on Android it is voiceToText.startRecording(nil, true)

 

When I do that, it turns off the startRecording sound, but not the stopRecording sound.



[TOPIC: post.html]
#80

alistair.crompton

[GLOBAL: userInfoPane.html]
alistair.crompton
  • Observer

  • 2 posts
  • Corona SDK

Yes, on Android it is voiceToText.startRecording(nil, true)

Thanks, perfectly works!



[TOPIC: post.html]
#81

Scott Harrison

[GLOBAL: userInfoPane.html]
Scott Harrison
  • Corona Geek

  • 1,837 posts
  • Enterprise

>>Yes, on Android it is voiceToText.startRecording(nil, true)

 

When I do that, it turns off the startRecording sound, but not the stopRecording sound.

This has been fixed please build again



[TOPIC: post.html]
#82

dmarques42

[GLOBAL: userInfoPane.html]
dmarques42
  • Enthusiast

  • 62 posts
  • Corona SDK

Scott, Thank you for fixing the microphone sounds.

 

I assume that getting the phonemes back is not an option (though still one I would prefer). What about sending in an extension to the lexicon with typical child utterances, such as these sets: duh, bub, fah, dah, puh, el, pah, dow -- partial words like that. Is there any way to get those to match? Isn't submitting lexicon extensions a typical thing for VTT to do?



[TOPIC: post.html]
#83

Scott Harrison

[GLOBAL: userInfoPane.html]
Scott Harrison
  • Corona Geek

  • 1,837 posts
  • Enterprise

Scott, Thank you for fixing the microphone sounds.

I assume that getting the phonemes back is not an option (though still one I would prefer). What about sending in an extension to the lexicon with typical child utterances, such as these sets: duh, bub, fah, dah, puh, el, pah, dow -- partial words like that. Is there any way to get those to match? Isn't submitting lexicon extensions a typical thing for VTT to do?


Unfortunately there is no way to extend the native speech to text libraries on iOS or supprisingly android. In fact the apis on these libraries are one of the most limited I have seen.

[TOPIC: post.html]
#84

dmarques42

[GLOBAL: userInfoPane.html]
dmarques42
  • Enthusiast

  • 62 posts
  • Corona SDK

Is there any possibility of getting Kindle Fire to work? I don't know what is wrong, but it is as if init is never called. I get no initialization of the microphone (BTW, on my Android phone, I still get the microphone start/stop sounds). I get no error in the adb debug either.

 

Any thoughts?

 

BTW, we are getting pretty good results now with recognition of the children. I send back everything that the voice recognition thought it heard, and I can add words to my matching set. We are using mostly iPad with the children for now, but I want to get a cheap tablet for other families to use.



[TOPIC: post.html]
#85

Scott Harrison

[GLOBAL: userInfoPane.html]
Scott Harrison
  • Corona Geek

  • 1,837 posts
  • Enterprise

Probably won’t work with kindle because on android it uses googles speech to text service. I believe that amazon has their own service but the plugin does not currently support this. I don’t have kindle to verify but I would suspect that is the problem. As for if I plan to support amazon, as mentioned I don’t have a kindle so I would have to go out and buy a kindle and I have not had much interest from others to justify supporting amazon.

[TOPIC: post.html]
#86

dmarques42

[GLOBAL: userInfoPane.html]
dmarques42
  • Enthusiast

  • 62 posts
  • Corona SDK

Kindle runs Android, though, and the plugin loads fine, and the microphone privileges show up correctly. If you are just using Android calls, and not hardware specifics, I thought it would work. But maybe the Kindle microphone needs different hardware support. Record does work fine for Alexa on the Kindle. I have asked Amazon to be part of their trial of Transcribe, but I think I am not a big enough customer, haven't heard back. I like Kindle Fire for the children voicing app because it is only $50, so low point of entry for families.

 

Thanks. Let me know if anything changes.



[TOPIC: post.html]
#87

Scott Harrison

[GLOBAL: userInfoPane.html]
Scott Harrison
  • Corona Geek

  • 1,837 posts
  • Enterprise

^ I recognize the audio que from google now, which leads mean to believe that it uses google now specifically for speech to text. I don’t know if that is supported on kindle. I did a little googling and was not able to find anything on this.

[TOPIC: post.html]
#88

dmarques42

[GLOBAL: userInfoPane.html]
dmarques42
  • Enthusiast

  • 62 posts
  • Corona SDK

That must be it. I am sure Amazon uses its own audio queue, the one for Alexa. Thanks, I can see that would be a big addition.



[TOPIC: post.html]
#89

dmarques42

[GLOBAL: userInfoPane.html]
dmarques42
  • Enthusiast

  • 62 posts
  • Corona SDK

Is there any way for me in the app to see/monitor the microphone amplitude? The issue is this: there will always be a gap between when the child finishes saying a word and when the app returns with 'correct' or not, because the service has to recognize the silence before processing. I wanted to catch the 'end of the word' as soon as the child stops talking (given that we have one-word speech, this is easy in principle) and give them feedback that the app is 'working on it'. Any way my app can 'listen' to the microphone sound stream also?



[TOPIC: post.html]
#90

Scott Harrison

[GLOBAL: userInfoPane.html]
Scott Harrison
  • Corona Geek

  • 1,837 posts
  • Enterprise

This is currently not supported by the plugin. You should be able to record audio while converting voice to text

[TOPIC: post.html]
#91

dmarques42

[GLOBAL: userInfoPane.html]
dmarques42
  • Enthusiast

  • 62 posts
  • Corona SDK

Thank you, I will look into doing that.

 

But I have a new real problem that I cannot explain. On iOS (iPhone and iPad, different users) I have been getting text strings back that make no sense at all to me. And, consecutive text strings seem to be extensions of previous ones even though I do a stop() before doing each start().

 

Here are my init and start calls:

voiceToText.init(function(e)
  if (e) then
  local hit
  if e.speech then
    -- here log e.speech to analytics
    for w =1,#candidates do
      -- candidates is a list of subtrings that I match against the text received
      if string.find(string.lower(e.speech),string.lower(candidates[w])) then
        if not hit then
          hit = candidates[w]
        elseif string.len(candidates[w])> string.len(hit) then 
          hit = candidates[w]
        end
      end
    end
  elseif e.response then
    if e.response=="stopped" then
      hit = "stopped"
  end
  if hit then
    callBack(hit)
  end
end)
 

voiceToText.startRecording(nil,true,nil,nil)

 

I have been getting strings back that I cannot account for, and today I sat with my grandson to be certain. He did it perfectly, single word answers, no one else around, no TV, no background noise, etc.

 

Here are some strings I received:

 

Glass America crack

I don't

I don't know (consecutive)

Good luck

Good luck last

Good luck last try (these 3 were consecutive)

Good luck last two adapter

 

I can swear to you that there was nothing said even remotely similar to those. He was saying "fish" for the last 4, for example.

 

Is there any chance I am getting someone else's buffer? Is nil invalid for language in the Apple version? I am very confused.



[TOPIC: post.html]
#92

dmarques42

[GLOBAL: userInfoPane.html]
dmarques42
  • Enthusiast

  • 62 posts
  • Corona SDK

It appears that the 'interference' is more common during typical busy times and when I leave the microphone open longer (5 secs). I changed the code to close the microphone as soon as I get any word back, and that improved the response, and I had less interference over the weekend. Nevertheless, I had a few cases of interference this afternoon:

 

jesus fish
ok doc
what you are
what
when you're
when
 
I was careful today to either say nothing or to say cheese or to say the correct word (fox, fish, duck, or cow). The only one close above was the first one, when I might have said 'fish', and the 'ok doc' when I said 'duck'. Still, extra words are strange, and the last 4 are bizarre. Is it possible that this is an Apple problem?
 
It is possible that these interference strings only happen when I leave the whole 5 secs silent. I will test that.


[TOPIC: post.html]
#93

Scott Harrison

[GLOBAL: userInfoPane.html]
Scott Harrison
  • Corona Geek

  • 1,837 posts
  • Enterprise

Looks to be an Apple problem, I am have not run into these problems myself (actually using this plugin in a personal project and works for me 100% of the time). My plugin just uses the native voice to text service of the device. I don't know what the cause is: weak internet, old devices, apple bug, holding it up too close, holding it too far away, older iOS version, apple server problems, etc. 



[TOPIC: post.html]
#94

dmarques42

[GLOBAL: userInfoPane.html]
dmarques42
  • Enthusiast

  • 62 posts
  • Corona SDK

Certainly not older devices, we are using an iPhone 8, iPhone 6, and an iPad mini less than 2 years old, and all are up to date. From my testing, it seems to be at least more common if not always when the microphone is open to silence for more than a few seconds, and I think even sometimes if the microphone is left open to silence even after a single word. The words might be coming in during the stop operation. The effect might also arise from the single-word paradigm I am using, which the service is not intended for. I do not think those bizarre text strings can be accounted for by the microphone distance or other user errors.

 

I agree they seem like Apple server errors, combined perhaps with timing loopholes. Is there a way to force using the Google service within your plugin?



[TOPIC: post.html]
#95

Scott Harrison

[GLOBAL: userInfoPane.html]
Scott Harrison
  • Corona Geek

  • 1,837 posts
  • Enterprise

^this would require a rewrite to the plugin.  Android and iOS have different code bases. My iOS plugin use the native iOS speech recognition api and android use the native android speech recognition api. Have you tried using dictation on your devices? The recognition should be similar to that.



[TOPIC: post.html]
#96

pirx

[GLOBAL: userInfoPane.html]
pirx
  • Enthusiast

  • 88 posts
  • Corona SDK

Hi, thank you for your great work.

 

I'm currently having fun developing a game of which this plugin is a crucial part.

I have a bit of difficult time understanding what is the correct workflow with it.

 

When exactly do I process the input and check what the user said?

 

I mean the following:

1. start recording

2. user says something

3. recording stops on its own

4. Now I want to check if the user said 'hello' or 'goodbye' (Example)

 

Do I do point 4 in the e.response==stopped part of the init function? Or should I detect it in the if e.speech then part and manually call stopRecording?

Also how many times/how often does the init function trigger during speech? Does it fire after every new word?



[TOPIC: post.html]
#97

Scott Harrison

[GLOBAL: userInfoPane.html]
Scott Harrison
  • Corona Geek

  • 1,837 posts
  • Enterprise

On Android, it will stop recording after the user stops speaking. On iOS, it will record until you hit the stop. I wish there was a way to make Android like iOS but it a limitation on Android. 

 

Edit:

On iOS e.speech will return after every new word.



[TOPIC: post.html]
#98

andri.yunanto

[GLOBAL: userInfoPane.html]
andri.yunanto
  • Observer

  • 4 posts
  • Corona SDK

Hi Scott, Does the plugin works on android TV Box?



[TOPIC: post.html]
#99

Scott Harrison

[GLOBAL: userInfoPane.html]
Scott Harrison
  • Corona Geek

  • 1,837 posts
  • Enterprise

I don’t have an android tv to 100% verify but I googled and looks like the speech library is supported on android tv and on stackoverflow people have had lots of success with the speech library on android tv

[TOPIC: post.html]
#100

EvilCensor

[GLOBAL: userInfoPane.html]
EvilCensor
  • Observer

  • 8 posts
  • Corona SDK

Hi, bought this a couple of days ago wanting to incorporate it into a project.

However, when I download and attempt to test the voiceToText-Demo-master example, upon clicking Stop I get the following error:

 

main.lua:40: attempt to call field 'stopRecording' (a nil value)

stack traceback:

main.lua:40: in function '?'

?: in function <?:190>

 

Can someone please advise?




[topic_controls]
« Page 4 of 5 2 3 4 5
 
[/topic_controls]