Controlling my App using Voice

October 15th, 2017 by Heather Maloney

Adding voice recognition to my mobile app
In order for the apps on your smartphone to be voice controlled, they need to be specifically programmed that way.

Some of the more common voice-enabled apps you are likely to find on your smartphone are:

Calendar – ask your smartphone the time of your next / first appointment, on a particular day, and it will tell you the answer and automatically show your calendar appointments for that day on screen
Phone – tell your smartphone to call person X, or send a text message to person Y, and it will take care of these tasks, prompting you for the details as required
Alarm – set an alarm to go off at a particular date and time
Search – ask your phone to search for a topic, and it will display a clickable list of search results

Voice recognition technologies have improved significantly over the last few years, providing numerous options with regard to voice enabling mobile apps, including:

The Android operating system for wearables (e.g. Galaxy watch), smart phones and tablets includes in-built voice control actions for carrying out commonly used tasks such as writing a note. It also comprises the ability for an app to include its own “intents” which listen for voice activation once the user has launched the app. Finally it includes methods for allowing the user to enter free form text for processing by your app.
Google Voice Interactions API – a code library provided by Google which allows an app to be triggered via the Google Now interface – that’s what you’re using when you say ‘Okay Google’ and then say a command.
Apple devices (iPhones, iPads, iWatch) are built on the iOS operating system. Native iOS apps are written in either Objective C or Swift (a more recent language). With the launch of iOS 10, the Swift programming language included a Speech framework to allow developers to more easily implement listen for voice commands, and manipulate voice into text for use within apps.
SiriKit was released in 2016, providing a toolkit for iOS developers to add voice interaction through Siri into their iOS 10 apps.
Cross platform apps need to use 3rd party libraries to interface with the native speech recognition functions.

It’s important to know that the speech of the user is processed by Apple’s servers or Google’s servers, and then returned to the mobile device, so some lag may be noticed particularly when dealing with longer bursts of voice. It may also have privacy considerations for your users.

3rd party APIs exist which are completely contained within the mobile device, meaning that the user doesn’t need to have an internet connection to use them, and the privacy issues are reduced. An example of such a 3rd party API is the CMU Sphinx – Speech Recognition Toolkit. The downside of using such a library is that you can’t avail yourself of the amazingly accurate voice recognition the large players have developed over time, including for many different languages.

Obvious apps which provide the user with significant benefit from the use of voice control include:

An app which improves or assists the job of a hands-on task e.g. chefs, surgeons, artists, hairdressers …
An app which is needed while a person is driving e.g. navigation, finding locations, dictating ideas on-the-go …
An app needed by a person with disability.
An app which involves the entry of lots of text.

We expect to see more and more support for voice in all sorts of applications in future. What would you like to be able to achieve through voice commands?

This entry was posted on Sunday, October 15th, 2017 at 2:18 pm and is filed under mobile apps, Online Technologies, Technology. You can follow any responses to this entry through the RSS 2.0 feed.You can leave a response, or trackback from your own site.

Controlling my App using Voice

Leave a Reply

Subscribe to our monthly

Contactpoint Email News