4. 1 Voice commands
YOUR APP
SPEECH RECOGNITION
VOICE COMMANDS
TEXT-TO-SPEECH (TTS)
• Application entry point
• Can act as deep links to your application
5. 1 Voice commands
• Set up your project capabilities:
– D_CAP_SPEECH_RECOGNITION,
– ID_CAP_MICROPHONE,
– ID_CAP_NETWORKING
• Create a new Voice Command Definition
6. 1 Voice commands
<?xml version="1.0" encoding="utf-8"?>
<VoiceCommands xmlns="http://schemas.microsoft.com/voicecommands/1.0">
<CommandSet xml:lang="en-us">
<CommandPrefix> Contoso Widgets </CommandPrefix>
<Example> Show today's specials </Example>
<Command Name="showWidgets">
<Example> Show today's specials </Example>
<ListenFor> [Show] {widgetViews} </ListenFor>
<ListenFor> {*} [Show] {widgetViews} </ListenFor>
<Feedback> Showing {widgetViews} </Feedback>
<Navigate Target="/favorites.xaml"/>
</Command>
<PhraseList Label="widgetViews">
<Item> today's specials </Item>
<Item> best sellers </Item>
</PhraseList>
</CommandSet>
<!-- Other CommandSets for other languages -->
</VoiceCommands>
7. 1 Voice commands
• Install the Voice Command Definition (VCD) file
await VoiceCommandService.InstallCommandSetsFromFileAsync( new Uri("ms-appx:///ContosoWidgets.xml") );
• VCD files need to be installed again when a
backup is restored on a device.
8. 1 Voice commands
• Voice commands parameters are included in the
QueryString property of the NavigationContext
"/favorites.xaml?voiceCommandName=showWidgets&widgetViews=best%20sellers&reco=Contoso%20Widgets%Show%20best%20sellers"
• Asterisks in ListenFor phrases are passed as “…”
– In other words, it is not possible to receive the actual
text that matched the asterisk.
10. 2 Speech recognition
YOUR APP
SPEECH RECOGNITION
VOICE COMMANDS
TEXT-TO-SPEECH (TTS)
• Natural interaction with your application
• Grammar-based
• Requires internet connection
11. 2 Speech recognition
• Default dictation grammar for free-text
and web-search are included in WP8
• Custom grammar can be defined in two
ways:
– Programmatic list grammar (array of strings)
– XML grammar leveraging on Speech
Recognition Grammar Specification (SRGS) 1.0
12. 2 Speech recognition
• Default dictation grammar
private async void ButtonWeatherSearch_Click(object sender, RoutedEventArgs e)
{
// Add the pre-defined web search grammar to the grammar set.
SpeechRecognizerUI recoWithUI = new SpeechRecognizerUI();
recoWithUI.Recognizer.Grammars.AddGrammarFromPredefinedType ("weatherSearch",
SpeechPredefinedGrammar.WebSearch);
// Display text to prompt the user's input.
recoWithUI.Settings.ListenText = "Say what you want to search for";
// Display an example of ideal expected input.
recoWithUI.Settings.ExampleText = @"Ex. 'weather for London'";
// Load the grammar set and start recognition.
SpeechRecognitionUIResult result = await recoWithUI.RecognizeWithUIAsync();
}
13. 2 Speech recognition
• Programmatic list grammar
private async void ButtonSR_Click(object sender, RoutedEventArgs e)
{
SpeechRecognizerUI recoWithUI = new SpeechRecognizerUI();
// You can create this string dynamically, for example from a movie queue.
string[] movies = { "Play The Cleveland Story", "Play The Office", "Play Psych", "Play Breaking
Bad", "Play Valley of the Sad", "Play Shaking Mad" };
// Create a grammar from the string array and add it to the grammar set.
recoWithUI.Recognizer.Grammars.AddGrammarFromList("myMovieList", movies);
// Display an example of ideal expected input.
recoWithUI.Settings.ExampleText = @"ex. 'Play New Mocumentaries'";
// Load the grammar set and start recognition.
SpeechRecognitionUIResult result = await recoWithUI.RecognizeWithUIAsync();
// Play movie given in result.Text
}
14. 2 Speech recognition
• XML grammar
private async void ButtonSR_Click(object sender, EventArgs e)
{
// Initialize objects ahead of time to avoid delays when starting recognition.
SpeeechRecognizerUI recoWithUI = new SpeechRecognizerUI();
// Initialize a URI with a path to the SRGS-compliant XML file.
Uri orderPizza = new Uri("ms-appx:///OrderPizza.grxml", UriKind.Absolute);
// Add an SRGS-compliant XML grammar to the grammar set.
recoWithUI.Recognizer.Grammars.AddGrammarFromUri("PizzaGrammar", orderPizza);
// Preload the grammar set.
await recoWithUI.Recognizer.PreloadGrammarsAsync();
// Display text to prompt the user's input.
recoWithUI.Settings.ListenText = "What kind of pizza do you want?";
// Display an example of ideal expected input.
recoWithUI.Settings.ExampleText = "Large combination with Italian sausage";
SpeechRecognitionUIResult recoResult = await recoWithUI.RecognizeWithUIAsync();
}
17. 3 Text-to-speech (TTS)
YOUR APP
SPEECH RECOGNITION
VOICE COMMANDS
TEXT-TO-SPEECH (TTS)
• Output synthetized speech
• Provide the user with spoken instructions
18. 3 Text-to-speech (TTS)
• TTS requires only the following capability:
– ID_CAP_SPEECH_RECOGNITION
• TTS can output the following text types:
– Unformatted text strings
– Speech Synthesis Markup Language (SSML)
1.0 strings or XML files
19. 3 Text-to-speech (TTS)
• Outputting unformatted strings is very easy and
it is also possible to select a voice language:
// Declare the SpeechSynthesizer object at the class level.
SpeechSynthesizer synth;
private async void ButtonSimpleTTS_Click(object sender, RoutedEventArgs e)
{
SpeechSynthesizer synth = new SpeechSynthesizer();
await synth.SpeakTextAsync("You have a meeting with Peter in 15 minutes.");
}
private async void SpeakFrench_Click_1(object sender, RoutedEventArgs e)
{
synth = new SpeechSynthesizer(); // Query for a voice that speaks French.
IEnumerable<VoiceInformation> frenchVoices = from voice in InstalledVoices.All
where voice.Language == "fr-FR" select voice;
// Set the voice as identified by the query.
synth.SetVoice(frenchVoices.ElementAt(0));
// Count in French.
await synth.SpeakTextAsync("un, deux, trois, quatre");
}
20. 3 Text-to-speech (TTS)
• SSML 1.0 text can be outputted from string
or XML files
// Speaks a string of text with SSML markup.
private async void SpeakSsml_Click(object sender, RoutedEventArgs e) {
SpeechSynthesizer synth = new SpeechSynthesizer(); // Build an SSML prompt in a string.
string ssmlPrompt = "<speak version="1.0" ";
ssmlPrompt += "xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US">";
ssmlPrompt += "This voice speaks English. </speak>"; // Speak the SSML prompt.
await synth.SpeakSsmlAsync(ssmlPrompt);
}
// Speaks the content of a standalone SSML file.
private async void SpeakSsmlFromFile_Click(object sender, RoutedEventArgs e) {
// Set the path to the SSML-compliant XML file.
SpeechSynthesizer synth = new SpeechSynthesizer();
string path = Package.Current.InstalledLocation.Path + "ChangeVoice.ssml";
Uri changeVoice = new Uri(path, UriKind.Absolute); // Speak the SSML prompt.
await synth.SpeakSsmlFromUriAsync(changeVoice);
}