Speech-to-Text

2016-10-02  本文已影响678人  xiaoxingyun

Speech-to-Text

This plugin interfaces Windows streaming, Wit.ai non-streaming, Google streaming/non-streaming, and IBM Watson streaming/non-streaming speech-to-text. There is also a sample scene that compares each of these APIs. This article on the Unity Labs website explains some of the concepts behind speech recognition and discusses the motivation behind this package.

Table of Contents

Requirements

Setting up the sample scene

  1. Open the scene "speechToTextComparison.unity".
  2. Enter your credentials for each API by going through each child of "Canvas/SpeechToTextServiceWidgets" in the Inspector and changing the appropriate field(s) in the "[Specific Name Here] Speech To Text Service" component. Note that Google streaming speech-to-text uses a JSON credentials file, which must be saved under "GoogleStreamingSpeechToTextProgram" under Application.streamingAssetsPath, and whose name must match the "JSON Credentials File Name" field of the "Google Streaming Speech To Text Service" component of "Canvas/SpeechToTextServiceWidgets/GoogleStreamingSpeechToTextService". You will only receive transcriptions from APIs for which you have provided valid credentials (except Windows, which does not require any). See the "Acquiring credentials" section for instructions on acquiring credentials for each API.
  3. Configure any parameters that you wish to change for each service (timeout, audio chunk length, etc.) Check the "Speech To Text Comparison Widget" component of "Canvas/SpeechToTextComparisonWidget" in the Inspector to make sure that all the services you wish to test are listed under "Speech To Text Service Widgets", and add/remove services from this list as needed.
  4. The scene is now ready to run. Refer to Recording and comparing results for how the sample scene works.

Recording and comparing results

Acquiring credentials

Google Cloud Speech

  1. Sign up for a Google Cloud Platform account.
  2. Sign up for Google Cloud Speech API.
  3. Once you have been granted access to Google Cloud Speech API, refer to the documentation for instructions on generating an API key (for non-streaming speech-to-text) and a JSON service account key (for streaming speech-to-text).

IBM Watson Speech to Text

  1. Sign up for an IBM Bluemix account.
  2. Sign up for IBM Watson Speech to Text.
  3. Once you have been granted access to IBM Watson Speech to Text, refer to the documentation for instructions on generating service credentials.

Wit.ai

  1. Sign up for a Wit.ai account.
  2. Create a new app through the Wit.ai console.
  3. Your server access token will be listed under your app settings.

Architecture

Namespaces

Speech-to-text services and results inheritance hierarchy

Speech-to-text services and results base functions and properties

AudioRecordingManager functions and properties

SmartLogger and DebugFlags

Example of speech-to-text service usage

void OnError(string text)
{
    Debug.LogError(text);
}

// Note that handling interim results is only necessary if your speech-to-text service is streaming.
// Non-streaming speech-to-text services should only return one result per recording.
void OnTextResult(SpeechToTextResult result)
{
    if (result.IsFinal)
    {
        Debug.Log("Final result:");
    }
    else
    {
        Debug.Log("Interim result:");
    }
    for (int i = 0; i < result.TextAlternatives.Length; ++i)
    {
        Debug.Log("Alternative " + i + ": " + result.TextAlternatives[i].Text);
    }
}

void OnRecordingTimeout()
{
    Debug.Log("Timeout");
}
m_SpeechToTextService.RegisterOnError(OnError);
m_SpeechToTextService.RegisterOnTextResult(OnTextResult);
m_SpeechToTextService.RegisterOnRecordingTimeout(OnRecordingTimeout);
m_SpeechToTextService.UnregisterOnError(OnError);
m_SpeechToTextService.UnregisterOnTextResult(OnTextResult);
m_SpeechToTextService.UnregisterOnRecordingTimeout(OnRecordingTimeout);

Forks

The BitBucket repository for this project can be found here. Anyone in the community is welcome to create their own forks. Drop us a note at labs@unity3d.com if you find it useful, we'd love to hear from you!

上一篇 下一篇

猜你喜欢

热点阅读