Angular2: Using Speech Recognition of Web Speech API in Angular2

Standard

Speech Recognition functionality of the Web Speech API in Angular2 application

Speech Recognition

Speech recognition involves receiving speech through a device’s microphone, which is then checked by a speech recognition service against a list of grammar (basically, the vocabulary you want to have recognised in a particular app.) When a word or phrase is successfully recognised, it is returned as a result (or list of results) as a text string, and further actions can be initiated as a result.

The Web Speech API has a main controller interface for this — SpeechRecognition — plus a number of closely-related interfaces for representing grammar, results, etc. Generally, the default speech recognition system available on the device will be used for the speech recognition — most modern OSes have a speech recognition system for issuing voice commands. Think about Dictation on Mac OS X, Siri on iOS, Cortana on Windows 10, Android Speech, etc.


Chrome support

As mentioned earlier, Chrome currently supports speech recognition with prefixed properties, therefore at the start of our code we include these lines to feed the right objects to Chrome, and non-prefix browsers, like Firefox:

var SpeechRecognition = SpeechRecognition || webkitSpeechRecognition
var SpeechGrammarList = SpeechGrammarList || webkitSpeechGrammarList
var SpeechRecognitionEvent = SpeechRecognitionEvent || webkitSpeechRecognitionEvent  

Properties

  • SpeechRecognition.lang: Sets the language of the recognition. Setting this is good practice, and therefore recommended.
  • SpeechRecognition.interimResults: Defines whether the speech recognition system should return interim results, or just final results. Final results are good enough for this simple demo.
  • SpeechRecognition.maxAlternatives: Sets the number of alternative potential matches that should be returned per result.
          recognition.continuous = false;
          recognition.lang = 'en-US';
          recognition.interimResults = false;
          recognition.maxAlternatives = 1;
    

Event handlers

  • SpeechRecognition.onaudiostartFired when the user agent has started to capture audio.
  • SpeechRecognition.onaudioendFired when the user agent has finished capturing audio.
  • SpeechRecognition.onendFired when the speech recognition service has disconnected.
  • SpeechRecognition.onerrorFired when a speech recognition error occurs.
  • SpeechRecognition.onnomatchFired when the speech recognition service returns a final result with no significant recognition. This may involve some degree of recognition, which doesn’t meet or exceed the confidence threshold.
  • SpeechRecognition.onresultFired when the speech recognition service returns a result — a word or phrase has been positively recognized and this has been communicated back to the app.
  • SpeechRecognition.onstartFired when the speech recognition service has begun listening to incoming audio with intent to recognize grammars associated with the current SpeechRecognition.

Method

  • SpeechRecognition.abort()Stops the speech recognition service from listening to incoming audio, and doesn’t attempt to return a SpeechRecognitionResult.
  • SpeechRecognition.start()Starts the speech recognition service listening to incoming audio with intent to recognize grammars associated with the current SpeechRecognition.
  • SpeechRecognition.stop()Stops the speech recognition service from listening to incoming audio, and attempts to return a SpeechRecognitionResult using the audio captured so far.

Starting the speech recognition

recognition.start();

Receiving and handling results

recognition.onresult = function(event) {
  var last = event.results.length - 1;
  console.log(event.results[last][0].transcript);
  console.log('Confidence: ' + event.results[0][0].confidence);
}

The SpeechRecognitionEvent.results property returns a SpeechRecognitionResultList object containing SpeechRecognitionResult objects. It has a getter so it can be accessed like an array — so the [last] returns the SpeechRecognitionResult at the last position. Each SpeechRecognitionResult object contains SpeechRecognitionAlternative objects that contain individual recognised words. These also have getters so they can be accessed like arrays — the [0] therefore returns the SpeechRecognitionAlternative at position 0

Stopping the speech recognition

recognition.stop(); 

Handling error in the speech recognition

SpeechRecognition.onerror handles cases where there is an actual error with the recognition successfully — the SpeechRecognitionError.error property contains the actual error returned:

recognition.onerror = function(event) {
  console.log(event.error);
}

Web Speech API in Angular2

See extended DEMO on Github

Angular2 Service using Web Speech API

import { Injectable, NgZone } from '@angular/core';
import { Observable } from 'rxjs/Rx';
import * as _ from "lodash";

interface IWindow extends Window {
    webkitSpeechRecognition: any;
    SpeechRecognition: any;
}

@Injectable()
export class SpeechRecognitionService {
    speechRecognition: any;

    constructor(private zone: NgZone) {
    }

    record(): Observable<string> {

        return Observable.create(observer => {
            const { webkitSpeechRecognition }: IWindow = <IWindow>window;
            this.speechRecognition = new webkitSpeechRecognition();
            //this.speechRecognition = SpeechRecognition;
            this.speechRecognition.continuous = true;
            //this.speechRecognition.interimResults = true;
            this.speechRecognition.lang = 'en-us';
            this.speechRecognition.maxAlternatives = 1;

            this.speechRecognition.onresult = speech => {
                let term: string = "";
                if (speech.results) {
                    var result = speech.results[speech.resultIndex];
                    var transcript = result[0].transcript;
                    if (result.isFinal) {
                        if (result[0].confidence < 0.3) {
                            console.log("Unrecognized result - Please try again");
                        }
                        else {
                            term = _.trim(transcript);
                            console.log("Did you said? -> " + term + " , If not then say something else...");
                        }
                    }
                }
                this.zone.run(() => {
                    observer.next(term);
                });
            };

            this.speechRecognition.onerror = error => {
                observer.error(error);
            };

            this.speechRecognition.onend = () => {
                observer.complete();
            };

            this.speechRecognition.start();
            console.log("Say something - We are listening !!!");
        });
    }

    DestroySpeechObject() {
        if (this.speechRecognition)
            this.speechRecognition.stop();
    }

}

Component using above service

import { Component, OnInit, OnDestroy} from '@angular/core';
import { SpeechRecognitionService } from './speech-recognition.service';

@Component({
    selector: 'my-app',
    templateUrl: './App/app.component.html'
})

export class AppComponent implements OnInit, OnDestroy {
    showSearchButton: boolean;
    speechData: string;

    constructor(private speechRecognitionService: SpeechRecognitionService) {
        this.showSearchButton = true;
        this.speechData = "";
    }

    ngOnInit() {
        console.log("hello")
    }

    ngOnDestroy() {
        this.speechRecognitionService.DestroySpeechObject();
    }

    activateSpeechSearchMovie(): void {
        this.showSearchButton = false;

        this.speechRecognitionService.record()
            .subscribe(
            //listener
            (value) => {
                this.speechData = value;
                console.log(value);
            },
            //errror
            (err) => {
                console.log(err);
                if (err.error == "no-speech") {
                    console.log("--restatring service--");
                    this.activateSpeechSearchMovie();
                }
            },
            //completion
            () => {
                this.showSearchButton = true;
                console.log("--complete--");
                this.activateSpeechSearchMovie();
            });
    }

}

Usage

Before using project Install dependencies by typing below command in CMD:

npm install --save --save-dev

Output

Initial State:

screenshot_19

Example1:

screenshot_20

Example2:

screenshot_21

Reference

Text Reference.

API Reference

Advertisements

2 thoughts on “Angular2: Using Speech Recognition of Web Speech API in Angular2

  1. Vasil

    Hello. Your example is awesome but i have problems with lodash. It say to me that lodash is not found. And this is happen when I’m trying to import lodash to my web api (Client site – Angular 2, gulp; Server site .Net Core). I have configure my gulp, my package.json and my system.config,js and still not working. If you have some idea how to fix this, i will bе very thankful if you help me. (Sorry for my bad english)

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s