The Automatic Speech Recognition engine is used to convert speech to text. It is an alternative User Interface to DTMF in IVR applications.
Engines from the following vendors have been evaluated.
Nuance is the monopolistic supplier of speech recognition engines. Accuracy and performance are good but cost is prohibitively high for most applications. The SDK is free trial for 2 months. Cost of per port perpetual license varies from $800 - $2000 depending on vocabulary size and natural language. Indian English and Hindi are good. Other Indian languages are available but need to be evaluated for accuracy.
NRec-9.0.19-i386-rhel3.tar.gz NRec-9.0.1-en-IN.i386-rhel3.tar.gz NRec-9.0.1-hi-IN.i386-rhel3.tar.gz NLICMGR-11.7.0-x86_64-linux.tar.gz eval-rec-9.lic
CentOS 5.5 x86_64
tar xvzf NRec-9.0.19-i386-rhel3.tar.gz ./install.sh tar xvzf NRec-9.0.1-en-IN.i386-rhel3.tar.gz tar xvzf NRec-9.0.1-hi-IN.i386-rhel3.tar.gz rpm -ivh NRec-en-IN-9.0-1.i386-rhel3.rpm rpm -ivh NRec-hi-IN-9.0-1.i386-rhel3.rpm
yum -y install redhat-lsb tar xvzf NLICMGR-11.7.0-x86_64-linux.tar.gz cd Nuance_License_Manager ./install.sh cd /usr/local/Nuance/license_manager/license cp /root/eval-rec-9.lic . cd ../components ./set-new-lic-file.sh /usr/local/Nuance/license_manager/license/eval-rec-9.lic
Check that the license log file /usr/local/Nuance/license_manager/license/nuance-lic.log has the following contents, which means that the evaluation license file has been correctly configured by the License Manager.
19:55:22 (lmgrd) License file(s): /usr/local/Nuance/license_manager/license/eval-rec-9.lic 19:55:22 (lmgrd) lmgrd tcp-port 27000 19:55:22 (lmgrd) Starting vendor daemons ...
Check that the Nuance License Manager is running.
Start and Test Speech Server
service NSSservice start
Check that Nuance client is able to talk to Nuance Speech Server from a different machine.
Nuance Recognizer logs are in /usr/local/Nuance/Recognizer/data
Sample Lua IVR
In dialplan conf/dialplan/default.xml put the following extension
Sample English Grammar
Sample Hindi Grammar
Started off from Sphinx, the free and open source project at CMU but considerable proprietary development has been done for MRCP and acoustic modelling. Indian English and Hindi are available but the cost of SDK is around $4000, so evaluation looks expensive.
Vestec provides SDK for $25 or free evaluation. Per port license fee is around $200. Indian English and Hindi are available and to be evaluated.
Simmortel uses a mix of open source and proprietary product to deliver good accuracy for medium vocabulary applications in Indian English and Hindi. However, MRCP is not available and CPU usage for concurrent calls is very high.
Provides good European languages. Acquired by Nuance.
Sphinx is open source and free. It works well if you have a trained acoustic model for your language and application. MRCP integration needs to be done for any real application.