Call Us Today! 877.742.2583




Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
cp -drp <freeswitch-src-dir>/scripts/javascript/js_modules /usr/local/freeswitch/scripts/
cp <freeswitch-src-dir>/scripts/javascript/ps_pizza.js /usr/local/freeswitch/scripts/

 


  • if you are doing this on an old install you must copy the pocketsphinx.conf.xml to the conf directory
Code Block
cp /usr/src/freeswitch/conf/autoload_configs/pocketsphinx.conf.xml /usr/local/freeswitch/conf/autoload_configs/

 


  • Download the sounds files from here
  • Move extracted pizza directory to sounds directory under freeswitch install (eg, /usr/local/freeswitch/sounds/en/us)
  • Newer FreeSWITCH versions already contain /usr/local/freeswitch/conf/dialplan/default/00_pizza_demo.xml which sets up 74992 or "pizza" as an extension. If you are on an older FreeSWITCH version, make an extension like this:
Code Block
 <include>
  <extension name="pizza_demo">
    <condition field="destination_number" expression="^(pizza|74992)$"/>
    <condition field="${module_exists(mod_spidermonkey)}" expression="true"/>
    <condition field="${module_exists(mod_pocketsphinx)}" expression="true">
     <action application="javascript" data="ps_pizza.js"/>
    </condition>
  </extension>
 </include>

 

 



  • edit your ps_pizza.js with the location of your sound files
Code Block
asr.setAudioBase("en/us/pizza/");

 


  • Install grammar files
Code Block
cd /usr/local/freeswitch/grammar
wget http://files.freeswitch.org/pizza_gram.tar.gz
tar xvzf pizza_gram.tar.gz

 

 



  • Give it a try by calling extension 74992 and watching the console for messages.

...

Process

 


  • Create work directory
Code Block
mkdir <anywhere>/vf_de_test
cd <anywhere>/vf_de_test

 


  • this new dir is now our <workdir>
  • Prepare SphinxTrain

 


Code Block
tar -jxf SphinxTrain-1.0.tar.bz2
cd SphinxTrain-1.0
make
cd ..

 


  • Setup sphinx training environment “voxforge_de_sphinx”
  • ./SphinxTrain-1.0/scripts_pl/setup_SphinxTrain.pl -task voxforge_de_sphinx
  • Content of <workdir>/

 


Info
drwxr-xr-x   2 ssw voip    4096  5. Aug 11:32 bin
drwxr-xr-x   2 ssw voip    4096  5. Aug 11:32 bwaccumdir
drwxr-xr-x   2 ssw voip    4096  5. Aug 11:32 etc
drwxr-xr-x   2 ssw voip    4096  5. Aug 11:32 feat
drwxr-xr-x   2 ssw voip    4096  5. Aug 11:32 logdir
drwxr-xr-x   2 ssw voip    4096  5. Aug 11:32 model_architecture
drwxr-xr-x   2 ssw voip    4096  5. Aug 11:32 model_parameters
drwxr-xr-x   3 ssw voip    4096  5. Aug 11:32 python
drwxr-xr-x  20 ssw voip    4096  5. Aug 11:32 scripts_pl
drwxr-xr-x   2 ssw voip    4096  5. Aug 11:32 wav
drwxr-xr-x  14 ssw voip    4096  5. Aug 11:02 SphinxTrain-1.0
-rw-r--r--   1 ssw voip 8297682 12. Feb 17:01 SphinxTrain-1.0.tar.bz2

 

 



  • Copy Sphinxbase version from freeswitch source directory

...

  • Extract acoustic model in a new directory

 


Code Block
mkdir am_tmp
cd am_tmp
tar –zxf AcousticModels.tgz
cd ..
Content of am_tmp:

...

Code Block
-rw-r--r-- 1 ssw voip 7862417  8. Mai 12:27 AcousticModels.tgz
-rwxr-xr-x 1 ssw voip    3435 31. Mai 2008  espeak2phones.pl
drwxr-xr-x 2 ssw voip    4096  8. Mai 00:19 etc
drwxr-xr-x 3 ssw voip    4096  8. Mai 00:19 model_parameters
drwxr-xr-x 2 ssw voip    4096  8. Mai 00:19 result
drwxr-xr-x 2 ssw voip    4096  8. Mai 00:19 test
-rwxr-xr-x 1 ssw voip    1368 31. Mai 2008  traintest

 


  • Preparing audio data (here 8kHz sample rate)
    • Put voxforge's audio archives to <workdir>/audio
    • Extract all archives

      Code Block
      Cd audio
      for i in *.tgz; do tar -zxf $i; done


    • Create script “copy_and_convert_audio.sh ”in <workdir>


Code Block
#Copyright 2009 Helmut Kuper
#
SOD=`pwd`
AD="${SOD}/audio"
TD="${SOD}/wav"

if [! -d $TD ]
then
        echo "ERROR: No wav directory found\n"
        echo "Please create it\n"
        exit 1
elif [ ! -d $AD ]
then
        echo "ERROR No audio directory found\n"
        exit 1
fi

copied=0
conv=0

cd $AD

for i in *
do
        if [ -d "$i/wav" ]
        then
                cd $i/wav
                for j in *.wav
                        do
                                cp $j "$TD/${i}_$j"
                                if [[ $(( copied++ % 100 )) -eq 0 ]]; then echo "wav: Copied: $((copied - 1))"; fi
                done
                cd $AD
        elif [ -d "$i/flac" ]
        then
                cd $i/flac
                for j in *.flac
                        do
                                if [[ $j =~ '(.*)\.flac$' ]]
                                then
                                        fname=${BASH_REMATCH[1]}
                                        #echo "Flac: Converting '$j' to ${i}_$fname.wav"
                                        flac -f -s -d -o "$TD/${i}_$fname.wav" $j
                                        if [[ $(( conv++ % 100 )) -eq 0 ]]; then echo "Flac: Converted $((conv - 1))"; fi
                                fi
                        done
                cd $AD
        fi
done

cd $SOD

echo "Copied $copied files"
echo "Copied and converted $conv files"
echo "Copied $((copied + conv )) files to $TD"
echo
echo "Done"

 


  • Converting (some are in flac format) and copy audio data to <workdir>/wav directory
  • bash ./copy_and_convert_audio.sh (you must be in <workdir> directory)
  • Create a feature file in <workdir>:
    • vi <workdir>/my_feat.params
Code Block
-alpha 0.97
-dither yes
-doublebw no
-nfilt 40
-ncep 13
-lowerf 0
-upperf 4000
-nfft 512
-wlen 0.0256
-transform legacy
-feat s2_4x

 


  • Create script for renaming MFC files in <workdir>.
    • vi <workdir>/renameMFC.sh

 


Code Block
#Copyright 2009 Helmut Kuper
#
for i in *.ch1.mfc
do
        if [[ $i =~ '(.*)\.ch1\.mfc$' ]]
        then
                fname=${BASH_REMATCH[1]}
                mv $i $fname.mfc
                echo "Renaming '$i' to $fname.mfc"
        fi
done
echo "Done"

 


  • Copy Voxforge's configurations to <workdir>/etc
    • cp ./am_tmp/etc/* ./etc/
  • Replace feature file with our own
    • cp ./my_feat.params ./etc/feat.params
  • Adapt Voxforge’s sphinx_trrain.cfg to our environment:
    • vi <workdir>/etc/sphinx_train.cfg

Code Block
$CFG_BASE_DIR = “<workdir>/vf_de_test";
$CFG_SPHINXTRAIN_DIR = "./SphinxTrain-1.0";
#$CFG_HMM_TYPE = '.cont.'; # Sphinx III
$CFG_HMM_TYPE  = '.semi.'; # Sphinx II
$CFG_LISTOFFILES    = "$CFG_LIST_DIR/${CFG_DB_NAME}_train.fileids";
$CFG_TRANSCRIPTFILE = "$CFG_LIST_DIR/${CFG_DB_NAME}_train.transcription";



 


  • Content of <workdir>
Code Block
am_tmp
audio
bin
bwaccumdir
copy_and_convert_audio.sh
etc
feat
logdir
model_architecture
model_parameters
my_feat.params
python
renameMFC.sh
scripts_pl
sphinxbase
sphinxbase-0.4.99
SphinxTrain-1.0
SphinxTrain-1.0.tar.bz2
wav

 


  • At least one File (openpento-20080512-2_3_exp_5_1_Unit_0) is somehow corrupt, so delete line containing the name from:
    • ./etc/voxforge_de_sphinx_train.transcription
    • ./etc/voxforge_de_sphinx_train.fileids
    • Then delete the file "./wav/openpento-20080512-2_3_exp_5_1_Unit_0.wav"
  • Create MFC files of wav files
    • <workdir>/sphinxbase/bin/sphinx_fe `cat ./etc/feat.params` -c ./etc/voxforge_de_sphinx_train.fileids -di ./wav -do ./feat/ -ei wav -eo mfc -raw no -mswav yes -samprate 8000

...

Info
INFO: fe_interface.c(288): You are using the internal mechanism to generate the seed.

 


  • Get rid of those ".ch1." parts in some MFC files
    • cd <workdir>/feat
    • bash ../renameMFC.sh
    • cd ..

...

Execute „<workdir>/scripts_pl/00.verify/verify_all.pl

 

Code Block
MODULE: 00 verify training files
O.S. is case sensitive ("A" != "a").
Phones will be treated as case sensitive.
    Phase 1: DICT - Checking to see if the dict and filler dict agrees with the phonelist file.
        Found 3019 words using 41 phones
    Phase 2: DICT - Checking to make sure there are not duplicate entries in the dictionary
    Phase 3: CTL - Check general format; utterance length (must be positive); files exist
    Phase 4: CTL - Checking number of lines in the transcript should match lines in control file
    Phase 5: CTL - Determine amount of training data, see if n_tied_states seems reasonable.
        Total Hours Training: 4.47290213675222
        This is a small amount of data, no comment at this time
    Phase 6: TRANSCRIPT - Checking that all the words in the transcript are in the dictionary
        Words in dictionary: 3016
        Words in filler dictionary: 3
    Phase 7: TRANSCRIPT - Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once

...

Code Block
MODULE: 00 verify training files
O.S. is case sensitive ("A" != "a").
Phones will be treated as case sensitive.
    Phase 1: DICT - Checking to see if the dict and filler dict agrees with the phonelist file.
        Found 3019 words using 41 phones
    Phase 2: DICT - Checking to make sure there are not duplicate entries in the dictionary
    Phase 3: CTL - Check general format; utterance length (must be positive); files exist
    Phase 4: CTL - Checking number of lines in the transcript should match lines in control file
    Phase 5: CTL - Determine amount of training data, see if n_tied_states seems reasonable.
        Total Hours Training: 4.47290213675222
        This is a small amount of data, no comment at this time
    Phase 6: TRANSCRIPT - Checking that all the words in the transcript are in the dictionary
        Words in dictionary: 3016
        Words in filler dictionary: 3
    Phase 7: TRANSCRIPT - Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
MODULE: 01 Vector Quantization
MODULE: 02 Training Context Independent models for forced alignment and VTLN
Skipped:  $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
Skipped:  $ST::CFG_VTLN set to '' in sphinx_train.cfg
MODULE: 03 Force-aligning transcripts
Skipped:  $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
MODULE: 04 Force-aligning data for VTLN
Skipped:  $ST::CFG_VTLN set to '' in sphinx_train.cfg
MODULE: 05 Train LDA transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 06 Train MLLT transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 20 Training Context Independent models
    Phase 1: Cleaning up directories:
        accumulator...logs...qmanager...models...
    Phase 2: Flat initialize
    Phase 3: Forward-Backward
        Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80%

 


Now you can go and get a cup of coffee or tea or go to bed or...

...

Code Block
[...]
Training for 1 Gaussian(s) completed after 4 iterations
MODULE: 90 deleted interpolation
    Phase 1: Cleaning up directories: logs...
    Phase 2: Doing interpolation...
WARNING: This step had 0 ERROR messages and 1 WARNING messages.  Please check the log file for details.
    Phase 3: Dumping senones for PocketSphinx...
MODULE: 99 Convert to Sphinx2 format models
    Phase 1: Cleaning up old log files...
    Phase 2: Copy noise dictionary
    Phase 3: Make codebooks
0
    Phase 4: Make chmm files
    Phase 5: Make senone file
    Phase 6: Make phone and map files

 


The target folder "<workdir>/model_parameters/voxforge_de_sphinx.ci_semi" looks now like this:

...

Code Block
<configuration name="pocketsphinx.conf" description="PocketSphinx ASR Configuration">
  <settings>
    <param name="threshold" value="400"/>
    <param name="silence-hits" value="25"/>
    <param name="listen-hits" value="1"/>
    <param name="auto-reload" value="true"/>
    <param name="narrowband-model" value="de4"/>
    <param name="wideband-model" value="wsj1"/>
    <param name="dictionary" value="de4.dic"/>
  </settings>
</configuration>

 


That's all you have to do as far as i know ... The results on my side were ... erm well ... suboptimal. After reloading mod_pocketsphinx FS detected simple german words but not very reliable. I think this is because of the small amount of prepared german audio data. Voxforge recommends 130 hours for training, but currently (March 2011) there are only 25hours available.

...