Kaldi - Speech to text using Aspire

Kaldi - Speech to text using Aspire


Assuming you already installed Kaldi, the following blog post shows how to install the Aspire Model. A list of available Kaldi models can be found here. If you don't have Kaldi installed yet, follow this web page - How to install Kaldi-ASR on Ubuntu 18


Navigate to egs/aspire/s5

wget https://kaldi-asr.org/models/1/0001_aspire_chain_model_with_hclg.tar.bz2

tar xfv 0001_aspire_chain_model.tar.gz

steps/online/nnet3/prepare_online_decoding.sh --mfcc-config conf/mfcc_hires.conf data/lang_chain exp/nnet3/extractor exp/chain/tdnn_7b exp/tdnn_7b_chain_online

wget https://catalog.ldc.upenn.edu/desc/addenda/LDC93S1.wav

./cmd.sh

./path.sh

ffmpeg -i LDC93S1.wav -acodec pcm_s16le -ac 1 -ar 8000 output.wav

navigate to src/online2bin

In the following command path(s) to words.txt, final.mdl, HCLG.fst, test.wav may need to be adjusted as per your installation.

./online2-wav-nnet3-latgen-faster \
--online=false \
--do-endpointing=false \
--frame-subsampling-factor=3 \
--config=/home/ubuntu/kaldi/egs/aspire/s5/exp/tdnn_7b_chain_online/conf/online.conf \
--max-active=7000 \
--beam=15.0 \
--lattice-beam=6.0 \
--acoustic-scale=1.0 \
--word-symbol-table=/home/ubuntu/kaldi/egs/aspire/s5/exp/tdnn_7b_chain_online/graph_pp/words.txt \
/home/ubuntu/kaldi/egs/aspire/s5/exp/tdnn_7b_chain_online/final.mdl \
/home/ubuntu/kaldi/egs/aspire/s5/exp/tdnn_7b_chain_online/graph_pp/HCLG.fst \
'ark:echo utterance-id1 utterance-id1|' \
'scp:echo utterance-id1 /home/ubuntu/kaldi/egs/aspire/s5/output.wav|' \
'ark:/dev/null'

In the output, there would be several log statements and a line that starts with "utterance-id1", after "utterance-id1" would be the translation.

Kaldi - Speech to text using Aspire

Comments

Popular posts from this blog

Multi-part Upload to S3 programmatically in .Net using C#

True Multi-Factor Authentication

How to install Kaldi-ASR on Ubuntu 18