Posts

Free large voice datasets for machine learning

 Free large voice datasets for machine learning            If you are into speech to text, whether, you are developing your own code or training models based on existing speech to text recognition engines, these datasets could be helpful. All of these are large english datasets. There are several other smaller datasets, which are omitted in this post on purpose, such as a digits dataset and commands dataset which wouldn't be much helpful. VoxCeleb Over a million utterances. The Spoken Wikipedia Corpora Hundreds of hours, over 16 GB in size. TED-LIUM Release 3 Over 450 hours of audio.  CommonVoice   Over 1400 hours and 50 GB in size. LibriSpeech Over 500 hours and 30 GB. Free large voice datasets for machine learning

Kaldi - Speech to text using Aspire

Kaldi - Speech to text using Aspire Assuming you already installed Kaldi, the following blog post shows how to install the Aspire Model. A list of available Kaldi models can be found here . If you don't have Kaldi installed yet, follow this web page - How to install Kaldi-ASR on Ubuntu 18 Navigate to egs/aspire/s5 wget https://kaldi-asr.org/models/1/0001_aspire_chain_model_with_hclg.tar.bz2 tar xfv 0001_aspire_chain_model.tar.gz steps/online/nnet3/prepare_online_decoding.sh --mfcc-config conf/mfcc_hires.conf data/lang_chain exp/nnet3/extractor exp/chain/tdnn_7b exp/tdnn_7b_chain_online wget https://catalog.ldc.upenn.edu/desc/addenda/LDC93S1.wav ./cmd.sh ./path.sh ffmpeg -i LDC93S1.wav -acodec pcm_s16le -ac 1 -ar 8000 output.wav navigate to src/online2bin In the following command path(s) to words.txt, final.mdl, HCLG.fst, test.wav may need to be adjusted as per your installation. ./online2-wav-nnet3-latgen-faster \ --online=false \ --do-endpointing=false

How to mount a EFS (Elastic File System) to a Linux Virtual Machine in AWS (Amazon Web Services) - Detailed Step by Step instructions

How to mount a EFS (Elastic File System) to a Linux Virtual Machine in AWS (Amazon Web Services) - Detailed Step by Step instructions Amazon's EFS - Elastic File System is a storage service that uses Network File System version 4 (NFSv4.1 and NFSv4.0) protocol. The Infrequent Access (IA) storage class is a lower-cost storage class that's designed for storing long-lived, infrequently accessed files cost-effectively. In this article, I will assume that you are familiar with creating a Ubuntu virtual machine. The instructions in this article would without a problem. On other versions, it should work, but I did not try. Check the security group attached to your VM, by navigating to EC2 console, clicking Instances in the left navigation menu. Select your VM and check the Description tab, in the bottom pane. Make a note of the security group name, in this article, I will refer to the securitygroup name as 'secgrp'. Now navigate to the EFS console, click "Create Fi

EFS Automatic mounting on Ubuntu Linux

EFS Automatic mounting on Ubuntu Linux In a earlier post How to mount a EFS (Elastic File System) to a Linux Virtual Machine in AWS (Amazon Web Services)  I posted about how to mount a Amazon EFS in a Linux Virtual Machine. In this post, I am going to write about, how to automatically mount EFS. There could be several approaches for this, the following is one the approach. I would love to hear any alternative approaches as well. For this I used cron job. Put the mapping command in a shell script file, let's call it map.sh. Make sure appropriate read permissions are provided. > sudo crontab -e In the crontab file type the following: @reboot pathtoshfile/map.sh That's about it, reboot and give it a try. EFS Automatic mounting on Ubuntu Linux

How to install Kaldi-ASR on Ubuntu 18

How to install Kaldi-ASR on Ubuntu 18 Kaldi is a speech recognition toolkit. The official website is:  http://kaldi-asr.org/ To know more please check out the official website. In this blog post would serve the purpose of a reference note for me, as well as could help some people out there. You can use Linux on Windows sub system as well, and these instructions are specifically for Ubuntu 18 for other versions, the apt-get package names may vary slightly, but the installation would pretty much be the same. Including the operating system, Kaldi could take about 12 - 15 GB in space, so make sure, you have enough free space before proceeding. sudo apt update sudo add-apt-repository universe sudo add-apt-repository main sudo apt-get update sudo apt-get install build-essential sudo apt-get install libatlas-base-dev liblapack-dev libblas-dev sudo apt-get install ffmpeg sox mkdir kaldi cd kaldi git clone https://github.com/kaldi-asr/kaldi.git cd kaldi/tools extras/ch

How to mount a EFS (Elastic File System) to a Linux Virtual Machine in AWS (Amazon Web Services)

How to mount a EFS (Elastic File System) to a Linux Virtual Machine in AWS (Amazon Web Services) How to mount a EFS (Elastic File System) to a Linux Virtual Machine in AWS (Amazon Web Services) EFS - Elastic File System is a service offered by Amazon in AWS. The good thing with EFS is you pay only for how much space is used and you don't have to even pre-allocate and still pay for unused space. However, the mounting instructions are a bit off and I had to struggle (spent a little time, figuring out) a bit to mount a EFS in a Linux VM.  EFS documentation can be found at:  https://aws.amazon.com/efs/ Follow the instructions as provided. However make sure that the security groups configured and that are being used while creating EFS have port 2049 (NFS) allowed as Inbound to the VM. The source IP's can be 172.30.0.0/16 or you can be more specific by using the exact IP address shown in the EFS console page. This is a very simple blog entry, and it is straight to the po