Automatic Speech Recognition

Implementation of a LSTM-based automatic speech recognition system and DeepSpeech 2.

Project GitHub. Year: 2022

The project aimed on the reproduction of DeepSpeech 2 automatic speech recognition. In addition, the code provides implementation of a LSTM with layer norm. The models are provided for two languages: English (trained on LibriSpeech) and Russian (Golos and Common Voice 11.0). Project supports Language-Model-based beam search and Byte-Pair-Encoding for both languages.

Originally developed for the HSE DLA Course Homework.