For example,Бобцов

AUTOMATIC SPEECH RECOGNITION – THE MAIN STAGES OVER LAST 50 YEARS

Annotation

The main stages of automatic speech recognition systems over last 50 years are regarded. The attempt is made to evaluate different methods in the context of approaching to functioning of biological systems. The method implementation based on dynamic programming algorithm and done in 1968 is considered as a benchmark. Shortcomings of the method, which make it possible to use it only for command recognition, are considered. The next method considered is based on a formalism of Markov chains. Based on the notion of coarticulation the necessity of applying context dependent triphones and biphones instead of context independent phonemes is shown. The problems of insufficiency of speech databases for triphone training which lead to state tying methods are explained. The importance of model adaptation and feature normalization methods providing better invariance to speakers, communication channels and additive noise are shown. Deep Neural Networks and Recurrent Networks are considered as the most up-to-date methods. The similarity of deep (multilayer) neural networks and biological systems is noted. In conclusion, the problems and drawbacks of the modern systems of automatic speech recognition are described and prognosis of their development is given.

Keywords

Articles in current issue