Open database of scientific publications ITMO UNIVERSITY

ACCURACY INCREASE FOR AUTOMATIC VISUAL RUSSIAN SPEECH RECOGNITION: VISEME CLASSES OPTIMIZATION

Journal

Scientific and Technical Journal of Information Technologies, Mechanics and Optics

Не указан

UDK

Issue:2 (114)

Download PDF0 Kbyte

Annotation

Nowadays there are a lot of continuous studies on the correct viseme classes to be used for the most effective automatic lip-reading. The paper proposes a structured approach for the development of speaker-dependent classes of visemes. This method gives the possibility to create a set of phoneme-viseme correspondence maps, where each class has a different number of visemes from two to forty-eight with a constant number of phonemes. Viseme classes are based on their mapping from phonemes, which are converted into viseme groups during speech recognition process. With the usage of the obtained correspondence maps together with the database of audio-visual Russian speech HAVRUS the paper demonstrates the dependence of the visual speech recognition accuracy on the number of used viseme classes. The application of high-speed video data made it possible to expand the optimal set of viseme classes to twenty that resulted in recognition accuracy improvement by 1.34% compared to the standard set of fourteen classes.

ACCURACY INCREASE FOR AUTOMATIC VISUAL RUSSIAN SPEECH RECOGNITION: VISEME CLASSES OPTIMIZATION

Scientific and Technical Journal of Information Technologies, Mechanics and Optics

Annotation

Keywords

Постоянный URL

Articles in current issue

ACCURACY INCREASE FOR AUTOMATIC VISUAL RUSSIAN SPEECH RECOGNITION: VISEME CLASSES OPTIMIZATION

Scientific and Technical Journal of Information Technologies, Mechanics and Optics

Annotation

Keywords

Постоянный URL

Поделиться

Articles in current issue