For example,Бобцов

ACCURACY INCREASE FOR AUTOMATIC VISUAL RUSSIAN SPEECH RECOGNITION: VISEME CLASSES OPTIMIZATION

Annotation

Nowadays there are a lot of continuous studies on the correct viseme classes to be used for the most effective automatic lip-reading. The paper proposes a structured approach for the development of speaker-dependent classes of visemes. This method gives the possibility to create a set of phoneme-viseme correspondence maps, where each class has a different number of visemes from two to forty-eight with a constant number of phonemes. Viseme classes are based on their mapping from phonemes, which are converted into viseme groups during speech recognition process. With the usage of the obtained correspondence maps together with the database of audio-visual Russian speech HAVRUS the paper demonstrates the dependence of the visual speech recognition accuracy on the number of used viseme classes. The application of high-speed video data made it possible to expand the optimal set of viseme classes to twenty that resulted in recognition accuracy improvement by 1.34% compared to the standard set of fourteen classes.

Keywords

Articles in current issue