ANALYSIS OF MULTIMODAL FUSION TECHNIQUES FOR AUDIO-VISUAL SPEECH RECOGNITION
Annotation
The paper deals with analytical review, covering the latest achievements in the field of audio-visual (AV) fusion (integration) of multimodal information. We discuss the main challenges and report on approaches to address them. One of the most important tasks of the AV integration is to understand how the modalities interact and influence each other. The paper addresses this problem in the context of AV speech processing and speech recognition. In the first part of the review we set out the basic principles of AV speech recognition and give the classification of audio and visual features of speech. Special attention is paid to the systematization of the existing techniques and the AV data fusion methods. In the second part we provide a consolidated list of tasks and applications that use the AV fusion based on carried out analysis of research area. We also indicate used methods, techniques, audio and video features. We propose classification of the AV integration, and discuss the advantages and disadvantages of different approaches. We draw conclusions and offer our assessment of the future in the field of AV fusion. In the further research we plan to implement a system of audio-visual Russian continuous speech recognition using advanced methods of multimodal fusion.
Keywords
Постоянный URL
Articles in current issue
- ANALOG-TO-DIGITAL CONVERSION OF SIGNALS WITH ANGULAR MANIPULATION FOR SOFTWARE DEFINED RADIO SYSTEMS
- QUANTUM-MECHANICAL MODELING OF SPATIAL AND BAND STRUCTURE OF Y3AL5O12 SCINTILLATION CRYSTAL
- STUDY OF INK LAYER BY METHOD OF ATTENUATED TOTAL REFLECTANCE SPECTROSCOPY
- RESEARCH OF THE ENTRANCE ANGLE EFFECT ON THE REFLECTANCE SPECTRA OF THE STAINLESS STEEL SURFACE OXIDIZED BY PULSED LASER RADIATION
- FEATURES OF MULTIPLEXED HOLOGRAMS RECORDING IN PHOTO-THERMO-REFRACTIVE GLASS
- SCALE FACTOR DETERMINATION METHOD OF ELECTRO-OPTICAL MODULATOR IN FIBER-OPTIC GYROSCOPE
- STUDY OF THE EFFECT OF ENDFACES POLISHING ANGLE FOR ANISOTROPIC WAVEGUIDES ON STATE CONVERSION OF LIGHT POLARIZATION
- SOLUTION OF SIGNAL UNCERTAINTY PROBLEM AT ANALYTICAL DESIGN OF CONSECUTIVE COMPENSATOR IN PIEZO ACTUATOR CONTROL
- ADAPTIVE SELECTION OF AUXILIARY OBJECTIVES IN MULTIOBJECTIVE EVOLUTIONARY ALGORITHMS
- AVAILABILITY RESEARCH OF REMOTE DEVICES FOR WIRELESS NETWORKS
- HIERARCHICAL ADAPTIVE ROOD PATTERN SEARCH FOR MOTION ESTIMATION AT VIDEO SEQUENCE ANALYSIS
- AUTHENTICATION ALGORITHM FOR PARTICIPANTS OF INFORMATION INTEROPERABILITY IN PROCESS OF OPERATING SYSTEM REMOTE LOADING ON THIN CLIENT
- GRAPH-BASED POST INCIDENT INTERNAL AUDIT METHOD OF COMPUTER EQUIPMENT
- AUTOMATIC SUMMARIZATION OF WEB FORUMS AS SOURCES OF PROFESSIONALLY SIGNIFICANT INFORMATION
- ENVIRONMENTALLY FRIENDLY METHOD OF GASEOUS FUEL COMBUSTION WITH THE USE OF QUASI-OPTICAL MICROWAVE
- FINITE MARKOV CHAINS IN THE MODEL REPRESENTATION OF THE HUMAN OPERATOR ACTIVITY IN QUASI-FUNCTIONAL ENVIRONMENT
- EVALUATION OF ERRORS IN PARAMETERS DETERMINATION FOR THE EARTH HIGHLY ANOMALOUS GRAVITY FIELD
- MATHEMATICAL MODEL OF RR-TYPE MICROMECHANICAL GYRO CAPACITIVE COMB-TYPE SENSORS WITH ACCOUNT FOR VIBRATIONS
- NUMERICAL SIMULATION OF SHOCK WAVE REFRACTION ON INCLINED CONTACT DISCONTINUITY
- METHOD OF EQUIPMENT GRAPHIC REPRESENTATION IN THE PROCESS OF PREPRODUCTION ENGINEERING
- IDENTIFICATION PROPERTIES ENHANCEMENT ALGORITHM FOR PROBLEMS OF PARAMETERS ESTIMATION OF LINEAR REGRESSION MODEL
- EVALUATION OF DISTRIBUTION HISTOGRAMS FOR INCREMENT OF CHROMATICITY COORDINATES IN DISPLAY TECHNOLOGIES
- CONDUCTOMETRY BIOTESTING AS APPLIED TO VALUATION OF THE PRO- AND ANTIBACTERIAL PROPERTIES OF CATOLITES AND ANOLITES
- ON UNIFORMITY OF RASTER ILLUMINATION UNDER LASER SCANNING