GAUSSIAN MIXTURE MODELS FOR ADAPTATION OF DEEP NEURAL NETWORK ACOUSTIC MODELS IN AUTOMATIC SPEECH RECOGNITION SYSTEMS
Annotation
Subject of Research. We study speaker adaptation of deep neural network (DNN) acoustic models in automatic speech recognition systems. The aim of speaker adaptation techniques is to improve the accuracy of the speech recognition system for a particular speaker. Method. A novel method for training and adaptation of deep neural network acoustic models has been developed. It is based on using an auxiliary GMM (Gaussian Mixture Models) model and GMMD (GMM-derived) features. The principle advantage of the proposed GMMD features is the possibility of performing the adaptation of a DNN through the adaptation of the auxiliary GMM. In the proposed approach any methods for the adaptation of the auxiliary GMM can be used, hence, it provides a universal method for transferring adaptation algorithms developed for GMMs to DNN adaptation.Main Results. The effectiveness of the proposed approach was shown by means of one of the most common adaptation algorithms for GMM models – MAP (Maximum A Posteriori) adaptation. Different ways of integration of the proposed approach into state-of-the-art DNN architecture have been proposed and explored. Analysis of choosing the type of the auxiliary GMM model is given. Experimental results on the TED-LIUM corpus demonstrate that, in an unsupervised adaptation mode, the proposed adaptation technique can provide, approximately, a 11–18% relative word error reduction (WER) on different adaptation sets, compared to the speaker-independent DNN system built on conventional features, and a 3–6% relative WER reduction compared to the SAT-DNN trained on fMLLR adapted features.
Keywords
Постоянный URL
Articles in current issue
- SUPERCOMPUTER SIMULATION OF CRITICAL PHENOMENA IN COMPLEX SOCIAL SYSTEMS
- COMPRESSION OF FEW-CYCLE OPTICAL PULSES AND UNIPOLAR PULSE GENERATION DUE TO COHERENT INTERACTION WITH NONLINEAR RESONANT MEDIUM
- LIDAR COMBINED SCANNING UNIT
- INKJET PRINTING OF HIGH REFRACTIVE STRUCTURES BASED ON TiO2 SOL
- EFFECT OF OPTICAL FIBER HYDROGEN LOADING ON THE INSCRIPTION EFFICIENCY OF CHIRPED BRAGG GRATINGS BY MEANS OF KrF EXCIMER LASER RADIATION
- ALGORITHM OF MULTIHARMONIC DISTURBANCE COMPENSATION IN LINEAR SYSTEMS WITH ARBITRARY DELAY: INTERNAL MODEL APPROACH
- LUMINESCENT PROPERTIES OF SILVER CLUSTERS FORMED BY ION EXCHANGE METHOD IN PHOTO-THERMO-REFRACTIVE GLASS
- DISTRIBUTION OF DISLOCATIONS IN AlN CRYSTALS GROWN ON EVAPORATING SiC SUBSTRATES
- SYNTHESIS OF THICK GALLIUM NITRIDE LAYERS BY METHOD OF MULTI-STAGE GROWTH ON SUBSTRATES WITH COLUMN STRUCTURE
- FEATURES OF MEASURING IN LIQUID MEDIA BY ATOMIC FORCE MICROSCOPY
- FUZZY MAPPING IN DATA SONIFICATION SYSTEM OF WIRELESS SENSOR NETWORK
- INTEGRATED INFORMATION SYSTEM ARCHITECTURE PROVIDING BEHAVIORAL FEATURE
- AUTOMATING SELECTION OF OPTIMAL PACKET SCHEDULING DURING VOIP-TRAFFIC TRANSMISSION
- DYNAMIC AUTHORIZATION BASED ON THE HISTORY OF EVENTS
- STATISTICAL METHOD OF TERM EXTRACTION FROM CHINESE TEXTS WITHOUT PRELIMINARY SEGMENTATION OF PHRASES
- FUNCTIONAL SURFACE MICROGEOMETRY PROVIDING THE DESIRED PERFORMANCE OF AN AIRCRAFT VIBRATION SENSOR
- EXPENSES FORECASTING MODEL IN UNIVERSITY PROJECTS PLANNING
- VIRTUAL CHANNEL SIMULATION MODEL
- ESTIMATION TECHNIQUE OF MECHANICAL PRODUCTS QUALITY LEVEL IN DESIGN PROCESS
- ANTIFUNGAL ACTIVITY OF ZnO, SiO2, Au AND Ag ACRYLIC NANOCOMPOSITES
- REDUNDANCY OF TRANSMISSIONS OVER THE AGGREGATED CHANNELS DIVIDED INTO GROUPS
- THE EFFECT OF TOPOLOGY ON TEMPORAL NETWORK DYNAMICS
- PREDICTION OF FLU EPIDEMIC PEAKS IN ST. PETERSBURG THROUGH POPULATION-BASED MATHEMATICAL MODELS