台灣留學生出席國際會議補助

2010年4月27日星期二

Matrix Updates for Perceptron Training of Continuous Density Hidden Markov Models

論文發表人: 程芝潔 (加州大學聖地牙哥分校資訊科學系博士班)
 
http://www.cs.mcgill.ca/~icml2009/index.html
 
在這篇論文中,我們研究一種簡易快速,並以錯誤驅動的學習方法以應用於鑑別式(discriminative training)的連續空間馬可夫模型(continuous density hidden Markov models)的訓練。大部分使用於自動語音辨識(automatic speech recognition)的馬可夫模型皆使用高斯分佈(Gaussian emission densities)來模型化,通常使用的參數為平均數(mean)及協方差距陣(covariance matrix)。為了利用鑑別式方法來訓練馬可夫模型,我們重新參數化高斯分佈,以單一半正定矩陣(positive semidefinite matrix)涵括所有所需之參數。我們在內文中將闡述如何利用以錯誤驅動的方式更新參數矩陣,並達到最小化錯誤率的目的。我們也實驗了不同型式的更新所產生的效果,以及比較數種不同的矩陣分解、初始化及平均化對錯誤率的影響。我們實驗此訓練架構在自動語音辨識的資料庫上,且實驗結果顯示此錯誤驅動的學習方法可以顯著及快速地使語音辨識錯誤率下降。
 
In this paper, we investigate a simple, mistake-driven learning algorithm for discriminative training of continuous density hidden Markov models (CD-HMMs). Most CD-HMMs for automatic speech recognition use multivariate Gaussian emission densities (or mixtures thereof) parameterized in terms of their means and covariance matrices. For discriminative training of CD-HMMs, we reparameterize these Gaussian distributions in terms of positive semidefinite matrices that jointly encode their mean and covariance statistics. We show how to explore the resulting parameter space in CD-HMMs with perceptron-style updates that minimize the distance between Viterbi decodings and target transcriptions. We experiment with several forms of updates, systematically comparing the effects of different matrix factorizations, initializations, and averaging schemes on phone accuracies and convergence rates. We present experimental results for context-independent CD-HMMs trained in this way on the TIMIT speech corpus. Our results show that certain types of perceptron training yield consistently significant and rapid reductions in phone error rates.