TY - GEN
T1 - Visual speech features representation for automatic lip-reading
AU - Sagheer, Alaa
AU - Tsuruta, Naoyuki
AU - Taniguchi, Rin Ichiro
AU - Maeda, Sakashi
PY - 2005
Y1 - 2005
N2 - A fundamental task in pattern recognition field is to find a suitable representation for a feature. In this paper, we present a new visual speech feature representation approach that combines Hypercolumn Model (HCM) with HMM to perform a complete lip-reading system. In this system, we use HCM to extract visual speech features from input image. The extracted features are modeled by Gaussian distributions through using HMM. The proposed lip-reading system can work under varying lip positions and sizes. All images were captured in a natural environment without using special lighting or lip markers. Experimental results are shown to compare favourably with the results of two reported systems: SOM and DCT base systems. HCM provides better performance than both systems.
AB - A fundamental task in pattern recognition field is to find a suitable representation for a feature. In this paper, we present a new visual speech feature representation approach that combines Hypercolumn Model (HCM) with HMM to perform a complete lip-reading system. In this system, we use HCM to extract visual speech features from input image. The extracted features are modeled by Gaussian distributions through using HMM. The proposed lip-reading system can work under varying lip positions and sizes. All images were captured in a natural environment without using special lighting or lip markers. Experimental results are shown to compare favourably with the results of two reported systems: SOM and DCT base systems. HCM provides better performance than both systems.
UR - http://www.scopus.com/inward/record.url?scp=33646795002&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33646795002&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2005.1415521
DO - 10.1109/ICASSP.2005.1415521
M3 - Conference contribution
AN - SCOPUS:33646795002
SN - 0780388747
SN - 9780780388741
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - II781-II784
BT - 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Image and Multidimensional Signal Processing Multimedia Signal Processing
T2 - 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05
Y2 - 18 March 2005 through 23 March 2005
ER -