TY - GEN
T1 - The effect of amplitude envelope blending across frequency bands on the quality of noise-vocoded speech
AU - Ueda, Kazuo
AU - Araki, Tomoya
AU - Nakajima, Yoshitaka
PY - 2009
Y1 - 2009
N2 - Ueda and Nakajima [Trans. Tech. Comm. Psychol. Physiol. Acoust., 38, 771-776, (2008); 39, 211-216, (2009)] found a consistent clustering of frequency bands common to different languages through factor analyses applied to power fluctuations of critical-band filtered speech sounds. One of the factors exhibited a characteristic shape of two peaks, which implied a correlation between a pair of distant frequency bands. The present study examined how amplitude envelope independence across frequency bands affected perception of Japanese noise-vocoded speech. The results indicated that the 20- and 4-band-synthesis conditions exhibited nearly perfect performances without any systematic training or feedback, and that the conditions in which the lowest and the next lowest frequency band blended, keeping a long-term spectrum energy distribution (sharpness) constant, yielded low mora accuracy. Those results indicated that noise-vocoded speech synthesized with the 4 frequency bands contained enough information for speech perception.
AB - Ueda and Nakajima [Trans. Tech. Comm. Psychol. Physiol. Acoust., 38, 771-776, (2008); 39, 211-216, (2009)] found a consistent clustering of frequency bands common to different languages through factor analyses applied to power fluctuations of critical-band filtered speech sounds. One of the factors exhibited a characteristic shape of two peaks, which implied a correlation between a pair of distant frequency bands. The present study examined how amplitude envelope independence across frequency bands affected perception of Japanese noise-vocoded speech. The results indicated that the 20- and 4-band-synthesis conditions exhibited nearly perfect performances without any systematic training or feedback, and that the conditions in which the lowest and the next lowest frequency band blended, keeping a long-term spectrum energy distribution (sharpness) constant, yielded low mora accuracy. Those results indicated that noise-vocoded speech synthesized with the 4 frequency bands contained enough information for speech perception.
UR - http://www.scopus.com/inward/record.url?scp=84864717126&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84864717126&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84864717126
SN - 9781615676804
T3 - 8th European Conference on Noise Control 2009, EURONOISE 2009 - Proceedings of the Institute of Acoustics
BT - 8th European Conference on Noise Control 2009, EURONOISE 2009 - Proceedings of the Institute of Acoustics
T2 - 8th European Conference on Noise Control 2009, EURONOISE 2009
Y2 - 26 October 2009 through 28 October 2009
ER -