INTERSPEECH - RankMe

776 papers

Year	Title / Authors
2004	"liveness" verification in audio-video authentication. Michael Wagner, Girija Chetty
2004	3d lip-tracking for audio-visual speech recognition in real applications. Petr Císar, Zdenek Krnoul, Milos Zelezný
2004	8th International Conference on Spoken Language Processing, INTERSPEECH-ICSLP 2004, Jeju Island, Korea, October 4-8, 2004
2004	A Japanese dialogue-based CALL system with mispronunciation and grammar error detection. Oh Pyo Kweon, Akinori Ito, Motoyuki Suzuki, Shozo Makino
2004	A Korean grapheme-to-phoneme conversion system using selection procedure for exceptions. Sunhee Kim, Ju-Eun Ahn, Soon-Hyob Kim, Yang-Hee Lee
2004	A PLSA-based language model for conversational telephone speech. David Mrva, Philip C. Woodland
2004	A cepstral domain maximum likelihod beamformer for speech recognition. Dominik Raub, John W. McDonough, Matthias Wölfel
2004	A comparative study on the production of inter-stress intervals of English speech by English native speakers and Korean speakers. Jong-Pyo Lee, Tae-Yeoub Jang
2004	A comparison of confirmation styles for error handling in a speech dialog system. Hirohiko Sagawa, Teruko Mitamura, Eric Nyberg
2004	A comparison of normalization and training approaches for ASR-dependent speaker identification. Alex Park, Timothy J. Hazen
2004	A comparison of packet loss compensation methods and interleaving for speech recognition in burst-like packet loss. Alastair Bruce James, Ben P. Milner, Angel Manuel Gomez
2004	A comparison of simultaneous 3-channel blind source separation to selective separation on channel pairs using 2-channel BSS. Erik M. Visser, Kwokleung Chan, Stanley Kim, Te-Won Lee
2004	A comparison of soft and hard spectral subtraction for speaker verification. Michael T. Padilla, Thomas F. Quatieri
2004	A comparison of statistical methods and features for the prediction of prosodic structures. Qin Shi, Volker Fischer
2004	A comparison of the perturbation analysis between PRAAT and computerize speech lab. Jong Min Choi, Myung-Whun Sung, Kwang Suk Park, Jeong-Hun Hah
2004	A compensation method for word-familiarity difference with SNR control in intelligibility test. Shuichi Sakamoto, Yôiti Suzuki, Shigeaki Amano, Tadahisa Kondo, Naoki Iwaoka
2004	A concurrent curve strategy for formant tracking. Yves Laprie
2004	A conversational dialogue system for cognitively overloaded users. Fuliang Weng, Lawrence Cavedon, Badri Raghunathan, Danilo Mirkovic, Hua Cheng, Hauke Schmidt, Harry Bratt, Rohit Mishra, Stanley Peters, Sandra Upson, Elizabeth Shriberg, Carsten Bergmann, Lin Zhao
2004	A cross-linguistic acoustic comparison of unreleased word-final stops: Korean and Thai. Kimiko Tsukada
2004	A cross-linguistic study of diphthongs in spoken word processing in Japanese and English. Kiyoko Yoneyama
2004	A database design for a TTS synthesis system using lexical diphones. Tanya Lambert, Andrew P. Breen
2004	A discriminative locally weighted distance measure for speaker independent template based speech recognition. Mike Matton, Mathias De Wachter, Dirk Van Compernolle, Ronald Cools
2004	A distributed speech recognition system in multi-user environments. Kyu Jeong Han, Shrikanth S. Narayanan, Naveen Srinivasamurthy
2004	A dynamic vocabulary spoken dialogue interface. Stephanie Seneff, Chao Wang, I. Lee Hetherington, Grace Chung
2004	A factorial HMM aproach to robust isolated digit recognition in background music. Mark Hasegawa-Johnson, Ameya N. Deoras
2004	A family-of-models approach to HMM-based segmentation for unit selection speech synthesis. John Kominek, Alan W. Black
2004	A first experience on multilingual acoustic modeling of the languages spoken in morocco. José B. Mariño, Asunción Moreno, Albino Nogueiras
2004	A first step towards text-independent voice conversion. Hermann Ney, David Sündermann, Antonio Bonafonte, Harald Höge
2004	A forensic phonetic investigation into the duration and speech rate. Kyunghwa Kim
2004	A forensically-motivated tool for selecting cepstrally-consistent steady-states from non-contemporaneous vowel utterances. Michael Barlow, Mehrdad Khodai-Joopari, Frantz Clermont
2004	A formant tracking LP model for speech processing. Qin Yan, Esfandiar Zavarehei, Saeed Vaseghi, Dimitrios Rentzos
2004	A frame level boosting training scheme for acoustic modeling. Rong Zhang, Alexander I. Rudnicky
2004	A framework for dialogue data collection with a simulated ASR channel. Matthew N. Stuttle, Jason D. Williams, Steve J. Young
2004	A general approach to TTS reading of mixed-language texts. Leonardo Badino, Claudia Barolo, Silvia Quazza
2004	A genetic algorithm for unit selection based speech synthesis. Rohit Kumar
2004	A grammar-based Chinese to English speech translation system for portable devices. Pascale Fung, Yi Liu, Yongsheng Yang, Yihai Shen, Dekai Wu
2004	A hybrid SVM/HMM acoustic modeling approach to automatic speech recognition. Jan Stadermann, Gerhard Rigoll
2004	A hybrid word / phoneme-based approach for improved vocabulary-independent search in spontaneous speech. Peng Yu, Frank Torsten Bernd Seide
2004	A maximum entropy shallow functional parser for spoken language understanding. David Horowitz, Partha Lal, Pierce Gerard Buckley
2004	A memory efficient grapheme-to-phoneme conversion system for speech processing. Jun Huang, Lex Olorenshaw, Gustavo Hernández Ábrego, Lei Duan
2004	A method for glottal formant frequency estimation. Baris Bozkurt, Thierry Dutoit, Boris Doval, Christophe d'Alessandro
2004	A minimum mean squared error estimator for single channel speaker separation. Aarthi M. Reddy, Bhiksha Raj
2004	A multi-layer conversation management approach for information seeking applications. Shimei Pan
2004	A multi-modal dialog system for a mobile robot. Ioannis Toptsis, Shuyin Li, Britta Wrede, Gernot A. Fink
2004	A multimodal communication aid for global aphasia patients. Jakob Schou Pedersen, Paul Dalsgaard, Børge Lindberg
2004	A new acoustic measure for aspiration noise detection. Carlos Toshinori Ishi
2004	A new approach to channel robust speaker verification via constrained stochastic feature transformation. Man-Wai Mak, Kwok-Kwong Yiu, Ming-Cheung Cheung, Sun-Yuan Kung
2004	A new feature extraction front-end for robust speech recognition using progressive histogram equalization and multi-eigenvector temporal filtering. Shang-nien Tsai, Lin-Shan Lee
2004	A new multicomponent AM-FM demodulation with predicting frequency boundaries and its application to formant estimation. Bo Xu, Jianhua Tao, Yongguo Kang
2004	A new nonlinear feature extraction algorithm for speaker verification. Mohamed Chetouani, Bruno Gas, Jean-Luc Zarader, Marcos Faúndez-Zanuy
2004	A new prosodic phrasing model for indian language telugu. Nemala Sridhar Krishna, Hema A. Murthy
2004	A new score normalization method for speaker verification with virtual impostor model. Woo-Yong Choi, Jung Gon Kim, Hyung Soon Kim, Sung Bum Pan
2004	A noise-robust feature extraction method based on pitch-synchronous ZCPA for ASR. Muhammad Ghulam, Takashi Fukuda, Junsei Horikawa, Tsuneo Nitta
2004	A novel method for two-speaker segmentation. Rashmi Gangadharaiah, Balakrishnan Narayanaswamy, Narayanaswamy Balakrishnan
2004	A novel target-driven generalized JMAP adaptation algorithm. Zhaobing Han, Shuwu Zhang, Bo Xu
2004	A novel voice conversion system based on codebook mapping with phoneme-tied weighting. Zixiang Wang, Ren-Hua Wang, Zhiwei Shuang, Zhen-Hua Ling
2004	A packet loss concealment method using recursive linear prediction. Kazuhiro Kondo, Kiyoshi Nakagawa
2004	A piecewise interpolation method based on log-least square error criterion for HRTF. Jie Zhang, Zhenyang Wu
2004	A proposal to quantitatively select the right intonation unit in data-driven intonation modeling. David Escudero Mancebo, Valentín Cardeñoso-Payo
2004	A prosodic phrasing model for a Korean text-to-speech synthesis system. Kyuchul Yoon
2004	A quantitative model for formant dynamics and contextually assimilated reduction in fluent speech. Li Deng, Yu Dong, Alex Acero
2004	A robust glottal source model estimation technique. Qiang Fu, Peter Murphy
2004	A robust training algorithm based on neighborhood information. Wing-Hei Au, Man-Hung Siu
2004	A robust understanding model for spoken dialogues. Junyan Chen, Ji Wu, Zuoying Wang
2004	A soft decision MMSE amplitude estimator as a noise preprocessor to speech coder s using a glottal sensor. Cenk Demiroglu, David V. Anderson
2004	A spoken dialog system based on automatic grammar generation and template-based weighting for autonomous mobile robots. Takashi Konashi, Motoyuki Suzuki, Akinori Ito, Shozo Makino
2004	A state model for the realization of visual perceptive feedback in smartkom. Peter Poller, Norbert Reithinger
2004	A statistical discrimination measure for hidden Markov models based on divergence. Jorge F. Silva, Shrikanth S. Narayanan
2004	A statistical lexicon for non-native speech recognition. Rainer Gruhn, Konstantin Markov, Satoshi Nakamura
2004	A study of minimum classification error training for segmental switching linear Gaussian hidden Markov models. Jian Wu, Donglai Zhu, Qiang Huo
2004	A study of tone classification for continuous Thai speech recognition. Tan Li, Montri Karnjanadecha, Thanate Khaorapapong
2004	A study on automatic detection of Japanese vowel devoicing for speech synthesis. Yi-Jian Wu, Hisashi Kawai, Jinfu Ni, Ren-Hua Wang
2004	A study on model-based equal error rate estimation for automatic speaker verification. Hsiao-Chuan Wang, Jyh-Min Cheng
2004	A study on nasal coda los in continuous speech. Qiang Fang
2004	A style control technique for HMM-based speech synthesis. Takashi Masuko, Takao Kobayashi, Keisuke Miyanaga
2004	A theoretical analysis of speech recognition based on feature trajectory models. Yasuhiro Minami, Erik McDermott, Atsushi Nakamura, Shigeru Katagiri
2004	A trainable prosodic model: learning the contours implementing communicative functions within a superpositional model of intonation. Gérard Bailly, Bleicke Holm, Véronique Aubergé
2004	A two phase arabic language model for speech recognition and other language applications. Mohsen A. Rashwan
2004	A two-level schema for detecting recognition errors. Zhengyu Zhou, Helen M. Meng
2004	A two-phase pitch marking method for TD-PSOLA synthesis. Cheng-Yuan Lin, Jyh-Shing Roger Jang
2004	A unified framework for large vocabulary speech recognition of mutually unintelligible Chinese "regionalects". Ren-yuan Lyu, Dau-Cheng Lyu, Min-Siong Liang, Min-Hong Wang, Yuang-Chin Chiang, Chun-Nan Hsu
2004	A universal speech interface for appliances. Thomas K. Harris, Roni Rosenfeld
2004	A vector-based method for efficiently representing multivariate environmental information. Akemi Iida, Yoshito Ueno, Ryohei Matsuura, Kiyoaki Aikawa
2004	A voice conversion method based on joint pitch and spectral envelope transformation. Taoufik En-Najjary, Olivier Rosec, Thierry Chonavel
2004	A wizard of oz framework for collecting spoken human-computer dialogs. Rohit Mishra, Elizabeth Shriberg, Sandra Upson, Joyce Chen, Fuliang Weng, Stanley Peters, Lawrence Cavedon, John Niekrasz, Hua Cheng, Harry Bratt
2004	ACCDIST: a metric for comparing speakers' accents. Mark A. Huckvale
2004	ASR on speech reconstructed from short-time fourier phase spectra. Leigh David Alsteris, Kuldip K. Paliwal
2004	AVICAR: audio-visual speech corpus in a car environment. Bowon Lee, Mark Hasegawa-Johnson, Camille Goudeseune, Suketu Kamdar, Sarah Borys, Ming Liu, Thomas S. Huang
2004	Accounting for the uncertainty of speech estimates in the context of model-based feature enhancement. Hugo Van hamme, Patrick Wambacq, Veronique Stouten
2004	Acoustic and prosodic analysis of Japanese vowel-vowel hiatus with laryngeal effect. Shigeyoshi Kitazawa, Shinya Kiriyama
2004	Acoustic correlates of phrase-internal lexical boundaries in dutch. Taehong Cho, Elizabeth K. Johnson
2004	Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task. Shinji Watanabe
2004	Acoustic model adaptation for coded speech using synthetic speech. Koji Tanaka, Fuji Ren, Shingo Kuroiwa, Satoru Tsuge
2004	Acoustic phonetic modeling using local codebook features. Frank Diehl, Asunción Moreno
2004	Acoustic-to-articulatory inversion mapping with Gaussian mixture model. Tomoki Toda, Alan W. Black, Keiichi Tokuda
2004	Active perception: using a priori knowledge from clean speech models to ignore non-target features. Bert Cranen, Johan de Veth
2004	Adaptation for soft whisper recognition using a throat microphone. Szu-Chen Stan Jou, Tanja Schultz, Alex Waibel
2004	Adaptation in the pronunciation space for non-native speech recognition. Georg Stemmer, Stefan Steidl, Christian Hacker, Elmar Nöth
2004	Adaptation of front end parameters in a speech recognizer. Karthik Visweswariah, Ramesh A. Gopinath
2004	Adaptive beamforming combined with particle filtering for acoustic source localization. Reinhold Haeb-Umbach, Sven Peschke, Ernst Warsitz
2004	Adaptive classifier cascade for multimodal speaker identification. Engin Erzin, Yucel Yemez, A. Murat Tekalp
2004	Adaptive cross-channel interference cancellation on blind signal separation outputs using source absence/presence detection and spectral subtraction. Gil-Jin Jang, Changkyu Choi, Yongbeom Lee, Yung-Hwan Oh
2004	Adaptive long-term predictive analysis of disordered speech. Abdellah Kacha, Francis Grenez, Frédéric Bettens, Jean Schoentgen
2004	Adult and infant sensitivity to phonotactic features in spoken Japanese. Sachiyo Kajikawa, Laurel Fais, Shigeaki Amano, Janet F. Werker
2004	Alignment of human prosodic patterns for spoken dialogue systems. Noriko Suzuki, Yasuhiro Katagiri
2004	An acoustic shock limiting algorithm using time and frequency domain speech features. Tina Soltani, Dave Hermann, Etienne Cornu, Hamid Sheikhzadeh, Robert L. Brennan
2004	An acoustic study of emotions expressed in speech. Serdar Yildirim, Murtaza Bulut, Chul Min Lee, Abe Kazemzadeh, Zhigang Deng, Sungbok Lee, Shrikanth S. Narayanan, Carlos Busso
2004	An acoustic study of speech rhythm in taiwan English. Hua-Li Jian
2004	An acoustic-analytic role for the deviation between the scansion and reading of poems. Key-Seop Kim, Un Lim, Dong-Il Shin
2004	An adaptive MEL-LPC analysis for speech recognition. Yoshihisa Nakatoh, Makoto Nishizaki, Shinichi Yoshizawa, Maki Yamada
2004	An adaptive band-partitioning spectral entropy based speech detection in realistic noisy environments. Kun-Ching Wang
2004	An adaptive kalman filter for the enhancement of speech signals. Marcel Gabrea
2004	An analysis of packet loss models for distributed speech recognition. Ben P. Milner, Alastair Bruce James
2004	An efficient codebook design in SDCHMM for mobile communication environments. Gue Jun Jung, Su-Hyun Kim, Yung-Hwan Oh
2004	An efficient partial matching algorithm toward speech retrieval by speech. Yoshiaki Itoh, Kazuyo Tanaka, Shi-wook Lee
2004	An efficient repair procedure for quick transcriptions. Anand Venkataraman, Andreas Stolcke, Wen Wang, Dimitra Vergyri, Jing Zheng, Venkata Ramana Rao Gadde
2004	An energy normalization scheme for improved robustness in speech recognition. Seyed Mohammad Ahadi, Hamid Sheikhzadeh, Robert L. Brennan, George H. Freeman
2004	An evaluation of a spoken document retrieval baseline system in finish. Mikko Kurimo, Ville T. Turunen, Inger Ekman
2004	An experimental method for measuring transfer functions of acoustic tubes. Tatsuya Kitamura, Satoru Fujita, Kiyoshi Honda, Hironori Nishimoto
2004	An implement of speech DB gathering system using voiceXML. Dong-Hyun Kim, Yong-Wan Roh, Kwang-Seok Hong
2004	An improved pair-wise variability index for comparing the timing characteristics of speech. Hua-Li Jian
2004	An improved preprocessor for the automatic transcription of broadcast news audio stream. Jindrich Zdánský, Petr David, Jan Nouza
2004	An information extraction approach for spoken language understanding. Jihyun Eun, Changki Lee, Gary Geunbae Lee
2004	An interactive English pronunciation dictionary for Korean learners. Chao Wang, Mitchell Peabody, Stephanie Seneff, Jong-mi Kim
2004	An intonation model for embedded devices based on natural F0 samples. Gerasimos Xydas, Georgios Kouroupetroglou
2004	An online audio indexing system. Jitendra Ajmera, Iain McCowan, Hervé Bourlard
2004	An understanding strategy based on plausibility score in recognition history using CSR confidence measure. Toshihiko Itoh, Atsuhiko Kai, Yukihiro Itoh, Tatsuhiro Konishi
2004	Analysis of F0 contours of Cantonese utterances based on the command-response model. Wentao Gu, Keikichi Hirose, Hiroya Fujisaki
2004	Analysis of acoustic features affecting "singing-ness" and its application to singing-voice synthesis from speaking-voice. Takeshi Saitou, Naoya Tsuji, Masashi Unoki, Masato Akagi
2004	Analysis of emotional speech in voice mail messages: the influence of speakers' gender. Noël Chateau, Valérie Maffiolo, Christophe Blouin
2004	Analysis of hypernasality by synthesis. P. Vijayalakshmi, M. Ramasubba Reddy
2004	Analysis of in-car speech recognition experiments using a large-scale multi-mode dialogue corpus. Hiroshi Fujimura, Katsunobu Itou, Kazuya Takeda, Fumitada Itakura
2004	Analysis of speaking styles by two-dimensional visualization of aggregate of acoustic models. Makoto Shozakai, Goshu Nagino
2004	Analysis of the phone level contributions to objective evaluation of English speech by non-natives. Yasuo Suzuki, Yoshinori Sagisaka, Katsuhiko Shirai, Makiko Muto
2004	Analysis of the voice source in different phonation types: simultaneous high-sped imaging of the vocal fold vibration and glottal inverse filtering. Hannu Pulakka, Paavo Alku, Svante Granqvist, Stellan Hertegard, Hans Larsson, Anne-Maria Laukkanen, Per-Ake Lindestad, Erkki Vilkman
2004	Analysis on disappearing and thriving of speech applications for ergonomic design guidelines and recommendations. Rinzou Ebukuro
2004	Application of long-term filtering to formant estimation. Hong You
2004	Application of voice conversion to hearing-impaired Mandarin speech enhancement. Chen-Long Lee, Wen-Whei Chang, Yuan-Chuan Chiang
2004	Apply n-best list re-ranking to acoustic model combinations of boosting training. Rong Zhang, Alexander I. Rudnicky
2004	Applying pitch connection control in Mandarin speech synthesis. Yi Zhou, Yiqing Zu, Zhenli Yu, Dongjian Yue, Guilin Chen
2004	Applying the Aurora feature extraction schemes to a phoneme based recognition task. Hans-Günter Hirsch, Harald Finster
2004	Approach to interchange-format based Chinese generation. Wenjie Cao, Chengqing Zong, Bo Xu
2004	Articulatory correlates of voice qualities of god guys and bad guys in Japanese anime: an MRI study. Emi Zuiki Murano, Mihoko Teshigawara
2004	Articulatory feature recognition using dynamic Bayesian networks. Joe Frankel, Mirjam Wester, Simon King
2004	Articulatory feature-based conditional pronunciation modeling for speaker verification. Ka-Yee Leung, Man-Wai Mak, Sun-Yuan Kung
2004	Aspects of named entity processing. Michael Levit, Allen L. Gorin, Patrick Haffner, Hiyan Alshawi, Elmar Nöth
2004	Aspects of speaking-face data corpus design methodology. J. Bruce Millar, Michael Wagner, Roland Goecke
2004	Assessment of non-native phones in anglicisms by German listeners. Julia Abresch, Stefan Breuer
2004	Audio source separation from the mixture using empirical mode decomposition with independent subspace analysis. Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu
2004	Audio watermarking in sub-band signals using multiple echo kernels. In-Jung Oh, Hyun-Yeol Chung, Jae-Won Cho, Ho-Youl Jung, Rémy Prost
2004	Audio-visual SPeaker localization for car navigation systems. Xianxian Zhang, Kazuya Takeda, John H. L. Hansen, Toshiki Maeno
2004	Audio-visual spoken language processing. Jinyoung Kim, Jeesun Kim, Chris Davis
2004	Audiovisual perceptual evaluation of resynthesised speech movements. Matthias Odisio, Gérard Bailly
2004	Automated lexical adaptation and speaker clustering based on pronunciation habits for non-native speech recognition. Antoine Raux
2004	Automatic adaptation of the momel F0 stylisation algorithm to new corpora. Salma Mouline, Olivier Boëffard, Paul C. Bagshaw
2004	Automatic detection of contrast for speech understanding. Mark Hasegawa-Johnson, Stephen E. Levinson, Tong Zhang
2004	Automatic detection of dialog acts based on multilevel information. Sophie Rosset, Lori Lamel
2004	Automatic detection of vocal fold paralysis and edema. Maria Marinaki, Constantine Kotropoulos, Ioannis Pitas, Nikolaos Maglaveras
2004	Automatic extraction of key sentences from oral presentations using statistical measure based on discourse markers. Tasuku Kitade, Tatsuya Kawahara, Hiroaki Nanjo
2004	Automatic extraction of phonetically rich sentences from large text corpus of indian languages. Karunesh Arora, Sunita Arora, Kapil Verma, Shyam Sunder Agrawal
2004	Automatic language identification using discrete hidden Markov model. Kakeung Wong, Man-Hung Siu
2004	Automatic lips reading for audio-visual speech processing and recognition. Josef Chaloupka
2004	Automatic network optimization of voice applications. Juan M. Huerta, Chaitanya Ekanadham
2004	Automatic phonetic base form generation based on maximum context tree. Changxue Ma
2004	Automatic pitch marking and reconstruction of glottal closure instants from noisy and deformed electro-glotto-graph signals. Attila Ferencz, Jeongsu Kim, Yong-Beom Lee, Jae-Won Lee
2004	Automatic prosody labeling of read norwegian. Per Olav Heggtveit, Jon Emil Natvig
2004	Automatic pruning of unit selection speech databases for synthesis without loss of naturalness. Rohit Kumar, S. Prahallad Kishore
2004	Automatic speech recognition of co-channel speech: integrated speaker and speech recognition approach. Larry P. Heck, Mark Z. Mao
2004	Automatic transcription of continuous speech using unsupervised and incremental training. L. Sarada Ghadiyaram, Hemalatha Nagarajan, Nagarajan Thangavelu, Hema A. Murthy
2004	Automatic transformation of lecture transcription into document style using statistical framework. Tatsuya Kawahara, Kazuya Shitaoka, Hiroaki Nanjo
2004	Beginning of utterance detection algorithm for low complexity ASR engines. Tommi Lahti
2004	Belief-based nonlinear rescoring in Thai speech understanding. Chai Wutiwiwatchai, Sadaoki Furui
2004	Best speaker-based structure tree for speaker verification. Chakib Tadj, Christian S. Gargour, Nabil Badri
2004	Biomechanical parameter fingerprint in the mucosal wave power spectral density. Juan Ignacio Godino-Llorente, María Victoria Rodellar Biarge, Pedro Gómez-Vilda, Francisco Díaz Pérez, Agustín Álvarez-Marquina, Rafael Martínez-Olalla
2004	Blind separation of speech and sub-Gaussian signals in underdetermined case. Sang-Gyun Kim, Chang D. Yoo
2004	Bonntempo-corpus and bonntempo-tools: a database for the study of speech rhythm and rate. Volker Dellwo, Bianca Aschenberner, Petra Wagner, Jana Dancovicova, Ingmar Steiner
2004	Boostrapping phonetic lexicons for new languages. Sameer Maskey, Alan W. Black, Laura Tomokiya
2004	CIAIR in-car speech database. Nobuo Kawaguchi, Shigeki Matsubara, Yukiko Yamaguchi, Kazuya Takeda, Fumitada Itakura
2004	Canonicalization of feature parameters for automatic speech recognition. Takashi Fukuda, Tsuneo Nitta
2004	Channel frequency response correction for speaker recognition. Stanley J. Wenndt, Richard M. Floyd
2004	Characterizing and classifying cued speech vowels from labial parameters. Denis Beautemps, Thomas Burger, Laurent Girin
2004	Characterizing task-oriented dialog using a simulated ASR chanel. Jason D. Williams, Steve J. Young
2004	Children's emotion recognition in an intelligent tutoring scenario. Mark Hasegawa-Johnson, Stephen E. Levinson, Tong Zhang
2004	Chinese prosody phrase break prediction based on maximum entropy model. Jianfeng Li, Guoping Hu, Ren-Hua Wang
2004	Chinese text word-segmentation considering semantic links among sentences. Leonardo Badino
2004	Classification of pathological voice including severely noisy cases. Cheolwoo Jo, Soo-Geon Wang, Byung-Gon Yang, Hyung-Soon Kim, Tao Li
2004	Classifying emotion in Chinese speech by decomposing prosodic features. Dan-Ning Jiang, Lian-Hong Cai
2004	Clause types and filed pauses in Japanese spontaneous monologues. Michiko Watanabe, Yasuharu Den, Keikichi Hirose, Nobuaki Minematsu
2004	Cluster-dependent modeling and confidence measure processing for in-set/out-of-set speaker identification. Pongtep Angkititrakul, Sepideh Baghaii, John H. L. Hansen
2004	Clustering similar nouns for selecting related news articles. Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi
2004	Coarticulatory variability and directionality in [s, ..]: an EPG study. Mitsuhiro Nakamura
2004	Combination of speech features using smoothed heteroscedastic linear discriminant analysis. Lukás Burget
2004	Combination of standard and throat microphones for robust speech recognition in highly noisy environments. Martin Graciarena, Federico Cesari, Horacio Franco, Gregory K. Myers, Cregg Cowan, Victor Abrash
2004	Combining agglomerative and tree-based state clustering for high accuracy acoustic modeling. Zhaobing Han, Shuwu Zhang, Bo Xu
2004	Combining linguistic knowledge and acoustic information in automatic pronunciation lexicon generation. Grace Chung, Chao Wang, Stephanie Seneff, Edward Filisko, Min Tang
2004	Communicative competence and adaptation in a spoken dialogue system. Kristiina Jokinen
2004	Compact acoustic model for embedded implementation. Junho Park, Hanseok Ko
2004	Comparative study of linear and non-linear models for viseme in version: modeling of a cortical associative function. Frédéric Berthommier
2004	Comparing intonation of two varieties of French using normalized F0 values. Svetlana Kaminskaia, François Poiré
2004	Comparison of ML, MAP, and VB based acoustic models in large vocabulary speech recognition. Panu Somervuo
2004	Comparison of several speaker verification procedures based on GMM. Vlasta Radová, Ales Padrta
2004	Comparison of transmitter - based packet-loss recovery techniques for voice transmission. Moo Young Kim, W. Bastiaan Kleijn
2004	Complex emotion recognition system for a specific user using SOM based on prosodic features. Atsushi Iwai, Yoshikazu Yano, Shigeru Okuma
2004	Complex spectrum circle centroid for microphone-array-based noisy speech recognition. Shigeki Sagayama, Okajima Takashi, Yutaka Kamamoto, Takuya Nishimoto
2004	Compression of speech database by feature separation and pattern clustering using STRAIGHT. Zhen-Hua Ling, Yu Hu, Zhiwei Shuang, Ren-Hua Wang
2004	Computation of the acoustic characteristics of vocal-tract models with geometrical perturbation. Kunitoshi Motoki, Hiroki Matsuzaki
2004	Conditional maximum likelihood estimation for improving annotation performance of n-gram models incorporating stochastic finite state grammars. Vaibhava Goel
2004	Confirmation strategy for document retrieval systems with spoken dialog interface. Teruhisa Misu, Tatsuya Kawahara, Kazunori Komatani
2004	Constrained minimization technique for topic identification using discriminative training and support vector machines. Imed Zitouni, Minkyu Lee, Hui Jiang
2004	Construct a multi-lingual speech corpus in taiwan with extracting phonetically balanced articles. Min-Siong Liang, Dau-Cheng Lyu, Yuang-Chin Chiang, Ren-yuan Lyu
2004	Constructing emotional speech synthesizers with limited speech database. Heiga Zen, Tadashi Kitamura, Murtaza Bulut, Shrikanth S. Narayanan, Ryosuke Tsuzuki, Keiichi Tokuda
2004	Context based emotion detection from text input. Jianhua Tao
2004	Context dependent "long units" for speech recognition. Denis Jouvet, Ronaldo O. Messina
2004	Context dependent phoneme duration modeling with tree-based state tying. Myoung-Wan Koo, Ho-Hyun Jeon, Sang-Hong Lee
2004	Context dependent statistical augmentation of persian transcripts. Panayiotis G. Georgiou, Shrikanth S. Narayanan, Hooman Shirani Mehr
2004	Contextual revision in information seeking conversation systems. Keith Houck
2004	Continuous speech recognition using joint features derived from the modified group delay function and MFCC. Rajesh Mahanand Hegde, Hema A. Murthy, Venkata Ramana Rao Gadde
2004	Convolutional networks for speech detection. Somsak Sukittanon, Arun C. Surendran, John C. Platt, Christopher J. C. Burges
2004	Coping with disfluencies in spontaneous speech recognition. Frederik Stouten, Jean-Pierre Martens
2004	Correcting Korean vowel speech recognition errors with limited lip features. Ki-Hyung Hong, Yong-Ju Lee, Jae-Young Suh, Kyong-Nim Lee
2004	Correlation between VOT and F0 in the perception of Korean stops and affricates. Midam Kim
2004	Cost-sensitive call classification. Gökhan Tür
2004	Cough detection in spoken dialogue system for home health care. Shinya Takahashi, Tsuyoshi Morimoto, Sakashi Maeda, Naoyuki Tsuruta
2004	Creating speech recognition grammars from regular expressions for alphanumeric concepts. Ye-Yi Wang, Yun-Cheng Ju
2004	Cross domain dialogue modelling: an object-based approach. Ian M. O'Neill, Philip Hanna, Xingkun Liu, Michael F. McTear
2004	Cross-lingual phoneme mapping for multilingual synthesis systems. Marko Moberg, Kimmo Pärssinen, Juha Iso-Sipilä
2004	Crosscorrelation-based multispeaker speech activity detection. Kornel Laskowski, Qin Jin, Tanja Schultz
2004	DFW-based spectral smoothing for concatenative speech synthesis. Hartmut R. Pfitzinger
2004	DOA estimation of speech signals using semi-blind source separation techniques. Ilyas Potamitis, Panagiotis Zervas, Nikos Fakotakis
2004	DORIS, a multiagent/IP platform for multimodal dialogue applications. Johann L'Hour, Olivier Boëffard, Jacques Siroux, Laurent Miclet, Francis Charpentier, Thierry Moudenc
2004	DWT-based classification of acoustic-phonetic classes and phonetic units. Gernot Kubin, Tuan Van Pham
2004	Data driven multidialectal phone set for Spanish dialects. Mónica Caballero, Asunción Moreno, Albino Nogueiras
2004	Data driven number-of-states selection in HMM topologies. Dirk Knoblauch
2004	Data pruning approach to unit selection for inventory generation of concatenative embeddable Chinese TTS systems. Zhenli Yu, Kaizhi Wang, Yiqing Zu, Dongjian Yue, Guilin Chen
2004	Data-driven approaches for automatic detection of syllable boundaries. Jilei Tian
2004	Decision-tree backing-off in HMM-based speech synthesis. Shunsuke Kataoka, Nobuaki Mizutani, Keiichi Tokuda, Tadashi Kitamura
2004	Decomposing linguistic and affective components of phonatory quality. Ailbhe Ní Chasaide, Christer Gobl
2004	Default phrasing and attachment preference in Korean. Sun-Ah Jun
2004	Dependency analysis of read Japanese sentences using pause and F0 information: a speaker independent case. Kazuyuki Takagi, Kazuhiko Ozeki
2004	Dependency structure analysis and sentence boundary detection in spontaneous Japanese. Tatsuya Kawahara, Kiyotaka Uchimoto, Hitoshi Isahara, Kazuya Shitaoka
2004	Dereverberation of speech signals based on linear prediction. Marc Delcroix, Takafumi Hikichi, Masato Miyoshi
2004	Design and construction of Korean-spoken English corpus. Seok-Chae Rhee, Sook-Hyang Lee, Young-Ju Lee, Seok-Keun Kang
2004	Design of compact acoustic models through clustering of tied-covariance Gaussians. Mark Z. Mao, Vincent Vanhoucke
2004	Design of ready-made acoustic model library by two-dimensional visualization of acoustic space. Goshu Nagino, Makoto Shozakai
2004	Design strategies for a virtual language tutor. Jonas Beskow, Olov Engwall, Björn Granström, Preben Wik
2004	Detecting user engagement in everyday conversations. Chen Yu, Paul M. Aoki, Allison Woodruff
2004	Detection of vowel on set points in continuous speech using autoassociative neural network models. Suryakanth V. Gangashetty, Chellu Chandra Sekhar, B. Yegnanarayana
2004	Deterministic annealing EM algorithm in parameter estimation for acoustic model. Yohei Itaya, Heiga Zen, Yoshihiko Nankaku, Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura
2004	Development of the knowledge-based spoken English evaluation system and its application. Seok-Chae Rhee, Jeon G. Park
2004	Developmental changes in voiced-segment ratio for Japanese infants and parents. Shigeaki Amano, Tomohiro Nakatani, Tadahisa Kondo
2004	Dialect analysis and modeling for automatic classification. John H. L. Hansen, Umit H. Yapanel, Rongqing Huang, Ayako Ikeno
2004	Dictionary refinements based on phonetic consensus and non-uniform pronunciation reduction. Gustavo Hernández Ábrego, Lex Olorenshaw, Raquel Tato, Thomas Schaaf
2004	Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition. Kazushi Ishihara, Yuya Hattori, Tomohiro Nakatani, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
2004	Discriminative combination of multiple linear predictions for speech recognition. Zhijian Ou, Zuoying Wang
2004	Discriminative training of compound-word based multinomial classifiers for speech routing. Xiang Li, Juan M. Huerta
2004	Discriminative training of naive Bayes classifiers for natural language call routing. Hui Jiang, Pengfei Liu, Imed Zitouni
2004	Discriminative training with tied covariance matrices. Wolfgang Macherey, Ralf Schlüter, Hermann Ney
2004	Distributed speaker recognition using earth mover's distance. Yoshiyuki Umeda, Shingo Kuroiwa, Satoru Tsuge, Fuji Ren
2004	Distributed speaker recognition. Veena Desai, Hema A. Murthy
2004	Domain adaptation methods in the IBM trainable text-to-speech system. Volker Fischer, Jaime Botella Ordinas, Siegfried Kunzmann
2004	Duration modeling for hindi text-to-speech synthesis system. Sridhar Krishna Nemala, Partha Pratim Talukdar, Kalika Bali, A. G. Ramakrishnan
2004	Duration modeling techniques for continuous speech recognition. Janne Pylkkönen, Mikko Kurimo
2004	Dynamic beam pruning strategy using adaptive control. Dongbin Zhang, Limin Du
2004	Dynamic language modeling for broadcast news. Langzhou Chen, Lori Lamel, Jean-Luc Gauvain, Gilles Adda
2004	Dynamic time windows for multimodal input fusion. Anurag Kumar Gupta, Tasos Anastasakos
2004	EVITA-RAD: an extensible enterprise voice porTAI - rapid application development tool. Yu Chen
2004	Effect of intensive audiovisual perceptual training on the perception and production of the /l/-/r/ contrast for Japanese learners of English. Valérie Hazan, Anke Sennema, Andrew Faulkner
2004	Effect of speaking rate on the acceptability of change in segment duration. Hiroaki Kato, Yoshinori Sagisaka, Minoru Tsuzaki, Makiko Muto
2004	Effect of voice prosody on the decision making process in human-computer interaction. Yohei Yabuta, Yasuhiro Katagiri, Noriko Suzuki, Yugo Takeuchi
2004	Effective acoustic modeling for rate-of-speech variation in large vocabulary conversational speech recognition. Jing Zheng, Horacio Franco, Andreas Stolcke
2004	Effects of language modeling on speech-driven question answering. Katsunobu Itou, Atsushi Fujii, Tomoyosi Akiba
2004	Effects of phonetic contexts on the duration of phonetic segments in fluent read speech. Sorin Dusan
2004	Effects of prosodic boundaries on ambiguous syntactic clause boundaries in Japanese. Shari R. Speer, Soyoung Kang
2004	Efficient compression method for pronunciation dictionaries. Jilei Tian
2004	Efficient likelihood computation in multi-stream HMM based audio-visual speech recognition. Etienne Marcheret, Stephen M. Chu, Vaibhava Goel, Gerasimos Potamianos
2004	Efficient online cohort selection method for speaker verification. Tomi Kinnunen, Evgeny Karpov, Pasi Fränti
2004	Efficient sub-optimal temporal decomposition with dynamic weighting of speech signals for coding applications. David Malah, Slava Shechtman
2004	Efficient tone classification of speaker independent continuous Chinese speech using anchoring based discriminating features. Jinsong Zhang, Satoshi Nakamura, Keikichi Hirose
2004	Efficient vector quantisation of line spectral frequencies using the switched split vector quantiser. Stephen So, Kuldip K. Paliwal
2004	Eigen-prosody analysis for robust speaker recognition under mismatch handset environment. Zi-He Chen, Yuan-Fu Liao, Yau-Tarng Juang
2004	Elements of interactivity in telephone conversations. Florian Hammer, Peter Reichl, Alexander Raake
2004	Emotion recognition based on phoneme classes. Chul Min Lee, Serdar Yildirim, Murtaza Bulut, Abe Kazemzadeh, Carlos Busso, Zhigang Deng, Sungbok Lee, Shrikanth S. Narayanan
2004	Emotion verification for emotion detection and unknown emotion rejection. Hoon-Young Cho, Kaisheng Yao, Te-Won Lee
2004	Enhancement of reverberant speech using excitation source information. M. Chaitanya, S. R. Mahadeva Prasanna, B. Yegnanarayana
2004	Enhancing existing form-based dialogue managers with reasoning capabilities. Dirk Bhler
2004	Entropy based combination of tandem representations for noise robust ASR. Shajith Ikbal, Hemant Misra, Sunil Sivadas, Hynek Hermansky, Hervé Bourlard
2004	Environmental robust features for speech detection. Thomas Kemp, Climent Nadeu, Yin Hay Lam, Josep Maria Sola i Caros
2004	Error - weighted discriminative training for HMM parameter estimation. Daniel Willett
2004	Estimating detailed spectral envelopes using articulatory clustering. Yoshinori Shiga, Simon King
2004	Estimating speaking rate in spontaneous speech from z-scores of pattern durations. Kazuyuki Ashimura, Hideki Kashioka, Nick Campbell
2004	Estimating syntactic structure from prosodic features in Japanese speech. Tomoko Ohsuga, Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa
2004	Estimation of semantic confidences on lattice hierarchies. Robert Lieb, Tibor Fábián, Günther Ruske, Matthias Thomae
2004	Estimation of the vocal tract spectrum from articulatory movements using phoneme-dependent neural networks. Takuya Tsuji, Tokihiko Kaburagi, Kohei Wakamiya, Jiji Kim
2004	Etiology of user experience with natural language speech. Christopher J. Pavlovski, Jennifer C. Lai, Stella Mitchell
2004	European initiatives to promote cooperation between speech and text communities. Nicoletta Calzolari
2004	Evaluating cognitive load in spoken language interfaces using a dual-task paradigm. Ellen Campana, Michael K. Tanenhaus, James F. Allen, Roger W. Remington
2004	Evaluating system metaphors via the speech output of a smart home system. Sebastian Möller, Jan Felix Krebber, Paula M. T. Smeele
2004	Evaluation of a prosodic labeling system utilizing linguistic information. Shinya Kiriyama, Shigeyoshi Kitazawa
2004	Evaluation of a threshold for detecting local slower phrases in Japanese spontaneous conversational speech. Keiichi Takamaru
2004	Evaluation of an inverse filtering technique using physical modeling of voice production. Paavo Alku, Matti Airas, Brad H. Story
2004	Evaluation of corpus based tone prediction in mismatched environments for greek tts synthesis. Panagiotis Zervas, Nikos Fakotakis, George K. Kokkinakis, Georgios Kouroupetroglou, Gerasimos Xydas
2004	Evaluation of the difference between the driving behavior of a speech based and a speech-visual based task of an in-car compute. Zhan Fu, Lay Ling Pow, Fang Chen
2004	Evaluation of the speech output of a smart-home system in a car environment. Paula M. T. Smeele, Sebastian Möller, Jan Felix Krebber
2004	Evaluation of tree-structured piecewise linear transformation-based noise adaptation on AURORA2 database. Zhipeng Zhang, Tomoyuki Ohya, Sadaoki Furui
2004	Evaluation of universal compensation on Aurora 2 and 3 and beyond. Ji Ming, Baochun Hou
2004	Evolutionary optimization of an adaptive prosody model. Oliver Jokisch, Michael Hofmann
2004	Evolutive speaker segmentation using a repository system. Xavier Anguera Miró, Javier Hernando Pericas
2004	Example-based spoken dialogue system with online example augmentation. Hiroya Murao, Nobuo Kawaguchi, Shigeki Matsubara, Yukiko Yamaguchi, Kazuya Takeda, Yasuyoshi Inagaki
2004	Example-based training of dialogue planning incorporating user and situation models. Ian Richard Lane, Tatsuya Kawahara, Shinichi Ueno
2004	Experiments on the accuracy of phone models and liaison processing in a French broadcast news transcription system. Dominique Fohr, Odile Mella, Irina Illina, Christophe Cerisara
2004	Explicit duration modeling for Cantonese connected-digit recognition. Yu Zhu, Tan Lee
2004	Exploiting models intrinsic robustness for noisy speech recognition. Christophe Cerisara, Dominique Fohr, Odile Mella, Irina Illina
2004	Exploring XML-based technologies and procedures for quality evaluation from a real-life case perspective. Folkert de Vriend, Giulio Maltese
2004	Exploring high-performance speech recognition in noisy environments using high-order taylor series expansion. Guo-Hong Ding, Bo Xu
2004	F0 and formant frequency distribution of dysarthric speech - a comparative study. Hiroki Mori, Yasunori Kobayashi, Hideki Kasuya, Hajime Hirose, Noriko Kobayashi
2004	Fast GMM-based voice conversion for text-to-speech synthesis systems. Taoufik En-Najjary, Olivier Rosec, Thierry Chonavel
2004	Fast clustering of Gaussians and the virtue of representing Gaussians in exponential model format. Peder A. Olsen, Karthik Visweswariah
2004	Fast on-the-fly composition for weighted finite-state transducers in 1.8 million-word vocabulary continuous speech recognition. Takaaki Hori, Chiori Hori, Yasuhiro Minami
2004	Fast parameter estimation for joint maximum entropy language models. Edward James Schofield
2004	Fast semi-automatic semantic annotation for spoken dialog systems. Ruhi Sarikaya, Yuqing Gao, Paola Virga
2004	Fast speech adaptation in linear spectral domain for additive and convolutional noise. Dongsuk Yook, Donghyun Kim
2004	Feature-based pronunciation modeling with trainable asynchrony probabilities. Karen Livescu, James R. Glass
2004	Feature-dependent compensation in speech recognition. Ivan Brito, Néstor Becerra Yoma, Carlos Molina
2004	Fiction database for emotion detection in abnormal situations. Ioana Vasilescu, Laurence Devillers, Chloé Clavel, Thibaut Ehrette
2004	Finite-state-based and phrase-based statistical machine translation. Josep Maria Crego, José B. Mariño, Adrià de Gispert
2004	Flexible dialogue management using distributed and dynamic dialogue control. Mikko Hartikainen, Markku Turunen, Jaakko Hakulinen, Esa-Pekka Salonen, J. Adam Funk
2004	Florence: a dialogue manager framework for spoken dialogue systems. Giuseppe Di Fabbrizio, Charles Lewis
2004	Flow representation through the glottis having a polygonal boundary shape. Yosuke Tanabe, Tokihiko Kaburagi
2004	Foreign-accented speaker-independent speech recognition. Stefanie Aalburg, Harald Höge
2004	Formulating contextual tonal variations in Mandarin. Jinfu Ni, Hisashi Kawai, Keikichi Hirose
2004	Four-layer categorization scheme of fast GMM computation techniques in large vocabulary continuous speech recognition systems. Arthur Chan, Mosur Ravishankar, Alexander I. Rudnicky, Jahanzeb Sherwani
2004	Frequency effects on vowel reduction in three typologically different languages (dutch, finish, Russian). Rob van Son, Olga Bolotova, Louis C. W. Pols, Mietta Lennes
2004	Frequency warped ARMA analysis of the closed and the open phase of voiced speech. Pedro J. Quintana-Morales, Juan L. Navarro-Mesa
2004	Friendly speech analysis and perception in standard Chinese. Aijun Li, Haibo Wang
2004	From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognition. Andrew Cameron Morris, Viktoria Maier, Phil D. Green
2004	From X-ray or MRU data to sounds through articulatory synthesis: towards an integrated view of the speech communication process. Jacqueline Vaissière
2004	From decoding-driven to detection-based paradigms for automatic speech recognition. Chin-Hui Lee
2004	From real-time MRI to 3d tongue movements. Olov Engwall
2004	From switchboard to meetings: development of the 2004 ICSI-SRI-UW meeting recognition system. Andreas Stolcke, Chuck Wooters, Ivan Bulyko, Martin Graciarena, Scott Otterson, Barbara Peskin, Mari Ostendorf, David Gelbart, Nikki Mirghafori, Tuomo W. Pirinen
2004	Fujisaki model based F0 contours in vietnamese TTS. Dung Tien Nguyen, Chi Mai Luong, Bang Kim Vu, Hansjörg Mixdorff, Huy Hoang Ngo
2004	Functions of intonation boundaries during spoken language comprehension in English. Allison Blodgett
2004	Fuzzy logic decision fusion in a multimodal biometric system. Chun Wai Lau, Bin Ma, Helen Mei-Ling Meng, Yiu Sang Moon, Yeung Yam
2004	Generating gestures from speech. Rubén San Segundo, Juan Manuel Montero, Javier Macías Guarasa, Ricardo de Córdoba, Javier Ferreiros, José Manuel Pardo
2004	Grapheme-to-phoneme conversion for Chinese text-to-speech. Jun Xu, Guohong Fu, Haizhou Li
2004	Graphical model approach to pitch tracking. Xiao Li, Jonathan Malkin, Jeff A. Bilmes
2004	HLT modules scalability within the NESPOLE! project. Hervé Blanchon
2004	HMM-based feature compensation method: an evaluation using the AURORA2. Akira Sasou, Kazuyo Tanaka, Satoshi Nakamura, Futoshi Asano
2004	Hands-free speech recognition using blind source separation post-processed by two-stage spectral subtraction. Masanori Tsujikawa, Ken-ichi Iso
2004	Harmonicity based monaural speech dereverberation with time warping and F0 adaptive window. Tomohiro Nakatani, Keisuke Kinoshita, Masato Miyoshi, Parham Zolfaghari
2004	Hidden factor dynamic Bayesian networks for speech recognition. Filip Korkmazsky, Murat Deviren, Dominique Fohr, Irina Illina
2004	Hidden semi-Markov model based speech synthesis. Heiga Zen, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura
2004	Higgins - a spoken dialogue system for investigating error handling techniques. Jens Edlund, Gabriel Skantze, Rolf Carlson
2004	High quality text-to-pinyin conversion using two-phase unknown word prediction. Juhong Ha, Yu Zheng, Gary Geunbae Lee, Yoon-Suk Seong, Byeongchang Kim
2004	High-level feature weighted GMM network for audio stream classification. Rongqing Huang, John H. L. Hansen
2004	Highband spectrum envelope estimation of telephone speech using hard/soft-classification. Yasheng Qian, Peter Kabal
2004	Histogram normalisation and the recognition of names and ontology words in the MUMIS project. Eric Sanders, Febe de Wet
2004	Hot discussion or frosty dialogue? towards a temperature metric for conversational interactivity. Peter Reichl, Florian Hammer
2004	How does the integration of speech recognition controls and spatialized auditory displays affect user workload? Ellen C. Haas
2004	How sparse can we make the auditory representation of speech? Christian Feldbauer, Gernot Kubin
2004	How to integrate phonetic and linguistic knowledge in a text-to-phoneme conversion task: a syllabic TPC tool for French. Nicole Beringer
2004	Human language acquisition methods in a machine learning task. Nicole Beringer
2004	Hybrid model using subspace distribution clustering hidden Markov models and semi-continuous hidden Markov models for embedded speech recognizers. Youngkyu Cho, Sung-a Kim, Dongsuk Yook
2004	Hybrid named entity recognition for question-answering system. Euisok Chung, Soojong Lim, Yi-Gyu Hwang, Myung-Gil Jang
2004	Hybrid utterance verification based on n-best models and model derived from kulback-leibler divergence. Minho Jin, Gyucheol Jang, Sungrack Yun, Chang Dong Yoo
2004	ICA-based feature extraction for phoneme recognition. Oh-Wook Kwon, Te-Won Lee
2004	Identifying emotion in speech prosody using acoustical cues of harmony. Takashi X. Fujisawa, Norman D. Cook
2004	Identifying local corrections in human-computer dialogue. Gina-Anne Levow
2004	Implementation of an intonational quality assessment system for a handheld device. Kisun You, Hoyoun Kim, Wonyong Sung
2004	Implementation of dialog applications in an open-source voiceXML platform. Fernando Fernández Martínez, Valentín Sama, Luis Fernando D'Haro, Rubén San Segundo, Ricardo de Córdoba, Juan Manuel Montero
2004	Improved differential phase spectrum processing for formant tracking. Baris Bozkurt, Thierry Dutoit, Boris Doval, Christophe d'Alessandro
2004	Improved histogram-based feature compensation for robust speech recognition and unsupervised speaker adaptation. Yasunari Obuchi
2004	Improved iterative wiener filtering for non-stationary noise speech enhancement. T. V. Sreenivas, K. Sharath Rao, A. Sreenivasa Murthy
2004	Improved model training and automatic weight adjustment for multi-SNR multi-band speaker identification system. Kenichi Yoshida, Kazuyuki Takagi, Kazuhiko Ozeki
2004	Improved performance of Aurora 4 using HTK and unsupervised MLLR adaptation. Jeff Siu-Kei Au-Yeung, Man-Hung Siu
2004	Improved robustness of time-frequency principal components (TFPC) by synergy of methods in different domains. Shang-nien Tsai
2004	Improved speech enhancement by applying time-shift property of DFT on hankel matrices for signal subspace decomposition. Gwo-hwa Ju, Lin-Shan Lee
2004	Improved spoken language translation using n-best speech recognition hypotheses. Ruiqiang Zhang, Gen-ichiro Kikui, Hirofumi Yamamoto, Frank K. Soong, Taro Watanabe, Eiichiro Sumita, Wai Kit Lo
2004	Improved voice activity detection combining noise reduction and subband divergence measures. Javier Ramírez, José C. Segura, M. Carmen Benítez, Ángel de la Torre, Antonio J. Rubio
2004	Improvement in corpus-based generation of F0 contours using generation process model for emotional speech synthesis. Keikichi Hirose
2004	Improvement in robustness of speech feature extraction method using sub-band based periodicity and aperiodicity decomposition. Kentaro Ishizuka, Noboru Miyazaki, Tomohiro Nakatani, Yasuhiro Minami
2004	Improvement of confidence measure performance using background model set algorithm. Byoung-Don Kim, Jin Young Kim, Seung Ho Choi, Young-Bum Lee, Kyoung-Rok Lee
2004	Improving automatic speech recognition performance and speech inteligibility with harmonicity based dereverberation. Keisuke Kinoshita, Tomohiro Nakatani, Masato Miyoshi
2004	Improving eigenspace-based MLLR adaptation by kernel PCA. Brian Kan-Wing Mak, Roger Wend-Huu Hsiao
2004	Improving letter-to-pronunciation accuracy with automatic morphologically-based stress prediction. Gabriel Webster
2004	Improving performance of text-independent speaker identification by utilizing contextual principal curves filtering. Yong Guan, Wenju Liu, Hongwei Qi, Jue Wang
2004	Improving the topic indexation and segmentation modules of a media watch system. Rui Amaral, Isabel Trancoso
2004	In search of a universal phonetic alphabet - theory and application of an organic visible speech-. Hyun-Bok Lee
2004	In-phase feature induction: an effective compensation technique for robust speech recognition. Siu Wa Lee, Pak-Chung Ching
2004	In-vehicle based speech processing for hearing impaired subjects. Xianxian Zhang, John H. L. Hansen, Kathryn Hoberg Arehart, Jessica Rossi-Katz
2004	Including dynamic and phonetic information in voice conversion systems. Antonio Bonafonte, Alexander Kain, Jan P. H. van Santen, Helenca Duxans
2004	Including uncertainty of speech observations in robust speech recognition. José C. Segura, Ángel de la Torre, Javier Ramírez, Antonio J. Rubio, M. Carmen Benítez
2004	Increasing the mixture components of non-uniform HMM structures based on a variational Bayesian approach. Takatoshi Jitsuhiro, Satoshi Nakamura
2004	Indonesian speech recognition for hearing and speaking impaired people. Sakriani Sakti, Arry Akhmad Arman, Satoshi Nakamura, Paulus Hutagaol
2004	Inexactness and robustness in cepstral-to-formant transformation of spoken and sung vowels. Frantz Clermont, Thomas John Millhouse
2004	Influence of temporal discretization schemes on formant frequencies and bandwidths in time domain simulations of the vocal tract system. Peter Birkholz, Dietmar Jackèl
2004	Inner product based-multiband vector quantization for wideband speech coding at 16 kbps. Seung Yeol Lee, Nam Soo Kim, Joon-Hyuk Chang
2004	Integrating layer concept inform ation into n-gram modeling for spoken language understanding. Nick Jui-Chang Wang, Jia-Lin Shen, Ching-Ho Tsai
2004	Integration of articulatory dynamic parameters in HMM/BN based speech recognition system. Konstantin Markov, Satoshi Nakamura, Jianwu Dang
2004	Integration of n-best recognition results obtained by multiple noise reduction algorithms. Takeshi Yamada, Jiro Okada, Nobuhiko Kitawaki
2004	Integration patterns during multimodal interaction. Anurag Kumar Gupta, Tasos Anastasakos
2004	Intelligibility of degraded speech from smeared STRAIGHT spectrum. Hideki Kawahara, Hideki Banno, Toshio Irino, Jiang Jin
2004	Interface for barge-in free spoken dialogue system using adaptive sound field control. Tatsunori Asai, Shigeki Miyabe, Hiroshi Saruwatari, Kiyohiro Shikano
2004	Intertranscriber reliability of prosodic labeling on telephone conversation using toBI. Taejin Yoon, Sandra Chavarria, Jennifer Cole, Mark Hasegawa-Johnson
2004	Intonation modeling for indian languages. Krothapalli Sreenivasa Rao, Bayya Yegnanarayana
2004	Intonation recognition for indonesian speech based on fujisaki model. Nazrul Effendy, Ekkarit Maneenoi, Patavee Charnvivit, Somchai Jitapunkul
2004	Investigating automatic recognition of non-native children's speech. Matteo Gerosa, Diego Giuliani
2004	Investigating speech style specific pronunciation variation in large spoken language corpora. Christophe Van Bael, Henk van den Heuvel, Helmer Strik
2004	Issues in meeting transcription - the ISL meeting transcription system. Tanja Schultz, Qin Jin, Kornel Laskowski, Yue Pan, Florian Metze, Christian Fügen
2004	Issues in the development of auditory-visual speech perception: adults, infants, and children. Kaoru Sekiyama, Denis Burnham
2004	Jacobian adaptation with improved noise reference for speaker verification. Jan Anguita, Javier Hernando, Alberto Abad
2004	Joint extraction and prediction of fujisaki's intonation model parameters. Pablo Daniel Agüero, Klaus Wimmer, Antonio Bonafonte
2004	Keyword recognition and extraction by multiple-LVCSRs with 60, 000 words in speech-driven WEB retrieval task. Masahiko Matsushita, Hiromitsu Nishizaki, Seiichi Nakagawa, Takehito Utsuro
2004	Keyword spotting for highly inflectional languages. Lubos Smídl, Ludek Müller
2004	Korean prosody generation and artificial neural networks. Kyung-Joong Min, Un-Cheon Lim
2004	LP-TRAP: linear predictive temporal patterns. Marios Athineos, Hynek Hermansky, Daniel P. W. Ellis
2004	Language detection by neural discrimination. Celestin Sedogbo, Sébastien Herry, Bruno Gas, Jean-Luc Zarader
2004	Language identification techniques based on full recognition in an air traffic control task. Ricardo de Córdoba, Javier Ferreiros, Valentín Sama, Javier Macías Guarasa, Luis Fernando D'Haro, Fernando Fernández Martínez
2004	Language model adaptation based on PLSA of topics and speakers. Yuya Akita, Tatsuya Kawahara
2004	Language recognition using phone latices. Jean-Luc Gauvain, Abdelkhalek Messaoudi, Holger Schwenk
2004	Language specific phonetic rules: evidence from domain-initial strengthening. Sung-A. Kim
2004	Large vocabulary continuous speech recognition based on cross-morpheme phonetic information. In-Jeong Choi, Nam-Hoon Kim, Su Youn Yoon
2004	Large vocabulary continuous speech recognition for estonian using morpheme classes. Tanel Alumäe
2004	Latent semantic analysis for speaker recognition. A. Nayeemulla Khan, Bayya Yegnanarayana
2004	Learning dialogue policies using state aggregation in reinforcement learning. Matthias Denecke, Kohji Dohsaka, Mikio Nakano
2004	Learning for transliteration of arabic-numeral expressions using decision tree for Korean TTS. Youngim Jung, Donghun Lee, HyeonSook Nam, Ae-sun Yoon, Hyuk-Chul Kwon
2004	Learning long-term temporal features in LVCSR using neural networks. Barry Y. Chen, Qifeng Zhu, Nelson Morgan
2004	Learning nonnegative features of spectro-temporal sounds for classification. Yong-Choon Cho, Seungjin Choi
2004	Learning subject drift for topic tracking. Fumiyo Fukumoto, Yoshimi Suzuki
2004	Letter-to-sound for small-footprint multilingual TTS engine. Gui-Lin Chen, Ke-Song Han
2004	Lexical representation of non-native phonemes. Mirjam Broersma, K. Marieke Kolkman
2004	Long term modeling of phase trajectories within the speech sinusoidal model framework. Laurent Girin, Mohammad Firouzmand, Sylvain Marchand
2004	Long vowel detection for letter-to-sound conversion for Japanese sourced words transliterated into the alphabet. Hisako Asano, Hideharu Nakajima, Hideyuki Mizuno, Masahiro Oku
2004	MAP prediction of pitch from MFCC vectors for speech reconstruction. Xu Shao, Ben P. Milner
2004	METRIC-SEQDAC: a hybrid approach for audio segmentation. Hsin-Min Wang, Shih-Sian Cheng
2004	MFCC computation from magnitude spectrum of higher lag autocorrelation coefficients for robust speech recognition. Benjamin J. Shannon, Kuldip K. Paliwal
2004	MICot : a tool for multimodal input data collection. Raymond H. Lee, Anurag Kumar Gupta
2004	MLLR adaptation for hidden semi-Markov model based speech synthesis. Junichi Yamagishi, Takashi Masuko, Takao Kobayashi
2004	MS connect: a fully featured auto-attendant: system design, implementation and performance. David Ollason, Yun-Cheng Ju, Siddharth Bhatia, Daniel Herron, Jackie Liu
2004	Maximum - likelihod adaptation of semi-continuous HMMs by latent variable decomposition of state distributions. Antoine Raux, Rita Singh
2004	Maximum a posteriori eigenvoice speaker adaptation for Korean connected digit recognition. Hyung Bae Jeon, Dong Kook Kim
2004	Maximum entropy direct model as a unified model for acoustic modeling in speech recognition. Hong-Kwang Jeff Kuo, Yuqing Gao
2004	Maximum short quantity in Japanese and finish in two perception tests with F0 and db variants. Toshiko Isei-Jaakkola
2004	Mean and covariance adaptation based on minimum classification error linear regression for continuous density HMMs. Haibin Liu, Zhenyang Wu
2004	Measuring convergence in language model estimation using relative entropy. Abhinav Sethy, Shrikanth S. Narayanan, Bhuvana Ramabhadran
2004	Measuring the perceived importance of time- and frequency-divided speech blocks for transmitting over packet networks. Akitoshi Kataoka, Yusuke Hiwasaki, Toru Morinaga, Jotaro Ikedo
2004	Memory and computation reduction for embedded ASR systems. Sangbae Jeong, Icksang Han, Eugene Jon, Jeongsu Kim
2004	Memory efficient decoding graph compilation with wide cross-word acoustic context. Miroslav Novak, Vladimír Bergl
2004	Methods for task adaptation of acoustic models with limited transcribed in-domain data. Enrico Bocchieri, Michael Riley, Murat Saraclar
2004	Minimum phase compensation in speech coding using hammerstein model. Jari Juhani Turunen, Juha T. Tanttu, Frank Cameron
2004	Mining customer care dialogs for "daily news". Shona Douglas, Deepak Agarwal, Tirso Alonso, Robert M. Bell, Mazin G. Rahim, Deborah F. Swayne, Chris Volinsky
2004	Mining of association patterns for language modeling. Jen-Tzung Chien, Hung-Ying Chen
2004	Mis-recognized utterance detection using hierarchical language model. Hirofumi Yamamoto, Gen-ichiro Kikui, Yoshinori Sagisaka
2004	Mixture Gaussian model training against impostor model parameters: an application to speaker identification. T. V. Sreenivas, Sameer Badaskar
2004	Mixture language models for call routing. Qiang Huang, Stephen J. Cox
2004	Model composition by lagrange polynomial approximation for robust speech recognition in noisy environment. Chandra Kant Raut, Takuya Nishimoto, Shigeki Sagayama
2004	Model quality evaluation during enrolment for speaker verification. Javier R. Saeta, Javier Hernando
2004	Model-based sequential organization for cochannel speaker identification. Yang Shao, DeLiang Wang
2004	Modeling and recognition of phonetic and prosodic factors for improvements to acoustic speech recognition models. Sarah Borys, Aaron Cohen, Mark Hasegawa-Johnson, Jennifer Cole
2004	Modeling audio-visual speech perception: back on fusion architectures and fusion control. Jean-Luc Schwartz, Marie-Agnès Cathiard
2004	Modeling auxiliary features in tandem systems. Mathew Magimai-Doss, Shajith Ikbal, Todd A. Stephenson, Hervé Bourlard
2004	Modeling data entry rates for ASR and alternative input methods. Roger K. Moore
2004	Modeling generic dialog applications for embedded systems. Gerhard Hanrieder, Stefan W. Hamerich
2004	Modeling phones coarticulation effects in a neural network based speech recognition system. Leila Ansary, Seyyed Ali Seyyed Salehi
2004	Modeling pronunciation variation using artificial neural networks for English spontaneous speech. Ken Chen, Mark Hasegawa-Johnson
2004	Modelling and ranking of differences across formants of british, australian and american accents. Qin Yan, Saeed Vaseghi, Dimitrios Rentzos, Ching-Hsiang Ho
2004	Modified realizable frequency warped ARMA modeling and its application in synthesis structures for voiced speech. Juan L. Navarro-Mesa, Pedro J. Quintana-Morales
2004	Morphology-based language modeling for arabic speech recognition. Dimitra Vergyri, Katrin Kirchhoff, Kevin Duh, Andreas Stolcke
2004	Multi-codebook vector quantization algorithm for speaker identification. Mohamed Fathy Abu-ElYazeed, Nemat S. Abdel Kader, Mohammed El-Henawy
2004	Multi-context rules for phonological processing in polyglot TTS synthesis. Harald Romsdorfer, Beat Pfister
2004	Multi-eigenspace normalization for robust speech recognition in noisy environments. Yoonjae Lee, Hanseok Ko
2004	Multi-layer structure MLLR adaptation algorithm with subspace regression classes and tying. Xiangyu Mu, Shuwu Zhang, Bo Xu
2004	Multi-mode harmonic transfrom excitation LPC coding for speech and music. Jong-Hark Kim, Jae-Hyun Shin, Insung Lee
2004	Multi-pass ASR using vocabulary expansion. Katsutoshi Ohtsuki, Nobuaki Hiroshima, Shoichi Matsunaga, Yoshihiko Hayashi
2004	Multi-pitch trajectory estimation of concurrent speech based on harmonic GMM and nonlinear kalman filtering. Takuya Nishimoto, Shigeki Sagayama, Hirokazu Kameoka
2004	Multi-sample fusion with constrained feature transformation for robust speaker verification. Ming-Cheung Cheung, Kwok-Kwong Yiu, Man-Wai Mak, Sun-Yuan Kung
2004	Multilayer subword units for open-vocabulary spoken document retrieval. Shi-wook Lee, Kazuyo Tanaka, Yoshiaki Itoh
2004	Multilingual corpora for speech-to-speech translation research. Gen-ichiro Kikui, Toshiyuki Takezawa, Seiichi Yamamoto
2004	Multilingual e-mail text processing for speech synthesis. Daniela Oria, Akos Vetek
2004	Multimodal expression for humanoid robots by integration of human speech mimicking and facial color. Tokitomo Ariyoshi, Kazuhiro Nakadai, Hiroshi Tsujino
2004	Mutual information based visual feature selection for lipreading. Patricia Scanlon, Gerasimos Potamianos, Vit Libal, Stephen M. Chu
2004	Mutual-information based segment pre-selection in concatenative text-to-speech. Wei Zhang, Ling Jin, Xijun Ma
2004	N-gram language modeling of Japanese using bunsetsu boundaries. Sungyup Chung, Keikichi Hirose, Nobuaki Minematsu
2004	Neural "spike rate spectrum" as a noise robust, speaker invariant feature for automatic speech recognition. T. V. Sreenivas, G. V. Kiran, A. G. Krishna
2004	Neural network language models for conversational speech recognition. Holger Schwenk, Jean-Luc Gauvain
2004	Neurocognition of speech-specific audiovisual perception. Mikko Sams, Ville Ojanen, Jyrki Tuomainen, Vasily Klucharev
2004	New background modeling for speaker verification. Dat Tran
2004	New challenges in usability evaluation - beyond task-oriented spoken dialogue systems. Laila Dybkjær, Niels Ole Bernsen, Wolfgang Minker
2004	New features based on multiple word graphs for utterance verification. Alberto Sanchís, Alfons Juan, Enrique Vidal
2004	New harmonicity measures for pitch estimation and voice activity detection. An-Tze Yu, Hsiao-Chuan Wang
2004	New nonsense syllables database - analyses and preliminary ASR experiments. Petr Fousek, Frantisek Grézl, Hynek Hermansky, Petr Svojanovsky
2004	Noise adaptation for robust AURORA 2 noisy digit recognition using statistical data mapping. Xuechuan Wang, Douglas D. O'Shaughnessy
2004	Noise adaptive spoken dialog system based on selection of multiple dialog strategies. Akinori Ito, Takanobu Oba, Takashi Konashi, Motoyuki Suzuki, Shozo Makino
2004	Noise reduction using hybrid noise estimation technique and post-filtering. Junfeng Li, Masato Akagi
2004	Noise robust digit recognition using a glottal radar sensor for voicing detection. Cenk Demiroglu, David V. Anderson
2004	Noise robust real world spoken dialogue system using GMM based rejection of unintended inputs. Akinobu Lee, Keisuke Nakamura, Ryuichi Nisimura, Hiroshi Saruwatari, Kiyohiro Shikano
2004	Noise-robust speaker verification using F0 features. Koji Iwano, Taichi Asami, Sadaoki Furui
2004	Non-audible murmur (NAM) speech recognition using a stethoscopic NAM microphone. Panikos Heracleous, Yoshitaka Nakajima, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano
2004	Number of output nodes of artificial neural networks for Korean prosody generation. Kyung-Joong Min, Chan-Goo Kang, Un-Cheon Lim
2004	Objective wavelet packet features for speaker verification. Mihalis Siafarikas, Todor Ganchev, Nikos Fakotakis
2004	Of the top of the head: audio-visual speech perception from the nose up. Chris Davis, Jeesun Kim
2004	On a n-gram model approach for packet loss concealment. Minkyu Lee, Imed Zitouni, Qiru Zhou
2004	On binary and ratio time-frequency masks for robust speech recognition. Soundararajan Srinivasan, Nicoleta Roman, DeLiang Wang
2004	On latent semantic language modeling and smoothing. Jen-Tzung Chien, Meng-Sung Wu, Hua-Jui Peng
2004	On the development of telephone applications: some practical issues and evaluation. Andrea Facco, Daniele Falavigna, Roberto Gretter, Marcello Viganò
2004	On the integration of speech recognition into personal networks. Zheng-Hua Tan, Paul Dalsgaard, Børge Lindberg
2004	On the time variability of vocal tract for speaker recognition. Samuel Kim, Thomas Eriksson, Hong-Goo Kang
2004	On the use of a weighted autocorrelation based fundamental frequency estimation for a multidimensional speech input. Federico Flego, Luca Armani, Maurizio Omologo
2004	On using MLP features in LVCSR. Qifeng Zhu, Barry Y. Chen, Nelson Morgan, Andreas Stolcke
2004	On-line incremental adaptation based on reinforcement learning for robust speech recognition. Masafumi Nishida, Yoshitaka Mamiya, Yasuo Horiuchi, Akira Ichikawa
2004	Online minimum mean square error filtering of noisy cepstral coefficients using a sequential EM algorithm. Tor André Myrvoll, Satoshi Nakamura
2004	Optimal acoustic and language model weights for minimizing word verification errors. Frank K. Soong, Wai Kit Lo, Satoshi Nakamura
2004	Optimizing an engine network that allows dynamic masking. Frédéric Tendeau
2004	Optimizing boosting with discriminative criteria. Rong Zhang, Alexander I. Rudnicky
2004	Optimizing regression for in-car speech recognition using multiple distributed microphones. Weifeng Li, Fumitada Itakura, Kazuya Takeda
2004	Orientel-turkish: telephone speech database description and notes on the experience. Tolga Çiloglu, Dinc Acar, Ahmet Tokatli
2004	PROSPECT features and their application to missing data techniques for robust speech recognition. Hugo Van hamme
2004	Parallel feature generation based on maximizing normalized acoustic likelihood. Xiang Li, Richard M. Stern
2004	Parallel tone score association method for tone language speech recognition. William S.-Y. Wang, Gang Peng
2004	Partially lexicalized parsing model utilizing rich features. So-Young Park, Yong-Jae Kwak, Joon-Ho Lim, Hae-Chang Rim, Soo-Hong Kim
2004	Perception of affect in speech - towards an automatic processing of paralinguistic information in spoken conversation. Nick Campbell
2004	Perception of non-native phonemes in noise. Nicole Cooper, Anne Cutler
2004	Perception-guided and phonetic clustering weight tuning based on diphone pairs for unit selection TTS. Francesc Alías, Xavier Llorà, Ignasi Iriondo Sanz, Joan Claudi Socoró, Xavier Sevillano, Lluís Formiga
2004	Perceptual discrimination of prosodic types and their preliminary acoustic analysis. Masahiko Komatsu, Tsutomu Sugawara, Takayuki Arai
2004	Perceptual wavelet packet audio coder. Teddy Surya Gunawan, Eliathamby Ambikairajah, Julien Epps
2004	Performance analysis of transcoding algorithms in packet-loss environments. Sung-Kyo Jung, Hong-Goo Kang, Dae Hee Youn, Chang-Heon Lee
2004	Performance improvement of connected digit recognition using unsupervised fast speaker adaptation. Young Kuk Kim, Hwa Jeon Song, Hyung Soon Kim
2004	Performance of speech recognition and synthesis in packet-based networks. Sebastian Möller, Jan Felix Krebber, Alexander Raake
2004	Phase-space representation of speech. Hua Yu
2004	Phone classification in pseudo-euclidean vector spaces. Alexander Gutkin, Simon King
2004	Phoneme restoration in degraded speech communication. Slobodan Jovicic, Sandra Antesevic, Zoran Saric
2004	Phoneme-based word activation in spoken-word recognition: evidence from Japanese school children. Takashi Otake, Yoko Sakamoto, Yasuyuki Konomi
2004	Phonemic repertoire and similarity within the vocabulary. Anne Cutler, Dennis Norris, Núria Sebastián-Gallés
2004	Phonetic confusion based document expansion for spoken document retrieval. Nicolas Moreau, Hyoung-Gook Kim, Thomas Sikora
2004	Phonetic realization of the suffix-suppressed accentual phrase in Korean. Mira Oh, Kee-Ho Kim
2004	Phonology of exceptions for for Korean grapheme-to-phoneme conversion. Sunhee Kim
2004	Phonotactics vs. phonetic cues in native and non-native listening: dutch and Korean listeners' perception of dutch and English. Taehong Cho, James M. McQueen
2004	Phoxsy: multi-phone segments for unit selection speech synthesis. Stefan Breuer, Julia Abresch
2004	Pinched lattice minimum Bayes risk discriminative training for large vocabulary continuous speech recognition. Vlasios Doumpiotis, William Byrne
2004	Poetry assistant. Isabel Trancoso, Paulo Araújo, Céu Viana, Nuno J. Mamede
2004	Policy analysis framework for conversational biometrics. Upendra V. Chaudhari, Ganesh N. Ramaswamy
2004	Polynomial regression model for duration prediction in Mandarin. Yu Hu, Ren-Hua Wang, Lu Sun
2004	Positional and phonotactic effects on the realization of taiwan Mandarin tone 2. Hui-Ju Hsu, Janice Fon
2004	Posteriori probabilities and likelihoods combination for speech and speaker recognition. Mohamed Faouzi BenZeghiba, Hervé Bourlard
2004	Practical use of English pronunciation system for Japanese students in the CALL classroom. Tatsuya Kawahara, Masatake Dantsuji, Yasushi Tsubota
2004	Pre-focal rephrasing, focal enhancement and postfocal deaccentuation in French. Marion Dohen, Hélène Loevenbruck
2004	Precise phone boundary detection using wavelet packet and recurrent neural networks. Farshad Almasganj
2004	Predicting word correct rate from acoustic and linguistic confusability. Gies Bouwman, Bert Cranen, Lou Boves
2004	Prediction of the glottal LF parameters using regression trees. Michelle Tooher, John G. McKenna
2004	Probabilistic speaker identification with dual penalized logistic regression machine. Tomoko Matsui, Kunio Tanabe
2004	Procedure "senza vibrato": a key component for morphing singing. Hideki Kawahara, Yumi Hirachi, Masanori Morise, Hideki Banno
2004	Prolongation in spontaneous Mandarin. Tzu-Lun Lee, Ya-Fang He, Yun-Ju Huang, Shu-Chuan Tseng, Robert Eklund
2004	Pronunciation assessment based upon the compatibility between a learner's pronunciation structure and the target language's lexical structure. Nobuaki Minematsu
2004	Pronunciation assessment based upon the phonological distortions observed in language learners' utterances. Nobuaki Minematsu
2004	Pronunciation lexicon adaptation for TTS voice building. Yeon-Jun Kim, Ann K. Syrdal, Alistair Conkie
2004	Pronunciation lexicon modeling and design for Korean large vocabulary continuous speech recognition. Kyong-Nim Lee, Minhwa Chung
2004	Prosodic analysis of a multi-style corpus in the perspective of emotional speech synthesis. Enrico Zovato, Stefano Sandri, Silvia Quazza, Leonardo Badino
2004	Prosodic characteristics of czech contrastive topic. Katerina Vesela, Nino Peterek, Eva Hajicová
2004	Prosody based attitude recognition with feature selection and its application to spoken dialog system as para-linguistic information. Shinya Fujie, Tetsunori Kobayashi, Daizo Yagi, Hideaki Kikuchi
2004	Question-answering in webtalk: an evaluation study. Junlan Feng, Srinivas Bangalore, Mazin G. Rahim
2004	Rapid EM training based on model-integration. Shinichi Yoshizawa, Kiyohiro Shikano
2004	Rapid acoustic model development using Gaussian mixture clustering and language adaptation. Nikos Chatzichrisafis, Vassilios Digalakis, Vassilios Diakoloukas, Costas Harizakis
2004	Rapid on-line environment compensation for server - based speech recognition in noisy mobile environments. Juan M. Huerta, Etienne Marcheret, Sreeram Balakrishnan
2004	Real-time speaker identification. Pasi Fränti, Evgeny Karpov, Tomi Kinnunen
2004	Recent improvements on ARTIC: czech text-to-speech system. Jindrich Matousek, Jan Romportl, Daniel Tihelka, Zbynek Tychtl
2004	Recent progress of open-source LVCSR engine julius and Japanese model repository. Tatsuya Kawahara, Akinobu Lee, Kazuya Takeda, Katsunobu Itou, Kiyohiro Shikano
2004	Recognition of read and spontaneous children's speech using two new corpora. Martin J. Russell, Shona D'Arcy, Lit Ping Wong
2004	Recognition of three simultaneous utterance of speech by four-line directivity microphone mounted on head of robot. Naoya Mochiki, Tetsunori Kobayashi, Toshiyuki Sekiya, Tetsuji Ogawa
2004	Reconciling pronunciation differences between the front-end and the back-end in the IBM speech synthesis system. Wael Hamza, Ellen Eide, Raimo Bakis
2004	Reconstruction filter design for bone-conducted speech. Toshiki Tamiya, Tetsuya Shimamura
2004	Reference marking in children's computer-directed speech: an integrated analysis of discourse and gestures. Simona Montanari, Serdar Yildirim, Elaine Andersen, Shrikanth S. Narayanan
2004	Restructuring HMM states for speaker adaptation in Mandarin speech recognition. Xianghua Xu, Qiang Guo, Jie Zhu
2004	Revisiting dysarthria assessment intelligibility metrics. Phil D. Green, James Carmichael
2004	Revisiting some model-based and data-driven denoising algorithms in Aurora 2 context. Panji Setiawan, Sorel Stan, Tim Fingscheidt
2004	Rhythm in read british English: interdialect variability. Emmanuel Ferragne, François Pellegrino
2004	Robot motion control using listener's back-channels and head gesture information. Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno, Tsuyoshi Tasaki, Takeshi Yamaguchi
2004	Robust ASR model adaptation by feature-based statistical data mapping. Xuechuan Wang, Douglas D. O'Shaughnessy
2004	Robust and adaptive architecture for multilingual spoken dialogue systems. Markku Turunen, Esa-Pekka Salonen, Mikko Hartikainen, Jaakko Hakulinen
2004	Robust automatic speech recognition using an optimal spectral amplitude estimator algorithm in low-SNR car environments. Zili Li, Hesham Tolba, Douglas D. O'Shaughnessy
2004	Robust dependency parsing of spontaneous Japanese speech and its evaluation. Tomohiro Ohno, Shigeki Matsubara, Nobuo Kawaguchi, Yasuyoshi Inagaki
2004	Robust distant speech recognition based on position dependent CMN. Norihide Kitaoka, Longbiao Wang, Seiichi Nakagawa
2004	Robust speaker identification based on perceptual log area ratio and Gaussian mixture models. David Chow, Waleed H. Abdulla
2004	Robust speech recognition based on HMM composition and modified wiener filter. Sumitaka Sakauchi, Yoshikazu Yamaguchi, Satoshi Takahashi, Satoshi Kobashikawa
2004	Robust speech recognition in client-server scenarios. Richard C. Rose, Hong Kook Kim
2004	Robust speech recognition over packet networks: an overview. Naveen Srinivasamurthy, Kyu Jeong Han, Shrikanth S. Narayanan
2004	Robust speech recognition using data-driven temporal filters based on independent component analysis. Junhui Zhao, Jingming Kuang, Xiang Xie
2004	Robust speech recognition with spectral subtraction in low SNR. Randy Gomez, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano
2004	Robust verification of recognized words in noise. Wai Kit Lo, Frank K. Soong, Satoshi Nakamura
2004	Robustness aspects of active learning for acoustic modeling. Gerard G. L. Meyer, Teresa M. Kamm
2004	Role of segmental and suprasegmental cues in the perception of maghrebian-acented French. Belynda Brahimi, Philippe Boula de Mareüil, Cédric Gendrot
2004	SVM kernel adaptation in speaker classification and verification. Purdy Ho, Pedro J. Moreno
2004	SVM modeling of "SNERF-grams" for speaker recognition. Elizabeth Shriberg, Luciana Ferrer, Anand Venkataraman, Sachin S. Kajarekar
2004	Scalable distributed speech recognition using multi-frame GMM-based block quantization. Kuldip K. Paliwal, Stephen So
2004	Scoring and direct methods for the interpretation of evidence in forensic speaker recognition. Anil Alexander, Andrzej Drygajlo
2004	Scoring unknown speaker clustering : VB vs. BIC. Fabio Valente, Christian Wellekens
2004	Segmental differences in the visual contribution to speech inteligibility. Kuniko Y. Nielsen
2004	Segmental speech coding model for storage applications. Anssi Rämö, Jani Nurminen, Sakari Himanen, Ari Heikkinen
2004	Segmentation and relevance measure for speaker verification. Jérôme Louradour, Régine André-Obrecht, Khalid Daoudi
2004	Segmenting ambiguous phrases using phoneme duration. Keren B. Shatzman
2004	Separation of multiple concurrent speeches using audio-visual speaker localization and minimum variance beam-forming. Changkyu Choi, Donggeon Kong, Hyoung-Ki Lee, Sang Min Yoon
2004	Shaping spoken input in user-initiative systems. Stefanie Tomko, Roni Rosenfeld
2004	Side effect free dialogue management in a voice enabled procedure browser. Manny Rayner, Beth Ann Hockey
2004	Signaling and detecting uncertainty in audiovisual speech by children and adults. Emiel Krahmer, Marc Swerts
2004	Simulating multimodal applications. Chakib Tadj, Hicham Djenidi, Madjid Haouani, Amar Ramdane-Cherif, Nicole Lévy
2004	Simultaneous estimation of weights of eigenvoices and bias compensation vector for rapid speaker adaptation. Hyung Soon Kim, Hwa Jeon Song
2004	Single acoustic-channel speech enhancement based on glottal correlation using non-acoustic sensor. Rongqiang Hu, David V. Anderson
2004	Soft features for improved distributed speech recognition over wireless networks. Reinhold Haeb-Umbach, Valentin Ion
2004	Some articulatory measurements of real sadness. Donna Erickson, Caroline Menezes, Akinori Fujino
2004	Sound source localization based on zero-crosing peak-amplitude coding. Young-Ik Kim, Rhee Man Kil
2004	Source separation using particle filters. Mital Gandhi, Mark Hasegawa-Johnson
2004	Source-filter separation for articulation-to-speech synthesis. Yoshinori Shiga, Simon King
2004	Speaker adaptation method for CALL system using bilingual speakers' utterances. Motoyuki Suzuki, Hirokazu Ogasawara, Akinori Ito, Yuichi Ohkawa, Shozo Makino
2004	Speaker adaptation of a three-dimensional tongue model. Olov Engwall
2004	Speaker clustering of speech utterances using a voice characteristic reference space. Wei-Ho Tsai, Shih-Sian Cheng, Hsin-Min Wang
2004	Speaker dependent model order selection of spectral envelopes. Matthias Wölfel
2004	Speaker diarization from speech transcripts. Lori Lamel, Jean-Luc Gauvain, Leonardo Canseco-Rodriguez
2004	Speaker diarization using bottom-up clustering based on a parameter-derived distance between adapted GMMs. Michael Betser, Frédéric Bimbot, Mathieu Ben, Guillaume Gravier
2004	Speaker identification using probabilistic PCA model selection. Jen-Tzung Chien, Chuan-Wei Ting
2004	Speaker indexing in audio archives using test utterance Gaussian mixture modeling. Hagai Aronowitz, David Burshtein, Amihood Amir
2004	Speaker model quantization for unsupervised speaker indexing. Soonil Kwon, Shrikanth S. Narayanan
2004	Speaker normalization through constrained MLLR based transforms. Diego Giuliani, Matteo Gerosa, Fabio Brugnara
2004	Speaker segmentation and clustering in meetings. Qin Jin, Tanja Schultz
2004	Speaker-and-environment change detection in broadcast news using the common component GMM-based divergence measure. Yih-Ru Wang, Chi-Han Huang
2004	Spectral characteristics of the release bursts in Korean alveolar stops. Hansang Park
2004	Spectral moment vs. bark cepstral analysis of children's word-initial voiceles stops. H. Timothy Bunnell, James B. Polikoff, Jane McNicholas
2004	Spectral subtraction with full-wave rectification and likelihood controlled instantaneous noise estimation for robust speech recognition. Haitian Xu, Zheng-Hua Tan, Paul Dalsgaard, Børge Lindberg
2004	Spectro-temporal activity pattern (STAP) features for noise robust ASR. Shajith Ikbal, Mathew Magimai-Doss, Hemant Misra, Hervé Bourlard
2004	Speech act identification using an ontology-based partial pattern tree. Chung-Hsien Wu, Jui-Feng Yeh, Ming-Jun Chen
2004	Speech coding using trajectory compression and multiple sensors. Sorin Dusan, James L. Flanagan, Amod Karve, Mridul Balaraman
2004	Speech enhanced multi-Span language model. A. Nayeemulla Khan, B. Yegnanarayana
2004	Speech enhancement and recognition by integrating adaptive beamforming and wiener filtering. Alberto Abad, Javier Hernando
2004	Speech enhancement based on magnitude estimation using the gamma prior. Weifeng Li, Kazuya Takeda, Fumitada Itakura, Tran Huy Dat
2004	Speech enhancement based on smoothing of spectral noise floor. Hyoung-Gook Kim, Thomas Sikora
2004	Speech enhancement using adaptive time-domain segmentation. Sriram Srinivasan, W. Bastiaan Kleijn
2004	Speech input and output module assessment for remote access to a smart-home spoken dialog system. Jan Felix Krebber, Sebastian Möller, Alexander Raake
2004	Speech intention understanding based on decision tree learning. Yuki Irie, Shigeki Matsubara, Nobuo Kawaguchi, Yukiko Yamaguchi, Yasuyoshi Inagaki
2004	Speech interaction system - how to increase its usability? Fang Chen
2004	Speech interface for name input based on combination of recognition methods using syllable-based n-gram and word dictionary. Hironori Oshikawa, Norihide Kitaoka, Seiichi Nakagawa
2004	Speech probability distribution based on generalized gama distribution. Jong Won Shin, Joon-Hyuk Chang, Nam Soo Kim
2004	Speech production based on lossy tube models: unit concatenation and sound transitions. Karl Schnell, Arild Lacroix
2004	Speech quality estimation using Gaussian mixture models. Tiago H. Falk, Wai-Yip Chan, Peter Kabal
2004	Speech recognition error analysis on the English MALACH corpus. Olivier Siohan, Bhuvana Ramabhadran, Geoffrey Zweig
2004	Speech recognition error correction using maximum entropy language model. Sangkeun Jung, Minwoo Jeong, Gary Geunbae Lee
2004	Speech recognition experiments with the SPEECON database using several robust front-ends. Pere Pujol, Jaume Padrell, Climent Nadeu, Dusan Macho
2004	Speech recognition for multiple non-native accent groups with speaker-group-dependent acoustic models. Tobias Cincarek, Rainer Gruhn, Satoshi Nakamura
2004	Speech recognition system robust to noise and speaking styles. Shigeki Matsuda, Takatoshi Jitsuhiro, Konstantin Markov, Satoshi Nakamura
2004	Speech recognition using motion based lipreading. Maria José Sanchez Martinez, Juan Pablo de la Cruz Gutiérrez
2004	Speech recognition using synchronization between speech and finger tapping. Hiromitsu Ban, Chiyomi Miyajima, Katsunobu Itou, Fumitada Itakura, Kazuya Takeda
2004	Speech recognition, sylabification and statistical phonetics. Melvyn John Hunt
2004	Speech spotter: on-demand speech recognition in human-human conversation on the telephone or in face-to-face situations. Masataka Goto, Koji Kitayama, Katsunobu Itou, Tetsunori Kobayashi
2004	Speech timing and rhythmic structure in arabic dialects: a comparison of two approaches. Melissa Barkat-Defradas, Rym Hamdi, Emmanuel Ferragne, François Pellegrino
2004	Speech translation: past, present and future. Alex Waibel
2004	Speech understanding, dialogue management and response generation in corpus-based spoken dialogue system. Keita Hayashi, Yuki Irie, Yukiko Yamaguchi, Shigeki Matsubara, Nobuo Kawaguchi
2004	Speedup of kernel eigenvoice speaker adaptation by embedded kernel PCA. Brian Mak, Simon Ka-Lung Ho, James T. Kwok
2004	Spoken language interface in ECMA/ISO telecommunication standards. Kuansan Wang
2004	Spokenquery: an alternate approach to chosing items with speech. Peter Wolf, Joseph Woelfel, Jan C. van Gemert, Bhiksha Raj, David Wong
2004	Spontaneous speech recognition using a massively parallel decoder. Takahiro Shinozaki, Sadaoki Furui
2004	Spread of high tone in akita Japanese. Kenji Yoshida
2004	Statistical Chinese spoken document retrieval using latent topical information. Jen-Wei Kuo, Yao-Min Huang, Berlin Chen, Hsin-Min Wang
2004	Statistical corpus-based speech segmentation. Vincent Pollet, Geert Coorman
2004	Statistical feature language model. Salma Jamoussi, David Langlois, Jean Paul Haton, Kamel Smaïli
2004	Statistical machine translation and its challenges. Hermann Ney
2004	Statistical model migration in speaker recognition. Jirí Navrátil, Ganesh N. Ramaswamy, Ran D. Zilca
2004	Statistics-based direction finding for training vowels. Cheolwoo Jo, Ilsuh Bak
2004	Stochastic gradient adaptation of front-end parameters. Sreeram Balakrishnan, Karthik Visweswariah, Vaibhava Goel
2004	Stop consonant classification by dynamic formant trajectory. Yanli Zheng, Mark Hasegawa-Johnson, Sarah Borys
2004	Strategies for optimizing a stochastic spoken natural language parser. Wolfgang Minker, Dirk Bühler, Christiane Beuschel
2004	Strategies to reduce design time in multimodal/multilingual dialog applications. Luis Fernando D'Haro, Ricardo de Córdoba, Rubén San Segundo, Juan Manuel Montero, Javier Macías Guarasa, José Manuel Pardo
2004	Structured interview-based evaluation of spoken multimodal conversation with h.c. andersen. Niels Ole Bernsen, Laila Dybkjær
2004	Structuring of baseball live games based on speech recognition using task dependant knowledge. Atsushi Sako, Yasuo Ariki
2004	Study on emotional speech features in Korean with its aplication to voice color conversion. Sang-Jin Kim, Kwang-Ki Kim, Minsoo Hahn
2004	Subjective evaluation of join cost functions used in unit selection speech synthesis. Jithendra Vepa, Simon King
2004	Subjective evaluation of spoken dialogue systems using SER VQUAL method. Mikko Hartikainen, Esa-Pekka Salonen, Markku Turunen
2004	Subtopic segmentation in the lecture speech. Noboru Kanedera, Asuka Sumida, Takao Ikehata, Tetsuo Funada
2004	Survey of spontaneous speech phenomena in a multimodal dialogue system and some implications for ASR. Louis ten Bosch, Lou Boves
2004	Syllable-based probabilistic morphological analysis model of Korean. Do-Gil Lee, Hae-Chang Rim
2004	Synchronization of speaker selection for centralized tandem free voIP conferencing. Peter Kabal, Colm Elliott
2004	Synthesis of vowels and tones in Thai language by articulatory modeling. Thanate Khaorapapong, Montri Karnjanadecha, Keerati Inthavisas
2004	Synthesizing speech from speech recognition parameters. Kris Demuynck, Oscar Garcia, Dirk Van Compernolle
2004	TRAP based features for LVCSR of meting data. Frantisek Grézl, Martin Karafiát, Jan Cernocký
2004	Target practice on talking faces. Adriano Vilela Barbosa, Eric Vatikiotis-Bateson, Andreas Daffertshofer
2004	Task adaptation of acoustic and language models based on large quantities of data. Karthik Visweswariah, Ramesh A. Gopinath, Vaibhava Goel
2004	Task-specific minimum Bayes-risk decoding using learned edit distance. Izhak Shafran, William Byrne
2004	Temporal normalization techniques for transform-type speech coding and application to split-band wideband coders. Kyung-Tae Kim, Sung-Kyo Jung, MiSuk Lee, Hong-Goo Kang, Dae Hee Youn
2004	Temporal variables in parkinsonian speech. Danielle Duez
2004	Text independent speaker recognition using speaker dependent word spotting. Hagai Aronowitz, David Burshtein, Amihood Amir
2004	The GEMINI platform: semi-automatic generation of dialogue applications. Stefan W. Hamerich, Volker Schless, Basilis Kladis, Volker Schubert, Otilia Kocsis, Stefan Igel, Ricardo de Córdoba, Luis Fernando D'Haro, José Manuel Pardo
2004	The IBM expressive speech synthesis system. Wael Hamza, Ellen Eide, Raimo Bakis, Michael Picheny, John F. Pitrelli
2004	The ICSI-SRI-UW metadata extraction system. Elizabeth Shriberg, Andreas Stolcke, Dustin Hillard, Mari Ostendorf, Barbara Peskin, Mary P. Harper, Yang Liu
2004	The MIT finite-state transducer toolkit for speech and language processing. I. Lee Hetherington
2004	The audio-video australian English speech data corpus AVOZES. J. Bruce Millar, Roland Goecke
2004	The automatic news transcription system: ANTS, some real time experiments. Dominique Fohr, Odile Mella, Christophe Cerisara, Irina Illina
2004	The development of anticipatory labial coarticulation in French: a pionering study. Aude Noiray, Lucie Ménard, Marie-Agnès Cathiard, Christian Abry, Christophe Savariaux
2004	The duration of pitch transition phase and its relative factors. Ziyu Xiong, Juanwen Chen
2004	The effect of intonation on perception of Cantonese lexical tones. Valter Ciocca, Tara L. Whitehill, Joan K.-Y. Ma
2004	The efficient generation of pronunciation dictionaries: human factors during bootstrapping. Marelie H. Davel, Etienne Barnard
2004	The efficient generation of pronunciation dictionaries: machine learning factors during bootstrapping. Marelie H. Davel, Etienne Barnard
2004	The influence of target size and distance on the production of speech and gesture in multimodal referring expressions. Ielka van der Sluis, Emiel Krahmer
2004	The modified group delay feature: a new spectral representation of speech. Hema A. Murthy, Rajesh Mahanand Hegde, Venkata Ramana Rao Gadde
2004	The role of pitch range variation in the discourse structure and intonation structure of Korean. Eunjong Kong
2004	The role of prosodic cues in word segmentation of Korean. Sahyang Kim
2004	The stochastic weighted viterbi algorithm: a frame work to compensate additive noise and low-bit rate coding distortion. Néstor Becerra Yoma, Ivan Brito, Carlos Molina
2004	The superior effectivenes of the F0 range for identifying the context from sounds without phonemes. Yasuko Nagasaki, Takanori Komatsu
2004	The use of typical sequences for robust speaker identification. Mohamed Mihoubi, Douglas D. O'Shaughnessy, Pierre Dumouchel
2004	The voice-logbook: integrating human factors for a chronic care system. Lesley-Ann Black, Norman D. Black, Roy Harper, Michelle Lemon, Michael F. McTear
2004	Theory and data in spoken language assessment. Jared Bernstein, Isabella Barbier, Elizabeth Rosenfeld, John H. A. L. de Jong
2004	Theory for speaker recognition over IP. Thomas Eriksson, Samuel Kim, Hong-Goo Kang, Chungyong Lee
2004	Three-way system-user-expert interactions help you expand the capabilities of an existing spoken dialogue system. Gregory Aist
2004	Throat microphone signal for speaker recognition. Bayya Yegnanarayana, A. Shahina, M. R. Kesheorey
2004	Thyroplastic medialisation in unilateral vocal fold paralysis: assessing voice quality recovering. Claudia Manfredi, Giorgio Peretti, Laura Magnoni, Fabrizio Dori, Ernesto Iadanza
2004	Tight coupling of speech recognition and dialog management - dialog-context dependent grammar weighting for speech recognition. Christian Fügen, Hartwig Holzapfel, Alex Waibel
2004	Time -frequency analysis of vocal source signal for speaker recognition. Nengheng Zheng, P. C. Ching, Tan Lee
2004	Time delay estimation using weighted CPSP function. Hong-Seok Kwon, Siho Kim, Keun-Sung Bae
2004	Time-scaling of speech using independent subspace analysis. R. Muralishankar, A. G. Ramakrishnan, Lakshmish N. Kaushik
2004	Tone information as a confidence measure for improving Cantonese LVCSR. Yao Qian, Tan Lee, Frank K. Soong
2004	Topic classification and verification modeling for out-of-domain utterance detection. Tatsuya Kawahara, Ian Richard Lane, Tomoko Matsui, Satoshi Nakamura
2004	Topic structure extraction for meeting indexing. Katsutoshi Ohtsuki, Nobuaki Hiroshima, Yoshihiko Hayashi, Katsuji Bessho, Shoichi Matsunaga
2004	Towards a grammar of spoken language - prosody of ill-formed utterances and listener's understanding in discourse -. Miyoko Sugito
2004	Towards a harmonious coexistence of spoken and written language. Hyun-Bok Lee
2004	Towards a new level of anotation detail of multilingual speech corpora. Anja Geumann
2004	Towards automatic word segmentation of dialect speech. Eric Sanders, Andrea Diersen, Willy Jongenburger, Helmer Strik
2004	Towards better understanding of the model implied by the use of dynamic features in HMMs. John Scott Bridle
2004	Towards large vocabulary ASR on embedded platforms. Miroslav Novak
2004	Towards ubiquitous task management. Porfírio P. Filipe, Nuno J. Mamede
2004	Towards understanding mixed-initiative in task-oriented dialogues. Fan Yang, Peter A. Heeman, Kristy Hollingshead
2004	Transcription of arabic broadcast news. Abdelkhalek Messaoudi, Lori Lamel, Jean-Luc Gauvain
2004	Transformation and combination of hiden Markov models for speaker selection training. Chao Huang, Tao Chen, Eric Chang
2004	Transformation-based error correction for speech-to-text systems. Jochen Peters, Christina Drexel
2004	Translingual grammar induction. John Lee, Stephanie Seneff
2004	Triphone-based confidence system for speaker identification. Aaron D. Lawson, Mark C. Huggins
2004	Two-way speech-to-speech translation on handheld devices. Bowen Zhou, Daniel Déchelotte, Yuqing Gao
2004	Unified language modeling using finite-state transducers with first applications. Hans J. G. A. Dolfing, Pierce Gerard Buckley, David Horowitz
2004	Unscented kalman filtering of line spectral frequencies. Andrew Errity, John McKenna, Stephen Isard
2004	Unseen handset mismatch compensation based on a priori knowledge interpolation for robust speaker recognition. Yh-Her Yang, Yuan-Fu Liao
2004	Unsupervised language model adaptation methods for spontaneous speech. Luc Lussier, Edward W. D. Whittaker, Sadaoki Furui
2004	Unsupervised learning from users' error correction in speech dictation. Dong Yu, Mei-Yuh Hwang, Peter Mau, Alex Acero, Li Deng
2004	Unsupervised speaker adaptation using high confidence portion recognition results by multiple recognition systems. Tomohiro Watanabe, Hiromitsu Nishizaki, Takehito Utsuro, Seiichi Nakagawa
2004	Unsupervised topic adaptation for lecture speech retrieval. Atsushi Fujii, Tetsuya Ishikawa, Katsunobu Itou, Tomoyosi Akiba
2004	Usability considerations of speech-to-speech translation system. Youngjik Lee, Jun Park, Seung-Shin Oh
2004	Use of formants in stressed and unstressed continuous speech recognition. Davood Gharavian, Seyed Mohammad Ahadi
2004	Use of metadata to improve recognition of spontaneous speech and named entities. Bhuvana Ramabhadran, Olivier Siohan, Geoffrey Zweig
2004	Use of neural network mapping and extended kalman filter to recover vocal tract resonances from the MFCC parameters of speech. Li Deng, Roberto Togneri
2004	Use of prosodic features for speech recognition. Keikichi Hirose, Nobuaki Minematsu
2004	Use of visual cues in the perception of a labial/labiodental contrast by Spanish-L1 and Japanese-L1 learners of English. Midori Iba, Anke Sennema, Valérie Hazan, Andrew Faulkner
2004	Using RASTA in task independent TANDEM feature extraction. Guillermo Aradilla, John Dines, Sunil Sivadas
2004	Using VTLN for broadcast news transcription. Do Yeong Kim, Srinivasan Umesh, Mark J. F. Gales, Thomas Hain, Philip C. Woodland
2004	Using a depth-restricted search to reduce delays in unit selection. Nobuyuki Nishizawa, Hisashi Kawai
2004	Using computer simulation to compare two models of mixed-initiative. Fan Yang, Peter A. Heeman
2004	Using context to correct phone recognition errors. Stephen Cox
2004	Using linear interpolation to improve histogram equalization for speech recognition. Filip Korkmazsky, Dominique Fohr, Irina Illina
2004	Using machine learning to cope with imbalanced classes in natural speech: evidence from sentence boundary and disfluency detection. Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Mary P. Harper
2004	Using multiple linguistic features for Mandarin phrase break prediction in maximum-entropy classification framework. Yu Zheng, Gary Geunbae Lee, Byeongchang Kim
2004	Using part-of-speech for predicting phrase breaks. Ian Read, Stephen Cox
2004	Using quick transcriptions to improve conversational speech models. Owen Kimball, Chia-Lin Kao, Rukmini Iyer, Teodoro Arvizo, John Makhoul
2004	Using simple speech-based features to detect the state of a meeting and the roles of the meeting participants. Satanjeev Banerjee, Alexander I. Rudnicky
2004	Using word latice information for a tighter coupling in speech translation systems. Tanja Schultz, Szu-Chen Stan Jou, Stephan Vogel, Shirin Saleem
2004	Very large vocabulary speech recognition system for automatic transcription of czech broadcast programs. Jan Nouza, Dana Nejedlová, Jindrich Zdánský, Jan Kolorenc
2004	Video-realistic synthetic speech with a parametric visual speech synthesizer. Sascha Fagel
2004	Visual recalibration of auditory speech versus selective speech adaptation: different build-up courses. Jean Vroomen, Sabine van Linden, Béatrice de Gelder, Paul Bertelson
2004	Visualizing dynamic features of expressions in speech. Peter Robinson, Tal Sobol Shikler
2004	Vocabulary and language model adaptation using information retrieval. Brigitte Bigi, Yan Huang, Renato De Mori
2004	Vocal tract normalization based on spectral warping. Wei Wang, Stephen A. Zahorian
2004	Voice activation using prosodic features. Marco Khne, Matthias Wolff, Matthias Eichner, Rüdiger Hoffmann
2004	Voice activity detection using global soft decision with mixture of Gaussian model. Kiyoung Park, Changkyu Choi, Jeongsu Kim
2004	Voice conversion for unknown speakers. Hui Ye, Steve J. Young
2004	Voice enhancement of male speakers with laryngeal neoplasm. Gernot Kubin, Martin Hagmüller
2004	Voice portal services in packet network and voIP environment. Wu Chou, Feng Liu
2004	Voicebuilder: a framework for automatic speech application development. Miguel Angel Rodriguez-Moreno, Heriberto Cuayáhuitl, Juventino Montiel-Hernández
2004	Weighting observation vectors for robust speech recognition in noisy environments. Zhenyu Xiong, Thomas Fang Zheng, Wenhu Wu
2004	What concept-to-speech can gain for prosody. Markus Schnell, Rüdiger Hoffmann
2004	What makes a non-native accent?: a study of Korean English. Jong-mi Kim, Suzanne Flynn
2004	Why speech recognizers make errors ? a robustness view. Hong Kook Kim, Mazin G. Rahim
2004	Word confusability prediction in automatic speech recognition. Jan Anguita, Stéphane Peillon, Javier Hernando, Alexandre Bramoulle
2004	Word n-gram probability estimation from a Japanese raw corpus. Shinsuke Mori, Daisuke Takuma
2004	Worldwide ongoing activities on multilingual speech to speech translation. Gianni Lazzari, Alex Waibel, Chengqing Zong
2004	XML representation languages as a way of interconnecting TTS modules. Marc Schröder, Stefan Breuer
2004	Zeros of z-transform (ZZT) decomposition of speech for source-tract separation. Boris Doval, Baris Bozkurt, Christophe d'Alessandro, Thierry Dutoit