INTERSPEECH - RankMe

799 papers

Year	Title / Authors
2003	"do not attempt to light with match!": some thoughts on progress and research goals in spoken dialog systems. Paul Heisterkamp
2003	"syncpitch": a pseudo pitch synchronous algorithm for speaker recognition. Ran D. Zilca, Jirí Navrátil, Ganesh N. Ramaswamy
2003	8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - INTERSPEECH 2003, Geneva, Switzerland, September 1-4, 2003
2003	A DP algorithm for speaker change detection. Michele Vescovi, Mauro Cettolo, Romeo Rizzi
2003	A DTW-based DAG technique for speech and speaker feature analysis. Jingwei Liu
2003	A clustering approach to on-line audio source separation. Julien Bourgeois
2003	A comparative study of some discriminative feature reduction algorithms on the AURORA 2000 and the daimlerchrysler in-car ASR tasks. Joan Marí Hilario, Fritz Class
2003	A comparative study on maximum entropy and discriminative training for acoustic modeling in automatic speech recognition. Wolfgang Macherey, Hermann Ney
2003	A comparison of the data requirements of automatic speech recognition systems and human listeners. Roger K. Moore
2003	A comparison of three non-linear observation models for noisy speech features. Jasha Droppo, Li Deng, Alex Acero
2003	A computational model of arm gestures in conversation. Dafydd Gibbon, Ulrike Gut, Benjamin Hell, Karin Looks, Alexandra Thies, Thorsten Trippel
2003	A context resolution server for the galaxy conversational systems. Edward Filisko, Stephanie Seneff
2003	A contrastive investigation of standard Mandarin and accented Mandarin. Aijun Li, Xia Wang
2003	A corpus-based decompounding algorithm for German lexical modeling in LVCSR. Martine Adda-Decker
2003	A cross-media retrieval system for lecture videos. Atsushi Fujii, Katunobu Itou, Tomoyosi Akiba, Tetsuya Ishikawa
2003	A discriminative decision tree learning approach to acoustic modeling. Sheng Gao, Chin-Hui Lee
2003	A dynamic cross-reference pruning strategy for multiple feature fusion at decoder run time. Yonghong Yan, Chengyi Zheng, Jianping Zhang, Jielin Pan, Jiang Han, Jian Liu
2003	A fast, accurate and stream-based speaker segmentation and clustering algorithm. An Vandecatseye, Jean-Pierre Martens
2003	A harmonic-model-based front end for robust speech recognition. Michael L. Seltzer, Jasha Droppo, Alex Acero
2003	A hidden Markov model-based missing data imputation approach. Yu Luo, Limin Du
2003	A hybrid method oriented to concatenative text-to-speech synthesis. Ignasi Iriondo Sanz, Francesc Alías, Javier Sanchis, Javier Melenchón
2003	A latent analogy framework for grapheme-to-phoneme conversion. Jerome R. Bellegarda
2003	A memory-based approach to Cantonese tone recognition. Michael Emonts, Deryle Lonsdale
2003	A method for on-line speaker indexing using generic reference models. Soonil Kwon, Shrikanth S. Narayanan
2003	A multimodal conversational interface for a concept vehicle. Roberto Pieraccini, Krishna Dayanidhi, Jonathan Bloom, Jean-Gui Dahan, Michael Phillips, Bryan R. Goodman, K. Venkatesh Prasad
2003	A neural network approach to dependency analysis of Japanese sentences using prosodic information. Kazuyuki Takagi, Mamiko Okimoto, Yoshio Ogawa, Kazuhiko Ozeki
2003	A new HMM-based approach to broad phonetic classification of speech. Jouni Pohjalainen
2003	A new SVM approach to speaker identification and verification using probabilistic distance kernels. Pedro J. Moreno, Purdy Ho
2003	A new adaptive long-term spectral estimation voice activity detector. Javier Ramírez, José C. Segura, M. Carmen Benítez, Ángel de la Torre, Antonio J. Rubio
2003	A new approach to minimize utterance verification error rate for a specific operating point. Wing-Hei Au, Man-Hung Siu
2003	A new approach to reducing alarm noise in speech. Yilmaz Gul, Aladdin M. Ariyaeeinia, Oliver Dewhirst
2003	A new approach to segment and detect syllables from high-speed speech. D. W. Ying, W. Gao, W. Q. Wang
2003	A new approach to voice activity detection based on self-organizing maps. Stephan Grashey
2003	A new decoder design for large vocabulary turkish speech recognition. Onur Cilingir, Mübeccel Demirekler
2003	A new method for pitch prediction from spectral envelope and its application in voice conversion. Taoufik En-Najjary, Olivier Rosec, Thierry Chonavel
2003	A new perspective on feature extraction for robust in-vehicle speech recognition. Umit H. Yapanel, John H. L. Hansen
2003	A new pitch modeling approach for Mandarin speech. Wen-Hsing Lai, Yih-Ru Wang, Sin-Horng Chen
2003	A new pitch synchronous time domain phoneme recognizer using component analysis and pitch clustering. Ramon Prieto, Jing Jiang, Chi-Ho Choi
2003	A new spectral transformation for speaker normalization. Pierre L. Dognin, Amro El-Jaroudi
2003	A new supervised-predictive compensation scheme for noisy speech recognition. Khalid Daoudi, Murat Deviren
2003	A noise-robust ASR back-end technique based on weighted viterbi recognition. Xiaodong Cui, Alexis Bernard, Abeer Alwan
2003	A novel method of analysing and comparing responses of hearing aid algorithms using auditory time-frequency representation. G. V. Kiran, Thippur V. Sreenivas
2003	A novel rate selection algorithm for transcoding CELP-type codec and SMV. Dalwon Jang, Seongho Seo, Sunil Lee, Chang D. Yoo
2003	A novel transcoding algorithm for SMV and g.723.1 speech coders via direct parameter transformation. Seongho Seo, Dalwon Jang, Sunil Lee, Chang D. Yoo
2003	A novel use of residual noise model for modified PMC. Cailian Miao, Yangsheng Wang
2003	A programmable policy manager for conversational biometrics. Ganesh N. Ramaswamy, Ran D. Zilca, Oleg Alecksandrovich
2003	A pronunciation lexicon for turkish based on two-level morphology. Kemal Oflazer, Sharon Inkelas
2003	A pronunciation training system for Japanese lexical accents with corrective feedback in learner's voice. Keikichi Hirose, Frédéric Gendrin, Nobuaki Minematsu
2003	A reconstruction of farkas kempelen's speaking machine. P. Nikleczy, Gábor Olaszy
2003	A robust and sensitive word boundary decision algorithm. Jong Uk Kim, Sang-Gyun Kim, Chang D. Yoo
2003	A robust noise and echo canceller. Khaldoon Al-Naimi, Christian Sturt, Ahmet M. Kondoz
2003	A segment-based algorithm of speech enhancement for robust speech recognition. Guokang Fu, Ta-Hsin Li
2003	A semantic representation for spoken dialogs. Hélène Bonneau-Maynard, Sophie Rosset
2003	A semi-blind source separation method for hands-free speech recognition of multiple talkers. Panikos Heracleous, Satoshi Nakamura, Kiyohiro Shikano
2003	A sequential metric-based audio segmentation method via the Bayesian information criterion. Shih-Sian Cheng, Hsin-Min Wang
2003	A source model mitigation technique for distributed speech recognition over lossy packet channels. Angel M. Gomez, Antonio M. Peinado, Victoria E. Sánchez, Antonio J. Rubio
2003	A speech dereverberation method based on the MTF concept. Masashi Unoki, Keigo Sakata, Masato Akagi
2003	A speech model of acoustic inventories based on asynchronous interpolation. Alexander Kain, Jan P. H. van Santen
2003	A speech processing front-end with eigenspace normalization for robust speech recognition in noisy automobile environments. Kaisheng Yao, Erik M. Visser, Oh-Wook Kwon, Te-Won Lee
2003	A spoken language interface to an electronic programme guide. Jianhong Jin, Martin J. Russell, Michael J. Carey, James Chapman, Harvey Lloyd-Thomas, Graham Tattersall
2003	A statistical approach to assessing speech and voice variability in speaker verification. Klaus R. Scherer, Didier Grandjean, Tom Johnstone, Gudrun Klasmeyer, Tanja Bänziger
2003	A statistical method of evaluating pronunciation proficiency for English words spoken by Japanese. Seiichi Nakagawa, Kazumasa Mori, Naoki Nakamura
2003	A study on domain recognition of spoken dialogue systems. Toshihiro Isobe, Shoji Hayakawa, Hiroya Murao, Tatsuji Mizutani, Kazuya Takeda, Fumitada Itakura
2003	A switching linear Gaussian hidden Markov model and its application to nonstationary noise compensation for robust speech recognition. Jian Wu, Qiang Huo
2003	A syllable segmentation algorithm for English and italian. Massimo Petrillo, Francesco Cutugno
2003	A system for voice conversion based on adaptive filtering and line spectral frequency distance optimization for text-to-speech synthesis. Özgül Salor, Mübeccel Demirekler, Bryan L. Pellom
2003	A topic classification system based on parametric trajectory mixture models. William Belfield, Herbert Gish
2003	A trainable generator for recommendations in multimodal dialog. Marilyn A. Walker, Rashmi Prasad, Amanda Stent
2003	A trainable speech enhancement technique based on mixture models for speech and noise. Ilyas Potamitis, Nikos Fakotakis, George Kokkinakis
2003	A visual context-aware multimodal system for spoken language processing. Niloy Mukherjee, Deb Roy
2003	A voice-driven web browser for blind people. Bostjan Vesnicer, Janez Zibert, Simon Dobrisek, Nikola Pavesic, France Mihelic
2003	Accentual lengthening in standard Chinese: evidence from four-syllable constituents. Yiya Chen
2003	Accuracy improved double-talk detector based on state transition diagram. Sang-Gyun Kim, Jong Uk Kim, Chang D. Yoo
2003	Acoustic change detection and segment clustering of two-way telephone conversations. Xin Zhong, Mark A. Clements, Sung Lim
2003	Acoustic model selection and voice quality assessment for HMM-based Mandarin speech synthesis. Wentao Gu, Keikichi Hirose
2003	Acoustic modeling of american English lateral approximants. Zhaoyan Zhang, Carol Y. Espy-Wilson, Mark Tiede
2003	Acoustic modeling with mixtures of subspace constrained exponential models. Karthik Visweswariah, Scott Axelrod, Ramesh A. Gopinath
2003	Acoustic normalization of children's speech. Georg Stemmer, Christian Hacker, Stefan Steidl, Elmar Nöth
2003	Acoustic variations of focused disyllabic words in Mandarin Chinese: analysis, synthesis and perception. Zhenglai Gu, Hiroki Mori, Hideki Kasuya
2003	Acoustic, phonetic, and discriminative approaches to automatic language identification. Elliot Singer, Pedro A. Torres-Carrasquillo, Terry P. Gleason, William M. Campbell, Douglas A. Reynolds
2003	Acquiring lexical information from multilevel temporal annotations. Thorsten Trippel, Felix Sasaki, Benjamin Hell, Dafydd Gibbon
2003	Active and unsupervised learning for automatic speech recognition. Giuseppe Riccardi, Dilek Hakkani-Tür
2003	Active labeling for spoken language understanding. Gökhan Tür, Mazin G. Rahim, Dilek Hakkani-Tür
2003	Adaptation of acoustic model using the gain-adapted HMM decomposition method. Akira Sasou, Futoshi Asano, Kazuyo Tanaka, Satoshi Nakamura
2003	Adapting acoustic models to new domains and conditions using untranscribed data. Asela Gunawardana, Alex Acero
2003	Adapting language models for frequent fixed phrases by emphasizing n-gram subsets. Tomoyosi Akiba, Katunobu Itou, Atsushi Fujii
2003	Adaptive beamforming in room with reverberation. Zoran Saric, Slobodan Jovicic
2003	Adaptive decision fusion for multi-sample speaker verification over GSM networks. Ming-Cheung Cheung, Man-Wai Mak, Sun-Yuan Kung
2003	Adaptive noise estimation using second generation and perceptual wavelet transforms. Essa Jafer, Abdulhussain E. Mahdi
2003	Adding fricatives to the portuguese articulatory synthesiser. António J. S. Teixeira, Luis M. T. Jesus, Roberto Martinez
2003	Additive noise and channel distortion-robust parametrization tool - performance evaluation on Aurora 2 & 3. Petr Fousek, Petr Pollák
2003	Agents for integrated tutoring in spoken dialogue systems. Jaakko Hakulinen, Markku Turunen, Esa-Pekka Salonen
2003	An NN-based approach to prosodic information generation for synthesizing English words embedded in Chinese text. Wei-Chih Kuo, Li-Feng Lin, Yih-Ru Wang, Sin-Horng Chen
2003	An accurate noise compensation algorithm in the log-spectral domain for robust speech recognition. Mohamed Afify
2003	An acoustic phonetic analysis of diphthongs in ningbo Chinese. Fang Hu
2003	An acquisition model of speech perception with considerations of temporal information. Ching-Pong Au
2003	An approach to common acoustical pole and zero modeling of consecutive periods of voiced speech. Pedro J. Quintana-Morales, Juan L. Navarro-Mesa
2003	An approach to multilingual acoustic modeling for portable devices. Yan Ming Cheng, Chen Liu, Yuanjun Wei, Lynette Melnar, Changxue Ma
2003	An architecture for rapid decoding of large vocabulary conversational speech. George Saon, Geoffrey Zweig, Brian Kingsbury, Lidia Mangu, Upendra V. Chaudhari
2003	An automatic singing transcription system with multilingual singing lyric recognizer and robust melody tracker. Chong-kai Wang, Ren-yuan Lyu, Yuang-Chin Chiang
2003	An efficient integrated gender detection scheme and time mediated averaging of gender dependent acoustic models. Peder A. Olsen, Satya Dharanipragada
2003	An efficient keyword spotting technique using a complementary language for filler models training. Panikos Heracleous, Tohru Shimizu
2003	An efficient viterbi algorithm on DBNs. Wei Hu, Yimin Zhang, Qian Diao, Shan Huang
2003	An efficient, fast matching approach using posterior probability estimates in speech recognition. Sherif M. Abdou, Michael S. Scordilis
2003	An empirical text transformation method for spontaneous speech synthesizers. Shiva Sundaram, Shrikanth S. Narayanan
2003	An evaluation of VTS and IMM for speaker verification in noise. Suhadi Suhadi, Sorel Stan, Tim Fingscheidt, Christophe Beaugeant
2003	An expandable web-based audiovisual text-to-speech synthesis system. Sascha Fagel, Walter F. Sendlmeier
2003	An improved model-based speaker segmentation system. Peng Yu, Frank Seide, Chengyuan Ma, Eric Chang
2003	An information theoretic approach for using word cluster information in natural language call routing. Li Li, Feng Liu, Wu Chou
2003	An integrated system for smart-home control of appliances based on remote speech interaction. Ilyas Potamitis, Kallirroi Georgila, Nikos Fakotakis, George K. Kokkinakis
2003	An integrated toolkit deploying speech technology for computer based speech training with application to dysarthric speakers. Athanassios Hatzis, Phil D. Green, James Carmichael, Stuart P. Cunningham, Rebecca Palmer, Mark Parker, Peter O'Neill
2003	An investigation of intensity patterns for German. Oliver Jokisch, Marco Kühne
2003	An optimized multi-duration HMM for spontaneous speech recognition. Yuichi Ohkawa, Akihiro Yoshida, Motoyuki Suzuki, Akinori Ito, Shozo Makino
2003	Analysis and compensation of packet loss in distributed speech recognition using interleaving. Ben P. Milner, Alastair Bruce James
2003	Analysis and modeling of f_0 contours of portuguese utterances based on the command-response model. Hiroya Fujisaki, Shuichi Narusawa, Sumio Ohno, Diamantino Freitas
2003	Analysis and modeling of syllable duration for Thai speech synthesis. Chatchawarn Hansakunbuntheung, Virongrong Tesprasit, Rungkarn Siricharoenchai, Yoshinori Sagisaka
2003	Analysis of lossy vocal tract models for speech production. Karl Schnell, Arild Lacroix
2003	Analysis of the Aurora large vocabulary evaluations. Naveen Parihar, Joseph Picone
2003	Analysis of voice source characteristics using a constrained polynomial model. Tokihiko Kaburagi, Koji Kawai
2003	Applications of computer generated expressive speech for communication disorders. Jan P. H. van Santen, Lois M. Black, Gilead Cohen, Alexander Kain, Esther Klabbers, Taniya Mishra, Jacques de Villiers, Xiaochuan Niu
2003	Approaches to foreign-accented speaker-independent speech recognition. Stefanie Aalburg, Harald Höge
2003	Arabic in my hand: small-footprint synthesis of egyptian arabic. Laura Mayfield Tomokiyo, Alan W. Black, Kevin A. Lenzo
2003	Assessment of dereverberation algorithms for large vocabulary speech recognition systems. Koen Eneman, Jacques Duchateau, Marc Moonen, Dirk Van Compernolle, Hugo Van hamme
2003	Assessment of spoken dialogue system usability - what are we really measuring? Lars Bo Larsen
2003	Audio-visual speech recognition in challenging environments. Gerasimos Potamianos, Chalapathy Neti
2003	Audiovisual speech enhancement based on the association between speech envelope and video features. Frédéric Berthommier
2003	Auditory principles in speech processing - do computers need silicon ears ? Birger Kollmeier
2003	Auditory-instrumental forensic speaker recognition. Stefan G. Gfrörer
2003	Automated closed-captioning of live TV broadcast news in French. Julie Brousseau, Jean-Francois Beaumont, Gilles Boulianne, Patrick Cardinal, Claude Chapdelaine, Michel Comeau, Frédéric Osterrath, Pierre Ouellet
2003	Automated speaker recognition in real world conditions: controlling the uncontrollable. Hirotaka Nakasone
2003	Automated transcription and topic segmentation of large spoken archives. Martin Franz, Bhuvana Ramabhadran, Todd Ward, Michael Picheny
2003	Automatic baseform generation from acoustic data. Benoît Maison
2003	Automatic call-routing without transcriptions. Qiang Huang, Stephen J. Cox
2003	Automatic construction of unique signatures and confusable sets for natural language directory assistance applications. E. E. Jan, Benoît Maison, Lidia Mangu, Geoffrey Zweig
2003	Automatic disfluency identification in conversational speech using multiple knowledge sources. Yang Liu, Elizabeth Shriberg, Andreas Stolcke
2003	Automatic estimation of perceptual age using speaker modeling techniques. Nobuaki Minematsu, Keita Yamauchi, Keikichi Hirose
2003	Automatic extraction of bilingual chunk lexicon for spoken language translation. Limin Du, Boxing Chen
2003	Automatic generation of context-independent variable parameter models using successive state and mixture splitting. Soo-Young Suk, Ho-Youl Jung, Hyun-Yeol Chung
2003	Automatic generation of non-uniform context-dependent HMM topologies based on the MDL criterion. Takatoshi Jitsuhiro, Tomoko Matsui, Satoshi Nakamura
2003	Automatic induction of n-gram language models from a natural language grammar. Stephanie Seneff, Chao Wang, Timothy J. Hazen
2003	Automatic phone set extension with confidence measure for spontaneous speech. Yi Liu, Pascale Fung
2003	Automatic prosodic prominence detection in speech using acoustic features: an unsupervised system. Fabio Tamburini
2003	Automatic segmentation for czech concatenative speech synthesis using statistical approach with boundary-specific correction. Jindrich Matousek, Daniel Tihelka, Josef Psutka
2003	Automatic segmentation of film dialogues into phonemes and graphemes. Gilles Boulianne, Jean-Francois Beaumont, Patrick Cardinal, Michel Comeau, Pierre Ouellet, Pierre Dumouchel
2003	Automatic singer identification of popular music recordings via estimation and modeling of solo vocal signal. Wei-Ho Tsai, Hsin-Min Wang, Dwight Rodgers
2003	Automatic speech recognition with sparse training data for dysarthric speakers. Phil D. Green, James Carmichael, Athanassios Hatzis, Pam Enderby, Mark S. Hawley, Mark Parker
2003	Automatic speech segmentation and verification for concatenative synthesis. Chih-Chung Kuo, Chi-Shiang Kuo, Jau-Hung Chen, Sen-Chia Chang
2003	Automatic summarization of broadcast news using structural features. Sameer Maskey, Julia Hirschberg
2003	Automatic title generation for Chinese spoken documents considering the special structure of the language. Lin-Shan Lee, Shun-Chuan Chen
2003	Automatic title generation for Chinese spoken documents using an adaptive k nearest-neighbor approach. Shun-Chuan Chen, Lin-Shan Lee
2003	Automatic transcription of football commentaries in the MUMIS project. Janienke Sturm, Judith M. Kessens, Mirjam Wester, Febe de Wet, Eric Sanders, Helmer Strik
2003	Automatic transformation of environmental sounds into sound-imitation words based on Japanese syllable structure. Kazushi Ishihara, Yasushi Tsubota, Hiroshi G. Okuno
2003	Autoregressive modeling based feature extraction for Aurora3 DSR task. Petr Motlícek, Jan Cernocký
2003	Average instantaneous frequency (AIF) and average log-envelopes (ALE) for ASR with the Aurora 2 database. Yadong Wang, Jesse Hansen, Gopi Krishna Allu, Ramdas Kumaresan
2003	Band-independent speech-event categories for TRAP based ASR. Hynek Hermansky, Pratibha Jain
2003	Bandwidth mismatch compensation for robust speech recognition. Yuan-Fu Liao, Jeng-Shien Lin, Wei-Ho Tsai
2003	Bayesian induction of intonational phrase breaks. Panagiotis Zervas, Manolis Maragoudakis, Nikos Fakotakis, George Kokkinakis
2003	Bayesian networks for spoken dialogue management in multimodal systems of tour-guide robots. Plamen J. Prodanov, Andrzej Drygajlo
2003	Beyond a single critical-band in TRAP based ASR. Pratibha Jain, Hynek Hermansky
2003	Blind inversion of multidimensional functions for speech enhancement. John Hogden, Patrick Valdez, Shigeru Katagiri, Erik McDermott
2003	Blind normalization of speech from different channels. David N. Levin
2003	Blind separation and deconvolution for convolutive mixture of speech using SIMO-model-based ICA and multichannel inverse filtering. Hiroaki Yamajo, Hiroshi Saruwatari, Tomoya Takatani, Tsuyoki Nishikawa, Kiyohiro Shikano
2003	Brain imaging correlates of temporal quantization in spoken language. David Poeppel
2003	Broad focus across sentence types in greek. Mary Baltazani
2003	Building a test collection for speech-driven web retrieval. Atsushi Fujii, Katunobu Itou
2003	CART-based factor analysis of intelligibility reduction in Japanese English. Nobuaki Minematsu, Changchen Guo, Keikichi Hirose
2003	CFA-BF: a novel combined fixed/adaptive beamforming for robust speech recognition in real car environments. Xianxian Zhang, John H. L. Hansen
2003	Characteristics of authentic anger in hebrew speech. Noam Amir, Shirley Ziv, Rachel Cohen
2003	Child and adult speaker adaptation during error resolution in a publicly available spoken dialogue system. Linda Bell, Joakim Gustafson
2003	Classification with free energy at raised temperatures. Rita Singh, Manfred K. Warmuth, Bhiksha Raj, Paul Lamere
2003	Classifying subject ratings of emotional speech using acoustic features. Jackson Liscombe, Jennifer J. Venditti, Julia Hirschberg
2003	Collecting machine-translation-aided bilingual dialogues for corpus-based speech translation. Toshiyuki Takezawa, Gen-ichiro Kikui
2003	Combination of CFG and n-gram modeling in semantic grammar learning. Ye-Yi Wang, Alex Acero
2003	Combination of a hidden tag model and a traditional n-gram model: a case study in czech speech recognition. Pavel Krbec, Petr Podveský, Jan Hajic
2003	Combination of finite state automata and neural network for spoken language understanding. Chai Wutiwiwatchai, Sadaoki Furui
2003	Combination of temporal domain SVD based speech enhancement and GMM based speech estimation for ASR in noise - evaluation on the AURORA2 task -. Masakiyo Fujimoto, Yasuo Ariki
2003	Combining non-uniform unit selection with diphone based synthesis. Michael Pucher, Friedrich Neubarth, Erhard Rank, Georg Niklfeld, Qi Guan
2003	Comparative analysis and synthesis of formant trajectories of british and broad australian accents. Qin Yan, Saeed Vaseghi, Ching-Hsiang Ho, Dimitrios Rentzos, Emir Turajlic
2003	Comparative experiments to evaluate the use of auditory-based acoustic distinctive features and formant cues for robust automatic speech recognition in low-SNR car environments. Hesham Tolba, Sid-Ahmed Selouani, Douglas D. O'Shaughnessy
2003	Comparative study of boosting and non-boosting training for constructing ensembles of acoustic models. Rong Zhang, Alexander I. Rudnicky
2003	Comparative study on hungarian acoustic model sets and training methods. Tibor Fegyó, Péter Mihajlik, Péter Tatai
2003	Comparing the usability of a user driven and a mixed initiative multimodal dialogue system for train timetable information. Janienke Sturm, Ilse Bakx, Bert Cranen, Jacques M. B. Terken
2003	Comparison of effects of acoustic and language knowledge on spontaneous speech perception/recognition between human and automatic speech recognizer. Norihide Kitaoka, Masahisa Shingu, Seiichi Nakagawa
2003	Compensation of channel distortion in line spectrum frequency domain. An-Tze Yu, Hsiao-Chuan Wang
2003	Compiling large-context phonetic decision trees into finite-state transducers. Stanley F. Chen
2003	Compound decomposition in dutch large vocabulary speech recognition. Roeland Ordelman, Arjan van Hessen, Franciska de Jong
2003	Computational auditory scene analysis by using statistics of high-dimensional speech dynamics and sound source direction. Johannes Nix, Michael Kleinschmidt, Volker Hohmann
2003	Conceptual decoding for spoken dialog systems. Yannick Estève, Christian Raymond, Frédéric Béchet, Renato De Mori
2003	Conditional and joint models for grapheme-to-phoneme conversion. Stanley F. Chen
2003	Confidence measure driven scalable two-pass recognition strategy for large list grammars. Miroslav Novak, Diego Ruiz
2003	Confidence measures for phonetic segmentation of continuous speech. Samir Nefti, Olivier Boëffard, Thierry Moudenc
2003	Confusion matrix based entropy correction in multi-stream combination. Hemant Misra, Andrew C. Morris
2003	Connectionist classification and specific stochastic models in the understanding process of a dialogue system. David Vilar, María José Castro, Emilio Sanchis
2003	Consideration of muscle co-contraction in a physiological articulatory model. Jianwu Dang, Kiyoshi Honda
2003	Considerations on vowel durations for Japanese CALL system. Taro Mouri, Keikichi Hirose, Nobuaki Minematsu
2003	Construction of an advanced in-car spoken dialogue corpus and its characteristic analysis. Itsuki Kishida, Yuki Irie, Yukiko Yamaguchi, Shigeki Matsubara, Nobuo Kawaguchi, Yasuyoshi Inagaki
2003	Context awareness using environmental noise classification. Ling Ma, Dan J. Smith, Ben P. Milner
2003	Context-dependent output densities for hidden Markov models in speech recognition. Georg Stemmer, Viktor Zeißler, Christian Hacker, Elmar Nöth, Heinrich Niemann
2003	Context-sensitive evaluation and correction of phone recognition output. Michael Levit, Hiyan Alshawi, Allen L. Gorin, Elmar Nöth
2003	Continuous speech recognition and verification based on a combination score. Binfeng Yan, Rui Guo, Xiaoyan Zhu
2003	Control and prediction of the impact of pitch modification on synthetic speech quality. Esther Klabbers, Jan P. H. van Santen
2003	Control in task-oriented dialogues. Peter A. Heeman, Fan Yang, Susan E. Strayer
2003	Convergence improvement for oversampled subband adaptive noise and echo cancellation. Hamid Reza Abutalebi, Hamid Sheikhzadeh, Robert L. Brennan, George H. Freeman
2003	Corpus-based modeling of naturalness estimation in timing control for non-native speech. Makiko Muto, Yoshinori Sagisaka, Takuro Naito, Daiju Maeki, Aki Kondo, Katsuhiko Shirai
2003	Corpus-based syntax-prosody tree matching. Dafydd Gibbon
2003	Corpus-based synthesis of fundamental frequency contours of Japanese using automatically-generated prosodic corpus and generation process model. Keikichi Hirose, Takayuki Ono, Nobuaki Minematsu
2003	Correction of disfluencies in spontaneous speech using a noisy-channel approach. Matthias Honal, Tanja Schultz
2003	Coupling vs. unifying: modeling techniques for speech-to-speech translation. Yuqing Gao
2003	Covariation and weighting of harmonically decomposed streams for ASR. Philip J. B. Jackson, David M. Moreno, Martin J. Russell, Javier Hernando
2003	Creating corpora for speech-to-speech translation. Gen-ichiro Kikui, Eiichiro Sumita, Toshiyuki Takezawa, Seiichi Yamamoto
2003	Cross domain Chinese speech understanding and answering based on named-entity extraction. Yun-Tien Lee, Shun-Chuan Chen, Lin-Shan Lee
2003	Cross-lingual pronunciation modelling for indonesian speech recognition. Terrence Martin, Torbjørn Svendsen, Sridha Sridharan
2003	Cross-modal informational masking due to mismatched audio cues in a speechreading task. Douglas Brungart, Brian D. Simpson, Alexander J. Kordik
2003	Cross-stream observation dependencies for multi-stream speech recognition. Özgür Çetin, Mari Ostendorf
2003	Custom-tailoring TTS voice font - keeping the naturalness when reducing database size. Yong Zhao, Min Chu, Hu Peng, Eric Chang
2003	Cycle extraction for perfect reconstruction and rate scalability. Miguel Arjona Ramírez
2003	DOA estimation of speech signal using equilateral-triangular microphone array. Yusuke Hioka, Nozomu Hamada
2003	DTW-based phonetic alignment using multiple acoustic features. Sérgio Paulo, Luís C. Oliveira
2003	Data driven example based continuous speech recognition. Mathias De Wachter, Kris Demuynck, Dirk Van Compernolle, Patrick Wambacq
2003	Data driven generation of broad classes for decision tree construction in acoustic modeling. Andrej Zgank, Zdravko Kacic, Bogomir Horvat
2003	Data-driven pronunciation modeling for ASR using acoustic subword units. Thurid Spiess, Britta Wrede, Gernot A. Fink, Franz Kummert
2003	Database adaptation for ASR in cross-environmental conditions in the SPEECON project. Christophe Couvreur, Oren Gedge, Klaus Linhard, Shaunie Shammass, Johan Vantieghem
2003	Decision tree-based simultaneous clustering of phonetic contexts, dimensions, and state positions for acoustic modeling. Heiga Zen, Keiichi Tokuda, Tadashi Kitamura
2003	Dependence of GMM adaptation on feature post-processing for speaker recognition. Robbie Vogt, Jason W. Pelecanos, Sridha Sridharan
2003	Design and evaluation of a limited two-way speech translator. David Stallard, John Makhoul, Fred Choi, Ehry MacRostie, Premkumar Natarajan, Richard M. Schwartz, Bushra Zawaydeh
2003	Design of the CMU sphinx-4 decoder. Paul Lamere, Philip Kwok, William Walker, Evandro B. Gouvêa, Rita Singh, Bhiksha Raj, Peter Wolf
2003	Designing for errors: similarities and differences of disfluency rates and prosodic characteristics across domains. Guergana K. Savova, Joan Bachenko
2003	Detection and recognition of correction utterance in spontaneously spoken dialog. Norihide Kitaoka, Naoko Kakutani, Seiichi Nakagawa
2003	Detection and separation of speech segment using audio and video information fusion. Futoshi Asano, Yoichi Motomura, Hideki Asoh, Takashi Yoshimura, Naoyuki Ichimura, Kiyoshi Yamamoto, Nobuhiko Kitawaki, Satoshi Nakamura
2003	Detection of list-type sentences. Taniya Mishra, Esther Klabbers, Jan P. H. van Santen
2003	Development of a bilingual spoken dialog system for weather information retrieval. Janez Zibert, Sanda Martincic-Ipsic, Melita Hajdinjak, Ivo Ipsic, France Mihelic
2003	Development of a stochastic dialog manager driven by semantics. Francisco Torres, Emilio Sanchis, Encarna Segarra
2003	Development of phrase translation systems for handheld computers: from concept to field. Horacio Franco, Jing Zheng, Kristin Precoda, Federico Cesari, Victor Abrash, Dimitra Vergyri, Anand Venkataraman, Harry Bratt, Colleen Richey, Ace Sarich
2003	Development of the estonian speechdat-like database. Einar Meister, Jürgen Lasn, Lya Meister
2003	Dialog systems for automotive environments. Julie Baca, Feng Zheng, Hualin Gao, Joseph Picone
2003	Discriminative estimation of subspace precision and mean (SPAM) models. Vaibhava Goel, Scott Axelrod, Ramesh A. Gopinath, Peder A. Olsen, Karthik Visweswariah
2003	Discriminative methods for improving named entity extraction on speech data. James Horlock, Simon King
2003	Discriminative optimization of large vocabulary Mandarin conversational speech recognition system. Peng Ding, Zhenbiao Chen, Sheng Hu, Shuwu Zhang, Bo Xu
2003	Discriminative training and maximum likelihood detector for speaker identification. Mohamed Mihoubi, Gilles Boulianne, Pierre Dumouchel
2003	Discriminative training of n-gram classifiers for speech and text routing. Ciprian Chelba, Alex Acero
2003	Discriminative weight training for unit-selection based speech synthesis. Seung Seop Park, Chong Kyu Kim, Nam Soo Kim
2003	Disfluency under feedback and time-pressure. H. B. M. Nicholson, Ellen Gurman Bard, Anne H. Anderson, María L. Flecha-García, David Kenicer, Lucy Smallwood, Jim Mullin, Robin J. Lickley, Yiya Chen
2003	Distributed genetic algorithm to discover a wavelet packet best basis for speech recognition. Robert van Kommer, Béat Hirsbrunner
2003	Distributed speech recognition on the WSJ task. Jan Stadermann, Gerhard Rigoll
2003	Domain adaptation augmented by state-dependence in spoken dialog systems. Wei He, Honglian Li, Baozong Yuan
2003	Dominance spectrum based v/UV classification and f_0 estimation. Tomohiro Nakatani, Toshio Irino, Parham Zolfaghari
2003	Dual-mode wideband speech recovery from narrowband speech. Yasheng Qian, Peter Kabal
2003	Duration normalization and hypothesis combination for improved spontaneous speech recognition. Jon P. Nedel, Richard M. Stern
2003	Durational characteristics of hindi stop consonants. K. Samudravijaya
2003	Dynamic channel compensation based on maximum a posteriori estimation. Huayun Zhang, Zhaobing Han, Bo Xu
2003	Earwitness line-ups: effects of speech duration, retention interval and acoustic environment on identification accuracy. Jose H. Kerstholt, E. J. M. Jansen, A. G. van Amelsvoort, A. P. A. Broeders
2003	Effect of foreign accent on speech recognition in the NATO n-4 corpus. Aaron D. Lawson, David M. Harris, John J. Grieco
2003	Effects of voice prosody by computers on human behaviors. Noriko Suzuki, Yohei Yabuta, Yugo Takeuchi, Yasuhiro Katagiri
2003	Efficient linear combination for distant n-gram models. David Langlois, Kamel Smaïli, Jean Paul Haton
2003	Efficient quantization of speech excitation parameters using temporal decomposition. Phu Chien Nguyen, Masato Akagi
2003	Efficient speech enhancement based on left-right HMM with state sequence detection using LRT. J. J. Lee, J. H. Lee, K. Y. Lee
2003	Efficient spoken dialogue control depending on the speech recognition rate and system's database. Kohji Dohsaka, Norihito Yasuda, Kiyoaki Aikawa
2003	Emotion control of Chinese speech synthesis in natural environment. Jianhua Tao
2003	Emotion recognition by speech signals. Oh-Wook Kwon, Kwokleung Chan, Jiucang Hao, Te-Won Lee
2003	Emotion recognition using a data-driven fuzzy inference system. Chul Min Lee, Shrikanth S. Narayanan
2003	Empowering end users to personalize dialogue systems through spoken interaction. Stephanie Seneff, Grace Chung, Chao Wang
2003	Energy contour extraction for in-car speech recognition. Tai-Hwei Hwang
2003	Enhance low-frequency suppression of GSC beamforming. Zhaorong Hou, Ying Jia
2003	Enhanced tree clustering with single pronunciation dictionary for conversational speech recognition. Hua Yu, Tanja Schultz
2003	Enhancement of hearing-impaired Mandarin speech. Chen-Long Lee, Ya-ru Yang, Wen-Whei Chang, Yuan-Chuan Chiang
2003	Enhancement of noisy speech for noise robust front-end and speech reconstruction at back-end of DSR system. Hyoung-Gook Kim, Markus Schwab, Nicolas Moreau, Thomas Sikora
2003	Enhancement of speech in multispeaker environment. B. Yegnanarayana, S. R. Mahadeva Prasanna, Mathew Magimai-Doss
2003	Entropy constrained quantization of LSP parameters. Turaj Zakizadeh Shabestary, Per Hedelin, Fredrik Nordén
2003	Entropy-optimized channel error mitigation with application to speech recognition over wireless. Victoria E. Sánchez, Antonio M. Peinado, Angel M. Gomez, José L. Pérez-Córdoba
2003	Environment adaptation for robust speaker verification. Kwok-Kwong Yiu, Man-Wai Mak, Sun-Yuan Kung
2003	Environment adaptive control of noise reduction parameters for improved robustness of ASR. Chng Chin Soon, Bernt Andrassy, Josef G. Bauer, Günther Ruske
2003	Environmental sniffing: robust digit recognition for an in-vehicle environment. Murat Akbacak, John H. L. Hansen
2003	Environmental sound source identification based on hidden Markov model for robust speech recognition. Takanobu Nishiura, Satoshi Nakamura, Kazuhiro Miki, Kiyohiro Shikano
2003	Estimating Japanese word accent from syllable sequence using support vector machine. Hideharu Nakajima, Masaaki Nagata, Hisako Asano, Masanobu Abe
2003	Estimating speech recognition error rate without acoustic test data. Yonggang Deng, Milind Mahajan, Alex Acero
2003	Estimating the spectral envelope of voiced speech using multi-frame analysis. Yoshinori Shiga, Simon King
2003	Estimating the vocal-tract area function and the derivative of the glottal wave from a speech signal. Huiqun Deng, Michael P. Beddoes, Rabab Kreidieh Ward, Murray Hodgson
2003	Estimating the weight of evidence in forensic speaker verification. Beat Pfister, René Beutler
2003	Estimation of GMM in voice conversion including unaligned data. Helenca Duxans, Antonio Bonafonte
2003	Estimation of resonant characteristics based on AR-HMM modeling and spectral envelope conversion of vowel sounds. Nobuyuki Nishizawa, Keikichi Hirose, Nobuaki Minematsu
2003	Estimation of the parameters of the quantitative intonation model with continuous wavelet analysis. Hans Kruschke, Michael Lenz
2003	Estimation of vocal noise in running speech by means of bi-directional double linear prediction. Frédéric Bettens, Francis Grenez, Jean Schoentgen
2003	Estimation of voice source and vocal tract characteristics based on multi-frame analysis. Yoshinori Shiga, Simon King
2003	Evaluating and correcting phoneme segmentation for unit selection synthesis. John Kominek, Christina L. Bennett, Alan W. Black
2003	Evaluating discourse understanding in spoken dialogue systems. Ryuichiro Higashinaka, Noboru Miyazaki, Mikio Nakano, Kiyoaki Aikawa
2003	Evaluating multiple LVCSR model combination in NTCIR-3 speech-driven web retrieval task. Masahiko Matsushita, Hiromitsu Nishizaki, Takehito Utsuro, Yasuhiro Kodama, Seiichi Nakagawa
2003	Evaluating the effect of predicting oral reading miscues. Satanjeev Banerjee, Joseph E. Beck, Jack Mostow
2003	Evaluation frameworks for speech translation technologies. Marcello Federico
2003	Evaluation method for automatic speech summarization. Chiori Hori, Takaaki Hori, Sadaoki Furui
2003	Evaluation of ETSI advanced DSR front-end and bias removal method on the Japanese newspaper article sentences speech corpus. Satoru Tsuge, Shingo Kuroiwa, Kenji Kita
2003	Evaluation of a speech-driven telephone information service using the PARADISE framework: a closer look at subjective measures. Paula M. T. Smeele, Juliette A. J. S. Waals
2003	Evaluation of an alert system for selective dissemination of broadcast news. Isabel Trancoso, João Paulo Neto, Hugo Meinedo, Rui Amaral
2003	Evaluation of model-based feature enhancement on the AURORA-4 task. Veronique Stouten, Hugo Van hamme, Jacques Duchateau, Patrick Wambacq
2003	Evaluation of quantile based histogram equalization with filter combination on the Aurora 3 and 4 databases. Florian Hilger, Hermann Ney
2003	Evaluation of the affect of speech intonation using a model of the perception of interval dissonance and harmonic tension. Norman D. Cook, Takeshi Fujisawa, Kazuaki Takami
2003	Evaluation of the stochastic morphosyntactic language model on a one million word hungarian dictation task. Máté Szarvas, Sadaoki Furui
2003	Evaluation of units selection criteria in corpus-based speech synthesis. Hélène François, Olivier Boëffard
2003	Evaluation on the Aurora 2 database of acoustic models that are less noise-sensitive. Edmondo Trentin, Marco Matassoni, Marco Gori
2003	Evolutionary weight tuning based on diphone pairs for unit selection speech synthesis. Francesc Alías, Xavier Llorà
2003	Example-based bi-directional Chinese-English machine translation with semi-automatically induced grammars. Kai-Chung Siu, Helen M. Meng, Chin-Chung Wong
2003	Experimental evaluation of the relevance of prosodic features in Spanish using machine learning techniques. David Escudero Mancebo, Valentín Cardeñoso-Payo, Antonio Bonafonte
2003	Experimental tools to evaluate intelligibility of text-to-speech (TTS) synthesis: effects of voice gender and signal quality. Catherine J. Stevens, Nicole Lees, Julie Vonwiller
2003	Exploiting order-preserving perfect hashing to speedup n-gram language model lookahead. Xiaolong Li, Yunxin Zhao
2003	Exploiting speech for recognizing elderly users to respond to their special needs. Christian A. Müller, Frank Wittig, Jörg Baus
2003	Exploiting time warping in AMR-NB and AMR-WB speech coders. Lasse Laaksonen, Sakari Himanen, Ari Heikkinen, Jani Nurminen
2003	Exploiting unlabeled utterances for spoken language understanding. Gökhan Tür, Dilek Hakkani-Tür
2003	Extracting an AV speech source from a mixture of signals. David Sodoyer, Laurent Girin, Christian Jutten, Jean-Luc Schwartz
2003	Extraction methods of voicing feature for robust speech recognition. András Zolnay, Ralf Schlüter, Hermann Ney
2003	FEM analysis based on 3-d time-varying vocal tract shape. Koji Sasaki, Nobuhiro Miki, Yoshikazu Miyanaga
2003	FLavor: a flexible architecture for LVCSR. Kris Demuynck, Tom Laureys, Dirk Van Compernolle, Hugo Van hamme
2003	F_0 estimation of one or several voices. Alain de Cheveigné, Alexis Baskind
2003	Factorial models and refiltering for speech separation and denoising. Sam T. Roweis
2003	Far-field ASR on inexpensive microphones. Laura Docío Fernández, David Gelbart, Nelson Morgan
2003	Fast incremental adaptation using maximum likelihood regression and stochastic gradient descent. Sreeram V. Balakrishnan
2003	Feature compensation scheme based on parallel combined mixture model. Wooil Kim, Sungjoo Ahn, Hanseok Ko
2003	Feature compensation technique for robust speech recognition in noisy environments. Young Joon Kim, Hyun Woo Kim, Woohyung Lim, Nam Soo Kim
2003	Feature generation based on maximum classification probability for improved speech recognition. Xiang Li, Richard M. Stern
2003	Feature selection for the classification of crosstalk in multi-channel audio. Stuart N. Wrigley, Guy J. Brown, Vincent Wan, Steve Renals
2003	Feature transformations and combinations for improving ASR performance. Panu Somervuo, Barry Y. Chen, Qifeng Zhu
2003	Features for tree based dialogue course management. Klaus Macherey, Hermann Ney
2003	Features of contracted syllables of spontaneous Mandarin. Shu-Chuan Tseng
2003	Fitting class-based language models into weighted finite-state transducer framework. Pavel Ircing, Josef Psutka
2003	Flexible speech act identification of spontaneous speech with disfluency. Chung-Hsien Wu, Gwo-Lang Yan
2003	Flooring the observation probability for robust ASR in impulsive noise. Pei Ding, Bertram E. Shi, Pascale Fung, Zhigang Cao
2003	French intonational rises and their role in speech seg mentation [sic]. Pauline Welby
2003	Frequency distribution based weighted sub-band approach for classification of emotional/stressful content in speech. Mandar A. Rahurkar, John H. L. Hansen
2003	Frequency-related representation of speech. Kuldip K. Paliwal, Bishnu S. Atal
2003	From switchboard to fisher: telephone collection protocols, their uses and yields. Christopher Cieri, David Miller, Kevin Walker
2003	Fusing high- and low-level features for speaker recognition. Joseph P. Campbell, Douglas A. Reynolds, Robert B. Dunn
2003	GMM-based voice conversion applied to emotional speech synthesis. Hiromichi Kawanami, Yohei Iwami, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano
2003	Gaussian dynamic warping (GDW) method applied to text-dependent speaker detection and verification. Jean-François Bonastre, Philippe Morin, Jean-Claude Junqua
2003	Generation and perception of f_0 markedness in conversational speech with adverbs expressing degrees. Takumi Yamashita, Yoshinori Sagisaka
2003	Generation of natural response timing using decision tree based on prosodic and linguistic information. Masashi Takeuchi, Norihide Kitaoka, Seiichi Nakagawa
2003	Geometric constrained maximum likelihood linear regression on Mandarin dialect adaptation. Huayun Zhang, Bo Xu
2003	Glottal closure instant synchronous sinusoidal model for high quality speech analysis/synthesis. Parham Zolfaghari, Tomohiro Nakatani, Toshio Irino, Hideki Kawahara, Fumitada Itakura
2003	Glottal spectrum based inverse filtering. Ixone Arroabarren, Alfonso Carlosena
2003	Grapheme based speech recognition. Mirjam Killer, Sebastian Stüker, Tanja Schultz
2003	Grapheme to phoneme conversion and dictionary verification using graphonemes. Paul Vozila, Jeff Adams, Yuliya Lobacheva, Ryan Thomas
2003	HARTFEX: a multi-dimensional system of HMM based recognisers for articulatory features extraction. Tarek Abu-Amer, Julie Carson-Berndsen
2003	Harmonic alternatives to sine-wave speech. László Tóth, András Kocsor
2003	Harmonic weighting for all-pole modeling of the voiced speech. Davor Petrinovic
2003	Hidden feature models for speech recognition using dynamic Bayesian networks. Karen Livescu, James R. Glass, Jeff A. Bilmes
2003	Hierarchical class n-gram language models: towards better estimation of unseen events in speech recognition. Imed Zitouni, Olivier Siohan, Chin-Hui Lee
2003	Hierarchical topic classification for dialog speech recognition based on language model switching. Ian R. Lane, Tatsuya Kawahara, Tomoko Matsui, Satoshi Nakamura
2003	High-likelihood model based on reliability statistics for robust combination of features: application to noisy speech recognition. Peter Jancovic, Münevver Köküer, Fionn Murtagh
2003	How NLP techniques can improve speech understanding: ROMUS - a robust chunk based message understanding system using link grammars. Jérôme Goulian, Jean-Yves Antoine, Franck Poirier
2003	How does human segment the speech by prosody ? Toshie Hatano, Yasuo Horiuchi, Akira Ichikawa
2003	Hybrid HMM/BN ASR system integrating spectrum and articulatory features. Konstantin Markov, Jianwu Dang, Yosuke Iizuka, Satoshi Nakamura
2003	ISCA special session: hot topics in speech synthesis. Gérard Bailly, Nick Campbell, Bernd Möbius
2003	Identifying speakers in children's stories for speech synthesis. Jason Y. Zhang, Alan W. Black, Richard Sproat
2003	Illusory continuity of intermittent pure tone in binaural listening and its dependency on interaural time difference. Mamoru Iwaki, Norio Nakamura
2003	Impact of audio segmentation and segment clustering on automated transcription accuracy of large spoken archives. Bhuvana Ramabhadran, Jing Huang, Upendra V. Chaudhari, Giridharan Iyengar, Harriet J. Nock
2003	Impact of word graph density on the quality of posterior probability based confidence measures. Tibor Fábián, Robert Lieb, Günther Ruske, Matthias Thomae
2003	Implementation and evaluation of a text-to-speech synthesis system for turkish. Özgül Salor, Bryan L. Pellom, Mübeccel Demirekler
2003	Implementing an SSML compliant concatenative TTS system. Andrew P. Breen, Steve Minnis, Barry Eggleton
2003	Improved Chinese broadcast news transcription by language modeling with temporally consistent training corpora and iterative phrase extraction. Pi-Chuan Chang, Shuo-Peng Liao, Lin-Shan Lee
2003	Improved emotion recognition with large set of statistical features. Vladimir Hozjan, Zdravko Kacic
2003	Improved feature extraction based on spectral noise reduction and nonlinear feature normalization. José C. Segura, Javier Ramírez, M. Carmen Benítez, Ángel de la Torre, Antonio J. Rubio
2003	Improved kalman filter-based speech enhancement. Jianqiang Wei, Limin Du, Zhaoli Yan, Hui Zeng
2003	Improved name recognition with user modeling. Dong Yu, Kuansan Wang, Milind Mahajan, Peter Mau, Alex Acero
2003	Improved robustness of automatic speech recognition using a new class definition in linear discriminant analysis. Martin Schafföner, Marcel Katz, Sven E. Krüger, Andreas Wendemuth
2003	Improved speaker verification through probabilistic subspace adaptation. Simon Lucey, Tsuhan Chen
2003	Improvement of non-native speech recognition by effectively modeling frequently observed pronunciation habits. Nobuaki Minematsu, Koichi Osaki, Keikichi Hirose
2003	Improving "how may i help you?" systems using the output of recognition lattices. James Allen, David Attwater, Peter J. Durston, Mark Farrell
2003	Improving a connectionist based syntactical language model. Ahmad Emami
2003	Improving speech intelligibility by steady-state suppression as pre-processing in small to medium sized halls. Nao Hodoshima, Takayuki Arai, Tsuyoshi Inoue, Keisuke Kinoshita, Akiko Kusumoto
2003	Improving statistical natural concept generation in interlingua-based speech-to-speech translation. Liang Gu, Yuqing Gao, Michael Picheny
2003	Improving the accuracy of pronunciation prediction for unit selection TTS. Justin Fackrell, Wojciech Skut, Kathrine Hammervold
2003	Improving the competitiveness of discriminant neural networks in speaker verification. Carlos Vivaracho-Pascual, Javier Ortega-Garcia, Luis Alonso Romero, Q. Isaac Moro-Sancho
2003	Improving the efficiency of automatic speech recognition by feature transformation and dimensionality reduction. Xuechuan Wang, Douglas D. O'Shaughnessy
2003	In search of target class definition in tandem feature extraction. Sunil Sivadas, Hynek Hermansky
2003	Incremental and iterative monolingual clustering algorithms. Sergio Barrachina, Juan Miguel Vilar
2003	Incremental learning of new user formulations in automatic directory assistance. Marco Andorno, Luciano Fissore, Pietro Laface, Mario Nigra, Cosmin Popovici, Franco Ravera, Claudio Vair
2003	Independent automatic segmentation by self-learning categorial pronunciation rules. Nicole Beringer
2003	Influence of recording equipment on the identification of second language phoneme contrasts. Hiroaki Kato, Masumi Nukinay, Hideki Kawahara, Reiko Akahane-Yamada
2003	Influence of the waveguide propagation on the antenna performance in a car cabin. Leonid G. Krasny, Ali S. Khayrallah
2003	Information retrieval based call classification. Jan Kneissler, Anne K. Kienappel, Dietrich Klakow
2003	Information structure and efficiency in speech production. R. J. J. H. van Son, Louis C. W. Pols
2003	Inhibitory priming effect in auditory word recognition: the role of the phonological mismatch length between primes and targets. Sophie Dufour, Ronald Peereman
2003	Inline updates for HMMs. Ashutosh Garg, Manfred K. Warmuth
2003	Integrated pitch and MFCC extraction for speech reconstruction and speech recognition applications. Xu Shao, Ben P. Milner, Stephen J. Cox
2003	Integrating multilingual articulatory features into speech recognition. Sebastian Stüker, Florian Metze, Tanja Schultz, Alex Waibel
2003	Integrating statistical and rule-based knowledge for continuous German speech recognition. René Beutler, Beat Pfister
2003	Integration of noise reduction algorithms for Aurora2 task. Takeshi Yamada, Jiro Okada, Kazuya Takeda, Norihide Kitaoka, Masakiyo Fujimoto, Shingo Kuroiwa, Kazumasa Yamamoto, Takanobu Nishiura, Mitsunori Mizumachi, Satoshi Nakamura
2003	Integration of speaker recognition into conversational spoken dialogue systems. Timothy J. Hazen, Douglas A. Jones, Alex Park, Linda C. Kukolich, Douglas A. Reynolds
2003	Introduction of the CELP structure of the GSM coder in the acoustic echo canceller for the GSM network. H. Gnaba, Monia Turki-Hadj Alouane, Meriem Jaïdane-Saïdane, Pascal Scalart
2003	Investigation of emotionally morphed speech perception and its structure using a high quality speech manipulation system. Hisami Matsui, Hideki Kawahara
2003	Is there an emotion signature in intonational patterns? and can it be used in synthesis? Tanja Bänziger, Michel Morel, Klaus R. Scherer
2003	Isolated word verification using cohort word-level verification. Kishan Thambiratnam, Sridha Sridharan
2003	Jacobian adaptation based on the frequency-filtered spectral energies. Alberto Abad, Climent Nadeu, Javier Hernando, Jaume Padrell
2003	Japanese prosodic labeling support system utilizing linguistic information. Shinya Kiriyama, Yoshifumi Mitsuta, Yuta Hosokawa, Yoshikazu Hashimoto, Toshihiko Itoh, Shigeyoshi Kitazawa
2003	Jaspis^2 - an architecture for supporting distributed spoken dialogues. Markku Turunen, Jaakko Hakulinen
2003	Joint estimation of thresholds in a bi-threshold verification problem. Simon Ka-Lung Ho, Brian Mak
2003	Joint model and feature based compensation for robust speech recognition under non-stationary noise environments. Chuan Jia, Peng Ding, Bo Xu
2003	Kalman-filter based join cost for unit-selection speech synthesis. Jithendra Vepa, Simon King
2003	Keeping rare events rare. Ove Andersen, Charles Hoequist
2003	LET's GO: improving spoken dialog systems for the elderly and non-natives. Antoine Raux, Brian Langner, Alan W. Black, Maxine Eskénazi
2003	LUCIA a new italian talking-head based on a modified cohen-massaro's labial coarticulation model. Piero Cosi, Andrea Fusaro, Graziano Tisato
2003	Language identification using parallel sub-word recognition - an ergodic HMM equivalence. V. Ramasubramanian, A. K. V. Sai Jayram, T. V. Sreenivas
2003	Language model accuracy and uncertainty in noise cancelling in the stochastic weighted viterbi algorithm. Néstor Becerra Yoma, Ivan Brito, Jorge F. Silva
2003	Language model adaptation using cross-lingual information. Woosung Kim, Sanjeev Khudanpur
2003	Language model adaptation using word clustering. Shinsuke Mori, Masafumi Nishimura, Nobuyasu Itoh
2003	Language-adaptive persian speech recognition. Naveen Srinivasamurthy, Shrikanth S. Narayanan
2003	Language-reconfigurable universal phone recognition. Brenton D. Walker, Bradley C. Lackey, Jennifer S. Muller, Patrick John Schone
2003	Large corpus experiments for broadcast news recognition. Patrick Nguyen, Luca Rigazio, Jean-Claude Junqua
2003	Large lexica for speech-to-speech translation: from specification to creation. Elviira Hartikainen, Giulio Maltese, Asunción Moreno, Shaunie Shammass, Ute Ziegenhain
2003	Large margin methods for label sequence learning. Yasemin Altun, Thomas Hofmann
2003	Large vocabulary ASR for spontaneous czech in the MALACH project. Josef Psutka, Pavel Ircing, Josef V. Psutka, Vlasta Radová, William J. Byrne, Jan Hajic, Jirí Mírovský, Samuel Gustman
2003	Large vocabulary continuous speech recognition in greek: corpus and an automatic dictation system. Vassilios Digalakis, Dimitris Oikonomidis, Dimitris Pratsolis, Nikos Tsourakis, Christos Vosnidis, Nikos Chatzichrisafis, Vassilios Diakoloukas
2003	Large vocabulary conversational speech recognition with a subspace constraint on inverse covariance matrices. Scott Axelrod, Vaibhava Goel, Brian Kingsbury, Karthik Visweswariah, Ramesh A. Gopinath
2003	Large vocabulary noise robustness on Aurora4. Luca Rigazio, Patrick Nguyen, David Kryze, Jean-Claude Junqua
2003	Large vocabulary speaker independent isolated word recognition for embedded systems. Sergey Astrov, Bernt Andrassy
2003	Large vocabulary taiwanese (min-nan) speech recognition using tone features and statistical pronunciation modeling. Dau-Cheng Lyu, Min-Siong Liang, Yuang-Chin Chiang, Chun-Nan Hsu, Ren-yuan Lyu
2003	Latent ability to manipulate phonemes by Japanese preliterates in roman alphabet. Takashi Otake, Yoko Sakamoto
2003	Lattice segmentation and minimum Bayes risk discriminative training. Vlasios Doumpiotis, Stavros Tsakalidis, William J. Byrne
2003	Learning Chinese tones. Valery A. Petrushin
2003	Learning discriminative temporal patterns in speech: development of novel TRAPS-like classifiers. Barry Y. Chen, Shuangyu Chang, Sunil Sivadas
2003	Learning intra-speaker model parameter correlations from many short speaker segments. Anne K. Kienappel
2003	Learning linguistically valid pronunciations from acoustic data. Françoise Beaufays, Ananth Sankar, Shaun Williams, Mitch Weintraub
2003	Learning phrase break detection in Thai text-to-speech. Virongrong Tesprasit, Paisarn Charoenpornsawat, Virach Sornlertlamvanich
2003	Learning rule ranking by dynamic construction of context-free grammars using AND/OR graphs. Anna Corazza, Louis ten Bosch
2003	Learning to boost GMM based speaker verification. Stan Z. Li, Dong Zhang, Chengyuan Ma, Heung-Yeung Shum, Eric Chang
2003	Lexica and corpora for speech-to-speech translation: a trilingual approach. David Conejero, Jesús Giménez, Victoria Arranz, Antonio Bonafonte, Neus Pascual, Núria Castell, Asunción Moreno
2003	Likelihood ratio test with complex laplacian model for voice activity detection. Joon-Hyuk Chang, Jong Won Shin, Nam Soo Kim
2003	Linear predictive method with low-frequency emphasis. Paavo Alku, Tom Bäckström
2003	Live speech recognition in sports games by adaptation of acoustic model and language model. Yasuo Ariki, Takeru Shigemori, Tsuyoshi Kaneko, Jun Ogata, Masakiyo Fujimoto
2003	Local averaging and differentiating of spectral plane for TRAP-based ASR. Frantisek Grézl, Hynek Hermansky
2003	Local regularity analysis at glottal opening and closure instants in electroglottogram signal using wavelet transform modulus maxima. Aïcha Bouzid, Noureddine Ellouze
2003	Localized spectro-temporal features for automatic speech recognition. Michael Kleinschmidt
2003	Locally recurrent probabilistic neural network for text-independent speaker verification. Todor Ganchev, Dimitris K. Tasoulis, Michael N. Vrahatis, Nikos Fakotakis
2003	Locus equations determination using the speechdat(II). Bojan Petek
2003	Low complexity joint optimization of excitation parameters in analysis-by-synthesis speech coding. Udar Mittal, James P. Ashley, Edgardo M. Cruz-Zeno
2003	Low memory acoustic models for HMM based speech recognition. Tommi Lahti, Olli Viikki, Marcel Vasilache
2003	Low resource lip finding and tracking algorithm for embedded devices. Jesus F. Guitarte Perez, Klaus Lukas, Alejandro F. Frangi
2003	Low-latency incremental speech transcription in the synface project. Alexander Seward
2003	MMI-MAP and MPE-MAP for acoustic model adaptation. Daniel Povey, Mark J. F. Gales, Do Yeong Kim, Philip C. Woodland
2003	Mandarin speech prosody: issues, pitfalls and directions. Chiu-yu Tseng
2003	Markov chain monte carlo methods for noise robust feature extraction using the autoregressive model. Robert W. Morris, Jon A. Arrowood, Mark A. Clements
2003	Maximum a posteriori linear regression (MAPLR) variance adaptation for continuous density HMMS. Wu Chou, Xiaodong He
2003	Maximum conditional mutual information projection for speech recognition. Mohamed Kamal Omar, Mark Hasegawa-Johnson
2003	Maximum entropy good-turing estimator for language modeling. Juan P. Piantanida, Claudio Estienne
2003	Maximum likelihood endpoint detection with time-domain features. Marco Orlandi, Alfiero Santarelli, Daniele Falavigna
2003	Maximum likelihood normalization for robust speech recognition. Yiu-Pong Lai, Man-Hung Siu
2003	Maximum likelihood sub-band weighting for robust speech recognition. Donglai Zhu, Satoshi Nakamura, Kuldip K. Paliwal, Ren-Hua Wang
2003	Measuring the readability of automatic speech-to-text transcripts. Douglas A. Jones, Florian Wolf, Edward Gibson, Elliott Williams, Evelina Fedorenko, Douglas A. Reynolds, Marc A. Zissman
2003	Methods for estimation of glottal pulses waveforms exciting voiced speech. Milan Bostik, Milan Sigmund
2003	Methods to improve its portability of a spoken dialog system both on task domains and languages. YunBiao Xu, Fengying Di, Masahiro Araki, Yasuhisa Niimi
2003	Microphone array voice activity detection and noise suppression using wideband generalized likelihood ratio. Ilyas Potamitis, Eran Fishler
2003	Minimum classification error (MCE) model adaptation of continuous density HMMS. Xiaodong He, Wu Chou
2003	Minimum variance distortionless response on a warped frequency scale. Matthias Wölfel, John W. McDonough, Alex Waibel
2003	Mis-recognized utterance detection using multiple language models generated by clustered sentences. Katsuhisa Fujinaga, Hiroaki Kokubo, Hirofumi Yamamoto, Gen-ichiro Kikui, Hiroshi Shimodaira
2003	Missing feature theory applied to robust speech recognition over IP network. Toshiki Endo, Shingo Kuroiwa, Satoshi Nakamura
2003	Mixed physical modeling techniques applied to speech production. Matti Karjalainen
2003	Mixed-lingual spoken word recognition by using VQ codebook sequences of variable length segments. Hiroaki Kojima, Kazuyo Tanaka
2003	Mixed-lingual text analysis for polyglot TTS synthesis. Beat Pfister, Harald Romsdorfer
2003	Model based noisy speech recognition with environment parameters estimated by noise adaptive speech recognition with prior. Kaisheng Yao, Kuldip K. Paliwal, Satoshi Nakamura
2003	Model compression for GMM based speaker recognition systems. Douglas A. Reynolds
2003	Model-integration rapid training based on maximum likelihood for speech recognition. Shinichi Yoshizawa, Kiyohiro Shikano
2003	Modeling Cantonese pronunciation variation by acoustic model refinement. Patgi Kam, Tan Lee, Frank K. Soong
2003	Modeling cross-morpheme pronunciation variations for korean large vocabulary continuous speech recognition. Kyong-Nim Lee, Minhwa Chung
2003	Modeling duration patterns for speaker recognition. Luciana Ferrer, Harry Bratt, Venkata Ramana Rao Gadde, Sachin S. Kajarekar, Elizabeth Shriberg, M. Kemal Sönmez, Andreas Stolcke, Anand Venkataraman
2003	Modeling linguistic features in speech recognition. Min Tang, Stephanie Seneff, Victor W. Zue
2003	Modeling of various speaking styles and emotions for HMM-based speech synthesis. Junichi Yamagishi, Koji Onishi, Takashi Masuko, Takao Kobayashi
2003	Modeling speaking rate for voice fonts. Ashish Verma, Arun Kumar
2003	Modelling human speech recognition using automatic speech recognition paradigms in speM. Odette Scharenborg, James M. McQueen, Louis ten Bosch, Dennis Norris
2003	Modulation spectral filtering of speech. Les E. Atlas
2003	Modulation spectrum for pitch and speech pause detection. Olaf Schreiner
2003	Morpheme-based lexical modeling for korean broadcast news transcription. Young-Hee Park, Dong-Hoon Ahn, Minhwa Chung
2003	Morphological filtering of speech spectrograms in the context of additive noise. Francisco Romero Rodriguez, Wei Ming Liu, Nicholas W. D. Evans, John S. D. Mason
2003	Multi-array fusion for beamforming and localization of moving speakers. Ilyas Potamitis, George Tremoulis, Nikos Fakotakis, George Kokkinakis
2003	Multi-channel sentence classification for spoken dialogue language modeling. Frédéric Béchet, Giuseppe Riccardi, Dilek Z. Hakkani-Tür
2003	Multi-class extractive voicemail summarization. Konstantinos Koumpis, Steve Renals
2003	Multi-mode matrix quantizer for low bit rate LSF quantization. Ulpu Sinervo, Jani Nurminen, Ari Heikkinen, Jukka Saarinen
2003	Multi-mode quantization of adjacent speech parameters using a low-complexity prediction scheme. Jani Nurminen
2003	Multi-rate extension of the scalable to lossless PSPIHT audio coder. Mohammed Raad, Ian S. Burnett, Alfred Mertins
2003	Multi-referenced correction of the voice timbre distortions in telephone networks. Gaël Mahé, André Gilloire
2003	Multi-resolution auditory scene analysis: robust speech recognition using pattern-matching from a noisy signal. Sue Harding, Georg F. Meyer
2003	Multi-scale document expansion in English-Mandarin cross-language spoken document retrieval. Wai Kit Lo, Yuk-Chi Li, Gina-Anne Levow, Hsin-Min Wang, Helen M. Meng
2003	Multi-source training and adaptation for generic speech recognition. Fabrice Lefèvre, Jean-Luc Gauvain, Lori Lamel
2003	Multi-speaker DOA tracking using interactive multiple models and probabilistic data association. Ilyas Potamitis, George Tremoulis, Nikos Fakotakis
2003	Multigram-based grapheme-to-phoneme conversion for LVCSR. Maximilian Bisani, Hermann Ney
2003	Multilayered extensions to the speech synthesis markup language for describing expressiveness. Ellen Eide, Raimo Bakis, Wael Hamza, John F. Pitrelli
2003	Multilingual acoustic modeling using graphemes. Stephan Kanthak, Hermann Ney
2003	Multilingual phone clustering for recognition of spontaneous indonesian speech utilising pronunciation modelling techniques. Eddie Wong, Terrence Martin, Torbjørn Svendsen, Sridha Sridharan
2003	Multimodal interaction on PDA's integrating speech and pen inputs. Sorin Dusan, Gregory J. Gadbois, James L. Flanagan
2003	Multimodality and speech technology: verbal and non-verbal communication in talking agents. Björn Granström, David House
2003	Multitask learning in connectionist robust ASR using recurrent neural networks. Shahla Parveen, Phil D. Green
2003	My voice, your prosody: sharing a speaker specific prosody model across speakers in unit selection TTS. Matthew P. Aylett, Justin Fackrell, Peter Rutten
2003	NIST 2003 language recognition evaluation. Alvin F. Martin, Mark A. Przybocki
2003	Named entity extraction from Japanese broadcast news. Akio Kobayashi, Franz Josef Och, Hermann Ney
2003	Named entity extraction from word lattices. James Horlock, Simon King
2003	Natural language response generation in mixed-initiative dialogs using task goals and dialog acts. Helen M. Meng, Wing Lin Yip, Oi Yan Mok, Shuk Fong Chan
2003	Nearest-neighbor search algorithms based on subcodebook selection and its application to speech recognition. José A. R. Fonollosa
2003	Neural networks versus codebooks in an application for bandwidth extension of speech signals. Bernd Iser, Gerhard Schmidt
2003	New MAP estimators for speaker recognition. Patrick Kenny, Mohamed Mihoubi, Pierre Dumouchel
2003	New model-based HMM distances with applications to run-time ASR error estimation and model tuning. Chao-Shih Huang, Chin-Hui Lee, Hsiao-Chuan Wang
2003	Noise reduction using paired-microphones on non-equally-spaced microphone arrangement. Mitsunori Mizumachi, Satoshi Nakamura
2003	Noise robust digit recognition with missing frames. Cenk Demiroglu, David V. Anderson
2003	Noise robust speech parameterization based on joint wavelet packet decomposition and autoregressive modeling. Bojan Kotnik, Zdravko Kacic, Bogomir Horvat
2003	Noise robustness in speech to speech translation. Fu-Hua Liu, Yuqing Gao, Liang Gu, Michael Picheny
2003	Noise-robust ASR by using distinctive phonetic features approximated with logarithmic normal distribution of HMM. Takashi Fukuda, Tsuneo Nitta
2003	Noise-robust automatic speech recognition using orthogonalized distinctive phonetic feature vectors. Takashi Fukuda, Tsuneo Nitta
2003	Non-audible murmur recognition. Yoshitaka Nakajima, Hideki Kashioka, Kiyohiro Shikano, Nick Campbell
2003	Non-intrusive assessment of perceptual speech quality using a self-organising map. Dorel Picovici, Abdulhussain E. Mahdi
2003	Non-linear compression of feature vectors using transform coding and non-uniform bit allocation. Ben P. Milner
2003	Non-linear maximum likelihood feature transformation for speech recognition. Mohamed Kamal Omar, Mark Hasegawa-Johnson
2003	Non-native spontaneous speech recognition through polyphone decision tree specialization. Zhirong Wang, Tanja Schultz
2003	Nonlinear analysis of speech signals: generalized dimensions and lyapunov exponents. Vassilis Pitsikalis, Iasonas Kokkinos, Petros Maragos
2003	Normalization of time-derivative parameters using histogram equalization. Yasunari Obuchi, Richard M. Stern
2003	Novel approaches for one- and two-speaker detection. Sachin S. Kajarekar, André Gustavo Adami, Hynek Hermansky
2003	On cohort selection for speaker verification. Yaniv Zigel, Arnon Cohen
2003	On divergence based clustering of normal distributions and its application to HMM adaptation. Tor André Myrvoll, Frank K. Soong
2003	On factorizing spectral dynamics for robust speech recognition. Vivek Tyagi, Iain McCowan, Hervé Bourlard, Hemant Misra
2003	On lexicon creation for turkish LVCSR. Kadri Hacioglu, Bryan L. Pellom, Tolga Çiloglu, Özlem Öztürk, Mikko Kurimo, Mathias Creutz
2003	On the advantage of frequency-filtering features for speech recognition with variable sampling frequencies. experiments with speechdatcar databases. Hermann Bauerecker, Climent Nadeu, Jaume Padrell
2003	On the amount of speech data necessary for successful speaker identification. Ales Padrta, Vlasta Radová
2003	On the combination of speech and speaker recognition. Mohamed Faouzi BenZeghiba, Hervé Bourlard
2003	On the design of cost functions for unit-selection speech synthesis. Francisco Campillo Díaz, Eduardo Rodríguez Banga
2003	On the fusion of dissimilarity-based classifiers for speaker identification. Tomi Kinnunen, Ville Hautamäki, Pasi Fränti
2003	On the limits of cluster-based acoustic modeling. S. Douglas Peters
2003	On the number of Gaussian components in a mixture: an application to speaker verification tasks. Mijail Arcienega, Andrzej Drygajlo
2003	On the role of intonation in the organization of Mandarin Chinese speech prosody. Chiu-yu Tseng
2003	On the use of kernel PCA for feature extraction in speech recognition. Amaro A. de Lima, Heiga Zen, Yoshihiko Nankaku, Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura
2003	On unit analysis for Cantonese corpus-based TTS. Jun Xu, Thomas Choy, Minghui Dong, Cuntai Guan, Haizhou Li
2003	On-line parametric histogram equalization techniques for noise robust embedded speech recognition. Hemmo Haverinen, Imre Kiss
2003	On-line user modelling in a mobile spoken dialogue system. Niels Ole Bernsen
2003	Optimality criteria in inverse problems for tongue-jaw interaction. Alexander S. Leonov, Victor N. Sorokin
2003	Optimization of the CELP model in the LSP domain. Khosrow Lashkari, Toshio Miki
2003	Optimization of window and LSF interpolation factor for the ITU-t g.729 speech coding standard. Wai C. Chu, Toshio Miki
2003	Optimizing integrated cost function for segment selection in concatenative speech synthesis based on perceptual evaluations. Tomoki Toda, Hisashi Kawai, Minoru Tsuzaki
2003	Orientel: recording telephone speech of turkish speakers in Germany. Christoph Draxler
2003	Overlapped di-tone modeling for tone recognition in continuous Cantonese speech. Yao Qian, Tan Lee, Yujia Li
2003	PPRLM optimization for language identification in air traffic control tasks. Ricardo de Córdoba, G. Prime, Javier Macías Guarasa, Juan Manuel Montero, Javier Ferreiros, José Manuel Pardo
2003	Parametric multi-band automatic gain control for noisy speech enhancement. Mikhail Stolbov, Serguei Koval, Mikhail Khitrov
2003	Parsing spontaneous speech. Rodolfo Delmonte
2003	Perceiving emotions by ear and by eye. Béatrice de Gelder
2003	Perception of English lexical stress by English and Japanese speakers: effect of duration and "realistic" intensity change. Shinichi Tokuma
2003	Perception of voice-individuality for distortions of resonance/source characteristics and waveforms. Hisao Kuwabara
2003	Perceptual MVDR-based cepstral coefficients (PMCCs) for high accuracy speech recognition. Umit H. Yapanel, Satya Dharanipragada, John H. L. Hansen
2003	Perceptual based speech enhancement for normal-hearing and hearing-impaired individuals. Ajay Natarajan, John H. L. Hansen, Kathryn Hoberg Arehart, Jessica Rossi-Katz
2003	Perceptual irrelevancy removal in narrowband speech coding. Marja Lahdekorpi, Jani Nurminen, Ari Heikkinen, Jukka Saarinen
2003	Perceptual wavelet adaptive denoising of speech. Qiang Fu, Eric A. Wan
2003	Perceptually weighted linear transformations for voice conversion. Hui Ye, Steve J. Young
2003	Perceptually-constrained generalized singular value decomposition-based approach for enhancing speech corrupted by colored noise. Gwo-hwa Ju, Lin-Shan Lee
2003	Perceptually-related acoustic-prosodic features of phrase finals in spontaneous speech. Carlos Toshinori Ishi, Parham Mokhtari, Nick Campbell
2003	Performance evaluation of IFAS-based fundamental frequency estimator in noisy environment. Dhany Arifianto, Takao Kobayashi
2003	Performance evaluation of phonotactic and contextual onset-rhyme models for speech recognition of Thai language. Somchai Jitapunkul, Ekkarit Maneenoi, Visarut Ahkuputra, Sudaporn Luksaneeyanawin
2003	Performance improvement of rapid speaker adaptation based on eigenvoice and bias compensation. Jong Se Park, Hwa Jeon Song, Hyung Soon Kim
2003	Person authentication by voice: a need for caution. Jean-François Bonastre, Frédéric Bimbot, Louis-Jean Boë, Joseph P. Campbell, Douglas A. Reynolds, Ivan Magrin-Chagnolleau
2003	Phonetic class-based speaker verification. Matthieu Hébert, Larry P. Heck
2003	Physical and perceptual configurations of Japanese fricatives from multidimensional scaling analyses. Won Tokuma
2003	Pitch estimation using phase locked loops. Patricia A. Pelle, Matias L. Capeletto
2003	Polar quantization of sinusoids from speech signal blocks. Harald Pobloth, Renat Vafin, W. Bastiaan Kleijn
2003	Potential audiovisual correlates of contrastive focus in French. Marion Dohen, Hélène Loevenbruck, Marie-Agnès Cathiard, Jean-Luc Schwartz
2003	Predicting the perceptive judgment of voices in a telecom context: selection of acoustic parameters. Thibaut Ehrette, Noël Chateau, Christophe d'Alessandro, Valérie Maffiolo
2003	Prediction of fujisaki model's phrase commands. João Paulo Ramos Teixeira, Diamantino Freitas, Hiroya Fujisaki
2003	Prediction of sentence importance for speech summarization using prosodic parameters. Akira Inoue, Takayoshi Mikami, Yoichi Yamashita
2003	Predictive hidden Markov model selection for decision tree state tying. Jen-Tzung Chien, Sadaoki Furui
2003	Preference, perception, and task completion of open, menu-based, and directed prompts for call routing: a case study. Jason D. Williams, Andrew T. Shaw, Lawrence Piano, Michael Abt
2003	Probability models of formant parameters for voice conversion. Dimitrios Rentzos, Saeed Vaseghi, Qin Yan, Ching-Hsiang Ho, Emir Turajlic
2003	Product of Gaussians as a distributed representation for speech recognition. S. S. Airey, Mark J. F. Gales
2003	Prosodic analysis and modeling of the NAGAUTA singing to synthesize its prosodic patterns from the standard notation. Nobuaki Minematsu, Bungo Matsuoka, Keikichi Hirose
2003	Prosodic correlates of contrastive and non-contrastive themes in German. Bettina Braun, D. Robert Ladd
2003	Prosodic cues for emotion characterization in real-life spoken dialogs. Laurence Devillers, Ioana Vasilescu
2003	Prosody dependent speech recognition with explicit duration modelling at intonational phrase boundaries. Ken Chen, Sarah Borys, Mark Hasegawa-Johnson, Jennifer Cole
2003	Prosody-based classification of emotions in spoken finnish. Tapio Seppänen, Eero Väyrynen, Juhani Toivanen
2003	Pruning transitions in a hidden Markov model with optimal brain surgeon. Brian Kan-Wing Mak, Kin-Wah Chan
2003	Quality control of language resources at ELRA. Henk van den Heuvel, Khalid Choukri, Harald Höge, Bente Maegaard, Jan Odijk, Valérie Mapelli
2003	Quality enhancement of CELP coded speech by using an MFCC based Gaussian mixture model. D. G. Raza, C. F. Chan
2003	Quality-complexity trade-off in predictive LSF quantization. Davorka Petrinovic, Davor Petrinovic
2003	Quantifying the impact of system characteristics on perceived quality dimensions of a spoken dialogue service. Sebastian Möller, Janto Skowronek
2003	Quantitative analysis and synthesis of syllabic tones in vietnamese. Hansjörg Mixdorff, Nguyen Hung Bach, Hiroya Fujisaki, Chi Mai Luong
2003	Quantity comparison of Japanese and finnish in various word structures. Toshiko Isei-Jaakkola
2003	Ravenclaw: dialog management using hierarchical task decomposition and an expectation agenda. Dan Bohus, Alexander I. Rudnicky
2003	Reaction time as an indicator of discrete intonational contrasts in English. Aoju Chen
2003	Read my tongue movements: bimodal learning to perceive and produce non-native speech /r/ and /l/. Dominic W. Massaro, Joanna Light
2003	Recent enhancements in CU VOCAL for Chinese TTS-enabled applications. Helen M. Meng, Yuk-Chi Li, Tien Ying Fung, Man Cheuk Ho, Chi-Kin Keung, Tin Hang Lo, Wai Kit Lo, P. C. Ching
2003	Recent progress in the decoding of non-native speech with multilingual acoustic models. Volker Fischer, Eric Janke, Siegfried Kunzmann
2003	Recognising 'real-life' speech with spem: a speech-based computational model of human speech recognition. Odette Scharenborg, Louis ten Bosch, Lou Boves
2003	Recognition of emotions in interactive voice response systems. Sherif M. Yacoub, Steven J. Simske, Xiaofan Lin, John Burns
2003	Recognition of intonation patterns in Thai utterance. Patavee Charnvivit, Nuttakorn Thubthong, Ekkarit Maneenoi, Sudaporn Luksaneeyanawin, Somchai Jitapunkul
2003	Recognition of out-of-vocabulary words with sub-lexical language models. Lucian Galescu
2003	Recognition of phoneme strings using TRAP technique. Petr Schwarz, Pavel Matejka, Jan Cernocký
2003	Reduction of dimension of HMM parameters using ICA and PCA in MLLR framework for speaker adaptation. Jiun Kim, Jaeho Chung
2003	Reproducing laryngeal mechanisms with a two-mass model. Denisse Sciamarella, Christophe d'Alessandro
2003	Residual echo power estimation for speech reinforcement systems in vehicles. Alfonso Ortega, Eduardo Lleida, Enrique Masgrau
2003	Restricted unlimited domain synthesis. Antje Schweitzer, Norbert Braunschweiler, Tanja Klankert, Bernd Möbius, Bettina Säuberlich
2003	Resynthesis of 3d tongue movements from facial data. Olov Engwall, Jonas Beskow
2003	Revisiting scenarios and methods for variable frame rate analysis in automatic speech recognition. Javier Macías Guarasa, J. Ordóñez, Juan Manuel Montero, Javier Ferreiros, Ricardo de Córdoba, Luis Fernando D'Haro
2003	Roadmaps, journeys and destinations speculations on the future of speech technology research. Ronald A. Cole
2003	Robust energy demodulation based on continuous models with application to speech recognition. Dimitrios Dimitriadis, Petros Maragos
2003	Robust feature extraction and acoustic modeling at multitel: experiments on the Aurora databases. Stéphane Dupont, Christophe Ris
2003	Robust jointly optimized multistage vector quantization for speech coding. Venkatesh Krishnan, David V. Anderson
2003	Robust likelihood ratio estimation in Bayesian forensic speaker recognition. Joaquin Gonzalez-Rodriguez, Daniel Garcia-Romero, Marta Garcia-Gomar, Daniel Ramos, Javier Ortega-Garcia
2003	Robust methods in automatic speech recognition and understanding. Sadaoki Furui
2003	Robust multi-class boosting. Gunnar Rätsch
2003	Robust multiple resolution analysis for automatic speech recognition. Roberto Gemello, Franco Mana, Dario Albesano, Renato De Mori
2003	Robust parsing of utterances in negotiative dialogue. Johan Boye, Mats Wirén
2003	Robust speaker identification using posterior union models. Ji Ming, Darryl Stewart, Philip Hanna, Pat Corr, Francis Jack Smith, Saeed Vaseghi
2003	Robust speech interaction in a mobile environment through the use of multiple and different media input types. Rainer Wasinger, Christoph Stahl, Antonio Krüger
2003	Robust speech recognition to non-stationary noise based on model-driven approaches. Christophe Cerisara, Irina Illina
2003	Robust speech recognition using missing feature theory in the cepstral or LDA domain. Hugo Van hamme
2003	Robust speech recognition using model-based feature enhancement. Veronique Stouten, Hugo Van hamme, Kris Demuynck, Patrick Wambacq
2003	Robust speech recognition using non-linear spectral smoothing. Michael J. Carey
2003	Robust speech understanding based on expected discourse plan. Shinya Takahashi, Tsuyoshi Morimoto, Sakashi Maeda, Naoyuki Tsuruta
2003	Robust techniques for pre- and post-surgical voice analysis. Claudia Manfredi, Giorgio Peretti
2003	SAG: a procedural tactical generator for dialog systems. Dalina Kallulli
2003	SOM as likelihood estimator for speaker clustering. Itshak Lapidot
2003	SYNFACE - a talking face telephone. Inger Karlsson, Andrew Faulkner, Giampiero Salvi
2003	Say-as classification for alphabetic words in Japanese texts. Hisako Asano, Masaaki Nagata, Masanobu Abe
2003	Schema-based modeling of phonemic restoration. Soundararajan Srinivasan, DeLiang Wang
2003	Score normalisation applied to open-set, text-independent speaker identification. P. Sivakumaran, J. Fortuna, Aladdin M. Ariyaeeinia
2003	Segmental durations predicted with a neural network. João Paulo Ramos Teixeira, Diamantino Freitas
2003	Segmentation of speech for speaker and language recognition. André Gustavo Adami, Hynek Hermansky
2003	Segmentation of speech into syllable-like units. T. Nagarajan, Hema A. Murthy, Rajesh M. Hegde
2003	Segmenting multiple concurrent speakers using microphone arrays. Guillaume Lathoud, Iain McCowan, Darren Moore
2003	Semantic and dialogic annotation for automated multilingual customer service. Hilda Hardy, Kirk Baker, Hélène Bonneau-Maynard, Laurence Devillers, Sophie Rosset, Tomek Strzalkowski
2003	Semantic object synchronous understanding in SALT for highly interactive user interface. Kuansan Wang
2003	Semi-tied full deviation matrices for laplacian density models. Christoph Neukirchen
2003	Sentence boundary detection in arabic speech. Amit Srivastava, Francis Kubala
2003	Sentence verification in spoken dialogue system. Huei-Ming Wang, Yi-Chung Lin
2003	Several HKU approaches for robust speech recognition and their evaluation on Aurora connected digit recognition tasks. Jian Wu, Qiang Huo
2003	Shared resources for robust speech-to-text technology. Stephanie M. Strassel, David Miller, Kevin Walker, Christopher Cieri
2003	Should i tell all?: an experiment on conciseness in spoken dialogue. Stephen Whittaker, Marilyn A. Walker, Preetam Maloor
2003	Simple designing methods of corpus-based visual speech synthesis. Tatsuya Shiraishi, Tomoki Toda, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano
2003	Smartkom-home - an advanced multi-modal interface to home entertainment. Thomas Portele, Silke Goronzy, Martin C. Emele, Andreas Kellner, Sunna Torge, Jürgen te Vrugt
2003	Spanish broadcast news transcription. Gerhard Backfried, Roser Jaquemot Caldes
2003	Speaker adaptation based on confidence-weighted training. Gyucheol Jang, Minho Jin, Chang D. Yoo
2003	Speaker adaptation for non-native speakers using bilingual English lexicon and acoustic models. Shoichi Matsunaga, Atsunori Ogawa, Yoshikazu Yamaguchi, Akihiro Imamura
2003	Speaker adaptation using regression classes generated by phonetic decision tree-based successive state splitting. Se-Jin Oh, Kwang-Dong Kim, Duk-Gyoo Roh, Woo-Chang Sung, Hyun-Yeol Chung
2003	Speaker characterization using principal component analysis and wavelet transform for speaker verification. Chakib Tadj, A. Benlahouar
2003	Speaker conversion in ARX-based source-formant type speech synthesis. Hiroki Mori, Hideki Kasuya
2003	Speaker model selection using Bayesian information criterion for speaker indexing and speaker adaptation. Masafumi Nishida, Tatsuya Kawahara
2003	Speaker modeling from selected neighbors applied to speaker recognition. Yassine Mami, Delphine Charlet
2003	Speaker recognition using MPEG-7 descriptors. Hyoung-Gook Kim, Edgar Berdahl, Nicolas Moreau, Thomas Sikora
2003	Speaker recognition using local models. Ryan Rifkin
2003	Speaker verification based on g.729 and g.723.1 coder parameters and handset mismatch compensation. Eric W. M. Yu, Man-Wai Mak, Chin-Hung Sit, Sun-Yuan Kung
2003	Speaker verification based on the German veridat database. Ulrich Türk, Florian Schiel
2003	Speaker verification systems and security considerations. David A. van Leeuwen
2003	Spectral maxima representation for robust automatic speech recognition. J. Sujatha, K. R. Prasanna Kumar, K. R. Ramakrishnan, N. Balakrishnan
2003	Spectro-temporal interactions in auditory and auditory-visual speech processing. Ken W. Grant, Steven Greenberg
2003	Speech analysis with the short-time chirp transform. Luis Weruaga, Marián Képesi
2003	Speech and language processing: where have we been and where are we going? Kenneth Ward Church
2003	Speech enhancement and improved recognition accuracy by integrating wavelet transform and spectral subtraction algorithm. Gwo-hwa Ju, Lin-Shan Lee
2003	Speech enhancement for a car environment using LP residual signal and spectral subtraction. Agustín Álvarez, Victor Nieto Lluis, Pedro Gómez-Vilda, Rafael Martínez
2003	Speech enhancement for hands-free car phones by adaptive compensation of harmonic engine noise components. Henning Puder
2003	Speech enhancement using a-priori information. Sriram Srinivasan, Jonas Samuelsson, W. Bastiaan Kleijn
2003	Speech enhancement using weighting function based on the variance of wavelet coefficients. Ching-Ta Lu, Hsiao-Chuan Wang
2003	Speech enhancement with microphone array and fourier / wavelet spectral subtraction in real noisy environments. Yuki Denda, Takanobu Nishiura, Hideki Kawahara
2003	Speech generation from concept for realizing conversation with an agent in a virtual room. Keikichi Hirose, Junji Tago, Nobuaki Minematsu
2003	Speech recognition based on syllable recovery. Li Zhang, William H. Edmondson
2003	Speech recognition of double talk using SAFIA-based audio segregation. Toshiyuki Sekiya, Tetsuji Ogawa, Tetsunori Kobayashi
2003	Speech recognition over bluetooth wireless channels. Ziad Al Bawab, Ivo Locher, Jianxia Xue, Abeer Alwan
2003	Speech recognition using EMG; mime speech recognition. Hiroyuki Manabe, Akira Hiraiwa, Toshiaki Sugimura
2003	Speech recognition with a generative factor analyzed hidden Markov model. Kaisheng Yao, Kuldip K. Paliwal, Te-Won Lee
2003	Speech recognition with dynamic grammars using finite-state transducers. Johan Schalkwyk, I. Lee Hetherington, Ezra Story
2003	Speech segregation based on fundamental event information using an auditory vocoder. Toshio Irino, Roy D. Patterson, Hideki Kawahara
2003	Speech shift: direct speech-input-mode switching through intentional control of voice pitch. Masataka Goto, Yukihiro Omoto, Katunobu Itou, Tetsunori Kobayashi
2003	Speech starter: noise-robust endpoint detection by using filled pauses. Koji Kitayama, Masataka Goto, Katunobu Itou, Tetsunori Kobayashi
2003	Speech summarization using weighted finite-state transducers. Takaaki Hori, Chiori Hori, Yasuhiro Minami
2003	Speech watermarking by parametric embedding with an l_(infinity) fidelity criterion. Aparna Gurijala, John R. Deller Jr.
2003	Speech-based, manual-visual, and multi-modal interaction with an in-car computer - evaluation of a pilot study. Rogier Woltjer, Wah Jin Tan, Fang Chen
2003	Speechalator: two-way speech-to-speech translation on a consumer PDA. Alex Waibel, Ahmed Badran, Alan W. Black, Robert E. Frederking, Donna Gates, Alon Lavie, Lori S. Levin, Kevin A. Lenzo, Laura Mayfield Tomokiyo, Jürgen Reichert, Tanja Schultz, Dorcas Wallace, Monika Woszczyna, Jing Zhang
2003	Spoken cross-language access to image collection via captions. Hsin-Hsi Chen
2003	Spoken dialogue system for queries on appliance manuals using hierarchical confirmation strategy. Tatsuya Kawahara, Ryosuke Ito, Kazunori Komatani
2003	Spoken language and e-inclusion. Alan F. Newell
2003	Spoken language condensation in the 21st century. Klaus Zechner
2003	Spoken language output: realising the vision. Roger K. Moore
2003	Spotting "hot spots" in meetings: human judgments and prosodic cues. Britta Wrede, Elizabeth Shriberg
2003	Statistical estimation of phoneme's most stable point based on universal constraint. Shigeki Okawa, Katsuhiko Shirai
2003	Statistical evaluation of the influence of stress on pitch frequency and phoneme durations in farsi language. Davood Gharavian, Seyed Mohammad Ahadi
2003	Statistical methods and Bayesian interpretation of evidence in forensic automatic speaker recognition. Andrzej Drygajlo, Didier Meuwly, Anil Alexander
2003	Statistical signal processing with nonnegativity constraints. Lawrence K. Saul, Fei Sha, Daniel D. Lee
2003	Statistical speech-to-speech translation with multilingual speech recognition and bilingual-chunk parsing. Bo Xu, Shuwu Zhang, Chengqing Zong
2003	Stem-based maximum entropy language models for inflectional languages. Dimitris Oikonomidis, Vassilios Digalakis
2003	Strategies for automatic multi-tier annotation of spoken language corpora. Steven Greenberg
2003	Stress-based speech segmentation revisited. Sven L. Mattys
2003	Structural linear model-space transformations for speaker adaptation. Driss Matrouf, Olivier Bellot, Pascal Nocera, Georges Linarès, Jean-François Bonastre
2003	Structural state-based frame synchronous compensation. Vincent Barreaud, Irina Illina, Dominique Fohr, Filipp Korkmazsky
2003	Subband-based acoustic shock limiting algorithm on a low-resource DSP system. Gary Choy, David Hermann, Robert L. Brennan, Todd Schneider, Hamid Sheikhzadeh, Etienne Cornu
2003	Subjective evaluations for perception of speaker identity through acoustic feature transplantations. Oytun Türk, Levent M. Arslan
2003	Syllable classification using articulatory-acoustic features. Mirjam Wester
2003	Syllable structure based phonetic units for context-dependent continuous Thai speech recognition. Supphanat Kanokphara
2003	Syllable-based acoustic modeling for Japanese spontaneous speech recognition. Jun Ogata, Yasuo Ariki
2003	Techniques for effective vocabulary selection. Anand Venkataraman, Wen Wang
2003	Temporal aspects of articulatory control. Elliot Saltzman
2003	Temporal properties of the nasals and nasalization in Cantonese. Beatrice Fung-Wah Khioe
2003	Text design for TTS speech corpus building using a modified greedy selection. Baris Bozkurt, Özlem Öztürk, Thierry Dutoit
2003	Text-independent speaker recognition by speaker-specific GMM and speaker adapted syllable-based HMM. Seiichi Nakagawa, Wei Zhang
2003	Tfarsdat - the telephone farsi speech database. Mahmood Bijankhan, Javad Sheykhzadegan, Mahmood R. Roohani, Rahman Zarrintare, Seyyed Z. Ghasemi, Mohammad E. Ghasedi
2003	The /i/-/a/-/u/-ness of spoken vowels. Hartmut R. Pfitzinger
2003	The 300k LIMSI German broadcast news transcription system. Kevin McTait, Martine Adda-Decker
2003	The LIUM-AVS database : a corpus to test lip segmentation and speechreading systems in natural conditions. Philippe Daubias, Paul Deléglise
2003	The NESPOLE! voIP multilingual corpora in tourism and medical domains. Nadia Mana, Susanne Burger, Roldano Cattoni, Laurent Besacier, Victoria MacLaren, John W. McDonough, Florian Metze
2003	The application of interactive speech unit selection in TTS systems. Peter Rutten, Justin Fackrell
2003	The awe and mystery of t-norm. Jirí Navrátil, Ganesh N. Ramaswamy
2003	The basque speech_dat (II) database: a description and first test recognition results. Inmaculada Hernáez, Iker Luengo, Eva Navas, Maria Luisa Zubizarreta, Iñaki Gaminde, Jon Sánchez
2003	The czech speech and prosody database both for ASR and TTS purposes. Jáchym Kolár, Jan Romportl, Josef Psutka
2003	The development of a multi-purpose spoken dialogue system. João Paulo Neto, Nuno J. Mamede, Renato Cassaca, Luís C. Oliveira
2003	The dynamic, multi-lingual lexicon in smartkom. Silke Goronzy, Zica Valsan, Martin C. Emele, Juergen Schimanowski
2003	The effect of amplitude compression on wide band telephone speech for hearing-impaired elderly people. Mutsumi Saito, Kimio Shiraishi, Kimitoshi Fukudome
2003	The effect of an intermediate articulatory layer on the performance of a segmental HMM. Martin J. Russell, Philip J. B. Jackson
2003	The effect of speech rate and noise on bilinguals' speech perception: the case of native speakers of arabic in israel. Judith Rosenhouse, Liat Kishon-Rabin
2003	The effect of surrounding phrase lengths on pause duration. Elena Zvonik, Fred Cummins
2003	The perceptual cues of a high level pitch-accent pattern in Japanese: pitch-accent patterns and duration. Tsutomu Sato
2003	The queen's communicator: an object-oriented dialogue manager. Ian M. O'Neill, Philip Hanna, Xingkun Liu, Michael F. McTear
2003	The statistical approach to machine translation and a roadmap for speech translation. Hermann Ney
2003	The temporal organisation of speech as gauged by speech synthesis. Brigitte Zellner Keller
2003	The use of confidence measures in vector based call-routing. Stephen J. Cox, Gavin C. Cawley
2003	The use of multiple pause information in dependency structure analysis of spoken Japanese sentences. Meirong Lu, Kazuyuki Takagi, Kazuhiko Ozeki
2003	Three simultaneous speech recognition by integration of active audition and face recognition for humanoid. Kazuhiro Nakadai, Daisuke Matsuura, Hiroshi G. Okuno, Hiroshi Tsujino
2003	Time adjustable mixture weights for speaking rate fluctuation. Takahiro Shinozaki, Sadaoki Furui
2003	Time alignment for scenario and sounds with voice, music and BGM. Yamato Wada, Masahide Sugiyama
2003	Time delay estimation based on hearing characteristic. Zhaoli Yan, Limin Du, Jianqiang Wei, Hui Zeng
2003	Time is of the essence - dynamic approaches to spoken language. Steven Greenberg
2003	Time-domain based temporal processing with application of orthogonal transformations. Petr Motlícek, Jan Cernocký
2003	Tone pattern discrimination combining parametric modeling and maximum likelihood estimation. Jinfu Ni, Hisashi Kawai
2003	Topic segmentation and retrieval system for lecture videos based on spontaneous speech recognition. Natsuo Yamamoto, Jun Ogata, Yasuo Ariki
2003	Topic-specific parser design in an air travel natural language understanding application. Chaitanya Ekanadham, Juan M. Huerta
2003	Toward domain-independent conversational speech recognition. Brian Kingsbury, Lidia Mangu, George Saon, Geoffrey Zweig, Scott Axelrod, Vaibhava Goel, Karthik Visweswariah, Michael Picheny
2003	Towards a personal robot with language interface. Luís Seabra Lopes, António J. S. Teixeira, Mário Rodrigues, Diogo Gomes, Cláudio Teixeira, Liliana da Silva Ferreira, Pedro Filipe Soares, João Girão, Nuno Sénica
2003	Towards a repository of digital talking books. António Joaquim Serralheiro, Isabel Trancoso, Diamantino Caseiro, Teresa Chambel, Luís Carriço, Nuno Guimarães
2003	Towards an evaluation standard for speech control concepts in real-world scenarios. Jens Maase, Diane Hirschfeld, Uwe Koloska, Timo Westfeld, Jörg Helbig
2003	Towards best practices for speech user interface design. Bernhard Suhm
2003	Towards dynamic multi-domain dialogue processing. Botond Pakucs
2003	Towards missing data recognition with cepstral features. Christophe Cerisara
2003	Towards multimodal interaction with an intelligent room. Petra Gieselmann, Matthias Denecke
2003	Towards optimal encoding for classification with applications to distributed speech recognition. Naveen Srinivasamurthy, Antonio Ortega, Shrikanth S. Narayanan
2003	Towards synthesising expressive speech; designing and collecting expressive speech data. Nick Campbell
2003	Towards the automatic extraction of fujisaki model parameters for Mandarin. Hansjörg Mixdorff, Hiroya Fujisaki, Gao Peng Chen, Yu Hu
2003	Towards the automatic generation of mixed-initiative dialogue systems from web content. Joseph Polifroni, Grace Chung, Stephanie Seneff
2003	Towards the development of a brazilian portuguese text-to-speech system based on HMM. Ranniery Maia, Heiga Zen, Keiichi Tokuda, Tadashi Kitamura, Fernando Gil Vianna Resende Jr.
2003	Tracking a moving speaker using excitation source information. Vikas C. Raykar, Ramani Duraiswami, B. Yegnanarayana, S. R. Mahadeva Prasanna
2003	Tracking vocal tract resonances using an analytical nonlinear predictor and a target-guided temporal constraint. Li Deng, Issam Bazzi, Alex Acero
2003	Training a confidence measure for a reading tutor that listens. Yik-Cheung Tam, Jack Mostow, Joseph E. Beck, Satanjeev Banerjee
2003	Training data optimization for language model adaptation. Xiaoshan Fang, Jianfeng Gao, Jianfeng Li, Huanye Sheng
2003	Trajectory modeling based on HMMs with the explicit relationship between static and dynamic features. Keiichi Tokuda, Heiga Zen, Tadashi Kitamura
2003	Transcoding algorithm for g.723.1 and AMR speech coders: for interoperability between voIP and mobile networks. Sung-Wan Yoon, Jin-Kyu Choi, Hong-Goo Kang, Dae Hee Youn
2003	Transforming F0 contours. Ben Gillett, Simon King
2003	Transforming voice quality. Ben Gillett, Simon King
2003	Translation and rotation of the cricothyroid joint revealed by phonation-synchronized high-resolution MRI. Sayoko Takano, Kiyoshi Honda, Shinobu Masaki, Yasuhiro Shimada, Ichiro Fujimoto
2003	Tree-structured noise-adapted HMM modeling for piecewise linear-transformation-based adaptation. Zhipeng Zhang, Kiyotaka Otsuji, Sadaoki Furui
2003	Two correction models for likelihoods in robust speech recognition using missing feature theory. Hugo Van hamme
2003	Two studies of open vs. directed dialog strategies in spoken dialog systems. Silke M. Witt, Jason D. Williams
2003	Understanding process for speech recognition. Salma Jamoussi, Kamel Smaïli, Jean Paul Haton
2003	Unified analysis of glottal source spectrum. Ixone Arroabarren, Alfonso Carlosena
2003	Unit selection and emotional speech. Alan W. Black
2003	Unit selection based on voice recognition. Yi Zhou, Yiqing Zu
2003	Unit selection in concatenative TTS synthesis systems based on mel filter bank amplitudes and phonetic context. Tanya Lambert, Andrew P. Breen, Barry Eggleton, Stephen J. Cox, Ben P. Milner
2003	Unit size in unit selection speech synthesis. S. Prahallad Kishore, Alan W. Black
2003	Unlimited vocabulary speech recognition based on morphs discovered in an unsupervised manner. Vesa Siivola, Teemu Hirsimäki, Mathias Creutz, Mikko Kurimo
2003	Unsupervised speaker adaptation based on HMM sufficient statistics in various noisy environments. Shingo Yamade, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano
2003	Unsupervised speaker indexing using anchor models and automatic transcription of discussions. Yuya Akita, Tatsuya Kawahara
2003	Unsupervised topic discovery applied to segmentation of news transcriptions. Sreenivasa Sista, Amit Srivastava, Francis Kubala, Richard M. Schwartz
2003	Use of a CSP-based voice activity detector for distant-talking ASR. Luca Armani, Marco Matassoni, Maurizio Omologo, Piergiorgio Svaizer
2003	Use of linguistic information for automatic extraction of f_0 contour generation process model parameters. Keikichi Hirose, Yusuke Furuyama, Shuichi Narusawa, Nobuaki Minematsu, Hiroya Fujisaki
2003	Use of trajectory models for automatic accent classification. Pongtep Angkititrakul, John H. L. Hansen
2003	Usefulness of phase spectrum in human speech perception. Kuldip K. Paliwal, Leigh D. Alsteris
2003	User modeling in spoken dialogue systems for flexible guidance generation. Kazunori Komatani, Shinichi Ueno, Tatsuya Kawahara, Hiroshi G. Okuno
2003	Using accent information in ASR models for Swedish. Giampiero Salvi
2003	Using acoustic models to choose pronunciation variations for synthetic voices. Christina L. Bennett, Alan W. Black
2003	Using both global and local hidden Markov models for automatic speech unit segmentation. Hong Zheng, Yiqing Lu
2003	Using confidence measures and domain knowledge to improve speech recognition. Pascal Wiggers, Léon J. M. Rothkrantz
2003	Using corpus-based methods for spoken access to news texts on the web. Alexandra Klein, Harald Trost
2003	Using genetic algorithms for rapid speaker adaptation. Fabrice Lauri, Irina Illina, Dominique Fohr, Filipp Korkmazsky
2003	Using mutual information to design class-specific phone recognizers. Patricia Scanlon, Daniel P. W. Ellis, Richard B. Reilly
2003	Using pitch frequency information in speech recognition. Mathew Magimai-Doss, Todd A. Stephenson, Hervé Bourlard
2003	Using place name data to train language identification models. Stanley F. Chen, Benoît Maison
2003	Using statistical language modelling to identify new vocabulary in a grammar-based speech recognition system. Genevieve Gorrell
2003	Using syllable-based indexing features and language models to improve German spoken document retrieval. Martha A. Larson, Stefan Eickeler
2003	Using the web for fast language model construction in minority languages. Viet Bac Le, Brigitte Bigi, Laurent Besacier, Eric Castelli
2003	Using untranscribed user utterances for improving language models based on confidence scoring. Mikio Nakano, Timothy J. Hazen
2003	Using word confidence measure for OOV words detection in a spontaneous spoken dialog system. Hui Sun, Guoliang Zhang, Fang Zheng, Mingxing Xu
2003	Utterance verification under distributed detection and fusion framework. Taeyoon Kim, Hanseok Ko
2003	Utterance verification using an optimized k-nearest neighbour classifier. Roberto Paredes, Alberto Sanchís, Enrique Vidal, Alfons Juan
2003	VISPER II - enhanced version of the educational software for speech processing courses. Miroslav Holada, Jan Nouza
2003	Validation of phonetic transcriptions based on recognition performance. Christophe Van Bael, Diana Binnenpoorte, Helmer Strik, Henk van den Heuvel
2003	Variable bit rate control with trellis diagram approximation. Kei Kikuiri, Nobuhiko Naka, Tomoyuki Ohya
2003	Variable length mixtures of inverse covariances. Vincent Vanhoucke, Ananth Sankar
2003	Variational Bayesian GMM for speech recognition. Fabio Valente, Christian Wellekens
2003	Very-low-rate speech compression by indexation of polyphones. Charles du Jeu, Maurice Charbit, Gérard Chollet
2003	Visualisation of the vocal tract based on estimation of vocal area functions and formant frequencies. Abdulhussain E. Mahdi
2003	Vocal tract normalization as linear transformation of MFCC. Michael Pitz, Hermann Ney
2003	Voice conversion methods for vocal tract and pitch contour modification. Oytun Türk, Levent M. Arslan
2003	Voice conversion with smoothed GMM and MAP adaptation. Yining Chen, Min Chu, Eric Chang, Jia Liu, Runsheng Liu
2003	Voice quality modification for emotional speech synthesis. Christophe d'Alessandro, Boris Doval
2003	Voice quality normalization in an utterance for robust ASR. Muhammad Ghulam, Takashi Fukuda, Tsuneo Nitta
2003	Voicing controlled frame loss concealment for adaptive multi-rate (AMR) speech frames in voice-over-IP. Frank Mertz, Hervé Taddei, Imre Varga, Peter Vary
2003	Voicing parameter and energy based speech/non-speech detection for speech recognition in adverse conditions. Arnaud Martin, Laurent Mauuary
2003	Voxenter^TM - intelligent voice enabled call center for hungarian. Tibor Fegyó, Péter Mihajlik, Máté Szarvas, Péter Tatai, Gábor Tatai
2003	Wavelet-based perceptual speech enhancement using adaptive threshold estimation. Essa Jafer, Abdulhussain E. Mahdi
2003	We are not amused - but how do you know? user states in a multi-modal dialogue system. Anton Batliner, Viktor Zeißler, Carmen Frank, Johann Adelhardt, Rui Ping Shi, Elmar Nöth
2003	Weighted automata kernels - general framework and algorithms. Corinna Cortes, Patrick Haffner, Mehryar Mohri
2003	Weighted entropy training for the decision tree based text-to-phoneme mapping. Jilei Tian, Janne Suontausta, Juha Häkkinen
2003	Who knows carl bildt? - and what if you don't? Elisabeth Zetterholm, Kirk P. H. Sullivan, James Green, Erik J. Eriksson, Jan van Doorn, Peter E. Czigler
2003	Why and how to control the authentic emotional speech corpora. Véronique Aubergé, Nicolas Audibert, Albert Rilliard
2003	Why is the special structure of the language important for Chinese spoken language processing? - examples on spoken document retrieval, segmentation and summarization. Lin-Shan Lee, Yuan Ho, Jia-Fu Chen, Shun-Chuan Chen
2003	Word activation model by Japanese school children without knowledge of roman alphabet. Takashi Otake, Miki Komatsu
2003	Word class modeling for speech recognition with out-of-task words using a hierarchical language model. Yoshihiko Ogawa, Hirofumi Yamamoto, Yoshinori Sagisaka, Gen-ichiro Kikui