| 2003 | "do not attempt to light with match!": some thoughts on progress and research goals in spoken dialog systems. Paul Heisterkamp |
| 2003 | "syncpitch": a pseudo pitch synchronous algorithm for speaker recognition. Ran D. Zilca, Jirí Navrátil, Ganesh N. Ramaswamy |
| 2003 | 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - INTERSPEECH 2003, Geneva, Switzerland, September 1-4, 2003 |
| 2003 | A DP algorithm for speaker change detection. Michele Vescovi, Mauro Cettolo, Romeo Rizzi |
| 2003 | A DTW-based DAG technique for speech and speaker feature analysis. Jingwei Liu |
| 2003 | A clustering approach to on-line audio source separation. Julien Bourgeois |
| 2003 | A comparative study of some discriminative feature reduction algorithms on the AURORA 2000 and the daimlerchrysler in-car ASR tasks. Joan Marí Hilario, Fritz Class |
| 2003 | A comparative study on maximum entropy and discriminative training for acoustic modeling in automatic speech recognition. Wolfgang Macherey, Hermann Ney |
| 2003 | A comparison of the data requirements of automatic speech recognition systems and human listeners. Roger K. Moore |
| 2003 | A comparison of three non-linear observation models for noisy speech features. Jasha Droppo, Li Deng, Alex Acero |
| 2003 | A computational model of arm gestures in conversation. Dafydd Gibbon, Ulrike Gut, Benjamin Hell, Karin Looks, Alexandra Thies, Thorsten Trippel |
| 2003 | A context resolution server for the galaxy conversational systems. Edward Filisko, Stephanie Seneff |
| 2003 | A contrastive investigation of standard Mandarin and accented Mandarin. Aijun Li, Xia Wang |
| 2003 | A corpus-based decompounding algorithm for German lexical modeling in LVCSR. Martine Adda-Decker |
| 2003 | A cross-media retrieval system for lecture videos. Atsushi Fujii, Katunobu Itou, Tomoyosi Akiba, Tetsuya Ishikawa |
| 2003 | A discriminative decision tree learning approach to acoustic modeling. Sheng Gao, Chin-Hui Lee |
| 2003 | A dynamic cross-reference pruning strategy for multiple feature fusion at decoder run time. Yonghong Yan, Chengyi Zheng, Jianping Zhang, Jielin Pan, Jiang Han, Jian Liu |
| 2003 | A fast, accurate and stream-based speaker segmentation and clustering algorithm. An Vandecatseye, Jean-Pierre Martens |
| 2003 | A harmonic-model-based front end for robust speech recognition. Michael L. Seltzer, Jasha Droppo, Alex Acero |
| 2003 | A hidden Markov model-based missing data imputation approach. Yu Luo, Limin Du |
| 2003 | A hybrid method oriented to concatenative text-to-speech synthesis. Ignasi Iriondo Sanz, Francesc Alías, Javier Sanchis, Javier Melenchón |
| 2003 | A latent analogy framework for grapheme-to-phoneme conversion. Jerome R. Bellegarda |
| 2003 | A memory-based approach to Cantonese tone recognition. Michael Emonts, Deryle Lonsdale |
| 2003 | A method for on-line speaker indexing using generic reference models. Soonil Kwon, Shrikanth S. Narayanan |
| 2003 | A multimodal conversational interface for a concept vehicle. Roberto Pieraccini, Krishna Dayanidhi, Jonathan Bloom, Jean-Gui Dahan, Michael Phillips, Bryan R. Goodman, K. Venkatesh Prasad |
| 2003 | A neural network approach to dependency analysis of Japanese sentences using prosodic information. Kazuyuki Takagi, Mamiko Okimoto, Yoshio Ogawa, Kazuhiko Ozeki |
| 2003 | A new HMM-based approach to broad phonetic classification of speech. Jouni Pohjalainen |
| 2003 | A new SVM approach to speaker identification and verification using probabilistic distance kernels. Pedro J. Moreno, Purdy Ho |
| 2003 | A new adaptive long-term spectral estimation voice activity detector. Javier Ramírez, José C. Segura, M. Carmen Benítez, Ángel de la Torre, Antonio J. Rubio |
| 2003 | A new approach to minimize utterance verification error rate for a specific operating point. Wing-Hei Au, Man-Hung Siu |
| 2003 | A new approach to reducing alarm noise in speech. Yilmaz Gul, Aladdin M. Ariyaeeinia, Oliver Dewhirst |
| 2003 | A new approach to segment and detect syllables from high-speed speech. D. W. Ying, W. Gao, W. Q. Wang |
| 2003 | A new approach to voice activity detection based on self-organizing maps. Stephan Grashey |
| 2003 | A new decoder design for large vocabulary turkish speech recognition. Onur Cilingir, Mübeccel Demirekler |
| 2003 | A new method for pitch prediction from spectral envelope and its application in voice conversion. Taoufik En-Najjary, Olivier Rosec, Thierry Chonavel |
| 2003 | A new perspective on feature extraction for robust in-vehicle speech recognition. Umit H. Yapanel, John H. L. Hansen |
| 2003 | A new pitch modeling approach for Mandarin speech. Wen-Hsing Lai, Yih-Ru Wang, Sin-Horng Chen |
| 2003 | A new pitch synchronous time domain phoneme recognizer using component analysis and pitch clustering. Ramon Prieto, Jing Jiang, Chi-Ho Choi |
| 2003 | A new spectral transformation for speaker normalization. Pierre L. Dognin, Amro El-Jaroudi |
| 2003 | A new supervised-predictive compensation scheme for noisy speech recognition. Khalid Daoudi, Murat Deviren |
| 2003 | A noise-robust ASR back-end technique based on weighted viterbi recognition. Xiaodong Cui, Alexis Bernard, Abeer Alwan |
| 2003 | A novel method of analysing and comparing responses of hearing aid algorithms using auditory time-frequency representation. G. V. Kiran, Thippur V. Sreenivas |
| 2003 | A novel rate selection algorithm for transcoding CELP-type codec and SMV. Dalwon Jang, Seongho Seo, Sunil Lee, Chang D. Yoo |
| 2003 | A novel transcoding algorithm for SMV and g.723.1 speech coders via direct parameter transformation. Seongho Seo, Dalwon Jang, Sunil Lee, Chang D. Yoo |
| 2003 | A novel use of residual noise model for modified PMC. Cailian Miao, Yangsheng Wang |
| 2003 | A programmable policy manager for conversational biometrics. Ganesh N. Ramaswamy, Ran D. Zilca, Oleg Alecksandrovich |
| 2003 | A pronunciation lexicon for turkish based on two-level morphology. Kemal Oflazer, Sharon Inkelas |
| 2003 | A pronunciation training system for Japanese lexical accents with corrective feedback in learner's voice. Keikichi Hirose, Frédéric Gendrin, Nobuaki Minematsu |
| 2003 | A reconstruction of farkas kempelen's speaking machine. P. Nikleczy, Gábor Olaszy |
| 2003 | A robust and sensitive word boundary decision algorithm. Jong Uk Kim, Sang-Gyun Kim, Chang D. Yoo |
| 2003 | A robust noise and echo canceller. Khaldoon Al-Naimi, Christian Sturt, Ahmet M. Kondoz |
| 2003 | A segment-based algorithm of speech enhancement for robust speech recognition. Guokang Fu, Ta-Hsin Li |
| 2003 | A semantic representation for spoken dialogs. Hélène Bonneau-Maynard, Sophie Rosset |
| 2003 | A semi-blind source separation method for hands-free speech recognition of multiple talkers. Panikos Heracleous, Satoshi Nakamura, Kiyohiro Shikano |
| 2003 | A sequential metric-based audio segmentation method via the Bayesian information criterion. Shih-Sian Cheng, Hsin-Min Wang |
| 2003 | A source model mitigation technique for distributed speech recognition over lossy packet channels. Angel M. Gomez, Antonio M. Peinado, Victoria E. Sánchez, Antonio J. Rubio |
| 2003 | A speech dereverberation method based on the MTF concept. Masashi Unoki, Keigo Sakata, Masato Akagi |
| 2003 | A speech model of acoustic inventories based on asynchronous interpolation. Alexander Kain, Jan P. H. van Santen |
| 2003 | A speech processing front-end with eigenspace normalization for robust speech recognition in noisy automobile environments. Kaisheng Yao, Erik M. Visser, Oh-Wook Kwon, Te-Won Lee |
| 2003 | A spoken language interface to an electronic programme guide. Jianhong Jin, Martin J. Russell, Michael J. Carey, James Chapman, Harvey Lloyd-Thomas, Graham Tattersall |
| 2003 | A statistical approach to assessing speech and voice variability in speaker verification. Klaus R. Scherer, Didier Grandjean, Tom Johnstone, Gudrun Klasmeyer, Tanja Bänziger |
| 2003 | A statistical method of evaluating pronunciation proficiency for English words spoken by Japanese. Seiichi Nakagawa, Kazumasa Mori, Naoki Nakamura |
| 2003 | A study on domain recognition of spoken dialogue systems. Toshihiro Isobe, Shoji Hayakawa, Hiroya Murao, Tatsuji Mizutani, Kazuya Takeda, Fumitada Itakura |
| 2003 | A switching linear Gaussian hidden Markov model and its application to nonstationary noise compensation for robust speech recognition. Jian Wu, Qiang Huo |
| 2003 | A syllable segmentation algorithm for English and italian. Massimo Petrillo, Francesco Cutugno |
| 2003 | A system for voice conversion based on adaptive filtering and line spectral frequency distance optimization for text-to-speech synthesis. Özgül Salor, Mübeccel Demirekler, Bryan L. Pellom |
| 2003 | A topic classification system based on parametric trajectory mixture models. William Belfield, Herbert Gish |
| 2003 | A trainable generator for recommendations in multimodal dialog. Marilyn A. Walker, Rashmi Prasad, Amanda Stent |
| 2003 | A trainable speech enhancement technique based on mixture models for speech and noise. Ilyas Potamitis, Nikos Fakotakis, George Kokkinakis |
| 2003 | A visual context-aware multimodal system for spoken language processing. Niloy Mukherjee, Deb Roy |
| 2003 | A voice-driven web browser for blind people. Bostjan Vesnicer, Janez Zibert, Simon Dobrisek, Nikola Pavesic, France Mihelic |
| 2003 | Accentual lengthening in standard Chinese: evidence from four-syllable constituents. Yiya Chen |
| 2003 | Accuracy improved double-talk detector based on state transition diagram. Sang-Gyun Kim, Jong Uk Kim, Chang D. Yoo |
| 2003 | Acoustic change detection and segment clustering of two-way telephone conversations. Xin Zhong, Mark A. Clements, Sung Lim |
| 2003 | Acoustic model selection and voice quality assessment for HMM-based Mandarin speech synthesis. Wentao Gu, Keikichi Hirose |
| 2003 | Acoustic modeling of american English lateral approximants. Zhaoyan Zhang, Carol Y. Espy-Wilson, Mark Tiede |
| 2003 | Acoustic modeling with mixtures of subspace constrained exponential models. Karthik Visweswariah, Scott Axelrod, Ramesh A. Gopinath |
| 2003 | Acoustic normalization of children's speech. Georg Stemmer, Christian Hacker, Stefan Steidl, Elmar Nöth |
| 2003 | Acoustic variations of focused disyllabic words in Mandarin Chinese: analysis, synthesis and perception. Zhenglai Gu, Hiroki Mori, Hideki Kasuya |
| 2003 | Acoustic, phonetic, and discriminative approaches to automatic language identification. Elliot Singer, Pedro A. Torres-Carrasquillo, Terry P. Gleason, William M. Campbell, Douglas A. Reynolds |
| 2003 | Acquiring lexical information from multilevel temporal annotations. Thorsten Trippel, Felix Sasaki, Benjamin Hell, Dafydd Gibbon |
| 2003 | Active and unsupervised learning for automatic speech recognition. Giuseppe Riccardi, Dilek Hakkani-Tür |
| 2003 | Active labeling for spoken language understanding. Gökhan Tür, Mazin G. Rahim, Dilek Hakkani-Tür |
| 2003 | Adaptation of acoustic model using the gain-adapted HMM decomposition method. Akira Sasou, Futoshi Asano, Kazuyo Tanaka, Satoshi Nakamura |
| 2003 | Adapting acoustic models to new domains and conditions using untranscribed data. Asela Gunawardana, Alex Acero |
| 2003 | Adapting language models for frequent fixed phrases by emphasizing n-gram subsets. Tomoyosi Akiba, Katunobu Itou, Atsushi Fujii |
| 2003 | Adaptive beamforming in room with reverberation. Zoran Saric, Slobodan Jovicic |
| 2003 | Adaptive decision fusion for multi-sample speaker verification over GSM networks. Ming-Cheung Cheung, Man-Wai Mak, Sun-Yuan Kung |
| 2003 | Adaptive noise estimation using second generation and perceptual wavelet transforms. Essa Jafer, Abdulhussain E. Mahdi |
| 2003 | Adding fricatives to the portuguese articulatory synthesiser. António J. S. Teixeira, Luis M. T. Jesus, Roberto Martinez |
| 2003 | Additive noise and channel distortion-robust parametrization tool - performance evaluation on Aurora 2 & 3. Petr Fousek, Petr Pollák |
| 2003 | Agents for integrated tutoring in spoken dialogue systems. Jaakko Hakulinen, Markku Turunen, Esa-Pekka Salonen |
| 2003 | An NN-based approach to prosodic information generation for synthesizing English words embedded in Chinese text. Wei-Chih Kuo, Li-Feng Lin, Yih-Ru Wang, Sin-Horng Chen |
| 2003 | An accurate noise compensation algorithm in the log-spectral domain for robust speech recognition. Mohamed Afify |
| 2003 | An acoustic phonetic analysis of diphthongs in ningbo Chinese. Fang Hu |
| 2003 | An acquisition model of speech perception with considerations of temporal information. Ching-Pong Au |
| 2003 | An approach to common acoustical pole and zero modeling of consecutive periods of voiced speech. Pedro J. Quintana-Morales, Juan L. Navarro-Mesa |
| 2003 | An approach to multilingual acoustic modeling for portable devices. Yan Ming Cheng, Chen Liu, Yuanjun Wei, Lynette Melnar, Changxue Ma |
| 2003 | An architecture for rapid decoding of large vocabulary conversational speech. George Saon, Geoffrey Zweig, Brian Kingsbury, Lidia Mangu, Upendra V. Chaudhari |
| 2003 | An automatic singing transcription system with multilingual singing lyric recognizer and robust melody tracker. Chong-kai Wang, Ren-yuan Lyu, Yuang-Chin Chiang |
| 2003 | An efficient integrated gender detection scheme and time mediated averaging of gender dependent acoustic models. Peder A. Olsen, Satya Dharanipragada |
| 2003 | An efficient keyword spotting technique using a complementary language for filler models training. Panikos Heracleous, Tohru Shimizu |
| 2003 | An efficient viterbi algorithm on DBNs. Wei Hu, Yimin Zhang, Qian Diao, Shan Huang |
| 2003 | An efficient, fast matching approach using posterior probability estimates in speech recognition. Sherif M. Abdou, Michael S. Scordilis |
| 2003 | An empirical text transformation method for spontaneous speech synthesizers. Shiva Sundaram, Shrikanth S. Narayanan |
| 2003 | An evaluation of VTS and IMM for speaker verification in noise. Suhadi Suhadi, Sorel Stan, Tim Fingscheidt, Christophe Beaugeant |
| 2003 | An expandable web-based audiovisual text-to-speech synthesis system. Sascha Fagel, Walter F. Sendlmeier |
| 2003 | An improved model-based speaker segmentation system. Peng Yu, Frank Seide, Chengyuan Ma, Eric Chang |
| 2003 | An information theoretic approach for using word cluster information in natural language call routing. Li Li, Feng Liu, Wu Chou |
| 2003 | An integrated system for smart-home control of appliances based on remote speech interaction. Ilyas Potamitis, Kallirroi Georgila, Nikos Fakotakis, George K. Kokkinakis |
| 2003 | An integrated toolkit deploying speech technology for computer based speech training with application to dysarthric speakers. Athanassios Hatzis, Phil D. Green, James Carmichael, Stuart P. Cunningham, Rebecca Palmer, Mark Parker, Peter O'Neill |
| 2003 | An investigation of intensity patterns for German. Oliver Jokisch, Marco Kühne |
| 2003 | An optimized multi-duration HMM for spontaneous speech recognition. Yuichi Ohkawa, Akihiro Yoshida, Motoyuki Suzuki, Akinori Ito, Shozo Makino |
| 2003 | Analysis and compensation of packet loss in distributed speech recognition using interleaving. Ben P. Milner, Alastair Bruce James |
| 2003 | Analysis and modeling of f_0 contours of portuguese utterances based on the command-response model. Hiroya Fujisaki, Shuichi Narusawa, Sumio Ohno, Diamantino Freitas |
| 2003 | Analysis and modeling of syllable duration for Thai speech synthesis. Chatchawarn Hansakunbuntheung, Virongrong Tesprasit, Rungkarn Siricharoenchai, Yoshinori Sagisaka |
| 2003 | Analysis of lossy vocal tract models for speech production. Karl Schnell, Arild Lacroix |
| 2003 | Analysis of the Aurora large vocabulary evaluations. Naveen Parihar, Joseph Picone |
| 2003 | Analysis of voice source characteristics using a constrained polynomial model. Tokihiko Kaburagi, Koji Kawai |
| 2003 | Applications of computer generated expressive speech for communication disorders. Jan P. H. van Santen, Lois M. Black, Gilead Cohen, Alexander Kain, Esther Klabbers, Taniya Mishra, Jacques de Villiers, Xiaochuan Niu |
| 2003 | Approaches to foreign-accented speaker-independent speech recognition. Stefanie Aalburg, Harald Höge |
| 2003 | Arabic in my hand: small-footprint synthesis of egyptian arabic. Laura Mayfield Tomokiyo, Alan W. Black, Kevin A. Lenzo |
| 2003 | Assessment of dereverberation algorithms for large vocabulary speech recognition systems. Koen Eneman, Jacques Duchateau, Marc Moonen, Dirk Van Compernolle, Hugo Van hamme |
| 2003 | Assessment of spoken dialogue system usability - what are we really measuring? Lars Bo Larsen |
| 2003 | Audio-visual speech recognition in challenging environments. Gerasimos Potamianos, Chalapathy Neti |
| 2003 | Audiovisual speech enhancement based on the association between speech envelope and video features. Frédéric Berthommier |
| 2003 | Auditory principles in speech processing - do computers need silicon ears ? Birger Kollmeier |
| 2003 | Auditory-instrumental forensic speaker recognition. Stefan G. Gfrörer |
| 2003 | Automated closed-captioning of live TV broadcast news in French. Julie Brousseau, Jean-Francois Beaumont, Gilles Boulianne, Patrick Cardinal, Claude Chapdelaine, Michel Comeau, Frédéric Osterrath, Pierre Ouellet |
| 2003 | Automated speaker recognition in real world conditions: controlling the uncontrollable. Hirotaka Nakasone |
| 2003 | Automated transcription and topic segmentation of large spoken archives. Martin Franz, Bhuvana Ramabhadran, Todd Ward, Michael Picheny |
| 2003 | Automatic baseform generation from acoustic data. Benoît Maison |
| 2003 | Automatic call-routing without transcriptions. Qiang Huang, Stephen J. Cox |
| 2003 | Automatic construction of unique signatures and confusable sets for natural language directory assistance applications. E. E. Jan, Benoît Maison, Lidia Mangu, Geoffrey Zweig |
| 2003 | Automatic disfluency identification in conversational speech using multiple knowledge sources. Yang Liu, Elizabeth Shriberg, Andreas Stolcke |
| 2003 | Automatic estimation of perceptual age using speaker modeling techniques. Nobuaki Minematsu, Keita Yamauchi, Keikichi Hirose |
| 2003 | Automatic extraction of bilingual chunk lexicon for spoken language translation. Limin Du, Boxing Chen |
| 2003 | Automatic generation of context-independent variable parameter models using successive state and mixture splitting. Soo-Young Suk, Ho-Youl Jung, Hyun-Yeol Chung |
| 2003 | Automatic generation of non-uniform context-dependent HMM topologies based on the MDL criterion. Takatoshi Jitsuhiro, Tomoko Matsui, Satoshi Nakamura |
| 2003 | Automatic induction of n-gram language models from a natural language grammar. Stephanie Seneff, Chao Wang, Timothy J. Hazen |
| 2003 | Automatic phone set extension with confidence measure for spontaneous speech. Yi Liu, Pascale Fung |
| 2003 | Automatic prosodic prominence detection in speech using acoustic features: an unsupervised system. Fabio Tamburini |
| 2003 | Automatic segmentation for czech concatenative speech synthesis using statistical approach with boundary-specific correction. Jindrich Matousek, Daniel Tihelka, Josef Psutka |
| 2003 | Automatic segmentation of film dialogues into phonemes and graphemes. Gilles Boulianne, Jean-Francois Beaumont, Patrick Cardinal, Michel Comeau, Pierre Ouellet, Pierre Dumouchel |
| 2003 | Automatic singer identification of popular music recordings via estimation and modeling of solo vocal signal. Wei-Ho Tsai, Hsin-Min Wang, Dwight Rodgers |
| 2003 | Automatic speech recognition with sparse training data for dysarthric speakers. Phil D. Green, James Carmichael, Athanassios Hatzis, Pam Enderby, Mark S. Hawley, Mark Parker |
| 2003 | Automatic speech segmentation and verification for concatenative synthesis. Chih-Chung Kuo, Chi-Shiang Kuo, Jau-Hung Chen, Sen-Chia Chang |
| 2003 | Automatic summarization of broadcast news using structural features. Sameer Maskey, Julia Hirschberg |
| 2003 | Automatic title generation for Chinese spoken documents considering the special structure of the language. Lin-Shan Lee, Shun-Chuan Chen |
| 2003 | Automatic title generation for Chinese spoken documents using an adaptive k nearest-neighbor approach. Shun-Chuan Chen, Lin-Shan Lee |
| 2003 | Automatic transcription of football commentaries in the MUMIS project. Janienke Sturm, Judith M. Kessens, Mirjam Wester, Febe de Wet, Eric Sanders, Helmer Strik |
| 2003 | Automatic transformation of environmental sounds into sound-imitation words based on Japanese syllable structure. Kazushi Ishihara, Yasushi Tsubota, Hiroshi G. Okuno |
| 2003 | Autoregressive modeling based feature extraction for Aurora3 DSR task. Petr Motlícek, Jan Cernocký |
| 2003 | Average instantaneous frequency (AIF) and average log-envelopes (ALE) for ASR with the Aurora 2 database. Yadong Wang, Jesse Hansen, Gopi Krishna Allu, Ramdas Kumaresan |
| 2003 | Band-independent speech-event categories for TRAP based ASR. Hynek Hermansky, Pratibha Jain |
| 2003 | Bandwidth mismatch compensation for robust speech recognition. Yuan-Fu Liao, Jeng-Shien Lin, Wei-Ho Tsai |
| 2003 | Bayesian induction of intonational phrase breaks. Panagiotis Zervas, Manolis Maragoudakis, Nikos Fakotakis, George Kokkinakis |
| 2003 | Bayesian networks for spoken dialogue management in multimodal systems of tour-guide robots. Plamen J. Prodanov, Andrzej Drygajlo |
| 2003 | Beyond a single critical-band in TRAP based ASR. Pratibha Jain, Hynek Hermansky |
| 2003 | Blind inversion of multidimensional functions for speech enhancement. John Hogden, Patrick Valdez, Shigeru Katagiri, Erik McDermott |
| 2003 | Blind normalization of speech from different channels. David N. Levin |
| 2003 | Blind separation and deconvolution for convolutive mixture of speech using SIMO-model-based ICA and multichannel inverse filtering. Hiroaki Yamajo, Hiroshi Saruwatari, Tomoya Takatani, Tsuyoki Nishikawa, Kiyohiro Shikano |
| 2003 | Brain imaging correlates of temporal quantization in spoken language. David Poeppel |
| 2003 | Broad focus across sentence types in greek. Mary Baltazani |
| 2003 | Building a test collection for speech-driven web retrieval. Atsushi Fujii, Katunobu Itou |
| 2003 | CART-based factor analysis of intelligibility reduction in Japanese English. Nobuaki Minematsu, Changchen Guo, Keikichi Hirose |
| 2003 | CFA-BF: a novel combined fixed/adaptive beamforming for robust speech recognition in real car environments. Xianxian Zhang, John H. L. Hansen |
| 2003 | Characteristics of authentic anger in hebrew speech. Noam Amir, Shirley Ziv, Rachel Cohen |
| 2003 | Child and adult speaker adaptation during error resolution in a publicly available spoken dialogue system. Linda Bell, Joakim Gustafson |
| 2003 | Classification with free energy at raised temperatures. Rita Singh, Manfred K. Warmuth, Bhiksha Raj, Paul Lamere |
| 2003 | Classifying subject ratings of emotional speech using acoustic features. Jackson Liscombe, Jennifer J. Venditti, Julia Hirschberg |
| 2003 | Collecting machine-translation-aided bilingual dialogues for corpus-based speech translation. Toshiyuki Takezawa, Gen-ichiro Kikui |
| 2003 | Combination of CFG and n-gram modeling in semantic grammar learning. Ye-Yi Wang, Alex Acero |
| 2003 | Combination of a hidden tag model and a traditional n-gram model: a case study in czech speech recognition. Pavel Krbec, Petr Podveský, Jan Hajic |
| 2003 | Combination of finite state automata and neural network for spoken language understanding. Chai Wutiwiwatchai, Sadaoki Furui |
| 2003 | Combination of temporal domain SVD based speech enhancement and GMM based speech estimation for ASR in noise - evaluation on the AURORA2 task -. Masakiyo Fujimoto, Yasuo Ariki |
| 2003 | Combining non-uniform unit selection with diphone based synthesis. Michael Pucher, Friedrich Neubarth, Erhard Rank, Georg Niklfeld, Qi Guan |
| 2003 | Comparative analysis and synthesis of formant trajectories of british and broad australian accents. Qin Yan, Saeed Vaseghi, Ching-Hsiang Ho, Dimitrios Rentzos, Emir Turajlic |
| 2003 | Comparative experiments to evaluate the use of auditory-based acoustic distinctive features and formant cues for robust automatic speech recognition in low-SNR car environments. Hesham Tolba, Sid-Ahmed Selouani, Douglas D. O'Shaughnessy |
| 2003 | Comparative study of boosting and non-boosting training for constructing ensembles of acoustic models. Rong Zhang, Alexander I. Rudnicky |
| 2003 | Comparative study on hungarian acoustic model sets and training methods. Tibor Fegyó, Péter Mihajlik, Péter Tatai |
| 2003 | Comparing the usability of a user driven and a mixed initiative multimodal dialogue system for train timetable information. Janienke Sturm, Ilse Bakx, Bert Cranen, Jacques M. B. Terken |
| 2003 | Comparison of effects of acoustic and language knowledge on spontaneous speech perception/recognition between human and automatic speech recognizer. Norihide Kitaoka, Masahisa Shingu, Seiichi Nakagawa |
| 2003 | Compensation of channel distortion in line spectrum frequency domain. An-Tze Yu, Hsiao-Chuan Wang |
| 2003 | Compiling large-context phonetic decision trees into finite-state transducers. Stanley F. Chen |
| 2003 | Compound decomposition in dutch large vocabulary speech recognition. Roeland Ordelman, Arjan van Hessen, Franciska de Jong |
| 2003 | Computational auditory scene analysis by using statistics of high-dimensional speech dynamics and sound source direction. Johannes Nix, Michael Kleinschmidt, Volker Hohmann |
| 2003 | Conceptual decoding for spoken dialog systems. Yannick Estève, Christian Raymond, Frédéric Béchet, Renato De Mori |
| 2003 | Conditional and joint models for grapheme-to-phoneme conversion. Stanley F. Chen |
| 2003 | Confidence measure driven scalable two-pass recognition strategy for large list grammars. Miroslav Novak, Diego Ruiz |
| 2003 | Confidence measures for phonetic segmentation of continuous speech. Samir Nefti, Olivier Boëffard, Thierry Moudenc |
| 2003 | Confusion matrix based entropy correction in multi-stream combination. Hemant Misra, Andrew C. Morris |
| 2003 | Connectionist classification and specific stochastic models in the understanding process of a dialogue system. David Vilar, María José Castro, Emilio Sanchis |
| 2003 | Consideration of muscle co-contraction in a physiological articulatory model. Jianwu Dang, Kiyoshi Honda |
| 2003 | Considerations on vowel durations for Japanese CALL system. Taro Mouri, Keikichi Hirose, Nobuaki Minematsu |
| 2003 | Construction of an advanced in-car spoken dialogue corpus and its characteristic analysis. Itsuki Kishida, Yuki Irie, Yukiko Yamaguchi, Shigeki Matsubara, Nobuo Kawaguchi, Yasuyoshi Inagaki |
| 2003 | Context awareness using environmental noise classification. Ling Ma, Dan J. Smith, Ben P. Milner |
| 2003 | Context-dependent output densities for hidden Markov models in speech recognition. Georg Stemmer, Viktor Zeißler, Christian Hacker, Elmar Nöth, Heinrich Niemann |
| 2003 | Context-sensitive evaluation and correction of phone recognition output. Michael Levit, Hiyan Alshawi, Allen L. Gorin, Elmar Nöth |
| 2003 | Continuous speech recognition and verification based on a combination score. Binfeng Yan, Rui Guo, Xiaoyan Zhu |
| 2003 | Control and prediction of the impact of pitch modification on synthetic speech quality. Esther Klabbers, Jan P. H. van Santen |
| 2003 | Control in task-oriented dialogues. Peter A. Heeman, Fan Yang, Susan E. Strayer |
| 2003 | Convergence improvement for oversampled subband adaptive noise and echo cancellation. Hamid Reza Abutalebi, Hamid Sheikhzadeh, Robert L. Brennan, George H. Freeman |
| 2003 | Corpus-based modeling of naturalness estimation in timing control for non-native speech. Makiko Muto, Yoshinori Sagisaka, Takuro Naito, Daiju Maeki, Aki Kondo, Katsuhiko Shirai |
| 2003 | Corpus-based syntax-prosody tree matching. Dafydd Gibbon |
| 2003 | Corpus-based synthesis of fundamental frequency contours of Japanese using automatically-generated prosodic corpus and generation process model. Keikichi Hirose, Takayuki Ono, Nobuaki Minematsu |
| 2003 | Correction of disfluencies in spontaneous speech using a noisy-channel approach. Matthias Honal, Tanja Schultz |
| 2003 | Coupling vs. unifying: modeling techniques for speech-to-speech translation. Yuqing Gao |
| 2003 | Covariation and weighting of harmonically decomposed streams for ASR. Philip J. B. Jackson, David M. Moreno, Martin J. Russell, Javier Hernando |
| 2003 | Creating corpora for speech-to-speech translation. Gen-ichiro Kikui, Eiichiro Sumita, Toshiyuki Takezawa, Seiichi Yamamoto |
| 2003 | Cross domain Chinese speech understanding and answering based on named-entity extraction. Yun-Tien Lee, Shun-Chuan Chen, Lin-Shan Lee |
| 2003 | Cross-lingual pronunciation modelling for indonesian speech recognition. Terrence Martin, Torbjørn Svendsen, Sridha Sridharan |
| 2003 | Cross-modal informational masking due to mismatched audio cues in a speechreading task. Douglas Brungart, Brian D. Simpson, Alexander J. Kordik |
| 2003 | Cross-stream observation dependencies for multi-stream speech recognition. Özgür Çetin, Mari Ostendorf |
| 2003 | Custom-tailoring TTS voice font - keeping the naturalness when reducing database size. Yong Zhao, Min Chu, Hu Peng, Eric Chang |
| 2003 | Cycle extraction for perfect reconstruction and rate scalability. Miguel Arjona Ramírez |
| 2003 | DOA estimation of speech signal using equilateral-triangular microphone array. Yusuke Hioka, Nozomu Hamada |
| 2003 | DTW-based phonetic alignment using multiple acoustic features. Sérgio Paulo, Luís C. Oliveira |
| 2003 | Data driven example based continuous speech recognition. Mathias De Wachter, Kris Demuynck, Dirk Van Compernolle, Patrick Wambacq |
| 2003 | Data driven generation of broad classes for decision tree construction in acoustic modeling. Andrej Zgank, Zdravko Kacic, Bogomir Horvat |
| 2003 | Data-driven pronunciation modeling for ASR using acoustic subword units. Thurid Spiess, Britta Wrede, Gernot A. Fink, Franz Kummert |
| 2003 | Database adaptation for ASR in cross-environmental conditions in the SPEECON project. Christophe Couvreur, Oren Gedge, Klaus Linhard, Shaunie Shammass, Johan Vantieghem |
| 2003 | Decision tree-based simultaneous clustering of phonetic contexts, dimensions, and state positions for acoustic modeling. Heiga Zen, Keiichi Tokuda, Tadashi Kitamura |
| 2003 | Dependence of GMM adaptation on feature post-processing for speaker recognition. Robbie Vogt, Jason W. Pelecanos, Sridha Sridharan |
| 2003 | Design and evaluation of a limited two-way speech translator. David Stallard, John Makhoul, Fred Choi, Ehry MacRostie, Premkumar Natarajan, Richard M. Schwartz, Bushra Zawaydeh |
| 2003 | Design of the CMU sphinx-4 decoder. Paul Lamere, Philip Kwok, William Walker, Evandro B. Gouvêa, Rita Singh, Bhiksha Raj, Peter Wolf |
| 2003 | Designing for errors: similarities and differences of disfluency rates and prosodic characteristics across domains. Guergana K. Savova, Joan Bachenko |
| 2003 | Detection and recognition of correction utterance in spontaneously spoken dialog. Norihide Kitaoka, Naoko Kakutani, Seiichi Nakagawa |
| 2003 | Detection and separation of speech segment using audio and video information fusion. Futoshi Asano, Yoichi Motomura, Hideki Asoh, Takashi Yoshimura, Naoyuki Ichimura, Kiyoshi Yamamoto, Nobuhiko Kitawaki, Satoshi Nakamura |
| 2003 | Detection of list-type sentences. Taniya Mishra, Esther Klabbers, Jan P. H. van Santen |
| 2003 | Development of a bilingual spoken dialog system for weather information retrieval. Janez Zibert, Sanda Martincic-Ipsic, Melita Hajdinjak, Ivo Ipsic, France Mihelic |
| 2003 | Development of a stochastic dialog manager driven by semantics. Francisco Torres, Emilio Sanchis, Encarna Segarra |
| 2003 | Development of phrase translation systems for handheld computers: from concept to field. Horacio Franco, Jing Zheng, Kristin Precoda, Federico Cesari, Victor Abrash, Dimitra Vergyri, Anand Venkataraman, Harry Bratt, Colleen Richey, Ace Sarich |
| 2003 | Development of the estonian speechdat-like database. Einar Meister, Jürgen Lasn, Lya Meister |
| 2003 | Dialog systems for automotive environments. Julie Baca, Feng Zheng, Hualin Gao, Joseph Picone |
| 2003 | Discriminative estimation of subspace precision and mean (SPAM) models. Vaibhava Goel, Scott Axelrod, Ramesh A. Gopinath, Peder A. Olsen, Karthik Visweswariah |
| 2003 | Discriminative methods for improving named entity extraction on speech data. James Horlock, Simon King |
| 2003 | Discriminative optimization of large vocabulary Mandarin conversational speech recognition system. Peng Ding, Zhenbiao Chen, Sheng Hu, Shuwu Zhang, Bo Xu |
| 2003 | Discriminative training and maximum likelihood detector for speaker identification. Mohamed Mihoubi, Gilles Boulianne, Pierre Dumouchel |
| 2003 | Discriminative training of n-gram classifiers for speech and text routing. Ciprian Chelba, Alex Acero |
| 2003 | Discriminative weight training for unit-selection based speech synthesis. Seung Seop Park, Chong Kyu Kim, Nam Soo Kim |
| 2003 | Disfluency under feedback and time-pressure. H. B. M. Nicholson, Ellen Gurman Bard, Anne H. Anderson, María L. Flecha-García, David Kenicer, Lucy Smallwood, Jim Mullin, Robin J. Lickley, Yiya Chen |
| 2003 | Distributed genetic algorithm to discover a wavelet packet best basis for speech recognition. Robert van Kommer, Béat Hirsbrunner |
| 2003 | Distributed speech recognition on the WSJ task. Jan Stadermann, Gerhard Rigoll |
| 2003 | Domain adaptation augmented by state-dependence in spoken dialog systems. Wei He, Honglian Li, Baozong Yuan |
| 2003 | Dominance spectrum based v/UV classification and f_0 estimation. Tomohiro Nakatani, Toshio Irino, Parham Zolfaghari |
| 2003 | Dual-mode wideband speech recovery from narrowband speech. Yasheng Qian, Peter Kabal |
| 2003 | Duration normalization and hypothesis combination for improved spontaneous speech recognition. Jon P. Nedel, Richard M. Stern |
| 2003 | Durational characteristics of hindi stop consonants. K. Samudravijaya |
| 2003 | Dynamic channel compensation based on maximum a posteriori estimation. Huayun Zhang, Zhaobing Han, Bo Xu |
| 2003 | Earwitness line-ups: effects of speech duration, retention interval and acoustic environment on identification accuracy. Jose H. Kerstholt, E. J. M. Jansen, A. G. van Amelsvoort, A. P. A. Broeders |
| 2003 | Effect of foreign accent on speech recognition in the NATO n-4 corpus. Aaron D. Lawson, David M. Harris, John J. Grieco |
| 2003 | Effects of voice prosody by computers on human behaviors. Noriko Suzuki, Yohei Yabuta, Yugo Takeuchi, Yasuhiro Katagiri |
| 2003 | Efficient linear combination for distant n-gram models. David Langlois, Kamel Smaïli, Jean Paul Haton |
| 2003 | Efficient quantization of speech excitation parameters using temporal decomposition. Phu Chien Nguyen, Masato Akagi |
| 2003 | Efficient speech enhancement based on left-right HMM with state sequence detection using LRT. J. J. Lee, J. H. Lee, K. Y. Lee |
| 2003 | Efficient spoken dialogue control depending on the speech recognition rate and system's database. Kohji Dohsaka, Norihito Yasuda, Kiyoaki Aikawa |
| 2003 | Emotion control of Chinese speech synthesis in natural environment. Jianhua Tao |
| 2003 | Emotion recognition by speech signals. Oh-Wook Kwon, Kwokleung Chan, Jiucang Hao, Te-Won Lee |
| 2003 | Emotion recognition using a data-driven fuzzy inference system. Chul Min Lee, Shrikanth S. Narayanan |
| 2003 | Empowering end users to personalize dialogue systems through spoken interaction. Stephanie Seneff, Grace Chung, Chao Wang |
| 2003 | Energy contour extraction for in-car speech recognition. Tai-Hwei Hwang |
| 2003 | Enhance low-frequency suppression of GSC beamforming. Zhaorong Hou, Ying Jia |
| 2003 | Enhanced tree clustering with single pronunciation dictionary for conversational speech recognition. Hua Yu, Tanja Schultz |
| 2003 | Enhancement of hearing-impaired Mandarin speech. Chen-Long Lee, Ya-ru Yang, Wen-Whei Chang, Yuan-Chuan Chiang |
| 2003 | Enhancement of noisy speech for noise robust front-end and speech reconstruction at back-end of DSR system. Hyoung-Gook Kim, Markus Schwab, Nicolas Moreau, Thomas Sikora |
| 2003 | Enhancement of speech in multispeaker environment. B. Yegnanarayana, S. R. Mahadeva Prasanna, Mathew Magimai-Doss |
| 2003 | Entropy constrained quantization of LSP parameters. Turaj Zakizadeh Shabestary, Per Hedelin, Fredrik Nordén |
| 2003 | Entropy-optimized channel error mitigation with application to speech recognition over wireless. Victoria E. Sánchez, Antonio M. Peinado, Angel M. Gomez, José L. Pérez-Córdoba |
| 2003 | Environment adaptation for robust speaker verification. Kwok-Kwong Yiu, Man-Wai Mak, Sun-Yuan Kung |
| 2003 | Environment adaptive control of noise reduction parameters for improved robustness of ASR. Chng Chin Soon, Bernt Andrassy, Josef G. Bauer, Günther Ruske |
| 2003 | Environmental sniffing: robust digit recognition for an in-vehicle environment. Murat Akbacak, John H. L. Hansen |
| 2003 | Environmental sound source identification based on hidden Markov model for robust speech recognition. Takanobu Nishiura, Satoshi Nakamura, Kazuhiro Miki, Kiyohiro Shikano |
| 2003 | Estimating Japanese word accent from syllable sequence using support vector machine. Hideharu Nakajima, Masaaki Nagata, Hisako Asano, Masanobu Abe |
| 2003 | Estimating speech recognition error rate without acoustic test data. Yonggang Deng, Milind Mahajan, Alex Acero |
| 2003 | Estimating the spectral envelope of voiced speech using multi-frame analysis. Yoshinori Shiga, Simon King |
| 2003 | Estimating the vocal-tract area function and the derivative of the glottal wave from a speech signal. Huiqun Deng, Michael P. Beddoes, Rabab Kreidieh Ward, Murray Hodgson |
| 2003 | Estimating the weight of evidence in forensic speaker verification. Beat Pfister, René Beutler |
| 2003 | Estimation of GMM in voice conversion including unaligned data. Helenca Duxans, Antonio Bonafonte |
| 2003 | Estimation of resonant characteristics based on AR-HMM modeling and spectral envelope conversion of vowel sounds. Nobuyuki Nishizawa, Keikichi Hirose, Nobuaki Minematsu |
| 2003 | Estimation of the parameters of the quantitative intonation model with continuous wavelet analysis. Hans Kruschke, Michael Lenz |
| 2003 | Estimation of vocal noise in running speech by means of bi-directional double linear prediction. Frédéric Bettens, Francis Grenez, Jean Schoentgen |
| 2003 | Estimation of voice source and vocal tract characteristics based on multi-frame analysis. Yoshinori Shiga, Simon King |
| 2003 | Evaluating and correcting phoneme segmentation for unit selection synthesis. John Kominek, Christina L. Bennett, Alan W. Black |
| 2003 | Evaluating discourse understanding in spoken dialogue systems. Ryuichiro Higashinaka, Noboru Miyazaki, Mikio Nakano, Kiyoaki Aikawa |
| 2003 | Evaluating multiple LVCSR model combination in NTCIR-3 speech-driven web retrieval task. Masahiko Matsushita, Hiromitsu Nishizaki, Takehito Utsuro, Yasuhiro Kodama, Seiichi Nakagawa |
| 2003 | Evaluating the effect of predicting oral reading miscues. Satanjeev Banerjee, Joseph E. Beck, Jack Mostow |
| 2003 | Evaluation frameworks for speech translation technologies. Marcello Federico |
| 2003 | Evaluation method for automatic speech summarization. Chiori Hori, Takaaki Hori, Sadaoki Furui |
| 2003 | Evaluation of ETSI advanced DSR front-end and bias removal method on the Japanese newspaper article sentences speech corpus. Satoru Tsuge, Shingo Kuroiwa, Kenji Kita |
| 2003 | Evaluation of a speech-driven telephone information service using the PARADISE framework: a closer look at subjective measures. Paula M. T. Smeele, Juliette A. J. S. Waals |
| 2003 | Evaluation of an alert system for selective dissemination of broadcast news. Isabel Trancoso, João Paulo Neto, Hugo Meinedo, Rui Amaral |
| 2003 | Evaluation of model-based feature enhancement on the AURORA-4 task. Veronique Stouten, Hugo Van hamme, Jacques Duchateau, Patrick Wambacq |
| 2003 | Evaluation of quantile based histogram equalization with filter combination on the Aurora 3 and 4 databases. Florian Hilger, Hermann Ney |
| 2003 | Evaluation of the affect of speech intonation using a model of the perception of interval dissonance and harmonic tension. Norman D. Cook, Takeshi Fujisawa, Kazuaki Takami |
| 2003 | Evaluation of the stochastic morphosyntactic language model on a one million word hungarian dictation task. Máté Szarvas, Sadaoki Furui |
| 2003 | Evaluation of units selection criteria in corpus-based speech synthesis. Hélène François, Olivier Boëffard |
| 2003 | Evaluation on the Aurora 2 database of acoustic models that are less noise-sensitive. Edmondo Trentin, Marco Matassoni, Marco Gori |
| 2003 | Evolutionary weight tuning based on diphone pairs for unit selection speech synthesis. Francesc Alías, Xavier Llorà |
| 2003 | Example-based bi-directional Chinese-English machine translation with semi-automatically induced grammars. Kai-Chung Siu, Helen M. Meng, Chin-Chung Wong |
| 2003 | Experimental evaluation of the relevance of prosodic features in Spanish using machine learning techniques. David Escudero Mancebo, Valentín Cardeñoso-Payo, Antonio Bonafonte |
| 2003 | Experimental tools to evaluate intelligibility of text-to-speech (TTS) synthesis: effects of voice gender and signal quality. Catherine J. Stevens, Nicole Lees, Julie Vonwiller |
| 2003 | Exploiting order-preserving perfect hashing to speedup n-gram language model lookahead. Xiaolong Li, Yunxin Zhao |
| 2003 | Exploiting speech for recognizing elderly users to respond to their special needs. Christian A. Müller, Frank Wittig, Jörg Baus |
| 2003 | Exploiting time warping in AMR-NB and AMR-WB speech coders. Lasse Laaksonen, Sakari Himanen, Ari Heikkinen, Jani Nurminen |
| 2003 | Exploiting unlabeled utterances for spoken language understanding. Gökhan Tür, Dilek Hakkani-Tür |
| 2003 | Extracting an AV speech source from a mixture of signals. David Sodoyer, Laurent Girin, Christian Jutten, Jean-Luc Schwartz |
| 2003 | Extraction methods of voicing feature for robust speech recognition. András Zolnay, Ralf Schlüter, Hermann Ney |
| 2003 | FEM analysis based on 3-d time-varying vocal tract shape. Koji Sasaki, Nobuhiro Miki, Yoshikazu Miyanaga |
| 2003 | FLavor: a flexible architecture for LVCSR. Kris Demuynck, Tom Laureys, Dirk Van Compernolle, Hugo Van hamme |
| 2003 | F_0 estimation of one or several voices. Alain de Cheveigné, Alexis Baskind |
| 2003 | Factorial models and refiltering for speech separation and denoising. Sam T. Roweis |
| 2003 | Far-field ASR on inexpensive microphones. Laura Docío Fernández, David Gelbart, Nelson Morgan |
| 2003 | Fast incremental adaptation using maximum likelihood regression and stochastic gradient descent. Sreeram V. Balakrishnan |
| 2003 | Feature compensation scheme based on parallel combined mixture model. Wooil Kim, Sungjoo Ahn, Hanseok Ko |
| 2003 | Feature compensation technique for robust speech recognition in noisy environments. Young Joon Kim, Hyun Woo Kim, Woohyung Lim, Nam Soo Kim |
| 2003 | Feature generation based on maximum classification probability for improved speech recognition. Xiang Li, Richard M. Stern |
| 2003 | Feature selection for the classification of crosstalk in multi-channel audio. Stuart N. Wrigley, Guy J. Brown, Vincent Wan, Steve Renals |
| 2003 | Feature transformations and combinations for improving ASR performance. Panu Somervuo, Barry Y. Chen, Qifeng Zhu |
| 2003 | Features for tree based dialogue course management. Klaus Macherey, Hermann Ney |
| 2003 | Features of contracted syllables of spontaneous Mandarin. Shu-Chuan Tseng |
| 2003 | Fitting class-based language models into weighted finite-state transducer framework. Pavel Ircing, Josef Psutka |
| 2003 | Flexible speech act identification of spontaneous speech with disfluency. Chung-Hsien Wu, Gwo-Lang Yan |
| 2003 | Flooring the observation probability for robust ASR in impulsive noise. Pei Ding, Bertram E. Shi, Pascale Fung, Zhigang Cao |
| 2003 | French intonational rises and their role in speech seg mentation [sic]. Pauline Welby |
| 2003 | Frequency distribution based weighted sub-band approach for classification of emotional/stressful content in speech. Mandar A. Rahurkar, John H. L. Hansen |
| 2003 | Frequency-related representation of speech. Kuldip K. Paliwal, Bishnu S. Atal |
| 2003 | From switchboard to fisher: telephone collection protocols, their uses and yields. Christopher Cieri, David Miller, Kevin Walker |
| 2003 | Fusing high- and low-level features for speaker recognition. Joseph P. Campbell, Douglas A. Reynolds, Robert B. Dunn |
| 2003 | GMM-based voice conversion applied to emotional speech synthesis. Hiromichi Kawanami, Yohei Iwami, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano |
| 2003 | Gaussian dynamic warping (GDW) method applied to text-dependent speaker detection and verification. Jean-François Bonastre, Philippe Morin, Jean-Claude Junqua |
| 2003 | Generation and perception of f_0 markedness in conversational speech with adverbs expressing degrees. Takumi Yamashita, Yoshinori Sagisaka |
| 2003 | Generation of natural response timing using decision tree based on prosodic and linguistic information. Masashi Takeuchi, Norihide Kitaoka, Seiichi Nakagawa |
| 2003 | Geometric constrained maximum likelihood linear regression on Mandarin dialect adaptation. Huayun Zhang, Bo Xu |
| 2003 | Glottal closure instant synchronous sinusoidal model for high quality speech analysis/synthesis. Parham Zolfaghari, Tomohiro Nakatani, Toshio Irino, Hideki Kawahara, Fumitada Itakura |
| 2003 | Glottal spectrum based inverse filtering. Ixone Arroabarren, Alfonso Carlosena |
| 2003 | Grapheme based speech recognition. Mirjam Killer, Sebastian Stüker, Tanja Schultz |
| 2003 | Grapheme to phoneme conversion and dictionary verification using graphonemes. Paul Vozila, Jeff Adams, Yuliya Lobacheva, Ryan Thomas |
| 2003 | HARTFEX: a multi-dimensional system of HMM based recognisers for articulatory features extraction. Tarek Abu-Amer, Julie Carson-Berndsen |
| 2003 | Harmonic alternatives to sine-wave speech. László Tóth, András Kocsor |
| 2003 | Harmonic weighting for all-pole modeling of the voiced speech. Davor Petrinovic |
| 2003 | Hidden feature models for speech recognition using dynamic Bayesian networks. Karen Livescu, James R. Glass, Jeff A. Bilmes |
| 2003 | Hierarchical class n-gram language models: towards better estimation of unseen events in speech recognition. Imed Zitouni, Olivier Siohan, Chin-Hui Lee |
| 2003 | Hierarchical topic classification for dialog speech recognition based on language model switching. Ian R. Lane, Tatsuya Kawahara, Tomoko Matsui, Satoshi Nakamura |
| 2003 | High-likelihood model based on reliability statistics for robust combination of features: application to noisy speech recognition. Peter Jancovic, Münevver Köküer, Fionn Murtagh |
| 2003 | How NLP techniques can improve speech understanding: ROMUS - a robust chunk based message understanding system using link grammars. Jérôme Goulian, Jean-Yves Antoine, Franck Poirier |
| 2003 | How does human segment the speech by prosody ? Toshie Hatano, Yasuo Horiuchi, Akira Ichikawa |
| 2003 | Hybrid HMM/BN ASR system integrating spectrum and articulatory features. Konstantin Markov, Jianwu Dang, Yosuke Iizuka, Satoshi Nakamura |
| 2003 | ISCA special session: hot topics in speech synthesis. Gérard Bailly, Nick Campbell, Bernd Möbius |
| 2003 | Identifying speakers in children's stories for speech synthesis. Jason Y. Zhang, Alan W. Black, Richard Sproat |
| 2003 | Illusory continuity of intermittent pure tone in binaural listening and its dependency on interaural time difference. Mamoru Iwaki, Norio Nakamura |
| 2003 | Impact of audio segmentation and segment clustering on automated transcription accuracy of large spoken archives. Bhuvana Ramabhadran, Jing Huang, Upendra V. Chaudhari, Giridharan Iyengar, Harriet J. Nock |
| 2003 | Impact of word graph density on the quality of posterior probability based confidence measures. Tibor Fábián, Robert Lieb, Günther Ruske, Matthias Thomae |
| 2003 | Implementation and evaluation of a text-to-speech synthesis system for turkish. Özgül Salor, Bryan L. Pellom, Mübeccel Demirekler |
| 2003 | Implementing an SSML compliant concatenative TTS system. Andrew P. Breen, Steve Minnis, Barry Eggleton |
| 2003 | Improved Chinese broadcast news transcription by language modeling with temporally consistent training corpora and iterative phrase extraction. Pi-Chuan Chang, Shuo-Peng Liao, Lin-Shan Lee |
| 2003 | Improved emotion recognition with large set of statistical features. Vladimir Hozjan, Zdravko Kacic |
| 2003 | Improved feature extraction based on spectral noise reduction and nonlinear feature normalization. José C. Segura, Javier Ramírez, M. Carmen Benítez, Ángel de la Torre, Antonio J. Rubio |
| 2003 | Improved kalman filter-based speech enhancement. Jianqiang Wei, Limin Du, Zhaoli Yan, Hui Zeng |
| 2003 | Improved name recognition with user modeling. Dong Yu, Kuansan Wang, Milind Mahajan, Peter Mau, Alex Acero |
| 2003 | Improved robustness of automatic speech recognition using a new class definition in linear discriminant analysis. Martin Schafföner, Marcel Katz, Sven E. Krüger, Andreas Wendemuth |
| 2003 | Improved speaker verification through probabilistic subspace adaptation. Simon Lucey, Tsuhan Chen |
| 2003 | Improvement of non-native speech recognition by effectively modeling frequently observed pronunciation habits. Nobuaki Minematsu, Koichi Osaki, Keikichi Hirose |
| 2003 | Improving "how may i help you?" systems using the output of recognition lattices. James Allen, David Attwater, Peter J. Durston, Mark Farrell |
| 2003 | Improving a connectionist based syntactical language model. Ahmad Emami |
| 2003 | Improving speech intelligibility by steady-state suppression as pre-processing in small to medium sized halls. Nao Hodoshima, Takayuki Arai, Tsuyoshi Inoue, Keisuke Kinoshita, Akiko Kusumoto |
| 2003 | Improving statistical natural concept generation in interlingua-based speech-to-speech translation. Liang Gu, Yuqing Gao, Michael Picheny |
| 2003 | Improving the accuracy of pronunciation prediction for unit selection TTS. Justin Fackrell, Wojciech Skut, Kathrine Hammervold |
| 2003 | Improving the competitiveness of discriminant neural networks in speaker verification. Carlos Vivaracho-Pascual, Javier Ortega-Garcia, Luis Alonso Romero, Q. Isaac Moro-Sancho |
| 2003 | Improving the efficiency of automatic speech recognition by feature transformation and dimensionality reduction. Xuechuan Wang, Douglas D. O'Shaughnessy |
| 2003 | In search of target class definition in tandem feature extraction. Sunil Sivadas, Hynek Hermansky |
| 2003 | Incremental and iterative monolingual clustering algorithms. Sergio Barrachina, Juan Miguel Vilar |
| 2003 | Incremental learning of new user formulations in automatic directory assistance. Marco Andorno, Luciano Fissore, Pietro Laface, Mario Nigra, Cosmin Popovici, Franco Ravera, Claudio Vair |
| 2003 | Independent automatic segmentation by self-learning categorial pronunciation rules. Nicole Beringer |
| 2003 | Influence of recording equipment on the identification of second language phoneme contrasts. Hiroaki Kato, Masumi Nukinay, Hideki Kawahara, Reiko Akahane-Yamada |
| 2003 | Influence of the waveguide propagation on the antenna performance in a car cabin. Leonid G. Krasny, Ali S. Khayrallah |
| 2003 | Information retrieval based call classification. Jan Kneissler, Anne K. Kienappel, Dietrich Klakow |
| 2003 | Information structure and efficiency in speech production. R. J. J. H. van Son, Louis C. W. Pols |
| 2003 | Inhibitory priming effect in auditory word recognition: the role of the phonological mismatch length between primes and targets. Sophie Dufour, Ronald Peereman |
| 2003 | Inline updates for HMMs. Ashutosh Garg, Manfred K. Warmuth |
| 2003 | Integrated pitch and MFCC extraction for speech reconstruction and speech recognition applications. Xu Shao, Ben P. Milner, Stephen J. Cox |
| 2003 | Integrating multilingual articulatory features into speech recognition. Sebastian Stüker, Florian Metze, Tanja Schultz, Alex Waibel |
| 2003 | Integrating statistical and rule-based knowledge for continuous German speech recognition. René Beutler, Beat Pfister |
| 2003 | Integration of noise reduction algorithms for Aurora2 task. Takeshi Yamada, Jiro Okada, Kazuya Takeda, Norihide Kitaoka, Masakiyo Fujimoto, Shingo Kuroiwa, Kazumasa Yamamoto, Takanobu Nishiura, Mitsunori Mizumachi, Satoshi Nakamura |
| 2003 | Integration of speaker recognition into conversational spoken dialogue systems. Timothy J. Hazen, Douglas A. Jones, Alex Park, Linda C. Kukolich, Douglas A. Reynolds |
| 2003 | Introduction of the CELP structure of the GSM coder in the acoustic echo canceller for the GSM network. H. Gnaba, Monia Turki-Hadj Alouane, Meriem Jaïdane-Saïdane, Pascal Scalart |
| 2003 | Investigation of emotionally morphed speech perception and its structure using a high quality speech manipulation system. Hisami Matsui, Hideki Kawahara |
| 2003 | Is there an emotion signature in intonational patterns? and can it be used in synthesis? Tanja Bänziger, Michel Morel, Klaus R. Scherer |
| 2003 | Isolated word verification using cohort word-level verification. Kishan Thambiratnam, Sridha Sridharan |
| 2003 | Jacobian adaptation based on the frequency-filtered spectral energies. Alberto Abad, Climent Nadeu, Javier Hernando, Jaume Padrell |
| 2003 | Japanese prosodic labeling support system utilizing linguistic information. Shinya Kiriyama, Yoshifumi Mitsuta, Yuta Hosokawa, Yoshikazu Hashimoto, Toshihiko Itoh, Shigeyoshi Kitazawa |
| 2003 | Jaspis^2 - an architecture for supporting distributed spoken dialogues. Markku Turunen, Jaakko Hakulinen |
| 2003 | Joint estimation of thresholds in a bi-threshold verification problem. Simon Ka-Lung Ho, Brian Mak |
| 2003 | Joint model and feature based compensation for robust speech recognition under non-stationary noise environments. Chuan Jia, Peng Ding, Bo Xu |
| 2003 | Kalman-filter based join cost for unit-selection speech synthesis. Jithendra Vepa, Simon King |
| 2003 | Keeping rare events rare. Ove Andersen, Charles Hoequist |
| 2003 | LET's GO: improving spoken dialog systems for the elderly and non-natives. Antoine Raux, Brian Langner, Alan W. Black, Maxine Eskénazi |
| 2003 | LUCIA a new italian talking-head based on a modified cohen-massaro's labial coarticulation model. Piero Cosi, Andrea Fusaro, Graziano Tisato |
| 2003 | Language identification using parallel sub-word recognition - an ergodic HMM equivalence. V. Ramasubramanian, A. K. V. Sai Jayram, T. V. Sreenivas |
| 2003 | Language model accuracy and uncertainty in noise cancelling in the stochastic weighted viterbi algorithm. Néstor Becerra Yoma, Ivan Brito, Jorge F. Silva |
| 2003 | Language model adaptation using cross-lingual information. Woosung Kim, Sanjeev Khudanpur |
| 2003 | Language model adaptation using word clustering. Shinsuke Mori, Masafumi Nishimura, Nobuyasu Itoh |
| 2003 | Language-adaptive persian speech recognition. Naveen Srinivasamurthy, Shrikanth S. Narayanan |
| 2003 | Language-reconfigurable universal phone recognition. Brenton D. Walker, Bradley C. Lackey, Jennifer S. Muller, Patrick John Schone |
| 2003 | Large corpus experiments for broadcast news recognition. Patrick Nguyen, Luca Rigazio, Jean-Claude Junqua |
| 2003 | Large lexica for speech-to-speech translation: from specification to creation. Elviira Hartikainen, Giulio Maltese, Asunción Moreno, Shaunie Shammass, Ute Ziegenhain |
| 2003 | Large margin methods for label sequence learning. Yasemin Altun, Thomas Hofmann |
| 2003 | Large vocabulary ASR for spontaneous czech in the MALACH project. Josef Psutka, Pavel Ircing, Josef V. Psutka, Vlasta Radová, William J. Byrne, Jan Hajic, Jirí Mírovský, Samuel Gustman |
| 2003 | Large vocabulary continuous speech recognition in greek: corpus and an automatic dictation system. Vassilios Digalakis, Dimitris Oikonomidis, Dimitris Pratsolis, Nikos Tsourakis, Christos Vosnidis, Nikos Chatzichrisafis, Vassilios Diakoloukas |
| 2003 | Large vocabulary conversational speech recognition with a subspace constraint on inverse covariance matrices. Scott Axelrod, Vaibhava Goel, Brian Kingsbury, Karthik Visweswariah, Ramesh A. Gopinath |
| 2003 | Large vocabulary noise robustness on Aurora4. Luca Rigazio, Patrick Nguyen, David Kryze, Jean-Claude Junqua |
| 2003 | Large vocabulary speaker independent isolated word recognition for embedded systems. Sergey Astrov, Bernt Andrassy |
| 2003 | Large vocabulary taiwanese (min-nan) speech recognition using tone features and statistical pronunciation modeling. Dau-Cheng Lyu, Min-Siong Liang, Yuang-Chin Chiang, Chun-Nan Hsu, Ren-yuan Lyu |
| 2003 | Latent ability to manipulate phonemes by Japanese preliterates in roman alphabet. Takashi Otake, Yoko Sakamoto |
| 2003 | Lattice segmentation and minimum Bayes risk discriminative training. Vlasios Doumpiotis, Stavros Tsakalidis, William J. Byrne |
| 2003 | Learning Chinese tones. Valery A. Petrushin |
| 2003 | Learning discriminative temporal patterns in speech: development of novel TRAPS-like classifiers. Barry Y. Chen, Shuangyu Chang, Sunil Sivadas |
| 2003 | Learning intra-speaker model parameter correlations from many short speaker segments. Anne K. Kienappel |
| 2003 | Learning linguistically valid pronunciations from acoustic data. Françoise Beaufays, Ananth Sankar, Shaun Williams, Mitch Weintraub |
| 2003 | Learning phrase break detection in Thai text-to-speech. Virongrong Tesprasit, Paisarn Charoenpornsawat, Virach Sornlertlamvanich |
| 2003 | Learning rule ranking by dynamic construction of context-free grammars using AND/OR graphs. Anna Corazza, Louis ten Bosch |
| 2003 | Learning to boost GMM based speaker verification. Stan Z. Li, Dong Zhang, Chengyuan Ma, Heung-Yeung Shum, Eric Chang |
| 2003 | Lexica and corpora for speech-to-speech translation: a trilingual approach. David Conejero, Jesús Giménez, Victoria Arranz, Antonio Bonafonte, Neus Pascual, Núria Castell, Asunción Moreno |
| 2003 | Likelihood ratio test with complex laplacian model for voice activity detection. Joon-Hyuk Chang, Jong Won Shin, Nam Soo Kim |
| 2003 | Linear predictive method with low-frequency emphasis. Paavo Alku, Tom Bäckström |
| 2003 | Live speech recognition in sports games by adaptation of acoustic model and language model. Yasuo Ariki, Takeru Shigemori, Tsuyoshi Kaneko, Jun Ogata, Masakiyo Fujimoto |
| 2003 | Local averaging and differentiating of spectral plane for TRAP-based ASR. Frantisek Grézl, Hynek Hermansky |
| 2003 | Local regularity analysis at glottal opening and closure instants in electroglottogram signal using wavelet transform modulus maxima. Aïcha Bouzid, Noureddine Ellouze |
| 2003 | Localized spectro-temporal features for automatic speech recognition. Michael Kleinschmidt |
| 2003 | Locally recurrent probabilistic neural network for text-independent speaker verification. Todor Ganchev, Dimitris K. Tasoulis, Michael N. Vrahatis, Nikos Fakotakis |
| 2003 | Locus equations determination using the speechdat(II). Bojan Petek |
| 2003 | Low complexity joint optimization of excitation parameters in analysis-by-synthesis speech coding. Udar Mittal, James P. Ashley, Edgardo M. Cruz-Zeno |
| 2003 | Low memory acoustic models for HMM based speech recognition. Tommi Lahti, Olli Viikki, Marcel Vasilache |
| 2003 | Low resource lip finding and tracking algorithm for embedded devices. Jesus F. Guitarte Perez, Klaus Lukas, Alejandro F. Frangi |
| 2003 | Low-latency incremental speech transcription in the synface project. Alexander Seward |
| 2003 | MMI-MAP and MPE-MAP for acoustic model adaptation. Daniel Povey, Mark J. F. Gales, Do Yeong Kim, Philip C. Woodland |
| 2003 | Mandarin speech prosody: issues, pitfalls and directions. Chiu-yu Tseng |
| 2003 | Markov chain monte carlo methods for noise robust feature extraction using the autoregressive model. Robert W. Morris, Jon A. Arrowood, Mark A. Clements |
| 2003 | Maximum a posteriori linear regression (MAPLR) variance adaptation for continuous density HMMS. Wu Chou, Xiaodong He |
| 2003 | Maximum conditional mutual information projection for speech recognition. Mohamed Kamal Omar, Mark Hasegawa-Johnson |
| 2003 | Maximum entropy good-turing estimator for language modeling. Juan P. Piantanida, Claudio Estienne |
| 2003 | Maximum likelihood endpoint detection with time-domain features. Marco Orlandi, Alfiero Santarelli, Daniele Falavigna |
| 2003 | Maximum likelihood normalization for robust speech recognition. Yiu-Pong Lai, Man-Hung Siu |
| 2003 | Maximum likelihood sub-band weighting for robust speech recognition. Donglai Zhu, Satoshi Nakamura, Kuldip K. Paliwal, Ren-Hua Wang |
| 2003 | Measuring the readability of automatic speech-to-text transcripts. Douglas A. Jones, Florian Wolf, Edward Gibson, Elliott Williams, Evelina Fedorenko, Douglas A. Reynolds, Marc A. Zissman |
| 2003 | Methods for estimation of glottal pulses waveforms exciting voiced speech. Milan Bostik, Milan Sigmund |
| 2003 | Methods to improve its portability of a spoken dialog system both on task domains and languages. YunBiao Xu, Fengying Di, Masahiro Araki, Yasuhisa Niimi |
| 2003 | Microphone array voice activity detection and noise suppression using wideband generalized likelihood ratio. Ilyas Potamitis, Eran Fishler |
| 2003 | Minimum classification error (MCE) model adaptation of continuous density HMMS. Xiaodong He, Wu Chou |
| 2003 | Minimum variance distortionless response on a warped frequency scale. Matthias Wölfel, John W. McDonough, Alex Waibel |
| 2003 | Mis-recognized utterance detection using multiple language models generated by clustered sentences. Katsuhisa Fujinaga, Hiroaki Kokubo, Hirofumi Yamamoto, Gen-ichiro Kikui, Hiroshi Shimodaira |
| 2003 | Missing feature theory applied to robust speech recognition over IP network. Toshiki Endo, Shingo Kuroiwa, Satoshi Nakamura |
| 2003 | Mixed physical modeling techniques applied to speech production. Matti Karjalainen |
| 2003 | Mixed-lingual spoken word recognition by using VQ codebook sequences of variable length segments. Hiroaki Kojima, Kazuyo Tanaka |
| 2003 | Mixed-lingual text analysis for polyglot TTS synthesis. Beat Pfister, Harald Romsdorfer |
| 2003 | Model based noisy speech recognition with environment parameters estimated by noise adaptive speech recognition with prior. Kaisheng Yao, Kuldip K. Paliwal, Satoshi Nakamura |
| 2003 | Model compression for GMM based speaker recognition systems. Douglas A. Reynolds |
| 2003 | Model-integration rapid training based on maximum likelihood for speech recognition. Shinichi Yoshizawa, Kiyohiro Shikano |
| 2003 | Modeling Cantonese pronunciation variation by acoustic model refinement. Patgi Kam, Tan Lee, Frank K. Soong |
| 2003 | Modeling cross-morpheme pronunciation variations for korean large vocabulary continuous speech recognition. Kyong-Nim Lee, Minhwa Chung |
| 2003 | Modeling duration patterns for speaker recognition. Luciana Ferrer, Harry Bratt, Venkata Ramana Rao Gadde, Sachin S. Kajarekar, Elizabeth Shriberg, M. Kemal Sönmez, Andreas Stolcke, Anand Venkataraman |
| 2003 | Modeling linguistic features in speech recognition. Min Tang, Stephanie Seneff, Victor W. Zue |
| 2003 | Modeling of various speaking styles and emotions for HMM-based speech synthesis. Junichi Yamagishi, Koji Onishi, Takashi Masuko, Takao Kobayashi |
| 2003 | Modeling speaking rate for voice fonts. Ashish Verma, Arun Kumar |
| 2003 | Modelling human speech recognition using automatic speech recognition paradigms in speM. Odette Scharenborg, James M. McQueen, Louis ten Bosch, Dennis Norris |
| 2003 | Modulation spectral filtering of speech. Les E. Atlas |
| 2003 | Modulation spectrum for pitch and speech pause detection. Olaf Schreiner |
| 2003 | Morpheme-based lexical modeling for korean broadcast news transcription. Young-Hee Park, Dong-Hoon Ahn, Minhwa Chung |
| 2003 | Morphological filtering of speech spectrograms in the context of additive noise. Francisco Romero Rodriguez, Wei Ming Liu, Nicholas W. D. Evans, John S. D. Mason |
| 2003 | Multi-array fusion for beamforming and localization of moving speakers. Ilyas Potamitis, George Tremoulis, Nikos Fakotakis, George Kokkinakis |
| 2003 | Multi-channel sentence classification for spoken dialogue language modeling. Frédéric Béchet, Giuseppe Riccardi, Dilek Z. Hakkani-Tür |
| 2003 | Multi-class extractive voicemail summarization. Konstantinos Koumpis, Steve Renals |
| 2003 | Multi-mode matrix quantizer for low bit rate LSF quantization. Ulpu Sinervo, Jani Nurminen, Ari Heikkinen, Jukka Saarinen |
| 2003 | Multi-mode quantization of adjacent speech parameters using a low-complexity prediction scheme. Jani Nurminen |
| 2003 | Multi-rate extension of the scalable to lossless PSPIHT audio coder. Mohammed Raad, Ian S. Burnett, Alfred Mertins |
| 2003 | Multi-referenced correction of the voice timbre distortions in telephone networks. Gaël Mahé, André Gilloire |
| 2003 | Multi-resolution auditory scene analysis: robust speech recognition using pattern-matching from a noisy signal. Sue Harding, Georg F. Meyer |
| 2003 | Multi-scale document expansion in English-Mandarin cross-language spoken document retrieval. Wai Kit Lo, Yuk-Chi Li, Gina-Anne Levow, Hsin-Min Wang, Helen M. Meng |
| 2003 | Multi-source training and adaptation for generic speech recognition. Fabrice Lefèvre, Jean-Luc Gauvain, Lori Lamel |
| 2003 | Multi-speaker DOA tracking using interactive multiple models and probabilistic data association. Ilyas Potamitis, George Tremoulis, Nikos Fakotakis |
| 2003 | Multigram-based grapheme-to-phoneme conversion for LVCSR. Maximilian Bisani, Hermann Ney |
| 2003 | Multilayered extensions to the speech synthesis markup language for describing expressiveness. Ellen Eide, Raimo Bakis, Wael Hamza, John F. Pitrelli |
| 2003 | Multilingual acoustic modeling using graphemes. Stephan Kanthak, Hermann Ney |
| 2003 | Multilingual phone clustering for recognition of spontaneous indonesian speech utilising pronunciation modelling techniques. Eddie Wong, Terrence Martin, Torbjørn Svendsen, Sridha Sridharan |
| 2003 | Multimodal interaction on PDA's integrating speech and pen inputs. Sorin Dusan, Gregory J. Gadbois, James L. Flanagan |
| 2003 | Multimodality and speech technology: verbal and non-verbal communication in talking agents. Björn Granström, David House |
| 2003 | Multitask learning in connectionist robust ASR using recurrent neural networks. Shahla Parveen, Phil D. Green |
| 2003 | My voice, your prosody: sharing a speaker specific prosody model across speakers in unit selection TTS. Matthew P. Aylett, Justin Fackrell, Peter Rutten |
| 2003 | NIST 2003 language recognition evaluation. Alvin F. Martin, Mark A. Przybocki |
| 2003 | Named entity extraction from Japanese broadcast news. Akio Kobayashi, Franz Josef Och, Hermann Ney |
| 2003 | Named entity extraction from word lattices. James Horlock, Simon King |
| 2003 | Natural language response generation in mixed-initiative dialogs using task goals and dialog acts. Helen M. Meng, Wing Lin Yip, Oi Yan Mok, Shuk Fong Chan |
| 2003 | Nearest-neighbor search algorithms based on subcodebook selection and its application to speech recognition. José A. R. Fonollosa |
| 2003 | Neural networks versus codebooks in an application for bandwidth extension of speech signals. Bernd Iser, Gerhard Schmidt |
| 2003 | New MAP estimators for speaker recognition. Patrick Kenny, Mohamed Mihoubi, Pierre Dumouchel |
| 2003 | New model-based HMM distances with applications to run-time ASR error estimation and model tuning. Chao-Shih Huang, Chin-Hui Lee, Hsiao-Chuan Wang |
| 2003 | Noise reduction using paired-microphones on non-equally-spaced microphone arrangement. Mitsunori Mizumachi, Satoshi Nakamura |
| 2003 | Noise robust digit recognition with missing frames. Cenk Demiroglu, David V. Anderson |
| 2003 | Noise robust speech parameterization based on joint wavelet packet decomposition and autoregressive modeling. Bojan Kotnik, Zdravko Kacic, Bogomir Horvat |
| 2003 | Noise robustness in speech to speech translation. Fu-Hua Liu, Yuqing Gao, Liang Gu, Michael Picheny |
| 2003 | Noise-robust ASR by using distinctive phonetic features approximated with logarithmic normal distribution of HMM. Takashi Fukuda, Tsuneo Nitta |
| 2003 | Noise-robust automatic speech recognition using orthogonalized distinctive phonetic feature vectors. Takashi Fukuda, Tsuneo Nitta |
| 2003 | Non-audible murmur recognition. Yoshitaka Nakajima, Hideki Kashioka, Kiyohiro Shikano, Nick Campbell |
| 2003 | Non-intrusive assessment of perceptual speech quality using a self-organising map. Dorel Picovici, Abdulhussain E. Mahdi |
| 2003 | Non-linear compression of feature vectors using transform coding and non-uniform bit allocation. Ben P. Milner |
| 2003 | Non-linear maximum likelihood feature transformation for speech recognition. Mohamed Kamal Omar, Mark Hasegawa-Johnson |
| 2003 | Non-native spontaneous speech recognition through polyphone decision tree specialization. Zhirong Wang, Tanja Schultz |
| 2003 | Nonlinear analysis of speech signals: generalized dimensions and lyapunov exponents. Vassilis Pitsikalis, Iasonas Kokkinos, Petros Maragos |
| 2003 | Normalization of time-derivative parameters using histogram equalization. Yasunari Obuchi, Richard M. Stern |
| 2003 | Novel approaches for one- and two-speaker detection. Sachin S. Kajarekar, André Gustavo Adami, Hynek Hermansky |
| 2003 | On cohort selection for speaker verification. Yaniv Zigel, Arnon Cohen |
| 2003 | On divergence based clustering of normal distributions and its application to HMM adaptation. Tor André Myrvoll, Frank K. Soong |
| 2003 | On factorizing spectral dynamics for robust speech recognition. Vivek Tyagi, Iain McCowan, Hervé Bourlard, Hemant Misra |
| 2003 | On lexicon creation for turkish LVCSR. Kadri Hacioglu, Bryan L. Pellom, Tolga Çiloglu, Özlem Öztürk, Mikko Kurimo, Mathias Creutz |
| 2003 | On the advantage of frequency-filtering features for speech recognition with variable sampling frequencies. experiments with speechdatcar databases. Hermann Bauerecker, Climent Nadeu, Jaume Padrell |
| 2003 | On the amount of speech data necessary for successful speaker identification. Ales Padrta, Vlasta Radová |
| 2003 | On the combination of speech and speaker recognition. Mohamed Faouzi BenZeghiba, Hervé Bourlard |
| 2003 | On the design of cost functions for unit-selection speech synthesis. Francisco Campillo Díaz, Eduardo Rodríguez Banga |
| 2003 | On the fusion of dissimilarity-based classifiers for speaker identification. Tomi Kinnunen, Ville Hautamäki, Pasi Fränti |
| 2003 | On the limits of cluster-based acoustic modeling. S. Douglas Peters |
| 2003 | On the number of Gaussian components in a mixture: an application to speaker verification tasks. Mijail Arcienega, Andrzej Drygajlo |
| 2003 | On the role of intonation in the organization of Mandarin Chinese speech prosody. Chiu-yu Tseng |
| 2003 | On the use of kernel PCA for feature extraction in speech recognition. Amaro A. de Lima, Heiga Zen, Yoshihiko Nankaku, Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura |
| 2003 | On unit analysis for Cantonese corpus-based TTS. Jun Xu, Thomas Choy, Minghui Dong, Cuntai Guan, Haizhou Li |
| 2003 | On-line parametric histogram equalization techniques for noise robust embedded speech recognition. Hemmo Haverinen, Imre Kiss |
| 2003 | On-line user modelling in a mobile spoken dialogue system. Niels Ole Bernsen |
| 2003 | Optimality criteria in inverse problems for tongue-jaw interaction. Alexander S. Leonov, Victor N. Sorokin |
| 2003 | Optimization of the CELP model in the LSP domain. Khosrow Lashkari, Toshio Miki |
| 2003 | Optimization of window and LSF interpolation factor for the ITU-t g.729 speech coding standard. Wai C. Chu, Toshio Miki |
| 2003 | Optimizing integrated cost function for segment selection in concatenative speech synthesis based on perceptual evaluations. Tomoki Toda, Hisashi Kawai, Minoru Tsuzaki |
| 2003 | Orientel: recording telephone speech of turkish speakers in Germany. Christoph Draxler |
| 2003 | Overlapped di-tone modeling for tone recognition in continuous Cantonese speech. Yao Qian, Tan Lee, Yujia Li |
| 2003 | PPRLM optimization for language identification in air traffic control tasks. Ricardo de Córdoba, G. Prime, Javier Macías Guarasa, Juan Manuel Montero, Javier Ferreiros, José Manuel Pardo |
| 2003 | Parametric multi-band automatic gain control for noisy speech enhancement. Mikhail Stolbov, Serguei Koval, Mikhail Khitrov |
| 2003 | Parsing spontaneous speech. Rodolfo Delmonte |
| 2003 | Perceiving emotions by ear and by eye. Béatrice de Gelder |
| 2003 | Perception of English lexical stress by English and Japanese speakers: effect of duration and "realistic" intensity change. Shinichi Tokuma |
| 2003 | Perception of voice-individuality for distortions of resonance/source characteristics and waveforms. Hisao Kuwabara |
| 2003 | Perceptual MVDR-based cepstral coefficients (PMCCs) for high accuracy speech recognition. Umit H. Yapanel, Satya Dharanipragada, John H. L. Hansen |
| 2003 | Perceptual based speech enhancement for normal-hearing and hearing-impaired individuals. Ajay Natarajan, John H. L. Hansen, Kathryn Hoberg Arehart, Jessica Rossi-Katz |
| 2003 | Perceptual irrelevancy removal in narrowband speech coding. Marja Lahdekorpi, Jani Nurminen, Ari Heikkinen, Jukka Saarinen |
| 2003 | Perceptual wavelet adaptive denoising of speech. Qiang Fu, Eric A. Wan |
| 2003 | Perceptually weighted linear transformations for voice conversion. Hui Ye, Steve J. Young |
| 2003 | Perceptually-constrained generalized singular value decomposition-based approach for enhancing speech corrupted by colored noise. Gwo-hwa Ju, Lin-Shan Lee |
| 2003 | Perceptually-related acoustic-prosodic features of phrase finals in spontaneous speech. Carlos Toshinori Ishi, Parham Mokhtari, Nick Campbell |
| 2003 | Performance evaluation of IFAS-based fundamental frequency estimator in noisy environment. Dhany Arifianto, Takao Kobayashi |
| 2003 | Performance evaluation of phonotactic and contextual onset-rhyme models for speech recognition of Thai language. Somchai Jitapunkul, Ekkarit Maneenoi, Visarut Ahkuputra, Sudaporn Luksaneeyanawin |
| 2003 | Performance improvement of rapid speaker adaptation based on eigenvoice and bias compensation. Jong Se Park, Hwa Jeon Song, Hyung Soon Kim |
| 2003 | Person authentication by voice: a need for caution. Jean-François Bonastre, Frédéric Bimbot, Louis-Jean Boë, Joseph P. Campbell, Douglas A. Reynolds, Ivan Magrin-Chagnolleau |
| 2003 | Phonetic class-based speaker verification. Matthieu Hébert, Larry P. Heck |
| 2003 | Physical and perceptual configurations of Japanese fricatives from multidimensional scaling analyses. Won Tokuma |
| 2003 | Pitch estimation using phase locked loops. Patricia A. Pelle, Matias L. Capeletto |
| 2003 | Polar quantization of sinusoids from speech signal blocks. Harald Pobloth, Renat Vafin, W. Bastiaan Kleijn |
| 2003 | Potential audiovisual correlates of contrastive focus in French. Marion Dohen, Hélène Loevenbruck, Marie-Agnès Cathiard, Jean-Luc Schwartz |
| 2003 | Predicting the perceptive judgment of voices in a telecom context: selection of acoustic parameters. Thibaut Ehrette, Noël Chateau, Christophe d'Alessandro, Valérie Maffiolo |
| 2003 | Prediction of fujisaki model's phrase commands. João Paulo Ramos Teixeira, Diamantino Freitas, Hiroya Fujisaki |
| 2003 | Prediction of sentence importance for speech summarization using prosodic parameters. Akira Inoue, Takayoshi Mikami, Yoichi Yamashita |
| 2003 | Predictive hidden Markov model selection for decision tree state tying. Jen-Tzung Chien, Sadaoki Furui |
| 2003 | Preference, perception, and task completion of open, menu-based, and directed prompts for call routing: a case study. Jason D. Williams, Andrew T. Shaw, Lawrence Piano, Michael Abt |
| 2003 | Probability models of formant parameters for voice conversion. Dimitrios Rentzos, Saeed Vaseghi, Qin Yan, Ching-Hsiang Ho, Emir Turajlic |
| 2003 | Product of Gaussians as a distributed representation for speech recognition. S. S. Airey, Mark J. F. Gales |
| 2003 | Prosodic analysis and modeling of the NAGAUTA singing to synthesize its prosodic patterns from the standard notation. Nobuaki Minematsu, Bungo Matsuoka, Keikichi Hirose |
| 2003 | Prosodic correlates of contrastive and non-contrastive themes in German. Bettina Braun, D. Robert Ladd |
| 2003 | Prosodic cues for emotion characterization in real-life spoken dialogs. Laurence Devillers, Ioana Vasilescu |
| 2003 | Prosody dependent speech recognition with explicit duration modelling at intonational phrase boundaries. Ken Chen, Sarah Borys, Mark Hasegawa-Johnson, Jennifer Cole |
| 2003 | Prosody-based classification of emotions in spoken finnish. Tapio Seppänen, Eero Väyrynen, Juhani Toivanen |
| 2003 | Pruning transitions in a hidden Markov model with optimal brain surgeon. Brian Kan-Wing Mak, Kin-Wah Chan |
| 2003 | Quality control of language resources at ELRA. Henk van den Heuvel, Khalid Choukri, Harald Höge, Bente Maegaard, Jan Odijk, Valérie Mapelli |
| 2003 | Quality enhancement of CELP coded speech by using an MFCC based Gaussian mixture model. D. G. Raza, C. F. Chan |
| 2003 | Quality-complexity trade-off in predictive LSF quantization. Davorka Petrinovic, Davor Petrinovic |
| 2003 | Quantifying the impact of system characteristics on perceived quality dimensions of a spoken dialogue service. Sebastian Möller, Janto Skowronek |
| 2003 | Quantitative analysis and synthesis of syllabic tones in vietnamese. Hansjörg Mixdorff, Nguyen Hung Bach, Hiroya Fujisaki, Chi Mai Luong |
| 2003 | Quantity comparison of Japanese and finnish in various word structures. Toshiko Isei-Jaakkola |
| 2003 | Ravenclaw: dialog management using hierarchical task decomposition and an expectation agenda. Dan Bohus, Alexander I. Rudnicky |
| 2003 | Reaction time as an indicator of discrete intonational contrasts in English. Aoju Chen |
| 2003 | Read my tongue movements: bimodal learning to perceive and produce non-native speech /r/ and /l/. Dominic W. Massaro, Joanna Light |
| 2003 | Recent enhancements in CU VOCAL for Chinese TTS-enabled applications. Helen M. Meng, Yuk-Chi Li, Tien Ying Fung, Man Cheuk Ho, Chi-Kin Keung, Tin Hang Lo, Wai Kit Lo, P. C. Ching |
| 2003 | Recent progress in the decoding of non-native speech with multilingual acoustic models. Volker Fischer, Eric Janke, Siegfried Kunzmann |
| 2003 | Recognising 'real-life' speech with spem: a speech-based computational model of human speech recognition. Odette Scharenborg, Louis ten Bosch, Lou Boves |
| 2003 | Recognition of emotions in interactive voice response systems. Sherif M. Yacoub, Steven J. Simske, Xiaofan Lin, John Burns |
| 2003 | Recognition of intonation patterns in Thai utterance. Patavee Charnvivit, Nuttakorn Thubthong, Ekkarit Maneenoi, Sudaporn Luksaneeyanawin, Somchai Jitapunkul |
| 2003 | Recognition of out-of-vocabulary words with sub-lexical language models. Lucian Galescu |
| 2003 | Recognition of phoneme strings using TRAP technique. Petr Schwarz, Pavel Matejka, Jan Cernocký |
| 2003 | Reduction of dimension of HMM parameters using ICA and PCA in MLLR framework for speaker adaptation. Jiun Kim, Jaeho Chung |
| 2003 | Reproducing laryngeal mechanisms with a two-mass model. Denisse Sciamarella, Christophe d'Alessandro |
| 2003 | Residual echo power estimation for speech reinforcement systems in vehicles. Alfonso Ortega, Eduardo Lleida, Enrique Masgrau |
| 2003 | Restricted unlimited domain synthesis. Antje Schweitzer, Norbert Braunschweiler, Tanja Klankert, Bernd Möbius, Bettina Säuberlich |
| 2003 | Resynthesis of 3d tongue movements from facial data. Olov Engwall, Jonas Beskow |
| 2003 | Revisiting scenarios and methods for variable frame rate analysis in automatic speech recognition. Javier Macías Guarasa, J. Ordóñez, Juan Manuel Montero, Javier Ferreiros, Ricardo de Córdoba, Luis Fernando D'Haro |
| 2003 | Roadmaps, journeys and destinations speculations on the future of speech technology research. Ronald A. Cole |
| 2003 | Robust energy demodulation based on continuous models with application to speech recognition. Dimitrios Dimitriadis, Petros Maragos |
| 2003 | Robust feature extraction and acoustic modeling at multitel: experiments on the Aurora databases. Stéphane Dupont, Christophe Ris |
| 2003 | Robust jointly optimized multistage vector quantization for speech coding. Venkatesh Krishnan, David V. Anderson |
| 2003 | Robust likelihood ratio estimation in Bayesian forensic speaker recognition. Joaquin Gonzalez-Rodriguez, Daniel Garcia-Romero, Marta Garcia-Gomar, Daniel Ramos, Javier Ortega-Garcia |
| 2003 | Robust methods in automatic speech recognition and understanding. Sadaoki Furui |
| 2003 | Robust multi-class boosting. Gunnar Rätsch |
| 2003 | Robust multiple resolution analysis for automatic speech recognition. Roberto Gemello, Franco Mana, Dario Albesano, Renato De Mori |
| 2003 | Robust parsing of utterances in negotiative dialogue. Johan Boye, Mats Wirén |
| 2003 | Robust speaker identification using posterior union models. Ji Ming, Darryl Stewart, Philip Hanna, Pat Corr, Francis Jack Smith, Saeed Vaseghi |
| 2003 | Robust speech interaction in a mobile environment through the use of multiple and different media input types. Rainer Wasinger, Christoph Stahl, Antonio Krüger |
| 2003 | Robust speech recognition to non-stationary noise based on model-driven approaches. Christophe Cerisara, Irina Illina |
| 2003 | Robust speech recognition using missing feature theory in the cepstral or LDA domain. Hugo Van hamme |
| 2003 | Robust speech recognition using model-based feature enhancement. Veronique Stouten, Hugo Van hamme, Kris Demuynck, Patrick Wambacq |
| 2003 | Robust speech recognition using non-linear spectral smoothing. Michael J. Carey |
| 2003 | Robust speech understanding based on expected discourse plan. Shinya Takahashi, Tsuyoshi Morimoto, Sakashi Maeda, Naoyuki Tsuruta |
| 2003 | Robust techniques for pre- and post-surgical voice analysis. Claudia Manfredi, Giorgio Peretti |
| 2003 | SAG: a procedural tactical generator for dialog systems. Dalina Kallulli |
| 2003 | SOM as likelihood estimator for speaker clustering. Itshak Lapidot |
| 2003 | SYNFACE - a talking face telephone. Inger Karlsson, Andrew Faulkner, Giampiero Salvi |
| 2003 | Say-as classification for alphabetic words in Japanese texts. Hisako Asano, Masaaki Nagata, Masanobu Abe |
| 2003 | Schema-based modeling of phonemic restoration. Soundararajan Srinivasan, DeLiang Wang |
| 2003 | Score normalisation applied to open-set, text-independent speaker identification. P. Sivakumaran, J. Fortuna, Aladdin M. Ariyaeeinia |
| 2003 | Segmental durations predicted with a neural network. João Paulo Ramos Teixeira, Diamantino Freitas |
| 2003 | Segmentation of speech for speaker and language recognition. André Gustavo Adami, Hynek Hermansky |
| 2003 | Segmentation of speech into syllable-like units. T. Nagarajan, Hema A. Murthy, Rajesh M. Hegde |
| 2003 | Segmenting multiple concurrent speakers using microphone arrays. Guillaume Lathoud, Iain McCowan, Darren Moore |
| 2003 | Semantic and dialogic annotation for automated multilingual customer service. Hilda Hardy, Kirk Baker, Hélène Bonneau-Maynard, Laurence Devillers, Sophie Rosset, Tomek Strzalkowski |
| 2003 | Semantic object synchronous understanding in SALT for highly interactive user interface. Kuansan Wang |
| 2003 | Semi-tied full deviation matrices for laplacian density models. Christoph Neukirchen |
| 2003 | Sentence boundary detection in arabic speech. Amit Srivastava, Francis Kubala |
| 2003 | Sentence verification in spoken dialogue system. Huei-Ming Wang, Yi-Chung Lin |
| 2003 | Several HKU approaches for robust speech recognition and their evaluation on Aurora connected digit recognition tasks. Jian Wu, Qiang Huo |
| 2003 | Shared resources for robust speech-to-text technology. Stephanie M. Strassel, David Miller, Kevin Walker, Christopher Cieri |
| 2003 | Should i tell all?: an experiment on conciseness in spoken dialogue. Stephen Whittaker, Marilyn A. Walker, Preetam Maloor |
| 2003 | Simple designing methods of corpus-based visual speech synthesis. Tatsuya Shiraishi, Tomoki Toda, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano |
| 2003 | Smartkom-home - an advanced multi-modal interface to home entertainment. Thomas Portele, Silke Goronzy, Martin C. Emele, Andreas Kellner, Sunna Torge, Jürgen te Vrugt |
| 2003 | Spanish broadcast news transcription. Gerhard Backfried, Roser Jaquemot Caldes |
| 2003 | Speaker adaptation based on confidence-weighted training. Gyucheol Jang, Minho Jin, Chang D. Yoo |
| 2003 | Speaker adaptation for non-native speakers using bilingual English lexicon and acoustic models. Shoichi Matsunaga, Atsunori Ogawa, Yoshikazu Yamaguchi, Akihiro Imamura |
| 2003 | Speaker adaptation using regression classes generated by phonetic decision tree-based successive state splitting. Se-Jin Oh, Kwang-Dong Kim, Duk-Gyoo Roh, Woo-Chang Sung, Hyun-Yeol Chung |
| 2003 | Speaker characterization using principal component analysis and wavelet transform for speaker verification. Chakib Tadj, A. Benlahouar |
| 2003 | Speaker conversion in ARX-based source-formant type speech synthesis. Hiroki Mori, Hideki Kasuya |
| 2003 | Speaker model selection using Bayesian information criterion for speaker indexing and speaker adaptation. Masafumi Nishida, Tatsuya Kawahara |
| 2003 | Speaker modeling from selected neighbors applied to speaker recognition. Yassine Mami, Delphine Charlet |
| 2003 | Speaker recognition using MPEG-7 descriptors. Hyoung-Gook Kim, Edgar Berdahl, Nicolas Moreau, Thomas Sikora |
| 2003 | Speaker recognition using local models. Ryan Rifkin |
| 2003 | Speaker verification based on g.729 and g.723.1 coder parameters and handset mismatch compensation. Eric W. M. Yu, Man-Wai Mak, Chin-Hung Sit, Sun-Yuan Kung |
| 2003 | Speaker verification based on the German veridat database. Ulrich Türk, Florian Schiel |
| 2003 | Speaker verification systems and security considerations. David A. van Leeuwen |
| 2003 | Spectral maxima representation for robust automatic speech recognition. J. Sujatha, K. R. Prasanna Kumar, K. R. Ramakrishnan, N. Balakrishnan |
| 2003 | Spectro-temporal interactions in auditory and auditory-visual speech processing. Ken W. Grant, Steven Greenberg |
| 2003 | Speech analysis with the short-time chirp transform. Luis Weruaga, Marián Képesi |
| 2003 | Speech and language processing: where have we been and where are we going? Kenneth Ward Church |
| 2003 | Speech enhancement and improved recognition accuracy by integrating wavelet transform and spectral subtraction algorithm. Gwo-hwa Ju, Lin-Shan Lee |
| 2003 | Speech enhancement for a car environment using LP residual signal and spectral subtraction. Agustín Álvarez, Victor Nieto Lluis, Pedro Gómez-Vilda, Rafael Martínez |
| 2003 | Speech enhancement for hands-free car phones by adaptive compensation of harmonic engine noise components. Henning Puder |
| 2003 | Speech enhancement using a-priori information. Sriram Srinivasan, Jonas Samuelsson, W. Bastiaan Kleijn |
| 2003 | Speech enhancement using weighting function based on the variance of wavelet coefficients. Ching-Ta Lu, Hsiao-Chuan Wang |
| 2003 | Speech enhancement with microphone array and fourier / wavelet spectral subtraction in real noisy environments. Yuki Denda, Takanobu Nishiura, Hideki Kawahara |
| 2003 | Speech generation from concept for realizing conversation with an agent in a virtual room. Keikichi Hirose, Junji Tago, Nobuaki Minematsu |
| 2003 | Speech recognition based on syllable recovery. Li Zhang, William H. Edmondson |
| 2003 | Speech recognition of double talk using SAFIA-based audio segregation. Toshiyuki Sekiya, Tetsuji Ogawa, Tetsunori Kobayashi |
| 2003 | Speech recognition over bluetooth wireless channels. Ziad Al Bawab, Ivo Locher, Jianxia Xue, Abeer Alwan |
| 2003 | Speech recognition using EMG; mime speech recognition. Hiroyuki Manabe, Akira Hiraiwa, Toshiaki Sugimura |
| 2003 | Speech recognition with a generative factor analyzed hidden Markov model. Kaisheng Yao, Kuldip K. Paliwal, Te-Won Lee |
| 2003 | Speech recognition with dynamic grammars using finite-state transducers. Johan Schalkwyk, I. Lee Hetherington, Ezra Story |
| 2003 | Speech segregation based on fundamental event information using an auditory vocoder. Toshio Irino, Roy D. Patterson, Hideki Kawahara |
| 2003 | Speech shift: direct speech-input-mode switching through intentional control of voice pitch. Masataka Goto, Yukihiro Omoto, Katunobu Itou, Tetsunori Kobayashi |
| 2003 | Speech starter: noise-robust endpoint detection by using filled pauses. Koji Kitayama, Masataka Goto, Katunobu Itou, Tetsunori Kobayashi |
| 2003 | Speech summarization using weighted finite-state transducers. Takaaki Hori, Chiori Hori, Yasuhiro Minami |
| 2003 | Speech watermarking by parametric embedding with an l_(infinity) fidelity criterion. Aparna Gurijala, John R. Deller Jr. |
| 2003 | Speech-based, manual-visual, and multi-modal interaction with an in-car computer - evaluation of a pilot study. Rogier Woltjer, Wah Jin Tan, Fang Chen |
| 2003 | Speechalator: two-way speech-to-speech translation on a consumer PDA. Alex Waibel, Ahmed Badran, Alan W. Black, Robert E. Frederking, Donna Gates, Alon Lavie, Lori S. Levin, Kevin A. Lenzo, Laura Mayfield Tomokiyo, Jürgen Reichert, Tanja Schultz, Dorcas Wallace, Monika Woszczyna, Jing Zhang |
| 2003 | Spoken cross-language access to image collection via captions. Hsin-Hsi Chen |
| 2003 | Spoken dialogue system for queries on appliance manuals using hierarchical confirmation strategy. Tatsuya Kawahara, Ryosuke Ito, Kazunori Komatani |
| 2003 | Spoken language and e-inclusion. Alan F. Newell |
| 2003 | Spoken language condensation in the 21st century. Klaus Zechner |
| 2003 | Spoken language output: realising the vision. Roger K. Moore |
| 2003 | Spotting "hot spots" in meetings: human judgments and prosodic cues. Britta Wrede, Elizabeth Shriberg |
| 2003 | Statistical estimation of phoneme's most stable point based on universal constraint. Shigeki Okawa, Katsuhiko Shirai |
| 2003 | Statistical evaluation of the influence of stress on pitch frequency and phoneme durations in farsi language. Davood Gharavian, Seyed Mohammad Ahadi |
| 2003 | Statistical methods and Bayesian interpretation of evidence in forensic automatic speaker recognition. Andrzej Drygajlo, Didier Meuwly, Anil Alexander |
| 2003 | Statistical signal processing with nonnegativity constraints. Lawrence K. Saul, Fei Sha, Daniel D. Lee |
| 2003 | Statistical speech-to-speech translation with multilingual speech recognition and bilingual-chunk parsing. Bo Xu, Shuwu Zhang, Chengqing Zong |
| 2003 | Stem-based maximum entropy language models for inflectional languages. Dimitris Oikonomidis, Vassilios Digalakis |
| 2003 | Strategies for automatic multi-tier annotation of spoken language corpora. Steven Greenberg |
| 2003 | Stress-based speech segmentation revisited. Sven L. Mattys |
| 2003 | Structural linear model-space transformations for speaker adaptation. Driss Matrouf, Olivier Bellot, Pascal Nocera, Georges Linarès, Jean-François Bonastre |
| 2003 | Structural state-based frame synchronous compensation. Vincent Barreaud, Irina Illina, Dominique Fohr, Filipp Korkmazsky |
| 2003 | Subband-based acoustic shock limiting algorithm on a low-resource DSP system. Gary Choy, David Hermann, Robert L. Brennan, Todd Schneider, Hamid Sheikhzadeh, Etienne Cornu |
| 2003 | Subjective evaluations for perception of speaker identity through acoustic feature transplantations. Oytun Türk, Levent M. Arslan |
| 2003 | Syllable classification using articulatory-acoustic features. Mirjam Wester |
| 2003 | Syllable structure based phonetic units for context-dependent continuous Thai speech recognition. Supphanat Kanokphara |
| 2003 | Syllable-based acoustic modeling for Japanese spontaneous speech recognition. Jun Ogata, Yasuo Ariki |
| 2003 | Techniques for effective vocabulary selection. Anand Venkataraman, Wen Wang |
| 2003 | Temporal aspects of articulatory control. Elliot Saltzman |
| 2003 | Temporal properties of the nasals and nasalization in Cantonese. Beatrice Fung-Wah Khioe |
| 2003 | Text design for TTS speech corpus building using a modified greedy selection. Baris Bozkurt, Özlem Öztürk, Thierry Dutoit |
| 2003 | Text-independent speaker recognition by speaker-specific GMM and speaker adapted syllable-based HMM. Seiichi Nakagawa, Wei Zhang |
| 2003 | Tfarsdat - the telephone farsi speech database. Mahmood Bijankhan, Javad Sheykhzadegan, Mahmood R. Roohani, Rahman Zarrintare, Seyyed Z. Ghasemi, Mohammad E. Ghasedi |
| 2003 | The /i/-/a/-/u/-ness of spoken vowels. Hartmut R. Pfitzinger |
| 2003 | The 300k LIMSI German broadcast news transcription system. Kevin McTait, Martine Adda-Decker |
| 2003 | The LIUM-AVS database : a corpus to test lip segmentation and speechreading systems in natural conditions. Philippe Daubias, Paul Deléglise |
| 2003 | The NESPOLE! voIP multilingual corpora in tourism and medical domains. Nadia Mana, Susanne Burger, Roldano Cattoni, Laurent Besacier, Victoria MacLaren, John W. McDonough, Florian Metze |
| 2003 | The application of interactive speech unit selection in TTS systems. Peter Rutten, Justin Fackrell |
| 2003 | The awe and mystery of t-norm. Jirí Navrátil, Ganesh N. Ramaswamy |
| 2003 | The basque speech_dat (II) database: a description and first test recognition results. Inmaculada Hernáez, Iker Luengo, Eva Navas, Maria Luisa Zubizarreta, Iñaki Gaminde, Jon Sánchez |
| 2003 | The czech speech and prosody database both for ASR and TTS purposes. Jáchym Kolár, Jan Romportl, Josef Psutka |
| 2003 | The development of a multi-purpose spoken dialogue system. João Paulo Neto, Nuno J. Mamede, Renato Cassaca, Luís C. Oliveira |
| 2003 | The dynamic, multi-lingual lexicon in smartkom. Silke Goronzy, Zica Valsan, Martin C. Emele, Juergen Schimanowski |
| 2003 | The effect of amplitude compression on wide band telephone speech for hearing-impaired elderly people. Mutsumi Saito, Kimio Shiraishi, Kimitoshi Fukudome |
| 2003 | The effect of an intermediate articulatory layer on the performance of a segmental HMM. Martin J. Russell, Philip J. B. Jackson |
| 2003 | The effect of speech rate and noise on bilinguals' speech perception: the case of native speakers of arabic in israel. Judith Rosenhouse, Liat Kishon-Rabin |
| 2003 | The effect of surrounding phrase lengths on pause duration. Elena Zvonik, Fred Cummins |
| 2003 | The perceptual cues of a high level pitch-accent pattern in Japanese: pitch-accent patterns and duration. Tsutomu Sato |
| 2003 | The queen's communicator: an object-oriented dialogue manager. Ian M. O'Neill, Philip Hanna, Xingkun Liu, Michael F. McTear |
| 2003 | The statistical approach to machine translation and a roadmap for speech translation. Hermann Ney |
| 2003 | The temporal organisation of speech as gauged by speech synthesis. Brigitte Zellner Keller |
| 2003 | The use of confidence measures in vector based call-routing. Stephen J. Cox, Gavin C. Cawley |
| 2003 | The use of multiple pause information in dependency structure analysis of spoken Japanese sentences. Meirong Lu, Kazuyuki Takagi, Kazuhiko Ozeki |
| 2003 | Three simultaneous speech recognition by integration of active audition and face recognition for humanoid. Kazuhiro Nakadai, Daisuke Matsuura, Hiroshi G. Okuno, Hiroshi Tsujino |
| 2003 | Time adjustable mixture weights for speaking rate fluctuation. Takahiro Shinozaki, Sadaoki Furui |
| 2003 | Time alignment for scenario and sounds with voice, music and BGM. Yamato Wada, Masahide Sugiyama |
| 2003 | Time delay estimation based on hearing characteristic. Zhaoli Yan, Limin Du, Jianqiang Wei, Hui Zeng |
| 2003 | Time is of the essence - dynamic approaches to spoken language. Steven Greenberg |
| 2003 | Time-domain based temporal processing with application of orthogonal transformations. Petr Motlícek, Jan Cernocký |
| 2003 | Tone pattern discrimination combining parametric modeling and maximum likelihood estimation. Jinfu Ni, Hisashi Kawai |
| 2003 | Topic segmentation and retrieval system for lecture videos based on spontaneous speech recognition. Natsuo Yamamoto, Jun Ogata, Yasuo Ariki |
| 2003 | Topic-specific parser design in an air travel natural language understanding application. Chaitanya Ekanadham, Juan M. Huerta |
| 2003 | Toward domain-independent conversational speech recognition. Brian Kingsbury, Lidia Mangu, George Saon, Geoffrey Zweig, Scott Axelrod, Vaibhava Goel, Karthik Visweswariah, Michael Picheny |
| 2003 | Towards a personal robot with language interface. Luís Seabra Lopes, António J. S. Teixeira, Mário Rodrigues, Diogo Gomes, Cláudio Teixeira, Liliana da Silva Ferreira, Pedro Filipe Soares, João Girão, Nuno Sénica |
| 2003 | Towards a repository of digital talking books. António Joaquim Serralheiro, Isabel Trancoso, Diamantino Caseiro, Teresa Chambel, Luís Carriço, Nuno Guimarães |
| 2003 | Towards an evaluation standard for speech control concepts in real-world scenarios. Jens Maase, Diane Hirschfeld, Uwe Koloska, Timo Westfeld, Jörg Helbig |
| 2003 | Towards best practices for speech user interface design. Bernhard Suhm |
| 2003 | Towards dynamic multi-domain dialogue processing. Botond Pakucs |
| 2003 | Towards missing data recognition with cepstral features. Christophe Cerisara |
| 2003 | Towards multimodal interaction with an intelligent room. Petra Gieselmann, Matthias Denecke |
| 2003 | Towards optimal encoding for classification with applications to distributed speech recognition. Naveen Srinivasamurthy, Antonio Ortega, Shrikanth S. Narayanan |
| 2003 | Towards synthesising expressive speech; designing and collecting expressive speech data. Nick Campbell |
| 2003 | Towards the automatic extraction of fujisaki model parameters for Mandarin. Hansjörg Mixdorff, Hiroya Fujisaki, Gao Peng Chen, Yu Hu |
| 2003 | Towards the automatic generation of mixed-initiative dialogue systems from web content. Joseph Polifroni, Grace Chung, Stephanie Seneff |
| 2003 | Towards the development of a brazilian portuguese text-to-speech system based on HMM. Ranniery Maia, Heiga Zen, Keiichi Tokuda, Tadashi Kitamura, Fernando Gil Vianna Resende Jr. |
| 2003 | Tracking a moving speaker using excitation source information. Vikas C. Raykar, Ramani Duraiswami, B. Yegnanarayana, S. R. Mahadeva Prasanna |
| 2003 | Tracking vocal tract resonances using an analytical nonlinear predictor and a target-guided temporal constraint. Li Deng, Issam Bazzi, Alex Acero |
| 2003 | Training a confidence measure for a reading tutor that listens. Yik-Cheung Tam, Jack Mostow, Joseph E. Beck, Satanjeev Banerjee |
| 2003 | Training data optimization for language model adaptation. Xiaoshan Fang, Jianfeng Gao, Jianfeng Li, Huanye Sheng |
| 2003 | Trajectory modeling based on HMMs with the explicit relationship between static and dynamic features. Keiichi Tokuda, Heiga Zen, Tadashi Kitamura |
| 2003 | Transcoding algorithm for g.723.1 and AMR speech coders: for interoperability between voIP and mobile networks. Sung-Wan Yoon, Jin-Kyu Choi, Hong-Goo Kang, Dae Hee Youn |
| 2003 | Transforming F0 contours. Ben Gillett, Simon King |
| 2003 | Transforming voice quality. Ben Gillett, Simon King |
| 2003 | Translation and rotation of the cricothyroid joint revealed by phonation-synchronized high-resolution MRI. Sayoko Takano, Kiyoshi Honda, Shinobu Masaki, Yasuhiro Shimada, Ichiro Fujimoto |
| 2003 | Tree-structured noise-adapted HMM modeling for piecewise linear-transformation-based adaptation. Zhipeng Zhang, Kiyotaka Otsuji, Sadaoki Furui |
| 2003 | Two correction models for likelihoods in robust speech recognition using missing feature theory. Hugo Van hamme |
| 2003 | Two studies of open vs. directed dialog strategies in spoken dialog systems. Silke M. Witt, Jason D. Williams |
| 2003 | Understanding process for speech recognition. Salma Jamoussi, Kamel Smaïli, Jean Paul Haton |
| 2003 | Unified analysis of glottal source spectrum. Ixone Arroabarren, Alfonso Carlosena |
| 2003 | Unit selection and emotional speech. Alan W. Black |
| 2003 | Unit selection based on voice recognition. Yi Zhou, Yiqing Zu |
| 2003 | Unit selection in concatenative TTS synthesis systems based on mel filter bank amplitudes and phonetic context. Tanya Lambert, Andrew P. Breen, Barry Eggleton, Stephen J. Cox, Ben P. Milner |
| 2003 | Unit size in unit selection speech synthesis. S. Prahallad Kishore, Alan W. Black |
| 2003 | Unlimited vocabulary speech recognition based on morphs discovered in an unsupervised manner. Vesa Siivola, Teemu Hirsimäki, Mathias Creutz, Mikko Kurimo |
| 2003 | Unsupervised speaker adaptation based on HMM sufficient statistics in various noisy environments. Shingo Yamade, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano |
| 2003 | Unsupervised speaker indexing using anchor models and automatic transcription of discussions. Yuya Akita, Tatsuya Kawahara |
| 2003 | Unsupervised topic discovery applied to segmentation of news transcriptions. Sreenivasa Sista, Amit Srivastava, Francis Kubala, Richard M. Schwartz |
| 2003 | Use of a CSP-based voice activity detector for distant-talking ASR. Luca Armani, Marco Matassoni, Maurizio Omologo, Piergiorgio Svaizer |
| 2003 | Use of linguistic information for automatic extraction of f_0 contour generation process model parameters. Keikichi Hirose, Yusuke Furuyama, Shuichi Narusawa, Nobuaki Minematsu, Hiroya Fujisaki |
| 2003 | Use of trajectory models for automatic accent classification. Pongtep Angkititrakul, John H. L. Hansen |
| 2003 | Usefulness of phase spectrum in human speech perception. Kuldip K. Paliwal, Leigh D. Alsteris |
| 2003 | User modeling in spoken dialogue systems for flexible guidance generation. Kazunori Komatani, Shinichi Ueno, Tatsuya Kawahara, Hiroshi G. Okuno |
| 2003 | Using accent information in ASR models for Swedish. Giampiero Salvi |
| 2003 | Using acoustic models to choose pronunciation variations for synthetic voices. Christina L. Bennett, Alan W. Black |
| 2003 | Using both global and local hidden Markov models for automatic speech unit segmentation. Hong Zheng, Yiqing Lu |
| 2003 | Using confidence measures and domain knowledge to improve speech recognition. Pascal Wiggers, Léon J. M. Rothkrantz |
| 2003 | Using corpus-based methods for spoken access to news texts on the web. Alexandra Klein, Harald Trost |
| 2003 | Using genetic algorithms for rapid speaker adaptation. Fabrice Lauri, Irina Illina, Dominique Fohr, Filipp Korkmazsky |
| 2003 | Using mutual information to design class-specific phone recognizers. Patricia Scanlon, Daniel P. W. Ellis, Richard B. Reilly |
| 2003 | Using pitch frequency information in speech recognition. Mathew Magimai-Doss, Todd A. Stephenson, Hervé Bourlard |
| 2003 | Using place name data to train language identification models. Stanley F. Chen, Benoît Maison |
| 2003 | Using statistical language modelling to identify new vocabulary in a grammar-based speech recognition system. Genevieve Gorrell |
| 2003 | Using syllable-based indexing features and language models to improve German spoken document retrieval. Martha A. Larson, Stefan Eickeler |
| 2003 | Using the web for fast language model construction in minority languages. Viet Bac Le, Brigitte Bigi, Laurent Besacier, Eric Castelli |
| 2003 | Using untranscribed user utterances for improving language models based on confidence scoring. Mikio Nakano, Timothy J. Hazen |
| 2003 | Using word confidence measure for OOV words detection in a spontaneous spoken dialog system. Hui Sun, Guoliang Zhang, Fang Zheng, Mingxing Xu |
| 2003 | Utterance verification under distributed detection and fusion framework. Taeyoon Kim, Hanseok Ko |
| 2003 | Utterance verification using an optimized k-nearest neighbour classifier. Roberto Paredes, Alberto Sanchís, Enrique Vidal, Alfons Juan |
| 2003 | VISPER II - enhanced version of the educational software for speech processing courses. Miroslav Holada, Jan Nouza |
| 2003 | Validation of phonetic transcriptions based on recognition performance. Christophe Van Bael, Diana Binnenpoorte, Helmer Strik, Henk van den Heuvel |
| 2003 | Variable bit rate control with trellis diagram approximation. Kei Kikuiri, Nobuhiko Naka, Tomoyuki Ohya |
| 2003 | Variable length mixtures of inverse covariances. Vincent Vanhoucke, Ananth Sankar |
| 2003 | Variational Bayesian GMM for speech recognition. Fabio Valente, Christian Wellekens |
| 2003 | Very-low-rate speech compression by indexation of polyphones. Charles du Jeu, Maurice Charbit, Gérard Chollet |
| 2003 | Visualisation of the vocal tract based on estimation of vocal area functions and formant frequencies. Abdulhussain E. Mahdi |
| 2003 | Vocal tract normalization as linear transformation of MFCC. Michael Pitz, Hermann Ney |
| 2003 | Voice conversion methods for vocal tract and pitch contour modification. Oytun Türk, Levent M. Arslan |
| 2003 | Voice conversion with smoothed GMM and MAP adaptation. Yining Chen, Min Chu, Eric Chang, Jia Liu, Runsheng Liu |
| 2003 | Voice quality modification for emotional speech synthesis. Christophe d'Alessandro, Boris Doval |
| 2003 | Voice quality normalization in an utterance for robust ASR. Muhammad Ghulam, Takashi Fukuda, Tsuneo Nitta |
| 2003 | Voicing controlled frame loss concealment for adaptive multi-rate (AMR) speech frames in voice-over-IP. Frank Mertz, Hervé Taddei, Imre Varga, Peter Vary |
| 2003 | Voicing parameter and energy based speech/non-speech detection for speech recognition in adverse conditions. Arnaud Martin, Laurent Mauuary |
| 2003 | Voxenter^TM - intelligent voice enabled call center for hungarian. Tibor Fegyó, Péter Mihajlik, Máté Szarvas, Péter Tatai, Gábor Tatai |
| 2003 | Wavelet-based perceptual speech enhancement using adaptive threshold estimation. Essa Jafer, Abdulhussain E. Mahdi |
| 2003 | We are not amused - but how do you know? user states in a multi-modal dialogue system. Anton Batliner, Viktor Zeißler, Carmen Frank, Johann Adelhardt, Rui Ping Shi, Elmar Nöth |
| 2003 | Weighted automata kernels - general framework and algorithms. Corinna Cortes, Patrick Haffner, Mehryar Mohri |
| 2003 | Weighted entropy training for the decision tree based text-to-phoneme mapping. Jilei Tian, Janne Suontausta, Juha Häkkinen |
| 2003 | Who knows carl bildt? - and what if you don't? Elisabeth Zetterholm, Kirk P. H. Sullivan, James Green, Erik J. Eriksson, Jan van Doorn, Peter E. Czigler |
| 2003 | Why and how to control the authentic emotional speech corpora. Véronique Aubergé, Nicolas Audibert, Albert Rilliard |
| 2003 | Why is the special structure of the language important for Chinese spoken language processing? - examples on spoken document retrieval, segmentation and summarization. Lin-Shan Lee, Yuan Ho, Jia-Fu Chen, Shun-Chuan Chen |
| 2003 | Word activation model by Japanese school children without knowledge of roman alphabet. Takashi Otake, Miki Komatsu |
| 2003 | Word class modeling for speech recognition with out-of-task words using a hierarchical language model. Yoshihiko Ogawa, Hirofumi Yamamoto, Yoshinori Sagisaka, Gen-ichiro Kikui |