| 2004 | "liveness" verification in audio-video authentication. Michael Wagner, Girija Chetty |
| 2004 | 3d lip-tracking for audio-visual speech recognition in real applications. Petr Císar, Zdenek Krnoul, Milos Zelezný |
| 2004 | 8th International Conference on Spoken Language Processing, INTERSPEECH-ICSLP 2004, Jeju Island, Korea, October 4-8, 2004 |
| 2004 | A Japanese dialogue-based CALL system with mispronunciation and grammar error detection. Oh Pyo Kweon, Akinori Ito, Motoyuki Suzuki, Shozo Makino |
| 2004 | A Korean grapheme-to-phoneme conversion system using selection procedure for exceptions. Sunhee Kim, Ju-Eun Ahn, Soon-Hyob Kim, Yang-Hee Lee |
| 2004 | A PLSA-based language model for conversational telephone speech. David Mrva, Philip C. Woodland |
| 2004 | A cepstral domain maximum likelihod beamformer for speech recognition. Dominik Raub, John W. McDonough, Matthias Wölfel |
| 2004 | A comparative study on the production of inter-stress intervals of English speech by English native speakers and Korean speakers. Jong-Pyo Lee, Tae-Yeoub Jang |
| 2004 | A comparison of confirmation styles for error handling in a speech dialog system. Hirohiko Sagawa, Teruko Mitamura, Eric Nyberg |
| 2004 | A comparison of normalization and training approaches for ASR-dependent speaker identification. Alex Park, Timothy J. Hazen |
| 2004 | A comparison of packet loss compensation methods and interleaving for speech recognition in burst-like packet loss. Alastair Bruce James, Ben P. Milner, Angel Manuel Gomez |
| 2004 | A comparison of simultaneous 3-channel blind source separation to selective separation on channel pairs using 2-channel BSS. Erik M. Visser, Kwokleung Chan, Stanley Kim, Te-Won Lee |
| 2004 | A comparison of soft and hard spectral subtraction for speaker verification. Michael T. Padilla, Thomas F. Quatieri |
| 2004 | A comparison of statistical methods and features for the prediction of prosodic structures. Qin Shi, Volker Fischer |
| 2004 | A comparison of the perturbation analysis between PRAAT and computerize speech lab. Jong Min Choi, Myung-Whun Sung, Kwang Suk Park, Jeong-Hun Hah |
| 2004 | A compensation method for word-familiarity difference with SNR control in intelligibility test. Shuichi Sakamoto, Yôiti Suzuki, Shigeaki Amano, Tadahisa Kondo, Naoki Iwaoka |
| 2004 | A concurrent curve strategy for formant tracking. Yves Laprie |
| 2004 | A conversational dialogue system for cognitively overloaded users. Fuliang Weng, Lawrence Cavedon, Badri Raghunathan, Danilo Mirkovic, Hua Cheng, Hauke Schmidt, Harry Bratt, Rohit Mishra, Stanley Peters, Sandra Upson, Elizabeth Shriberg, Carsten Bergmann, Lin Zhao |
| 2004 | A cross-linguistic acoustic comparison of unreleased word-final stops: Korean and Thai. Kimiko Tsukada |
| 2004 | A cross-linguistic study of diphthongs in spoken word processing in Japanese and English. Kiyoko Yoneyama |
| 2004 | A database design for a TTS synthesis system using lexical diphones. Tanya Lambert, Andrew P. Breen |
| 2004 | A discriminative locally weighted distance measure for speaker independent template based speech recognition. Mike Matton, Mathias De Wachter, Dirk Van Compernolle, Ronald Cools |
| 2004 | A distributed speech recognition system in multi-user environments. Kyu Jeong Han, Shrikanth S. Narayanan, Naveen Srinivasamurthy |
| 2004 | A dynamic vocabulary spoken dialogue interface. Stephanie Seneff, Chao Wang, I. Lee Hetherington, Grace Chung |
| 2004 | A factorial HMM aproach to robust isolated digit recognition in background music. Mark Hasegawa-Johnson, Ameya N. Deoras |
| 2004 | A family-of-models approach to HMM-based segmentation for unit selection speech synthesis. John Kominek, Alan W. Black |
| 2004 | A first experience on multilingual acoustic modeling of the languages spoken in morocco. José B. Mariño, Asunción Moreno, Albino Nogueiras |
| 2004 | A first step towards text-independent voice conversion. Hermann Ney, David Sündermann, Antonio Bonafonte, Harald Höge |
| 2004 | A forensic phonetic investigation into the duration and speech rate. Kyunghwa Kim |
| 2004 | A forensically-motivated tool for selecting cepstrally-consistent steady-states from non-contemporaneous vowel utterances. Michael Barlow, Mehrdad Khodai-Joopari, Frantz Clermont |
| 2004 | A formant tracking LP model for speech processing. Qin Yan, Esfandiar Zavarehei, Saeed Vaseghi, Dimitrios Rentzos |
| 2004 | A frame level boosting training scheme for acoustic modeling. Rong Zhang, Alexander I. Rudnicky |
| 2004 | A framework for dialogue data collection with a simulated ASR channel. Matthew N. Stuttle, Jason D. Williams, Steve J. Young |
| 2004 | A general approach to TTS reading of mixed-language texts. Leonardo Badino, Claudia Barolo, Silvia Quazza |
| 2004 | A genetic algorithm for unit selection based speech synthesis. Rohit Kumar |
| 2004 | A grammar-based Chinese to English speech translation system for portable devices. Pascale Fung, Yi Liu, Yongsheng Yang, Yihai Shen, Dekai Wu |
| 2004 | A hybrid SVM/HMM acoustic modeling approach to automatic speech recognition. Jan Stadermann, Gerhard Rigoll |
| 2004 | A hybrid word / phoneme-based approach for improved vocabulary-independent search in spontaneous speech. Peng Yu, Frank Torsten Bernd Seide |
| 2004 | A maximum entropy shallow functional parser for spoken language understanding. David Horowitz, Partha Lal, Pierce Gerard Buckley |
| 2004 | A memory efficient grapheme-to-phoneme conversion system for speech processing. Jun Huang, Lex Olorenshaw, Gustavo Hernández Ábrego, Lei Duan |
| 2004 | A method for glottal formant frequency estimation. Baris Bozkurt, Thierry Dutoit, Boris Doval, Christophe d'Alessandro |
| 2004 | A minimum mean squared error estimator for single channel speaker separation. Aarthi M. Reddy, Bhiksha Raj |
| 2004 | A multi-layer conversation management approach for information seeking applications. Shimei Pan |
| 2004 | A multi-modal dialog system for a mobile robot. Ioannis Toptsis, Shuyin Li, Britta Wrede, Gernot A. Fink |
| 2004 | A multimodal communication aid for global aphasia patients. Jakob Schou Pedersen, Paul Dalsgaard, Børge Lindberg |
| 2004 | A new acoustic measure for aspiration noise detection. Carlos Toshinori Ishi |
| 2004 | A new approach to channel robust speaker verification via constrained stochastic feature transformation. Man-Wai Mak, Kwok-Kwong Yiu, Ming-Cheung Cheung, Sun-Yuan Kung |
| 2004 | A new feature extraction front-end for robust speech recognition using progressive histogram equalization and multi-eigenvector temporal filtering. Shang-nien Tsai, Lin-Shan Lee |
| 2004 | A new multicomponent AM-FM demodulation with predicting frequency boundaries and its application to formant estimation. Bo Xu, Jianhua Tao, Yongguo Kang |
| 2004 | A new nonlinear feature extraction algorithm for speaker verification. Mohamed Chetouani, Bruno Gas, Jean-Luc Zarader, Marcos Faúndez-Zanuy |
| 2004 | A new prosodic phrasing model for indian language telugu. Nemala Sridhar Krishna, Hema A. Murthy |
| 2004 | A new score normalization method for speaker verification with virtual impostor model. Woo-Yong Choi, Jung Gon Kim, Hyung Soon Kim, Sung Bum Pan |
| 2004 | A noise-robust feature extraction method based on pitch-synchronous ZCPA for ASR. Muhammad Ghulam, Takashi Fukuda, Junsei Horikawa, Tsuneo Nitta |
| 2004 | A novel method for two-speaker segmentation. Rashmi Gangadharaiah, Balakrishnan Narayanaswamy, Narayanaswamy Balakrishnan |
| 2004 | A novel target-driven generalized JMAP adaptation algorithm. Zhaobing Han, Shuwu Zhang, Bo Xu |
| 2004 | A novel voice conversion system based on codebook mapping with phoneme-tied weighting. Zixiang Wang, Ren-Hua Wang, Zhiwei Shuang, Zhen-Hua Ling |
| 2004 | A packet loss concealment method using recursive linear prediction. Kazuhiro Kondo, Kiyoshi Nakagawa |
| 2004 | A piecewise interpolation method based on log-least square error criterion for HRTF. Jie Zhang, Zhenyang Wu |
| 2004 | A proposal to quantitatively select the right intonation unit in data-driven intonation modeling. David Escudero Mancebo, Valentín Cardeñoso-Payo |
| 2004 | A prosodic phrasing model for a Korean text-to-speech synthesis system. Kyuchul Yoon |
| 2004 | A quantitative model for formant dynamics and contextually assimilated reduction in fluent speech. Li Deng, Yu Dong, Alex Acero |
| 2004 | A robust glottal source model estimation technique. Qiang Fu, Peter Murphy |
| 2004 | A robust training algorithm based on neighborhood information. Wing-Hei Au, Man-Hung Siu |
| 2004 | A robust understanding model for spoken dialogues. Junyan Chen, Ji Wu, Zuoying Wang |
| 2004 | A soft decision MMSE amplitude estimator as a noise preprocessor to speech coder s using a glottal sensor. Cenk Demiroglu, David V. Anderson |
| 2004 | A spoken dialog system based on automatic grammar generation and template-based weighting for autonomous mobile robots. Takashi Konashi, Motoyuki Suzuki, Akinori Ito, Shozo Makino |
| 2004 | A state model for the realization of visual perceptive feedback in smartkom. Peter Poller, Norbert Reithinger |
| 2004 | A statistical discrimination measure for hidden Markov models based on divergence. Jorge F. Silva, Shrikanth S. Narayanan |
| 2004 | A statistical lexicon for non-native speech recognition. Rainer Gruhn, Konstantin Markov, Satoshi Nakamura |
| 2004 | A study of minimum classification error training for segmental switching linear Gaussian hidden Markov models. Jian Wu, Donglai Zhu, Qiang Huo |
| 2004 | A study of tone classification for continuous Thai speech recognition. Tan Li, Montri Karnjanadecha, Thanate Khaorapapong |
| 2004 | A study on automatic detection of Japanese vowel devoicing for speech synthesis. Yi-Jian Wu, Hisashi Kawai, Jinfu Ni, Ren-Hua Wang |
| 2004 | A study on model-based equal error rate estimation for automatic speaker verification. Hsiao-Chuan Wang, Jyh-Min Cheng |
| 2004 | A study on nasal coda los in continuous speech. Qiang Fang |
| 2004 | A style control technique for HMM-based speech synthesis. Takashi Masuko, Takao Kobayashi, Keisuke Miyanaga |
| 2004 | A theoretical analysis of speech recognition based on feature trajectory models. Yasuhiro Minami, Erik McDermott, Atsushi Nakamura, Shigeru Katagiri |
| 2004 | A trainable prosodic model: learning the contours implementing communicative functions within a superpositional model of intonation. Gérard Bailly, Bleicke Holm, Véronique Aubergé |
| 2004 | A two phase arabic language model for speech recognition and other language applications. Mohsen A. Rashwan |
| 2004 | A two-level schema for detecting recognition errors. Zhengyu Zhou, Helen M. Meng |
| 2004 | A two-phase pitch marking method for TD-PSOLA synthesis. Cheng-Yuan Lin, Jyh-Shing Roger Jang |
| 2004 | A unified framework for large vocabulary speech recognition of mutually unintelligible Chinese "regionalects". Ren-yuan Lyu, Dau-Cheng Lyu, Min-Siong Liang, Min-Hong Wang, Yuang-Chin Chiang, Chun-Nan Hsu |
| 2004 | A universal speech interface for appliances. Thomas K. Harris, Roni Rosenfeld |
| 2004 | A vector-based method for efficiently representing multivariate environmental information. Akemi Iida, Yoshito Ueno, Ryohei Matsuura, Kiyoaki Aikawa |
| 2004 | A voice conversion method based on joint pitch and spectral envelope transformation. Taoufik En-Najjary, Olivier Rosec, Thierry Chonavel |
| 2004 | A wizard of oz framework for collecting spoken human-computer dialogs. Rohit Mishra, Elizabeth Shriberg, Sandra Upson, Joyce Chen, Fuliang Weng, Stanley Peters, Lawrence Cavedon, John Niekrasz, Hua Cheng, Harry Bratt |
| 2004 | ACCDIST: a metric for comparing speakers' accents. Mark A. Huckvale |
| 2004 | ASR on speech reconstructed from short-time fourier phase spectra. Leigh David Alsteris, Kuldip K. Paliwal |
| 2004 | AVICAR: audio-visual speech corpus in a car environment. Bowon Lee, Mark Hasegawa-Johnson, Camille Goudeseune, Suketu Kamdar, Sarah Borys, Ming Liu, Thomas S. Huang |
| 2004 | Accounting for the uncertainty of speech estimates in the context of model-based feature enhancement. Hugo Van hamme, Patrick Wambacq, Veronique Stouten |
| 2004 | Acoustic and prosodic analysis of Japanese vowel-vowel hiatus with laryngeal effect. Shigeyoshi Kitazawa, Shinya Kiriyama |
| 2004 | Acoustic correlates of phrase-internal lexical boundaries in dutch. Taehong Cho, Elizabeth K. Johnson |
| 2004 | Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task. Shinji Watanabe |
| 2004 | Acoustic model adaptation for coded speech using synthetic speech. Koji Tanaka, Fuji Ren, Shingo Kuroiwa, Satoru Tsuge |
| 2004 | Acoustic phonetic modeling using local codebook features. Frank Diehl, Asunción Moreno |
| 2004 | Acoustic-to-articulatory inversion mapping with Gaussian mixture model. Tomoki Toda, Alan W. Black, Keiichi Tokuda |
| 2004 | Active perception: using a priori knowledge from clean speech models to ignore non-target features. Bert Cranen, Johan de Veth |
| 2004 | Adaptation for soft whisper recognition using a throat microphone. Szu-Chen Stan Jou, Tanja Schultz, Alex Waibel |
| 2004 | Adaptation in the pronunciation space for non-native speech recognition. Georg Stemmer, Stefan Steidl, Christian Hacker, Elmar Nöth |
| 2004 | Adaptation of front end parameters in a speech recognizer. Karthik Visweswariah, Ramesh A. Gopinath |
| 2004 | Adaptive beamforming combined with particle filtering for acoustic source localization. Reinhold Haeb-Umbach, Sven Peschke, Ernst Warsitz |
| 2004 | Adaptive classifier cascade for multimodal speaker identification. Engin Erzin, Yucel Yemez, A. Murat Tekalp |
| 2004 | Adaptive cross-channel interference cancellation on blind signal separation outputs using source absence/presence detection and spectral subtraction. Gil-Jin Jang, Changkyu Choi, Yongbeom Lee, Yung-Hwan Oh |
| 2004 | Adaptive long-term predictive analysis of disordered speech. Abdellah Kacha, Francis Grenez, Frédéric Bettens, Jean Schoentgen |
| 2004 | Adult and infant sensitivity to phonotactic features in spoken Japanese. Sachiyo Kajikawa, Laurel Fais, Shigeaki Amano, Janet F. Werker |
| 2004 | Alignment of human prosodic patterns for spoken dialogue systems. Noriko Suzuki, Yasuhiro Katagiri |
| 2004 | An acoustic shock limiting algorithm using time and frequency domain speech features. Tina Soltani, Dave Hermann, Etienne Cornu, Hamid Sheikhzadeh, Robert L. Brennan |
| 2004 | An acoustic study of emotions expressed in speech. Serdar Yildirim, Murtaza Bulut, Chul Min Lee, Abe Kazemzadeh, Zhigang Deng, Sungbok Lee, Shrikanth S. Narayanan, Carlos Busso |
| 2004 | An acoustic study of speech rhythm in taiwan English. Hua-Li Jian |
| 2004 | An acoustic-analytic role for the deviation between the scansion and reading of poems. Key-Seop Kim, Un Lim, Dong-Il Shin |
| 2004 | An adaptive MEL-LPC analysis for speech recognition. Yoshihisa Nakatoh, Makoto Nishizaki, Shinichi Yoshizawa, Maki Yamada |
| 2004 | An adaptive band-partitioning spectral entropy based speech detection in realistic noisy environments. Kun-Ching Wang |
| 2004 | An adaptive kalman filter for the enhancement of speech signals. Marcel Gabrea |
| 2004 | An analysis of packet loss models for distributed speech recognition. Ben P. Milner, Alastair Bruce James |
| 2004 | An efficient codebook design in SDCHMM for mobile communication environments. Gue Jun Jung, Su-Hyun Kim, Yung-Hwan Oh |
| 2004 | An efficient partial matching algorithm toward speech retrieval by speech. Yoshiaki Itoh, Kazuyo Tanaka, Shi-wook Lee |
| 2004 | An efficient repair procedure for quick transcriptions. Anand Venkataraman, Andreas Stolcke, Wen Wang, Dimitra Vergyri, Jing Zheng, Venkata Ramana Rao Gadde |
| 2004 | An energy normalization scheme for improved robustness in speech recognition. Seyed Mohammad Ahadi, Hamid Sheikhzadeh, Robert L. Brennan, George H. Freeman |
| 2004 | An evaluation of a spoken document retrieval baseline system in finish. Mikko Kurimo, Ville T. Turunen, Inger Ekman |
| 2004 | An experimental method for measuring transfer functions of acoustic tubes. Tatsuya Kitamura, Satoru Fujita, Kiyoshi Honda, Hironori Nishimoto |
| 2004 | An implement of speech DB gathering system using voiceXML. Dong-Hyun Kim, Yong-Wan Roh, Kwang-Seok Hong |
| 2004 | An improved pair-wise variability index for comparing the timing characteristics of speech. Hua-Li Jian |
| 2004 | An improved preprocessor for the automatic transcription of broadcast news audio stream. Jindrich Zdánský, Petr David, Jan Nouza |
| 2004 | An information extraction approach for spoken language understanding. Jihyun Eun, Changki Lee, Gary Geunbae Lee |
| 2004 | An interactive English pronunciation dictionary for Korean learners. Chao Wang, Mitchell Peabody, Stephanie Seneff, Jong-mi Kim |
| 2004 | An intonation model for embedded devices based on natural F0 samples. Gerasimos Xydas, Georgios Kouroupetroglou |
| 2004 | An online audio indexing system. Jitendra Ajmera, Iain McCowan, Hervé Bourlard |
| 2004 | An understanding strategy based on plausibility score in recognition history using CSR confidence measure. Toshihiko Itoh, Atsuhiko Kai, Yukihiro Itoh, Tatsuhiro Konishi |
| 2004 | Analysis of F0 contours of Cantonese utterances based on the command-response model. Wentao Gu, Keikichi Hirose, Hiroya Fujisaki |
| 2004 | Analysis of acoustic features affecting "singing-ness" and its application to singing-voice synthesis from speaking-voice. Takeshi Saitou, Naoya Tsuji, Masashi Unoki, Masato Akagi |
| 2004 | Analysis of emotional speech in voice mail messages: the influence of speakers' gender. Noël Chateau, Valérie Maffiolo, Christophe Blouin |
| 2004 | Analysis of hypernasality by synthesis. P. Vijayalakshmi, M. Ramasubba Reddy |
| 2004 | Analysis of in-car speech recognition experiments using a large-scale multi-mode dialogue corpus. Hiroshi Fujimura, Katsunobu Itou, Kazuya Takeda, Fumitada Itakura |
| 2004 | Analysis of speaking styles by two-dimensional visualization of aggregate of acoustic models. Makoto Shozakai, Goshu Nagino |
| 2004 | Analysis of the phone level contributions to objective evaluation of English speech by non-natives. Yasuo Suzuki, Yoshinori Sagisaka, Katsuhiko Shirai, Makiko Muto |
| 2004 | Analysis of the voice source in different phonation types: simultaneous high-sped imaging of the vocal fold vibration and glottal inverse filtering. Hannu Pulakka, Paavo Alku, Svante Granqvist, Stellan Hertegard, Hans Larsson, Anne-Maria Laukkanen, Per-Ake Lindestad, Erkki Vilkman |
| 2004 | Analysis on disappearing and thriving of speech applications for ergonomic design guidelines and recommendations. Rinzou Ebukuro |
| 2004 | Application of long-term filtering to formant estimation. Hong You |
| 2004 | Application of voice conversion to hearing-impaired Mandarin speech enhancement. Chen-Long Lee, Wen-Whei Chang, Yuan-Chuan Chiang |
| 2004 | Apply n-best list re-ranking to acoustic model combinations of boosting training. Rong Zhang, Alexander I. Rudnicky |
| 2004 | Applying pitch connection control in Mandarin speech synthesis. Yi Zhou, Yiqing Zu, Zhenli Yu, Dongjian Yue, Guilin Chen |
| 2004 | Applying the Aurora feature extraction schemes to a phoneme based recognition task. Hans-Günter Hirsch, Harald Finster |
| 2004 | Approach to interchange-format based Chinese generation. Wenjie Cao, Chengqing Zong, Bo Xu |
| 2004 | Articulatory correlates of voice qualities of god guys and bad guys in Japanese anime: an MRI study. Emi Zuiki Murano, Mihoko Teshigawara |
| 2004 | Articulatory feature recognition using dynamic Bayesian networks. Joe Frankel, Mirjam Wester, Simon King |
| 2004 | Articulatory feature-based conditional pronunciation modeling for speaker verification. Ka-Yee Leung, Man-Wai Mak, Sun-Yuan Kung |
| 2004 | Aspects of named entity processing. Michael Levit, Allen L. Gorin, Patrick Haffner, Hiyan Alshawi, Elmar Nöth |
| 2004 | Aspects of speaking-face data corpus design methodology. J. Bruce Millar, Michael Wagner, Roland Goecke |
| 2004 | Assessment of non-native phones in anglicisms by German listeners. Julia Abresch, Stefan Breuer |
| 2004 | Audio source separation from the mixture using empirical mode decomposition with independent subspace analysis. Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu |
| 2004 | Audio watermarking in sub-band signals using multiple echo kernels. In-Jung Oh, Hyun-Yeol Chung, Jae-Won Cho, Ho-Youl Jung, Rémy Prost |
| 2004 | Audio-visual SPeaker localization for car navigation systems. Xianxian Zhang, Kazuya Takeda, John H. L. Hansen, Toshiki Maeno |
| 2004 | Audio-visual spoken language processing. Jinyoung Kim, Jeesun Kim, Chris Davis |
| 2004 | Audiovisual perceptual evaluation of resynthesised speech movements. Matthias Odisio, Gérard Bailly |
| 2004 | Automated lexical adaptation and speaker clustering based on pronunciation habits for non-native speech recognition. Antoine Raux |
| 2004 | Automatic adaptation of the momel F0 stylisation algorithm to new corpora. Salma Mouline, Olivier Boëffard, Paul C. Bagshaw |
| 2004 | Automatic detection of contrast for speech understanding. Mark Hasegawa-Johnson, Stephen E. Levinson, Tong Zhang |
| 2004 | Automatic detection of dialog acts based on multilevel information. Sophie Rosset, Lori Lamel |
| 2004 | Automatic detection of vocal fold paralysis and edema. Maria Marinaki, Constantine Kotropoulos, Ioannis Pitas, Nikolaos Maglaveras |
| 2004 | Automatic extraction of key sentences from oral presentations using statistical measure based on discourse markers. Tasuku Kitade, Tatsuya Kawahara, Hiroaki Nanjo |
| 2004 | Automatic extraction of phonetically rich sentences from large text corpus of indian languages. Karunesh Arora, Sunita Arora, Kapil Verma, Shyam Sunder Agrawal |
| 2004 | Automatic language identification using discrete hidden Markov model. Kakeung Wong, Man-Hung Siu |
| 2004 | Automatic lips reading for audio-visual speech processing and recognition. Josef Chaloupka |
| 2004 | Automatic network optimization of voice applications. Juan M. Huerta, Chaitanya Ekanadham |
| 2004 | Automatic phonetic base form generation based on maximum context tree. Changxue Ma |
| 2004 | Automatic pitch marking and reconstruction of glottal closure instants from noisy and deformed electro-glotto-graph signals. Attila Ferencz, Jeongsu Kim, Yong-Beom Lee, Jae-Won Lee |
| 2004 | Automatic prosody labeling of read norwegian. Per Olav Heggtveit, Jon Emil Natvig |
| 2004 | Automatic pruning of unit selection speech databases for synthesis without loss of naturalness. Rohit Kumar, S. Prahallad Kishore |
| 2004 | Automatic speech recognition of co-channel speech: integrated speaker and speech recognition approach. Larry P. Heck, Mark Z. Mao |
| 2004 | Automatic transcription of continuous speech using unsupervised and incremental training. L. Sarada Ghadiyaram, Hemalatha Nagarajan, Nagarajan Thangavelu, Hema A. Murthy |
| 2004 | Automatic transformation of lecture transcription into document style using statistical framework. Tatsuya Kawahara, Kazuya Shitaoka, Hiroaki Nanjo |
| 2004 | Beginning of utterance detection algorithm for low complexity ASR engines. Tommi Lahti |
| 2004 | Belief-based nonlinear rescoring in Thai speech understanding. Chai Wutiwiwatchai, Sadaoki Furui |
| 2004 | Best speaker-based structure tree for speaker verification. Chakib Tadj, Christian S. Gargour, Nabil Badri |
| 2004 | Biomechanical parameter fingerprint in the mucosal wave power spectral density. Juan Ignacio Godino-Llorente, María Victoria Rodellar Biarge, Pedro Gómez-Vilda, Francisco Díaz Pérez, Agustín Álvarez-Marquina, Rafael Martínez-Olalla |
| 2004 | Blind separation of speech and sub-Gaussian signals in underdetermined case. Sang-Gyun Kim, Chang D. Yoo |
| 2004 | Bonntempo-corpus and bonntempo-tools: a database for the study of speech rhythm and rate. Volker Dellwo, Bianca Aschenberner, Petra Wagner, Jana Dancovicova, Ingmar Steiner |
| 2004 | Boostrapping phonetic lexicons for new languages. Sameer Maskey, Alan W. Black, Laura Tomokiya |
| 2004 | CIAIR in-car speech database. Nobuo Kawaguchi, Shigeki Matsubara, Yukiko Yamaguchi, Kazuya Takeda, Fumitada Itakura |
| 2004 | Canonicalization of feature parameters for automatic speech recognition. Takashi Fukuda, Tsuneo Nitta |
| 2004 | Channel frequency response correction for speaker recognition. Stanley J. Wenndt, Richard M. Floyd |
| 2004 | Characterizing and classifying cued speech vowels from labial parameters. Denis Beautemps, Thomas Burger, Laurent Girin |
| 2004 | Characterizing task-oriented dialog using a simulated ASR chanel. Jason D. Williams, Steve J. Young |
| 2004 | Children's emotion recognition in an intelligent tutoring scenario. Mark Hasegawa-Johnson, Stephen E. Levinson, Tong Zhang |
| 2004 | Chinese prosody phrase break prediction based on maximum entropy model. Jianfeng Li, Guoping Hu, Ren-Hua Wang |
| 2004 | Chinese text word-segmentation considering semantic links among sentences. Leonardo Badino |
| 2004 | Classification of pathological voice including severely noisy cases. Cheolwoo Jo, Soo-Geon Wang, Byung-Gon Yang, Hyung-Soon Kim, Tao Li |
| 2004 | Classifying emotion in Chinese speech by decomposing prosodic features. Dan-Ning Jiang, Lian-Hong Cai |
| 2004 | Clause types and filed pauses in Japanese spontaneous monologues. Michiko Watanabe, Yasuharu Den, Keikichi Hirose, Nobuaki Minematsu |
| 2004 | Cluster-dependent modeling and confidence measure processing for in-set/out-of-set speaker identification. Pongtep Angkititrakul, Sepideh Baghaii, John H. L. Hansen |
| 2004 | Clustering similar nouns for selecting related news articles. Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi |
| 2004 | Coarticulatory variability and directionality in [s, ..]: an EPG study. Mitsuhiro Nakamura |
| 2004 | Combination of speech features using smoothed heteroscedastic linear discriminant analysis. Lukás Burget |
| 2004 | Combination of standard and throat microphones for robust speech recognition in highly noisy environments. Martin Graciarena, Federico Cesari, Horacio Franco, Gregory K. Myers, Cregg Cowan, Victor Abrash |
| 2004 | Combining agglomerative and tree-based state clustering for high accuracy acoustic modeling. Zhaobing Han, Shuwu Zhang, Bo Xu |
| 2004 | Combining linguistic knowledge and acoustic information in automatic pronunciation lexicon generation. Grace Chung, Chao Wang, Stephanie Seneff, Edward Filisko, Min Tang |
| 2004 | Communicative competence and adaptation in a spoken dialogue system. Kristiina Jokinen |
| 2004 | Compact acoustic model for embedded implementation. Junho Park, Hanseok Ko |
| 2004 | Comparative study of linear and non-linear models for viseme in version: modeling of a cortical associative function. Frédéric Berthommier |
| 2004 | Comparing intonation of two varieties of French using normalized F0 values. Svetlana Kaminskaia, François Poiré |
| 2004 | Comparison of ML, MAP, and VB based acoustic models in large vocabulary speech recognition. Panu Somervuo |
| 2004 | Comparison of several speaker verification procedures based on GMM. Vlasta Radová, Ales Padrta |
| 2004 | Comparison of transmitter - based packet-loss recovery techniques for voice transmission. Moo Young Kim, W. Bastiaan Kleijn |
| 2004 | Complex emotion recognition system for a specific user using SOM based on prosodic features. Atsushi Iwai, Yoshikazu Yano, Shigeru Okuma |
| 2004 | Complex spectrum circle centroid for microphone-array-based noisy speech recognition. Shigeki Sagayama, Okajima Takashi, Yutaka Kamamoto, Takuya Nishimoto |
| 2004 | Compression of speech database by feature separation and pattern clustering using STRAIGHT. Zhen-Hua Ling, Yu Hu, Zhiwei Shuang, Ren-Hua Wang |
| 2004 | Computation of the acoustic characteristics of vocal-tract models with geometrical perturbation. Kunitoshi Motoki, Hiroki Matsuzaki |
| 2004 | Conditional maximum likelihood estimation for improving annotation performance of n-gram models incorporating stochastic finite state grammars. Vaibhava Goel |
| 2004 | Confirmation strategy for document retrieval systems with spoken dialog interface. Teruhisa Misu, Tatsuya Kawahara, Kazunori Komatani |
| 2004 | Constrained minimization technique for topic identification using discriminative training and support vector machines. Imed Zitouni, Minkyu Lee, Hui Jiang |
| 2004 | Construct a multi-lingual speech corpus in taiwan with extracting phonetically balanced articles. Min-Siong Liang, Dau-Cheng Lyu, Yuang-Chin Chiang, Ren-yuan Lyu |
| 2004 | Constructing emotional speech synthesizers with limited speech database. Heiga Zen, Tadashi Kitamura, Murtaza Bulut, Shrikanth S. Narayanan, Ryosuke Tsuzuki, Keiichi Tokuda |
| 2004 | Context based emotion detection from text input. Jianhua Tao |
| 2004 | Context dependent "long units" for speech recognition. Denis Jouvet, Ronaldo O. Messina |
| 2004 | Context dependent phoneme duration modeling with tree-based state tying. Myoung-Wan Koo, Ho-Hyun Jeon, Sang-Hong Lee |
| 2004 | Context dependent statistical augmentation of persian transcripts. Panayiotis G. Georgiou, Shrikanth S. Narayanan, Hooman Shirani Mehr |
| 2004 | Contextual revision in information seeking conversation systems. Keith Houck |
| 2004 | Continuous speech recognition using joint features derived from the modified group delay function and MFCC. Rajesh Mahanand Hegde, Hema A. Murthy, Venkata Ramana Rao Gadde |
| 2004 | Convolutional networks for speech detection. Somsak Sukittanon, Arun C. Surendran, John C. Platt, Christopher J. C. Burges |
| 2004 | Coping with disfluencies in spontaneous speech recognition. Frederik Stouten, Jean-Pierre Martens |
| 2004 | Correcting Korean vowel speech recognition errors with limited lip features. Ki-Hyung Hong, Yong-Ju Lee, Jae-Young Suh, Kyong-Nim Lee |
| 2004 | Correlation between VOT and F0 in the perception of Korean stops and affricates. Midam Kim |
| 2004 | Cost-sensitive call classification. Gökhan Tür |
| 2004 | Cough detection in spoken dialogue system for home health care. Shinya Takahashi, Tsuyoshi Morimoto, Sakashi Maeda, Naoyuki Tsuruta |
| 2004 | Creating speech recognition grammars from regular expressions for alphanumeric concepts. Ye-Yi Wang, Yun-Cheng Ju |
| 2004 | Cross domain dialogue modelling: an object-based approach. Ian M. O'Neill, Philip Hanna, Xingkun Liu, Michael F. McTear |
| 2004 | Cross-lingual phoneme mapping for multilingual synthesis systems. Marko Moberg, Kimmo Pärssinen, Juha Iso-Sipilä |
| 2004 | Crosscorrelation-based multispeaker speech activity detection. Kornel Laskowski, Qin Jin, Tanja Schultz |
| 2004 | DFW-based spectral smoothing for concatenative speech synthesis. Hartmut R. Pfitzinger |
| 2004 | DOA estimation of speech signals using semi-blind source separation techniques. Ilyas Potamitis, Panagiotis Zervas, Nikos Fakotakis |
| 2004 | DORIS, a multiagent/IP platform for multimodal dialogue applications. Johann L'Hour, Olivier Boëffard, Jacques Siroux, Laurent Miclet, Francis Charpentier, Thierry Moudenc |
| 2004 | DWT-based classification of acoustic-phonetic classes and phonetic units. Gernot Kubin, Tuan Van Pham |
| 2004 | Data driven multidialectal phone set for Spanish dialects. Mónica Caballero, Asunción Moreno, Albino Nogueiras |
| 2004 | Data driven number-of-states selection in HMM topologies. Dirk Knoblauch |
| 2004 | Data pruning approach to unit selection for inventory generation of concatenative embeddable Chinese TTS systems. Zhenli Yu, Kaizhi Wang, Yiqing Zu, Dongjian Yue, Guilin Chen |
| 2004 | Data-driven approaches for automatic detection of syllable boundaries. Jilei Tian |
| 2004 | Decision-tree backing-off in HMM-based speech synthesis. Shunsuke Kataoka, Nobuaki Mizutani, Keiichi Tokuda, Tadashi Kitamura |
| 2004 | Decomposing linguistic and affective components of phonatory quality. Ailbhe Ní Chasaide, Christer Gobl |
| 2004 | Default phrasing and attachment preference in Korean. Sun-Ah Jun |
| 2004 | Dependency analysis of read Japanese sentences using pause and F0 information: a speaker independent case. Kazuyuki Takagi, Kazuhiko Ozeki |
| 2004 | Dependency structure analysis and sentence boundary detection in spontaneous Japanese. Tatsuya Kawahara, Kiyotaka Uchimoto, Hitoshi Isahara, Kazuya Shitaoka |
| 2004 | Dereverberation of speech signals based on linear prediction. Marc Delcroix, Takafumi Hikichi, Masato Miyoshi |
| 2004 | Design and construction of Korean-spoken English corpus. Seok-Chae Rhee, Sook-Hyang Lee, Young-Ju Lee, Seok-Keun Kang |
| 2004 | Design of compact acoustic models through clustering of tied-covariance Gaussians. Mark Z. Mao, Vincent Vanhoucke |
| 2004 | Design of ready-made acoustic model library by two-dimensional visualization of acoustic space. Goshu Nagino, Makoto Shozakai |
| 2004 | Design strategies for a virtual language tutor. Jonas Beskow, Olov Engwall, Björn Granström, Preben Wik |
| 2004 | Detecting user engagement in everyday conversations. Chen Yu, Paul M. Aoki, Allison Woodruff |
| 2004 | Detection of vowel on set points in continuous speech using autoassociative neural network models. Suryakanth V. Gangashetty, Chellu Chandra Sekhar, B. Yegnanarayana |
| 2004 | Deterministic annealing EM algorithm in parameter estimation for acoustic model. Yohei Itaya, Heiga Zen, Yoshihiko Nankaku, Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura |
| 2004 | Development of the knowledge-based spoken English evaluation system and its application. Seok-Chae Rhee, Jeon G. Park |
| 2004 | Developmental changes in voiced-segment ratio for Japanese infants and parents. Shigeaki Amano, Tomohiro Nakatani, Tadahisa Kondo |
| 2004 | Dialect analysis and modeling for automatic classification. John H. L. Hansen, Umit H. Yapanel, Rongqing Huang, Ayako Ikeno |
| 2004 | Dictionary refinements based on phonetic consensus and non-uniform pronunciation reduction. Gustavo Hernández Ábrego, Lex Olorenshaw, Raquel Tato, Thomas Schaaf |
| 2004 | Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition. Kazushi Ishihara, Yuya Hattori, Tomohiro Nakatani, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno |
| 2004 | Discriminative combination of multiple linear predictions for speech recognition. Zhijian Ou, Zuoying Wang |
| 2004 | Discriminative training of compound-word based multinomial classifiers for speech routing. Xiang Li, Juan M. Huerta |
| 2004 | Discriminative training of naive Bayes classifiers for natural language call routing. Hui Jiang, Pengfei Liu, Imed Zitouni |
| 2004 | Discriminative training with tied covariance matrices. Wolfgang Macherey, Ralf Schlüter, Hermann Ney |
| 2004 | Distributed speaker recognition using earth mover's distance. Yoshiyuki Umeda, Shingo Kuroiwa, Satoru Tsuge, Fuji Ren |
| 2004 | Distributed speaker recognition. Veena Desai, Hema A. Murthy |
| 2004 | Domain adaptation methods in the IBM trainable text-to-speech system. Volker Fischer, Jaime Botella Ordinas, Siegfried Kunzmann |
| 2004 | Duration modeling for hindi text-to-speech synthesis system. Sridhar Krishna Nemala, Partha Pratim Talukdar, Kalika Bali, A. G. Ramakrishnan |
| 2004 | Duration modeling techniques for continuous speech recognition. Janne Pylkkönen, Mikko Kurimo |
| 2004 | Dynamic beam pruning strategy using adaptive control. Dongbin Zhang, Limin Du |
| 2004 | Dynamic language modeling for broadcast news. Langzhou Chen, Lori Lamel, Jean-Luc Gauvain, Gilles Adda |
| 2004 | Dynamic time windows for multimodal input fusion. Anurag Kumar Gupta, Tasos Anastasakos |
| 2004 | EVITA-RAD: an extensible enterprise voice porTAI - rapid application development tool. Yu Chen |
| 2004 | Effect of intensive audiovisual perceptual training on the perception and production of the /l/-/r/ contrast for Japanese learners of English. Valérie Hazan, Anke Sennema, Andrew Faulkner |
| 2004 | Effect of speaking rate on the acceptability of change in segment duration. Hiroaki Kato, Yoshinori Sagisaka, Minoru Tsuzaki, Makiko Muto |
| 2004 | Effect of voice prosody on the decision making process in human-computer interaction. Yohei Yabuta, Yasuhiro Katagiri, Noriko Suzuki, Yugo Takeuchi |
| 2004 | Effective acoustic modeling for rate-of-speech variation in large vocabulary conversational speech recognition. Jing Zheng, Horacio Franco, Andreas Stolcke |
| 2004 | Effects of language modeling on speech-driven question answering. Katsunobu Itou, Atsushi Fujii, Tomoyosi Akiba |
| 2004 | Effects of phonetic contexts on the duration of phonetic segments in fluent read speech. Sorin Dusan |
| 2004 | Effects of prosodic boundaries on ambiguous syntactic clause boundaries in Japanese. Shari R. Speer, Soyoung Kang |
| 2004 | Efficient compression method for pronunciation dictionaries. Jilei Tian |
| 2004 | Efficient likelihood computation in multi-stream HMM based audio-visual speech recognition. Etienne Marcheret, Stephen M. Chu, Vaibhava Goel, Gerasimos Potamianos |
| 2004 | Efficient online cohort selection method for speaker verification. Tomi Kinnunen, Evgeny Karpov, Pasi Fränti |
| 2004 | Efficient sub-optimal temporal decomposition with dynamic weighting of speech signals for coding applications. David Malah, Slava Shechtman |
| 2004 | Efficient tone classification of speaker independent continuous Chinese speech using anchoring based discriminating features. Jinsong Zhang, Satoshi Nakamura, Keikichi Hirose |
| 2004 | Efficient vector quantisation of line spectral frequencies using the switched split vector quantiser. Stephen So, Kuldip K. Paliwal |
| 2004 | Eigen-prosody analysis for robust speaker recognition under mismatch handset environment. Zi-He Chen, Yuan-Fu Liao, Yau-Tarng Juang |
| 2004 | Elements of interactivity in telephone conversations. Florian Hammer, Peter Reichl, Alexander Raake |
| 2004 | Emotion recognition based on phoneme classes. Chul Min Lee, Serdar Yildirim, Murtaza Bulut, Abe Kazemzadeh, Carlos Busso, Zhigang Deng, Sungbok Lee, Shrikanth S. Narayanan |
| 2004 | Emotion verification for emotion detection and unknown emotion rejection. Hoon-Young Cho, Kaisheng Yao, Te-Won Lee |
| 2004 | Enhancement of reverberant speech using excitation source information. M. Chaitanya, S. R. Mahadeva Prasanna, B. Yegnanarayana |
| 2004 | Enhancing existing form-based dialogue managers with reasoning capabilities. Dirk Bhler |
| 2004 | Entropy based combination of tandem representations for noise robust ASR. Shajith Ikbal, Hemant Misra, Sunil Sivadas, Hynek Hermansky, Hervé Bourlard |
| 2004 | Environmental robust features for speech detection. Thomas Kemp, Climent Nadeu, Yin Hay Lam, Josep Maria Sola i Caros |
| 2004 | Error - weighted discriminative training for HMM parameter estimation. Daniel Willett |
| 2004 | Estimating detailed spectral envelopes using articulatory clustering. Yoshinori Shiga, Simon King |
| 2004 | Estimating speaking rate in spontaneous speech from z-scores of pattern durations. Kazuyuki Ashimura, Hideki Kashioka, Nick Campbell |
| 2004 | Estimating syntactic structure from prosodic features in Japanese speech. Tomoko Ohsuga, Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa |
| 2004 | Estimation of semantic confidences on lattice hierarchies. Robert Lieb, Tibor Fábián, Günther Ruske, Matthias Thomae |
| 2004 | Estimation of the vocal tract spectrum from articulatory movements using phoneme-dependent neural networks. Takuya Tsuji, Tokihiko Kaburagi, Kohei Wakamiya, Jiji Kim |
| 2004 | Etiology of user experience with natural language speech. Christopher J. Pavlovski, Jennifer C. Lai, Stella Mitchell |
| 2004 | European initiatives to promote cooperation between speech and text communities. Nicoletta Calzolari |
| 2004 | Evaluating cognitive load in spoken language interfaces using a dual-task paradigm. Ellen Campana, Michael K. Tanenhaus, James F. Allen, Roger W. Remington |
| 2004 | Evaluating system metaphors via the speech output of a smart home system. Sebastian Möller, Jan Felix Krebber, Paula M. T. Smeele |
| 2004 | Evaluation of a prosodic labeling system utilizing linguistic information. Shinya Kiriyama, Shigeyoshi Kitazawa |
| 2004 | Evaluation of a threshold for detecting local slower phrases in Japanese spontaneous conversational speech. Keiichi Takamaru |
| 2004 | Evaluation of an inverse filtering technique using physical modeling of voice production. Paavo Alku, Matti Airas, Brad H. Story |
| 2004 | Evaluation of corpus based tone prediction in mismatched environments for greek tts synthesis. Panagiotis Zervas, Nikos Fakotakis, George K. Kokkinakis, Georgios Kouroupetroglou, Gerasimos Xydas |
| 2004 | Evaluation of the difference between the driving behavior of a speech based and a speech-visual based task of an in-car compute. Zhan Fu, Lay Ling Pow, Fang Chen |
| 2004 | Evaluation of the speech output of a smart-home system in a car environment. Paula M. T. Smeele, Sebastian Möller, Jan Felix Krebber |
| 2004 | Evaluation of tree-structured piecewise linear transformation-based noise adaptation on AURORA2 database. Zhipeng Zhang, Tomoyuki Ohya, Sadaoki Furui |
| 2004 | Evaluation of universal compensation on Aurora 2 and 3 and beyond. Ji Ming, Baochun Hou |
| 2004 | Evolutionary optimization of an adaptive prosody model. Oliver Jokisch, Michael Hofmann |
| 2004 | Evolutive speaker segmentation using a repository system. Xavier Anguera Miró, Javier Hernando Pericas |
| 2004 | Example-based spoken dialogue system with online example augmentation. Hiroya Murao, Nobuo Kawaguchi, Shigeki Matsubara, Yukiko Yamaguchi, Kazuya Takeda, Yasuyoshi Inagaki |
| 2004 | Example-based training of dialogue planning incorporating user and situation models. Ian Richard Lane, Tatsuya Kawahara, Shinichi Ueno |
| 2004 | Experiments on the accuracy of phone models and liaison processing in a French broadcast news transcription system. Dominique Fohr, Odile Mella, Irina Illina, Christophe Cerisara |
| 2004 | Explicit duration modeling for Cantonese connected-digit recognition. Yu Zhu, Tan Lee |
| 2004 | Exploiting models intrinsic robustness for noisy speech recognition. Christophe Cerisara, Dominique Fohr, Odile Mella, Irina Illina |
| 2004 | Exploring XML-based technologies and procedures for quality evaluation from a real-life case perspective. Folkert de Vriend, Giulio Maltese |
| 2004 | Exploring high-performance speech recognition in noisy environments using high-order taylor series expansion. Guo-Hong Ding, Bo Xu |
| 2004 | F0 and formant frequency distribution of dysarthric speech - a comparative study. Hiroki Mori, Yasunori Kobayashi, Hideki Kasuya, Hajime Hirose, Noriko Kobayashi |
| 2004 | Fast GMM-based voice conversion for text-to-speech synthesis systems. Taoufik En-Najjary, Olivier Rosec, Thierry Chonavel |
| 2004 | Fast clustering of Gaussians and the virtue of representing Gaussians in exponential model format. Peder A. Olsen, Karthik Visweswariah |
| 2004 | Fast on-the-fly composition for weighted finite-state transducers in 1.8 million-word vocabulary continuous speech recognition. Takaaki Hori, Chiori Hori, Yasuhiro Minami |
| 2004 | Fast parameter estimation for joint maximum entropy language models. Edward James Schofield |
| 2004 | Fast semi-automatic semantic annotation for spoken dialog systems. Ruhi Sarikaya, Yuqing Gao, Paola Virga |
| 2004 | Fast speech adaptation in linear spectral domain for additive and convolutional noise. Dongsuk Yook, Donghyun Kim |
| 2004 | Feature-based pronunciation modeling with trainable asynchrony probabilities. Karen Livescu, James R. Glass |
| 2004 | Feature-dependent compensation in speech recognition. Ivan Brito, Néstor Becerra Yoma, Carlos Molina |
| 2004 | Fiction database for emotion detection in abnormal situations. Ioana Vasilescu, Laurence Devillers, Chloé Clavel, Thibaut Ehrette |
| 2004 | Finite-state-based and phrase-based statistical machine translation. Josep Maria Crego, José B. Mariño, Adrià de Gispert |
| 2004 | Flexible dialogue management using distributed and dynamic dialogue control. Mikko Hartikainen, Markku Turunen, Jaakko Hakulinen, Esa-Pekka Salonen, J. Adam Funk |
| 2004 | Florence: a dialogue manager framework for spoken dialogue systems. Giuseppe Di Fabbrizio, Charles Lewis |
| 2004 | Flow representation through the glottis having a polygonal boundary shape. Yosuke Tanabe, Tokihiko Kaburagi |
| 2004 | Foreign-accented speaker-independent speech recognition. Stefanie Aalburg, Harald Höge |
| 2004 | Formulating contextual tonal variations in Mandarin. Jinfu Ni, Hisashi Kawai, Keikichi Hirose |
| 2004 | Four-layer categorization scheme of fast GMM computation techniques in large vocabulary continuous speech recognition systems. Arthur Chan, Mosur Ravishankar, Alexander I. Rudnicky, Jahanzeb Sherwani |
| 2004 | Frequency effects on vowel reduction in three typologically different languages (dutch, finish, Russian). Rob van Son, Olga Bolotova, Louis C. W. Pols, Mietta Lennes |
| 2004 | Frequency warped ARMA analysis of the closed and the open phase of voiced speech. Pedro J. Quintana-Morales, Juan L. Navarro-Mesa |
| 2004 | Friendly speech analysis and perception in standard Chinese. Aijun Li, Haibo Wang |
| 2004 | From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognition. Andrew Cameron Morris, Viktoria Maier, Phil D. Green |
| 2004 | From X-ray or MRU data to sounds through articulatory synthesis: towards an integrated view of the speech communication process. Jacqueline Vaissière |
| 2004 | From decoding-driven to detection-based paradigms for automatic speech recognition. Chin-Hui Lee |
| 2004 | From real-time MRI to 3d tongue movements. Olov Engwall |
| 2004 | From switchboard to meetings: development of the 2004 ICSI-SRI-UW meeting recognition system. Andreas Stolcke, Chuck Wooters, Ivan Bulyko, Martin Graciarena, Scott Otterson, Barbara Peskin, Mari Ostendorf, David Gelbart, Nikki Mirghafori, Tuomo W. Pirinen |
| 2004 | Fujisaki model based F0 contours in vietnamese TTS. Dung Tien Nguyen, Chi Mai Luong, Bang Kim Vu, Hansjörg Mixdorff, Huy Hoang Ngo |
| 2004 | Functions of intonation boundaries during spoken language comprehension in English. Allison Blodgett |
| 2004 | Fuzzy logic decision fusion in a multimodal biometric system. Chun Wai Lau, Bin Ma, Helen Mei-Ling Meng, Yiu Sang Moon, Yeung Yam |
| 2004 | Generating gestures from speech. Rubén San Segundo, Juan Manuel Montero, Javier Macías Guarasa, Ricardo de Córdoba, Javier Ferreiros, José Manuel Pardo |
| 2004 | Grapheme-to-phoneme conversion for Chinese text-to-speech. Jun Xu, Guohong Fu, Haizhou Li |
| 2004 | Graphical model approach to pitch tracking. Xiao Li, Jonathan Malkin, Jeff A. Bilmes |
| 2004 | HLT modules scalability within the NESPOLE! project. Hervé Blanchon |
| 2004 | HMM-based feature compensation method: an evaluation using the AURORA2. Akira Sasou, Kazuyo Tanaka, Satoshi Nakamura, Futoshi Asano |
| 2004 | Hands-free speech recognition using blind source separation post-processed by two-stage spectral subtraction. Masanori Tsujikawa, Ken-ichi Iso |
| 2004 | Harmonicity based monaural speech dereverberation with time warping and F0 adaptive window. Tomohiro Nakatani, Keisuke Kinoshita, Masato Miyoshi, Parham Zolfaghari |
| 2004 | Hidden factor dynamic Bayesian networks for speech recognition. Filip Korkmazsky, Murat Deviren, Dominique Fohr, Irina Illina |
| 2004 | Hidden semi-Markov model based speech synthesis. Heiga Zen, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura |
| 2004 | Higgins - a spoken dialogue system for investigating error handling techniques. Jens Edlund, Gabriel Skantze, Rolf Carlson |
| 2004 | High quality text-to-pinyin conversion using two-phase unknown word prediction. Juhong Ha, Yu Zheng, Gary Geunbae Lee, Yoon-Suk Seong, Byeongchang Kim |
| 2004 | High-level feature weighted GMM network for audio stream classification. Rongqing Huang, John H. L. Hansen |
| 2004 | Highband spectrum envelope estimation of telephone speech using hard/soft-classification. Yasheng Qian, Peter Kabal |
| 2004 | Histogram normalisation and the recognition of names and ontology words in the MUMIS project. Eric Sanders, Febe de Wet |
| 2004 | Hot discussion or frosty dialogue? towards a temperature metric for conversational interactivity. Peter Reichl, Florian Hammer |
| 2004 | How does the integration of speech recognition controls and spatialized auditory displays affect user workload? Ellen C. Haas |
| 2004 | How sparse can we make the auditory representation of speech? Christian Feldbauer, Gernot Kubin |
| 2004 | How to integrate phonetic and linguistic knowledge in a text-to-phoneme conversion task: a syllabic TPC tool for French. Nicole Beringer |
| 2004 | Human language acquisition methods in a machine learning task. Nicole Beringer |
| 2004 | Hybrid model using subspace distribution clustering hidden Markov models and semi-continuous hidden Markov models for embedded speech recognizers. Youngkyu Cho, Sung-a Kim, Dongsuk Yook |
| 2004 | Hybrid named entity recognition for question-answering system. Euisok Chung, Soojong Lim, Yi-Gyu Hwang, Myung-Gil Jang |
| 2004 | Hybrid utterance verification based on n-best models and model derived from kulback-leibler divergence. Minho Jin, Gyucheol Jang, Sungrack Yun, Chang Dong Yoo |
| 2004 | ICA-based feature extraction for phoneme recognition. Oh-Wook Kwon, Te-Won Lee |
| 2004 | Identifying emotion in speech prosody using acoustical cues of harmony. Takashi X. Fujisawa, Norman D. Cook |
| 2004 | Identifying local corrections in human-computer dialogue. Gina-Anne Levow |
| 2004 | Implementation of an intonational quality assessment system for a handheld device. Kisun You, Hoyoun Kim, Wonyong Sung |
| 2004 | Implementation of dialog applications in an open-source voiceXML platform. Fernando Fernández Martínez, Valentín Sama, Luis Fernando D'Haro, Rubén San Segundo, Ricardo de Córdoba, Juan Manuel Montero |
| 2004 | Improved differential phase spectrum processing for formant tracking. Baris Bozkurt, Thierry Dutoit, Boris Doval, Christophe d'Alessandro |
| 2004 | Improved histogram-based feature compensation for robust speech recognition and unsupervised speaker adaptation. Yasunari Obuchi |
| 2004 | Improved iterative wiener filtering for non-stationary noise speech enhancement. T. V. Sreenivas, K. Sharath Rao, A. Sreenivasa Murthy |
| 2004 | Improved model training and automatic weight adjustment for multi-SNR multi-band speaker identification system. Kenichi Yoshida, Kazuyuki Takagi, Kazuhiko Ozeki |
| 2004 | Improved performance of Aurora 4 using HTK and unsupervised MLLR adaptation. Jeff Siu-Kei Au-Yeung, Man-Hung Siu |
| 2004 | Improved robustness of time-frequency principal components (TFPC) by synergy of methods in different domains. Shang-nien Tsai |
| 2004 | Improved speech enhancement by applying time-shift property of DFT on hankel matrices for signal subspace decomposition. Gwo-hwa Ju, Lin-Shan Lee |
| 2004 | Improved spoken language translation using n-best speech recognition hypotheses. Ruiqiang Zhang, Gen-ichiro Kikui, Hirofumi Yamamoto, Frank K. Soong, Taro Watanabe, Eiichiro Sumita, Wai Kit Lo |
| 2004 | Improved voice activity detection combining noise reduction and subband divergence measures. Javier Ramírez, José C. Segura, M. Carmen Benítez, Ángel de la Torre, Antonio J. Rubio |
| 2004 | Improvement in corpus-based generation of F0 contours using generation process model for emotional speech synthesis. Keikichi Hirose |
| 2004 | Improvement in robustness of speech feature extraction method using sub-band based periodicity and aperiodicity decomposition. Kentaro Ishizuka, Noboru Miyazaki, Tomohiro Nakatani, Yasuhiro Minami |
| 2004 | Improvement of confidence measure performance using background model set algorithm. Byoung-Don Kim, Jin Young Kim, Seung Ho Choi, Young-Bum Lee, Kyoung-Rok Lee |
| 2004 | Improving automatic speech recognition performance and speech inteligibility with harmonicity based dereverberation. Keisuke Kinoshita, Tomohiro Nakatani, Masato Miyoshi |
| 2004 | Improving eigenspace-based MLLR adaptation by kernel PCA. Brian Kan-Wing Mak, Roger Wend-Huu Hsiao |
| 2004 | Improving letter-to-pronunciation accuracy with automatic morphologically-based stress prediction. Gabriel Webster |
| 2004 | Improving performance of text-independent speaker identification by utilizing contextual principal curves filtering. Yong Guan, Wenju Liu, Hongwei Qi, Jue Wang |
| 2004 | Improving the topic indexation and segmentation modules of a media watch system. Rui Amaral, Isabel Trancoso |
| 2004 | In search of a universal phonetic alphabet - theory and application of an organic visible speech-. Hyun-Bok Lee |
| 2004 | In-phase feature induction: an effective compensation technique for robust speech recognition. Siu Wa Lee, Pak-Chung Ching |
| 2004 | In-vehicle based speech processing for hearing impaired subjects. Xianxian Zhang, John H. L. Hansen, Kathryn Hoberg Arehart, Jessica Rossi-Katz |
| 2004 | Including dynamic and phonetic information in voice conversion systems. Antonio Bonafonte, Alexander Kain, Jan P. H. van Santen, Helenca Duxans |
| 2004 | Including uncertainty of speech observations in robust speech recognition. José C. Segura, Ángel de la Torre, Javier Ramírez, Antonio J. Rubio, M. Carmen Benítez |
| 2004 | Increasing the mixture components of non-uniform HMM structures based on a variational Bayesian approach. Takatoshi Jitsuhiro, Satoshi Nakamura |
| 2004 | Indonesian speech recognition for hearing and speaking impaired people. Sakriani Sakti, Arry Akhmad Arman, Satoshi Nakamura, Paulus Hutagaol |
| 2004 | Inexactness and robustness in cepstral-to-formant transformation of spoken and sung vowels. Frantz Clermont, Thomas John Millhouse |
| 2004 | Influence of temporal discretization schemes on formant frequencies and bandwidths in time domain simulations of the vocal tract system. Peter Birkholz, Dietmar Jackèl |
| 2004 | Inner product based-multiband vector quantization for wideband speech coding at 16 kbps. Seung Yeol Lee, Nam Soo Kim, Joon-Hyuk Chang |
| 2004 | Integrating layer concept inform ation into n-gram modeling for spoken language understanding. Nick Jui-Chang Wang, Jia-Lin Shen, Ching-Ho Tsai |
| 2004 | Integration of articulatory dynamic parameters in HMM/BN based speech recognition system. Konstantin Markov, Satoshi Nakamura, Jianwu Dang |
| 2004 | Integration of n-best recognition results obtained by multiple noise reduction algorithms. Takeshi Yamada, Jiro Okada, Nobuhiko Kitawaki |
| 2004 | Integration patterns during multimodal interaction. Anurag Kumar Gupta, Tasos Anastasakos |
| 2004 | Intelligibility of degraded speech from smeared STRAIGHT spectrum. Hideki Kawahara, Hideki Banno, Toshio Irino, Jiang Jin |
| 2004 | Interface for barge-in free spoken dialogue system using adaptive sound field control. Tatsunori Asai, Shigeki Miyabe, Hiroshi Saruwatari, Kiyohiro Shikano |
| 2004 | Intertranscriber reliability of prosodic labeling on telephone conversation using toBI. Taejin Yoon, Sandra Chavarria, Jennifer Cole, Mark Hasegawa-Johnson |
| 2004 | Intonation modeling for indian languages. Krothapalli Sreenivasa Rao, Bayya Yegnanarayana |
| 2004 | Intonation recognition for indonesian speech based on fujisaki model. Nazrul Effendy, Ekkarit Maneenoi, Patavee Charnvivit, Somchai Jitapunkul |
| 2004 | Investigating automatic recognition of non-native children's speech. Matteo Gerosa, Diego Giuliani |
| 2004 | Investigating speech style specific pronunciation variation in large spoken language corpora. Christophe Van Bael, Henk van den Heuvel, Helmer Strik |
| 2004 | Issues in meeting transcription - the ISL meeting transcription system. Tanja Schultz, Qin Jin, Kornel Laskowski, Yue Pan, Florian Metze, Christian Fügen |
| 2004 | Issues in the development of auditory-visual speech perception: adults, infants, and children. Kaoru Sekiyama, Denis Burnham |
| 2004 | Jacobian adaptation with improved noise reference for speaker verification. Jan Anguita, Javier Hernando, Alberto Abad |
| 2004 | Joint extraction and prediction of fujisaki's intonation model parameters. Pablo Daniel Agüero, Klaus Wimmer, Antonio Bonafonte |
| 2004 | Keyword recognition and extraction by multiple-LVCSRs with 60, 000 words in speech-driven WEB retrieval task. Masahiko Matsushita, Hiromitsu Nishizaki, Seiichi Nakagawa, Takehito Utsuro |
| 2004 | Keyword spotting for highly inflectional languages. Lubos Smídl, Ludek Müller |
| 2004 | Korean prosody generation and artificial neural networks. Kyung-Joong Min, Un-Cheon Lim |
| 2004 | LP-TRAP: linear predictive temporal patterns. Marios Athineos, Hynek Hermansky, Daniel P. W. Ellis |
| 2004 | Language detection by neural discrimination. Celestin Sedogbo, Sébastien Herry, Bruno Gas, Jean-Luc Zarader |
| 2004 | Language identification techniques based on full recognition in an air traffic control task. Ricardo de Córdoba, Javier Ferreiros, Valentín Sama, Javier Macías Guarasa, Luis Fernando D'Haro, Fernando Fernández Martínez |
| 2004 | Language model adaptation based on PLSA of topics and speakers. Yuya Akita, Tatsuya Kawahara |
| 2004 | Language recognition using phone latices. Jean-Luc Gauvain, Abdelkhalek Messaoudi, Holger Schwenk |
| 2004 | Language specific phonetic rules: evidence from domain-initial strengthening. Sung-A. Kim |
| 2004 | Large vocabulary continuous speech recognition based on cross-morpheme phonetic information. In-Jeong Choi, Nam-Hoon Kim, Su Youn Yoon |
| 2004 | Large vocabulary continuous speech recognition for estonian using morpheme classes. Tanel Alumäe |
| 2004 | Latent semantic analysis for speaker recognition. A. Nayeemulla Khan, Bayya Yegnanarayana |
| 2004 | Learning dialogue policies using state aggregation in reinforcement learning. Matthias Denecke, Kohji Dohsaka, Mikio Nakano |
| 2004 | Learning for transliteration of arabic-numeral expressions using decision tree for Korean TTS. Youngim Jung, Donghun Lee, HyeonSook Nam, Ae-sun Yoon, Hyuk-Chul Kwon |
| 2004 | Learning long-term temporal features in LVCSR using neural networks. Barry Y. Chen, Qifeng Zhu, Nelson Morgan |
| 2004 | Learning nonnegative features of spectro-temporal sounds for classification. Yong-Choon Cho, Seungjin Choi |
| 2004 | Learning subject drift for topic tracking. Fumiyo Fukumoto, Yoshimi Suzuki |
| 2004 | Letter-to-sound for small-footprint multilingual TTS engine. Gui-Lin Chen, Ke-Song Han |
| 2004 | Lexical representation of non-native phonemes. Mirjam Broersma, K. Marieke Kolkman |
| 2004 | Long term modeling of phase trajectories within the speech sinusoidal model framework. Laurent Girin, Mohammad Firouzmand, Sylvain Marchand |
| 2004 | Long vowel detection for letter-to-sound conversion for Japanese sourced words transliterated into the alphabet. Hisako Asano, Hideharu Nakajima, Hideyuki Mizuno, Masahiro Oku |
| 2004 | MAP prediction of pitch from MFCC vectors for speech reconstruction. Xu Shao, Ben P. Milner |
| 2004 | METRIC-SEQDAC: a hybrid approach for audio segmentation. Hsin-Min Wang, Shih-Sian Cheng |
| 2004 | MFCC computation from magnitude spectrum of higher lag autocorrelation coefficients for robust speech recognition. Benjamin J. Shannon, Kuldip K. Paliwal |
| 2004 | MICot : a tool for multimodal input data collection. Raymond H. Lee, Anurag Kumar Gupta |
| 2004 | MLLR adaptation for hidden semi-Markov model based speech synthesis. Junichi Yamagishi, Takashi Masuko, Takao Kobayashi |
| 2004 | MS connect: a fully featured auto-attendant: system design, implementation and performance. David Ollason, Yun-Cheng Ju, Siddharth Bhatia, Daniel Herron, Jackie Liu |
| 2004 | Maximum - likelihod adaptation of semi-continuous HMMs by latent variable decomposition of state distributions. Antoine Raux, Rita Singh |
| 2004 | Maximum a posteriori eigenvoice speaker adaptation for Korean connected digit recognition. Hyung Bae Jeon, Dong Kook Kim |
| 2004 | Maximum entropy direct model as a unified model for acoustic modeling in speech recognition. Hong-Kwang Jeff Kuo, Yuqing Gao |
| 2004 | Maximum short quantity in Japanese and finish in two perception tests with F0 and db variants. Toshiko Isei-Jaakkola |
| 2004 | Mean and covariance adaptation based on minimum classification error linear regression for continuous density HMMs. Haibin Liu, Zhenyang Wu |
| 2004 | Measuring convergence in language model estimation using relative entropy. Abhinav Sethy, Shrikanth S. Narayanan, Bhuvana Ramabhadran |
| 2004 | Measuring the perceived importance of time- and frequency-divided speech blocks for transmitting over packet networks. Akitoshi Kataoka, Yusuke Hiwasaki, Toru Morinaga, Jotaro Ikedo |
| 2004 | Memory and computation reduction for embedded ASR systems. Sangbae Jeong, Icksang Han, Eugene Jon, Jeongsu Kim |
| 2004 | Memory efficient decoding graph compilation with wide cross-word acoustic context. Miroslav Novak, Vladimír Bergl |
| 2004 | Methods for task adaptation of acoustic models with limited transcribed in-domain data. Enrico Bocchieri, Michael Riley, Murat Saraclar |
| 2004 | Minimum phase compensation in speech coding using hammerstein model. Jari Juhani Turunen, Juha T. Tanttu, Frank Cameron |
| 2004 | Mining customer care dialogs for "daily news". Shona Douglas, Deepak Agarwal, Tirso Alonso, Robert M. Bell, Mazin G. Rahim, Deborah F. Swayne, Chris Volinsky |
| 2004 | Mining of association patterns for language modeling. Jen-Tzung Chien, Hung-Ying Chen |
| 2004 | Mis-recognized utterance detection using hierarchical language model. Hirofumi Yamamoto, Gen-ichiro Kikui, Yoshinori Sagisaka |
| 2004 | Mixture Gaussian model training against impostor model parameters: an application to speaker identification. T. V. Sreenivas, Sameer Badaskar |
| 2004 | Mixture language models for call routing. Qiang Huang, Stephen J. Cox |
| 2004 | Model composition by lagrange polynomial approximation for robust speech recognition in noisy environment. Chandra Kant Raut, Takuya Nishimoto, Shigeki Sagayama |
| 2004 | Model quality evaluation during enrolment for speaker verification. Javier R. Saeta, Javier Hernando |
| 2004 | Model-based sequential organization for cochannel speaker identification. Yang Shao, DeLiang Wang |
| 2004 | Modeling and recognition of phonetic and prosodic factors for improvements to acoustic speech recognition models. Sarah Borys, Aaron Cohen, Mark Hasegawa-Johnson, Jennifer Cole |
| 2004 | Modeling audio-visual speech perception: back on fusion architectures and fusion control. Jean-Luc Schwartz, Marie-Agnès Cathiard |
| 2004 | Modeling auxiliary features in tandem systems. Mathew Magimai-Doss, Shajith Ikbal, Todd A. Stephenson, Hervé Bourlard |
| 2004 | Modeling data entry rates for ASR and alternative input methods. Roger K. Moore |
| 2004 | Modeling generic dialog applications for embedded systems. Gerhard Hanrieder, Stefan W. Hamerich |
| 2004 | Modeling phones coarticulation effects in a neural network based speech recognition system. Leila Ansary, Seyyed Ali Seyyed Salehi |
| 2004 | Modeling pronunciation variation using artificial neural networks for English spontaneous speech. Ken Chen, Mark Hasegawa-Johnson |
| 2004 | Modelling and ranking of differences across formants of british, australian and american accents. Qin Yan, Saeed Vaseghi, Dimitrios Rentzos, Ching-Hsiang Ho |
| 2004 | Modified realizable frequency warped ARMA modeling and its application in synthesis structures for voiced speech. Juan L. Navarro-Mesa, Pedro J. Quintana-Morales |
| 2004 | Morphology-based language modeling for arabic speech recognition. Dimitra Vergyri, Katrin Kirchhoff, Kevin Duh, Andreas Stolcke |
| 2004 | Multi-codebook vector quantization algorithm for speaker identification. Mohamed Fathy Abu-ElYazeed, Nemat S. Abdel Kader, Mohammed El-Henawy |
| 2004 | Multi-context rules for phonological processing in polyglot TTS synthesis. Harald Romsdorfer, Beat Pfister |
| 2004 | Multi-eigenspace normalization for robust speech recognition in noisy environments. Yoonjae Lee, Hanseok Ko |
| 2004 | Multi-layer structure MLLR adaptation algorithm with subspace regression classes and tying. Xiangyu Mu, Shuwu Zhang, Bo Xu |
| 2004 | Multi-mode harmonic transfrom excitation LPC coding for speech and music. Jong-Hark Kim, Jae-Hyun Shin, Insung Lee |
| 2004 | Multi-pass ASR using vocabulary expansion. Katsutoshi Ohtsuki, Nobuaki Hiroshima, Shoichi Matsunaga, Yoshihiko Hayashi |
| 2004 | Multi-pitch trajectory estimation of concurrent speech based on harmonic GMM and nonlinear kalman filtering. Takuya Nishimoto, Shigeki Sagayama, Hirokazu Kameoka |
| 2004 | Multi-sample fusion with constrained feature transformation for robust speaker verification. Ming-Cheung Cheung, Kwok-Kwong Yiu, Man-Wai Mak, Sun-Yuan Kung |
| 2004 | Multilayer subword units for open-vocabulary spoken document retrieval. Shi-wook Lee, Kazuyo Tanaka, Yoshiaki Itoh |
| 2004 | Multilingual corpora for speech-to-speech translation research. Gen-ichiro Kikui, Toshiyuki Takezawa, Seiichi Yamamoto |
| 2004 | Multilingual e-mail text processing for speech synthesis. Daniela Oria, Akos Vetek |
| 2004 | Multimodal expression for humanoid robots by integration of human speech mimicking and facial color. Tokitomo Ariyoshi, Kazuhiro Nakadai, Hiroshi Tsujino |
| 2004 | Mutual information based visual feature selection for lipreading. Patricia Scanlon, Gerasimos Potamianos, Vit Libal, Stephen M. Chu |
| 2004 | Mutual-information based segment pre-selection in concatenative text-to-speech. Wei Zhang, Ling Jin, Xijun Ma |
| 2004 | N-gram language modeling of Japanese using bunsetsu boundaries. Sungyup Chung, Keikichi Hirose, Nobuaki Minematsu |
| 2004 | Neural "spike rate spectrum" as a noise robust, speaker invariant feature for automatic speech recognition. T. V. Sreenivas, G. V. Kiran, A. G. Krishna |
| 2004 | Neural network language models for conversational speech recognition. Holger Schwenk, Jean-Luc Gauvain |
| 2004 | Neurocognition of speech-specific audiovisual perception. Mikko Sams, Ville Ojanen, Jyrki Tuomainen, Vasily Klucharev |
| 2004 | New background modeling for speaker verification. Dat Tran |
| 2004 | New challenges in usability evaluation - beyond task-oriented spoken dialogue systems. Laila Dybkjær, Niels Ole Bernsen, Wolfgang Minker |
| 2004 | New features based on multiple word graphs for utterance verification. Alberto Sanchís, Alfons Juan, Enrique Vidal |
| 2004 | New harmonicity measures for pitch estimation and voice activity detection. An-Tze Yu, Hsiao-Chuan Wang |
| 2004 | New nonsense syllables database - analyses and preliminary ASR experiments. Petr Fousek, Frantisek Grézl, Hynek Hermansky, Petr Svojanovsky |
| 2004 | Noise adaptation for robust AURORA 2 noisy digit recognition using statistical data mapping. Xuechuan Wang, Douglas D. O'Shaughnessy |
| 2004 | Noise adaptive spoken dialog system based on selection of multiple dialog strategies. Akinori Ito, Takanobu Oba, Takashi Konashi, Motoyuki Suzuki, Shozo Makino |
| 2004 | Noise reduction using hybrid noise estimation technique and post-filtering. Junfeng Li, Masato Akagi |
| 2004 | Noise robust digit recognition using a glottal radar sensor for voicing detection. Cenk Demiroglu, David V. Anderson |
| 2004 | Noise robust real world spoken dialogue system using GMM based rejection of unintended inputs. Akinobu Lee, Keisuke Nakamura, Ryuichi Nisimura, Hiroshi Saruwatari, Kiyohiro Shikano |
| 2004 | Noise-robust speaker verification using F0 features. Koji Iwano, Taichi Asami, Sadaoki Furui |
| 2004 | Non-audible murmur (NAM) speech recognition using a stethoscopic NAM microphone. Panikos Heracleous, Yoshitaka Nakajima, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano |
| 2004 | Number of output nodes of artificial neural networks for Korean prosody generation. Kyung-Joong Min, Chan-Goo Kang, Un-Cheon Lim |
| 2004 | Objective wavelet packet features for speaker verification. Mihalis Siafarikas, Todor Ganchev, Nikos Fakotakis |
| 2004 | Of the top of the head: audio-visual speech perception from the nose up. Chris Davis, Jeesun Kim |
| 2004 | On a n-gram model approach for packet loss concealment. Minkyu Lee, Imed Zitouni, Qiru Zhou |
| 2004 | On binary and ratio time-frequency masks for robust speech recognition. Soundararajan Srinivasan, Nicoleta Roman, DeLiang Wang |
| 2004 | On latent semantic language modeling and smoothing. Jen-Tzung Chien, Meng-Sung Wu, Hua-Jui Peng |
| 2004 | On the development of telephone applications: some practical issues and evaluation. Andrea Facco, Daniele Falavigna, Roberto Gretter, Marcello Viganò |
| 2004 | On the integration of speech recognition into personal networks. Zheng-Hua Tan, Paul Dalsgaard, Børge Lindberg |
| 2004 | On the time variability of vocal tract for speaker recognition. Samuel Kim, Thomas Eriksson, Hong-Goo Kang |
| 2004 | On the use of a weighted autocorrelation based fundamental frequency estimation for a multidimensional speech input. Federico Flego, Luca Armani, Maurizio Omologo |
| 2004 | On using MLP features in LVCSR. Qifeng Zhu, Barry Y. Chen, Nelson Morgan, Andreas Stolcke |
| 2004 | On-line incremental adaptation based on reinforcement learning for robust speech recognition. Masafumi Nishida, Yoshitaka Mamiya, Yasuo Horiuchi, Akira Ichikawa |
| 2004 | Online minimum mean square error filtering of noisy cepstral coefficients using a sequential EM algorithm. Tor André Myrvoll, Satoshi Nakamura |
| 2004 | Optimal acoustic and language model weights for minimizing word verification errors. Frank K. Soong, Wai Kit Lo, Satoshi Nakamura |
| 2004 | Optimizing an engine network that allows dynamic masking. Frédéric Tendeau |
| 2004 | Optimizing boosting with discriminative criteria. Rong Zhang, Alexander I. Rudnicky |
| 2004 | Optimizing regression for in-car speech recognition using multiple distributed microphones. Weifeng Li, Fumitada Itakura, Kazuya Takeda |
| 2004 | Orientel-turkish: telephone speech database description and notes on the experience. Tolga Çiloglu, Dinc Acar, Ahmet Tokatli |
| 2004 | PROSPECT features and their application to missing data techniques for robust speech recognition. Hugo Van hamme |
| 2004 | Parallel feature generation based on maximizing normalized acoustic likelihood. Xiang Li, Richard M. Stern |
| 2004 | Parallel tone score association method for tone language speech recognition. William S.-Y. Wang, Gang Peng |
| 2004 | Partially lexicalized parsing model utilizing rich features. So-Young Park, Yong-Jae Kwak, Joon-Ho Lim, Hae-Chang Rim, Soo-Hong Kim |
| 2004 | Perception of affect in speech - towards an automatic processing of paralinguistic information in spoken conversation. Nick Campbell |
| 2004 | Perception of non-native phonemes in noise. Nicole Cooper, Anne Cutler |
| 2004 | Perception-guided and phonetic clustering weight tuning based on diphone pairs for unit selection TTS. Francesc Alías, Xavier Llorà, Ignasi Iriondo Sanz, Joan Claudi Socoró, Xavier Sevillano, Lluís Formiga |
| 2004 | Perceptual discrimination of prosodic types and their preliminary acoustic analysis. Masahiko Komatsu, Tsutomu Sugawara, Takayuki Arai |
| 2004 | Perceptual wavelet packet audio coder. Teddy Surya Gunawan, Eliathamby Ambikairajah, Julien Epps |
| 2004 | Performance analysis of transcoding algorithms in packet-loss environments. Sung-Kyo Jung, Hong-Goo Kang, Dae Hee Youn, Chang-Heon Lee |
| 2004 | Performance improvement of connected digit recognition using unsupervised fast speaker adaptation. Young Kuk Kim, Hwa Jeon Song, Hyung Soon Kim |
| 2004 | Performance of speech recognition and synthesis in packet-based networks. Sebastian Möller, Jan Felix Krebber, Alexander Raake |
| 2004 | Phase-space representation of speech. Hua Yu |
| 2004 | Phone classification in pseudo-euclidean vector spaces. Alexander Gutkin, Simon King |
| 2004 | Phoneme restoration in degraded speech communication. Slobodan Jovicic, Sandra Antesevic, Zoran Saric |
| 2004 | Phoneme-based word activation in spoken-word recognition: evidence from Japanese school children. Takashi Otake, Yoko Sakamoto, Yasuyuki Konomi |
| 2004 | Phonemic repertoire and similarity within the vocabulary. Anne Cutler, Dennis Norris, Núria Sebastián-Gallés |
| 2004 | Phonetic confusion based document expansion for spoken document retrieval. Nicolas Moreau, Hyoung-Gook Kim, Thomas Sikora |
| 2004 | Phonetic realization of the suffix-suppressed accentual phrase in Korean. Mira Oh, Kee-Ho Kim |
| 2004 | Phonology of exceptions for for Korean grapheme-to-phoneme conversion. Sunhee Kim |
| 2004 | Phonotactics vs. phonetic cues in native and non-native listening: dutch and Korean listeners' perception of dutch and English. Taehong Cho, James M. McQueen |
| 2004 | Phoxsy: multi-phone segments for unit selection speech synthesis. Stefan Breuer, Julia Abresch |
| 2004 | Pinched lattice minimum Bayes risk discriminative training for large vocabulary continuous speech recognition. Vlasios Doumpiotis, William Byrne |
| 2004 | Poetry assistant. Isabel Trancoso, Paulo Araújo, Céu Viana, Nuno J. Mamede |
| 2004 | Policy analysis framework for conversational biometrics. Upendra V. Chaudhari, Ganesh N. Ramaswamy |
| 2004 | Polynomial regression model for duration prediction in Mandarin. Yu Hu, Ren-Hua Wang, Lu Sun |
| 2004 | Positional and phonotactic effects on the realization of taiwan Mandarin tone 2. Hui-Ju Hsu, Janice Fon |
| 2004 | Posteriori probabilities and likelihoods combination for speech and speaker recognition. Mohamed Faouzi BenZeghiba, Hervé Bourlard |
| 2004 | Practical use of English pronunciation system for Japanese students in the CALL classroom. Tatsuya Kawahara, Masatake Dantsuji, Yasushi Tsubota |
| 2004 | Pre-focal rephrasing, focal enhancement and postfocal deaccentuation in French. Marion Dohen, Hélène Loevenbruck |
| 2004 | Precise phone boundary detection using wavelet packet and recurrent neural networks. Farshad Almasganj |
| 2004 | Predicting word correct rate from acoustic and linguistic confusability. Gies Bouwman, Bert Cranen, Lou Boves |
| 2004 | Prediction of the glottal LF parameters using regression trees. Michelle Tooher, John G. McKenna |
| 2004 | Probabilistic speaker identification with dual penalized logistic regression machine. Tomoko Matsui, Kunio Tanabe |
| 2004 | Procedure "senza vibrato": a key component for morphing singing. Hideki Kawahara, Yumi Hirachi, Masanori Morise, Hideki Banno |
| 2004 | Prolongation in spontaneous Mandarin. Tzu-Lun Lee, Ya-Fang He, Yun-Ju Huang, Shu-Chuan Tseng, Robert Eklund |
| 2004 | Pronunciation assessment based upon the compatibility between a learner's pronunciation structure and the target language's lexical structure. Nobuaki Minematsu |
| 2004 | Pronunciation assessment based upon the phonological distortions observed in language learners' utterances. Nobuaki Minematsu |
| 2004 | Pronunciation lexicon adaptation for TTS voice building. Yeon-Jun Kim, Ann K. Syrdal, Alistair Conkie |
| 2004 | Pronunciation lexicon modeling and design for Korean large vocabulary continuous speech recognition. Kyong-Nim Lee, Minhwa Chung |
| 2004 | Prosodic analysis of a multi-style corpus in the perspective of emotional speech synthesis. Enrico Zovato, Stefano Sandri, Silvia Quazza, Leonardo Badino |
| 2004 | Prosodic characteristics of czech contrastive topic. Katerina Vesela, Nino Peterek, Eva Hajicová |
| 2004 | Prosody based attitude recognition with feature selection and its application to spoken dialog system as para-linguistic information. Shinya Fujie, Tetsunori Kobayashi, Daizo Yagi, Hideaki Kikuchi |
| 2004 | Question-answering in webtalk: an evaluation study. Junlan Feng, Srinivas Bangalore, Mazin G. Rahim |
| 2004 | Rapid EM training based on model-integration. Shinichi Yoshizawa, Kiyohiro Shikano |
| 2004 | Rapid acoustic model development using Gaussian mixture clustering and language adaptation. Nikos Chatzichrisafis, Vassilios Digalakis, Vassilios Diakoloukas, Costas Harizakis |
| 2004 | Rapid on-line environment compensation for server - based speech recognition in noisy mobile environments. Juan M. Huerta, Etienne Marcheret, Sreeram Balakrishnan |
| 2004 | Real-time speaker identification. Pasi Fränti, Evgeny Karpov, Tomi Kinnunen |
| 2004 | Recent improvements on ARTIC: czech text-to-speech system. Jindrich Matousek, Jan Romportl, Daniel Tihelka, Zbynek Tychtl |
| 2004 | Recent progress of open-source LVCSR engine julius and Japanese model repository. Tatsuya Kawahara, Akinobu Lee, Kazuya Takeda, Katsunobu Itou, Kiyohiro Shikano |
| 2004 | Recognition of read and spontaneous children's speech using two new corpora. Martin J. Russell, Shona D'Arcy, Lit Ping Wong |
| 2004 | Recognition of three simultaneous utterance of speech by four-line directivity microphone mounted on head of robot. Naoya Mochiki, Tetsunori Kobayashi, Toshiyuki Sekiya, Tetsuji Ogawa |
| 2004 | Reconciling pronunciation differences between the front-end and the back-end in the IBM speech synthesis system. Wael Hamza, Ellen Eide, Raimo Bakis |
| 2004 | Reconstruction filter design for bone-conducted speech. Toshiki Tamiya, Tetsuya Shimamura |
| 2004 | Reference marking in children's computer-directed speech: an integrated analysis of discourse and gestures. Simona Montanari, Serdar Yildirim, Elaine Andersen, Shrikanth S. Narayanan |
| 2004 | Restructuring HMM states for speaker adaptation in Mandarin speech recognition. Xianghua Xu, Qiang Guo, Jie Zhu |
| 2004 | Revisiting dysarthria assessment intelligibility metrics. Phil D. Green, James Carmichael |
| 2004 | Revisiting some model-based and data-driven denoising algorithms in Aurora 2 context. Panji Setiawan, Sorel Stan, Tim Fingscheidt |
| 2004 | Rhythm in read british English: interdialect variability. Emmanuel Ferragne, François Pellegrino |
| 2004 | Robot motion control using listener's back-channels and head gesture information. Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno, Tsuyoshi Tasaki, Takeshi Yamaguchi |
| 2004 | Robust ASR model adaptation by feature-based statistical data mapping. Xuechuan Wang, Douglas D. O'Shaughnessy |
| 2004 | Robust and adaptive architecture for multilingual spoken dialogue systems. Markku Turunen, Esa-Pekka Salonen, Mikko Hartikainen, Jaakko Hakulinen |
| 2004 | Robust automatic speech recognition using an optimal spectral amplitude estimator algorithm in low-SNR car environments. Zili Li, Hesham Tolba, Douglas D. O'Shaughnessy |
| 2004 | Robust dependency parsing of spontaneous Japanese speech and its evaluation. Tomohiro Ohno, Shigeki Matsubara, Nobuo Kawaguchi, Yasuyoshi Inagaki |
| 2004 | Robust distant speech recognition based on position dependent CMN. Norihide Kitaoka, Longbiao Wang, Seiichi Nakagawa |
| 2004 | Robust speaker identification based on perceptual log area ratio and Gaussian mixture models. David Chow, Waleed H. Abdulla |
| 2004 | Robust speech recognition based on HMM composition and modified wiener filter. Sumitaka Sakauchi, Yoshikazu Yamaguchi, Satoshi Takahashi, Satoshi Kobashikawa |
| 2004 | Robust speech recognition in client-server scenarios. Richard C. Rose, Hong Kook Kim |
| 2004 | Robust speech recognition over packet networks: an overview. Naveen Srinivasamurthy, Kyu Jeong Han, Shrikanth S. Narayanan |
| 2004 | Robust speech recognition using data-driven temporal filters based on independent component analysis. Junhui Zhao, Jingming Kuang, Xiang Xie |
| 2004 | Robust speech recognition with spectral subtraction in low SNR. Randy Gomez, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano |
| 2004 | Robust verification of recognized words in noise. Wai Kit Lo, Frank K. Soong, Satoshi Nakamura |
| 2004 | Robustness aspects of active learning for acoustic modeling. Gerard G. L. Meyer, Teresa M. Kamm |
| 2004 | Role of segmental and suprasegmental cues in the perception of maghrebian-acented French. Belynda Brahimi, Philippe Boula de Mareüil, Cédric Gendrot |
| 2004 | SVM kernel adaptation in speaker classification and verification. Purdy Ho, Pedro J. Moreno |
| 2004 | SVM modeling of "SNERF-grams" for speaker recognition. Elizabeth Shriberg, Luciana Ferrer, Anand Venkataraman, Sachin S. Kajarekar |
| 2004 | Scalable distributed speech recognition using multi-frame GMM-based block quantization. Kuldip K. Paliwal, Stephen So |
| 2004 | Scoring and direct methods for the interpretation of evidence in forensic speaker recognition. Anil Alexander, Andrzej Drygajlo |
| 2004 | Scoring unknown speaker clustering : VB vs. BIC. Fabio Valente, Christian Wellekens |
| 2004 | Segmental differences in the visual contribution to speech inteligibility. Kuniko Y. Nielsen |
| 2004 | Segmental speech coding model for storage applications. Anssi Rämö, Jani Nurminen, Sakari Himanen, Ari Heikkinen |
| 2004 | Segmentation and relevance measure for speaker verification. Jérôme Louradour, Régine André-Obrecht, Khalid Daoudi |
| 2004 | Segmenting ambiguous phrases using phoneme duration. Keren B. Shatzman |
| 2004 | Separation of multiple concurrent speeches using audio-visual speaker localization and minimum variance beam-forming. Changkyu Choi, Donggeon Kong, Hyoung-Ki Lee, Sang Min Yoon |
| 2004 | Shaping spoken input in user-initiative systems. Stefanie Tomko, Roni Rosenfeld |
| 2004 | Side effect free dialogue management in a voice enabled procedure browser. Manny Rayner, Beth Ann Hockey |
| 2004 | Signaling and detecting uncertainty in audiovisual speech by children and adults. Emiel Krahmer, Marc Swerts |
| 2004 | Simulating multimodal applications. Chakib Tadj, Hicham Djenidi, Madjid Haouani, Amar Ramdane-Cherif, Nicole Lévy |
| 2004 | Simultaneous estimation of weights of eigenvoices and bias compensation vector for rapid speaker adaptation. Hyung Soon Kim, Hwa Jeon Song |
| 2004 | Single acoustic-channel speech enhancement based on glottal correlation using non-acoustic sensor. Rongqiang Hu, David V. Anderson |
| 2004 | Soft features for improved distributed speech recognition over wireless networks. Reinhold Haeb-Umbach, Valentin Ion |
| 2004 | Some articulatory measurements of real sadness. Donna Erickson, Caroline Menezes, Akinori Fujino |
| 2004 | Sound source localization based on zero-crosing peak-amplitude coding. Young-Ik Kim, Rhee Man Kil |
| 2004 | Source separation using particle filters. Mital Gandhi, Mark Hasegawa-Johnson |
| 2004 | Source-filter separation for articulation-to-speech synthesis. Yoshinori Shiga, Simon King |
| 2004 | Speaker adaptation method for CALL system using bilingual speakers' utterances. Motoyuki Suzuki, Hirokazu Ogasawara, Akinori Ito, Yuichi Ohkawa, Shozo Makino |
| 2004 | Speaker adaptation of a three-dimensional tongue model. Olov Engwall |
| 2004 | Speaker clustering of speech utterances using a voice characteristic reference space. Wei-Ho Tsai, Shih-Sian Cheng, Hsin-Min Wang |
| 2004 | Speaker dependent model order selection of spectral envelopes. Matthias Wölfel |
| 2004 | Speaker diarization from speech transcripts. Lori Lamel, Jean-Luc Gauvain, Leonardo Canseco-Rodriguez |
| 2004 | Speaker diarization using bottom-up clustering based on a parameter-derived distance between adapted GMMs. Michael Betser, Frédéric Bimbot, Mathieu Ben, Guillaume Gravier |
| 2004 | Speaker identification using probabilistic PCA model selection. Jen-Tzung Chien, Chuan-Wei Ting |
| 2004 | Speaker indexing in audio archives using test utterance Gaussian mixture modeling. Hagai Aronowitz, David Burshtein, Amihood Amir |
| 2004 | Speaker model quantization for unsupervised speaker indexing. Soonil Kwon, Shrikanth S. Narayanan |
| 2004 | Speaker normalization through constrained MLLR based transforms. Diego Giuliani, Matteo Gerosa, Fabio Brugnara |
| 2004 | Speaker segmentation and clustering in meetings. Qin Jin, Tanja Schultz |
| 2004 | Speaker-and-environment change detection in broadcast news using the common component GMM-based divergence measure. Yih-Ru Wang, Chi-Han Huang |
| 2004 | Spectral characteristics of the release bursts in Korean alveolar stops. Hansang Park |
| 2004 | Spectral moment vs. bark cepstral analysis of children's word-initial voiceles stops. H. Timothy Bunnell, James B. Polikoff, Jane McNicholas |
| 2004 | Spectral subtraction with full-wave rectification and likelihood controlled instantaneous noise estimation for robust speech recognition. Haitian Xu, Zheng-Hua Tan, Paul Dalsgaard, Børge Lindberg |
| 2004 | Spectro-temporal activity pattern (STAP) features for noise robust ASR. Shajith Ikbal, Mathew Magimai-Doss, Hemant Misra, Hervé Bourlard |
| 2004 | Speech act identification using an ontology-based partial pattern tree. Chung-Hsien Wu, Jui-Feng Yeh, Ming-Jun Chen |
| 2004 | Speech coding using trajectory compression and multiple sensors. Sorin Dusan, James L. Flanagan, Amod Karve, Mridul Balaraman |
| 2004 | Speech enhanced multi-Span language model. A. Nayeemulla Khan, B. Yegnanarayana |
| 2004 | Speech enhancement and recognition by integrating adaptive beamforming and wiener filtering. Alberto Abad, Javier Hernando |
| 2004 | Speech enhancement based on magnitude estimation using the gamma prior. Weifeng Li, Kazuya Takeda, Fumitada Itakura, Tran Huy Dat |
| 2004 | Speech enhancement based on smoothing of spectral noise floor. Hyoung-Gook Kim, Thomas Sikora |
| 2004 | Speech enhancement using adaptive time-domain segmentation. Sriram Srinivasan, W. Bastiaan Kleijn |
| 2004 | Speech input and output module assessment for remote access to a smart-home spoken dialog system. Jan Felix Krebber, Sebastian Möller, Alexander Raake |
| 2004 | Speech intention understanding based on decision tree learning. Yuki Irie, Shigeki Matsubara, Nobuo Kawaguchi, Yukiko Yamaguchi, Yasuyoshi Inagaki |
| 2004 | Speech interaction system - how to increase its usability? Fang Chen |
| 2004 | Speech interface for name input based on combination of recognition methods using syllable-based n-gram and word dictionary. Hironori Oshikawa, Norihide Kitaoka, Seiichi Nakagawa |
| 2004 | Speech probability distribution based on generalized gama distribution. Jong Won Shin, Joon-Hyuk Chang, Nam Soo Kim |
| 2004 | Speech production based on lossy tube models: unit concatenation and sound transitions. Karl Schnell, Arild Lacroix |
| 2004 | Speech quality estimation using Gaussian mixture models. Tiago H. Falk, Wai-Yip Chan, Peter Kabal |
| 2004 | Speech recognition error analysis on the English MALACH corpus. Olivier Siohan, Bhuvana Ramabhadran, Geoffrey Zweig |
| 2004 | Speech recognition error correction using maximum entropy language model. Sangkeun Jung, Minwoo Jeong, Gary Geunbae Lee |
| 2004 | Speech recognition experiments with the SPEECON database using several robust front-ends. Pere Pujol, Jaume Padrell, Climent Nadeu, Dusan Macho |
| 2004 | Speech recognition for multiple non-native accent groups with speaker-group-dependent acoustic models. Tobias Cincarek, Rainer Gruhn, Satoshi Nakamura |
| 2004 | Speech recognition system robust to noise and speaking styles. Shigeki Matsuda, Takatoshi Jitsuhiro, Konstantin Markov, Satoshi Nakamura |
| 2004 | Speech recognition using motion based lipreading. Maria José Sanchez Martinez, Juan Pablo de la Cruz Gutiérrez |
| 2004 | Speech recognition using synchronization between speech and finger tapping. Hiromitsu Ban, Chiyomi Miyajima, Katsunobu Itou, Fumitada Itakura, Kazuya Takeda |
| 2004 | Speech recognition, sylabification and statistical phonetics. Melvyn John Hunt |
| 2004 | Speech spotter: on-demand speech recognition in human-human conversation on the telephone or in face-to-face situations. Masataka Goto, Koji Kitayama, Katsunobu Itou, Tetsunori Kobayashi |
| 2004 | Speech timing and rhythmic structure in arabic dialects: a comparison of two approaches. Melissa Barkat-Defradas, Rym Hamdi, Emmanuel Ferragne, François Pellegrino |
| 2004 | Speech translation: past, present and future. Alex Waibel |
| 2004 | Speech understanding, dialogue management and response generation in corpus-based spoken dialogue system. Keita Hayashi, Yuki Irie, Yukiko Yamaguchi, Shigeki Matsubara, Nobuo Kawaguchi |
| 2004 | Speedup of kernel eigenvoice speaker adaptation by embedded kernel PCA. Brian Mak, Simon Ka-Lung Ho, James T. Kwok |
| 2004 | Spoken language interface in ECMA/ISO telecommunication standards. Kuansan Wang |
| 2004 | Spokenquery: an alternate approach to chosing items with speech. Peter Wolf, Joseph Woelfel, Jan C. van Gemert, Bhiksha Raj, David Wong |
| 2004 | Spontaneous speech recognition using a massively parallel decoder. Takahiro Shinozaki, Sadaoki Furui |
| 2004 | Spread of high tone in akita Japanese. Kenji Yoshida |
| 2004 | Statistical Chinese spoken document retrieval using latent topical information. Jen-Wei Kuo, Yao-Min Huang, Berlin Chen, Hsin-Min Wang |
| 2004 | Statistical corpus-based speech segmentation. Vincent Pollet, Geert Coorman |
| 2004 | Statistical feature language model. Salma Jamoussi, David Langlois, Jean Paul Haton, Kamel Smaïli |
| 2004 | Statistical machine translation and its challenges. Hermann Ney |
| 2004 | Statistical model migration in speaker recognition. Jirí Navrátil, Ganesh N. Ramaswamy, Ran D. Zilca |
| 2004 | Statistics-based direction finding for training vowels. Cheolwoo Jo, Ilsuh Bak |
| 2004 | Stochastic gradient adaptation of front-end parameters. Sreeram Balakrishnan, Karthik Visweswariah, Vaibhava Goel |
| 2004 | Stop consonant classification by dynamic formant trajectory. Yanli Zheng, Mark Hasegawa-Johnson, Sarah Borys |
| 2004 | Strategies for optimizing a stochastic spoken natural language parser. Wolfgang Minker, Dirk Bühler, Christiane Beuschel |
| 2004 | Strategies to reduce design time in multimodal/multilingual dialog applications. Luis Fernando D'Haro, Ricardo de Córdoba, Rubén San Segundo, Juan Manuel Montero, Javier Macías Guarasa, José Manuel Pardo |
| 2004 | Structured interview-based evaluation of spoken multimodal conversation with h.c. andersen. Niels Ole Bernsen, Laila Dybkjær |
| 2004 | Structuring of baseball live games based on speech recognition using task dependant knowledge. Atsushi Sako, Yasuo Ariki |
| 2004 | Study on emotional speech features in Korean with its aplication to voice color conversion. Sang-Jin Kim, Kwang-Ki Kim, Minsoo Hahn |
| 2004 | Subjective evaluation of join cost functions used in unit selection speech synthesis. Jithendra Vepa, Simon King |
| 2004 | Subjective evaluation of spoken dialogue systems using SER VQUAL method. Mikko Hartikainen, Esa-Pekka Salonen, Markku Turunen |
| 2004 | Subtopic segmentation in the lecture speech. Noboru Kanedera, Asuka Sumida, Takao Ikehata, Tetsuo Funada |
| 2004 | Survey of spontaneous speech phenomena in a multimodal dialogue system and some implications for ASR. Louis ten Bosch, Lou Boves |
| 2004 | Syllable-based probabilistic morphological analysis model of Korean. Do-Gil Lee, Hae-Chang Rim |
| 2004 | Synchronization of speaker selection for centralized tandem free voIP conferencing. Peter Kabal, Colm Elliott |
| 2004 | Synthesis of vowels and tones in Thai language by articulatory modeling. Thanate Khaorapapong, Montri Karnjanadecha, Keerati Inthavisas |
| 2004 | Synthesizing speech from speech recognition parameters. Kris Demuynck, Oscar Garcia, Dirk Van Compernolle |
| 2004 | TRAP based features for LVCSR of meting data. Frantisek Grézl, Martin Karafiát, Jan Cernocký |
| 2004 | Target practice on talking faces. Adriano Vilela Barbosa, Eric Vatikiotis-Bateson, Andreas Daffertshofer |
| 2004 | Task adaptation of acoustic and language models based on large quantities of data. Karthik Visweswariah, Ramesh A. Gopinath, Vaibhava Goel |
| 2004 | Task-specific minimum Bayes-risk decoding using learned edit distance. Izhak Shafran, William Byrne |
| 2004 | Temporal normalization techniques for transform-type speech coding and application to split-band wideband coders. Kyung-Tae Kim, Sung-Kyo Jung, MiSuk Lee, Hong-Goo Kang, Dae Hee Youn |
| 2004 | Temporal variables in parkinsonian speech. Danielle Duez |
| 2004 | Text independent speaker recognition using speaker dependent word spotting. Hagai Aronowitz, David Burshtein, Amihood Amir |
| 2004 | The GEMINI platform: semi-automatic generation of dialogue applications. Stefan W. Hamerich, Volker Schless, Basilis Kladis, Volker Schubert, Otilia Kocsis, Stefan Igel, Ricardo de Córdoba, Luis Fernando D'Haro, José Manuel Pardo |
| 2004 | The IBM expressive speech synthesis system. Wael Hamza, Ellen Eide, Raimo Bakis, Michael Picheny, John F. Pitrelli |
| 2004 | The ICSI-SRI-UW metadata extraction system. Elizabeth Shriberg, Andreas Stolcke, Dustin Hillard, Mari Ostendorf, Barbara Peskin, Mary P. Harper, Yang Liu |
| 2004 | The MIT finite-state transducer toolkit for speech and language processing. I. Lee Hetherington |
| 2004 | The audio-video australian English speech data corpus AVOZES. J. Bruce Millar, Roland Goecke |
| 2004 | The automatic news transcription system: ANTS, some real time experiments. Dominique Fohr, Odile Mella, Christophe Cerisara, Irina Illina |
| 2004 | The development of anticipatory labial coarticulation in French: a pionering study. Aude Noiray, Lucie Ménard, Marie-Agnès Cathiard, Christian Abry, Christophe Savariaux |
| 2004 | The duration of pitch transition phase and its relative factors. Ziyu Xiong, Juanwen Chen |
| 2004 | The effect of intonation on perception of Cantonese lexical tones. Valter Ciocca, Tara L. Whitehill, Joan K.-Y. Ma |
| 2004 | The efficient generation of pronunciation dictionaries: human factors during bootstrapping. Marelie H. Davel, Etienne Barnard |
| 2004 | The efficient generation of pronunciation dictionaries: machine learning factors during bootstrapping. Marelie H. Davel, Etienne Barnard |
| 2004 | The influence of target size and distance on the production of speech and gesture in multimodal referring expressions. Ielka van der Sluis, Emiel Krahmer |
| 2004 | The modified group delay feature: a new spectral representation of speech. Hema A. Murthy, Rajesh Mahanand Hegde, Venkata Ramana Rao Gadde |
| 2004 | The role of pitch range variation in the discourse structure and intonation structure of Korean. Eunjong Kong |
| 2004 | The role of prosodic cues in word segmentation of Korean. Sahyang Kim |
| 2004 | The stochastic weighted viterbi algorithm: a frame work to compensate additive noise and low-bit rate coding distortion. Néstor Becerra Yoma, Ivan Brito, Carlos Molina |
| 2004 | The superior effectivenes of the F0 range for identifying the context from sounds without phonemes. Yasuko Nagasaki, Takanori Komatsu |
| 2004 | The use of typical sequences for robust speaker identification. Mohamed Mihoubi, Douglas D. O'Shaughnessy, Pierre Dumouchel |
| 2004 | The voice-logbook: integrating human factors for a chronic care system. Lesley-Ann Black, Norman D. Black, Roy Harper, Michelle Lemon, Michael F. McTear |
| 2004 | Theory and data in spoken language assessment. Jared Bernstein, Isabella Barbier, Elizabeth Rosenfeld, John H. A. L. de Jong |
| 2004 | Theory for speaker recognition over IP. Thomas Eriksson, Samuel Kim, Hong-Goo Kang, Chungyong Lee |
| 2004 | Three-way system-user-expert interactions help you expand the capabilities of an existing spoken dialogue system. Gregory Aist |
| 2004 | Throat microphone signal for speaker recognition. Bayya Yegnanarayana, A. Shahina, M. R. Kesheorey |
| 2004 | Thyroplastic medialisation in unilateral vocal fold paralysis: assessing voice quality recovering. Claudia Manfredi, Giorgio Peretti, Laura Magnoni, Fabrizio Dori, Ernesto Iadanza |
| 2004 | Tight coupling of speech recognition and dialog management - dialog-context dependent grammar weighting for speech recognition. Christian Fügen, Hartwig Holzapfel, Alex Waibel |
| 2004 | Time -frequency analysis of vocal source signal for speaker recognition. Nengheng Zheng, P. C. Ching, Tan Lee |
| 2004 | Time delay estimation using weighted CPSP function. Hong-Seok Kwon, Siho Kim, Keun-Sung Bae |
| 2004 | Time-scaling of speech using independent subspace analysis. R. Muralishankar, A. G. Ramakrishnan, Lakshmish N. Kaushik |
| 2004 | Tone information as a confidence measure for improving Cantonese LVCSR. Yao Qian, Tan Lee, Frank K. Soong |
| 2004 | Topic classification and verification modeling for out-of-domain utterance detection. Tatsuya Kawahara, Ian Richard Lane, Tomoko Matsui, Satoshi Nakamura |
| 2004 | Topic structure extraction for meeting indexing. Katsutoshi Ohtsuki, Nobuaki Hiroshima, Yoshihiko Hayashi, Katsuji Bessho, Shoichi Matsunaga |
| 2004 | Towards a grammar of spoken language - prosody of ill-formed utterances and listener's understanding in discourse -. Miyoko Sugito |
| 2004 | Towards a harmonious coexistence of spoken and written language. Hyun-Bok Lee |
| 2004 | Towards a new level of anotation detail of multilingual speech corpora. Anja Geumann |
| 2004 | Towards automatic word segmentation of dialect speech. Eric Sanders, Andrea Diersen, Willy Jongenburger, Helmer Strik |
| 2004 | Towards better understanding of the model implied by the use of dynamic features in HMMs. John Scott Bridle |
| 2004 | Towards large vocabulary ASR on embedded platforms. Miroslav Novak |
| 2004 | Towards ubiquitous task management. Porfírio P. Filipe, Nuno J. Mamede |
| 2004 | Towards understanding mixed-initiative in task-oriented dialogues. Fan Yang, Peter A. Heeman, Kristy Hollingshead |
| 2004 | Transcription of arabic broadcast news. Abdelkhalek Messaoudi, Lori Lamel, Jean-Luc Gauvain |
| 2004 | Transformation and combination of hiden Markov models for speaker selection training. Chao Huang, Tao Chen, Eric Chang |
| 2004 | Transformation-based error correction for speech-to-text systems. Jochen Peters, Christina Drexel |
| 2004 | Translingual grammar induction. John Lee, Stephanie Seneff |
| 2004 | Triphone-based confidence system for speaker identification. Aaron D. Lawson, Mark C. Huggins |
| 2004 | Two-way speech-to-speech translation on handheld devices. Bowen Zhou, Daniel Déchelotte, Yuqing Gao |
| 2004 | Unified language modeling using finite-state transducers with first applications. Hans J. G. A. Dolfing, Pierce Gerard Buckley, David Horowitz |
| 2004 | Unscented kalman filtering of line spectral frequencies. Andrew Errity, John McKenna, Stephen Isard |
| 2004 | Unseen handset mismatch compensation based on a priori knowledge interpolation for robust speaker recognition. Yh-Her Yang, Yuan-Fu Liao |
| 2004 | Unsupervised language model adaptation methods for spontaneous speech. Luc Lussier, Edward W. D. Whittaker, Sadaoki Furui |
| 2004 | Unsupervised learning from users' error correction in speech dictation. Dong Yu, Mei-Yuh Hwang, Peter Mau, Alex Acero, Li Deng |
| 2004 | Unsupervised speaker adaptation using high confidence portion recognition results by multiple recognition systems. Tomohiro Watanabe, Hiromitsu Nishizaki, Takehito Utsuro, Seiichi Nakagawa |
| 2004 | Unsupervised topic adaptation for lecture speech retrieval. Atsushi Fujii, Tetsuya Ishikawa, Katsunobu Itou, Tomoyosi Akiba |
| 2004 | Usability considerations of speech-to-speech translation system. Youngjik Lee, Jun Park, Seung-Shin Oh |
| 2004 | Use of formants in stressed and unstressed continuous speech recognition. Davood Gharavian, Seyed Mohammad Ahadi |
| 2004 | Use of metadata to improve recognition of spontaneous speech and named entities. Bhuvana Ramabhadran, Olivier Siohan, Geoffrey Zweig |
| 2004 | Use of neural network mapping and extended kalman filter to recover vocal tract resonances from the MFCC parameters of speech. Li Deng, Roberto Togneri |
| 2004 | Use of prosodic features for speech recognition. Keikichi Hirose, Nobuaki Minematsu |
| 2004 | Use of visual cues in the perception of a labial/labiodental contrast by Spanish-L1 and Japanese-L1 learners of English. Midori Iba, Anke Sennema, Valérie Hazan, Andrew Faulkner |
| 2004 | Using RASTA in task independent TANDEM feature extraction. Guillermo Aradilla, John Dines, Sunil Sivadas |
| 2004 | Using VTLN for broadcast news transcription. Do Yeong Kim, Srinivasan Umesh, Mark J. F. Gales, Thomas Hain, Philip C. Woodland |
| 2004 | Using a depth-restricted search to reduce delays in unit selection. Nobuyuki Nishizawa, Hisashi Kawai |
| 2004 | Using computer simulation to compare two models of mixed-initiative. Fan Yang, Peter A. Heeman |
| 2004 | Using context to correct phone recognition errors. Stephen Cox |
| 2004 | Using linear interpolation to improve histogram equalization for speech recognition. Filip Korkmazsky, Dominique Fohr, Irina Illina |
| 2004 | Using machine learning to cope with imbalanced classes in natural speech: evidence from sentence boundary and disfluency detection. Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Mary P. Harper |
| 2004 | Using multiple linguistic features for Mandarin phrase break prediction in maximum-entropy classification framework. Yu Zheng, Gary Geunbae Lee, Byeongchang Kim |
| 2004 | Using part-of-speech for predicting phrase breaks. Ian Read, Stephen Cox |
| 2004 | Using quick transcriptions to improve conversational speech models. Owen Kimball, Chia-Lin Kao, Rukmini Iyer, Teodoro Arvizo, John Makhoul |
| 2004 | Using simple speech-based features to detect the state of a meeting and the roles of the meeting participants. Satanjeev Banerjee, Alexander I. Rudnicky |
| 2004 | Using word latice information for a tighter coupling in speech translation systems. Tanja Schultz, Szu-Chen Stan Jou, Stephan Vogel, Shirin Saleem |
| 2004 | Very large vocabulary speech recognition system for automatic transcription of czech broadcast programs. Jan Nouza, Dana Nejedlová, Jindrich Zdánský, Jan Kolorenc |
| 2004 | Video-realistic synthetic speech with a parametric visual speech synthesizer. Sascha Fagel |
| 2004 | Visual recalibration of auditory speech versus selective speech adaptation: different build-up courses. Jean Vroomen, Sabine van Linden, Béatrice de Gelder, Paul Bertelson |
| 2004 | Visualizing dynamic features of expressions in speech. Peter Robinson, Tal Sobol Shikler |
| 2004 | Vocabulary and language model adaptation using information retrieval. Brigitte Bigi, Yan Huang, Renato De Mori |
| 2004 | Vocal tract normalization based on spectral warping. Wei Wang, Stephen A. Zahorian |
| 2004 | Voice activation using prosodic features. Marco Khne, Matthias Wolff, Matthias Eichner, Rüdiger Hoffmann |
| 2004 | Voice activity detection using global soft decision with mixture of Gaussian model. Kiyoung Park, Changkyu Choi, Jeongsu Kim |
| 2004 | Voice conversion for unknown speakers. Hui Ye, Steve J. Young |
| 2004 | Voice enhancement of male speakers with laryngeal neoplasm. Gernot Kubin, Martin Hagmüller |
| 2004 | Voice portal services in packet network and voIP environment. Wu Chou, Feng Liu |
| 2004 | Voicebuilder: a framework for automatic speech application development. Miguel Angel Rodriguez-Moreno, Heriberto Cuayáhuitl, Juventino Montiel-Hernández |
| 2004 | Weighting observation vectors for robust speech recognition in noisy environments. Zhenyu Xiong, Thomas Fang Zheng, Wenhu Wu |
| 2004 | What concept-to-speech can gain for prosody. Markus Schnell, Rüdiger Hoffmann |
| 2004 | What makes a non-native accent?: a study of Korean English. Jong-mi Kim, Suzanne Flynn |
| 2004 | Why speech recognizers make errors ? a robustness view. Hong Kook Kim, Mazin G. Rahim |
| 2004 | Word confusability prediction in automatic speech recognition. Jan Anguita, Stéphane Peillon, Javier Hernando, Alexandre Bramoulle |
| 2004 | Word n-gram probability estimation from a Japanese raw corpus. Shinsuke Mori, Daisuke Takuma |
| 2004 | Worldwide ongoing activities on multilingual speech to speech translation. Gianni Lazzari, Alex Waibel, Chengqing Zong |
| 2004 | XML representation languages as a way of interconnecting TTS modules. Marc Schröder, Stefan Breuer |
| 2004 | Zeros of z-transform (ZZT) decomposition of speech for source-tract separation. Boris Doval, Baris Bozkurt, Christophe d'Alessandro, Thierry Dutoit |