| 2009 | 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009, Brighton, United Kingdom, September 6-10, 2009 |
| 2009 | 2-d processing of speech for multi-pitch analysis. Tianyu T. Wang, Thomas F. Quatieri |
| 2009 | A Bayesian approach to Hidden Semi-Markov Model based speech synthesis. Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda |
| 2009 | A Bayesian approach to non-intrusive quality assessment of speech. Petko Nikolov Petkov, Iman S. Mossavat, W. Bastiaan Kleijn |
| 2009 | A Policy-switching learning approach for adaptive spoken dialogue agents. Heriberto Cuayáhuitl, Juventino Montiel-Hernández |
| 2009 | A WFST-based log-linear framework for speaking-style transformation. Graham Neubig, Shinsuke Mori, Tatsuya Kawahara |
| 2009 | A back-off discriminative acoustic model for automatic speech recognition. Hung-An Chang, James R. Glass |
| 2009 | A close look into the probabilistic concatenation model for corpus-based speech synthesis. Shinsuke Sakai, Ranniery Maia, Hisashi Kawai, Satoshi Nakamura |
| 2009 | A closer look at quality judgments of spoken dialog systems. Klaus-Peter Engelbrecht, Felix Hartard, Florian Gödde, Sebastian Möller |
| 2009 | A comparison of audio-free speech recognition error prediction methods. Preethi Jyothi, Eric Fosler-Lussier |
| 2009 | A comparison of linear and nonlinear dimensionality reduction methods applied to synthetic speech. Andrew Errity, John McKenna |
| 2009 | A comparison of query-by-example methods for spoken term detection. Wade Shen, Christopher M. White, Timothy J. Hazen |
| 2009 | A correlation-maximization denoising filter used as an enhancement frontend for noise robust bird call classification. Wei Chu, Abeer Alwan |
| 2009 | A data-driven approach for estimating the time-frequency binary mask. Gibak Kim, Philipos C. Loizou |
| 2009 | A decision tree-based clustering approach to state definition in an excitation modeling framework for HMM-based speech synthesis. Ranniery Maia, Tomoki Toda, Keiichi Tokuda, Shinsuke Sakai, Satoshi Nakamura |
| 2009 | A detailed study of word-position effects on emotion expression in speech. Jangwon Kim, Sungbok Lee, Shrikanth S. Narayanan |
| 2009 | A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis. Thomas Drugman, Geoffrey Wilfart, Thierry Dutoit |
| 2009 | A fast online algorithm for large margin training of continuous density hidden Markov models. Chih-Chieh Cheng, Fei Sha, Lawrence K. Saul |
| 2009 | A framework for discriminative SVM/GMM systems for language recognition. William M. Campbell, Zahi N. Karam |
| 2009 | A framework for rapid development of conversational natural language call routing systems for call centers. Ea-Ee Jan, Hong-Kwang Kuo, Osamuyimen Stewart, David M. Lubensky |
| 2009 | A fully data parallel WFST-based large vocabulary continuous speech recognition on a graphics processing unit. Jike Chong, Ekaterina Gonina, Youngmin Yi, Kurt Keutzer |
| 2009 | A fundamental study of shouted speech for acoustic-based security system. Hiroaki Nanjo, Hiroki Mikami, Hiroshi Kawano, Takanobu Nishiura |
| 2009 | A general-purpose 32 ms prosodic vector for hidden Markov modeling. Kornel Laskowski, Mattias Heldner, Jens Edlund |
| 2009 | A generalized composition algorithm for weighted finite-state transducers. Cyril Allauzen, Michael Riley, Johan Schalkwyk |
| 2009 | A human benchmark for language recognition. Rosemary Orr, David A. van Leeuwen |
| 2009 | A language-independent feature set for the automatic evaluation of prosody. Andreas K. Maier, Florian Hönig, Viktor Zeißler, Anton Batliner, Erik Körner, Nobuyuki Yamanaka, Peter Ackermann, Elmar Nöth |
| 2009 | A large greek-English dictionary with incorporated speech and language processing tools. Dimitrios P. Lyras, George K. Kokkinakis, Alexandros Lazaridis, Kyriakos N. Sgarbas, Nikos Fakotakis |
| 2009 | A media-specific FEC based on huffman coding for distributed speech recognition. Young Han Lee, Hong Kook Kim |
| 2009 | A microphone-independent visualization technique for speech disorders. Andreas K. Maier, Stefan Wenhardt, Tino Haderlein, Maria Schuster, Elmar Nöth |
| 2009 | A minimum v/u error approach to F0 generation in HMM-based TTS. Yao Qian, Frank K. Soong, Miaomiao Wang, Zhizheng Wu |
| 2009 | A multi-level context-dependent prosodic model applied to durational modeling. Nicolas Obin, Xavier Rodet, Anne Lacheret-Dujour |
| 2009 | A new quality measure for topic segmentation of text and speech. Mehryar Mohri, Pedro J. Moreno, Eugene Weinstein |
| 2009 | A noise robust method for pattern discovery in quantized time series: the concept matrix approach. Okko Johannes Räsänen, Unto Kalervo Laine, Toomas Altosaar |
| 2009 | A noise-type and level-dependent MPO-based speech enhancement architecture with variable frame analysis for noise-robust speech recognition. Vikramjit Mitra, Bengt J. Borgstrom, Carol Y. Espy-Wilson, Abeer Alwan |
| 2009 | A non-intrusive signal-based model for speech quality evaluation using automatic classification of background noises. Adrien Leman, Julien Faure, Etienne Parizet |
| 2009 | A novel approach to cost weighting in unit selection TTS. Jerome R. Bellegarda |
| 2009 | A novel codebook search technique for estimating the open quotient. Yen-Liang Shue, Jody Kreiman, Abeer Alwan |
| 2009 | A novel method for epoch extraction from speech signals. Lakshmish Kaushik, Douglas D. O'Shaughnessy |
| 2009 | A novel model-based pitch conversion method for Mandarin speech. Hsin-Te Hwang, Chen-Yu Chiang, Po-Yi Sung, Sin-Horng Chen |
| 2009 | A novel technique for voice conversion based on style and content decomposition with bilinear models. Victor Popa, Jani Nurminen, Moncef Gabbouj |
| 2009 | A one-step tone recognition approach using MSD-HMM for continuous speech. Changliang Liu, Fengpei Ge, Fuping Pan, Bin Dong, Yonghong Yan |
| 2009 | A parallel training algorithm for hierarchical pitman-yor process language models. Songfang Huang, Steve Renals |
| 2009 | A perceptual investigation of speech transcription errors involving frequent near-homophones in French and american English. Ioana Vasilescu, Martine Adda-Decker, Lori Lamel, Pierre A. Hallé |
| 2009 | A posterior probability-based system hybridisation and combination for spoken term detection. Javier Tejedor, Dong Wang, Simon King, Joe Frankel, José Colás |
| 2009 | A quantitative study of F0 peak alignment and sentence modality. Hansjörg Mixdorff, Hartmut R. Pfitzinger |
| 2009 | A robust variational method for the acoustic-to-articulatory problem. Blaise Potard, Yves Laprie |
| 2009 | A self-labeling speech corpus: collecting spoken words with an online educational game. Ian McGraw, Alexander Gruenstein, Andrew M. Sutherland |
| 2009 | A semi-blind source separation method with a less amount of computation suitable for tiny DSP modules. Kazunobu Kondo, Makoto Yamada, Hideki Kenmochi |
| 2009 | A semi-supervised version of heteroscedastic linear discriminant analysis. Haolang Zhou, Damianos G. Karakos, Andreas G. Andreou |
| 2009 | A sequential minimization algorithm for finite-state pronunciation lexicon models. Simon Dobrisek, Bostjan Vesnicer, France Mihelic |
| 2009 | A statistical dialog manager for the LUNA project. David Griol, Giuseppe Riccardi, Emilio Sanchis |
| 2009 | A study of bootstrapping with multiple acoustic features for improved automatic speech recognition. Xiaodong Cui, Jian Xue, Bing Xiang, Bowen Zhou |
| 2009 | A study of mutual front-end processing method based on statistical model for noise robust speech recognition. Masakiyo Fujimoto, Kentaro Ishizuka, Tomohiro Nakatani |
| 2009 | A study of new approaches to speaker diarization. Douglas A. Reynolds, Patrick Kenny, Fabio Castaldo |
| 2009 | A study on multiple sound source localization with a distributed microphone system. Kook Cho, Takanobu Nishiura, Yoichi Yamashita |
| 2009 | A study on soft margin estimation of linear regression parameters for speaker adaptation. Shigeki Matsuda, Yu Tsao, Jinyu Li, Satoshi Nakamura, Chin-Hui Lee |
| 2009 | A study on the influence of covariance adaptation on jacobian compensation in vocal tract length normalization. D. Rama Sanand, Shakti Prasad Rath, Srinivasan Umesh |
| 2009 | A system for detecting miscues in dyslexic read speech. Morten Højfeldt Rasmussen, Zheng-Hua Tan, Børge Lindberg, Søren Holdt Jensen |
| 2009 | A user modeling-based performance analysis of a wizarded uncertainty-adaptive dialogue system corpus. Katherine Forbes-Riley, Diane J. Litman |
| 2009 | A voice search approach to replying to SMS messages in automobiles. Yun-Cheng Ju, Tim Paek |
| 2009 | AM-FM estimation for speech based on a time-varying sinusoidal model. Yannis Pantazis, Olivier Rosec, Yannis Stylianou |
| 2009 | ANN based decision fusion for speech emotion recognition. Lu Xu, Mingxing Xu, Dali Yang |
| 2009 | ASR based pronunciation evaluation with automatically generated competing vocabulary. Carlos Molina, Néstor Becerra Yoma, Jorge Wuth, Hiram Vivanco |
| 2009 | ASR corpus design for resource-scarce languages. Etienne Barnard, Marelie H. Davel, Charl Johannes van Heerden |
| 2009 | Accounting for the uncertainty of speech estimates in the complex domain for minimum mean square error speech enhancement. Ramón Fernandez Astudillo, Dorothea Kolossa, Reinhold Orglmeister |
| 2009 | Acoustic and high-speed digital imaging based analysis of pathological voice contributes to better understanding and differential diagnosis of neurological dysphonias and of mimicking phonatory disorders. Krzysztof Izdebski, Yuling Yan, Melda Kunduk |
| 2009 | Acoustic and perceptual effects of vocal training in amateur male singing. Takeshi Saitou, Masataka Goto |
| 2009 | Acoustic characteristics of ejectives in amharic. Hussien Seid Worku, S. Rajendran, B. Yegnanarayana |
| 2009 | Acoustic class specific VTLN-warping using regression class trees. Shakti Prasad Rath, Srinivasan Umesh |
| 2009 | Acoustic cues of palatalisation in plosive + lateral onset clusters. Daniela Müller, Sidney Martin Mota |
| 2009 | Acoustic emotion recognition using dynamic Bayesian networks and multi-space distributions. Roberto Barra-Chicote, Fernando Fernández Martínez, Syaheerah L. Lutfi, Juan Manuel Lucas-Cuesta, Javier Macías Guarasa, Juan Manuel Montero, Rubén San Segundo, José Manuel Pardo |
| 2009 | Acoustic event detection for spotting "hot spots" in podcasts. Kouhei Sumi, Tatsuya Kawahara, Jun Ogata, Masataka Goto |
| 2009 | Acoustic modeling using exponential families. Vaibhava Goel, Peder A. Olsen |
| 2009 | Acoustic-to-articulatory inversion using speech recognition and trajectory formation based on phoneme hidden Markov models. Atef Ben Youssef, Pierre Badin, Gérard Bailly, Panikos Heracleous |
| 2009 | Adaptation of a predictive model of tongue shapes. Chao Qin, Miguel Á. Carreira-Perpiñán |
| 2009 | Adapting the acoustic model of a speech recognizer for varied proficiency non-native spontaneous speech using read speech with language-specific pronunciation difficulty. Klaus Zechner, Derrick Higgins, René Lawless, Yoko Futagi, Sarah Ohls, George Ivanov |
| 2009 | Adaptive individual background model for speaker verification. Yossi Bar-Yosef, Yuval Bistritz |
| 2009 | Adaptive non-negative matrix factorization in a computational model of language acquisition. Joris Driesen, Louis ten Bosch, Hugo Van hamme |
| 2009 | Adaptive training with noisy constrained maximum likelihood linear regression for noise robust speech recognition. D. K. Kim, Mark J. F. Gales |
| 2009 | Advanced unsupervised joint prosody labeling and modeling for Mandarin speech and its application to prosody generation for TTS. Chen-Yu Chiang, Sin-Horng Chen, Yih-Ru Wang |
| 2009 | Advancements in whisper-island detection within normally phonated audio streams. Chi Zhang, John H. L. Hansen |
| 2009 | Aerodynamics of fricative production in european portuguese. Cátia M. R. Pinho, Luis M. T. Jesus, Anna Barney |
| 2009 | Age recognition for spoken dialogue systems: do we need it? Maria Klara Wolters, Ravichander Vipperla, Steve Renals |
| 2009 | Age verification using a hybrid speech processing approach. Ron M. Hecht, Omer Hezroni, Amit Manna, Ruth Aloni-Lavi, Gil Dobry, Amir Alfandary, Yaniv Zigel |
| 2009 | Algorithms for speech indexing in microsoft recite. Kunal Mukerjee, Shankar L. Regunathan, Jeffrey Cole |
| 2009 | Alleviating the one-to-many mapping problem in voice conversion with context-dependent modeling. Elizabeth Godoy, Olivier Rosec, Thierry Chonavel |
| 2009 | An adaptive BIC approach for robust audio stream segmentation. Janez Zibert, Andrej Brodnik, France Mihelic |
| 2009 | An adaptive threshold computation for unsupervised speaker segmentation. Laura Docío Fernández, Paula Lopez-Otero, Carmen García-Mateo |
| 2009 | An analysis of speech rate strategies in aging. Frits van Brenk, Hayo Terband, Pascal van Lieshout, Anja Lowit, Ben Maassen |
| 2009 | An analytic derivation of a phase-sensitive observation model for noise robust speech recognition. Volker Leutnant, Reinhold Haeb-Umbach |
| 2009 | An articulatory analysis of phonological transfer using real-time MRI. Joseph Tepperman, Erik Bresch, Yoon-Chul Kim, Sungbok Lee, Louis Goldstein, Shrikanth S. Narayanan |
| 2009 | An audio-visual approach to measuring discourse synchrony in multimodal conversation data. Nick Campbell |
| 2009 | An audio-visual attention system for online association learning. Martin Heckmann, Holger Brandl, Xavier Domont, Bram Bolder, Frank Joublin, Christian Goerick |
| 2009 | An evaluation methodology for prosody transformation systems based on chirp signals. Damien Lolive, Nelly Barbot, Olivier Boëffard |
| 2009 | An evaluation of formant tracking methods on an Arabic database. Imen Jemaa, Oussama Rekhis, Kaïs Ouni, Yves Laprie |
| 2009 | An evaluation of objective quality measures for speech intelligibility prediction. Cees H. Taal, Richard C. Hendriks, Richard Heusdens, Jesper Jensen, Ulrik Kjems |
| 2009 | An improved minimum generation error based model adaptation for HMM-based speech synthesis. Yi-Jian Wu, Long Qin, Keiichi Tokuda |
| 2009 | An improved speech segmentation quality measure: the r-value. Okko Johannes Räsänen, Unto Kalervo Laine, Toomas Altosaar |
| 2009 | An indexing weight for voice-to-text search. Chen Liu |
| 2009 | Analysis and recognition of accentual patterns. Agnieszka Wagner |
| 2009 | Analysis and utilization of MLLR speaker adaptation technique for learners' pronunciation evaluation. Dean Luo, Yu Qiao, Nobuaki Minematsu, Yutaka Yamauchi, Keikichi Hirose |
| 2009 | Analysis of Lombard speech using excitation source information. G. Bapineedu, B. Avinash, Suryakanth V. Gangashetty, B. Yegnanarayana |
| 2009 | Analysis of band structures for speaker-specific information in FM feature extraction. Tharmarajah Thiruvaran, Eliathamby Ambikairajah, Julien Epps |
| 2009 | Analysis of laugh signals for detecting in continuous speech. K. Sudheer Kumar, Sri Harish Reddy Mallidi, K. Sri Rama Murty, B. Yegnanarayana |
| 2009 | Analysis of low-resource acoustic model self-training. Scott Novotney, Richard M. Schwartz |
| 2009 | Analysis of voice fundamental frequency contours of continuing and terminating prosodic phrases in four swiss German dialects. Adrian Leemann, Keikichi Hirose, Hiroya Fujisaki |
| 2009 | Analyzing GMMs to characterize resonance anomalies in speakers suffering from apnoea. José Luis Blanco Murillo, Rubén Fernández Pozo, David Díaz Pardo de Vera, Álvaro Sigüenza, Luis A. Hernández Gómez, José Alcázar Ramírez |
| 2009 | Analyzing features for automatic age estimation on cross-sectional data. Werner Spiegl, Georg Stemmer, Eva Lasarcyk, Varada Kolhatkar, Andrew Cassidy, Blaise Potard, Stephen Shum, Young Chol Song, Puyang Xu, Peter Beyerlein, James D. Harnsberger, Elmar Nöth |
| 2009 | Annotating communicative function and semantic content in dialogue act for construction of consulting dialogue systems. Teruhisa Misu, Kiyonori Ohtake, Chiori Hori, Hideki Kashioka, Satoshi Nakamura |
| 2009 | Annotation and features of non-native Mandarin tone quality. Mitchell Peabody, Stephanie Seneff |
| 2009 | Application of differential microphone array for IS-127 EVRC rate determination algorithm. Henry Widjaja, Suryoadhi Wibowo |
| 2009 | Application of noise robust MDT speech recognition on the SPEECON and speechdat-car databases. Jort F. Gemmeke, Yujun Wang, Maarten Van Segbroeck, Bert Cranen, Hugo Van hamme |
| 2009 | Applying non-negative matrix factorization on time-frequency reassignment spectra for missing data mask estimation. Maarten Van Segbroeck, Hugo Van hamme |
| 2009 | Approximate intrinsic fourier analysis of speech. Frank Tompkins, Patrick J. Wolfe |
| 2009 | Are real tongue movements easier to speech read than synthesized? Olov Engwall, Preben Wik |
| 2009 | Are we 'in sync': turn-taking in collaborative dialogues. Stefan Benus |
| 2009 | Arithmetic coding of sub-band residuals in FDLP speech/audio codec. Petr Motlícek, Sriram Ganapathy, Hynek Hermansky |
| 2009 | Arousal and valence prediction in spontaneous emotional speech: felt versus perceived emotion. Khiet P. Truong, David A. van Leeuwen, Mark A. Neerincx, Franciska M. G. de Jong |
| 2009 | Articulatory feature asynchrony analysis and compensation in detection-based ASR. I-Fan Chen, Hsin-Min Wang |
| 2009 | Articulatory modeling based on semi-polar coordinates and guided PCA technique. Jun Cai, Yves Laprie, Julie Busset, Fabrice Hirsch |
| 2009 | Articulatory phonological code for word classification. Xiaodan Zhuang, Hosung Nam, Mark Hasegawa-Johnson, Louis Goldstein, Elliot Saltzman |
| 2009 | Artificial nasalization of speech sounds based on pole-zero models of spectral relations between mouth and nose signals. Karl Schnell, Arild Lacroix |
| 2009 | Artificial speech synthesizer control by brain-computer interface. Jonathan S. Brumberg, Philip R. Kennedy, Frank H. Guenther |
| 2009 | Assessing a speaker for fast speech in unit selection speech synthesis. Donata Moers, Petra Wagner |
| 2009 | Assessing context and learning for isizulu tone recognition. Gina-Anne Levow |
| 2009 | Asynchronous F0 and spectrum modeling for HMM-based speech synthesis. Cheng-Cheng Wang, Zhen-Hua Ling, Li-Rong Dai |
| 2009 | Audio keyword extraction by unsupervised word discovery. Armando Muscariello, Guillaume Gravier, Frédéric Bimbot |
| 2009 | Audio spatialisation strategies for multitasking during teleconferences. Stuart N. Wrigley, Simon Tucker, Guy J. Brown, Steve Whittaker |
| 2009 | Audio-visual prosody of social attitudes in vietnamese: building and evaluating a tones balanced corpus. Dang-Khoa Mac, Véronique Aubergé, Albert Rilliard, Eric Castelli |
| 2009 | Audio-visual speech asynchrony modeling in a talking head. Alexey Karpov, Liliya Tsirulnik, Zdenek Krnoul, Andrey Ronzhin, Boris Lobanov, Milos Zelezný |
| 2009 | Auditory model based optimization of MFCCs improves automatic speech recognition performance. Saikat Chatterjee, Christos Koniaris, W. Bastiaan Kleijn |
| 2009 | Auto-checking speech transcriptions by multiple template constrained posterior. Lijuan Wang, Shenghao Qin, Frank K. Soong |
| 2009 | Auto-meshing algorithm for acoustic analysis of vocal tract. Kyohei Hayashi, Nobuhiro Miki |
| 2009 | Automated pronunciation scoring using confidence scoring and landmark-based SVM. Su-Youn Yoon, Mark Hasegawa-Johnson, Richard Sproat |
| 2009 | Automatic accent detection: effect of base units and boundary information. Je Hun Jeon, Yang Liu |
| 2009 | Automatic detection and prediction of topic changes through automatic detection of register variations and pause duration. Céline De Looze, Stéphane Rauzy |
| 2009 | Automatic detection of audio advertisements. I. Dan Melamed, Yeon-Jun Kim |
| 2009 | Automatic estimation of decoding parameters using large-margin iterative linear programming. Brian Mak, Tom Ko |
| 2009 | Automatic formant extraction for sociolinguistic analysis of large corpora. Keelan Evanini, Stephen Isard, Mark Y. Liberman |
| 2009 | Automatic intonation classification for speech training systems. György Szaszák, Dávid Sztahó, Klára Vicsi |
| 2009 | Automatic out-of-language detection based on confidence measures derived from LVCSR word and phone lattices. Petr Motlícek |
| 2009 | Automatic syllabification for danish text-to-speech systems. Jeppe Beck, Daniela Braga, João Nogueira, Miguel Sales Dias, Luís Pinto Coelho |
| 2009 | Automatic topic detection of recorded voice messages. Caroline Clemens, Stefan Feldes, Karlheinz Schuhmacher, Joachim Stegmann |
| 2009 | Automatic transcription system for meetings of the Japanese national congress. Yuya Akita, Masato Mimura, Tatsuya Kawahara |
| 2009 | Automatic vs. human question answering over multimedia meeting recordings. Quoc Anh Le, Andrei Popescu-Belis |
| 2009 | Automatically rating pronunciation through articulatory phonology. Joseph Tepperman, Louis Goldstein, Sungbok Lee, Shrikanth S. Narayanan |
| 2009 | Autoregressive HMMs for speech synthesis. Matt Shannon, William Byrne |
| 2009 | BUT system for NIST 2008 speaker recognition evaluation. Lukás Burget, Michal Fapso, Valiantsina Hubeika, Ondrej Glembek, Martin Karafiát, Marcel Kockmann, Pavel Matejka, Petr Schwarz, Jan Cernocký |
| 2009 | Back-off language model compression. Boulos Harb, Ciprian Chelba, Jeffrey Dean, Sanjay Ghemawat |
| 2009 | Backchannel-inviting cues in task-oriented dialogue. Agustín Gravano, Julia Hirschberg |
| 2009 | Balanced corpus of informal spoken Czech: compilation, design and findings. Martina Waclawicová, Michal Kren, Lucie Válková |
| 2009 | Bark-shift based nonlinear speaker normalization using the second subglottal resonance. Shizhen Wang, Yi-Hui Lee, Abeer Alwan |
| 2009 | Basic speech recognition for spoken dialogues. Charl Johannes van Heerden, Etienne Barnard, Marelie H. Davel |
| 2009 | Bayes risk approximations using time overlap with an application to system combination. Björn Hoffmeister, Ralf Schlüter, Hermann Ney |
| 2009 | Bayesian learning of confidence measure function for generation of utterances and motions in object manipulation dialogue task. Komei Sugiura, Naoto Iwahashi, Hideki Kashioka, Satoshi Nakamura |
| 2009 | Bilinear transformation space-based maximum likelihood linear regression frameworks. Hwa Jeon Song, Yongwon Jeong, Hyung Soon Kim |
| 2009 | Brno University of Technology system for Interspeech 2009 emotion challenge. Marcel Kockmann, Lukás Burget, Jan Cernocký |
| 2009 | CMAC for speech emotion profiling. Norhaslinda Kamaruddin, Abdul Wahab |
| 2009 | CRANDEM: conditional random fields for word recognition. Jeremy Morris, Eric Fosler-Lussier |
| 2009 | Categorical perception of speech without stimulus repetition. Jack C. Rogers, Matthew H. Davis |
| 2009 | Categories and gradience in intonation: evidence from linguistics and neurobiology. Brechtje Post, Francis Nolan, Emmanuel A. Stamatakis, Toby Hudson |
| 2009 | Cepstral analysis of vocal dysperiodicities in disordered connected speech. Ali Alpan, Jean Schoentgen, Youri Maryn, Francis Grenez, P. Murphy |
| 2009 | Cepstral and long-term features for emotion recognition. Pierre Dumouchel, Najim Dehak, Yazid Attabi, Réda Dehak, Narjès Boufaden |
| 2009 | Characteristics of two-dimensional finite difference techniques for vocal tract analysis and voice synthesis. Matt Speed, Damian T. Murphy, David M. Howard |
| 2009 | Characterizing silent and pseudo-silent speech using radar-like sensors. John F. Holzrichter |
| 2009 | Characterizing speaker variability using spectral envelopes of vowel sounds. A. N. Harish, D. Rama Sanand, Srinivasan Umesh |
| 2009 | Classification of disfluent phenomena as fluent communicative devices in specific prosodic contexts. Helena Moniz, Isabel Trancoso, Ana Isabel Mata |
| 2009 | Classification-based strategies for combining multiple 5-w question answering systems. Sibel Yaman, Dilek Hakkani-Tür, Gökhan Tür, Ralph Grishman, Mary P. Harper, Kathleen R. McKeown, Adam Meyers, Kartavya Sharma |
| 2009 | Classifying clear and conversational speech based on acoustic features. Akiko Amano-Kusumoto, John-Paul Hosom, Izhak Shafran |
| 2009 | Classifying turn-level uncertainty using word-level prosody. Diane J. Litman, Mihai Rotaru, Greg Nicholas |
| 2009 | Closely related languages, different ways of realizing focus. Szu-Wei Chen, Bei Wang, Yi Xu |
| 2009 | Clusterrank: a graph based method for meeting summarization. Nikhil Garg, Benoît Favre, Korbinian Riedhammer, Dilek Hakkani-Tür |
| 2009 | Collision threshold pressure before and after vocal loading. Laura Enflo, Johan Sundberg, Friedemann Pabst |
| 2009 | Combination of acoustic and lexical speaker adaptation for disordered speech recognition. Oscar Saz, Eduardo Lleida, Antonio Miguel |
| 2009 | Combined discriminative training for multi-stream HMM-based audio-visual speech recognition. Jing Huang, Karthik Visweswariah |
| 2009 | Combined low level and high level features for out-of-vocabulary word detection. Benjamin Lecouteux, Georges Linarès, Benoît Favre |
| 2009 | Combining semantic and syntactic information sources for 5-w question answering. Sibel Yaman, Dilek Hakkani-Tür, Gökhan Tür |
| 2009 | Combining spectral and prosodic information for emotion recognition in the interspeech 2009 emotion challenge. Iker Luengo, Eva Navas, Inmaculada Hernáez |
| 2009 | Compacting discriminative feature space transforms for embedded devices. Etienne Marcheret, Jia-Yu Chen, Petr Fousek, Peder A. Olsen, Vaibhava Goel |
| 2009 | Comparing methods to find a best exemplar in a multidimensional space. Titia Benders, Paul Boersma |
| 2009 | Comparison of Fujisaki-model extractors and F0 stylizers. Hartmut R. Pfitzinger, Hansjörg Mixdorff, Jan Schwarz |
| 2009 | Comparison of estimation techniques in joint uncertainty decoding for noise robust speech recognition. Haitian Xu, K. K. Chin |
| 2009 | Comparison of manual and automated estimates of subglottal resonances. Wolfgang Wokurek, Andreas Madsack |
| 2009 | Comparison of vowel structures of Japanese and English in articulatory and auditory spaces. Jianwu Dang, Mark Tiede, Jiahong Yuan |
| 2009 | Complementarity of MFCC, PLP and Gabor features in the presence of speech-intrinsic variabilities. Bernd T. Meyer, Birger Kollmeier |
| 2009 | Complex cepstrum-based decomposition of speech for glottal source estimation. Thomas Drugman, Baris Bozkurt, Thierry Dutoit |
| 2009 | Compression and truncation revisited. Claudia K. Ohl, Hartmut R. Pfitzinger |
| 2009 | Compression techniques applied to multiple speech recognition systems. Catherine Breslin, Matthew N. Stuttle, Kate M. Knill |
| 2009 | Concept segmentation and labeling for conversational speech. Marco Dinarelli, Alessandro Moschitti, Giuseppe Riccardi |
| 2009 | Connecting human and machine learning via probabilistic models of cognition. Thomas L. Griffiths |
| 2009 | Connecting rhythm and prominence in automatic ESL pronunciation scoring. Emily Nava, Joseph Tepperman, Louis Goldstein, Maria Luisa Zubizarreta, Shrikanth S. Narayanan |
| 2009 | Constrained probabilistic subspace maps applied to speech enhancement. Kaustubh Kalgaonkar, Mark A. Clements |
| 2009 | Constraint selection for topic-based MDI adaptation of language models. Gwénolé Lecorvé, Guillaume Gravier, Pascale Sébillot |
| 2009 | Context effects and the processing of ambiguous words: further evidence from semantic incongruence. Michael C. W. Yip |
| 2009 | Context-dependent additive log f_0 model for HMM-based speech synthesis. Heiga Zen, Norbert Braunschweiler |
| 2009 | Context-driven automatic bilingual movie subtitle alignment. Andreas Tsiartas, Prasanta Kumar Ghosh, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2009 | Contextual effects on protrusion and lip opening for /i, y/. Anne Bonneau, Julie Buquet, Brigitte Wrobel-Dautcourt |
| 2009 | Continuous speech recognition using attention shift decoding with soft decision. Ozlem Kalinli, Shrikanth S. Narayanan |
| 2009 | Control of human generating force by use of acoustic information - study on onomatopoeic utterances for controlling small lifting-force. Miki Iimura, Taichi Sato, Kihachiro Tanaka |
| 2009 | Conversation robot participating in and activating a group communication. Shinya Fujie, Yoichi Matsuyama, Hikaru Taniyama, Tetsunori Kobayashi |
| 2009 | Cross-cultural perception of discourse phenomena. Rolf Carlson, Julia Hirschberg |
| 2009 | Cross-language F0 modeling for under-resourced tonal languages: a case study on Thai-Mandarin. Vataya Boonpiam, Anocha Rugchatjaroen, Chai Wutiwiwatchai |
| 2009 | Cross-language bootstrapping for unsupervised acoustic model training: rapid development of a Polish speech recognition system. Jonas Lööf, Christian Gollan, Hermann Ney |
| 2009 | Cross-language voice conversion based on eigenvoices. Malorie Charlier, Yamato Ohtani, Tomoki Toda, Alexis Moinet, Thierry Dutoit |
| 2009 | Cross-variety rhythm typology in portuguese. Plínio Almeida Barbosa, Céu Viana, Isabel Trancoso |
| 2009 | Cued speech recognition for augmentative communication in normal-hearing and hearing-impaired subjects. Panikos Heracleous, Denis Beautemps, Noureddine Aboutabit |
| 2009 | Data-driven clustering in emotional space for affect recognition using discriminatively trained LSTM networks. Martin Wöllmer, Florian Eyben, Björn W. Schuller, Ellen Douglas-Cowie, Roddy Cowie |
| 2009 | Data-driven phonetic comparison and conversion between south african, british and american English pronunciations. Linsen Loots, Thomas Niesler |
| 2009 | Decision tree acoustic models for ASR. Jitendra Ajmera, Masami Akamine |
| 2009 | Deriving vocal tract shapes from electromagnetic articulograph data via geometric adaptation and matching. Ziad Al Bawab, Lorenzo Turicchia, Richard M. Stern, Bhiksha Raj |
| 2009 | Designing spoken tutorial dialogue with children to elicit predictable but educationally valuable responses. Gregory Aist, Jack Mostow |
| 2009 | Detailed description of triphone model using SSS-free algorithm. Motoyuki Suzuki, Daisuke Honma, Akinori Ito, Shozo Makino |
| 2009 | Detecting audio events for semantic video search. Miguel M. F. Bugalho, José Portelo, Isabel Trancoso, Thomas Pellegrini, Alberto Abad |
| 2009 | Detecting changes in speech expressiveness in participants of a radio program. Plínio A. Barbosa |
| 2009 | Detecting subjectivity in multiparty speech. Gabriel Murray, Giuseppe Carenini |
| 2009 | Determining intonational boundaries from the acoustic signal. Lourdes Aguilar, Antonio Bonafonte, Francisco Campillo, David Escudero Mancebo |
| 2009 | Deterministic annealing based training algorithm for Bayesian speech recognition. Sayaka Shiota, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda |
| 2009 | Developing an automatic functional annotation system for british English intonation. Saandia Ali, Daniel Hirst |
| 2009 | Development of a kenyan English text to speech system: a method of developing a TTS for a previously undefined English dialect. Mucemi Gakuru |
| 2009 | Development of the 2008 SRI Mandarin speech-to-text system for broadcast news and conversation. Xin Lei, Wei Wu, Wen Wang, Arindam Mandal, Andreas Stolcke |
| 2009 | Development of the GALE 2008 Mandarin LVCSR system. Christian Plahl, Björn Hoffmeister, Georg Heigold, Jonas Lööf, Ralf Schlüter, Hermann Ney |
| 2009 | Development of voicing categorization in deaf children with cochlear implant. Victoria Medina, Willy Serniclaes |
| 2009 | Dialectal characteristics of osaka and tokyo Japanese: analyses of phonologically identical words. Kanae Amino, Takayuki Arai |
| 2009 | Did you say a BLUE banana? the prosody of contrast and abnormality in bulgarian and dutch. Diana V. Dimitrova, Gisela Redeker, John C. J. Hoeks |
| 2009 | Differential vector quantization of feature vectors for distributed speech recognition. José Enrique García Laínez, Alfonso Ortega, Antonio Miguel, Eduardo Lleida |
| 2009 | Dimension reducing of LSF parameters based on radial basis function neural network. Hongjun Sun, Jianhua Tao, Huibin Jia |
| 2009 | Dimension reduction approaches for SVM based speaker age estimation. Gil Dobry, Ron M. Hecht, Mireille Avigal, Yaniv Zigel |
| 2009 | Direct, modular and hybrid audio to visual speech conversion methods - a comparative study. György Takács |
| 2009 | Discovering consistent word confusions in noise. Martin Cooke |
| 2009 | Discovering keywords from cross-modal input: ecological vs. engineering methods for enhancing acoustic repetitions. Guillaume Aimetti, Roger K. Moore, Louis ten Bosch, Okko Johannes Räsänen, Unto Kalervo Laine |
| 2009 | Discriminant spectrotemporal features for phoneme recognition. Nima Mesgarani, Garimella S. V. S. Sivaram, Sridhar Krishna Nemala, Mounya Elhilali, Hynek Hermansky |
| 2009 | Discriminative acoustic language recognition via channel-compensated GMM statistics. Niko Brümmer, Albert Strasheim, Valiantsina Hubeika, Pavel Matejka, Lukás Burget, Ondrej Glembek |
| 2009 | Discriminative feature transformation using output coding for speech recognition. Omid Dehzangi, Bin Ma, Engsiong Chng, Haizhou Li |
| 2009 | Discriminative n-gram selection for dialect recognition. Fred S. Richardson, William M. Campbell, Pedro A. Torres-Carrasquillo |
| 2009 | Disordered speech recognition using acoustic and sEMG signals. Yunbin Deng, Rupal Patel, James T. Heaton, Glen Colby, L. Donald Gilmore, Joao Cabrera, Serge H. Roy, Carlo J. De Luca, Geoffrey S. Meltzner |
| 2009 | Distorted visual information influences audiovisual perception of voicing. Ragnhild Eg, Dawn M. Behne |
| 2009 | Do humans and speaker verification system use the same information to differentiate voices? Juliette Kahn, Solange Rossato |
| 2009 | Do multiple caregivers speed up language acquisition? Louis ten Bosch, Okko Johannes Räsänen, Joris Driesen, Guillaume Aimetti, Toomas Altosaar, Lou Boves, A. Corns |
| 2009 | Does session variability compensation in speaker recognition model intrinsic variation under mismatched conditions? Elizabeth Shriberg, Sachin S. Kajarekar, Nicolas Scheffer |
| 2009 | Dynamic features in the linear domain for robust automatic speech recognition in a reverberant environment. Osamu Ichikawa, Takashi Fukuda, Ryuki Tachibana, Masafumi Nishimura |
| 2009 | Effect of contralateral noise on energetic and informational masking on speech-in-speech intelligibility. Marjorie Dole, Michel Hoen, Fanny Meunier |
| 2009 | Effect of noise reduction on reaction time to speech in noise. Mark A. Huckvale, Jayne Leak |
| 2009 | Effect of r-resonance information on intelligibility. Antje Heinrich, Sarah Hawkins |
| 2009 | Effective use of pause information in language modelling for speech recognition. Kengo Ohta, Masatoshi Tsuchiya, Seiichi Nakagawa |
| 2009 | Effects of language mixing for automatic recognition of Cantonese-English code-mixing utterances. Houwei Cao, P. C. Ching, Tan Lee |
| 2009 | Effects of mora-timing in English rhythm control by Japanese learners. Shizuka Nakamura, Hiroaki Kato, Yoshinori Sagisaka |
| 2009 | Effects of tempo in radio commercials on young and elderly listeners. Hanny den Ouden, Hugo Quené |
| 2009 | Efficient combination of confidence measures for machine translation. Sylvain Raybaud, David Langlois, Kamel Smaïli |
| 2009 | Efficient generation and use of MLP features for Arabic speech recognition. Junho Park, Frank Diehl, Mark J. F. Gales, Marcus Tomalin, Philip C. Woodland |
| 2009 | Efficient modeling of temporal structure of speech for applications in voice transformation. Binh Phu Nguyen, Masato Akagi |
| 2009 | Electrolaryngeal speech enhancement based on statistical voice conversion. Keigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano |
| 2009 | Eliciting a hierarchical structure of human consonant perception task errors using formal concept analysis. Carmen Peláez-Moreno, Ana I. García-Moral, Francisco J. Valverde-Albacete |
| 2009 | Emotion classification in children's speech using fusion of acoustic and linguistic features. Tim Polzehl, Shiva Sundaram, Hamed Ketabdar, Michael Wagner, Florian Metze |
| 2009 | Emotion dimensions and formant position. Martijn Goudbeek, Jean-Philippe Goldman, Klaus R. Scherer |
| 2009 | Emotion recognition from speech using extended feature selection and a simple classifier. Ali Hassan, Robert I. Damper |
| 2009 | Emotion recognition using a hierarchical binary decision tree approach. Chi-Chun Lee, Emily Mower, Carlos Busso, Sungbok Lee, Shrikanth S. Narayanan |
| 2009 | Emotion recognition using linear transformations in combination with video. Rok Gajsek, Vitomir Struc, Simon Dobrisek, France Mihelic |
| 2009 | Enabling a user to specify an item at any time during system enumeration - item identification for barge-in-able conversational dialogue systems. Kyoko Matsuyama, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno |
| 2009 | Enhanced minimum statistics technique incorporating soft decision for noise suppression. Yun-Sik Park, Ji-Hyun Song, Jae-Hun Choi, Joon-Hyuk Chang |
| 2009 | Enhancement of binaural speech using codebook constrained iterative binaural wiener filter. Nadir Cazi, T. V. Sreenivas |
| 2009 | Enhancing audio speech using visual speech features. Ibrahim Almajai, Ben Milner |
| 2009 | Entropy based overlapped speech detection as a pre-processing stage for speaker diarization. Oshry Ben-Harush, Itshak Lapidot, Hugo Guterman |
| 2009 | Entropy-based feature analysis for speech recognition. Panji Setiawan, Harald Höge, Tim Fingscheidt |
| 2009 | Error correction of proportions in spoken opinion surveys. Nathalie Camelin, Renato De Mori, Frédéric Béchet, Géraldine Damnati |
| 2009 | Error metrics for impaired auditory nerve responses of different phoneme groups. Andrew Hines, Naomi Harte |
| 2009 | Estimating the position and orientation of an acoustic source with a microphone array network. Alberto Yoshihiro Nakano, Seiichi Nakagawa, Kazumasa Yamamoto |
| 2009 | Estimating the potential of signal and interlocutor-track information for language modeling. Nigel G. Ward, Benjamin H. Walker |
| 2009 | Estimation of articulatory gesture patterns from speech acoustics. Prasanta Kumar Ghosh, Shrikanth S. Narayanan, Pierre L. Divenyi, Louis Goldstein, Elliot Saltzman |
| 2009 | Evaluating evaluators: a case study in understanding the benefits and pitfalls of multi-evaluator modeling. Emily Mower, Maja J. Mataric, Shrikanth S. Narayanan |
| 2009 | Evaluating parameters for mapping adult vowels to imitative babbling. Ilana Heintz, Mary E. Beckman, Eric Fosler-Lussier, Lucie Ménard |
| 2009 | Evaluating the potential utility of ASR n-best lists for incremental spoken dialogue systems. Timo Baumann, Okko Buß, Michaela Atterer, David Schlangen |
| 2009 | Evaluation of English intonation based on combination of multiple evaluation scores. Akinori Ito, Tomoaki Konno, Masashi Ito, Shozo Makino |
| 2009 | Evaluation of external and internal articulator dynamics for pronunciation learning. Lan Wang, Hui Chen, Jianjun Ouyang |
| 2009 | Evaluation of phone lattice based speech decoding. Jacques Duchateau, Kris Demuynck, Hugo Van hamme |
| 2009 | Evaluation of the effect of the GSM full rate codec on the automatic detection of laryngeal pathologies based on cepstral analysis. Rubén Fraile, Carmelo Sánchez, Juan Ignacio Godino-Llorente, Nicolás Sáenz-Lechón, Víctor Osma-Ruiz, Juana M. Gutiérrez |
| 2009 | Example-based speech recognition using formulaic phrases. Christopher James Watkins, Stephen J. Cox |
| 2009 | Experiments on automatic prosodic labeling. Antje Schweitzer, Bernd Möbius |
| 2009 | Exploiting Chinese character models to improve speech recognition performance. Jim L. Hieronymus, Xunying Liu, Mark J. F. Gales, Philip C. Woodland |
| 2009 | Exploration of vocal excitation modulation features for speaker recognition. Ning Wang, P. C. Ching, Tan Lee |
| 2009 | Exploring automatic similarity measures for unit selection tuning. Daniel Tihelka, Jan Romportl |
| 2009 | Exploring complex vowels as phrase break correlates in a corpus of English speech with proPOSEL, a prosody and POS English lexicon. Claire Brierley, Eric Atwell |
| 2009 | Exploring speech therapy games with children on the autism spectrum. Mohammed E. Hoque, Joseph K. Lane, Rana El Kaliouby, Matthew S. Goodwin, Rosalind W. Picard |
| 2009 | Exploring the benefits of discretization of acoustic features for speech emotion recognition. Thurid Vogt, Elisabeth André |
| 2009 | Exploring the role of spectral smoothing in context of children's speech recognition. Shweta Ghai, Rohit Sinha |
| 2009 | Exploring universal attribute characterization of spoken languages for spoken language recognition. Sabato Marco Siniscalchi, Jeremy Reed, Torbjørn Svendsen, Chin-Hui Lee |
| 2009 | Exploring vocalization of /l/ in English: an EPG and EMA study. Mitsuhiro Nakamura |
| 2009 | Extreme reductions: contraction of disyllables into monosyllables in taiwan Mandarin. Chierh Cheng, Yi Xu |
| 2009 | Eye tracking for the online evaluation of prosody in speech synthesis: not so fast! Michael White, Rajakrishnan Rajkumar, Kiwako Ito, Shari R. Speer |
| 2009 | F0 cues for the discourse functions of "hã" in hindi. Kalika Bali |
| 2009 | Factor analysis and SVM for language recognition. Florian Verdet, Driss Matrouf, Jean-François Bonastre, Jean Hennebert |
| 2009 | Factor analysis for audio-based video genre classification. Mickael Rouvier, Driss Matrouf, Georges Linarès |
| 2009 | Factor analyzed HMM topology for speech recognition. Chuan-Wei Ting, Jen-Tzung Chien |
| 2009 | Fast GMM computation for speaker verification using scalar quantization and discrete densities. Guoli Ye, Brian Mak, Man-Wai Mak |
| 2009 | Fast keyword detection using suffix array. Kouichi Katsurada, Shigeki Teshima, Tsuneo Nitta |
| 2009 | Fast speech recognition for voice destination entry in a car navigation system. Hoon Chung, JeonGue Park, HyeonBae Jeon, Yunkeun Lee |
| 2009 | Fast transcription of unstructured audio recordings. Brandon Roy, Deb Roy |
| 2009 | Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction. Chanwoo Kim, Richard M. Stern |
| 2009 | Feature-based and channel-based analyses of intrinsic variability in speaker verification. Martin Graciarena, Tobias Bocklet, Elizabeth Shriberg, Andreas Stolcke, Sachin S. Kajarekar |
| 2009 | Feature-based summary space for stochastic dialogue modeling with hierarchical semantic frames. Florian Pinault, Fabrice Lefèvre, Renato De Mori |
| 2009 | Feedback loop for prosody prediction in concatenative speech synthesis. Javier Latorre, Sergio Gracia, Masami Akamine |
| 2009 | Feedforward control of a 3d physiological articulatory model for vowel production. Qiang Fang, Akikazu Nishikido, Jianwu Dang, Aijun Li |
| 2009 | Finding allophones: an evaluation on consonants in the TIMIT corpus. Timothy Kempton, Roger K. Moore |
| 2009 | Fine-granular scalable MELP coder based on embedded vector quantization. Mouloud Djamah, Douglas D. O'Shaughnessy |
| 2009 | Finite mixture spectrogram modeling for multipitch tracking using a factorial hidden Markov model. Michael Wohlmayr, Franz Pernkopf |
| 2009 | Forensic speaker recognition using traditional features comparing automatic and human-in-the-loop formant tracking. Alberto de Castro, Daniel Ramos, Joaquin Gonzalez-Rodriguez |
| 2009 | Formant trajectories for acoustic-to-articulatory inversion. I. Yücel Özbek, Mark Hasegawa-Johnson, Mübeccel Demirekler |
| 2009 | From experiments to articulatory motion - a three dimensional talking head model. Xiao Bo Lu, William Thorpe, Kylie Foster, Peter Hunter |
| 2009 | Functional data analysis as a tool for analyzing speech dynamics - a case study on the French word c'était. Michele Gubian, Francisco Torreira, Helmer Strik, Lou Boves |
| 2009 | Fusing audio and video information for online speaker diarization. Joerg Schmalenstroeer, Martin Kelling, Volker Leutnant, Reinhold Haeb-Umbach |
| 2009 | Fusing fast algorithms to achieve efficient speech detection in FM broadcasts. Stéphane Pigeon, Patrick Verlinde |
| 2009 | GMM kernel by Taylor series for speaker verification. Minqiang Xu, Xi Zhou, Beiqian Dai, Thomas S. Huang |
| 2009 | GTM-URL contribution to the INTERSPEECH 2009 emotion challenge. Santiago Planet, Ignasi Iriondo Sanz, Joan Claudi Socoró, Carlos Monzo, Jordi Adell |
| 2009 | Gender differences in the realization of vowel-initial glottalization. Elke Philburn |
| 2009 | Generalized discriminative feature transformation for speech recognition. Roger Hsiao, Tanja Schultz |
| 2009 | German boundary tones show categorical perception and a perceptual magnet effect when presented in different contexts. Katrin Schneider, Grzegorz Dogil, Bernd Möbius |
| 2009 | Glottal closure and opening instant detection from speech signals. Thomas Drugman, Thierry Dutoit |
| 2009 | Grapheme to phoneme conversion using an SMT system. Antoine Laurent, Paul Deléglise, Sylvain Meignier |
| 2009 | Graphical models for discrete hidden Markov models in speech recognition. Antonio Miguel, Alfonso Ortega, Luis Buera, Eduardo Lleida |
| 2009 | Group-delay-deviation based spectral analysis of speech. Anthony P. Stark, Kuldip K. Paliwal |
| 2009 | HEAR: an hybrid episodic-abstract speech recognizer. Sébastien Demange, Dirk Van Compernolle |
| 2009 | HMM adaptation and voice conversion for the synthesis of child speech: a comparison. Oliver Watts, Junichi Yamagishi, Simon King, Kay Berkling |
| 2009 | HMM-based automatic eye-blink synthesis from speech. Michal Dziemianko, Gregor Hofer, Hiroshi Shimodaira |
| 2009 | HMM-based speaker characteristics emphasis using average voice model. Takashi Nose, Junichi Adada, Takao Kobayashi |
| 2009 | Hidden conditional random field with distribution constraints for phone classification. Dong Yu, Li Deng, Alex Acero |
| 2009 | Hierarchical processing of the modulation spectrum for GALE Mandarin LVCSR system. Fabio Valente, Mathew Magimai-Doss, Christian Plahl, Suman V. Ravuri |
| 2009 | High front vowels in Czech: a contrast in quantity or quality? Václav Jonás Podlipský, Radek Skarnitzl, Jan Volín |
| 2009 | High performance automatic mispronunciation detection method based on neural network and TRAP features. Hongyan Li, Shijin Wang, Jiaen Liang, Shen Huang, Bo Xu |
| 2009 | High-accuracy, low-complexity voice activity detection based on a posteriori SNR weighted energy. Zheng-Hua Tan, Børge Lindberg |
| 2009 | Hill-climbing feature selection for multi-stream ASR. David Gelbart, Nelson Morgan, Alexey Tsymbal |
| 2009 | How similar are clusters resulting from schwa deletion in French to identical underlying clusters? Audrey Bürki, Cécile Fougeron, Christophe Veaux, Ulrich H. Frauenfelder |
| 2009 | How speaker tongue and name source language affect the automatic recognition of spoken names. Bert Réveil, Jean-Pierre Martens, Bart D'hoore |
| 2009 | How to improve TTS systems for emotional expressivity. Antonio Rui Ferreira Rebordão, Shaikh Mostafa Al Masum, Keikichi Hirose, Nobuaki Minematsu |
| 2009 | How to loose confidence: probabilistic linear machines for multiclass classification. Hui Lin, Jeff A. Bilmes, Koby Crammer |
| 2009 | How to select a good training-data subset for transcription: submodular active selection for sequences. Hui Lin, Jeff A. Bilmes |
| 2009 | Human audio-visual consonant recognition analyzed with three bimodal integration models. Zhanyu Ma, Arne Leijon |
| 2009 | Human translations guided language discovery for ASR systems. Sebastian Stüker, Laurent Besacier, Alex Waibel |
| 2009 | Human voice or prompt generation? can they co-exist in an application? Géza Németh, Csaba Zainkó, Mátyás Bartalis, Gábor Olaszy, Géza Kiss |
| 2009 | Hybrid approach to grapheme to phoneme conversion for Korean. Jinsik Lee, Byeongchang Kim, Gary Geunbae Lee |
| 2009 | Hybridisation of expertise and reinforcement learning in dialogue systems. Romain Laroche, Ghislain Putois, Philippe Bretier, Bernadette Bouchon-Meunier |
| 2009 | Hybrids of supervised and unsupervised models for extractive speech summarization. Shih-Hsiang Lin, Yueng-Tien Lo, Yao-Ming Yeh, Berlin Chen |
| 2009 | Identification and automatic detection of parasitic speech sounds. Jindrich Matousek, Radek Skarnitzl, Pavel Machac, Jan Trmal |
| 2009 | Identification of contrast and its emphatic realization in HMM based speech synthesis. Leonardo Badino, J. Sebastian Andersson, Junichi Yamagishi, Robert A. J. Clark |
| 2009 | Identifying uncertain words within an utterance via prosodic features. Heather Pon-Barry, Stuart M. Shieber |
| 2009 | Impact of different speaking modes on EMG-based speech recognition. Michael Wand, Szu-Chen Stan Jou, Arthur R. Toth, Tanja Schultz |
| 2009 | Importance of nasality measures for speaker recognition data selection and performance prediction. Howard Lei, Eduardo López Gonzalo |
| 2009 | Improved GMM-based speaker verification using SVM-driven impostor dataset selection. Mitchell McLaren, Robbie Vogt, Brendan Baker, Sridha Sridharan |
| 2009 | Improved language modelling using bag of word pairs. Langzhou Chen, K. K. Chin, Kate M. Knill |
| 2009 | Improved speaker diarization of meeting speech with recurrent selection of representative speech segments and participant interaction pattern modeling. Kyu Jeong Han, Shrikanth S. Narayanan |
| 2009 | Improved speech summarization with multiple-hypothesis representations and kullback-leibler divergence measures. Shih-Hsiang Lin, Berlin Chen |
| 2009 | Improvements to the LIUM French ASR system based on CMU sphinx: what helps to significantly reduce the word error rate? Paul Deléglise, Yannick Estève, Sylvain Meignier, Téva Merlin |
| 2009 | Improving acceptability assessment for the labelling of affective speech corpora. Zoraida Callejas, Ramón López-Cózar |
| 2009 | Improving automatic emotion recognition from speech signals. Elif Bozkurt, Engin Erzin, Çigdem Eroglu Erdem, A. Tanju Erdem |
| 2009 | Improving broadcast news transcription with a precision grammar and discriminative reranking. Tobias Kaufmann, Thomas Ewender, Beat Pfister |
| 2009 | Improving consistence of phonetic transcription for text-to-speech. Pablo Daniel Agüero, Antonio Bonafonte, Juan Carlos Tulli |
| 2009 | Improving detection of acoustic events using audiovisual data and feature level fusion. Taras Butko, Cristian Canton-Ferrer, Carlos Segura, Xavier Giró, Climent Nadeu, Javier Hernando, Josep R. Casas |
| 2009 | Improving emotion recognition using class-level spectral features. Dmitri Bitouk, Ani Nenkova, Ragini Verma |
| 2009 | Improving initial boundary estimation for HMM-based automatic phonetic segmentation. Kalu U. Ogbureke, Julie Carson-Berndsen |
| 2009 | Improving perceived accuracy for in-car media search. Yun-Cheng Ju, Michael L. Seltzer, Ivan Tashev |
| 2009 | Improving phone recognition performance via phonetically-motivated units. Hyejin Hong, Minhwa Chung |
| 2009 | Improving speaker segmentation via speaker identification and text segmentation. Runxin Li, Tanja Schultz, Qin Jin |
| 2009 | Improving speech understanding accuracy with limited training data using multiple language models and multiple understanding models. Masaki Katsumaru, Mikio Nakano, Kazunori Komatani, Kotaro Funakoshi, Tetsuya Ogata, Hiroshi G. Okuno |
| 2009 | Improving the recognition of names by document-level clustering. Bin Zhang, Wei Wu, Jeremy G. Kahn, Mari Ostendorf |
| 2009 | Improving the robustness of phonetic segmentation to accent and style variation with a two-staged approach. Vaishali Patil, Shrikant Joshi, Preeti Rao |
| 2009 | Improving the robustness with multiple sets of HMMs. Hans-Günter Hirsch, Andreas Kitzig |
| 2009 | In search of non-uniqueness in the acoustic-to-articulatory mapping. Gopal Ananthakrishnan, Daniel Neiberg, Olov Engwall |
| 2009 | Incremental adaptation with VTS and joint adaptively trained systems. Federico Flego, Mark J. F. Gales |
| 2009 | Incremental composition of static decoding graphs. Miroslav Novak |
| 2009 | Incremental dialog clustering for speech-to-speech translation. David Stallard, Stavros Tsakalidis, Shirin Saleem |
| 2009 | Influence of training on direct and indirect measures for the evaluation of multimodal systems. Julia Seebode, Stefan Schaffer, Ina Wechsung, Florian Metze |
| 2009 | Influences of vowel duration on speaker-size estimation and discrimination. Chihiro Takeshima, Minoru Tsuzaki, Toshio Irino |
| 2009 | Information bottleneck based age verification. Ron M. Hecht, Omer Hezroni, Amit Manna, Gil Dobry, Yaniv Zigel, Naftali Tishby |
| 2009 | Integrating codebook and utterance information in cepstral statistics normalization techniques for robust speech recognition. Guan-min He, Jeih-weih Hung |
| 2009 | Intelligibility assessment in children with cleft lip and palate in Italian and German. Marcello Scipioni, Matteo Gerosa, Diego Giuliani, Elmar Nöth, Andreas K. Maier |
| 2009 | Intercultural differences in evaluation of pathological voice quality: perceptual and acoustical comparisons between RASATI and GRBASI scales. Emi Juliana Yamauchi, Satoshi Imaizumi, Hagino Maruyama, Tomoyuki Haji |
| 2009 | Intonation of Japanese sentences spoken by English speakers. Chiharu Tsurutani |
| 2009 | Intonation segments and segmental intonation. Oliver Niebuhr |
| 2009 | Intonational features for identifying regional accents of Italian. Michelina Savino |
| 2009 | Intrinsic vowel duration and the post-vocalic voicing effect: some evidence from dialects of north american English. Joshua Tauberer, Keelan Evanini |
| 2009 | Invariant-integration method for robust feature extraction in speaker-independent speech recognition. Florian Müller, Alfred Mertins |
| 2009 | Investigating /l/ variation in English through forced alignment. Jiahong Yuan, Mark Y. Liberman |
| 2009 | Investigating changes in the rhythm of maori over time. Margaret Maclagan, Catherine Inez Watson, Jeanette King, Ray Harlow, Laura Thompson, Peter Keegan |
| 2009 | Investigating phonetic information reduction and lexical confusability. William Hartmann, Eric Fosler-Lussier |
| 2009 | Investigating privacy-sensitive features for speech detection in multiparty conversations. Sree Hari Krishnan Parthasarathi, Mathew Magimai-Doss, Hervé Bourlard, Daniel Gatica-Perez |
| 2009 | Investigating the use of morphological decomposition and diacritization for improving Arabic LVCSR. Amr El-Desoky, Christian Gollan, David Rybach, Ralf Schlüter, Hermann Ney |
| 2009 | Investigation into bottle-neck features for meeting speech recognition. Frantisek Grézl, Martin Karafiát, Lukás Burget |
| 2009 | Investigation into variants of joint factor analysis for speaker recognition. Lukás Burget, Pavel Matejka, Valiantsina Hubeika, Jan Cernocký |
| 2009 | Investigation of morph-based speech recognition improvements across speech genres. Péter Mihajlik, Balázs Tarján, Zoltán Tüske, Tibor Fegyó |
| 2009 | Investigations on convex optimization using log-linear HMMs for digit string recognition. Georg Heigold, David Rybach, Ralf Schlüter, Hermann Ney |
| 2009 | Investigations on discriminative training in large scale acoustic model estimation. Janne Pylkkönen |
| 2009 | Is tonal alignment interpretation independent of methodology? Caterina Petrone, Mariapaola D'Imperio |
| 2009 | Iterative sentence-pair extraction from quasi-parallel corpora for machine translation. Ruhi Sarikaya, Sameer Maskey, R. Zhang, Ea-Ee Jan, D. Wang, Bhuvana Ramabhadran, Salim Roukos |
| 2009 | JTrans: an open-source software for semi-automatic text-to-speech alignment. Christophe Cerisara, Odile Mella, Dominique Fohr |
| 2009 | Japanese children's acquisition of prosodic Politeness expressions. Takaaki Shochi, Donna Erickson, Kaoru Sekiyama, Albert Rilliard, Véronique Aubergé |
| 2009 | Japanese pitch conversion for voice morphing based on differential modeling. Ryuki Tachibana, Zhiwei Shuang, Masafumi Nishimura |
| 2009 | Joint noise reduction and dereverberation of speech using hybrid TF-GSC and adaptive MMSE estimator. Behdad Dashtbozorg, Hamid Reza Abutalebi |
| 2009 | Joint quantization strategies for low bit-rate sinusoidal coding. Emre Unver, Stephane Villette, Ahmet M. Kondoz |
| 2009 | Joint segmentation and classification of dialog acts using conditional random fields. Matthias Zimmermann |
| 2009 | Joint speech enhancement and speaker identification using monte carlo methods. Ciira Wa Maina, John MacLaren Walsh |
| 2009 | KL realignment for speaker diarization with multiple feature streams. Deepu Vijayasenan, Fabio Valente, Hervé Bourlard |
| 2009 | KLAIR: a virtual infant for spoken language acquisition research. Mark A. Huckvale, Ian S. Howard, Sascha Fagel |
| 2009 | LS regularization of group delay features for speaker recognition. Jia Min Karen Kua, Julien Epps, Eliathamby Ambikairajah, Eric H. C. Choi |
| 2009 | Language identification for speech-to-speech translation. Daniel Chung Yong Lim, Ian R. Lane |
| 2009 | Language modeling and dialog management for address recognition. Rajesh Balchandran, Leonid Rachevsky, Larry Sansone |
| 2009 | Language modeling for what-with-where on GOOG-411. Charl Johannes van Heerden, Johan Schalkwyk, Brian Strope |
| 2009 | Language recognition using language factors. Fabio Castaldo, Sandro Cumani, Pietro Laface, Daniele Colibro |
| 2009 | Language score calibration using adapted Gaussian back-end. Mohamed Faouzi BenZeghiba, Jean-Luc Gauvain, Lori Lamel |
| 2009 | Large margin estimation of Gaussian mixture model parameters with extended baum-welch for spoken language recognition. Donglai Zhu, Bin Ma, Haizhou Li |
| 2009 | Large-scale Polish SLU. Patrick Lehnen, Stefan Hahn, Hermann Ney, Agnieszka Mykowiecka |
| 2009 | Large-scale analysis of formant frequency estimation variability in conversational telephone speech. Nancy F. Chen, Wade Shen, Joseph P. Campbell, Reva Schwartz |
| 2009 | Laying the foundation for in-car alcohol detection by speech. Florian Schiel, Christian Heinrich |
| 2009 | Learning and generalization of novel contrastive cues. Meghan Sumner |
| 2009 | Learning lexicons from spoken utterances based on statistical model selection. Ryo Taguchi, Naoto Iwahashi, Takashi Nose, Kotaro Funakoshi, Mikio Nakano |
| 2009 | Learning the structure of human-computer and human-human dialogs. David Griol, Giuseppe Riccardi, Emilio Sanchis |
| 2009 | Letter-to-phoneme conversion by inference of rewriting rules. Vincent Claveau |
| 2009 | Leveraging sentence weights in a concept-based optimization framework for extractive meeting summarization. Shasha Xie, Benoît Favre, Dilek Hakkani-Tür, Yang Liu |
| 2009 | Lexical and phonetic modeling for Arabic automatic speech recognition. Long Nguyen, Tim Ng, Kham Nguyen, Rabih Zbib, John Makhoul |
| 2009 | Lexical embedding in spoken dutch. Odette Scharenborg, Stefanie Okolowski |
| 2009 | Lexical tone production by Cantonese speakers with parkinson's disease. Joan Ka-Yin Ma |
| 2009 | Linguistically-motivated automatic classification of regional French varieties. Cécile Woehrling, Philippe Boula de Mareüil, Martine Adda-Decker |
| 2009 | Local minimum generation error criterion for hybrid HMM speech synthesis. Xavi Gonzalvo, Alexander Gutkin, Joan Claudi Socoró, Ignasi Iriondo Sanz, Paul Taylor |
| 2009 | Local projections and support vector based feature selection in speech recognition. Antonio Miguel, Alfonso Ortega, Luis Buera, Eduardo Lleida |
| 2009 | Localization of speech recognition in spoken dialog systems: how machine translation can make our lives easier. David Suendermann, Jackson Liscombe, Krishna Dayanidhi, Roberto Pieraccini |
| 2009 | Log-linear model combination with word-dependent scaling factors. Björn Hoffmeister, Ruoying Liang, Ralf Schlüter, Hermann Ney |
| 2009 | Log-spectral magnitude MMSE estimators under super-Gaussian densities. Richard C. Hendriks, Richard Heusdens, Jesper Jensen |
| 2009 | Long term examination of intra-session and inter-session speaker variability. Aaron D. Lawson, Allen R. Stauffer, Brett Y. Smolenski, Benjamin B. Pokines, Matthew Leonard, Edward J. Cupples |
| 2009 | Low-cost call type classification for contact center calls using partial transcripts. Youngja Park, Wilfried Teiken, Stephen C. Gates |
| 2009 | Mandarin spontaneous narrative planning - prosodic evidence from national taiwan university lecture corpus. Chiu-yu Tseng, Zhao-yu Su, Lin-Shan Lee |
| 2009 | Many-to-many eigenvoice conversion with reference voice. Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano |
| 2009 | Margin-space integration of MPE loss via differencing of MMI functionals for generalized error-weighted discriminative training. Erik McDermott, Shinji Watanabe, Atsushi Nakamura |
| 2009 | Maximum likelihood unit selection for corpus-based speech synthesis. Abubeker Gamboa Rosales, Hamurabi Gamboa-Rosales, Rüdiger Hoffmann |
| 2009 | Maximum mutual information estimation via second order cone programming for large vocabulary continuous speech recognition. Dalei Wu, Baojie Li, Hui Jiang |
| 2009 | Maximum mutual information multi-phone units in direct modeling. Geoffrey Zweig, Patrick Nguyen |
| 2009 | Measuring speech rhythm variation in a model-based framework. Plínio A. Barbosa |
| 2009 | Measuring tagging performance of a joint language model. Denis Filimonov, Mary P. Harper |
| 2009 | Measuring the gap between HMM-based ASR and TTS. John Dines, Junichi Yamagishi, Simon King |
| 2009 | Mel, linear, and antimel frequency cepstral coefficients in broad phonetic regions for telephone speaker recognition. Howard Lei, Eduardo López Gonzalo |
| 2009 | Merging search spaces for subword spoken term detection. Timo Mertens, Daniel Schneider, Joachim Köhler |
| 2009 | Mi-DJ: a multi-source intelligent DJ service. Ching-Hsien Lee, Hsu-Chih Wu |
| 2009 | Minimum hypothesis phone error as a decoding method for speech recognition. Haihua Xu, Daniel Povey, Jie Zhu, Guanyong Wu |
| 2009 | Minivectors: an improved GMM-SVM approach for speaker verification. Xavier Anguera |
| 2009 | Model based feature enhancement for automatic speech recognition in reverberant environments. Alexander Krueger, Reinhold Haeb-Umbach |
| 2009 | Model-based automatic evaluation of L2 learner's English timing. Chatchawarn Hansakunbuntheung, Hiroaki Kato, Yoshinori Sagisaka |
| 2009 | Model-based estimation of instantaneous pitch in noisy speech. Jung Ook Hong, Patrick J. Wolfe |
| 2009 | Model-based speech separation: identifying transcription using orthogonality. Siu Wa Lee, Frank K. Soong, Tan Lee |
| 2009 | Modeling mutual influence of interlocutor emotion states in dyadic spoken interactions. Chi-Chun Lee, Carlos Busso, Sungbok Lee, Shrikanth S. Narayanan |
| 2009 | Modeling northern and southern varieties of dutch for STT. Julien Despres, Petr Fousek, Jean-Luc Gauvain, Sandrine Gay, Yvan Josse, Lori Lamel, Abdelkhalek Messaoudi |
| 2009 | Modeling other talkers for improved dialog act recognition in meetings. Kornel Laskowski, Elizabeth Shriberg |
| 2009 | Modeling the intonation of topic structure: two approaches. Margaret Zellers, Brechtje Post, Mariapaola D'Imperio |
| 2009 | Modelling similarity perception of intonation. Uwe D. Reichel, Felicitas Kleber, Raphael Winkelmann |
| 2009 | Modelling vocabulary growth from birth to young adulthood. Roger K. Moore, Louis ten Bosch |
| 2009 | Modulation domain spectral subtraction for speech enhancement. Kuldip K. Paliwal, Belinda Schwerin, Kamil K. Wójcicki |
| 2009 | Monaural segregation of voiced speech using discriminative random fields. Rohit Prabhavalkar, Zhaozhang Jin, Eric Fosler-Lussier |
| 2009 | Morphological analysis and decomposition for Arabic speech-to-text systems. Frank Diehl, Mark J. F. Gales, Marcus Tomalin, Philip C. Woodland |
| 2009 | Multi-stream to many-stream: using spectro-temporal features for ASR. Sherry Y. Zhao, Suman V. Ravuri, Nelson Morgan |
| 2009 | Multifactor adaptation for Mandarin broadcast news and conversation speech recognition. Wen Wang, Arindam Mandal, Xin Lei, Andreas Stolcke, Jing Zheng |
| 2009 | Multimodal HMM-based NAM-to-speech conversion. Viet-Anh Tran, Gérard Bailly, Hélène Loevenbruck, Tomoki Toda |
| 2009 | Multimodal speaker verification using ancillary known speaker characteristics such as gender or age. Girija Chetty, Michael Wagner |
| 2009 | Multiple text segmentation for statistical language modeling. Sopheap Seng, Laurent Besacier, Brigitte Bigi, Eric Castelli |
| 2009 | NIST 2008 speaker recognition evaluation: performance across telephone and room microphone channels. Alvin F. Martin, Craig S. Greenberg |
| 2009 | Named entity network based on wikipedia. Sameer Maskey, Wisam Dakka |
| 2009 | Nearly perfect detection of continuous f_0 contour and frame classification for TTS synthesis. Thomas Ewender, Sarah Hoffmann, Beat Pfister |
| 2009 | New horizons in the study of child language acquisition. Deb Roy |
| 2009 | New method for delexicalization and its application to prosodic tagging for text-to-speech synthesis. Martti Vainio, Antti Suni, Tuomo Raitio, Jani Nurminen, Juhani Järvikivi, Paavo Alku |
| 2009 | New methods for the analysis of repeated utterances. Geoffrey Zweig |
| 2009 | No sooner said than done? testing incrementality of semantic interpretations of spontaneous speech. Michaela Atterer, Timo Baumann, David Schlangen |
| 2009 | Noise robustness of tract variables and their application to speech recognition. Vikramjit Mitra, Hosung Nam, Carol Y. Espy-Wilson, Elliot Saltzman, Louis Goldstein |
| 2009 | Noise-robust feature extraction based on forward masking. Sheng-Chiuan Chiou, Chia-Ping Chen |
| 2009 | Noisy speech recognition by using output combination of discrete-mixture HMMs and continuous-mixture HMMs. Tetsuo Kosaka, You Saito, Masaharu Kato |
| 2009 | Non-automaticity of use of orthographic knowledge in phoneme evaluation. Anne Cutler, Chris Davis, Jeesun Kim |
| 2009 | Nonstationary latent Dirichlet allocation for speech recognition. Chuang-Hua Chueh, Jen-Tzung Chien |
| 2009 | Normalized modulation spectral features for cross-database voice pathology detection. Maria E. Markaki, Yannis Stylianou |
| 2009 | Observation of empirical cumulative distribution of vowel spectral distances and its application to vowel based voice conversion. Hideki Kawahara, Masanori Morise, Toru Takahashi, Hideki Banno, Ryuichi Nisimura, Toshio Irino |
| 2009 | On acquiring speech production knowledge from articulatory measurements for phoneme recognition. Daniel Neiberg, Gopal Ananthakrishnan, Mats Blomberg |
| 2009 | On invariant structural representation for speech recognition: theoretical validation and experimental improvement. Yu Qiao, Nobuaki Minematsu, Keikichi Hirose |
| 2009 | On the cost of backward compatibility for communication codecs. Konstantin Schmidt, Markus Schnell, Nikolaus Rettelbach, Manfred Lutzky, Jochen Issing |
| 2009 | On the development of matched and mismatched Italian children's speech recognition systems. Piero Cosi |
| 2009 | On the estimation and the use of confusion-matrices for improving ASR accuracy. Santiago Omar Caballero Morales, Stephen J. Cox |
| 2009 | On the mutual information between source and filter contributions for voice pathology detection. Thomas Drugman, Thomas Dubuisson, Thierry Dutoit |
| 2009 | On the production of sandhi phenomena in French: psycholinguistic and acoustic data. Odile Bagou, Violaine Michel, Marina Laganaro |
| 2009 | On the relevance of high-level features for speaker independent emotion recognition of spontaneous speech. Marko Lugger, Bin Yang |
| 2009 | On the semi-supervised learning of multi-layered perceptrons. Jonathan Malkin, Amarnag Subramanya, Jeff A. Bilmes |
| 2009 | On the use of phonological features for automatic accent analysis. Abhijeet Sangwan, John H. L. Hansen |
| 2009 | On the use of pitch normalization for improving children's speech recognition. Rohit Sinha, Shweta Ghai |
| 2009 | On-line formant shifting as a function of F0. Katerina Chládková, Paul Boersma, Václav Jonás Podlipský |
| 2009 | Online detecting end times of spoken utterances for synchronization of live speech and its transcripts. Jie Gao, Qingwei Zhao, Yonghong Yan |
| 2009 | Online discriminative training for grapheme-to-phoneme conversion. Sittichai Jiampojamarn, Grzegorz Kondrak |
| 2009 | Online generation of acoustic models for multilingual speech recognition. Martin Raab, Guillermo Aradilla, Rainer Gruhn, Elmar Nöth |
| 2009 | Online model adaptation for voice conversion using model-based speech synthesis techniques. Dalei Wu, Baojie Li, Hui Jiang, Qian-Jie Fu |
| 2009 | Open-set speaker identification under mismatch conditions. Surosh G. Pillay, Aladdin M. Ariyaeeinia, P. Sivakumaran, M. Pawlewski |
| 2009 | Optimal event search using a structural cost function - improvement of structure to speech conversion. Daisuke Saito, Yu Qiao, Nobuaki Minematsu, Keikichi Hirose |
| 2009 | Optimization of dereverberation parameters based on likelihood of speech recognizer. Randy Gomez, Tatsuya Kawahara |
| 2009 | Optimization of discriminative kernels in SVM speaker verification. Shi-Xiong Zhang, Man-Wai Mak |
| 2009 | Optimization of t-tilt F0 modeling. Ausdang Thangthai, Anocha Rugchatjaroen, Nattanun Thatphithakkul, Ananlada Chotimongkol, Chai Wutiwiwatchai |
| 2009 | Optimized feature set to assess acoustic perturbations in dysarthric speech. Sunil Nagaraja, Eduardo Castillo Guerra |
| 2009 | Optimizing CRFs for SLU tasks in various languages using modified training criteria. Stefan Hahn, Patrick Lehnen, Georg Heigold, Hermann Ney |
| 2009 | Optimizing non-native speech recognition for CALL applications. Joost van Doremalen, Helmer Strik, Catia Cucchiarini |
| 2009 | Overall performance metrics for multi-condition speaker recognition evaluations. David A. van Leeuwen |
| 2009 | Paper 8003 was not available at the time of publication oral presentation of poster papers no time to lose? time shrinking effects enhance the impression of rhythmic "isochrony" and fast speech rate. Petra Wagner, Andreas Windmann |
| 2009 | Parallel fast likelihood computation for LVCSR using mixture decomposition. Naveen Parihar, Ralf Schlüter, David Rybach, Eric A. Hansen |
| 2009 | Parallelized viterbi processor for 5, 000-word large-vocabulary real-time continuous speech recognition FPGA system. Tsuyoshi Fujinaga, Kazuo Miura, Hiroki Noguchi, Hiroshi Kawaguchi, Masahiko Yoshimoto |
| 2009 | Parameterization of vocal fry in HMM-based speech synthesis. Hanna Silén, Elina Helander, Jani Nurminen, Moncef Gabbouj |
| 2009 | Pause and gap length in face-to-face interaction. Jens Edlund, Mattias Heldner, Julia Hirschberg |
| 2009 | Perceived loudness and voice quality in affect cueing. Irena Yanushevskaya, Christer Gobl, Ailbhe Ní Chasaide |
| 2009 | Perceived naturalness of a synthesizer of disordered voices. Samia Fraj, Francis Grenez, Jean Schoentgen |
| 2009 | Perceiving surprise on cue words: prosody and semantics interact on right and really. Catherine Lai |
| 2009 | Perception and production of boundary tones in whispered dutch. Willemijn Heeren, Vincent J. van Heuven |
| 2009 | Perception of English compound vs. phrasal stress: natural vs. synthetic speech. Irene Vogel, Arild Hestvik, H. Timothy Bunnell, Laura Spinu |
| 2009 | Perception of temporal cues at discourse boundaries. Hsin-Yi Lin, Janice Fon |
| 2009 | Perception of the evolution of prosody in the French broadcast news style. Philippe Boula de Mareüil, Albert Rilliard, Alexandre Allauzen |
| 2009 | Perceptual cost function for cross-fading based concatenation. Qi Miao, Alexander Kain, Jan P. H. van Santen |
| 2009 | Perceptual grouping of alternating word pairs: effect of pitch difference and presentation rate. Nandini Iyer, Douglas Brungart, Brian D. Simpson |
| 2009 | Perceptual training of singleton and geminate stops in Japanese language by Korean learners. Mee Sonu, Keiichi Tajima, Hiroaki Kato, Yoshinori Sagisaka |
| 2009 | Performance comparison of HMM and VQ based single channel speech separation. Mohammad H. Radfar, Wai-Yip Chan, Richard M. Dansereau, Willy Wong |
| 2009 | Performance comparisons of the integrated parallel model combination approaches with front-end noise reduction. Guanghu Shen, Soo-Young Suk, Hyun-Yeol Chung |
| 2009 | Personalizing synthetic voices for people with progressive speech disorders: judging voice similarity. Sarah M. Creer, Stuart P. Cunningham, Phil D. Green, K. Fatema |
| 2009 | Phonetic alignment for speech synthesis in under-resourced languages. Daniel R. van Niekerk, Etienne Barnard |
| 2009 | Phrase and word level strategies for detecting appositions in speech. Benoît Favre, Dilek Hakkani-Tür |
| 2009 | Physiologically-inspired feature extraction for emotion recognition. Yu Zhou, Yanqing Sun, Junfeng Li, Jianping Zhang, Yonghong Yan |
| 2009 | Pitch accents and information status in a German radio news corpus. Katrin Schweitzer, Arndt Riester, Michael Walsh, Grzegorz Dogil |
| 2009 | Pitch adaptation in different age groups: boundary tones versus global pitch. Marie Nilsenová, Marc Swerts, Véronique Houtepen, Heleen Dittrich |
| 2009 | Pitch contour parameterisation based on linear stylisation for emotion recognition. Vidhyasaharan Sethu, Eliathamby Ambikairajah, Julien Epps |
| 2009 | Pitch variation estimation. Tom Bäckström, Stefan Bayer, Sascha Disch |
| 2009 | Podcastle: collaborative training of acoustic models on the basis of wisdom of crowds for podcast transcription. Jun Ogata, Masataka Goto |
| 2009 | Polyglot speech prosody control. Harald Romsdorfer |
| 2009 | Porting an european portuguese broadcast news recognition system to brazilian portuguese. Alberto Abad, Isabel Trancoso, Nelson Neto, Céu Viana |
| 2009 | Posterior-based out of vocabulary word detection in telephone speech. Stefan Kombrink, Lukás Burget, Pavel Matejka, Martin Karafiát, Hynek Hermansky |
| 2009 | Precision of phoneme boundaries derived using hidden Markov models. Ladan Baghai-Ravary, Greg Kochanski, John S. Coleman |
| 2009 | Predicting children's reading ability using evaluator-informed features. Matthew Black, Joseph Tepperman, Sungbok Lee, Shrikanth S. Narayanan |
| 2009 | Predicting how it sounds: re-ranking dialogue prompts based on TTS quality for adaptive spoken dialogue systems. Cédric Boidin, Verena Rieser, Lonneke van der Plas, Oliver Lemon, Jonathan Chevelu |
| 2009 | Predicting the quality of multimodal systems based on judgments of single modalities. Ina Wechsung, Klaus-Peter Engelbrecht, Anja B. Naumann, Stefan Schaffer, Julia Seebode, Florian Metze, Sebastian Möller |
| 2009 | Preliminary inversion mapping results with a new EMA corpus. Korin Richmond |
| 2009 | Probabilistic and possibilistic language models based on the world wide web. Stanislas Oger, Vladimir Popescu, Georges Linarès |
| 2009 | Probabilistic effects on French [t] duration. Francisco Torreira, Mirjam Ernestus |
| 2009 | Processing affected speech within human machine interaction. Bogdan Vlasenko, Andreas Wendemuth |
| 2009 | Processing liaison-initial words in native and non-native French: evidence from eye movements. Annie Tremblay |
| 2009 | Production boundary between fricative and affricate in Japanese and Korean speakers. Kimiko Yamakawa, Shigeaki Amano, Shuichi Itahashi |
| 2009 | Profiling large-vocabulary continuous speech recognition on embedded devices: a hardware resource sensitivity analysis. Kai Yu, Rob A. Rutenbar |
| 2009 | Progressive memory-based parametric non-linear feature equalization. Luz García, Roberto Gemello, Franco Mana, José C. Segura |
| 2009 | Pronunciation dictionary development in resource-scarce environments. Marelie H. Davel, Olga Martirosian |
| 2009 | Pronunciation-based ASR for names. Henk van den Heuvel, Bert Réveil, Jean-Pierre Martens |
| 2009 | Prosodic analysis of foreign-accented English. Hansjörg Mixdorff, John Ingram |
| 2009 | Prosodic effects on vowel production: evidence from formant structure. Yoonsook Mo, Jennifer Cole, Mark Hasegawa-Johnson |
| 2009 | Prosodic issues in synthesising thadou, a tibeto-burman tone language. Dafydd Gibbon, Pramod Pandey, D. Mary Kim Haokip, Jolanta Bachan |
| 2009 | Pulse density representation of spectrum for statistical speech processing. Yoshinori Shiga |
| 2009 | Quantifying wideband speech codec degradations via impairment factors: the new ITU-t p.834.1 methodology and its application to the g.711.1 codec. Sebastian Möller, Nicolas Côté, Atsuko Kurashima, Noritsugu Egi, Akira Takahashi |
| 2009 | RTTS: towards enterprise-level real-time speech transcription and translation services. Juan M. Huerta, Cheng Wu, Andrej Sakrajda, Sasha Caskey, Ea-Ee Jan, Alexander Faisman, Shai Ben-David, Wen Liu, Antonio Lee, Osamuyimen Stewart, Michael Frissora, David M. Lubensky |
| 2009 | Rapid unsupervised adaptation using frame independent output probabilities of gender and context independent phoneme models. Satoshi Kobashikawa, Atsunori Ogawa, Yoshikazu Yamaguchi, Satoshi Takahashi |
| 2009 | Rarefaction gestures and coarticulation in mangetti dune !xung clicks. Amanda Miller, Abigail Scott, Bonny E. Sands, Sheena Shah |
| 2009 | Real voice and TTS accent effects on intelligibility and comprehension for indian speakers of English as a second language. Frederick Weber, Kalika Bali |
| 2009 | Real-time ASR from meetings. Philip N. Garner, John Dines, Thomas Hain, Asmaa El Hannani, Martin Karafiát, Danil Korchagin, Mike Lincoln, Vincent Wan, Le Zhang |
| 2009 | Real-time correction of closed-captions. Patrick Cardinal, Gilles Boulianne |
| 2009 | Real-time lexical competitions during speech-in-speech comprehension. Véronique Boulenger, Michel Hoen, François Pellegrino, Fanny Meunier |
| 2009 | Real-time live broadcast news subtitling system for Spanish. Alfonso Ortega, José Enrique García Laínez, Antonio Miguel, Eduardo Lleida |
| 2009 | Recent advances in WFST-based dialog system. Chiori Hori, Kiyonori Ohtake, Teruhisa Misu, Hideki Kashioka, Satoshi Nakamura |
| 2009 | Recognising interest in conversational speech - comparing bag of frames and supra-segmental features. Björn W. Schuller, Gerhard Rigoll |
| 2009 | Recognition and correction of voice web search queries. Keith Vertanen, Per Ola Kristensson |
| 2009 | Reconstructing clean speech from noisy MFCC vectors. Ben Milner, Jonathan Darch, Ibrahim Almajai |
| 2009 | Redefining the Bayesian information criterion for speaker diarisation. Themos Stafylakis, Vassilis Katsouros, George Carayannis |
| 2009 | Reduced complexity equalization of lombard effect for speech recognition in noisy adverse environments. Hynek Boril, John H. L. Hansen |
| 2009 | Refactoring acoustic models using variational expectation-maximization. Pierre L. Dognin, John R. Hershey, Vaibhava Goel, Peder A. Olsen |
| 2009 | Reinforcement learning for dialog management using least-squares Policy iteration and fast feature selection. Lihong Li, Jason D. Williams, Suhrid Balakrishnan |
| 2009 | Relation of formants and subglottal resonances in Hungarian vowels. Tamás Gábor Csapó, Zsuzsanna Bárkányi, Tekla Etelka Gráczi, Tamás Bohm, Steven M. Lulich |
| 2009 | Relative importance of formant and whole-spectral cues for vowel perception. Masashi Ito, Keiji Ohara, Akinori Ito, Masafumi Yano |
| 2009 | Replacing uncertainty decoding with subband re-estimation for large vocabulary speech recognition in noise. Jianhua Lu, Ji Ming, Roger F. Woods |
| 2009 | Resources for speech research: present and future infrastructure needs. Lou Boves, Rolf Carlson, Erhard W. Hinrichs, David House, Steven Krauwer, Lothar Lemnitzer, Martti Vainio, Peter Wittenburg |
| 2009 | Responding to user emotional state by adding emotional coloring to utterances. Jaime C. Acosta, Nigel G. Ward |
| 2009 | Results of the n-best 2008 dutch speech recognition evaluation. David A. van Leeuwen, Judith M. Kessens, Eric Sanders, Henk van den Heuvel |
| 2009 | Rhythm measures with language-independent segmentation. Anastassia Loukina, Greg Kochanski, Chilin Shih, Elinor Keane, Ian Watson |
| 2009 | Rich context modeling for high quality HMM-based TTS. Zhi-Jie Yan, Yao Qian, Frank K. Soong |
| 2009 | Robust F0 estimation based on log-time scale autocorrelation and its application to Mandarin tone recognition. Yusuke Kida, Masaru Sakai, Takashi Masuko, Akinori Kawamura |
| 2009 | Robust LTS rules with the Combilex speech technology lexicon. Korin Richmond, Robert A. J. Clark, Susan Fitt |
| 2009 | Robust angry speech detection employing a TEO-based discriminative classifier combination. Wooil Kim, John H. L. Hansen |
| 2009 | Robust audio-based classification of video genre. Mickael Rouvier, Georges Linarès, Driss Matrouf |
| 2009 | Robust audio-visual speech synchrony detection by generalized bimodal linear prediction. Kshitiz Kumar, Jirí Navrátil, Etienne Marcheret, Vit Libal, Gerasimos Potamianos |
| 2009 | Robust dependency parsing for spoken language understanding of spontaneous speech. Frédéric Béchet, Alexis Nasr |
| 2009 | Robust in-car spelling recognition - a tandem BLSTM-HMM approach. Martin Wöllmer, Florian Eyben, Björn W. Schuller, Yang Sun, Tobias Moosmayr, Nhu Nguyen-Thien |
| 2009 | Robust keyword spotting with rapidly adapting point process models. Aren Jansen, Partha Niyogi |
| 2009 | Robust minimal variance distortionless speech power spectra enhancement using order statistic filter for microphone array. Tao Yu, John H. L. Hansen |
| 2009 | Robust speech recognition using VAD-measure-embedded decoder. Tasuku Oonishi, Paul R. Dixon, Koji Iwano, Sadaoki Furui |
| 2009 | Robustness of phase based features for speaker recognition. R. Padmanabhan, Sree Hari Krishnan Parthasarathi, Hema A. Murthy |
| 2009 | Role of natural language understanding in voice local search. Junlan Feng, Srinivas Bangalore, Mazin Gilbert |
| 2009 | Rule-based voice quality variation with formant synthesis. Felix Burkhardt |
| 2009 | SHoUT, the university of twente submission to the n-best 2008 speech recognition evaluation for dutch. Marijn Huijbregts, Roeland Ordelman, Laurens van der Werff, Franciska M. G. de Jong |
| 2009 | STFT-based speech enhancement by reconstructing the harmonics. Iman Haji Abolhassani, Sid-Ahmed Selouani, Douglas D. O'Shaughnessy |
| 2009 | SUXES - user experience evaluation method for spoken and multimodal interaction. Markku Turunen, Jaakko Hakulinen, Aleksi Melto, Tomi Heimonen, Tuuli Laivo, Juho Hella |
| 2009 | Same tone, different category: linguistic-tonetic variation in the areal tone acoustics of chuqu wu. William Steed, Phil Rose |
| 2009 | Second language discrimination vowel contrasts by adults speakers with a five vowel system. Bianca Sisinni, Mirko Grimaldi |
| 2009 | Selected topics from 40 years of research on speech and speaker recognition. Sadaoki Furui |
| 2009 | Selection of the best set of shifted delta cepstral features in speaker verification using mutual information. José R. Calvo, Rafael Fernández, Gabriel Hernández |
| 2009 | Self-learning vector quantization for pattern discovery from speech. Okko Johannes Räsänen, Unto K. Laine, Toomas Altosaar |
| 2009 | Self-voice recognition in 4 to 5-year-old children. Sofia Strömbergsson |
| 2009 | Semantic context effects in the recognition of acoustically unreduced and reduced words. Chao Wang, Johan Schalkwyk, Roberto Sicconi, Geoffrey Zweig, Marco van de Ven, Benjamin V. Tucker, Mirjam Ernestus |
| 2009 | Semantic role labeling with discriminative feature selection for spoken language understanding. Chao-Hong Liu, Chung-Hsien Wu |
| 2009 | Sentence-final particles in hong kong Cantonese: are they tonal or intonational? Wing Li Wu |
| 2009 | Sentiment classification in English from sentence-level annotations of emotions regarding models of affect. Alexandre Trilla, Francesc Alías |
| 2009 | Sequencing of articulatory gestures using cost optimization. Juraj Simko, Fred Cummins |
| 2009 | Signal separation for robust speech recognition based on phase difference information obtained in the frequency domain. Chanwoo Kim, Kshitiz Kumar, Bhiksha Raj, Richard M. Stern |
| 2009 | Signature cluster model selection for incremental Gaussian mixture cluster modeling in agglomerative hierarchical speaker clustering. Kyu Jeong Han, Shrikanth S. Narayanan |
| 2009 | Simple physical models of the vocal tract for education in speech science. Takayuki Arai |
| 2009 | Simultaneous estimation of confidence and error cause in speech recognition using discriminative model. Atsunori Ogawa, Atsushi Nakamura |
| 2009 | Singing voice detection in polyphonic music using predominant pitch. Vishweshwara Rao, S. Ramakrishnan, Preeti Rao |
| 2009 | Sliding vocal-tract model and its application for vowel production. Takayuki Arai |
| 2009 | Soft decision-based acoustic echo suppression in a frequency domain. Yun-Sik Park, Ji-Hyun Song, Jae-Hun Choi, Joon-Hyuk Chang |
| 2009 | Speaker adaptation based on two-step active learning. Koichi Shinoda, Hiroko Murakami, Sadaoki Furui |
| 2009 | Speaker adaptation using a parallel phone set pronunciation dictionary for Thai-English bilingual TTS. Anocha Rugchatjaroen, Nattanun Thatphithakkul, Ananlada Chotimongkol, Ausdang Thangthai, Chai Wutiwiwatchai |
| 2009 | Speaker dependent emotion recognition using prosodic supervectors. Ignacio López-Moreno, Carlos Ortego-Resa, Joaquin Gonzalez-Rodriguez, Daniel Ramos |
| 2009 | Speaker dependent mapping for low bit rate coding of throat microphone speech. Joseph M. Anand, B. Yegnanarayana, Sanjeev Gupta, M. R. Kesheorey |
| 2009 | Speaker diarization for meeting room audio. Hanwu Sun, Tin Lay Nwe, Bin Ma, Haizhou Li |
| 2009 | Speaker diarization using divide-and-conquer. Shih-Sian Cheng, Chun-Han Tseng, Chia-Ping Chen, Hsin-Min Wang |
| 2009 | Speaker discriminability for visual speech modes. Jeesun Kim, Chris Davis, Christian Kroos, Harold Hill |
| 2009 | Speaker identification for whispered speech using modified temporal patterns and MFCCs. Xing Fan, John H. L. Hansen |
| 2009 | Speaker identification using warped MVDR cepstral features. Matthias Wölfel, Qian Yang, Qin Jin, Tanja Schultz |
| 2009 | Speaker normalization for template based speech recognition. Sébastien Demange, Dirk Van Compernolle |
| 2009 | Speaker recognition by Gaussian information bottleneck. Ron M. Hecht, Elad Noor, Naftali Tishby |
| 2009 | Speaker recognition on lossy compressed speech using the speex codec. A. R. Stauffer, Aaron D. Lawson |
| 2009 | Speaker segmentation and clustering for simultaneously presented speech. Lingyun Gu, Richard M. Stern |
| 2009 | Speaking in the presence of a competing talker. Youyi Lu, Martin Cooke |
| 2009 | Speaking style adaptation for spontaneous speech recognition using multiple-regression HMM. Yusuke Ijima, Takeshi Matsubara, Takashi Nose, Takao Kobayashi |
| 2009 | Spectral and temporal modulation features for phonetic recognition. Stephen A. Zahorian, Hongbing Hu, Zhengqing Chen, Jiang Wu |
| 2009 | Speech enhancement in a 2-dimensional area based on power spectrum estimation of multiple areas with investigation of existence of active sources. Yusuke Hioka, Ken'ichi Furuya, Youichi Haneda, Akitoshi Kataoka |
| 2009 | Speech enhancement minimizing generalized euclidean distortion using supergaussian priors. Amit Das, John H. L. Hansen |
| 2009 | Speech generation from hand gestures based on space mapping. Aki Kunikoshi, Yu Qiao, Nobuaki Minematsu, Keikichi Hirose |
| 2009 | Speech overlap detection in a two-pass speaker diarization system. Marijn Huijbregts, David A. van Leeuwen, Franciska M. G. de Jong |
| 2009 | Speech rate and pauses in non-native Finnish. Minnaleena Toivola, Mietta Lennes, Eija Aho |
| 2009 | Speech rate effects on european portuguese nasal vowels. Catarina Oliveira, Paula Martins, António J. S. Teixeira |
| 2009 | Speech rate effects on linguistic change. Alexsandro R. Meireles, Plínio A. Barbosa |
| 2009 | Speech recognition with speech synthesis models by marginalising over decision tree leaves. John Dines, Lakshmi Babu Saheer, Hui Liang |
| 2009 | Speech recordings via the internet: an overview of the VOYS project in scotland. Catherine Dickie, Felix Schaeffler, Christoph Draxler, Klaus Jänsch |
| 2009 | Speech sample salience analysis for speech cycle detection. Christophe Mertens, Francis Grenez, Jean Schoentgen |
| 2009 | Speech style and speaker recognition: a case study. Marco Grimaldi, Fred Cummins |
| 2009 | Speech synthesis based on the plural unit selection and fusion method using FWF model. Ryo Morinaka, Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima |
| 2009 | Speech synthesis without a phone inventory. Matthew P. Aylett, Simon King, Junichi Yamagishi |
| 2009 | Speech-based and multimodal media center for different user groups. Markku Turunen, Jaakko Hakulinen, Aleksi Melto, Juho Hella, Juha-Pekka Rajaniemi, Erno Mäkinen, Jussi Rantala, Tomi Heimonen, Tuuli Laivo, Hannu Soronen, Mervi Hansen, Pellervo Valkama, Toni Miettinen, Roope Raisamo |
| 2009 | SplaSH (spoken language search hawk): integrating time-aligned with text-aligned annotations. Sara Romano, Elvio Cecere, Francesco Cutugno |
| 2009 | Stability and composition of functional synergies for speech movements in children and adults. Hayo Terband, Frits van Brenk, Pascal van Lieshout, Lian Nijland, Ben Maassen |
| 2009 | Standard information from patients: the usefulness of self-evaluation (measured with the French version of the VHI). Lise Crevier-Buchman, Stephanie Borel, Stéphane Hans, Madeleine Menard, Jacqueline Vaissière |
| 2009 | State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis. Yi-Jian Wu, Yoshihiko Nankaku, Keiichi Tokuda |
| 2009 | Static and dynamic modulation spectrum for speech recognition. Sriram Ganapathy, Samuel Thomas, Hynek Hermansky |
| 2009 | Steganographic band width extension for the AMR codec of low-bit-rate modes. Akira Nishimura |
| 2009 | Stereo-input speech recognition using sparseness-based time-frequency masking in a reverberant environment. Yosuke Izumi, Kenta Nishiki, Shinji Watanabe, Takuya Nishimoto, Nobutaka Ono, Shigeki Sagayama |
| 2009 | Stochastic pronunciation modelling for spoken term detection. Dong Wang, Simon King, Joe Frankel |
| 2009 | Strategies for accelerating the design of dialogue applications using heuristic information from the backend database. Luis Fernando D'Haro, Ricardo de Córdoba, Rubén San Segundo, Javier Macías Guarasa, José Manuel Pardo |
| 2009 | Stream-based context-sensitive phone mapping for cross-lingual speech recognition. Khe Chai Sim, Haizhou Li |
| 2009 | Structural analysis of dialects, sub-dialects and sub-sub-dialects of Chinese. Xuebin Ma, Akira Nemoto, Nobuaki Minematsu, Yu Qiao, Keikichi Hirose |
| 2009 | Structure and annotation of Polish LVCSR speech database. Katarzyna Klessa, Grazyna Demenko |
| 2009 | Studying L2 suprasegmental features in asian Englishes: a position paper. Helen Meng, Chiu-yu Tseng, Mariko Kondo, Alissa M. Harrison, Tanya Visceglia |
| 2009 | Subband temporal modulation spectrum normalization for automatic speech recognition in reverberant environments. Xugang Lu, Masashi Unoki, Satoshi Nakamura |
| 2009 | Subjective experiments on influence of response timing in spoken dialogues. Toshihiko Itoh, Norihide Kitaoka, Ryota Nishimura |
| 2009 | Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. Najim Dehak, Réda Dehak, Patrick Kenny, Niko Brümmer, Pierre Ouellet, Pierre Dumouchel |
| 2009 | Syllable HMM based Mandarin TTS and comparison with concatenative TTS. Zhiwei Shuang, Shiyin Kang, Qin Shi, Yong Qin, Lianhong Cai |
| 2009 | Synthesizing speech from electromyography using voice transformation techniques. Arthur R. Toth, Michael Wand, Tanja Schultz |
| 2009 | System request detection in human conversation based on multi-resolution Gabor wavelet features. Tomoyuki Yamagata, Tetsuya Takiguchi, Yasuo Ariki |
| 2009 | Talking heads for interacting with spoken dialog smart-home systems. Christine Kühnel, Benjamin Weiss, Sebastian Möller |
| 2009 | Tandem representations of spectral envelope and modulation frequency features for ASR. Samuel Thomas, Sriram Ganapathy, Hynek Hermansky |
| 2009 | Target speech GMM-based spectral compensation for noise robust speech recognition. Takahiro Shinozaki, Sadaoki Furui |
| 2009 | Target-aware language models for spoken language recognition. Rong Tong, Bin Ma, Haizhou Li, Engsiong Chng, Kong-Aik Lee |
| 2009 | Techniques for rapid and robust topic identification of conversational telephone speech. Jonathan Wintrode, Scott Kulp |
| 2009 | Technologies for processing body-conducted speech detected with non-audible murmur microphone. Tomoki Toda, Keigo Nakamura, Takayuki Nagai, Tomomi Kaino, Yoshitaka Nakajima, Kiyohiro Shikano |
| 2009 | Temporal modulation processing of speech signals for noise robust ASR. Hong You, Abeer Alwan |
| 2009 | Term-dependent confidence for out-of-vocabulary term detection. Dong Wang, Simon King, Joe Frankel, Peter Bell |
| 2009 | Text-independent speaker identification using vocal tract length normalization for building universal background model. Achintya Kumar Sarkar, Srinivasan Umesh, Shakti Prasad Rath |
| 2009 | Text-independent speaker verification using rank threshold in large number of speaker models. Haruka Okamoto, Satoru Tsuge, Amira Abdelwahab, Masafumi Nishida, Yasuo Horiuchi, Shingo Kuroiwa |
| 2009 | The HMM synthesis algorithm of an embedded unified speech recognizer and synthesizer. Guntram Strecha, Matthias Wolff, Frank Duckhorn, Sören Wittenberg, Constanze Tschöpe |
| 2009 | The INTERSPEECH 2009 emotion challenge. Björn W. Schuller, Stefan Steidl, Anton Batliner |
| 2009 | The MIT lincoln laboratory 2008 speaker recognition system. Douglas E. Sturim, William M. Campbell, Zahi N. Karam, Douglas A. Reynolds, Fred S. Richardson |
| 2009 | The MonAMI reminder: a spoken dialogue system for face-to-face interaction. Jonas Beskow, Jens Edlund, Björn Granström, Joakim Gustafson, Gabriel Skantze, Helena Tobiasson |
| 2009 | The RWTH aachen university open source speech recognition system. David Rybach, Christian Gollan, Georg Heigold, Björn Hoffmeister, Jonas Lööf, Ralf Schlüter, Hermann Ney |
| 2009 | The acoustic characteristics of Russian vowels in children of 6 and 7 years of age. Elena E. Lyakso, Olga V. Frolova, Aleks S. Grigoriev |
| 2009 | The acoustics of mangetti dune !xung clicks. Amanda Miller, Sheena Shah |
| 2009 | The articulatory and acoustic impact of scottish English /r/ on the preceding vowel-onset. Janine Lilienthal |
| 2009 | The broadcast narrow band speech corpus: a new resource type for large scale language recognition. Christopher Cieri, Linda Brandschain, Abby Neely, David Graff, Kevin Walker, Chris Caruso, Alvin F. Martin, Craig S. Greenberg |
| 2009 | The case for case-based automatic speech recognition. Viktoria Maier, Roger K. Moore |
| 2009 | The dynamic dimension of the global speech-rhythm attributes. Jan Volín, Petr Pollák |
| 2009 | The effect of F0 peak-delay on the L1 / L2 perception of English lexical stress. Shinichi Tokuma, Yi Xu |
| 2009 | The effects of different voices for speech-based in-vehicle interfaces: impact of young and old voices on driving performance and attitude. Ing-Marie Jonsson, Nils Dahlbäck |
| 2009 | The effects of fundamental frequency and formant space on speaker discrimination through bone-conducted ultrasonic hearing. Takayuki Kagomiya, Seiji Nakagawa |
| 2009 | The ester 2 evaluation campaign for the rich transcription of French radio broadcasts. Sylvain Galliano, Guillaume Gravier, Laura Chaubard |
| 2009 | The klattgrid speech synthesizer. David Weenink |
| 2009 | The majority wins: a method for combining speaker diarization systems. Marijn Huijbregts, David A. van Leeuwen, Franciska M. G. de Jong |
| 2009 | The monophthongs and diphthongs of north-eastern welsh: an acoustic study. Robert Mayr, Hannah Davies |
| 2009 | The multi-session audio research project (MARP) corpus: goals, design and initial findings. Aaron D. Lawson, A. R. Stauffer, Edward J. Cupples, Stanley J. Wenndt, W. P. Bray, John J. Grieco |
| 2009 | The phrase-final accent in kammu: effects of tone, focus and engagement. David House, Anastasia Karlsson, Jan-Olof Svantesson, Damrong Tayanin |
| 2009 | The rhythm of text and the rhythm of utterances: from metrics to models. Daniel Hirst |
| 2009 | The role of age in factor analysis for speaker identification. Yun Lei, John H. L. Hansen |
| 2009 | The role of glottal pulse rate and vocal tract length in the perception of speaker identity. Etienne Gaudrain, Su Li, Vin Shen Ban, Roy D. Patterson |
| 2009 | The roles of reconstruction and lexical storage in the comprehension of regular pronunciation variants. Mirjam Ernestus |
| 2009 | The semi-supervised switchboard transcription project. Amarnag Subramanya, Jeff A. Bilmes |
| 2009 | The use of telephone speech recordings for assessment and monitoring of cognitive function in elderly people. Viliam Rapcan, Shona D'Arcy, Nils Penard, Ian H. Robertson, Richard B. Reilly |
| 2009 | Thousands of voices for HMM-based speech synthesis. Junichi Yamagishi, Bela Usabaev, Simon King, Oliver Watts, John Dines, Jilei Tian, Rile Hu, Yong Guan, Keiichiro Oura, Keiichi Tokuda, Reima Karhila, Mikko Kurimo |
| 2009 | Three-way laryngeal categorization of Japanese, French, English and Chinese plosives by Korean speakers. Tomohiko Ooigawa, Shigeko Shinohara |
| 2009 | Tied-state multi-path HMnet model using three-domain successive state splitting. Soo-Young Suk, Hiroaki Kojima |
| 2009 | Time-varying autoregressive tests for multiscale speech analysis. Daniel Rudoy, Thomas F. Quatieri, Patrick J. Wolfe |
| 2009 | Tonal alignment in three varieties of hiberno-English. Raya Kalaldeh, Amelie Dorn, Ailbhe Ní Chasaide |
| 2009 | Tonal articulatory feature for Mandarin and its application to conversational LVCSR. Qingqing Zhang, Jielin Pan, Yonghong Yan |
| 2009 | Topic dependent language model based on topic voting on noun history. Welly Naptali, Masatoshi Tsuchiya, Seiichi Nakagawa |
| 2009 | Towards flexible representations for analysis of accommodation of temporal features in spontaneous dialogue speech. Spyros Kousidis, David Dorran, Ciaran McDonnell, Eugene Coyle |
| 2009 | Towards fusion of feature extraction and acoustic model training: a top down process for robust speech recognition. Yu-Hsiang Bosco Chiu, Bhiksha Raj, Richard M. Stern |
| 2009 | Towards intonation control in unit selection speech synthesis. Cédric Boidin, Olivier Boëffard, Thierry Moudenc, Géraldine Damnati |
| 2009 | Towards robust glottal source modeling. Javier Pérez, Antonio Bonafonte |
| 2009 | Towards unsupervised articulatory resynthesis of German utterances using EMA data. Ingmar Steiner, Korin Richmond |
| 2009 | Towards using hybrid word and fragment units for vocabulary independent LVCSR systems. Ariya Rastrow, Abhinav Sethy, Bhuvana Ramabhadran, Frederick Jelinek |
| 2009 | Transcribing human-directed speech for spoken language processing. Mari Ostendorf |
| 2009 | Transformation-based learning for semantic parsing. Filip Jurcícek, Milica Gasic, Simon Keizer, François Mairesse, Blaise Thomson, Kai Yu, Steve J. Young |
| 2009 | Transforming features to compensate speech recogniser models for noise. Rogier C. van Dalen, Federico Flego, Mark J. F. Gales |
| 2009 | Tree-based estimation of speaker characteristics for speech recognition. Mats Blomberg, Daniel Elenius |
| 2009 | Trimmed KL divergence between Gaussian mixtures for robust unsupervised acoustic anomaly detection. Nash M. Borges, Gerard G. L. Meyer |
| 2009 | Tuning support vector machines for robust phoneme classification with acoustic waveforms. Jibran Yousafzai, Zoran Cvetkovic, Peter Sollich |
| 2009 | Two-pass decision tree construction for unsupervised adaptation of HMM-based synthesis models. Matthew Gibson |
| 2009 | Two-wire nuisance attribute projection. Yosef A. Solewicz, Hagai Aronowitz |
| 2009 | Tying covariance matrices to reduce the footprint of HMM-based speech synthesis systems. Keiichiro Oura, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda |
| 2009 | UBM-based sequence kernel for speaker recognition. Zhenchun Lei |
| 2009 | Ultra low bit-rate speech coding based on unit-selection with joint spectral-residual quantization: no transmission of any residual information. V. Ramasubramanian, D. Harish |
| 2009 | Understanding speaker-listener interactions. Dirk Heylen |
| 2009 | Unit selection based speech synthesis for poor channel condition. Ling Cen, Minghui Dong, Paul Y. Chan, Haizhou Li |
| 2009 | Universal access: speech recognition for talkers with spastic dysarthria. Harsh Vardhan Sharma, Mark Hasegawa-Johnson |
| 2009 | Universidade de aveiro's voice evaluation protocol. Luis M. T. Jesus, Anna Barney, Ricardo Santos, Janine Caetano, Juliana Jorge, Pedro Sá-Couto |
| 2009 | Unsupervised estimation of the language model scaling factor. Christopher M. White, Ariya Rastrow, Sanjeev Khudanpur, Frederick Jelinek |
| 2009 | Unsupervised lattice-based acoustic model adaptation for speaker-dependent conversational telephone speech transcription. Kishan Thambiratnam, Frank Seide |
| 2009 | Unsupervised training of an HMM-based speech recognizer for topic classification. Herbert Gish, Man-Hung Siu, Arthur Chan, William Belfield |
| 2009 | Unsupervised training scheme with non-stereo data for empirical feature vector compensation. Luis Buera, Antonio Miguel, Alfonso Ortega, Eduardo Lleida, Richard M. Stern |
| 2009 | Usability study of VUI consistent with GUI focusing on age-groups. Jun Okamoto, Tomoyuki Kato, Makoto Shozakai |
| 2009 | Use of contexts in language model interpolation and adaptation. Xunying Liu, Mark J. F. Gales, Philip C. Woodland |
| 2009 | Use of harmonic phase information for polarity detection in speech signals. Ibon Saratxaga, Daniel Erro, Inmaculada Hernáez, Iñaki Sainz, Eva Navas |
| 2009 | Using VTLN matrices for rapid and computationally-efficient speaker adaptation with robustness to first-pass transcription errors. Shakti Prasad Rath, Srinivasan Umesh, Achintya Kumar Sarkar |
| 2009 | Using dialogue-based dynamic language models for improving speech recognition. Juan Manuel Lucas-Cuesta, Fernando Fernández Martínez, Javier Ferreiros |
| 2009 | Using durational cues in a computational model of spoken-word recognition. Odette Scharenborg |
| 2009 | Using graphical models for mixed-initiative dialog management systems with realtime Policies. Stefan Schwärzler, Stefan Maier, Joachim Schenk, Frank Wallhoff, Gerhard Rigoll |
| 2009 | Using location cues to track speaker changes from mobile, binaural microphones. Heidi Christensen, Jon Barker |
| 2009 | Using parallel architectures in speech recognition. Patrick Cardinal, Pierre Dumouchel, Gilles Boulianne |
| 2009 | Using prosody and phonotactics in Arabic dialect identification. Fadi Biadsy, Julia Hirschberg |
| 2009 | Using responsive prosodic variation to acknowledge the user's current state. Nigel G. Ward, Rafael Escalante-Ruiz |
| 2009 | Using same-language machine translation to create alternative target sequences for text-to-speech synthesis. Peter Cahill, Jinhua Du, Andy Way, Julie Carson-Berndsen |
| 2009 | Using sensor orientation information for computational head stabilisation in 3d electromagnetic articulography (EMA). Christian Kroos |
| 2009 | Using syntax in large-scale audio document translation. Jing Zheng, Necip Fazil Ayan, Wen Wang, David Burkett |
| 2009 | Variability and stability in collaborative dialogues: turn-taking and filled pauses. Stefan Benus |
| 2009 | Variability compensated support vector machines applied to speaker verification. Zahi N. Karam, William M. Campbell |
| 2009 | Variational dynamic kernels for speaker verification. Chris Longworth, Rogier C. van Dalen, Mark J. F. Gales |
| 2009 | Variational loopy belief propagation for multi-talker speech recognition. Steven J. Rennie, John R. Hershey, Peder A. Olsen |
| 2009 | Variational model composition for robust speech recognition with time-varying background noise. Wooil Kim, John H. L. Hansen |
| 2009 | Very large vocabulary voice dictation for mobile devices. Jan Nouza, Petr Cerva, Jindrich Zdánský |
| 2009 | Virtual speech reading support for hard of hearing in a domestic multi-media setting. Samer Al Moubayed, Jonas Beskow, Anne-Marie Öster, Giampiero Salvi, Björn Granström, Nic van Son, Ellen Ormel |
| 2009 | Visuo-phonetic decoding using multi-stream and context-dependent models for an ultrasound-based silent speech interface. Thomas Hueber, Elie-Laurent Benaroya, Gérard Chollet, Bruce Denby, Gérard Dreyfus, Maureen Stone |
| 2009 | Vocabulary expansion through automatic abbreviation generation for Chinese voice search. Dong Yang, Yi-Cheng Pan, Sadaoki Furui |
| 2009 | Vocalic sandwich, a unit designed for unit selection TTS. Didier Cadic, Cédric Boidin, Christophe d'Alessandro |
| 2009 | Voice activity detection using partially observable Markov decision process. Chiyoun Park, Namhoon Kim, Jeongmi Cho |
| 2009 | Voice activity detection using singular value decomposition-based filter. Hwa Jeon Song, Sung Min Ban, Hyung Soon Kim |
| 2009 | Voice conversion using k-histograms and frame selection. Alejandro José Uriz, Pablo Daniel Agüero, Antonio Bonafonte, Juan Carlos Tulli |
| 2009 | Voice morphing based on interpolation of vocal tract area functions using AR-HMM analysis of speech. Yoshiki Nambu, Masahiko Mikawa, Kazuyo Tanaka |
| 2009 | Voice production model employing an interactive boundary-layer analysis of glottal flow. Tokihiko Kaburagi, Katsunori Daimo, Shogo Nakamura |
| 2009 | Voice source waveform analysis and synthesis using principal component analysis and Gaussian mixture modelling. Jón Guðnason, Mark R. P. Thomas, Patrick A. Naylor, Daniel P. W. Ellis |
| 2009 | Voiced/unvoiced decision algorithm for HMM-based speech synthesis. Shiyin Kang, Zhiwei Shuang, Quansheng Duan, Yong Qin, Lianhong Cai |
| 2009 | Voicing profile of Polish sonorants: [r] in obstruent clusters. Jagoda Sieczkowska, Bernd Möbius, Antje Schweitzer, Michael Walsh, Grzegorz Dogil |
| 2009 | Vowel category perception affected by microdurational variations. Einar Meister, Stefan Werner |
| 2009 | Vowel duration in pre-geminate contexts in Polish. Zofia Malisz |
| 2009 | Watermark recovery from speech using inverse filtering and sign correlation. Robert Morris, Ralph Johnson, Vladimir Goncharoff, Joseph DiVita |
| 2009 | Wavelet-based speaker change detection in single channel speech data. Michael Wiesenegger, Franz Pernkopf |
| 2009 | Weighted linear prediction for speech analysis in noisy conditions. Jouni Pohjalainen, Heikki Kallasjoki, Kalle J. Palomäki, Mikko Kurimo, Paavo Alku |
| 2009 | Weighted neural network ensemble models for speech prosody control. Harald Romsdorfer |
| 2009 | What's in an ontology for spoken language understanding. Silvia Quarteroni, Giuseppe Riccardi, Marco Dinarelli |
| 2009 | Why would aspiration lower the pitch of the following vowel? observations from leng-shui-jiang Chinese. Caicai Zhang |
| 2009 | Within-session variability modelling for factor analysis speaker verification. Robbie Vogt, Jason W. Pelecanos, Nicolas Scheffer, Sachin S. Kajarekar, Sridha Sridharan |
| 2009 | Word confidence using duration models. Stefano Scanzio, Pietro Laface, Daniele Colibro, Roberto Gemello |
| 2009 | Word stress assessment for computer aided language learning. Juan Pablo Arias, Néstor Becerra Yoma, Hiram Vivanco |
| 2009 | Word-final [t]-deletion: an analysis on the segmental and sub-segmental level. Barbara Schuppler, Wim A. van Dommelen, Jacques C. Koreman, Mirjam Ernestus |
| 2009 | XTrans: a speech annotation and transcription tool. Meghan Lammie Glenn, Stephanie M. Strassel, Haejoong Lee |
| 2009 | ZZT-domain immiscibility of the opening and closing phases of the LF GFM under frame length variations. Christian Fischer Pedersen, Ove Andersen, Paul Dalsgaard |