| 2010 | "flat pitch accents" in Czech. Tomás Dubeda |
| 2010 | 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, Makuhari, Chiba, Japan, September 26-30, 2010 Takao Kobayashi, Keikichi Hirose, Satoshi Nakamura |
| 2010 | 2010, a speech oddity: phonetic transcription of reversed speech. François Pellegrino, Emmanuel Ferragne, Fanny Meunier |
| 2010 | A Bayesian approach to voice activity detection using multiple statistical models and discriminative training. Tao Yu, John H. L. Hansen |
| 2010 | A DOA estimation algorithm based on equalization-cancellation theory. Duc Thanh Chau, Junfeng Li, Masato Akagi |
| 2010 | A MMSE estimator in mel-cepstral domain for robust large vocabulary automatic speech recognition using uncertainty propagation. Ramón Fernandez Astudillo, Reinhold Orglmeister |
| 2010 | A blind signal-to-noise ratio estimator for high noise speech recordings. Charles Mercier, Roch Lefebvre |
| 2010 | A classifier-based target cost for unit selection speech synthesis trained on perceptual data. Volker Strom, Simon King |
| 2010 | A cluster-profile representation of emotion using agglomerative hierarchical clustering. Emily Mower, Kyu Jeong Han, Sungbok Lee, Shrikanth S. Narayanan |
| 2010 | A comparative large scale study of MLP features for Mandarin ASR. Fabio Valente, Mathew Magimai-Doss, Christian Plahl, Suman V. Ravuri, Wen Wang |
| 2010 | A comparative study of constrained and unconstrained approaches for segmentation of speech signal. Venkatesh Keri, Kishore Prahallad |
| 2010 | A comparative study of noise estimation algorithms for VTS-based robust speech recognition. Yong Zhao, Biing-Hwang Juang |
| 2010 | A comparison of pronunciation modeling approaches for HMM-TTS. Gabriel Webster, Sacha Krstulovic, Kate M. Knill |
| 2010 | A corpus-based approach to speech enhancement from nonstationary noise. Ji Ming, Ramji Srinivasan, Danny Crookes |
| 2010 | A discriminative performance metric for GMM-UBM speaker identification. Omid Dehzangi, Bin Ma, Engsiong Chng, Haizhou Li |
| 2010 | A discriminative splitting criterion for phonetic decision trees. Simon Wiesler, Georg Heigold, Markus Nußbaum-Thom, Ralf Schlüter, Hermann Ney |
| 2010 | A duration modeling technique with incremental speech rate normalization. Hiroshi Fujimura, Takashi Masuko, Mitsuyoshi Tachimori |
| 2010 | A factorial sparse coder model for single channel source separation. Robert Peharz, Michael Stark, Franz Pernkopf, Yannis Stylianou |
| 2010 | A fast implementation of factor analysis for speaker verification. Qingsong Liu, Wei Huang, Dongxing Xu, Hongbin Cai, Beiqian Dai |
| 2010 | A fast one-pass-training feature selection technique for GMM-based acoustic event detection with audio-visual data. Taras Butko, Climent Nadeu |
| 2010 | A fast query by humming system based on notes. Jingzhou Yang, Jia Liu, Weiqiang Zhang |
| 2010 | A fast speaker indexing using vector quantization and second order statistics with adaptive threshold computation. Konstantin Biatov |
| 2010 | A feature extraction method for automatic speech recognition based on the cochlear nucleus. Serajul Haque, Roberto Togneri |
| 2010 | A hierarchical F0 modeling method for HMM-based speech synthesis. Ming Lei, Yi-Jian Wu, Frank K. Soong, Zhen-Hua Ling, Li-Rong Dai |
| 2010 | A hybrid approach to online speaker diarization. Carlos Vaquero, Oriol Vinyals, Gerald Friedland |
| 2010 | A hybrid approach to robust word lattice generation via acoustic-based word detection. Icksang Han, Chiyoun Park, Jeongmi Cho, Jeongsu Kim |
| 2010 | A hybrid architecture for mobile voice user interfaces. Imre Kiss, Joseph Polifroni, Chao Wang, Ghinwa F. Choueiter, Mike Phillips |
| 2010 | A hybrid modeling strategy for GMM-SVM speaker recognition with adaptive relevance factor. Chang Huai You, Haizhou Li, Kong-Aik Lee |
| 2010 | A language-identification inspired method for spontaneous speech detection. Mickael Rouvier, Richard Dufour, Georges Linarès, Yannick Estève |
| 2010 | A lightweight keyword and tag-cloud retrieval algorithm for automatic speech recognition transcripts. Sebastian Tschöpel, Daniel Schneider |
| 2010 | A longest matching segment approach for text-independent speaker recognition. Ayeh Jafari, Ramji Srinivasan, Danny Crookes, Ji Ming |
| 2010 | A maximum a posteriori sound source localization in reverberant and noisy conditions. Jinho Choi, Chang D. Yoo |
| 2010 | A minimum classification error approach to pronunciation variation modeling of non-native proper names. Line Adde, Bert Réveil, Jean-Pierre Martens, Torbjørn Svendsen |
| 2010 | A minimum converted trajectory error (MCTE) approach to high quality speech-to-lips conversion. Xiaodan Zhuang, Lijuan Wang, Frank K. Soong, Mark Hasegawa-Johnson |
| 2010 | A modified parameterization of the Fujisaki model. Robert Schubert, Oliver Jokisch, Diane Hirschfeld |
| 2010 | A multidomain approach for automatic home environmental sound classification. Stavros Ntalampiras, Ilyas Potamitis, Nikos Fakotakis |
| 2010 | A multimodal density function estimation approach to formant tracking. Sundar Harshavardhan, Chandra Sekhar Seelamantula, Thippur V. Sreenivas |
| 2010 | A multipulse FEC scheme based on amplitude estimation for CELP codecs over packet networks. José L. Carmona, Angel M. Gomez, Antonio M. Peinado, José L. Pérez-Córdoba, José A. González |
| 2010 | A multistream multiresolution framework for phoneme recognition. Nima Mesgarani, Samuel Thomas, Hynek Hermansky |
| 2010 | A new VAD framework using statistical model and human knowledge based empirical rule. Ji Wu, Xiao-Lei Zhang, Wei Li |
| 2010 | A new approach for automatic tone error detection in strong accented Mandarin based on dominant set. Taotao Zhu, Dengfeng Ke, Zhenbiao Chen, Bo Xu |
| 2010 | A new binary mask based on noise constraints for improved speech intelligibility. Gibak Kim, Philipos C. Loizou |
| 2010 | A new multichannel multi modal dyadic interaction database. Viktor Rozgic, Bo Xiao, Athanasios Katsamanis, Brian R. Baucom, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2010 | A novel approach for matched reverberant training of HMMs using data pairs. Armin Sehr, Christian Hofmann, Roland Maas, Walter Kellermann |
| 2010 | A novel confidence measure based on marginalization of jointly estimated error cause probabilities. Atsunori Ogawa, Atsushi Nakamura |
| 2010 | A novel feature extraction strategy for multi-stream robust emotion identification. Gang Liu, Yun Lei, John H. L. Hansen |
| 2010 | A novel hybrid approach for Mandarin speech synthesis. Shifeng Pan, Meng Zhang, Jianhua Tao |
| 2010 | A novel path extension framework using steady segment detection for Mandarin speech recognition. Zhanlei Yang, Wenju Liu |
| 2010 | A novel speaker binary key derived from anchor models. Xavier Anguera, Jean-François Bonastre |
| 2010 | A novel text-independent phonetic segmentation algorithm based on the microcanonical multiscale formalism. Vahid Khanagha, Khalid Daoudi, Oriol Pont, Hussein M. Yahia |
| 2010 | A particle filter feature compensation approach to robust speech recognition. Aleem Mushtaq, Yu Tsao, Chin-Hui Lee |
| 2010 | A perceptual study of acceleration parameters in HMM-based TTS. Yining Chen, Zhi-Jie Yan, Frank K. Soong |
| 2010 | A phoneme recognition framework based on auditory spectro-temporal receptive fields. Samuel Thomas, Kailash Patil, Sriram Ganapathy, Nima Mesgarani, Hynek Hermansky |
| 2010 | A phonetic alternative to cross-language voice conversion in a text-dependent context: evaluation of speaker identity. Kayoko Yanagisawa, Mark A. Huckvale |
| 2010 | A procedure for estimating gestural scores from natural speech. Hosung Nam, Vikramjit Mitra, Mark Tiede, Elliot Saltzman, Louis Goldstein, Carol Y. Espy-Wilson, Mark Hasegawa-Johnson |
| 2010 | A quick sequential forward floating feature selection algorithm for emotion detection from speech. Mátyás Brendel, Riccardo Zaccarelli, Laurence Devillers |
| 2010 | A regularized discriminative training method of acoustic models derived by minimum relative entropy discrimination. Yotaro Kubo, Shinji Watanabe, Atsushi Nakamura, Tetsunori Kobayashi |
| 2010 | A robust audio-visual speech recognition using audio-visual voice activity detection. Satoshi Tamura, Masato Ishikawa, Takashi Hashiba, Shin'ichi Takeuchi, Satoru Hayamizu |
| 2010 | A robust speech recognition system against the ego noise of a robot. Gökhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Hiroshi Tsujino, Jun-ichi Imura |
| 2010 | A rule-based backchannel prediction model using pitch and pause information. Khiet P. Truong, Ronald Poppe, Dirk Heylen |
| 2010 | A segment-based non-parametric approach for monophone recognition. Ladan Golipour, Douglas D. O'Shaughnessy |
| 2010 | A semi-supervised cluster-and-label approach for utterance classification. Amparo Albalate, Aparna Suchindranath, David Suendermann, Wolfgang Minker |
| 2010 | A singing style modeling system for singing voice synthesizers. Keijiro Saino, Makoto Tachibana, Hideki Kenmochi |
| 2010 | A spectral LF model based approach to voice source parameterisation. John Kane, Mark Kane, Christer Gobl |
| 2010 | A speech-in-noise test based on spoken digits: comparison of normal and impaired listeners using a computer model. Matthew Robertson, Guy J. Brown, Wendy Lecluyse, Manasa Panda, Christine M. Tan |
| 2010 | A spoken term detection framework for recovering out-of-vocabulary words using the web. Carolina Parada, Abhinav Sethy, Mark Dredze, Frederick Jelinek |
| 2010 | A statistical segment-based approach for spoken language understanding. Lucía Ortega, Isabel Galiano, Lluís F. Hurtado, Emilio Sanchis, Encarna Segarra |
| 2010 | A stochastic finite-state transducer approach to spoken dialog management. Lluís F. Hurtado, Joaquin Planells, Encarna Segarra, Emilio Sanchis, David Griol |
| 2010 | A study of interplay between articulatory movement and prosodic characteristics in emotional speech production. Jangwon Kim, Sungbok Lee, Shrikanth S. Narayanan |
| 2010 | A study of intra-speaker and inter-speaker affective variability using electroglottograph and inverse filtered glottal waveforms. Daniel Bone, Samuel Kim, Sungbok Lee, Shrikanth S. Narayanan |
| 2010 | A study of irrelevant variability normalization based training and unsupervised online adaptation for LVCSR. Guangchuan Shi, Yu Shi, Qiang Huo |
| 2010 | A study of term weighting in phonotactic approach to spoken language recognition. Sirinoot Boonsuk, Donglai Zhu, Bin Ma, Atiwong Suchato, Proadpran Punyabukkana, Nattanun Thatphithakkul, Chai Wutiwiwatchai |
| 2010 | A super-resolution spectrogram using coupled PLCA. Juhan Nam, Gautham J. Mysore, Joachim Ganseman, Kyogu Lee, Jonathan S. Abel |
| 2010 | A variable frame length and rate algorithm based on the spectral kurtosis measure for speaker verification. Chi-Sang Jung, Kyu Jeong Han, Hyunson Seo, Shrikanth S. Narayanan, Hong-Goo Kang |
| 2010 | Accelerating hierarchical acoustic likelihood computation on graphics processors. Pavel Kveton, Miroslav Novak |
| 2010 | Accurate pitch marking for prosodic modification of speech segments. Thomas Ewender, Beat Pfister |
| 2010 | Acoustic analysis of intonation in parkinson's disease. Joan K. Y. Ma, Rüdiger Hoffmann |
| 2010 | Acoustic correlates of meaning structure in conversational speech. Alexei V. Ivanov, Giuseppe Riccardi, Sucheta Ghosh, Sara Tonelli, Evgeny A. Stepanov |
| 2010 | Acoustic correlates of voice quality improvement by voice training. Kiyoaki Aikawa, Junko Uenuma, Tomoko Akitake |
| 2010 | Acoustic feature analysis in speech emotion primitives estimation. Dongrui Wu, Thomas D. Parsons, Shrikanth S. Narayanan |
| 2010 | Acoustic feature diversity and speaker verification. R. Padmanabhan, Hema A. Murthy |
| 2010 | Acoustic modeling with bootstrap and restructuring for low-resourced languages. Xiaodong Cui, Jian Xue, Pierre L. Dognin, Upendra V. Chaudhari, Bowen Zhou |
| 2010 | Acoustic vector resampling for GMMSVM-based speaker verification. Man-Wai Mak, Wei Rao |
| 2010 | Acoustic-based recognition of head gestures accompanying speech. Akira Sasou, Yasuharu Hashimoto, Katsuhiko Sakaue |
| 2010 | Acoustic-to-articulatory inversion based on local regression. Samer Al Moubayed, Gopal Ananthakrishnan |
| 2010 | Acoustics-based phonetic transcription method for proper nouns. Antoine Laurent, Sylvain Meignier, Téva Merlin, Paul Deléglise |
| 2010 | Active appearance models for photorealistic visual speech synthesis. Wesley Mattheyses, Lukas Latacz, Werner Verhelst |
| 2010 | Active word learning under uncertain input conditions. Maarten Versteegh, Louis ten Bosch, Lou Boves |
| 2010 | Adaptation of a tongue shape model by local feature transformations. Chao Qin, Miguel Á. Carreira-Perpiñán, Mohsen Farhadloo |
| 2010 | Adapting a duration synthesis model to rate children's oral reading prosody. Minh Duong, Jack Mostow |
| 2010 | Adaptive high accuracy approaches to speech activity detection in noisy and hostile audio environments. Mark C. Huggins, Brett Y. Smolenski, Aaron D. Lawson |
| 2010 | Adaptive voice-quality control based on one-to-many eigenvoice conversion. Kumi Ohta, Tomoki Toda, Yamato Ohtani, Hiroshi Saruwatari, Kiyohiro Shikano |
| 2010 | Advanced speech communication system for deaf people. Rubén San Segundo, Verónica López-Ludeña, Raquel Martín, Syaheerah L. Lutfi, Javier Ferreiros, Ricardo de Córdoba, José Manuel Pardo |
| 2010 | Advances in fast multistream diarization based on the information bottleneck framework. Deepu Vijayasenan, Fabio Valente, Hervé Bourlard |
| 2010 | Affective story teller: a TTS system for emotional expressivity. Shaikh Mostafa Al Masum, Antonio Rui Ferreira Rebordão, Keikichi Hirose |
| 2010 | Age and gender classification from speech using decision level fusion and ensemble based techniques. Florian Lingenfelser, Johannes Wagner, Thurid Vogt, Jonghwa Kim, Elisabeth André |
| 2010 | Age and gender classification using fusion of acoustic and prosodic features. Hugo Meinedo, Isabel Trancoso |
| 2010 | Age and gender recognition based on multiple systems - early vs. late fusion. Tobias Bocklet, Georg Stemmer, Viktor Zeißler, Elmar Nöth |
| 2010 | Age recognition based on speech signals using weights supervector. Royi Porat, Dan Lange, Yaniv Zigel |
| 2010 | An HMM trajectory tiling (HTT) approach to high quality TTS. Yao Qian, Zhi-Jie Yan, Yi-Jian Wu, Frank K. Soong, Xin Zhuang, Shengyi Kong |
| 2010 | An analysis of language mismatch in HMM state mapping-based cross-lingual speaker adaptation. Hui Liang, John Dines |
| 2010 | An analysis of sparseness and regularization in exemplar-based methods for speech classification. Dimitri Kanevsky, Tara N. Sainath, Bhuvana Ramabhadran, David Nahamoo |
| 2010 | An analytic modeling approach to enhancing throat microphone speech commands for keyword spotting. Jun Cai, Stefano Marini, Pierre Malarme, Francis Grenez, Jean Schoentgen |
| 2010 | An auditory based modulation spectral feature for reverberant speech recognition. Hari Krishna Maganti, Marco Matassoni |
| 2010 | An effect of formant amplitude in vowel perception. Masashi Ito, Keiji Ohara, Akinori Ito, Masafumi Yano |
| 2010 | An effective feature compensation scheme tightly matched with speech recognizer employing SVM-based GMM generation. Wooil Kim, Jun-Won Suh, John H. L. Hansen |
| 2010 | An empirical comparison of the t Josef R. Novak, Paul R. Dixon, Sadaoki Furui |
| 2010 | An exploration of voice source correlates of focus. Irena Yanushevskaya, Christer Gobl, John Kane, Ailbhe Ní Chasaide |
| 2010 | An implementation of decision tree-based context clustering on graphics processing units. Nicholas Pilkington, Heiga Zen |
| 2010 | An improved cluster model selection method for agglomerative hierarchical speaker clustering using incremental Gaussian mixture models. Kyu Jeong Han, Shrikanth S. Narayanan |
| 2010 | An improved wavelet-based dereverberation for robust automatic speech recognition. Randy Gomez, Tatsuya Kawahara |
| 2010 | An integrated top-down/bottom-up approach to speaker diarization. Simon Bozonnet, Nicholas W. D. Evans, Corinne Fredouille, Dong Wang, Raphaël Troncy |
| 2010 | An intonation model for TTS in sepedi. Daniel R. van Niekerk, Etienne Barnard |
| 2010 | An intrusive super-wideband speech quality model: DIAL. Nicolas Côté, Vincent Koehl, Valérie Gautier-Turbin, Alexander Raake, Sebastian Möller |
| 2010 | An investigation into direct scoring methods without SVM training in speaker verification. Ce Zhang, Rong Zheng, Bo Xu |
| 2010 | An investigation of formant frequencies for cognitive load classification. Tet Fei Yap, Julien Epps, Eliathamby Ambikairajah, Eric H. C. Choi |
| 2010 | An unsupervised approach to creating web audio contents-based HMM voices. Jinfu Ni, Hisashi Kawai |
| 2010 | Analysis and detection of cognitive load and frustration in drivers' speech. Hynek Boril, Seyed Omid Sadjadi, Tristan Kleinschmidt, John H. L. Hansen |
| 2010 | Analysis of excitation source information in emotional speech. S. R. Mahadeva Prasanna, D. Govind |
| 2010 | Analysis of gender normalization using MLP and VTLN features. Thomas Schaaf, Florian Metze |
| 2010 | Analytical assessment and distance modeling of speech transmission quality. Marcel Wältermann, Alexander Raake, Sebastian Möller |
| 2010 | Analyzing user utterances in barge-in-able spoken dialogue system for improving identification accuracy. Kyoko Matsuyama, Kazunori Komatani, Ryu Takeda, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno |
| 2010 | Applying geometric source separation for improved pitch extraction in human-robot interaction. Martin Heckmann, Claudius Gläser, Frank Joublin, Kazuhiro Nakadai |
| 2010 | Applying scalable phonetic context similarity in unit selection of concatenative text-to-speech. Wei Zhang, Xiaodong Cui |
| 2010 | Applying voice conversion to concatenative singing-voice synthesis. Fernando Villavicencio, Jordi Bonada |
| 2010 | Approaching human listener accuracy with modern speaker verification. Ville Hautamäki, Tomi Kinnunen, Mohaddeseh Nosratighods, Kong-Aik Lee, Bin Ma, Haizhou Li |
| 2010 | Articulatory grounding of southern salentino harmony processes. Mirko Grimaldi, Andrea Calabrese, Francesco Sigona, Luigia Garrapa, Bianca Sisinni |
| 2010 | Articulatory inversion of american English /turnr/ by conditional density modes. Chao Qin, Miguel Á. Carreira-Perpiñán |
| 2010 | Articulatory synthesis and perception of plosive-vowel syllables with virtual consonant targets. Peter Birkholz, Bernd J. Kröger, Christiane Neuschaefer-Rube |
| 2010 | Articulatory-functional modeling of speech prosody: a review. Yi Xu, Santitham Prom-on |
| 2010 | Artificial and online acquired noise dictionaries for noise robust ASR. Jort F. Gemmeke, Tuomas Virtanen |
| 2010 | Assessment of single-channel speech enhancement techniques for speaker identification under mismatched conditions. Seyed Omid Sadjadi, John H. L. Hansen |
| 2010 | Assessment of spoken and multimodal applications: lessons learned from laboratory and field studies. Markku Turunen, Jaakko Hakulinen, Tomi Heimonen |
| 2010 | Asymptotically exact noise-corrupted speech likelihoods. Rogier C. van Dalen, Mark J. F. Gales |
| 2010 | Audio analytics by template modeling and 1-pass DP based decoding. Srikanth Cherla, V. Ramasubramanian |
| 2010 | Audio-based sports highlight detection by fourier local auto-correlations. Jiaxing Ye, Takumi Kobayashi, Tetsuya Higuchi |
| 2010 | Audio-visual anticipatory coarticulation modeling by human and machine. Louis H. Terry, Karen Livescu, Janet B. Pierrehumbert, Aggelos K. Katsaggelos |
| 2010 | Audio-visual synchronisation for speaker diarisation. Giulia Garau, Alfred Dielmann, Hervé Bourlard |
| 2010 | Audiovisual congruence and pragmatic focus marking. Charlotte Wollermann, Bernhard Schröder, Ulrich Schade |
| 2010 | Augmentation of adaptation data. Ravichander Vipperla, Steve Renals, Joe Frankel |
| 2010 | Augmented context features for Arabic speech recognition. Ahmad Emami, Hong-Kwang Jeff Kuo, Imed Zitouni, Lidia Mangu |
| 2010 | Augmented set of features for confidence estimation in spoken term detection. Javier Tejedor, Doroteo T. Toledano, Miguel Bautista, Simon King, Dong Wang, José Colás |
| 2010 | AutoBI - a tool for automatic toBI annotation. Andrew Rosenberg |
| 2010 | Autocorrelation and double autocorrelation based spectral representations for a noisy word recognition system. Tetsuya Shimamura, Ngoc Dinh Nguyen |
| 2010 | Automated vocal emotion recognition using phoneme class specific features. Géza Kiss, Jan P. H. van Santen |
| 2010 | Automatic analysis of the intonation of a tone language. applying the momel algorithm to spontaneous standard Chinese (beijing). Na Zhi, Daniel Hirst, Pier Marco Bertinetto |
| 2010 | Automatic classification of married couples' behavior using audio features. Matthew Black, Athanasios Katsamanis, Chi-Chun Lee, Adam C. Lammert, Brian R. Baucom, Andrew Christensen, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2010 | Automatic derivation of phonological rules for mispronunciation detection in a computer-assisted pronunciation training system. Wai Kit Lo, Shuang Zhang, Helen M. Meng |
| 2010 | Automatic detection of abnormal stress patterns in unit selection synthesis. Yeon-Jun Kim, Marc C. Beutnagel |
| 2010 | Automatic detection of task-incompleted dialog for spoken dialog system based on dialog act n-gram. Sunao Hara, Norihide Kitaoka, Kazuya Takeda |
| 2010 | Automatic discriminative measurement of voice onset time. Morgan Sonderegger, Joseph Keshet |
| 2010 | Automatic error detection for unit selection speech synthesis using log likelihood ratio based SVM classifier. Heng Lu, Zhen-Hua Ling, Si Wei, Li-Rong Dai, Ren-Hua Wang |
| 2010 | Automatic estimation of transcription accuracy and difficulty. Brandon Roy, Soroush Vosoughi, Deb Roy |
| 2010 | Automatic evaluation of English pronunciation by Japanese speakers using various acoustic features and pattern recognition techniques. Kuniaki Hirabayashi, Seiichi Nakagawa |
| 2010 | Automatic excitement-level detection for sports highlights generation. Hynek Boril, Abhijeet Sangwan, Taufiq Hasan, John H. L. Hansen |
| 2010 | Automatic perceptual categorization of disordered connected speech. Ali Alpan, Jean Schoentgen, Youri Maryn, Francis Grenez |
| 2010 | Automatic pronunciation scoring using learning to rank and DP-based score segmentation. Liang-Yu Chen, Jyh-Shing Roger Jang |
| 2010 | Automatic reference independent evaluation of prosody quality using multiple knowledge fusions. Shen Huang, Hongyan Li, Shijin Wang, Jiaen Liang, Bo Xu |
| 2010 | Automatic selection of thresholds for signal separation algorithms based on interaural delay. Chanwoo Kim, Richard M. Stern, Kiwan Eom, Jaewon Lee |
| 2010 | Automatic speaker age and gender recognition in the car for tailoring dialog and mobile services. Michael Feld, Felix Burkhardt, Christian A. Müller |
| 2010 | Automatic speech recognition for assistive writing in speech supplemented word prediction. John-Paul Hosom, Tom Jakobs, Allen Baker, Susan Fager |
| 2010 | Automatic speech recognition of multiple accented English data. Dimitra Vergyri, Lori Lamel, Jean-Luc Gauvain |
| 2010 | Automatic speech recognition system channel modeling. Qun Feng Tan, Kartik Audhkhasi, Panayiotis G. Georgiou, Emil Ettelaie, Shrikanth S. Narayanan |
| 2010 | Automatic turn segmentation in spoken conversations. Alexei V. Ivanov, Giuseppe Riccardi |
| 2010 | Autoregressive clustering for HMM speech synthesis. Matt Shannon, William Byrne |
| 2010 | Autoregressive modelling for linear prediction of ultrasonic speech. Farzaneh Ahmadi, Ian Vince McLoughlin, Hamid R. Sharifzadeh |
| 2010 | Bandwidth expansion of speech based on wavelet transform modulus maxima vector mapping. Zhe Chen, You-Chi Cheng, Fuliang Yin, Chin-Hui Lee |
| 2010 | Bayes factor based speaker segmentation for speaker diarization. David Wang, Robert Vogt, Sridha Sridharan |
| 2010 | Bayesian speaker recognition using Gaussian mixture model and laplace approximation. Shih-Sian Cheng, I-Fan Chen, Hsin-Min Wang |
| 2010 | Beyond sentence prosody. Chiu-yu Tseng |
| 2010 | Bias considerations for minimum subspace noise tracking. Mahdi Triki, Kees Janse |
| 2010 | Bimodal coherence based scale ambiguity cancellation for target speech extraction and enhancement. Qingju Liu, Wenwu Wang, Philip J. B. Jackson |
| 2010 | Binary coding of speech spectrograms using a deep auto-encoder. Li Deng, Michael L. Seltzer, Dong Yu, Alex Acero, Abdel-rahman Mohamed, Geoffrey E. Hinton |
| 2010 | Boosted mixture learning of Gaussian mixture HMMs for speech recognition. Jun Du, Yu Hu, Hui Jiang |
| 2010 | Boosting systems for LVCSR. George Saon, Hagen Soltau |
| 2010 | Brazilian portuguese acoustic model training based on data borrowing from other language. Kazuhiko Abe, Sakriani Sakti, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura |
| 2010 | Brno university of technology system for interspeech 2010 paralinguistic challenge. Marcel Kockmann, Lukás Burget, Jan Cernocký |
| 2010 | Building transcribed speech corpora quickly and cheaply for many languages. Thad Hughes, Kaisuke Nakajima, Linne Ha, Atul Vasu, Pedro J. Moreno, Mike LeBeau |
| 2010 | CASTLE: a computer-assisted stress teaching and learning environment for learners of English as a second language. Jingli Lu, Ruili Wang, Liyanage C. De Silva, Yang Gao, Jia Liu |
| 2010 | CRF-based combination of contextual features to improve a posteriori word-level confidence measures. Julien Fayolle, Fabienne Moreau, Christian Raymond, Guillaume Gravier, Patrick Gros |
| 2010 | CRF-based stochastic pronunciation modeling for out-of-vocabulary spoken term detection. Dong Wang, Simon King, Nicholas W. D. Evans, Raphaël Troncy |
| 2010 | Can conversational word usage be used to predict speaker demographics?. Dan Gillick |
| 2010 | Can tongue be recovered from face? the answer of data-driven statistical models. Atef Ben Youssef, Pierre Badin, Gérard Bailly |
| 2010 | Canonical state models for automatic speech recognition. Mark J. F. Gales, Kai Yu |
| 2010 | Cantonese tone word learning by tone and non-tone language speakers. Angela Cooper, Yue Wang |
| 2010 | Catalog-based single-channel speech-music separation. Cemil Demir, A. Taylan Cemgil, Murat Saraclar |
| 2010 | Challenging the speech intelligibility index: macroscopic vs. microscopic prediction of sentence recognition in normal and hearing-impaired listeners. Tim Jürgens, Stefan Fredelake, Ralf M. Meyer, Birger Kollmeier, Thomas Brand |
| 2010 | Changes in temporal processing of speech across the adult lifespan. Diane Kewley-Port, Larry E. Humes, Daniel Fogerty |
| 2010 | Channel detectors for system fusion in the context of NIST LRE 2009. Florian Verdet, Driss Matrouf, Jean-François Bonastre, Jean Hennebert |
| 2010 | Chirp complex cepstrum-based decomposition for asynchronous glottal analysis. Thomas Drugman, Thierry Dutoit |
| 2010 | Classifying dialog acts in human-human and human-machine spoken conversations. Silvia Quarteroni, Giuseppe Riccardi |
| 2010 | Classroom note-taking system for hearing impaired students using automatic speech recognition adapted to lectures. Tatsuya Kawahara, Norihiro Katsumaru, Yuya Akita, Shinsuke Mori |
| 2010 | Close speaker cancellation for suppression of non-stationary background noise for hands-free speech interface. Jani Even, Carlos Toshinori Ishi, Hiroshi Saruwatari, Norihiro Hagita |
| 2010 | Cluster analysis of differential spectral envelopes on emotional speech. Giampiero Salvi, Fabio Tesser, Enrico Zovato, Piero Cosi |
| 2010 | Cluster-based language model for spoken document retrieval using NMF-based document clustering. Xinhui Hu, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura |
| 2010 | Combination of probabilistic and possibilistic language models. Stanislas Oger, Vladimir Popescu, Georges Linarès |
| 2010 | Combining Chinese spoken term detection systems via side-information conditioned linear logistic regression. Sha Meng, Weiqiang Zhang, Jia Liu |
| 2010 | Combining five acoustic level modeling methods for automatic speaker age and gender recognition. Ming Li, Chi-Sang Jung, Kyu Jeong Han |
| 2010 | Combining many alignments for speech to speech translation. Sameer Maskey, Steven J. Rennie, Bowen Zhou |
| 2010 | Combining monaural and binaural evidence for reverberant speech segregation. John Woodruff, Rohit Prabhavalkar, Eric Fosler-Lussier, DeLiang Wang |
| 2010 | Combining text categorization and dialog modeling for speaker role identification on call center conversations. Rémi Lavalley, Chloé Clavel, Patrice Bellot, Marc El-Bèze |
| 2010 | Combining user intention and error modeling for statistical dialog simulators. Silvia Quarteroni, Meritxell González, Giuseppe Riccardi, Sebastian Varges |
| 2010 | Combining word-based features, statistical language models, and parsing for named entity recognition. Joseph Polifroni, Stephanie Seneff |
| 2010 | Comparing measures of synchrony and alignment in dialogue speech timing with respect to turn-taking activity. Nick Campbell, Stefan Scherer |
| 2010 | Comparing mono- & multilingual acoustic seed models for a low e-resourced language: a case-study of luxembourgish. Martine Adda-Decker, Lori Lamel, Natalie D. Snoeren |
| 2010 | Comparison of HMM and TMDN methods for lip synchronisation. Gregor Hofer, Korin Richmond |
| 2010 | Comparison of approaches for instrumentally predicting the quality of text-to-speech systems. Sebastian Möller, Florian Hinterleitner, Tiago H. Falk, Tim Polzehl |
| 2010 | Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems. Bo Li, Khe Chai Sim |
| 2010 | Comparison of methods for topic classification in a speech-oriented guidance system. Rafael Torres, Shota Takeuchi, Hiromichi Kawanami, Tomoko Matsui, Hiroshi Saruwatari, Kiyohiro Shikano |
| 2010 | Competition in the perception of spoken Japanese words. Takashi Otake, James M. McQueen, Anne Cutler |
| 2010 | Concurrent speaker localization using multi-band position-pitch (m-popi) algorithm with spectro-temporal pre-processing. Tania Habib, Harald Romsdorfer |
| 2010 | Conditional models for detecting lambda-functions in a spoken language understanding system. Frédéric Duvert, Renato De Mori |
| 2010 | Confidence measures for speaker segmentation and their relation to speaker verification. Carlos Vaquero, Alfonso Ortega, Jesús Antonio Villalba López, Antonio Miguel, Eduardo Lleida |
| 2010 | Constructing Japanese test collections for spoken term detection. Yoshiaki Itoh, Hiromitsu Nishizaki, Xinhui Hu, Hiroaki Nanjo, Tomoyosi Akiba, Tatsuya Kawahara, Seiichi Nakagawa, Tomoko Matsui, Yoichi Yamashita, Kiyoaki Aikawa |
| 2010 | Construction and evaluations of an annotated Chinese conversational corpus in travel domain for the language model of speech recognition. Xinhui Hu, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura |
| 2010 | Content-based advertisement detection. Patrick Cardinal, Vishwa Gupta, Gilles Boulianne |
| 2010 | Context adaptive training with factorized decision trees for HMM-based speech synthesis. Kai Yu, Heiga Zen, François Mairesse, Steve J. Young |
| 2010 | Context dependent modelling approaches for hybrid speech recognizers. Alberto Abad, Thomas Pellegrini, Isabel Trancoso, João Paulo Neto |
| 2010 | Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional LSTM modeling. Martin Wöllmer, Angeliki Metallinou, Florian Eyben, Björn W. Schuller, Shrikanth S. Narayanan |
| 2010 | Contextual verification for open vocabulary spoken term detection. Daniel Schneider, Timo Mertens, Martha A. Larson, Joachim Köhler |
| 2010 | Continuous speech recognition with a TF-IDF acoustic model. Geoffrey Zweig, Patrick Nguyen, Jasha Droppo, Alex Acero |
| 2010 | Conversational spontaneous speech synthesis using average voice model. Tomoki Koriyama, Takashi Nose, Takao Kobayashi |
| 2010 | Convexity and fast speech extraction by split bregman method. Meng Yu, Wenye Ma, Jack Xin, Stanley J. Osher |
| 2010 | Coping imbalanced prosodic unit boundary detection with linguistically-motivated prosodic features. Yi-Fen Liu, Shu-Chuan Tseng, Jyh-Shing Roger Jang, C.-H. Alvin Chen |
| 2010 | Creating a linguistic plausibility dataset with non-expert annotators. Benjamin Lambert, Rita Singh, Bhiksha Raj |
| 2010 | Cross-cultural investigation of prosody in verbal feedback in interactional rapport. Gina-Anne Levow, Susan Duncan, Edward T. King |
| 2010 | Cross-lingual acoustic modeling for dialectal Arabic speech recognition. Mohamed Elmahdy, Rainer Gruhn, Wolfgang Minker, Slim Abdennadher |
| 2010 | Cross-lingual and multi-stream posterior features for low resource LVCSR systems. Samuel Thomas, Sriram Ganapathy, Hynek Hermansky |
| 2010 | Cross-lingual speaker adaptation via Gaussian component mapping. Houwei Cao, Tan Lee, P. C. Ching |
| 2010 | Cross-lingual spoken language understanding from unaligned data using discriminative classification models and machine translation. Fabrice Lefèvre, François Mairesse, Steve J. Young |
| 2010 | Cross-lingual talker discrimination. Mirjam Wester |
| 2010 | Dajare is not the lowest form of wit. Takashi Otake |
| 2010 | Data pruning for template-based automatic speech recognition. Dino Seppi, Dirk Van Compernolle |
| 2010 | Data selection for language modeling using sparse representations. Abhinav Sethy, Tara N. Sainath, Bhuvana Ramabhadran, Dimitri Kanevsky |
| 2010 | Data-dependent evaluator modeling and its application to emotional valence classification from speech. Kartik Audhkhasi, Shrikanth S. Narayanan |
| 2010 | Data-driven analysis of realtime vocal tract MRI using correlated image regions. Adam C. Lammert, Michael I. Proctor, Shrikanth S. Narayanan |
| 2010 | Decision tree based tone modeling with corrective feedbacks for automatic Mandarin tone assessment. Hsien-Cheng Liao, Jiang-Chun Chen, Sen-Chia Chang, Ying-Hua Guan, Chin-Hui Lee |
| 2010 | Decision tree state clustering with word and syllable features. Hank Liao, Christopher Alberti, Michiel Bacchiani, Olivier Siohan |
| 2010 | Declarative sentence intonation patterns in 8 swiss German dialects. Adrian Leemann, Lucy Zuberbühler |
| 2010 | Decoding with shrinkage-based language models. Ahmad Emami, Stanley F. Chen, Abraham Ittycheriah, Hagen Soltau, Bing Zhao |
| 2010 | Decoupling session variability modelling and speaker characterisation. Anthony Larcher, Christophe Lévy, Driss Matrouf, Jean-François Bonastre |
| 2010 | Deep-structured hidden conditional random fields for phonetic recognition. Dong Yu, Li Deng |
| 2010 | Detailed pronunciation variant modeling for speech transcription. Denis Jouvet, Dominique Fohr, Irina Illina |
| 2010 | Detecting Politeness and efficiency in a cooperative social interaction. Paul M. Brunet, Marcela Charfuelan, Roderick Cowie, Marc Schröder, Hastings Donnan, Ellen Douglas-Cowie |
| 2010 | Detecting categorical perception in continuous discrimination data. Paul Boersma, Katerina Chládková |
| 2010 | Detecting novel objects in acoustic scenes through classifier incongruence. Jörg-Hendrik Bach, Jörn Anemüller |
| 2010 | Detection of anger emotion in dialog speech using prosody feature and temporal relation of utterances. Narichika Nomoto, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi |
| 2010 | Detection of hot spots in poster conversations based on reactive tokens of audience. Tatsuya Kawahara, Kouhei Sumi, Zhi-qiang Chang, Katsuya Takanashi |
| 2010 | Determining optimal features for emotion recognition from speech by applying an evolutionary algorithm. David Philippou-Hübner, Bogdan Vlasenko, Tobias Grosser, Andreas Wendemuth |
| 2010 | Developing a Chinese L2 speech database of Japanese learners with narrow-phonetic labels for computer assisted pronunciation training. Wen Cao, Dongning Wang, Jinsong Zhang, Ziyu Xiong |
| 2010 | Dialect recognition using a phone-GMM-supervector-based SVM kernel. Fadi Biadsy, Julia Hirschberg, Michael Collins |
| 2010 | Dialog prediction for a general model of turn-taking. Nigel G. Ward, Olac Fuentes, Alejandro Vega |
| 2010 | Dialogue act detection in error-prone spoken dialogue systems using partial sentence tree and latent dialogue act matrix. Wei-Bin Liang, Chung-Hsien Wu, Yu-Cheng Hsiao |
| 2010 | Dialogue act tagging and segmentation with a single perceptron. Ramón Granell, Stephen G. Pulman, Carlos D. Martínez-Hinarejos, José-Miguel Benedí |
| 2010 | Did you say susi or shushi? measuring the emergence of robust fricative contrasts in English- and Japanese-acquiring children. Jeffrey J. Holliday, Mary E. Beckman, Chanelle Mays |
| 2010 | Direct construction of compact context-dependency transducers from data. David Rybach, Michael Riley |
| 2010 | Direct observation of pruning errors (DOPE): a search analysis tool. Volker Steinbiss, Martin Sundermeyer, Hermann Ney |
| 2010 | Disambiguating the functions of conversational sounds with prosody: the case of 'yeah'. Khiet P. Truong, Dirk Heylen |
| 2010 | Discovering an optimal set of minimally contrasting acoustic speech units: a point of focus for whole-word pattern matching. Guillaume Aimetti, Roger K. Moore, Louis ten Bosch |
| 2010 | Discriminative acoustic model for improving mispronunciation detection and diagnosis in computer-aided pronunciation training (CAPT). Xiaojun Qian, Frank K. Soong, Helen M. Meng |
| 2010 | Discriminative adaptation based on fast combination of DMAP and dfMLLR. Lukás Machlica, Zbynek Zajíc, Ludek Müller |
| 2010 | Discriminative adaptation for log-linear acoustic models. Jonas Lööf, Ralf Schlüter, Hermann Ney |
| 2010 | Discriminative language modeling using simulated ASR errors. Preethi Jyothi, Eric Fosler-Lussier |
| 2010 | Discriminative training and unsupervised adaptation for labeling prosodic events with limited training data. Raul Fernandez, Bhuvana Ramabhadran |
| 2010 | Discriminative training for hierarchical clustering in speaker diarization. Oriol Vinyals, Gerald Friedland, Nelson Morgan |
| 2010 | Distribution and trichotomic realization of voiced velars in Japanese - an experimental study. Shin-ichiro Sano, Tomohiko Ooigawa |
| 2010 | Does sentence complexity interfere with intelligibility in noise? evaluation of the oldenburg linguistically and audiologically controlled sentence test (OLACS). Verena N. Uslar, Thomas Brand, Mirko Hanke, Rebecca Carroll, Esther Ruigendijk, Cornelia Hamann, Birger Kollmeier |
| 2010 | Domain adaptation and compensation for emotion detection. Michelle Hewlett Sanchez, Gökhan Tür, Luciana Ferrer, Dilek Hakkani-Tür |
| 2010 | Durational structure of Japanese single/geminate stops in three- and four-mora words spoken at varied rates. Yukari Hirata, Shigeaki Amano |
| 2010 | Dynamic language model adaptation using keyword category classification. Hitoshi Yamamoto, Ken Hanazawa, Kiyokazu Miki, Koichi Shinoda |
| 2010 | Dynamic language modeling using Bayesian networks for spoken dialog systems. Antoine Raux, Neville Mehta, Deepak Ramachandran, Rakesh Gupta |
| 2010 | Dynamic model selection for spectral voice conversion. Pierre Lanchantin, Xavier Rodet |
| 2010 | Effect of spatial separation on speech-in-noise comprehension in dyslexic adults. Marjorie Dole, Michel Hoen, Fanny Meunier |
| 2010 | Effects of Korean learners' consonant cluster reduction strategies on English speech recognition performance. Hyejin Hong, Jina Kim, Minhwa Chung |
| 2010 | Effects of accent typicality and phonotactic frequency on nonword immediate serial recall performance in Japanese. Yuuki Tanida, Taiji Ueno, Satoru Saito, Matthew A. Lambon Ralph |
| 2010 | Effects of enhancement of spectral changes on speech quality and subjective speech intelligibility. Jing Chen, Thomas Baer, Brian C. J. Moore |
| 2010 | Effects of modelling within- and between-frame temporal variations in power spectra on non-verbal sound recognition. Nobuhide Yamakawa, Tetsuro Kitahara, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno |
| 2010 | Effects of the phonological relevance in speaker verification. Yanhua Long, Li-Rong Dai, Bin Ma, Wu Guo |
| 2010 | Effects of wall impedance on transmission and attenuation of higher-order modes in vocal-tract model. Kunitoshi Motoki |
| 2010 | Efficient HMM-based estimation of missing features, with applications to packet loss concealment. Bengt J. Borgström, Per Henrik Borgström, Abeer Alwan |
| 2010 | Efficient combined approach for named entity recognition in spoken language. Azeddine Zidouni, Sophie Rosset, Hervé Glotin |
| 2010 | Efficient data selection for speech recognition based on prior confidence estimation using speech and context independent models. Satoshi Kobashikawa, Taichi Asami, Yoshikazu Yamaguchi, Hirokazu Masataki, Satoshi Takahashi |
| 2010 | Efficient estimation of maximum entropy language models with n-gram features: an SRILM extension. Tanel Alumäe, Mikko Kurimo |
| 2010 | Efficient manycore CHMM speech recognition for audiovisual and multistream data. Dorothea Kolossa, Jike Chong, Steffen Zeiler, Kurt Keutzer |
| 2010 | Efficient three-stage pitch estimation for packet loss concealment. Xuejing Sun, Sameer Gadre |
| 2010 | Emotion recognition using imperfect speech recognition. Florian Metze, Anton Batliner, Florian Eyben, Tim Polzehl, Björn W. Schuller, Stefan Steidl |
| 2010 | Empirical mode decomposition for noise-robust automatic speech recognition. Kuo-Hao Wu, Chia-Ping Chen |
| 2010 | Energy reallocation strategies for speech enhancement in known noise conditions. Yan Tang, Martin Cooke |
| 2010 | English spoken term detection in multilingual recordings. Petr Motlícek, Fabio Valente, Philip N. Garner |
| 2010 | Enhanced monitoring tools and online dialogue optimisation merged into a new spoken dialogue system design experience. Romain Laroche, Philippe Bretier, Ghislain Putois |
| 2010 | Enhanced speech yielding higher intelligibility for all listeners and environments. Takayuki Arai, Nao Hodoshima |
| 2010 | Enhanced word classing for model M. Stanley F. Chen, Stephen M. Chu |
| 2010 | Enhancements of viterbi search for fast unit selection synthesis. Daniel Tihelka, Jirí Kala, Jindrich Matousek |
| 2010 | Enhancing children's speech recognition under mismatched condition by explicit acoustic normalization. Shweta Ghai, Rohit Sinha |
| 2010 | Estimating missing data sequences in x-ray microbeam recordings. Chao Qin, Miguel Á. Carreira-Perpiñán |
| 2010 | Estimating noise from noisy speech features with a monte carlo variant of the expectation maximization algorithm. Friedrich Faubel, Dietrich Klakow |
| 2010 | Estimation of glottal area function using stereo-endoscopic high-speed digital imaging. Hiroshi Imagawa, Ken-Ichi Sakakibara, Isao T. Tokuda, Mamiko Otsuka, Niro Tayama |
| 2010 | Estimation of speech lip features from discrete cosinus transform. Zuheng Ming, Denis Beautemps, Gang Feng, Sébastien Schmerber |
| 2010 | Estimation of two-to-one forced selection intelligibility scores by speech recognizers using noise-adapted models. Kazuhiro Kondo, Yusuke Takano |
| 2010 | Estimation studies of vocal tract shape trajectory using a variable length and lossy kelly-lochbaum model. Heikki Rasilo, Unto K. Laine, Okko Johannes Räsänen |
| 2010 | Evaluating a dialog language generation system: comparing the mountain system to other NLG approaches. Brian Langner, Stephan Vogel, Alan W. Black |
| 2010 | Evaluation of a silent speech interface based on magnetic sensing. Robin Hofe, Stephen R. Ell, Michael J. Fagan, James M. Gilbert, Phil D. Green, Roger K. Moore, Sergey I. Rybchenko |
| 2010 | Evaluation of bone-conducted ultrasonic hearing-aid regarding transmission of paralinguistic information: a comparison with cochlear implant simulator. Takayuki Kagomiya, Seiji Nakagawa |
| 2010 | Evaluation of prosodic contextual factors for HMM-based speech synthesis. Shuji Yokomizo, Takashi Nose, Takao Kobayashi |
| 2010 | Evaluation of speaker mimic technology for personalizing SGD voices. Esther Klabbers, Alexander Kain, Jan P. H. van Santen |
| 2010 | Excitation modeling based on waveform interpolation for HMM-based speech synthesis. June Sig Sung, Doo Hwa Hong, Kyung Hwan Oh, Nam Soo Kim |
| 2010 | Expectations for discourse genre identification: a prosodic study. Nicolas Obin, Volker Dellwo, Anne Lacheret, Xavier Rodet |
| 2010 | Exploitation of phase information for speaker recognition. Ning Wang, P. C. Ching, Tan Lee |
| 2010 | Exploiting context-dependency and acoustic resolution of universal speech attribute models in spoken language recognition. Sabato Marco Siniscalchi, Jeremy Reed, Torbjørn Svendsen, Chin-Hui Lee |
| 2010 | Exploiting glottal formant parameters for glottal inverse filtering and parameterization. Alan Ó Cinnéide, David Dorran, Mikel Gainza, Eugene Coyle |
| 2010 | Exploiting variety-dependent phones in portuguese variety identification applied to broadcast news transcription. Oscar Koller, Alberto Abad, Isabel Trancoso, Céu Viana |
| 2010 | Exploring goodness of prosody by diverse matching templates. Shen Huang, Hongyan Li, Shijin Wang, Jiaen Liang, Bo Xu |
| 2010 | Exploring recognition network representations for efficient speech inference on highly parallel platforms. Jike Chong, Ekaterina Gonina, Kisun You, Kurt Keutzer |
| 2010 | Exploring speaker characteristics for meeting summarization. Fei Liu, Yang Liu |
| 2010 | Exploring subsegmental and suprasegmental features for a text-dependent speaker verification in distant speech signals. B. Avinash, Sunitha Guruprasad, B. Yegnanarayana |
| 2010 | Exploring the mechanism of tonal contraction in taiwan Mandarin. Chierh Cheng, Yi Xu, Michele Gubian |
| 2010 | Exploring web-browser based runtimes engines for creating ubiquitous speech interfaces. Paul R. Dixon, Sadaoki Furui |
| 2010 | Extended weighted linear prediction (XLP) analysis of speech and its application to speaker verification in adverse conditions. Jouni Pohjalainen, Rahim Saeidi, Tomi Kinnunen, Paavo Alku |
| 2010 | Extending the punctuation module for european portuguese. Fernando Batista, Helena Moniz, Isabel Trancoso, Hugo Meinedo, Ana Isabel Mata, Nuno J. Mamede |
| 2010 | Extractive speech summarization - from the view of decision theory. Shih-Hsiang Lin, Yao-Ming Yeh, Berlin Chen |
| 2010 | Extractive summarization using a latent variable model. Asli Celikyilmaz, Dilek Hakkani-Tür |
| 2010 | F Jiahong Yuan, Mark Y. Liberman |
| 2010 | FSM-based pronunciation modeling using articulatory phonological code. Chi Hu, Xiaodan Zhuang, Mark Hasegawa-Johnson |
| 2010 | Fast computation of speaker characterization vector using MLLR and sufficient statistics in anchor model framework. Achintya Kumar Sarkar, Srinivasan Umesh |
| 2010 | Fast converging iterative kalman filtering for speech enhancement using long and overlapped tapered windows with large side lobe attenuation. Stephen So, Kuldip K. Paliwal |
| 2010 | Fast least-squares solution for sinusoidal, harmonic and quasi-harmonic models. Georgios Tzedakis, Yannis Pantazis, Olivier Rosec, Yannis Stylianou |
| 2010 | Feature selection for pose invariant lip biometrics. Adrian Pass, Jianguo Zhang, Darryl Stewart |
| 2010 | Feature versus model based noise robustness. Kris Demuynck, Xueru Zhang, Dirk Van Compernolle, Hugo Van hamme |
| 2010 | Floor holder detection and end of speaker turn prediction in meetings. Alfred Dielmann, Giulia Garau, Hervé Bourlard |
| 2010 | Fluency and structural complexity as predictors of L2 oral proficiency. Jared Bernstein, Jian Cheng, Masanori Suzuki |
| 2010 | Focus-sensitive operator or focus inducer: always and only. Yong-Cheol Lee, Satoshi Nambu |
| 2010 | Foreign accent matters most when timing is wrong. Chiharu Tsurutani |
| 2010 | Formant-based frequency warping for improving speaker adaptation in HMM TTS. Xin Zhuang, Yao Qian, Frank K. Soong, Yi-Jian Wu, Bo Zhang |
| 2010 | Frequency of occurrence effects on pitch accent realisation. Katrin Schweitzer, Michael Walsh, Bernd Möbius, Hinrich Schütze |
| 2010 | Frequency-domain delexicalization using surrogate vowels. Alexander Kain, Jan P. H. van Santen |
| 2010 | Full body aero-tactile integration in speech perception. Donald Derrick, Bryan Gick |
| 2010 | Fully automatic segmentation for prosodic speech corpora. Sarah Hoffmann, Beat Pfister |
| 2010 | Fully unsupervised word learning from continuous speech using transitional probabilities of atomic acoustic events. Okko Johannes Räsänen |
| 2010 | Functional imaging of brain regions sensitive to communication sounds in primates. Christopher I. Petkov, Benjamin Wilson |
| 2010 | Fuzzy support vector machines for age and gender classification. Phuoc Nguyen, Trung Le, Dat Tran, Xu Huang, Dharmendra Sharma |
| 2010 | GMM-UBM based open-set online speaker diarization. Jürgen T. Geiger, Frank Wallhoff, Gerhard Rigoll |
| 2010 | Gender and affect recognition based on GMM and GMM-UBM modeling with relevance MAP estimation. Rok Gajsek, Janez Zibert, Tadej Justin, Vitomir Struc, Bostjan Vesnicer, France Mihelic |
| 2010 | Gesture and speech coordination: the influence of the relationship between manual gesture and speech. Benjamin Roustan, Marion Dohen |
| 2010 | Global variance modeling on the log power spectrum of LSPs for HMM-based speech synthesis. Zhen-Hua Ling, Yu Hu, Li-Rong Dai |
| 2010 | Glottal parameters estimation on speech using the zeros of the z-transform. Nicolas Sturmel, Christophe d'Alessandro, Boris Doval |
| 2010 | Glottal-based analysis of the lombard effect. Thomas Drugman, Thierry Dutoit |
| 2010 | Graph-embedding for speaker recognition. Zahi N. Karam, William M. Campbell |
| 2010 | HMM adaptation using linear spline interpolation with integrated spline parameter training for robust speech recognition. Michael L. Seltzer, Alex Acero |
| 2010 | HMM based TTS for mixed language text. Zhiwei Shuang, Shiyin Kang, Yong Qin, Li-Rong Dai, Lianhong Cai |
| 2010 | HMM-based automatic visual speech segmentation using facial data. Utpala Musti, Asterios Toutios, Slim Ouni, Vincent Colotte, Brigitte Wrobel-Dautcourt, Marie-Odile Berger |
| 2010 | HMM-based prosodic structure model using rich linguistic context. Nicolas Obin, Xavier Rodet, Anne Lacheret |
| 2010 | HMM-based singing voice synthesis system using pitch-shifted pseudo training data. Ayami Mase, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda |
| 2010 | HMM-based text-to-articulatory-movement prediction and analysis of critical articulators. Zhen-Hua Ling, Korin Richmond, Junichi Yamagishi |
| 2010 | Hands free audio analysis from home entertainment. Danil Korchagin, Philip N. Garner, Petr Motlícek |
| 2010 | Hidden Markov models with context-sensitive observations for grapheme-to-phoneme conversion. Kalu U. Ogbureke, Peter Cahill, Julie Carson-Berndsen |
| 2010 | Hidden logistic linear regression for support vector machine based phone verification. Bo Li, Khe Chai Sim |
| 2010 | Hierarchical bottle neck features for LVCSR. Christian Plahl, Ralf Schlüter, Hermann Ney |
| 2010 | Hierarchical classification for speech-to-speech translation. Emil Ettelaie, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2010 | Hierarchical multilayer perceptron based language identification. David Imseng, Mathew Magimai-Doss, Hervé Bourlard |
| 2010 | Hierarchical neural net architectures for feature extraction in ASR. Frantisek Grézl, Martin Karafiát |
| 2010 | How abstract is phonetics?. Osamu Fujimura |
| 2010 | How children acquire situation understanding skills?: a developmental analysis utilizing multimodal speech behavior corpus. Shogo Ishikawa, Shinya Kiriyama, Yoichi Takebayashi, Shigeyoshi Kitazawa |
| 2010 | Identification of abnormal audio events based on probabilistic novelty detection. Stavros Ntalampiras, Ilyas Potamitis, Nikos Fakotakis |
| 2010 | Identifying articulatory goals from kinematic data using principal differential analysis. Michael Reimer, Frank Rudzicz |
| 2010 | Impact of lack of acoustic feedback in EMG-based silent speech recognition. Matthias Janke, Michael Wand, Tanja Schultz |
| 2010 | Impact of word classing on shrinkage-based language models. Ruhi Sarikaya, Stanley F. Chen, Abhinav Sethy, Bhuvana Ramabhadran |
| 2010 | Improved generation of fundamental frequency in HMM-based speech synthesis using generation process model. Miaomiao Wang, Miaomiao Wen, Keikichi Hirose, Nobuaki Minematsu |
| 2010 | Improved language recognition using mixture components statistics. Abualsoud Hanani, Michael J. Carey, Martin J. Russell |
| 2010 | Improved modelling of speech dynamics using non-linear formant trajectories for HMM-based speech synthesis. Hongwei Hu, Martin J. Russell |
| 2010 | Improved n-gram phonotactic models for language recognition. Mohamed Faouzi BenZeghiba, Jean-Luc Gauvain, Lori Lamel |
| 2010 | Improved neural network based language modelling and adaptation. Junho Park, Xunying Liu, Mark J. F. Gales, Philip C. Woodland |
| 2010 | Improved phoneme recognition by integrating evidence from spectro-temporal and cepstral features. Shang-wen Li, Liang-Che Sun, Lin-Shan Lee |
| 2010 | Improved real-time MRI of oral-velar coordination using a golden-ratio spiral view order. Yoon-Chul Kim, Shrikanth S. Narayanan, Krishna S. Nayak |
| 2010 | Improved spoken term detection by discriminative training of acoustic models based on user relevance feedback. Hung-yi Lee, Chia-Ping Chen, Ching-Feng Yeh, Lin-Shan Lee |
| 2010 | Improved spoken term detection by feature space pseudo-relevance feedback. Chia-Ping Chen, Hung-yi Lee, Ching-Feng Yeh, Lin-Shan Lee |
| 2010 | Improved topic classification and keyword discovery using an HMM-based speech recognizer trained without supervision. Man-Hung Siu, Herbert Gish, Arthur Chan, William Belfield |
| 2010 | Improved training of excitation for HMM-based parametric speech synthesis. Yoshinori Shiga, Tomoki Toda, Shinsuke Sakai, Hisashi Kawai |
| 2010 | Improvement on plural unit selection and fusion. Jian Luan, Jian Li |
| 2010 | Improvements of search error risk minimization in viterbi beam search for speech recognition. Takaaki Hori, Shinji Watanabe, Atsushi Nakamura |
| 2010 | Improvements to generalized discriminative feature transformation for speech recognition. Roger Hsiao, Florian Metze, Tanja Schultz |
| 2010 | Improvements to the equal-parameter BIC for speaker diarization. Themos Stafylakis, Xavier Anguera |
| 2010 | Improving ASR error detection with non-decoder based features. Thomas Pellegrini, Isabel Trancoso |
| 2010 | Improving ASR-based topic segmentation of TV programs with confidence measures and semantic relations. Camille Guinaudeau, Guillaume Gravier, Pascale Sébillot |
| 2010 | Improving Mandarin segmental duration prediction with automatically extracted syntax features. Miaomiao Wen, Miaomiao Wang, Keikichi Hirose, Nobuaki Minematsu |
| 2010 | Improving back-off models with bag of words and hollow-grams. Benjamin Lecouteux, Raphaël Rubino, Georges Linarès |
| 2010 | Improving cross database prediction of dialogue quality using mixture of experts. Klaus-Peter Engelbrecht, Hamed Ketabdar, Sebastian Möller |
| 2010 | Improving monaural speaker identification by double-talk detection. Rahim Saeidi, Pejman Mowlaee, Tomi Kinnunen, Zheng-Hua Tan, Mads Græsbøll Christensen, Søren Holdt Jensen, Pasi Fränti |
| 2010 | Improving prosodic phrase prediction by unsupervised adaptation and syntactic features extraction. Zhigang Chen, Guoping Hu, Wei Jiang |
| 2010 | Improving speech synthesis of machine translation output. Alok Parlikar, Alan W. Black, Stephan Vogel |
| 2010 | Improving the readability of class lecture ASR results using a confusion network. Yasuhisa Fujii, Kazumasa Yamamoto, Seiichi Nakagawa |
| 2010 | Incorporating MAP estimation and covariance transform for SVM based speaker recognition. Cheung-Chi Leung, Donglai Zhu, Kong-Aik Lee, Bin Ma, Haizhou Li |
| 2010 | Incorporating sparse representation phone identification features in automatic speech recognition using exponential families. Vaibhava Goel, Tara N. Sainath, Bhuvana Ramabhadran, Peder A. Olsen, David Nahamoo, Dimitri Kanevsky |
| 2010 | Incremental acoustic valence recognition: an inter-corpus perspective on features, matching, and performance in a gating paradigm. Björn W. Schuller, Laurence Devillers |
| 2010 | Incremental composition of static decoding graphs with label pushing. Miroslav Novak |
| 2010 | Incremental diarization of telephone conversations. Oshry Ben-Harush, Itshak Lapidot, Hugo Guterman |
| 2010 | Incremental word learning using large-margin discriminative training and variance floor estimation. Irene Ayllón Clemente, Martin Heckmann, Alexander Denecke, Britta Wrede, Christian Goerick |
| 2010 | Influence of gestural salience on the interpretation of spoken requests. Gideon Kowadlo, Patrick Ye, Ingrid Zukerman |
| 2010 | Influence of lexical tones on intonation in kammu. Anastasia Karlsson, David House, Jan-Olof Svantesson, Damrong Tayanin |
| 2010 | Influence of musical training on perception of L2 speech. Makiko Sadakata, Lotte van der Zanden, Kaoru Sekiyama |
| 2010 | Integrate template matching and statistical modeling for speech recognition. Xie Sun, Yunxin Zhao |
| 2010 | Integrated feedback and noise reduction algorithm in digital hearing aids via oscillation detection. Miao Yao, Weiqian Liang |
| 2010 | Integrating MLP features and discriminative training in data sampling based ensemble acoustic modeling. Xin Chen, Yunxin Zhao |
| 2010 | Integration of cache-based model and topic dependent class model with soft clustering and soft voting. Welly Naptali, Masatoshi Tsuchiya, Seiichi Nakagawa |
| 2010 | Integration of multilayer regression analysis with structure-based pronunciation assessment. Masayuki Suzuki, Yu Qiao, Nobuaki Minematsu, Keikichi Hirose |
| 2010 | Intelligibility predictions for speech against fluctuating masker. Juan-Pablo Ramirez, Hamed Ketabdar, Alexander Raake |
| 2010 | Interaction of syntax-marked focus and wh-question induced focus in standard Chinese. Yuan Jia, Aijun Li |
| 2010 | Intra-frame variability as a predictor of frame classifiability. Trond Skogstad, Torbjørn Svendsen |
| 2010 | Invariant integration features combined with speaker-adaptation methods. Florian Müller, Alfred Mertins |
| 2010 | Investigating articulatory setting - pauses, ready position, and rest - using real-time MRI. Vikram Ramanarayanan, Dani Byrd, Louis Goldstein, Shrikanth S. Narayanan |
| 2010 | Investigating multiple approaches for SLU portability to a new language. Bassam Jabaian, Laurent Besacier, Fabrice Lefèvre |
| 2010 | Investigation of full-sequence training of deep belief networks for speech recognition. Abdel-rahman Mohamed, Dong Yu, Li Deng |
| 2010 | Is it possible to predict task completion in automated troubleshooters?. Alexander Schmitt, Michael Scholz, Wolfgang Minker, Jackson Liscombe, David Suendermann |
| 2010 | It takes two to tango - assessing the impact of delay on conversational interactivity on perceived speech quality. Sebastian Egger, Raimund Schatz, Stefan Scherer |
| 2010 | Japanese spoken term detection using syllable transition network derived from multiple speech recognizers' outputs. Satoshi Natori, Hiromitsu Nishizaki, Yoshihiro Sekiguchi |
| 2010 | Jointly optimized discriminative features for speech recognition. Tim Ng, Bing Zhang, Long Nguyen |
| 2010 | Kinematic analysis of tongue movement control in spastic dysarthria. Heejin Kim, Panying Rong, Torrey M. Loucks, Mark Hasegawa-Johnson |
| 2010 | Korean lenis, fortis, and aspirated stops: effect of place of articulation on acoustic realization. Mirjam Broersma |
| 2010 | L2 experience and non-native vowel categorization of L1-Mandarin speakers. Bo-ren Hsieh, Ho-Hsien Pan |
| 2010 | Landmark-based automated pronunciation error detection. Su-Youn Yoon, Mark Hasegawa-Johnson, Richard Sproat |
| 2010 | Language acquisition and cross-modal associations: computational simulation of the result of infant studies. Louis ten Bosch, Lou Boves |
| 2010 | Language model cross adaptation for LVCSR system combination. Xunying Liu, Mark J. F. Gales, Philip C. Woodland |
| 2010 | Language specific effects of emotion on phoneme duration. Martijn Goudbeek, Mirjam Broersma |
| 2010 | Language-specific influence on phoneme development: French and drehu data. Julia Monnin, Hélène Loevenbruck |
| 2010 | Large margin Gaussian mixture models for speaker identification. Reda Jourani, Khalid Daoudi, Régine André-Obrecht, Driss Aboutajdine |
| 2010 | Large vocabulary continuous speech recognition using WFST-based linear classifier for structured data. Shinji Watanabe, Takaaki Hori, Atsushi Nakamura |
| 2010 | Laryngeal characteristics during the production of geminate consonants. Masako Fujimoto, Kikuo Maekawa, Seiya Funatsu |
| 2010 | Laryngeal voice quality in the expression of focus. Martti Vainio, Matti Airas, Juhani Järvikivi, Paavo Alku |
| 2010 | Laryngealization and features for Chinese tonal recognition. Kristine M. Yu |
| 2010 | Latent affective mapping: a novel framework for the data-driven analysis of emotion in text. Jerome R. Bellegarda |
| 2010 | Latent perceptual mapping: a new acoustic modeling framework for speech recognition. Shiva Sundaram, Jerome R. Bellegarda |
| 2010 | Learning a language model from continuous speech. Graham Neubig, Masato Mimura, Shinsuke Mori, Tatsuya Kawahara |
| 2010 | Learning from human errors: prediction of phoneme confusions based on modified ASR training. Bernd T. Meyer, Birger Kollmeier |
| 2010 | Learning naturally spoken commands for a robot. Anja Austermann, Seiji Yamada, Kotaro Funakoshi, Mikio Nakano |
| 2010 | Learning new word pronunciations from spoken examples. Ibrahim Badr, Ian McGraw, James R. Glass |
| 2010 | Learning speaker normalization using semisupervised manifold alignment. Andrew R. Plummer, Mary E. Beckman, Mikhail Belkin, Eric Fosler-Lussier, Benjamin Munson |
| 2010 | Learning words and speech units through natural interactions. Jonas Hörnstein, José Santos-Victor |
| 2010 | Lecture speech recognition by combining word graphs of various acoustic models. Tetsuo Kosaka, Keisuke Goto, Takashi Ito, Masaharu Katoh |
| 2010 | Lecture subtopic retrieval by retrieval keyword expansion using subordinate concept. Noboru Kanedera, Tetsuo Funada, Seiichi Nakagawa |
| 2010 | Level of interest sensing in spoken dialog using multi-level fusion of acoustic and lexical evidence. Je Hun Jeon, Rui Xia, Yang Liu |
| 2010 | Lexical entrainment of real users in the let's go spoken dialog system. Gabriel Parent, Maxine Eskénazi |
| 2010 | Lightly supervised recognition for automatic alignment of large coherent speech recordings. Norbert Braunschweiler, Mark J. F. Gales, Sabine Buchholz |
| 2010 | Linguistic rhythm in foreign accent. Jiahong Yuan |
| 2010 | Locally-weighted regression for estimating the forward kinematics of a geometric vocal tract model. Adam C. Lammert, Louis Goldstein, Khalil Iskarous |
| 2010 | Long short-term memory networks for noise robust speech recognition. Martin Wöllmer, Yang Sun, Florian Eyben, Björn W. Schuller |
| 2010 | Longitudinal changes of selected voice source parameters. Hideki Kasuya, Hajime Yoshida, Satoshi Ebihara, Hiroki Mori |
| 2010 | Looking for relevant features for speaker role recognition. Benjamin Bigot, Julien Pinquier, Isabelle Ferrané, Régine André-Obrecht |
| 2010 | Low-dimensional space transforms of posteriors in speech recognition. Jan Zelinka, Jan Trmal, Ludek Müller |
| 2010 | MAP estimation of subspace transform for speaker recognition. Donglai Zhu, Bin Ma, Kong-Aik Lee, Cheung-Chi Leung, Haizhou Li |
| 2010 | Machine learning for text selection with expressive unit-selection voices. Dominic Espinosa, Michael White, Eric Fosler-Lussier, Chris Brew |
| 2010 | Mandarin digit recognition assisted by selective tone distinction. Xiaodong Wang, Kunihiko Owa, Makoto Shozakai |
| 2010 | Mandarin tone recognition using affine-invariant prosodic features and tone posteriorgram. Yow-Bang Wang, Lin-Shan Lee |
| 2010 | Manipulating treacheoesophageal speech. R. J. J. H. van Son, Irene Jacobi, Frans J. M. Hilgers |
| 2010 | Mask estimation in non-stationary noise environments for missing feature based robust speech recognition. Shirin Badiezadegan, Richard C. Rose |
| 2010 | Masking of vowel-analog transitions by vowel-analog distracters. Pierre L. Divenyi |
| 2010 | Masking property based microphone array post-filter design. Ning Cheng, Wenju Liu, Lan Wang |
| 2010 | Maximum a posteriori voice conversion using sequential monte carlo methods. Elina Helander, Hanna Silén, Joaquín Míguez, Moncef Gabbouj |
| 2010 | Maximum lexical cohesion for fine-grained news story segmentation. Zihan Liu, Lei Xie, Wei Feng |
| 2010 | Measuring basic tempo across languages and some implications for speech rhythm. Gertraud Fenk-Oczlon, August Fenk |
| 2010 | Mechanical vocal-tract models for speech dynamics. Takayuki Arai |
| 2010 | Melody pitch estimation based on range estimation and candidate extraction using harmonic structure model. Seokhwan Jo, Sihyun Joo, Chang D. Yoo |
| 2010 | Memory-based active learning for French broadcast news. Frédéric Tantini, Christophe Cerisara, Claire Gardent |
| 2010 | Methods for robust speech recognition in reverberant environments: a comparison. Rico Petrick, Thomas Fehér, Masashi Unoki, Rüdiger Hoffmann |
| 2010 | Metric subspace indexing for fast spoken term detection. Taisuke Kaneko, Tomoyosi Akiba |
| 2010 | Minimally invasive surgery for spoken dialog systems. David Suendermann, Jackson Liscombe, Roberto Pieraccini |
| 2010 | Modal analysis of vocal fold vibrations using laryngotopography. Ken-Ichi Sakakibara, Hiroshi Imagawa, Miwako Kimura, Hisayuki Yokonishi, Niro Tayama |
| 2010 | Model synthesis for band-limited speech recognition. Yongjun He, Jiqing Han |
| 2010 | Modeling liaison in French by using decision trees. Josafá de Jesus Aguiar Pontes, Sadaoki Furui |
| 2010 | Modeling of sentence-medial pauses in bangla readout speech: occurrence and duration. Shyamal Kr. Das Mandal, Arup Saha, Tulika Basu, Keikichi Hirose, Hiroya Fujisaki |
| 2010 | Modeling perceived vocal age in american English. James D. Harnsberger, Rahul Shrivastav, W. S. Brown Jr. |
| 2010 | Modeling posterior probabilities using the linear exponential family. Peder A. Olsen, Vaibhava Goel, Charles A. Micchelli, John R. Hershey |
| 2010 | Modeling pronunciation variation with context-dependent articulatory feature decision trees. Samuel R. Bowman, Karen Livescu |
| 2010 | Modelling speech line spectral frequencies with dirichlet mixture models. Zhanyu Ma, Arne Leijon |
| 2010 | Modelling the effect of speaker familiarity and noise on infant word recognition. Christina Bergmann, Michele Gubian, Lou Boves |
| 2010 | Modified spatial audio object coding scheme with harmonic extraction and elimination structure for interactive audio service. Jihoon Park, Kwang-Ki Kim, Jeongil Seo, Minsoo Hahn |
| 2010 | Morphological and predictability effects on schwa reduction: the case of dutch word-initial syllables. Iris Hanique, Barbara Schuppler, Mirjam Ernestus |
| 2010 | Multi resolution discriminative models for subvocalic speech recognition. Mark Raugas, Vivek Kumar Rangarajan Sridhar, Rohit Prasad, Prem Natarajan |
| 2010 | Multi-channel iterative dereverberation based on codebook constrained iterative multi-channel wiener filter. Ajay Srinivasamurthy, Thippur V. Sreenivas |
| 2010 | Multi-class and hierarchical SVMs for emotion recognition. Ali Hassan, Robert I. Damper |
| 2010 | Multi-pitch estimation by a joint 2-d representation of pitch and pitch dynamics. Tianyu T. Wang, Thomas F. Quatieri |
| 2010 | MultiBIC: an improved speaker segmentation technique for TV shows. Paula Lopez-Otero, Laura Docío Fernández, Carmen García-Mateo |
| 2010 | Multichannel noise reduction using low order RTF estimate. Subhojit Chakladar, Nam Soo Kim, Yu Gwang Jin, Tae Gyoon Kang |
| 2010 | Multichannel source separation based on source location cue with log-spectral shaping by hidden Markov source model. Tomohiro Nakatani, Shoko Araki, Takuya Yoshioka, Masakiyo Fujimoto |
| 2010 | Multimodal dialog in the car: combining speech and turn-and-push dial to control comfort functions. Sandro Castronovo, Angela Mahr, Margarita Pentcheva, Christian A. Müller |
| 2010 | Multimodal speaker diarization using oriented optical flow histograms. Mary Tai Knox, Gerald Friedland |
| 2010 | Multivariate analysis of vocal fatigue in continuous reading. Marie-José Caraty, Claude Montacié |
| 2010 | Mutual information analysis for feature and sensor subset selection in surface electromyography based speech recognition. Vivek Kumar Rangarajan Sridhar, Rohit Prasad, Prem Natarajan |
| 2010 | Named-entity projection and data-driven morphological decomposition for field maintainable speech-to-speech translation systems. Ian R. Lane, Alex Waibel |
| 2010 | Native and non-native speaker judgements on the quality of synthesized speech. Anna C. Janska, Robert A. J. Clark |
| 2010 | Natural belief-critic: a reinforcement algorithm for parameter estimation in statistical spoken dialogue systems. Filip Jurcícek, Blaise Thomson, Simon Keizer, François Mairesse, Milica Gasic, Kai Yu, Steve J. Young |
| 2010 | Near field sound source localization based on cross-power spectrum phase analysis with multiple microphones. Kohei Hayashida, Masanori Morise, Takanobu Nishiura |
| 2010 | New insights into subspace noise tracking. Mahdi Triki |
| 2010 | New technique to enhance the performance of spoken dialogue systems based on dialogue states-dependent language models and grammatical rules. Ramón López-Cózar, David Griol |
| 2010 | Noise robust voice activity detection using features extracted from the time-domain autocorrelation function. Houman Ghaemmaghami, Brendan Baker, Robert Vogt, Sridha Sridharan |
| 2010 | Non-audible murmur recognition based on fusion of audio and visual streams. Panikos Heracleous, Norihiro Hagita |
| 2010 | Non-linear predictive vector quantization of feature vectors for distributed speech recognition. José Enrique García Laínez, Alfonso Ortega, Antonio Miguel, Eduardo Lleida |
| 2010 | Non-negative matrix factorization based compensation of music for automatic speech recognition. Bhiksha Raj, Tuomas Virtanen, Sourish Chaudhuri, Rita Singh |
| 2010 | Nonlinear enhancement of onset for robust speech recognition. Chanwoo Kim, Richard M. Stern |
| 2010 | Novel probabilistic control of noise reduction for improved microphone array beamforming. Jungpyo Hong, Seung Ho Han, Sangbae Jeong, Minsoo Hahn |
| 2010 | Novel weighting scheme for unsupervised language model adaptation using latent dirichlet allocation. Md. Akmal Haidar, Douglas D. O'Shaughnessy |
| 2010 | Nucleus position within the intonation phrase: a typological study of English, Czech and Hungarian. Tomás Dubeda, Katalin Mády |
| 2010 | Numerical study of turbulent flow-induced sound production in presence of a tooth-shaped obstacle: towards sibilant [s] physical modeling. Julien Cisonni, Kazunori Nozaki, Annemie Van Hirtum, Shigeo Wada |
| 2010 | Observation uncertainty measures for sparse imputation. Jort F. Gemmeke, Ulpu Remes, Kalle J. Palomäki |
| 2010 | On enhancing feature sequence filtering with filter-bank energy transformation in speaker verification with telephone speech. Claudio Garretón, Néstor Becerra Yoma |
| 2010 | On evaluation of the f Keiichi Funaki |
| 2010 | On generating combilex pronunciations via morphological analysis. Korin Richmond, Robert A. J. Clark, Susan Fitt |
| 2010 | On speaker adaptive training of artificial neural networks. Jan Trmal, Jan Zelinka, Ludek Müller |
| 2010 | On the automatic toBI accent type identification from data. César González Ferreras, Carlos Vivaracho-Pascual, David Escudero Mancebo, Valentín Cardeñoso-Payo |
| 2010 | On the effect of fundamental frequency on amplitude and frequency modulation patterns in speech resonances. Pirros Tsiakoulis, Alexandros Potamianos |
| 2010 | On the exploitation of hidden Markov models and linear dynamic models in a hybrid decoder architecture for continuous speech recognition. Volker Leutnant, Reinhold Haeb-Umbach |
| 2010 | On the importance of glottal flow spectral energy for the recognition of emotions in speech. Ling He, Margaret Lech, Nicholas B. Allen |
| 2010 | On the interdependencies between voice quality, glottal gaps, and voice-source related acoustic measures. Yen-Liang Shue, Gang Chen, Abeer Alwan |
| 2010 | On the potential of channel selection for recognition of reverberated speech with multiple microphones. Martin Wolf, Climent Nadeu |
| 2010 | On the potential of glottal signatures for speaker recognition. Thomas Drugman, Thierry Dutoit |
| 2010 | On the relation of Bayes risk, word error, and word posteriors in ASR. Ralf Schlüter, Markus Nußbaum-Thom, Hermann Ney |
| 2010 | On the use of Gaussian component information in the generative likelihood ratio estimation for speaker verification. Rong Zheng, Bo Xu |
| 2010 | On using Gaussian mixture model for double-talk detection in acoustic echo suppression. Ji-Hyun Song, Kyu-Ho Lee, Yun-Sik Park, Sang-Ick Kang, Joon-Hyuk Chang |
| 2010 | On using missing-feature theory with cepstral features - approximations to the multivariate integral. Frank Seide, Pei Zhao |
| 2010 | On using voice source measures in automatic gender classification of children's speech. Gang Chen, Xue Feng, Yen-Liang Shue, Abeer Alwan |
| 2010 | On-demand language model interpolation for mobile speech input. Brandon Ballinger, Cyril Allauzen, Alexander Gruenstein, Johan Schalkwyk |
| 2010 | On-the-fly lattice rescoring for real-time automatic speech recognition. Hasim Sak, Murat Saraclar, Tunga Güngör |
| 2010 | One-model speech recognition and synthesis based on articulatory movement HMMs. Tsuneo Nitta, Takayuki Onoda, Masashi Kimura, Yurie Iribe, Kouichi Katsurada |
| 2010 | Online Gaussian process for nonstationary speech separation. Hsin-Lung Hsieh, Jen-Tzung Chien |
| 2010 | Online SLU model adaptation with a partial oracle. Pierre Gotab, Géraldine Damnati, Frédéric Béchet, Lionel Delphin-Poulat |
| 2010 | Online adaptive learning for speech recognition decoding. Jeff A. Bilmes, Hui Lin |
| 2010 | Optimising a handcrafted dialogue system design. Romain Laroche, Ghislain Putois, Philippe Bretier |
| 2010 | Optimizing spoken dialogue management with fitted value iteration. Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin |
| 2010 | Oriented PCA method for blind speech separation of convolutive mixtures. Yasmina Benabderrahmane, Sid-Ahmed Selouani, Douglas D. O'Shaughnessy |
| 2010 | Overlap detection for speaker diarization by fusing spectral and spatial features. Martin Zelenák, Carlos Segura, Javier Hernando |
| 2010 | PDF-optimized LSF vector quantization based on beta mixture models. Zhanyu Ma, Arne Leijon |
| 2010 | Parallel lexical-tree based LVCSR on multi-core processors. Naveen Parihar, Ralf Schlüter, David Rybach, Eric A. Hansen |
| 2010 | Parallel processing of interruptions and feedback in companions affective dialogue system. Jaakko Hakulinen, Markku Turunen, Raúl Santos de la Cámara, Nigel T. Crook |
| 2010 | Parallel training of neural networks for speech recognition. Karel Veselý, Lukás Burget, Frantisek Grézl |
| 2010 | Parameters describing multimodal interaction - definitions and three usage scenarios. Christine Kühnel, Benjamin Weiss, Sebastian Möller |
| 2010 | Paraphrase generation to improve text-to-speech synthesis. Ghislain Putois, Jonathan Chevelu, Cédric Boidin |
| 2010 | Perception of estonian vowel categories by native and non-native speakers. Lya Meister, Einar Meister |
| 2010 | Perception of voiceless fricatives by Japanese listeners of advanced and intermediate level English proficiency. Hinako Masuda, Takayuki Arai |
| 2010 | Perception on pitch reset at discourse boundaries. Hsin-Yi Lin, Janice Fon |
| 2010 | Perception-based automatic approximation of F0 contours in Cantonese speech. Yujia Li, Tan Lee |
| 2010 | Perceptual compensation for effects of reverberation in speech identification: a computer model based on auditory efferent processing. Amy V. Beeston, Guy J. Brown |
| 2010 | Perceptual wavelet decomposition for speech segmentation. Mariusz Ziólko, Jakub Galka, Bartosz Ziólko, Tomasz Drwiega |
| 2010 | Performance estimation of noisy speech recognition considering recognition task complexity. Takeshi Yamada, Tomohiro Nakajima, Nobuhiko Kitawaki, Shoji Makino |
| 2010 | Performance estimation of reverberant speech recognition based on reverberant criteria RSR-d Takahiro Fukumori, Masanori Morise, Takanobu Nishiura |
| 2010 | Phase equalization-based autoregressive model of speech signals. Sadao Hiroya, Takemi Mochida |
| 2010 | Phone boundary detection using sample-based acoustic parameters. You-yu Lin, Yih-Ru Wang, Yuan-Fu Liao |
| 2010 | Phone mismatch penalty matrices for two-stage keyword spotting via multi-pass phone recognizer. Chang Woo Han, Shin Jae Kang, Chul Min Lee, Nam Soo Kim |
| 2010 | Phoneme classification and lattice rescoring based on a k-NN approach. Ladan Golipour, Douglas D. O'Shaughnessy |
| 2010 | Phoneme lattice based texttiling towards multilingual story segmentation. Xiaoxuan Wang, Lei Xie, Bin Ma, Engsiong Chng, Haizhou Li |
| 2010 | Phonetic imitation of Japanese vowel devoicing. Kuniko Y. Nielsen |
| 2010 | Phonetic realization of second occurrence focus in Japanese. Satoshi Nambu, Yong-Cheol Lee |
| 2010 | Phonetic segmentation of singing voice using MIDI and parallel speech. Minghui Dong, Paul Y. Chan, Ling Cen, Haizhou Li, Jason Teo, Ping Jen Kua |
| 2010 | Phonetic subspace mixture model for speaker diarization. I-Fan Chen, Shih-Sian Cheng, Hsin-Min Wang |
| 2010 | Phrase alignment confidence for statistical machine translation. Sankaranarayanan Ananthakrishnan, Rohit Prasad, Prem Natarajan |
| 2010 | Phrase-medial vowel devoicing in spontaneous French. Francisco Torreira, Mirjam Ernestus |
| 2010 | Physics of body-conducted silent speech - production, propagation and representation of non-audible murmur. Makoto Otani, Tatsuya Hirahara |
| 2010 | Pitch determination using autocorrelation function in spectral domain. M. Shahidur Rahman, Tetsuya Shimamura |
| 2010 | Pitch estimation in noisy speech based on temporal accumulation of spectrum peaks. Feng Huang, Tan Lee |
| 2010 | Pitch similarity in the vicinity of backchannels. Mattias Heldner, Jens Edlund, Julia Hirschberg |
| 2010 | Positional variability of pitch accents in Czech. Tomás Dubeda |
| 2010 | Post-aspiration in standard Italian: some first cross-regional acoustic evidence. Mary Stevens, John Hajek |
| 2010 | Pre- and short-term posttreatment vocal functioning in patients with advanced head and neck cancer treated with concomitant chemoradiotherapy. Irene Jacobi, Lisette van der Molen, Maya van Rossum, Frans J. M. Hilgers |
| 2010 | Predicting human perception and ASR classification of word-final [t] by its acoustic sub-segmental properties. Barbara Schuppler, Mirjam Ernestus, Wim A. van Dommelen, Jacques C. Koreman |
| 2010 | Predicting unseen articulations from multi-speaker articulatory models. Gopal Ananthakrishnan, Pierre Badin, Julián Andrés Valdés Vargas, Olov Engwall |
| 2010 | Predicting word accuracy for the automatic speech recognition of non-native speech. Su-Youn Yoon, Lei Chen, Klaus Zechner |
| 2010 | Prior information for rapid speaker adaptation. Catherine Breslin, K. K. Chin, Mark J. F. Gales, Kate M. Knill, Haitian Xu |
| 2010 | Probabilistic integration of joint density model and speaker model for voice conversion. Daisuke Saito, Shinji Watanabe, Atsushi Nakamura, Nobuaki Minematsu |
| 2010 | Probabilistic state clustering using conditional random field for context-dependent acoustic modelling. Khe Chai Sim |
| 2010 | Production and perception of vietnamese short vowels in V1V2 context. Viet Son Nguyen, Eric Castelli, René Carré |
| 2010 | Prominence based scoring of speech segments for automatic speech-to-speech summarization. Sree Harsha Yella, Vasudeva Varma, Kishore Prahallad |
| 2010 | Prominence detection in Swedish using syllable correlates. Samer Al Moubayed, Jonas Beskow |
| 2010 | Prosodic grouping and relative clause disambiguation in Mandarin. Jianjing Kuang |
| 2010 | Prosodic speaker verification using subspace multinomial models with intersession compensation. Marcel Kockmann, Lukás Burget, Ondrej Glembek, Luciana Ferrer, Jan Cernocký |
| 2010 | Prosodic timing analysis for articulatory re-synthesis using a bank of resonators with an adaptive oscillator. Michael C. Brady |
| 2010 | Prosodic word-based error correction in speech recognition using prosodic word expansion and contextual information. Chao-Hong Liu, Chung-Hsien Wu |
| 2010 | Prosody and voice quality of vocal social signals: the case of dominance in scenario meetings. Marcela Charfuelan, Marc Schröder, Ingmar Steiner |
| 2010 | Prosody cues for classification of the discourse particle "hã" in hindi. Sankalan Prasad, Kalika Bali |
| 2010 | Prosody for the eyes: quantifying visual prosody using guided principal component analysis. Erin Cvejic, Jeesun Kim, Chris Davis, Guillaume Gibert |
| 2010 | Psychological evaluation of a group communication activation robot in a party game. Yoichi Matsuyama, Shinya Fujie, Hikaru Taniyama, Tetsunori Kobayashi |
| 2010 | Quality conversion of non-acoustic signals for facilitating human-to-human speech communication under harsh acoustic conditions. Seyed Omid Sadjadi, Sanjay A. Patil, John H. L. Hansen |
| 2010 | Quality-based playout buffering with FEC for conversational voIP. Qipeng Gong, Peter Kabal |
| 2010 | Quantification of prosodic entrainment in affective spontaneous spoken interactions of married couples. Chi-Chun Lee, Matthew Black, Athanasios Katsamanis, Adam C. Lammert, Brian R. Baucom, Andrew Christensen, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2010 | Quantized HMMs for low footprint text-to-speech synthesis. Alexander Gutkin, Xavi Gonzalvo, Stefan Breuer, Paul Taylor |
| 2010 | Rapid bootstrapping of five eastern european languages using the rapid language adaptation toolkit. Ngoc Thang Vu, Tim Schlippe, Franziska Kraus, Tanja Schultz |
| 2010 | Rapid development of speech translation using consecutive interpretation. Matthias Paulik, Alex Waibel |
| 2010 | Rapid semi-automatic segmentation of real-time magnetic resonance images for parametric vocal tract analysis. Michael I. Proctor, Daniel Bone, Athanasios Katsamanis, Shrikanth S. Narayanan |
| 2010 | Real-life emotion-related states detection in call centers: a cross-corpora study. Laurence Devillers, Christophe Vaudable, Clément Chastagnol |
| 2010 | Recognition of spontaneous conversational speech using long short-term memory phoneme predictions. Martin Wöllmer, Florian Eyben, Björn W. Schuller, Gerhard Rigoll |
| 2010 | Recognizing cochlear implant-like spectrally reduced speech with HMM-based ASR: experiments with MFCCs and PLP coefficients. Cong-Thanh Do, Dominique Pastor, Gaël Le Lan, André Goalic |
| 2010 | Recurrent neural network based language model. Tomás Mikolov, Martin Karafiát, Lukás Burget, Jan Cernocký, Sanjeev Khudanpur |
| 2010 | Redescribing intonational categories with functional data analysis. Margaret Zellers, Michele Gubian, Brechtje Post |
| 2010 | Reducing musical noise in blind source separation by time-domain sparse filters and split bregman method. Wenye Ma, Meng Yu, Jack Xin, Stanley J. Osher |
| 2010 | Reduction of broadband noise in speech signals by multilinear subspace analysis. Yusuke Sato, Tetsuya Hoya, Hovagim Bakardjian, Andrzej Cichocki |
| 2010 | Regularized-MLLR speaker adaptation for computer-assisted language learning system. Dean Luo, Yu Qiao, Nobuaki Minematsu, Yutaka Yamauchi, Keikichi Hirose |
| 2010 | Reinforced blocking matrix with cross channel projection for speech enhancement. Inho Lee, Jongsung Yoon, Yoonjae Lee, Hanseok Ko |
| 2010 | Reliable tracking based on speech sample salience of vocal cycle length perturbations. Christophe Mertens, Francis Grenez, Lise Crevier-Buchman, Jean Schoentgen |
| 2010 | Relying on critical articulators to estimate vocal tract spectra in an articulatory-acoustic database. Daniel Felps, Christian Geng, Michael Berger, Korin Richmond, Ricardo Gutierrez-Osuna |
| 2010 | Repair strategies on trial: which error recovery do users like best?. Alexander Zgorzelski, Alexander Schmitt, Tobias Heinroth, Wolfgang Minker |
| 2010 | Resources for turn competition in overlap in multi-party conversations: speech rate, pausing and duration. Emina Kurtic, Guy J. Brown, Bill Wells |
| 2010 | Restructuring exponential family mixture models. Pierre L. Dognin, John R. Hershey, Vaibhava Goel, Peder A. Olsen |
| 2010 | Revisiting VTLN using linear transformation on conventional MFCC. Doddipatla Rama Sanand, Ralf Schlüter, Hermann Ney |
| 2010 | Rhythm and formant features for automatic alcohol detection. Florian Schiel, Christian Heinrich, Veronika Neumeyer |
| 2010 | Robust and efficient pitch estimation using an iterative ARMA technique. Jung Ook Hong, Patrick J. Wolfe |
| 2010 | Robust automatic speech recognition with decoder oriented ideal binary mask estimation. Lae-Hoon Kim, Kyung-Tae Kim, Mark Hasegawa-Johnson |
| 2010 | Robust mixture modeling using t-distribution: application to speaker ID. Sundar Harshavardhan, Thippur V. Sreenivas |
| 2010 | Robust noise estimation using minimum correction with harmonicity control. Xuejing Sun, Kuan-Chieh Yen, Rogerio Guedes Alves |
| 2010 | Robust statistical voice activity detection using a likelihood ratio sign test. Shiwen Deng, Jiqing Han |
| 2010 | Robust voice activity detection in stereo recording with crosstalk. Prasanta Kumar Ghosh, Andreas Tsiartas, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2010 | Robust word recognition using articulatory trajectories and gestures. Vikramjit Mitra, Hosung Nam, Carol Y. Espy-Wilson, Elliot Saltzman, Louis Goldstein |
| 2010 | Role of language models in spoken fluency evaluation. Om Deshmukh, Harish Doddala, Ashish Verma, Karthik Visweswariah |
| 2010 | Roles of the average voice in speaker-adaptive HMM-based speech synthesis. Junichi Yamagishi, Oliver Watts, Simon King, Bela Usabaev |
| 2010 | Round-robin discrimination model for reranking ASR hypotheses. Takanobu Oba, Takaaki Hori, Atsushi Nakamura |
| 2010 | Russian infants and children's sounds and speech corpuses for language acquisition studies. Elena E. Lyakso, Olga V. Frolova, Anna V. Kurazhova, Julia S. Gaikova |
| 2010 | SAFE: a statistical algorithm for F0 estimation for both clean and noisy speech. Wei Chu, Abeer Alwan |
| 2010 | SCARF: a segmental conditional random field toolkit for speech recognition. Geoffrey Zweig, Patrick Nguyen |
| 2010 | SEAME: a Mandarin-English code-switching speech corpus in south-east asia. Dau-Cheng Lyu, Tien Ping Tan, Engsiong Chng, Haizhou Li |
| 2010 | SNR-based mask compensation for computational auditory scene analysis applied to speech recognition in a car environment. Ji Hun Park, Seon Man Kim, Jae Sam Yoon, Hong Kook Kim, Sung Joo Lee, Yunkeun Lee |
| 2010 | Say it as you mean it - analyzing free user comments in the VOICE awards corpus. Florian Gödde, Sebastian Möller |
| 2010 | Say what? why users choose to speak their web queries. Maryam Kamvar, Doug Beeferman |
| 2010 | Score-level compensation of extreme speech duration variability in speaker verification. Sergio Perez-Gomez, Daniel Ramos, Javier Gonzalez-Dominguez, Joaquin Gonzalez-Rodriguez |
| 2010 | Search by voice in Mandarin Chinese. Jiulong Shan, Genqing Wu, Zhihong Hu, Xiliu Tang, Martin Jansche, Pedro J. Moreno |
| 2010 | Selecting phonotactic features for language recognition. Rong Tong, Bin Ma, Haizhou Li, Engsiong Chng |
| 2010 | Selective gammatone filterbank feature for robust sound event recognition. Yi Ren Leng, Tran Huy Dat, Norihide Kitaoka, Haizhou Li |
| 2010 | Semantic facilitation in bilingual everyday speech comprehension. Marco van de Ven, Benjamin V. Tucker, Mirjam Ernestus |
| 2010 | Semi-automated update of automatic transcription system for the Japanese national congress. Yuya Akita, Masato Mimura, Graham Neubig, Tatsuya Kawahara |
| 2010 | Semi-parametric trajectory modelling using temporally varying feature mapping for speech recognition. Khe Chai Sim, Shilin Liu |
| 2010 | Semi-supervised extractive speech summarization via co-training algorithm. Shasha Xie, Hui Lin, Yang Liu |
| 2010 | Semi-supervised learning for improved expression of uncertainty in discriminative classifiers. Jonathan Malkin, Jeff A. Bilmes |
| 2010 | Semi-supervised part-of-speech tagging in speech applications. Richard Dufour, Benoît Favre |
| 2010 | Semi-supervised training of Gaussian mixture models by conditional entropy minimization. Jui-Ting Huang, Mark Hasegawa-Johnson |
| 2010 | Session variability contrasts in the MARP corpus. Keith W. Godin, John H. L. Hansen |
| 2010 | Setup for acoustic-visual speech synthesis by concatenating bimodal units. Asterios Toutios, Utpala Musti, Slim Ouni, Vincent Colotte, Brigitte Wrobel-Dautcourt, Marie-Odile Berger |
| 2010 | Shape-invariant speech transformation with the phase vocoder. Axel Röbel |
| 2010 | Shrinkage model adaptation in automatic speech recognition. Jinyu Li, Yu Tsao, Chin-Hui Lee |
| 2010 | Signal interaction and the devil function. John R. Hershey, Peder A. Olsen, Steven J. Rennie |
| 2010 | Signal-based accent and phrase marking using the fujisaki model. Hussein Hussein, Rüdiger Hoffmann |
| 2010 | Significance of pitch synchronous analysis for speaker recognition using AANN models. Sri Harish Reddy Mallidi, Kishore Prahallad, Suryakanth V. Gangashetty, B. Yegnanarayana |
| 2010 | Silent vs vocalized articulation for a portable ultrasound-based silent speech interface. Victoria M. Florescu, Lise Crevier-Buchman, Bruce Denby, Thomas Hueber, Antonia Colazo-Simon, Claire Pillot-Loiseau, Pierre Roussel-Ragot, Cédric Gendrot, Sophie Quattrocchi |
| 2010 | Similar n-gram language model. Christian Gillot, Christophe Cerisara, David Langlois, Jean Paul Haton |
| 2010 | Similarity of effects of emotions on the speech organ configuration with and without speaking. Tatsuya Kitamura |
| 2010 | Similarity scoring for recognizing repeated out-of-vocabulary words. Mirko Hannemann, Stefan Kombrink, Martin Karafiát, Lukás Burget |
| 2010 | Simple and efficient speaker comparison using approximate KL divergence. William M. Campbell, Zahi N. Karam |
| 2010 | Simplification and extension of non-periodic excitation source representations for high-quality speech manipulation systems. Hideki Kawahara, Masanori Morise, Toru Takahashi, Hideki Banno, Ryuichi Nisimura, Toshio Irino |
| 2010 | Single-channel speech enhancement using kalman filtering in the modulation domain. Stephen So, Kamil K. Wójcicki, Kuldip K. Paliwal |
| 2010 | Single-speaker/multi-speaker co-channel speech classification. Stéphane Rossignol, Olivier Pietquin |
| 2010 | Sinusoidal model parameterization for HMM-based TTS system. Slava Shechtman, Alexander Sorin |
| 2010 | Social role discovery from spoken language using dynamic Bayesian networks. Sibel Yaman, Dilek Hakkani-Tür, Gökhan Tür |
| 2010 | Sound-based assistive technology supporting "seeing", "hearing" and "speaking" for the disabled and the elderly. Tohru Ifukube |
| 2010 | Sparse auto-associative neural networks: theory and application to speech recognition. Garimella S. V. S. Sivaram, Sriram Ganapathy, Hynek Hermansky |
| 2010 | Sparse component analysis for speech recognition in multi-speaker environment. Afsaneh Asaei, Hervé Bourlard, Philip N. Garner |
| 2010 | Sparse representation features for speech recognition. Tara N. Sainath, Bhuvana Ramabhadran, David Nahamoo, Dimitri Kanevsky, Abhinav Sethy |
| 2010 | Sparse representations for text categorization. Tara N. Sainath, Sameer Maskey, Dimitri Kanevsky, Bhuvana Ramabhadran, David Nahamoo, Julia Hirschberg |
| 2010 | Speaker adaptation based on nonlinear spectral transform for speech recognition. Toyohiro Hayashi, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda |
| 2010 | Speaker adaptation based on system combination using speaker-class models. Tetsuo Kosaka, Takashi Ito, Masaharu Katoh, Masaki Kohda |
| 2010 | Speaker adaptation in transformation space using two-dimensional PCA. Yongwon Jeong, Young Rok Song, Hyung Soon Kim |
| 2010 | Speaker and language adaptive training for HMM-based polyglot speech synthesis. Heiga Zen |
| 2010 | Speaker characterization using long-term and temporal information. Chien-Lin Huang, Hanwu Sun, Bin Ma, Haizhou Li |
| 2010 | Speaker diarization in meeting audio for single distant microphone. Tin Lay Nwe, Hanwu Sun, Bin Ma, Haizhou Li |
| 2010 | Speaker recognition experiments using connectionist transformation network features. Alberto Abad, Isabel Trancoso |
| 2010 | Speaker recognition using supervised probabilistic principal component analysis. Yun Lei, John H. L. Hansen |
| 2010 | Speaker recognition using the resynthesized speech via spectrum modeling. Xiang Zhang, Chuan Cao, Lin Yang, Hongbin Suo, Jianping Zhang, Yonghong Yan |
| 2010 | Speaker tracking in an unsupervised speech controlled system. Tobias Herbig, Franz Gerl, Wolfgang Minker |
| 2010 | Speaker-dependent mapping of source and system features for enhancement of throat microphone speech. Anand Joseph Xavier Medabalimi, Sri Harish Reddy Mallidi, B. Yegnanarayana |
| 2010 | Speaker-independent HMM-based voice conversion using quantized fundamental frequency. Takashi Nose, Takao Kobayashi |
| 2010 | Speaking style dependency of formant targets. Akiko Amano-Kusumoto, John-Paul Hosom, Alexander Kain |
| 2010 | Specification in context - devoicing processes in Polish, French, american English and German sonorants. Jagoda Sieczkowska, Bernd Möbius, Grzegorz Dogil |
| 2010 | Spectral entropy-based voice activity detector for videoconferencing systems. Bowon Lee, Debargha Muhkerjee |
| 2010 | Spectro-temporal modulations for robust speech emotion recognition. Lan-Ying Yeh, Tai-Shih Chi |
| 2010 | Speech categorization context effects in seven- to nine-month-old infants. Ellen Marklund, Francisco Lacerda, Anna Ericsson |
| 2010 | Speech database reduction method for corpus-based TTS system. Mitsuaki Isogai, Hideyuki Mizuno |
| 2010 | Speech dominoes and phonetic convergence. Gérard Bailly, Amélie Lelong |
| 2010 | Speech enhancement using improved generalized sidelobe canceller in frequency domain with multi-channel postfiltering. Kai Li, Qiang Fu, Yonghong Yan |
| 2010 | Speech estimation in non-stationary noise environments using timing structures between mouth movements and sound signals. Hiroaki Kawashima, Yu Horii, Takashi Matsuyama |
| 2010 | Speech intelligibility of diagonally localized speech with competing noise using bone-conduction headphones. Kazuhiro Kondo, Takayuki Kanda, Yosuke Kobayashi, Hiroyuki Yagyu |
| 2010 | Speech inventory based discriminative training for joint speech enhancement and low-rate speech coding. Xiaoqiang Xiao, Robert M. Nickel |
| 2010 | Speech recognition using long-term phase information. Kazumasa Yamamoto, Eiichi Sueyoshi, Seiichi Nakagawa |
| 2010 | Speech recognition with a seamlessly updated language model for real-time closed-captioning. Toru Imai, Shinichi Homma, Akio Kobayashi, Takahiro Oku, Shoei Sato |
| 2010 | Speech recognizer optimization under speed constraints. Ivan Bulyko |
| 2010 | Speech robot mimicking human articulatory motion. Kotaro Fukui, Toshihiro Kusano, Yoshikazu Mukaeda, Yuto Suzuki, Atsuo Takanishi, Masaaki Honda |
| 2010 | Speech synthesis by modeling harmonics structure with multiple function. Toru Nakashika, Ryuki Tachibana, Masafumi Nishimura, Tetsuya Takiguchi, Yasuo Ariki |
| 2010 | Speech-based automated cognitive status assessment. Dilek Hakkani-Tür, Dimitra Vergyri, Gökhan Tür |
| 2010 | Spoken English assessment system for non-native speakers using acoustic and prosodic features. Qin Shi, Kun Li, Shilei Zhang, Stephen M. Chu, Ji Xiao, Zhijian Ou |
| 2010 | Spoken document retrieval for oral presentations integrating global document similarities into local document similarities. Hiroaki Nanjo, Yusuke Iyonaga, Takehiko Yoshimi |
| 2010 | State-based labelling for a sparse representation of speech and its application to robust speech recognition. Tuomas Virtanen, Jort F. Gemmeke, Antti Hurmalainen |
| 2010 | Statistical modeling of F0 dynamics in singing voices based on Gaussian processes with multiple oscillation bases. Yasunori Ohishi, Hirokazu Kameoka, Daichi Mochihashi, Hidehisa Nagano, Kunio Kashino |
| 2010 | Statistical multi-stream modeling of real-time MRI articulatory speech data. Erik Bresch, Athanasios Katsamanis, Louis Goldstein, Shrikanth S. Narayanan |
| 2010 | Still talking to machines (cognitively speaking). Steve J. Young |
| 2010 | Strategies for statistical spoken language understanding with small amount of data - an empirical study. Ye-Yi Wang |
| 2010 | Study on interaction between entropy pruning and kneser-ney smoothing. Ciprian Chelba, Thorsten Brants, Will Neveitt, Peng Xu |
| 2010 | Sub-band basis spectrum model for pitch-synchronous log-spectrum and phase based on approximation of sparse coding. Masatsune Tamura, Takehiko Kagoshima, Masami Akamine |
| 2010 | Superwideband extension of g.718 and g.729.1 speech codecs. Lasse Laaksonen, Mikko Tammi, Vladimir Malenovsky, Tommy Vaillancourt, Mi Suk Lee, Tomofumi Yamanashi, Masahiro Oshikiri, Claude Lamblin, Balázs Kövesi, Lei Miao, Deming Zhang, Jon Gibbs, Holly Francois |
| 2010 | Syllable-level prominence detection with acoustic evidence. Je Hun Jeon, Yang Liu |
| 2010 | Synthesis of fast speech with interpolation of adapted HSMMs and its evaluation by blind and sighted listeners. Michael Pucher, Dietmar Schabus, Junichi Yamagishi |
| 2010 | Synthesizing photo-real talking head via trajectory-guided sample selection. Lijuan Wang, Xiaojun Qian, Wei Han, Frank K. Soong |
| 2010 | System output combination for improved speaker diarization. Simon Bozonnet, Nicholas W. D. Evans, Xavier Anguera, Oriol Vinyals, Gerald Friedland, Corinne Fredouille |
| 2010 | Techniques for topic detection based processing in spoken dialog systems. Rajesh Balchandran, Leonid Rachevsky, Bhuvana Ramabhadran, Miroslav Novak |
| 2010 | Template-based spectral estimation using microphone array for speech recognition. Satoshi Tamura, Eriko Hishikawa, Wataru Taguchi, Satoru Hayamizu |
| 2010 | Text normalization based on statistical machine translation and internet user support. Tim Schlippe, Chenfei Zhu, Jan Gebhardt, Tanja Schultz |
| 2010 | Text-based unstressed syllable prediction in Mandarin. Ya Li, Jianhua Tao, Meng Zhang, Shifeng Pan, Xiaoying Xu |
| 2010 | Text-independent F0 transformation with non-parallel data for voice conversion. Zhizheng Wu, Tomi Kinnunen, Engsiong Chng, Haizhou Li |
| 2010 | The 2010 CMU GALE speech-to-text system. Florian Metze, Roger Hsiao, Qin Jin, Udhyakumar Nallasamy, Tanja Schultz |
| 2010 | The AMIDA 2009 meeting transcription system. Thomas Hain, Lukás Burget, John Dines, Philip N. Garner, Asmaa El Hannani, Marijn Huijbregts, Martin Karafiát, Mike Lincoln, Vincent Wan |
| 2010 | The CHiME corpus: a resource and a challenge for computational hearing in multisource environments. Heidi Christensen, Jon Barker, Ning Ma, Phil D. Green |
| 2010 | The IIR NIST SRE 2008 and 2010 summed channel speaker recognition systems. Hanwu Sun, Bin Ma, Chien-Lin Huang, Trung Hieu Nguyen, Haizhou Li |
| 2010 | The INTERSPEECH 2010 paralinguistic challenge. Björn W. Schuller, Stefan Steidl, Anton Batliner, Felix Burkhardt, Laurence Devillers, Christian A. Müller, Shrikanth S. Narayanan |
| 2010 | The NIST 2010 speaker recognition evaluation. Alvin F. Martin, Craig S. Greenberg |
| 2010 | The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms. David Dean, Sridha Sridharan, Robert Vogt, Michael Mason |
| 2010 | The RWTH 2009 quaero ASR evaluation system for English and German. Markus Nußbaum-Thom, Simon Wiesler, Martin Sundermeyer, Christian Plahl, Stefan Hahn, Ralf Schlüter, Hermann Ney |
| 2010 | The characterization of the relative information content by spectral features for the objective intelligibility assessment of nonlinearly processed speech. Anton Schlesinger, Marinus M. Boone |
| 2010 | The comparison between the deletion-based methods and the mixing-based methods for audio CAPTCHA systems. Takuya Nishimoto, Takayuki Watanabe |
| 2010 | The effect of a word embedded in a sentence and speaking rate variation on the perceptual training of geminate and singleton consonant distinction. Mee Sonu, Keiichi Tajima, Hiroaki Kato, Yoshinori Sagisaka |
| 2010 | The effect of audience familiarity on the perception of modified accent. Jonathan Teutenberg, Catherine Inez Watson |
| 2010 | The effects of EMA-based augmented visual feedback on the English speakers' acquisition of the Japanese flap: a perceptual study. June S. Levitt, William F. Katz |
| 2010 | The estimation and kernel metric of spectral correlation for text-independent speaker verification. Eryu Wang, Kong-Aik Lee, Bin Ma, Haizhou Li, Wu Guo, Li-Rong Dai |
| 2010 | The impact of ASR on abstractive vs. extractive meeting summaries. Gabriel Murray, Giuseppe Carenini, Raymond T. Ng |
| 2010 | The influence of actual and perceived sexual orientation on diadochokinetic rate in women and men. Benjamin Munson |
| 2010 | The influence of expertise and efficiency on modality selection strategies and perceived mental effort. Ina Wechsung, Stefan Schaffer, Robert Schleicher, Anja Naumann, Sebastian Möller |
| 2010 | The interrelation between the stimulus range and the number of response categories in vowel categorization. Titia Benders, Paola Escudero |
| 2010 | The prosody of Swedish conversational grunts. Daniel Neiberg, Joakim Gustafson |
| 2010 | The relation between pitch perception preference and emotion identification. Marie Nilsenová, Martijn Goudbeek, Luuk Kempen |
| 2010 | The relevance of timing, pauses and overlaps in dialogues: detecting topic changes in scenario based meetings. Saturnino Luz, Jing Su |
| 2010 | The role of higher-level linguistic features in HMM-based speech synthesis. Oliver Watts, Junichi Yamagishi, Simon King |
| 2010 | The use of air-pressure sensor in electrolaryngeal speech enhancement based on statistical voice conversion. Keigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano |
| 2010 | The use of sense in unsupervised training of acoustic models for ASR systems. Rita Singh, Benjamin Lambert, Bhiksha Raj |
| 2010 | The use of subvector quantization and discrete densities for fast GMM computation for speaker verification. Guoli Ye, Brian Mak |
| 2010 | Time conditioned search in automatic speech recognition reconsidered. David Nolden, Hermann Ney, Ralf Schlüter |
| 2010 | Topic and style-adapted language modeling for Thai broadcast news ASR. Markpong Jongtaveesataporn, Sadaoki Furui |
| 2010 | Topic-dependent n-gram models based on optimization of context lengths in LDA. Akira Nakamura, Satoru Hayamizu |
| 2010 | Topological representation of speech for speaker recognition. Gabriel Hernández Sierra, Jean-François Bonastre, Driss Matrouf, José R. Calvo |
| 2010 | Toward aero-acoustical analysis of the sibilant /s/: an oral cavity modeling. Kazunori Nozaki, Youhei Ohnishi, Takashi Suda, Shigeo Wada, Shinji Shimojo |
| 2010 | Toward detecting voice activity employing soft decision in second-order conditional MAP. Sang-Kyun Kim, Jae-Hun Choi, Sang-Ick Kang, Ji-Hyun Song, Joon-Hyuk Chang |
| 2010 | Towards a robust face recognition system using compressive sensing. Allen Y. Yang, Zihan Zhou, Yi Ma, Shankar Sastry |
| 2010 | Towards affective state modeling in narrative and conversational settings. Bart Jochems, Martha A. Larson, Roeland Ordelman, Ronald Poppe, Khiet P. Truong |
| 2010 | Towards an ASR-free objective analysis of pathological speech. Catherine Middag, Yvan Saeys, Jean-Pierre Martens |
| 2010 | Towards long-range prosodic attribute modeling for language recognition. Raymond W. M. Ng, Cheung-Chi Leung, Ville Hautamäki, Tan Lee, Bin Ma, Haizhou Li |
| 2010 | Towards mixed language speech recognition systems. David Imseng, Hervé Bourlard, Mathew Magimai-Doss |
| 2010 | Towards spoken term discovery at scale with zero resources. Aren Jansen, Kenneth Church, Hynek Hermansky |
| 2010 | Tracter: a lightweight dataflow framework. Philip N. Garner, John Dines |
| 2010 | Training a parametric-based logF0 model with the minimum generation error criterion. Javier Latorre, Mark J. F. Gales, Heiga Zen |
| 2010 | Transcript-dependent speaker recognition using mixer 1 and 2. Fred S. Richardson, Joseph P. Campbell |
| 2010 | Turn taking-based conversation detection by using DOA estimation. Yohei Kawaguchi, Masahito Togami, Yasunari Obuchi |
| 2010 | Turn-alignment using eye-gaze and speech in conversational interaction. Kristiina Jokinen, Kazuaki Harada, Masafumi Nishida, Seiichi Yamamoto |
| 2010 | Two new estimation methods for a superpositional intonation model. Humberto M. Torres, Hansjörg Mixdorff, Jorge A. Gurlekian, Hartmut R. Pfitzinger |
| 2010 | Two-layered audio-visual integration in voice activity detection and automatic speech recognition for robots. Takami Yoshida, Kazuhiro Nakadai |
| 2010 | Ungrounded independent non-negative factor analysis. Bhiksha Raj, Kevin W. Wilson, Alexander Krueger, Reinhold Haeb-Umbach |
| 2010 | Unscented transform with online distortion estimation for HMM adaptation. Jinyu Li, Dong Yu, Yifan Gong, Li Deng |
| 2010 | Unsupervised acoustic model adaptation for multi-origin non native ASR. Sethserey Sam, Eric Castelli, Laurent Besacier |
| 2010 | Unsupervised discovery and training of maximally dissimilar cluster models. Françoise Beaufays, Vincent Vanhoucke, Brian Strope |
| 2010 | Unsupervised learning of vowels from continuous speech based on self-organized phoneme acquisition model. Kouki Miyazawa, Hideaki Kikuchi, Reiko Mazuka |
| 2010 | Unsupervised model adaptation on targeted speech segments for LVCSR system combination. Richard Dufour, Fethi Bougares, Yannick Estève, Paul Deléglise |
| 2010 | Unsupervised sequential organization for cochannel speech separation. Ke Hu, DeLiang Wang |
| 2010 | Unsupervised spoken-term detection with spoken queries using segment-based dynamic time warping. Chun-an Chan, Lin-Shan Lee |
| 2010 | Unvoiced speech segregation based on CASA and spectral subtraction. Ke Hu, DeLiang Wang |
| 2010 | Using a DBN to integrate sparse classification and GMM-based ASR. Yang Sun, Jort F. Gemmeke, Bert Cranen, Louis ten Bosch, Lou Boves |
| 2010 | Using cross-decoder co-occurrences of phone n-grams in SVM-based phonotactic language recognition. Mikel Peñagarikano, Amparo Varona, Luis Javier Rodríguez-Fuentes, Germán Bordel |
| 2010 | Using dependency parsing and machine learning for factoid question answering on spoken documents. Pere Comas, Jordi Turmo, Lluís Màrquez |
| 2010 | Using harmonic phase information to improve ASR rate. Ibon Saratxaga, Inma Hernáez, Igor Odriozola, Eva Navas, Iker Luengo, Daniel Erro |
| 2010 | Using high-level information to detect key audio events in a tennis game. Qiang Huang, Stephen J. Cox |
| 2010 | Using non-native error patterns to improve pronunciation verification. Joost van Doremalen, Catia Cucchiarini, Helmer Strik |
| 2010 | Using phoneme recognition and text-dependent speaker verification to improve speaker segmentation for Chinese speech. Gang Wang, Xiaojun Wu, Thomas Fang Zheng |
| 2010 | Using prosody to improve Mandarin automatic speech recognition. Chong-Jia Ni, Wenju Liu, Bo Xu |
| 2010 | Using robust viterbi algorithm and HMM-modeling in unit selection TTS to replace units of poor quality. Hanna Silén, Elina Helander, Jani Nurminen, Konsta Koppinen, Moncef Gabbouj |
| 2010 | Using spectro-temporal features to improve AFE feature extraction for ASR. Suman V. Ravuri, Nelson Morgan |
| 2010 | Utilizing a noisy-channel approach for Korean LVCSR. Sakriani Sakti, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura |
| 2010 | Utterance selection for speech acts in a cognitive tourguide scenario. Felix Putze, Tanja Schultz |
| 2010 | VAD-measure-embedded decoder with online model adaptation. Tasuku Oonishi, Koji Iwano, Sadaoki Furui |
| 2010 | Validation of a training method for L2 continuous-speech segmentation. Anne Cutler, Janise Shanley |
| 2010 | Variant time-frequency cepstral features for speaker recognition. Weiqiang Zhang, Yan Deng, Liang He, Jia Liu |
| 2010 | Verifying pronunciation dictionaries using conflict analysis. Marelie H. Davel, Febe de Wet |
| 2010 | Viseme-dependent weight optimization for CHMM-based audio-visual speech recognition. Alexey Karpov, Andrey Ronzhin, Konstantin Markov, Milos Zelezný |
| 2010 | Vocabulary independent spoken query: a case for subword units. Evandro B. Gouvêa, Tony Ezzat |
| 2010 | Vocal tract contour analysis of emotional speech by the functional data curve representation. Sungbok Lee, Shrikanth S. Narayanan |
| 2010 | Voice activity detection based on conditional random fields using multiple features. Akira Saito, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda |
| 2010 | Voice activity detection in a reguarized reproducing kernel hilbert space. Xugang Lu, Masashi Unoki, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura |
| 2010 | Voice activity detection using frame-wise model re-estimation method based on Gaussian pruning with weight normalization. Masakiyo Fujimoto, Shinji Watanabe, Tomohiro Nakatani |
| 2010 | Voice attributes affecting likability perception. Benjamin Weiss, Felix Burkhardt |
| 2010 | Voice quality evaluation of recent open source codecs. Anssi Rämö, Henri Toukomaa |
| 2010 | Voice search for development. Etienne Barnard, Johan Schalkwyk, Charl Johannes van Heerden, Pedro J. Moreno |
| 2010 | WFST compression for automatic speech recognition. Diamantino Caseiro |
| 2010 | What do you mean, you're uncertain?: the interpretation of cue words and rising intonation in dialogue. Catherine Lai |
| 2010 | What else is new than the hamming window? robust MFCCs for speaker recognition via multitapering. Tomi Kinnunen, Rahim Saeidi, Johan Sandberg, Maria Hansson-Sandsten |
| 2010 | When is indexical information about speech activated? evidence from a cross-modal priming experiment. Benjamin Munson, Renata Solum |
| 2010 | Wiktionary as a source for automatic pronunciation extraction. Tim Schlippe, Sebastian Ochs, Tanja Schultz |
| 2010 | Within and across sentence boundary language model. Saeedeh Momtazi, Friedrich Faubel, Dietrich Klakow |