INTERSPEECH - RankMe

782 papers

Year	Title / Authors
2010	"flat pitch accents" in Czech. Tomás Dubeda
2010	11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, Makuhari, Chiba, Japan, September 26-30, 2010 Takao Kobayashi, Keikichi Hirose, Satoshi Nakamura
2010	2010, a speech oddity: phonetic transcription of reversed speech. François Pellegrino, Emmanuel Ferragne, Fanny Meunier
2010	A Bayesian approach to voice activity detection using multiple statistical models and discriminative training. Tao Yu, John H. L. Hansen
2010	A DOA estimation algorithm based on equalization-cancellation theory. Duc Thanh Chau, Junfeng Li, Masato Akagi
2010	A MMSE estimator in mel-cepstral domain for robust large vocabulary automatic speech recognition using uncertainty propagation. Ramón Fernandez Astudillo, Reinhold Orglmeister
2010	A blind signal-to-noise ratio estimator for high noise speech recordings. Charles Mercier, Roch Lefebvre
2010	A classifier-based target cost for unit selection speech synthesis trained on perceptual data. Volker Strom, Simon King
2010	A cluster-profile representation of emotion using agglomerative hierarchical clustering. Emily Mower, Kyu Jeong Han, Sungbok Lee, Shrikanth S. Narayanan
2010	A comparative large scale study of MLP features for Mandarin ASR. Fabio Valente, Mathew Magimai-Doss, Christian Plahl, Suman V. Ravuri, Wen Wang
2010	A comparative study of constrained and unconstrained approaches for segmentation of speech signal. Venkatesh Keri, Kishore Prahallad
2010	A comparative study of noise estimation algorithms for VTS-based robust speech recognition. Yong Zhao, Biing-Hwang Juang
2010	A comparison of pronunciation modeling approaches for HMM-TTS. Gabriel Webster, Sacha Krstulovic, Kate M. Knill
2010	A corpus-based approach to speech enhancement from nonstationary noise. Ji Ming, Ramji Srinivasan, Danny Crookes
2010	A discriminative performance metric for GMM-UBM speaker identification. Omid Dehzangi, Bin Ma, Engsiong Chng, Haizhou Li
2010	A discriminative splitting criterion for phonetic decision trees. Simon Wiesler, Georg Heigold, Markus Nußbaum-Thom, Ralf Schlüter, Hermann Ney
2010	A duration modeling technique with incremental speech rate normalization. Hiroshi Fujimura, Takashi Masuko, Mitsuyoshi Tachimori
2010	A factorial sparse coder model for single channel source separation. Robert Peharz, Michael Stark, Franz Pernkopf, Yannis Stylianou
2010	A fast implementation of factor analysis for speaker verification. Qingsong Liu, Wei Huang, Dongxing Xu, Hongbin Cai, Beiqian Dai
2010	A fast one-pass-training feature selection technique for GMM-based acoustic event detection with audio-visual data. Taras Butko, Climent Nadeu
2010	A fast query by humming system based on notes. Jingzhou Yang, Jia Liu, Weiqiang Zhang
2010	A fast speaker indexing using vector quantization and second order statistics with adaptive threshold computation. Konstantin Biatov
2010	A feature extraction method for automatic speech recognition based on the cochlear nucleus. Serajul Haque, Roberto Togneri
2010	A hierarchical F0 modeling method for HMM-based speech synthesis. Ming Lei, Yi-Jian Wu, Frank K. Soong, Zhen-Hua Ling, Li-Rong Dai
2010	A hybrid approach to online speaker diarization. Carlos Vaquero, Oriol Vinyals, Gerald Friedland
2010	A hybrid approach to robust word lattice generation via acoustic-based word detection. Icksang Han, Chiyoun Park, Jeongmi Cho, Jeongsu Kim
2010	A hybrid architecture for mobile voice user interfaces. Imre Kiss, Joseph Polifroni, Chao Wang, Ghinwa F. Choueiter, Mike Phillips
2010	A hybrid modeling strategy for GMM-SVM speaker recognition with adaptive relevance factor. Chang Huai You, Haizhou Li, Kong-Aik Lee
2010	A language-identification inspired method for spontaneous speech detection. Mickael Rouvier, Richard Dufour, Georges Linarès, Yannick Estève
2010	A lightweight keyword and tag-cloud retrieval algorithm for automatic speech recognition transcripts. Sebastian Tschöpel, Daniel Schneider
2010	A longest matching segment approach for text-independent speaker recognition. Ayeh Jafari, Ramji Srinivasan, Danny Crookes, Ji Ming
2010	A maximum a posteriori sound source localization in reverberant and noisy conditions. Jinho Choi, Chang D. Yoo
2010	A minimum classification error approach to pronunciation variation modeling of non-native proper names. Line Adde, Bert Réveil, Jean-Pierre Martens, Torbjørn Svendsen
2010	A minimum converted trajectory error (MCTE) approach to high quality speech-to-lips conversion. Xiaodan Zhuang, Lijuan Wang, Frank K. Soong, Mark Hasegawa-Johnson
2010	A modified parameterization of the Fujisaki model. Robert Schubert, Oliver Jokisch, Diane Hirschfeld
2010	A multidomain approach for automatic home environmental sound classification. Stavros Ntalampiras, Ilyas Potamitis, Nikos Fakotakis
2010	A multimodal density function estimation approach to formant tracking. Sundar Harshavardhan, Chandra Sekhar Seelamantula, Thippur V. Sreenivas
2010	A multipulse FEC scheme based on amplitude estimation for CELP codecs over packet networks. José L. Carmona, Angel M. Gomez, Antonio M. Peinado, José L. Pérez-Córdoba, José A. González
2010	A multistream multiresolution framework for phoneme recognition. Nima Mesgarani, Samuel Thomas, Hynek Hermansky
2010	A new VAD framework using statistical model and human knowledge based empirical rule. Ji Wu, Xiao-Lei Zhang, Wei Li
2010	A new approach for automatic tone error detection in strong accented Mandarin based on dominant set. Taotao Zhu, Dengfeng Ke, Zhenbiao Chen, Bo Xu
2010	A new binary mask based on noise constraints for improved speech intelligibility. Gibak Kim, Philipos C. Loizou
2010	A new multichannel multi modal dyadic interaction database. Viktor Rozgic, Bo Xiao, Athanasios Katsamanis, Brian R. Baucom, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2010	A novel approach for matched reverberant training of HMMs using data pairs. Armin Sehr, Christian Hofmann, Roland Maas, Walter Kellermann
2010	A novel confidence measure based on marginalization of jointly estimated error cause probabilities. Atsunori Ogawa, Atsushi Nakamura
2010	A novel feature extraction strategy for multi-stream robust emotion identification. Gang Liu, Yun Lei, John H. L. Hansen
2010	A novel hybrid approach for Mandarin speech synthesis. Shifeng Pan, Meng Zhang, Jianhua Tao
2010	A novel path extension framework using steady segment detection for Mandarin speech recognition. Zhanlei Yang, Wenju Liu
2010	A novel speaker binary key derived from anchor models. Xavier Anguera, Jean-François Bonastre
2010	A novel text-independent phonetic segmentation algorithm based on the microcanonical multiscale formalism. Vahid Khanagha, Khalid Daoudi, Oriol Pont, Hussein M. Yahia
2010	A particle filter feature compensation approach to robust speech recognition. Aleem Mushtaq, Yu Tsao, Chin-Hui Lee
2010	A perceptual study of acceleration parameters in HMM-based TTS. Yining Chen, Zhi-Jie Yan, Frank K. Soong
2010	A phoneme recognition framework based on auditory spectro-temporal receptive fields. Samuel Thomas, Kailash Patil, Sriram Ganapathy, Nima Mesgarani, Hynek Hermansky
2010	A phonetic alternative to cross-language voice conversion in a text-dependent context: evaluation of speaker identity. Kayoko Yanagisawa, Mark A. Huckvale
2010	A procedure for estimating gestural scores from natural speech. Hosung Nam, Vikramjit Mitra, Mark Tiede, Elliot Saltzman, Louis Goldstein, Carol Y. Espy-Wilson, Mark Hasegawa-Johnson
2010	A quick sequential forward floating feature selection algorithm for emotion detection from speech. Mátyás Brendel, Riccardo Zaccarelli, Laurence Devillers
2010	A regularized discriminative training method of acoustic models derived by minimum relative entropy discrimination. Yotaro Kubo, Shinji Watanabe, Atsushi Nakamura, Tetsunori Kobayashi
2010	A robust audio-visual speech recognition using audio-visual voice activity detection. Satoshi Tamura, Masato Ishikawa, Takashi Hashiba, Shin'ichi Takeuchi, Satoru Hayamizu
2010	A robust speech recognition system against the ego noise of a robot. Gökhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Hiroshi Tsujino, Jun-ichi Imura
2010	A rule-based backchannel prediction model using pitch and pause information. Khiet P. Truong, Ronald Poppe, Dirk Heylen
2010	A segment-based non-parametric approach for monophone recognition. Ladan Golipour, Douglas D. O'Shaughnessy
2010	A semi-supervised cluster-and-label approach for utterance classification. Amparo Albalate, Aparna Suchindranath, David Suendermann, Wolfgang Minker
2010	A singing style modeling system for singing voice synthesizers. Keijiro Saino, Makoto Tachibana, Hideki Kenmochi
2010	A spectral LF model based approach to voice source parameterisation. John Kane, Mark Kane, Christer Gobl
2010	A speech-in-noise test based on spoken digits: comparison of normal and impaired listeners using a computer model. Matthew Robertson, Guy J. Brown, Wendy Lecluyse, Manasa Panda, Christine M. Tan
2010	A spoken term detection framework for recovering out-of-vocabulary words using the web. Carolina Parada, Abhinav Sethy, Mark Dredze, Frederick Jelinek
2010	A statistical segment-based approach for spoken language understanding. Lucía Ortega, Isabel Galiano, Lluís F. Hurtado, Emilio Sanchis, Encarna Segarra
2010	A stochastic finite-state transducer approach to spoken dialog management. Lluís F. Hurtado, Joaquin Planells, Encarna Segarra, Emilio Sanchis, David Griol
2010	A study of interplay between articulatory movement and prosodic characteristics in emotional speech production. Jangwon Kim, Sungbok Lee, Shrikanth S. Narayanan
2010	A study of intra-speaker and inter-speaker affective variability using electroglottograph and inverse filtered glottal waveforms. Daniel Bone, Samuel Kim, Sungbok Lee, Shrikanth S. Narayanan
2010	A study of irrelevant variability normalization based training and unsupervised online adaptation for LVCSR. Guangchuan Shi, Yu Shi, Qiang Huo
2010	A study of term weighting in phonotactic approach to spoken language recognition. Sirinoot Boonsuk, Donglai Zhu, Bin Ma, Atiwong Suchato, Proadpran Punyabukkana, Nattanun Thatphithakkul, Chai Wutiwiwatchai
2010	A super-resolution spectrogram using coupled PLCA. Juhan Nam, Gautham J. Mysore, Joachim Ganseman, Kyogu Lee, Jonathan S. Abel
2010	A variable frame length and rate algorithm based on the spectral kurtosis measure for speaker verification. Chi-Sang Jung, Kyu Jeong Han, Hyunson Seo, Shrikanth S. Narayanan, Hong-Goo Kang
2010	Accelerating hierarchical acoustic likelihood computation on graphics processors. Pavel Kveton, Miroslav Novak
2010	Accurate pitch marking for prosodic modification of speech segments. Thomas Ewender, Beat Pfister
2010	Acoustic analysis of intonation in parkinson's disease. Joan K. Y. Ma, Rüdiger Hoffmann
2010	Acoustic correlates of meaning structure in conversational speech. Alexei V. Ivanov, Giuseppe Riccardi, Sucheta Ghosh, Sara Tonelli, Evgeny A. Stepanov
2010	Acoustic correlates of voice quality improvement by voice training. Kiyoaki Aikawa, Junko Uenuma, Tomoko Akitake
2010	Acoustic feature analysis in speech emotion primitives estimation. Dongrui Wu, Thomas D. Parsons, Shrikanth S. Narayanan
2010	Acoustic feature diversity and speaker verification. R. Padmanabhan, Hema A. Murthy
2010	Acoustic modeling with bootstrap and restructuring for low-resourced languages. Xiaodong Cui, Jian Xue, Pierre L. Dognin, Upendra V. Chaudhari, Bowen Zhou
2010	Acoustic vector resampling for GMMSVM-based speaker verification. Man-Wai Mak, Wei Rao
2010	Acoustic-based recognition of head gestures accompanying speech. Akira Sasou, Yasuharu Hashimoto, Katsuhiko Sakaue
2010	Acoustic-to-articulatory inversion based on local regression. Samer Al Moubayed, Gopal Ananthakrishnan
2010	Acoustics-based phonetic transcription method for proper nouns. Antoine Laurent, Sylvain Meignier, Téva Merlin, Paul Deléglise
2010	Active appearance models for photorealistic visual speech synthesis. Wesley Mattheyses, Lukas Latacz, Werner Verhelst
2010	Active word learning under uncertain input conditions. Maarten Versteegh, Louis ten Bosch, Lou Boves
2010	Adaptation of a tongue shape model by local feature transformations. Chao Qin, Miguel Á. Carreira-Perpiñán, Mohsen Farhadloo
2010	Adapting a duration synthesis model to rate children's oral reading prosody. Minh Duong, Jack Mostow
2010	Adaptive high accuracy approaches to speech activity detection in noisy and hostile audio environments. Mark C. Huggins, Brett Y. Smolenski, Aaron D. Lawson
2010	Adaptive voice-quality control based on one-to-many eigenvoice conversion. Kumi Ohta, Tomoki Toda, Yamato Ohtani, Hiroshi Saruwatari, Kiyohiro Shikano
2010	Advanced speech communication system for deaf people. Rubén San Segundo, Verónica López-Ludeña, Raquel Martín, Syaheerah L. Lutfi, Javier Ferreiros, Ricardo de Córdoba, José Manuel Pardo
2010	Advances in fast multistream diarization based on the information bottleneck framework. Deepu Vijayasenan, Fabio Valente, Hervé Bourlard
2010	Affective story teller: a TTS system for emotional expressivity. Shaikh Mostafa Al Masum, Antonio Rui Ferreira Rebordão, Keikichi Hirose
2010	Age and gender classification from speech using decision level fusion and ensemble based techniques. Florian Lingenfelser, Johannes Wagner, Thurid Vogt, Jonghwa Kim, Elisabeth André
2010	Age and gender classification using fusion of acoustic and prosodic features. Hugo Meinedo, Isabel Trancoso
2010	Age and gender recognition based on multiple systems - early vs. late fusion. Tobias Bocklet, Georg Stemmer, Viktor Zeißler, Elmar Nöth
2010	Age recognition based on speech signals using weights supervector. Royi Porat, Dan Lange, Yaniv Zigel
2010	An HMM trajectory tiling (HTT) approach to high quality TTS. Yao Qian, Zhi-Jie Yan, Yi-Jian Wu, Frank K. Soong, Xin Zhuang, Shengyi Kong
2010	An analysis of language mismatch in HMM state mapping-based cross-lingual speaker adaptation. Hui Liang, John Dines
2010	An analysis of sparseness and regularization in exemplar-based methods for speech classification. Dimitri Kanevsky, Tara N. Sainath, Bhuvana Ramabhadran, David Nahamoo
2010	An analytic modeling approach to enhancing throat microphone speech commands for keyword spotting. Jun Cai, Stefano Marini, Pierre Malarme, Francis Grenez, Jean Schoentgen
2010	An auditory based modulation spectral feature for reverberant speech recognition. Hari Krishna Maganti, Marco Matassoni
2010	An effect of formant amplitude in vowel perception. Masashi Ito, Keiji Ohara, Akinori Ito, Masafumi Yano
2010	An effective feature compensation scheme tightly matched with speech recognizer employing SVM-based GMM generation. Wooil Kim, Jun-Won Suh, John H. L. Hansen
2010	An empirical comparison of the t Josef R. Novak, Paul R. Dixon, Sadaoki Furui
2010	An exploration of voice source correlates of focus. Irena Yanushevskaya, Christer Gobl, John Kane, Ailbhe Ní Chasaide
2010	An implementation of decision tree-based context clustering on graphics processing units. Nicholas Pilkington, Heiga Zen
2010	An improved cluster model selection method for agglomerative hierarchical speaker clustering using incremental Gaussian mixture models. Kyu Jeong Han, Shrikanth S. Narayanan
2010	An improved wavelet-based dereverberation for robust automatic speech recognition. Randy Gomez, Tatsuya Kawahara
2010	An integrated top-down/bottom-up approach to speaker diarization. Simon Bozonnet, Nicholas W. D. Evans, Corinne Fredouille, Dong Wang, Raphaël Troncy
2010	An intonation model for TTS in sepedi. Daniel R. van Niekerk, Etienne Barnard
2010	An intrusive super-wideband speech quality model: DIAL. Nicolas Côté, Vincent Koehl, Valérie Gautier-Turbin, Alexander Raake, Sebastian Möller
2010	An investigation into direct scoring methods without SVM training in speaker verification. Ce Zhang, Rong Zheng, Bo Xu
2010	An investigation of formant frequencies for cognitive load classification. Tet Fei Yap, Julien Epps, Eliathamby Ambikairajah, Eric H. C. Choi
2010	An unsupervised approach to creating web audio contents-based HMM voices. Jinfu Ni, Hisashi Kawai
2010	Analysis and detection of cognitive load and frustration in drivers' speech. Hynek Boril, Seyed Omid Sadjadi, Tristan Kleinschmidt, John H. L. Hansen
2010	Analysis of excitation source information in emotional speech. S. R. Mahadeva Prasanna, D. Govind
2010	Analysis of gender normalization using MLP and VTLN features. Thomas Schaaf, Florian Metze
2010	Analytical assessment and distance modeling of speech transmission quality. Marcel Wältermann, Alexander Raake, Sebastian Möller
2010	Analyzing user utterances in barge-in-able spoken dialogue system for improving identification accuracy. Kyoko Matsuyama, Kazunori Komatani, Ryu Takeda, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno
2010	Applying geometric source separation for improved pitch extraction in human-robot interaction. Martin Heckmann, Claudius Gläser, Frank Joublin, Kazuhiro Nakadai
2010	Applying scalable phonetic context similarity in unit selection of concatenative text-to-speech. Wei Zhang, Xiaodong Cui
2010	Applying voice conversion to concatenative singing-voice synthesis. Fernando Villavicencio, Jordi Bonada
2010	Approaching human listener accuracy with modern speaker verification. Ville Hautamäki, Tomi Kinnunen, Mohaddeseh Nosratighods, Kong-Aik Lee, Bin Ma, Haizhou Li
2010	Articulatory grounding of southern salentino harmony processes. Mirko Grimaldi, Andrea Calabrese, Francesco Sigona, Luigia Garrapa, Bianca Sisinni
2010	Articulatory inversion of american English /turnr/ by conditional density modes. Chao Qin, Miguel Á. Carreira-Perpiñán
2010	Articulatory synthesis and perception of plosive-vowel syllables with virtual consonant targets. Peter Birkholz, Bernd J. Kröger, Christiane Neuschaefer-Rube
2010	Articulatory-functional modeling of speech prosody: a review. Yi Xu, Santitham Prom-on
2010	Artificial and online acquired noise dictionaries for noise robust ASR. Jort F. Gemmeke, Tuomas Virtanen
2010	Assessment of single-channel speech enhancement techniques for speaker identification under mismatched conditions. Seyed Omid Sadjadi, John H. L. Hansen
2010	Assessment of spoken and multimodal applications: lessons learned from laboratory and field studies. Markku Turunen, Jaakko Hakulinen, Tomi Heimonen
2010	Asymptotically exact noise-corrupted speech likelihoods. Rogier C. van Dalen, Mark J. F. Gales
2010	Audio analytics by template modeling and 1-pass DP based decoding. Srikanth Cherla, V. Ramasubramanian
2010	Audio-based sports highlight detection by fourier local auto-correlations. Jiaxing Ye, Takumi Kobayashi, Tetsuya Higuchi
2010	Audio-visual anticipatory coarticulation modeling by human and machine. Louis H. Terry, Karen Livescu, Janet B. Pierrehumbert, Aggelos K. Katsaggelos
2010	Audio-visual synchronisation for speaker diarisation. Giulia Garau, Alfred Dielmann, Hervé Bourlard
2010	Audiovisual congruence and pragmatic focus marking. Charlotte Wollermann, Bernhard Schröder, Ulrich Schade
2010	Augmentation of adaptation data. Ravichander Vipperla, Steve Renals, Joe Frankel
2010	Augmented context features for Arabic speech recognition. Ahmad Emami, Hong-Kwang Jeff Kuo, Imed Zitouni, Lidia Mangu
2010	Augmented set of features for confidence estimation in spoken term detection. Javier Tejedor, Doroteo T. Toledano, Miguel Bautista, Simon King, Dong Wang, José Colás
2010	AutoBI - a tool for automatic toBI annotation. Andrew Rosenberg
2010	Autocorrelation and double autocorrelation based spectral representations for a noisy word recognition system. Tetsuya Shimamura, Ngoc Dinh Nguyen
2010	Automated vocal emotion recognition using phoneme class specific features. Géza Kiss, Jan P. H. van Santen
2010	Automatic analysis of the intonation of a tone language. applying the momel algorithm to spontaneous standard Chinese (beijing). Na Zhi, Daniel Hirst, Pier Marco Bertinetto
2010	Automatic classification of married couples' behavior using audio features. Matthew Black, Athanasios Katsamanis, Chi-Chun Lee, Adam C. Lammert, Brian R. Baucom, Andrew Christensen, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2010	Automatic derivation of phonological rules for mispronunciation detection in a computer-assisted pronunciation training system. Wai Kit Lo, Shuang Zhang, Helen M. Meng
2010	Automatic detection of abnormal stress patterns in unit selection synthesis. Yeon-Jun Kim, Marc C. Beutnagel
2010	Automatic detection of task-incompleted dialog for spoken dialog system based on dialog act n-gram. Sunao Hara, Norihide Kitaoka, Kazuya Takeda
2010	Automatic discriminative measurement of voice onset time. Morgan Sonderegger, Joseph Keshet
2010	Automatic error detection for unit selection speech synthesis using log likelihood ratio based SVM classifier. Heng Lu, Zhen-Hua Ling, Si Wei, Li-Rong Dai, Ren-Hua Wang
2010	Automatic estimation of transcription accuracy and difficulty. Brandon Roy, Soroush Vosoughi, Deb Roy
2010	Automatic evaluation of English pronunciation by Japanese speakers using various acoustic features and pattern recognition techniques. Kuniaki Hirabayashi, Seiichi Nakagawa
2010	Automatic excitement-level detection for sports highlights generation. Hynek Boril, Abhijeet Sangwan, Taufiq Hasan, John H. L. Hansen
2010	Automatic perceptual categorization of disordered connected speech. Ali Alpan, Jean Schoentgen, Youri Maryn, Francis Grenez
2010	Automatic pronunciation scoring using learning to rank and DP-based score segmentation. Liang-Yu Chen, Jyh-Shing Roger Jang
2010	Automatic reference independent evaluation of prosody quality using multiple knowledge fusions. Shen Huang, Hongyan Li, Shijin Wang, Jiaen Liang, Bo Xu
2010	Automatic selection of thresholds for signal separation algorithms based on interaural delay. Chanwoo Kim, Richard M. Stern, Kiwan Eom, Jaewon Lee
2010	Automatic speaker age and gender recognition in the car for tailoring dialog and mobile services. Michael Feld, Felix Burkhardt, Christian A. Müller
2010	Automatic speech recognition for assistive writing in speech supplemented word prediction. John-Paul Hosom, Tom Jakobs, Allen Baker, Susan Fager
2010	Automatic speech recognition of multiple accented English data. Dimitra Vergyri, Lori Lamel, Jean-Luc Gauvain
2010	Automatic speech recognition system channel modeling. Qun Feng Tan, Kartik Audhkhasi, Panayiotis G. Georgiou, Emil Ettelaie, Shrikanth S. Narayanan
2010	Automatic turn segmentation in spoken conversations. Alexei V. Ivanov, Giuseppe Riccardi
2010	Autoregressive clustering for HMM speech synthesis. Matt Shannon, William Byrne
2010	Autoregressive modelling for linear prediction of ultrasonic speech. Farzaneh Ahmadi, Ian Vince McLoughlin, Hamid R. Sharifzadeh
2010	Bandwidth expansion of speech based on wavelet transform modulus maxima vector mapping. Zhe Chen, You-Chi Cheng, Fuliang Yin, Chin-Hui Lee
2010	Bayes factor based speaker segmentation for speaker diarization. David Wang, Robert Vogt, Sridha Sridharan
2010	Bayesian speaker recognition using Gaussian mixture model and laplace approximation. Shih-Sian Cheng, I-Fan Chen, Hsin-Min Wang
2010	Beyond sentence prosody. Chiu-yu Tseng
2010	Bias considerations for minimum subspace noise tracking. Mahdi Triki, Kees Janse
2010	Bimodal coherence based scale ambiguity cancellation for target speech extraction and enhancement. Qingju Liu, Wenwu Wang, Philip J. B. Jackson
2010	Binary coding of speech spectrograms using a deep auto-encoder. Li Deng, Michael L. Seltzer, Dong Yu, Alex Acero, Abdel-rahman Mohamed, Geoffrey E. Hinton
2010	Boosted mixture learning of Gaussian mixture HMMs for speech recognition. Jun Du, Yu Hu, Hui Jiang
2010	Boosting systems for LVCSR. George Saon, Hagen Soltau
2010	Brazilian portuguese acoustic model training based on data borrowing from other language. Kazuhiko Abe, Sakriani Sakti, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura
2010	Brno university of technology system for interspeech 2010 paralinguistic challenge. Marcel Kockmann, Lukás Burget, Jan Cernocký
2010	Building transcribed speech corpora quickly and cheaply for many languages. Thad Hughes, Kaisuke Nakajima, Linne Ha, Atul Vasu, Pedro J. Moreno, Mike LeBeau
2010	CASTLE: a computer-assisted stress teaching and learning environment for learners of English as a second language. Jingli Lu, Ruili Wang, Liyanage C. De Silva, Yang Gao, Jia Liu
2010	CRF-based combination of contextual features to improve a posteriori word-level confidence measures. Julien Fayolle, Fabienne Moreau, Christian Raymond, Guillaume Gravier, Patrick Gros
2010	CRF-based stochastic pronunciation modeling for out-of-vocabulary spoken term detection. Dong Wang, Simon King, Nicholas W. D. Evans, Raphaël Troncy
2010	Can conversational word usage be used to predict speaker demographics?. Dan Gillick
2010	Can tongue be recovered from face? the answer of data-driven statistical models. Atef Ben Youssef, Pierre Badin, Gérard Bailly
2010	Canonical state models for automatic speech recognition. Mark J. F. Gales, Kai Yu
2010	Cantonese tone word learning by tone and non-tone language speakers. Angela Cooper, Yue Wang
2010	Catalog-based single-channel speech-music separation. Cemil Demir, A. Taylan Cemgil, Murat Saraclar
2010	Challenging the speech intelligibility index: macroscopic vs. microscopic prediction of sentence recognition in normal and hearing-impaired listeners. Tim Jürgens, Stefan Fredelake, Ralf M. Meyer, Birger Kollmeier, Thomas Brand
2010	Changes in temporal processing of speech across the adult lifespan. Diane Kewley-Port, Larry E. Humes, Daniel Fogerty
2010	Channel detectors for system fusion in the context of NIST LRE 2009. Florian Verdet, Driss Matrouf, Jean-François Bonastre, Jean Hennebert
2010	Chirp complex cepstrum-based decomposition for asynchronous glottal analysis. Thomas Drugman, Thierry Dutoit
2010	Classifying dialog acts in human-human and human-machine spoken conversations. Silvia Quarteroni, Giuseppe Riccardi
2010	Classroom note-taking system for hearing impaired students using automatic speech recognition adapted to lectures. Tatsuya Kawahara, Norihiro Katsumaru, Yuya Akita, Shinsuke Mori
2010	Close speaker cancellation for suppression of non-stationary background noise for hands-free speech interface. Jani Even, Carlos Toshinori Ishi, Hiroshi Saruwatari, Norihiro Hagita
2010	Cluster analysis of differential spectral envelopes on emotional speech. Giampiero Salvi, Fabio Tesser, Enrico Zovato, Piero Cosi
2010	Cluster-based language model for spoken document retrieval using NMF-based document clustering. Xinhui Hu, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura
2010	Combination of probabilistic and possibilistic language models. Stanislas Oger, Vladimir Popescu, Georges Linarès
2010	Combining Chinese spoken term detection systems via side-information conditioned linear logistic regression. Sha Meng, Weiqiang Zhang, Jia Liu
2010	Combining five acoustic level modeling methods for automatic speaker age and gender recognition. Ming Li, Chi-Sang Jung, Kyu Jeong Han
2010	Combining many alignments for speech to speech translation. Sameer Maskey, Steven J. Rennie, Bowen Zhou
2010	Combining monaural and binaural evidence for reverberant speech segregation. John Woodruff, Rohit Prabhavalkar, Eric Fosler-Lussier, DeLiang Wang
2010	Combining text categorization and dialog modeling for speaker role identification on call center conversations. Rémi Lavalley, Chloé Clavel, Patrice Bellot, Marc El-Bèze
2010	Combining user intention and error modeling for statistical dialog simulators. Silvia Quarteroni, Meritxell González, Giuseppe Riccardi, Sebastian Varges
2010	Combining word-based features, statistical language models, and parsing for named entity recognition. Joseph Polifroni, Stephanie Seneff
2010	Comparing measures of synchrony and alignment in dialogue speech timing with respect to turn-taking activity. Nick Campbell, Stefan Scherer
2010	Comparing mono- & multilingual acoustic seed models for a low e-resourced language: a case-study of luxembourgish. Martine Adda-Decker, Lori Lamel, Natalie D. Snoeren
2010	Comparison of HMM and TMDN methods for lip synchronisation. Gregor Hofer, Korin Richmond
2010	Comparison of approaches for instrumentally predicting the quality of text-to-speech systems. Sebastian Möller, Florian Hinterleitner, Tiago H. Falk, Tim Polzehl
2010	Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems. Bo Li, Khe Chai Sim
2010	Comparison of methods for topic classification in a speech-oriented guidance system. Rafael Torres, Shota Takeuchi, Hiromichi Kawanami, Tomoko Matsui, Hiroshi Saruwatari, Kiyohiro Shikano
2010	Competition in the perception of spoken Japanese words. Takashi Otake, James M. McQueen, Anne Cutler
2010	Concurrent speaker localization using multi-band position-pitch (m-popi) algorithm with spectro-temporal pre-processing. Tania Habib, Harald Romsdorfer
2010	Conditional models for detecting lambda-functions in a spoken language understanding system. Frédéric Duvert, Renato De Mori
2010	Confidence measures for speaker segmentation and their relation to speaker verification. Carlos Vaquero, Alfonso Ortega, Jesús Antonio Villalba López, Antonio Miguel, Eduardo Lleida
2010	Constructing Japanese test collections for spoken term detection. Yoshiaki Itoh, Hiromitsu Nishizaki, Xinhui Hu, Hiroaki Nanjo, Tomoyosi Akiba, Tatsuya Kawahara, Seiichi Nakagawa, Tomoko Matsui, Yoichi Yamashita, Kiyoaki Aikawa
2010	Construction and evaluations of an annotated Chinese conversational corpus in travel domain for the language model of speech recognition. Xinhui Hu, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura
2010	Content-based advertisement detection. Patrick Cardinal, Vishwa Gupta, Gilles Boulianne
2010	Context adaptive training with factorized decision trees for HMM-based speech synthesis. Kai Yu, Heiga Zen, François Mairesse, Steve J. Young
2010	Context dependent modelling approaches for hybrid speech recognizers. Alberto Abad, Thomas Pellegrini, Isabel Trancoso, João Paulo Neto
2010	Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional LSTM modeling. Martin Wöllmer, Angeliki Metallinou, Florian Eyben, Björn W. Schuller, Shrikanth S. Narayanan
2010	Contextual verification for open vocabulary spoken term detection. Daniel Schneider, Timo Mertens, Martha A. Larson, Joachim Köhler
2010	Continuous speech recognition with a TF-IDF acoustic model. Geoffrey Zweig, Patrick Nguyen, Jasha Droppo, Alex Acero
2010	Conversational spontaneous speech synthesis using average voice model. Tomoki Koriyama, Takashi Nose, Takao Kobayashi
2010	Convexity and fast speech extraction by split bregman method. Meng Yu, Wenye Ma, Jack Xin, Stanley J. Osher
2010	Coping imbalanced prosodic unit boundary detection with linguistically-motivated prosodic features. Yi-Fen Liu, Shu-Chuan Tseng, Jyh-Shing Roger Jang, C.-H. Alvin Chen
2010	Creating a linguistic plausibility dataset with non-expert annotators. Benjamin Lambert, Rita Singh, Bhiksha Raj
2010	Cross-cultural investigation of prosody in verbal feedback in interactional rapport. Gina-Anne Levow, Susan Duncan, Edward T. King
2010	Cross-lingual acoustic modeling for dialectal Arabic speech recognition. Mohamed Elmahdy, Rainer Gruhn, Wolfgang Minker, Slim Abdennadher
2010	Cross-lingual and multi-stream posterior features for low resource LVCSR systems. Samuel Thomas, Sriram Ganapathy, Hynek Hermansky
2010	Cross-lingual speaker adaptation via Gaussian component mapping. Houwei Cao, Tan Lee, P. C. Ching
2010	Cross-lingual spoken language understanding from unaligned data using discriminative classification models and machine translation. Fabrice Lefèvre, François Mairesse, Steve J. Young
2010	Cross-lingual talker discrimination. Mirjam Wester
2010	Dajare is not the lowest form of wit. Takashi Otake
2010	Data pruning for template-based automatic speech recognition. Dino Seppi, Dirk Van Compernolle
2010	Data selection for language modeling using sparse representations. Abhinav Sethy, Tara N. Sainath, Bhuvana Ramabhadran, Dimitri Kanevsky
2010	Data-dependent evaluator modeling and its application to emotional valence classification from speech. Kartik Audhkhasi, Shrikanth S. Narayanan
2010	Data-driven analysis of realtime vocal tract MRI using correlated image regions. Adam C. Lammert, Michael I. Proctor, Shrikanth S. Narayanan
2010	Decision tree based tone modeling with corrective feedbacks for automatic Mandarin tone assessment. Hsien-Cheng Liao, Jiang-Chun Chen, Sen-Chia Chang, Ying-Hua Guan, Chin-Hui Lee
2010	Decision tree state clustering with word and syllable features. Hank Liao, Christopher Alberti, Michiel Bacchiani, Olivier Siohan
2010	Declarative sentence intonation patterns in 8 swiss German dialects. Adrian Leemann, Lucy Zuberbühler
2010	Decoding with shrinkage-based language models. Ahmad Emami, Stanley F. Chen, Abraham Ittycheriah, Hagen Soltau, Bing Zhao
2010	Decoupling session variability modelling and speaker characterisation. Anthony Larcher, Christophe Lévy, Driss Matrouf, Jean-François Bonastre
2010	Deep-structured hidden conditional random fields for phonetic recognition. Dong Yu, Li Deng
2010	Detailed pronunciation variant modeling for speech transcription. Denis Jouvet, Dominique Fohr, Irina Illina
2010	Detecting Politeness and efficiency in a cooperative social interaction. Paul M. Brunet, Marcela Charfuelan, Roderick Cowie, Marc Schröder, Hastings Donnan, Ellen Douglas-Cowie
2010	Detecting categorical perception in continuous discrimination data. Paul Boersma, Katerina Chládková
2010	Detecting novel objects in acoustic scenes through classifier incongruence. Jörg-Hendrik Bach, Jörn Anemüller
2010	Detection of anger emotion in dialog speech using prosody feature and temporal relation of utterances. Narichika Nomoto, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi
2010	Detection of hot spots in poster conversations based on reactive tokens of audience. Tatsuya Kawahara, Kouhei Sumi, Zhi-qiang Chang, Katsuya Takanashi
2010	Determining optimal features for emotion recognition from speech by applying an evolutionary algorithm. David Philippou-Hübner, Bogdan Vlasenko, Tobias Grosser, Andreas Wendemuth
2010	Developing a Chinese L2 speech database of Japanese learners with narrow-phonetic labels for computer assisted pronunciation training. Wen Cao, Dongning Wang, Jinsong Zhang, Ziyu Xiong
2010	Dialect recognition using a phone-GMM-supervector-based SVM kernel. Fadi Biadsy, Julia Hirschberg, Michael Collins
2010	Dialog prediction for a general model of turn-taking. Nigel G. Ward, Olac Fuentes, Alejandro Vega
2010	Dialogue act detection in error-prone spoken dialogue systems using partial sentence tree and latent dialogue act matrix. Wei-Bin Liang, Chung-Hsien Wu, Yu-Cheng Hsiao
2010	Dialogue act tagging and segmentation with a single perceptron. Ramón Granell, Stephen G. Pulman, Carlos D. Martínez-Hinarejos, José-Miguel Benedí
2010	Did you say susi or shushi? measuring the emergence of robust fricative contrasts in English- and Japanese-acquiring children. Jeffrey J. Holliday, Mary E. Beckman, Chanelle Mays
2010	Direct construction of compact context-dependency transducers from data. David Rybach, Michael Riley
2010	Direct observation of pruning errors (DOPE): a search analysis tool. Volker Steinbiss, Martin Sundermeyer, Hermann Ney
2010	Disambiguating the functions of conversational sounds with prosody: the case of 'yeah'. Khiet P. Truong, Dirk Heylen
2010	Discovering an optimal set of minimally contrasting acoustic speech units: a point of focus for whole-word pattern matching. Guillaume Aimetti, Roger K. Moore, Louis ten Bosch
2010	Discriminative acoustic model for improving mispronunciation detection and diagnosis in computer-aided pronunciation training (CAPT). Xiaojun Qian, Frank K. Soong, Helen M. Meng
2010	Discriminative adaptation based on fast combination of DMAP and dfMLLR. Lukás Machlica, Zbynek Zajíc, Ludek Müller
2010	Discriminative adaptation for log-linear acoustic models. Jonas Lööf, Ralf Schlüter, Hermann Ney
2010	Discriminative language modeling using simulated ASR errors. Preethi Jyothi, Eric Fosler-Lussier
2010	Discriminative training and unsupervised adaptation for labeling prosodic events with limited training data. Raul Fernandez, Bhuvana Ramabhadran
2010	Discriminative training for hierarchical clustering in speaker diarization. Oriol Vinyals, Gerald Friedland, Nelson Morgan
2010	Distribution and trichotomic realization of voiced velars in Japanese - an experimental study. Shin-ichiro Sano, Tomohiko Ooigawa
2010	Does sentence complexity interfere with intelligibility in noise? evaluation of the oldenburg linguistically and audiologically controlled sentence test (OLACS). Verena N. Uslar, Thomas Brand, Mirko Hanke, Rebecca Carroll, Esther Ruigendijk, Cornelia Hamann, Birger Kollmeier
2010	Domain adaptation and compensation for emotion detection. Michelle Hewlett Sanchez, Gökhan Tür, Luciana Ferrer, Dilek Hakkani-Tür
2010	Durational structure of Japanese single/geminate stops in three- and four-mora words spoken at varied rates. Yukari Hirata, Shigeaki Amano
2010	Dynamic language model adaptation using keyword category classification. Hitoshi Yamamoto, Ken Hanazawa, Kiyokazu Miki, Koichi Shinoda
2010	Dynamic language modeling using Bayesian networks for spoken dialog systems. Antoine Raux, Neville Mehta, Deepak Ramachandran, Rakesh Gupta
2010	Dynamic model selection for spectral voice conversion. Pierre Lanchantin, Xavier Rodet
2010	Effect of spatial separation on speech-in-noise comprehension in dyslexic adults. Marjorie Dole, Michel Hoen, Fanny Meunier
2010	Effects of Korean learners' consonant cluster reduction strategies on English speech recognition performance. Hyejin Hong, Jina Kim, Minhwa Chung
2010	Effects of accent typicality and phonotactic frequency on nonword immediate serial recall performance in Japanese. Yuuki Tanida, Taiji Ueno, Satoru Saito, Matthew A. Lambon Ralph
2010	Effects of enhancement of spectral changes on speech quality and subjective speech intelligibility. Jing Chen, Thomas Baer, Brian C. J. Moore
2010	Effects of modelling within- and between-frame temporal variations in power spectra on non-verbal sound recognition. Nobuhide Yamakawa, Tetsuro Kitahara, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
2010	Effects of the phonological relevance in speaker verification. Yanhua Long, Li-Rong Dai, Bin Ma, Wu Guo
2010	Effects of wall impedance on transmission and attenuation of higher-order modes in vocal-tract model. Kunitoshi Motoki
2010	Efficient HMM-based estimation of missing features, with applications to packet loss concealment. Bengt J. Borgström, Per Henrik Borgström, Abeer Alwan
2010	Efficient combined approach for named entity recognition in spoken language. Azeddine Zidouni, Sophie Rosset, Hervé Glotin
2010	Efficient data selection for speech recognition based on prior confidence estimation using speech and context independent models. Satoshi Kobashikawa, Taichi Asami, Yoshikazu Yamaguchi, Hirokazu Masataki, Satoshi Takahashi
2010	Efficient estimation of maximum entropy language models with n-gram features: an SRILM extension. Tanel Alumäe, Mikko Kurimo
2010	Efficient manycore CHMM speech recognition for audiovisual and multistream data. Dorothea Kolossa, Jike Chong, Steffen Zeiler, Kurt Keutzer
2010	Efficient three-stage pitch estimation for packet loss concealment. Xuejing Sun, Sameer Gadre
2010	Emotion recognition using imperfect speech recognition. Florian Metze, Anton Batliner, Florian Eyben, Tim Polzehl, Björn W. Schuller, Stefan Steidl
2010	Empirical mode decomposition for noise-robust automatic speech recognition. Kuo-Hao Wu, Chia-Ping Chen
2010	Energy reallocation strategies for speech enhancement in known noise conditions. Yan Tang, Martin Cooke
2010	English spoken term detection in multilingual recordings. Petr Motlícek, Fabio Valente, Philip N. Garner
2010	Enhanced monitoring tools and online dialogue optimisation merged into a new spoken dialogue system design experience. Romain Laroche, Philippe Bretier, Ghislain Putois
2010	Enhanced speech yielding higher intelligibility for all listeners and environments. Takayuki Arai, Nao Hodoshima
2010	Enhanced word classing for model M. Stanley F. Chen, Stephen M. Chu
2010	Enhancements of viterbi search for fast unit selection synthesis. Daniel Tihelka, Jirí Kala, Jindrich Matousek
2010	Enhancing children's speech recognition under mismatched condition by explicit acoustic normalization. Shweta Ghai, Rohit Sinha
2010	Estimating missing data sequences in x-ray microbeam recordings. Chao Qin, Miguel Á. Carreira-Perpiñán
2010	Estimating noise from noisy speech features with a monte carlo variant of the expectation maximization algorithm. Friedrich Faubel, Dietrich Klakow
2010	Estimation of glottal area function using stereo-endoscopic high-speed digital imaging. Hiroshi Imagawa, Ken-Ichi Sakakibara, Isao T. Tokuda, Mamiko Otsuka, Niro Tayama
2010	Estimation of speech lip features from discrete cosinus transform. Zuheng Ming, Denis Beautemps, Gang Feng, Sébastien Schmerber
2010	Estimation of two-to-one forced selection intelligibility scores by speech recognizers using noise-adapted models. Kazuhiro Kondo, Yusuke Takano
2010	Estimation studies of vocal tract shape trajectory using a variable length and lossy kelly-lochbaum model. Heikki Rasilo, Unto K. Laine, Okko Johannes Räsänen
2010	Evaluating a dialog language generation system: comparing the mountain system to other NLG approaches. Brian Langner, Stephan Vogel, Alan W. Black
2010	Evaluation of a silent speech interface based on magnetic sensing. Robin Hofe, Stephen R. Ell, Michael J. Fagan, James M. Gilbert, Phil D. Green, Roger K. Moore, Sergey I. Rybchenko
2010	Evaluation of bone-conducted ultrasonic hearing-aid regarding transmission of paralinguistic information: a comparison with cochlear implant simulator. Takayuki Kagomiya, Seiji Nakagawa
2010	Evaluation of prosodic contextual factors for HMM-based speech synthesis. Shuji Yokomizo, Takashi Nose, Takao Kobayashi
2010	Evaluation of speaker mimic technology for personalizing SGD voices. Esther Klabbers, Alexander Kain, Jan P. H. van Santen
2010	Excitation modeling based on waveform interpolation for HMM-based speech synthesis. June Sig Sung, Doo Hwa Hong, Kyung Hwan Oh, Nam Soo Kim
2010	Expectations for discourse genre identification: a prosodic study. Nicolas Obin, Volker Dellwo, Anne Lacheret, Xavier Rodet
2010	Exploitation of phase information for speaker recognition. Ning Wang, P. C. Ching, Tan Lee
2010	Exploiting context-dependency and acoustic resolution of universal speech attribute models in spoken language recognition. Sabato Marco Siniscalchi, Jeremy Reed, Torbjørn Svendsen, Chin-Hui Lee
2010	Exploiting glottal formant parameters for glottal inverse filtering and parameterization. Alan Ó Cinnéide, David Dorran, Mikel Gainza, Eugene Coyle
2010	Exploiting variety-dependent phones in portuguese variety identification applied to broadcast news transcription. Oscar Koller, Alberto Abad, Isabel Trancoso, Céu Viana
2010	Exploring goodness of prosody by diverse matching templates. Shen Huang, Hongyan Li, Shijin Wang, Jiaen Liang, Bo Xu
2010	Exploring recognition network representations for efficient speech inference on highly parallel platforms. Jike Chong, Ekaterina Gonina, Kisun You, Kurt Keutzer
2010	Exploring speaker characteristics for meeting summarization. Fei Liu, Yang Liu
2010	Exploring subsegmental and suprasegmental features for a text-dependent speaker verification in distant speech signals. B. Avinash, Sunitha Guruprasad, B. Yegnanarayana
2010	Exploring the mechanism of tonal contraction in taiwan Mandarin. Chierh Cheng, Yi Xu, Michele Gubian
2010	Exploring web-browser based runtimes engines for creating ubiquitous speech interfaces. Paul R. Dixon, Sadaoki Furui
2010	Extended weighted linear prediction (XLP) analysis of speech and its application to speaker verification in adverse conditions. Jouni Pohjalainen, Rahim Saeidi, Tomi Kinnunen, Paavo Alku
2010	Extending the punctuation module for european portuguese. Fernando Batista, Helena Moniz, Isabel Trancoso, Hugo Meinedo, Ana Isabel Mata, Nuno J. Mamede
2010	Extractive speech summarization - from the view of decision theory. Shih-Hsiang Lin, Yao-Ming Yeh, Berlin Chen
2010	Extractive summarization using a latent variable model. Asli Celikyilmaz, Dilek Hakkani-Tür
2010	F Jiahong Yuan, Mark Y. Liberman
2010	FSM-based pronunciation modeling using articulatory phonological code. Chi Hu, Xiaodan Zhuang, Mark Hasegawa-Johnson
2010	Fast computation of speaker characterization vector using MLLR and sufficient statistics in anchor model framework. Achintya Kumar Sarkar, Srinivasan Umesh
2010	Fast converging iterative kalman filtering for speech enhancement using long and overlapped tapered windows with large side lobe attenuation. Stephen So, Kuldip K. Paliwal
2010	Fast least-squares solution for sinusoidal, harmonic and quasi-harmonic models. Georgios Tzedakis, Yannis Pantazis, Olivier Rosec, Yannis Stylianou
2010	Feature selection for pose invariant lip biometrics. Adrian Pass, Jianguo Zhang, Darryl Stewart
2010	Feature versus model based noise robustness. Kris Demuynck, Xueru Zhang, Dirk Van Compernolle, Hugo Van hamme
2010	Floor holder detection and end of speaker turn prediction in meetings. Alfred Dielmann, Giulia Garau, Hervé Bourlard
2010	Fluency and structural complexity as predictors of L2 oral proficiency. Jared Bernstein, Jian Cheng, Masanori Suzuki
2010	Focus-sensitive operator or focus inducer: always and only. Yong-Cheol Lee, Satoshi Nambu
2010	Foreign accent matters most when timing is wrong. Chiharu Tsurutani
2010	Formant-based frequency warping for improving speaker adaptation in HMM TTS. Xin Zhuang, Yao Qian, Frank K. Soong, Yi-Jian Wu, Bo Zhang
2010	Frequency of occurrence effects on pitch accent realisation. Katrin Schweitzer, Michael Walsh, Bernd Möbius, Hinrich Schütze
2010	Frequency-domain delexicalization using surrogate vowels. Alexander Kain, Jan P. H. van Santen
2010	Full body aero-tactile integration in speech perception. Donald Derrick, Bryan Gick
2010	Fully automatic segmentation for prosodic speech corpora. Sarah Hoffmann, Beat Pfister
2010	Fully unsupervised word learning from continuous speech using transitional probabilities of atomic acoustic events. Okko Johannes Räsänen
2010	Functional imaging of brain regions sensitive to communication sounds in primates. Christopher I. Petkov, Benjamin Wilson
2010	Fuzzy support vector machines for age and gender classification. Phuoc Nguyen, Trung Le, Dat Tran, Xu Huang, Dharmendra Sharma
2010	GMM-UBM based open-set online speaker diarization. Jürgen T. Geiger, Frank Wallhoff, Gerhard Rigoll
2010	Gender and affect recognition based on GMM and GMM-UBM modeling with relevance MAP estimation. Rok Gajsek, Janez Zibert, Tadej Justin, Vitomir Struc, Bostjan Vesnicer, France Mihelic
2010	Gesture and speech coordination: the influence of the relationship between manual gesture and speech. Benjamin Roustan, Marion Dohen
2010	Global variance modeling on the log power spectrum of LSPs for HMM-based speech synthesis. Zhen-Hua Ling, Yu Hu, Li-Rong Dai
2010	Glottal parameters estimation on speech using the zeros of the z-transform. Nicolas Sturmel, Christophe d'Alessandro, Boris Doval
2010	Glottal-based analysis of the lombard effect. Thomas Drugman, Thierry Dutoit
2010	Graph-embedding for speaker recognition. Zahi N. Karam, William M. Campbell
2010	HMM adaptation using linear spline interpolation with integrated spline parameter training for robust speech recognition. Michael L. Seltzer, Alex Acero
2010	HMM based TTS for mixed language text. Zhiwei Shuang, Shiyin Kang, Yong Qin, Li-Rong Dai, Lianhong Cai
2010	HMM-based automatic visual speech segmentation using facial data. Utpala Musti, Asterios Toutios, Slim Ouni, Vincent Colotte, Brigitte Wrobel-Dautcourt, Marie-Odile Berger
2010	HMM-based prosodic structure model using rich linguistic context. Nicolas Obin, Xavier Rodet, Anne Lacheret
2010	HMM-based singing voice synthesis system using pitch-shifted pseudo training data. Ayami Mase, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda
2010	HMM-based text-to-articulatory-movement prediction and analysis of critical articulators. Zhen-Hua Ling, Korin Richmond, Junichi Yamagishi
2010	Hands free audio analysis from home entertainment. Danil Korchagin, Philip N. Garner, Petr Motlícek
2010	Hidden Markov models with context-sensitive observations for grapheme-to-phoneme conversion. Kalu U. Ogbureke, Peter Cahill, Julie Carson-Berndsen
2010	Hidden logistic linear regression for support vector machine based phone verification. Bo Li, Khe Chai Sim
2010	Hierarchical bottle neck features for LVCSR. Christian Plahl, Ralf Schlüter, Hermann Ney
2010	Hierarchical classification for speech-to-speech translation. Emil Ettelaie, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2010	Hierarchical multilayer perceptron based language identification. David Imseng, Mathew Magimai-Doss, Hervé Bourlard
2010	Hierarchical neural net architectures for feature extraction in ASR. Frantisek Grézl, Martin Karafiát
2010	How abstract is phonetics?. Osamu Fujimura
2010	How children acquire situation understanding skills?: a developmental analysis utilizing multimodal speech behavior corpus. Shogo Ishikawa, Shinya Kiriyama, Yoichi Takebayashi, Shigeyoshi Kitazawa
2010	Identification of abnormal audio events based on probabilistic novelty detection. Stavros Ntalampiras, Ilyas Potamitis, Nikos Fakotakis
2010	Identifying articulatory goals from kinematic data using principal differential analysis. Michael Reimer, Frank Rudzicz
2010	Impact of lack of acoustic feedback in EMG-based silent speech recognition. Matthias Janke, Michael Wand, Tanja Schultz
2010	Impact of word classing on shrinkage-based language models. Ruhi Sarikaya, Stanley F. Chen, Abhinav Sethy, Bhuvana Ramabhadran
2010	Improved generation of fundamental frequency in HMM-based speech synthesis using generation process model. Miaomiao Wang, Miaomiao Wen, Keikichi Hirose, Nobuaki Minematsu
2010	Improved language recognition using mixture components statistics. Abualsoud Hanani, Michael J. Carey, Martin J. Russell
2010	Improved modelling of speech dynamics using non-linear formant trajectories for HMM-based speech synthesis. Hongwei Hu, Martin J. Russell
2010	Improved n-gram phonotactic models for language recognition. Mohamed Faouzi BenZeghiba, Jean-Luc Gauvain, Lori Lamel
2010	Improved neural network based language modelling and adaptation. Junho Park, Xunying Liu, Mark J. F. Gales, Philip C. Woodland
2010	Improved phoneme recognition by integrating evidence from spectro-temporal and cepstral features. Shang-wen Li, Liang-Che Sun, Lin-Shan Lee
2010	Improved real-time MRI of oral-velar coordination using a golden-ratio spiral view order. Yoon-Chul Kim, Shrikanth S. Narayanan, Krishna S. Nayak
2010	Improved spoken term detection by discriminative training of acoustic models based on user relevance feedback. Hung-yi Lee, Chia-Ping Chen, Ching-Feng Yeh, Lin-Shan Lee
2010	Improved spoken term detection by feature space pseudo-relevance feedback. Chia-Ping Chen, Hung-yi Lee, Ching-Feng Yeh, Lin-Shan Lee
2010	Improved topic classification and keyword discovery using an HMM-based speech recognizer trained without supervision. Man-Hung Siu, Herbert Gish, Arthur Chan, William Belfield
2010	Improved training of excitation for HMM-based parametric speech synthesis. Yoshinori Shiga, Tomoki Toda, Shinsuke Sakai, Hisashi Kawai
2010	Improvement on plural unit selection and fusion. Jian Luan, Jian Li
2010	Improvements of search error risk minimization in viterbi beam search for speech recognition. Takaaki Hori, Shinji Watanabe, Atsushi Nakamura
2010	Improvements to generalized discriminative feature transformation for speech recognition. Roger Hsiao, Florian Metze, Tanja Schultz
2010	Improvements to the equal-parameter BIC for speaker diarization. Themos Stafylakis, Xavier Anguera
2010	Improving ASR error detection with non-decoder based features. Thomas Pellegrini, Isabel Trancoso
2010	Improving ASR-based topic segmentation of TV programs with confidence measures and semantic relations. Camille Guinaudeau, Guillaume Gravier, Pascale Sébillot
2010	Improving Mandarin segmental duration prediction with automatically extracted syntax features. Miaomiao Wen, Miaomiao Wang, Keikichi Hirose, Nobuaki Minematsu
2010	Improving back-off models with bag of words and hollow-grams. Benjamin Lecouteux, Raphaël Rubino, Georges Linarès
2010	Improving cross database prediction of dialogue quality using mixture of experts. Klaus-Peter Engelbrecht, Hamed Ketabdar, Sebastian Möller
2010	Improving monaural speaker identification by double-talk detection. Rahim Saeidi, Pejman Mowlaee, Tomi Kinnunen, Zheng-Hua Tan, Mads Græsbøll Christensen, Søren Holdt Jensen, Pasi Fränti
2010	Improving prosodic phrase prediction by unsupervised adaptation and syntactic features extraction. Zhigang Chen, Guoping Hu, Wei Jiang
2010	Improving speech synthesis of machine translation output. Alok Parlikar, Alan W. Black, Stephan Vogel
2010	Improving the readability of class lecture ASR results using a confusion network. Yasuhisa Fujii, Kazumasa Yamamoto, Seiichi Nakagawa
2010	Incorporating MAP estimation and covariance transform for SVM based speaker recognition. Cheung-Chi Leung, Donglai Zhu, Kong-Aik Lee, Bin Ma, Haizhou Li
2010	Incorporating sparse representation phone identification features in automatic speech recognition using exponential families. Vaibhava Goel, Tara N. Sainath, Bhuvana Ramabhadran, Peder A. Olsen, David Nahamoo, Dimitri Kanevsky
2010	Incremental acoustic valence recognition: an inter-corpus perspective on features, matching, and performance in a gating paradigm. Björn W. Schuller, Laurence Devillers
2010	Incremental composition of static decoding graphs with label pushing. Miroslav Novak
2010	Incremental diarization of telephone conversations. Oshry Ben-Harush, Itshak Lapidot, Hugo Guterman
2010	Incremental word learning using large-margin discriminative training and variance floor estimation. Irene Ayllón Clemente, Martin Heckmann, Alexander Denecke, Britta Wrede, Christian Goerick
2010	Influence of gestural salience on the interpretation of spoken requests. Gideon Kowadlo, Patrick Ye, Ingrid Zukerman
2010	Influence of lexical tones on intonation in kammu. Anastasia Karlsson, David House, Jan-Olof Svantesson, Damrong Tayanin
2010	Influence of musical training on perception of L2 speech. Makiko Sadakata, Lotte van der Zanden, Kaoru Sekiyama
2010	Integrate template matching and statistical modeling for speech recognition. Xie Sun, Yunxin Zhao
2010	Integrated feedback and noise reduction algorithm in digital hearing aids via oscillation detection. Miao Yao, Weiqian Liang
2010	Integrating MLP features and discriminative training in data sampling based ensemble acoustic modeling. Xin Chen, Yunxin Zhao
2010	Integration of cache-based model and topic dependent class model with soft clustering and soft voting. Welly Naptali, Masatoshi Tsuchiya, Seiichi Nakagawa
2010	Integration of multilayer regression analysis with structure-based pronunciation assessment. Masayuki Suzuki, Yu Qiao, Nobuaki Minematsu, Keikichi Hirose
2010	Intelligibility predictions for speech against fluctuating masker. Juan-Pablo Ramirez, Hamed Ketabdar, Alexander Raake
2010	Interaction of syntax-marked focus and wh-question induced focus in standard Chinese. Yuan Jia, Aijun Li
2010	Intra-frame variability as a predictor of frame classifiability. Trond Skogstad, Torbjørn Svendsen
2010	Invariant integration features combined with speaker-adaptation methods. Florian Müller, Alfred Mertins
2010	Investigating articulatory setting - pauses, ready position, and rest - using real-time MRI. Vikram Ramanarayanan, Dani Byrd, Louis Goldstein, Shrikanth S. Narayanan
2010	Investigating multiple approaches for SLU portability to a new language. Bassam Jabaian, Laurent Besacier, Fabrice Lefèvre
2010	Investigation of full-sequence training of deep belief networks for speech recognition. Abdel-rahman Mohamed, Dong Yu, Li Deng
2010	Is it possible to predict task completion in automated troubleshooters?. Alexander Schmitt, Michael Scholz, Wolfgang Minker, Jackson Liscombe, David Suendermann
2010	It takes two to tango - assessing the impact of delay on conversational interactivity on perceived speech quality. Sebastian Egger, Raimund Schatz, Stefan Scherer
2010	Japanese spoken term detection using syllable transition network derived from multiple speech recognizers' outputs. Satoshi Natori, Hiromitsu Nishizaki, Yoshihiro Sekiguchi
2010	Jointly optimized discriminative features for speech recognition. Tim Ng, Bing Zhang, Long Nguyen
2010	Kinematic analysis of tongue movement control in spastic dysarthria. Heejin Kim, Panying Rong, Torrey M. Loucks, Mark Hasegawa-Johnson
2010	Korean lenis, fortis, and aspirated stops: effect of place of articulation on acoustic realization. Mirjam Broersma
2010	L2 experience and non-native vowel categorization of L1-Mandarin speakers. Bo-ren Hsieh, Ho-Hsien Pan
2010	Landmark-based automated pronunciation error detection. Su-Youn Yoon, Mark Hasegawa-Johnson, Richard Sproat
2010	Language acquisition and cross-modal associations: computational simulation of the result of infant studies. Louis ten Bosch, Lou Boves
2010	Language model cross adaptation for LVCSR system combination. Xunying Liu, Mark J. F. Gales, Philip C. Woodland
2010	Language specific effects of emotion on phoneme duration. Martijn Goudbeek, Mirjam Broersma
2010	Language-specific influence on phoneme development: French and drehu data. Julia Monnin, Hélène Loevenbruck
2010	Large margin Gaussian mixture models for speaker identification. Reda Jourani, Khalid Daoudi, Régine André-Obrecht, Driss Aboutajdine
2010	Large vocabulary continuous speech recognition using WFST-based linear classifier for structured data. Shinji Watanabe, Takaaki Hori, Atsushi Nakamura
2010	Laryngeal characteristics during the production of geminate consonants. Masako Fujimoto, Kikuo Maekawa, Seiya Funatsu
2010	Laryngeal voice quality in the expression of focus. Martti Vainio, Matti Airas, Juhani Järvikivi, Paavo Alku
2010	Laryngealization and features for Chinese tonal recognition. Kristine M. Yu
2010	Latent affective mapping: a novel framework for the data-driven analysis of emotion in text. Jerome R. Bellegarda
2010	Latent perceptual mapping: a new acoustic modeling framework for speech recognition. Shiva Sundaram, Jerome R. Bellegarda
2010	Learning a language model from continuous speech. Graham Neubig, Masato Mimura, Shinsuke Mori, Tatsuya Kawahara
2010	Learning from human errors: prediction of phoneme confusions based on modified ASR training. Bernd T. Meyer, Birger Kollmeier
2010	Learning naturally spoken commands for a robot. Anja Austermann, Seiji Yamada, Kotaro Funakoshi, Mikio Nakano
2010	Learning new word pronunciations from spoken examples. Ibrahim Badr, Ian McGraw, James R. Glass
2010	Learning speaker normalization using semisupervised manifold alignment. Andrew R. Plummer, Mary E. Beckman, Mikhail Belkin, Eric Fosler-Lussier, Benjamin Munson
2010	Learning words and speech units through natural interactions. Jonas Hörnstein, José Santos-Victor
2010	Lecture speech recognition by combining word graphs of various acoustic models. Tetsuo Kosaka, Keisuke Goto, Takashi Ito, Masaharu Katoh
2010	Lecture subtopic retrieval by retrieval keyword expansion using subordinate concept. Noboru Kanedera, Tetsuo Funada, Seiichi Nakagawa
2010	Level of interest sensing in spoken dialog using multi-level fusion of acoustic and lexical evidence. Je Hun Jeon, Rui Xia, Yang Liu
2010	Lexical entrainment of real users in the let's go spoken dialog system. Gabriel Parent, Maxine Eskénazi
2010	Lightly supervised recognition for automatic alignment of large coherent speech recordings. Norbert Braunschweiler, Mark J. F. Gales, Sabine Buchholz
2010	Linguistic rhythm in foreign accent. Jiahong Yuan
2010	Locally-weighted regression for estimating the forward kinematics of a geometric vocal tract model. Adam C. Lammert, Louis Goldstein, Khalil Iskarous
2010	Long short-term memory networks for noise robust speech recognition. Martin Wöllmer, Yang Sun, Florian Eyben, Björn W. Schuller
2010	Longitudinal changes of selected voice source parameters. Hideki Kasuya, Hajime Yoshida, Satoshi Ebihara, Hiroki Mori
2010	Looking for relevant features for speaker role recognition. Benjamin Bigot, Julien Pinquier, Isabelle Ferrané, Régine André-Obrecht
2010	Low-dimensional space transforms of posteriors in speech recognition. Jan Zelinka, Jan Trmal, Ludek Müller
2010	MAP estimation of subspace transform for speaker recognition. Donglai Zhu, Bin Ma, Kong-Aik Lee, Cheung-Chi Leung, Haizhou Li
2010	Machine learning for text selection with expressive unit-selection voices. Dominic Espinosa, Michael White, Eric Fosler-Lussier, Chris Brew
2010	Mandarin digit recognition assisted by selective tone distinction. Xiaodong Wang, Kunihiko Owa, Makoto Shozakai
2010	Mandarin tone recognition using affine-invariant prosodic features and tone posteriorgram. Yow-Bang Wang, Lin-Shan Lee
2010	Manipulating treacheoesophageal speech. R. J. J. H. van Son, Irene Jacobi, Frans J. M. Hilgers
2010	Mask estimation in non-stationary noise environments for missing feature based robust speech recognition. Shirin Badiezadegan, Richard C. Rose
2010	Masking of vowel-analog transitions by vowel-analog distracters. Pierre L. Divenyi
2010	Masking property based microphone array post-filter design. Ning Cheng, Wenju Liu, Lan Wang
2010	Maximum a posteriori voice conversion using sequential monte carlo methods. Elina Helander, Hanna Silén, Joaquín Míguez, Moncef Gabbouj
2010	Maximum lexical cohesion for fine-grained news story segmentation. Zihan Liu, Lei Xie, Wei Feng
2010	Measuring basic tempo across languages and some implications for speech rhythm. Gertraud Fenk-Oczlon, August Fenk
2010	Mechanical vocal-tract models for speech dynamics. Takayuki Arai
2010	Melody pitch estimation based on range estimation and candidate extraction using harmonic structure model. Seokhwan Jo, Sihyun Joo, Chang D. Yoo
2010	Memory-based active learning for French broadcast news. Frédéric Tantini, Christophe Cerisara, Claire Gardent
2010	Methods for robust speech recognition in reverberant environments: a comparison. Rico Petrick, Thomas Fehér, Masashi Unoki, Rüdiger Hoffmann
2010	Metric subspace indexing for fast spoken term detection. Taisuke Kaneko, Tomoyosi Akiba
2010	Minimally invasive surgery for spoken dialog systems. David Suendermann, Jackson Liscombe, Roberto Pieraccini
2010	Modal analysis of vocal fold vibrations using laryngotopography. Ken-Ichi Sakakibara, Hiroshi Imagawa, Miwako Kimura, Hisayuki Yokonishi, Niro Tayama
2010	Model synthesis for band-limited speech recognition. Yongjun He, Jiqing Han
2010	Modeling liaison in French by using decision trees. Josafá de Jesus Aguiar Pontes, Sadaoki Furui
2010	Modeling of sentence-medial pauses in bangla readout speech: occurrence and duration. Shyamal Kr. Das Mandal, Arup Saha, Tulika Basu, Keikichi Hirose, Hiroya Fujisaki
2010	Modeling perceived vocal age in american English. James D. Harnsberger, Rahul Shrivastav, W. S. Brown Jr.
2010	Modeling posterior probabilities using the linear exponential family. Peder A. Olsen, Vaibhava Goel, Charles A. Micchelli, John R. Hershey
2010	Modeling pronunciation variation with context-dependent articulatory feature decision trees. Samuel R. Bowman, Karen Livescu
2010	Modelling speech line spectral frequencies with dirichlet mixture models. Zhanyu Ma, Arne Leijon
2010	Modelling the effect of speaker familiarity and noise on infant word recognition. Christina Bergmann, Michele Gubian, Lou Boves
2010	Modified spatial audio object coding scheme with harmonic extraction and elimination structure for interactive audio service. Jihoon Park, Kwang-Ki Kim, Jeongil Seo, Minsoo Hahn
2010	Morphological and predictability effects on schwa reduction: the case of dutch word-initial syllables. Iris Hanique, Barbara Schuppler, Mirjam Ernestus
2010	Multi resolution discriminative models for subvocalic speech recognition. Mark Raugas, Vivek Kumar Rangarajan Sridhar, Rohit Prasad, Prem Natarajan
2010	Multi-channel iterative dereverberation based on codebook constrained iterative multi-channel wiener filter. Ajay Srinivasamurthy, Thippur V. Sreenivas
2010	Multi-class and hierarchical SVMs for emotion recognition. Ali Hassan, Robert I. Damper
2010	Multi-pitch estimation by a joint 2-d representation of pitch and pitch dynamics. Tianyu T. Wang, Thomas F. Quatieri
2010	MultiBIC: an improved speaker segmentation technique for TV shows. Paula Lopez-Otero, Laura Docío Fernández, Carmen García-Mateo
2010	Multichannel noise reduction using low order RTF estimate. Subhojit Chakladar, Nam Soo Kim, Yu Gwang Jin, Tae Gyoon Kang
2010	Multichannel source separation based on source location cue with log-spectral shaping by hidden Markov source model. Tomohiro Nakatani, Shoko Araki, Takuya Yoshioka, Masakiyo Fujimoto
2010	Multimodal dialog in the car: combining speech and turn-and-push dial to control comfort functions. Sandro Castronovo, Angela Mahr, Margarita Pentcheva, Christian A. Müller
2010	Multimodal speaker diarization using oriented optical flow histograms. Mary Tai Knox, Gerald Friedland
2010	Multivariate analysis of vocal fatigue in continuous reading. Marie-José Caraty, Claude Montacié
2010	Mutual information analysis for feature and sensor subset selection in surface electromyography based speech recognition. Vivek Kumar Rangarajan Sridhar, Rohit Prasad, Prem Natarajan
2010	Named-entity projection and data-driven morphological decomposition for field maintainable speech-to-speech translation systems. Ian R. Lane, Alex Waibel
2010	Native and non-native speaker judgements on the quality of synthesized speech. Anna C. Janska, Robert A. J. Clark
2010	Natural belief-critic: a reinforcement algorithm for parameter estimation in statistical spoken dialogue systems. Filip Jurcícek, Blaise Thomson, Simon Keizer, François Mairesse, Milica Gasic, Kai Yu, Steve J. Young
2010	Near field sound source localization based on cross-power spectrum phase analysis with multiple microphones. Kohei Hayashida, Masanori Morise, Takanobu Nishiura
2010	New insights into subspace noise tracking. Mahdi Triki
2010	New technique to enhance the performance of spoken dialogue systems based on dialogue states-dependent language models and grammatical rules. Ramón López-Cózar, David Griol
2010	Noise robust voice activity detection using features extracted from the time-domain autocorrelation function. Houman Ghaemmaghami, Brendan Baker, Robert Vogt, Sridha Sridharan
2010	Non-audible murmur recognition based on fusion of audio and visual streams. Panikos Heracleous, Norihiro Hagita
2010	Non-linear predictive vector quantization of feature vectors for distributed speech recognition. José Enrique García Laínez, Alfonso Ortega, Antonio Miguel, Eduardo Lleida
2010	Non-negative matrix factorization based compensation of music for automatic speech recognition. Bhiksha Raj, Tuomas Virtanen, Sourish Chaudhuri, Rita Singh
2010	Nonlinear enhancement of onset for robust speech recognition. Chanwoo Kim, Richard M. Stern
2010	Novel probabilistic control of noise reduction for improved microphone array beamforming. Jungpyo Hong, Seung Ho Han, Sangbae Jeong, Minsoo Hahn
2010	Novel weighting scheme for unsupervised language model adaptation using latent dirichlet allocation. Md. Akmal Haidar, Douglas D. O'Shaughnessy
2010	Nucleus position within the intonation phrase: a typological study of English, Czech and Hungarian. Tomás Dubeda, Katalin Mády
2010	Numerical study of turbulent flow-induced sound production in presence of a tooth-shaped obstacle: towards sibilant [s] physical modeling. Julien Cisonni, Kazunori Nozaki, Annemie Van Hirtum, Shigeo Wada
2010	Observation uncertainty measures for sparse imputation. Jort F. Gemmeke, Ulpu Remes, Kalle J. Palomäki
2010	On enhancing feature sequence filtering with filter-bank energy transformation in speaker verification with telephone speech. Claudio Garretón, Néstor Becerra Yoma
2010	On evaluation of the f Keiichi Funaki
2010	On generating combilex pronunciations via morphological analysis. Korin Richmond, Robert A. J. Clark, Susan Fitt
2010	On speaker adaptive training of artificial neural networks. Jan Trmal, Jan Zelinka, Ludek Müller
2010	On the automatic toBI accent type identification from data. César González Ferreras, Carlos Vivaracho-Pascual, David Escudero Mancebo, Valentín Cardeñoso-Payo
2010	On the effect of fundamental frequency on amplitude and frequency modulation patterns in speech resonances. Pirros Tsiakoulis, Alexandros Potamianos
2010	On the exploitation of hidden Markov models and linear dynamic models in a hybrid decoder architecture for continuous speech recognition. Volker Leutnant, Reinhold Haeb-Umbach
2010	On the importance of glottal flow spectral energy for the recognition of emotions in speech. Ling He, Margaret Lech, Nicholas B. Allen
2010	On the interdependencies between voice quality, glottal gaps, and voice-source related acoustic measures. Yen-Liang Shue, Gang Chen, Abeer Alwan
2010	On the potential of channel selection for recognition of reverberated speech with multiple microphones. Martin Wolf, Climent Nadeu
2010	On the potential of glottal signatures for speaker recognition. Thomas Drugman, Thierry Dutoit
2010	On the relation of Bayes risk, word error, and word posteriors in ASR. Ralf Schlüter, Markus Nußbaum-Thom, Hermann Ney
2010	On the use of Gaussian component information in the generative likelihood ratio estimation for speaker verification. Rong Zheng, Bo Xu
2010	On using Gaussian mixture model for double-talk detection in acoustic echo suppression. Ji-Hyun Song, Kyu-Ho Lee, Yun-Sik Park, Sang-Ick Kang, Joon-Hyuk Chang
2010	On using missing-feature theory with cepstral features - approximations to the multivariate integral. Frank Seide, Pei Zhao
2010	On using voice source measures in automatic gender classification of children's speech. Gang Chen, Xue Feng, Yen-Liang Shue, Abeer Alwan
2010	On-demand language model interpolation for mobile speech input. Brandon Ballinger, Cyril Allauzen, Alexander Gruenstein, Johan Schalkwyk
2010	On-the-fly lattice rescoring for real-time automatic speech recognition. Hasim Sak, Murat Saraclar, Tunga Güngör
2010	One-model speech recognition and synthesis based on articulatory movement HMMs. Tsuneo Nitta, Takayuki Onoda, Masashi Kimura, Yurie Iribe, Kouichi Katsurada
2010	Online Gaussian process for nonstationary speech separation. Hsin-Lung Hsieh, Jen-Tzung Chien
2010	Online SLU model adaptation with a partial oracle. Pierre Gotab, Géraldine Damnati, Frédéric Béchet, Lionel Delphin-Poulat
2010	Online adaptive learning for speech recognition decoding. Jeff A. Bilmes, Hui Lin
2010	Optimising a handcrafted dialogue system design. Romain Laroche, Ghislain Putois, Philippe Bretier
2010	Optimizing spoken dialogue management with fitted value iteration. Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin
2010	Oriented PCA method for blind speech separation of convolutive mixtures. Yasmina Benabderrahmane, Sid-Ahmed Selouani, Douglas D. O'Shaughnessy
2010	Overlap detection for speaker diarization by fusing spectral and spatial features. Martin Zelenák, Carlos Segura, Javier Hernando
2010	PDF-optimized LSF vector quantization based on beta mixture models. Zhanyu Ma, Arne Leijon
2010	Parallel lexical-tree based LVCSR on multi-core processors. Naveen Parihar, Ralf Schlüter, David Rybach, Eric A. Hansen
2010	Parallel processing of interruptions and feedback in companions affective dialogue system. Jaakko Hakulinen, Markku Turunen, Raúl Santos de la Cámara, Nigel T. Crook
2010	Parallel training of neural networks for speech recognition. Karel Veselý, Lukás Burget, Frantisek Grézl
2010	Parameters describing multimodal interaction - definitions and three usage scenarios. Christine Kühnel, Benjamin Weiss, Sebastian Möller
2010	Paraphrase generation to improve text-to-speech synthesis. Ghislain Putois, Jonathan Chevelu, Cédric Boidin
2010	Perception of estonian vowel categories by native and non-native speakers. Lya Meister, Einar Meister
2010	Perception of voiceless fricatives by Japanese listeners of advanced and intermediate level English proficiency. Hinako Masuda, Takayuki Arai
2010	Perception on pitch reset at discourse boundaries. Hsin-Yi Lin, Janice Fon
2010	Perception-based automatic approximation of F0 contours in Cantonese speech. Yujia Li, Tan Lee
2010	Perceptual compensation for effects of reverberation in speech identification: a computer model based on auditory efferent processing. Amy V. Beeston, Guy J. Brown
2010	Perceptual wavelet decomposition for speech segmentation. Mariusz Ziólko, Jakub Galka, Bartosz Ziólko, Tomasz Drwiega
2010	Performance estimation of noisy speech recognition considering recognition task complexity. Takeshi Yamada, Tomohiro Nakajima, Nobuhiko Kitawaki, Shoji Makino
2010	Performance estimation of reverberant speech recognition based on reverberant criteria RSR-d Takahiro Fukumori, Masanori Morise, Takanobu Nishiura
2010	Phase equalization-based autoregressive model of speech signals. Sadao Hiroya, Takemi Mochida
2010	Phone boundary detection using sample-based acoustic parameters. You-yu Lin, Yih-Ru Wang, Yuan-Fu Liao
2010	Phone mismatch penalty matrices for two-stage keyword spotting via multi-pass phone recognizer. Chang Woo Han, Shin Jae Kang, Chul Min Lee, Nam Soo Kim
2010	Phoneme classification and lattice rescoring based on a k-NN approach. Ladan Golipour, Douglas D. O'Shaughnessy
2010	Phoneme lattice based texttiling towards multilingual story segmentation. Xiaoxuan Wang, Lei Xie, Bin Ma, Engsiong Chng, Haizhou Li
2010	Phonetic imitation of Japanese vowel devoicing. Kuniko Y. Nielsen
2010	Phonetic realization of second occurrence focus in Japanese. Satoshi Nambu, Yong-Cheol Lee
2010	Phonetic segmentation of singing voice using MIDI and parallel speech. Minghui Dong, Paul Y. Chan, Ling Cen, Haizhou Li, Jason Teo, Ping Jen Kua
2010	Phonetic subspace mixture model for speaker diarization. I-Fan Chen, Shih-Sian Cheng, Hsin-Min Wang
2010	Phrase alignment confidence for statistical machine translation. Sankaranarayanan Ananthakrishnan, Rohit Prasad, Prem Natarajan
2010	Phrase-medial vowel devoicing in spontaneous French. Francisco Torreira, Mirjam Ernestus
2010	Physics of body-conducted silent speech - production, propagation and representation of non-audible murmur. Makoto Otani, Tatsuya Hirahara
2010	Pitch determination using autocorrelation function in spectral domain. M. Shahidur Rahman, Tetsuya Shimamura
2010	Pitch estimation in noisy speech based on temporal accumulation of spectrum peaks. Feng Huang, Tan Lee
2010	Pitch similarity in the vicinity of backchannels. Mattias Heldner, Jens Edlund, Julia Hirschberg
2010	Positional variability of pitch accents in Czech. Tomás Dubeda
2010	Post-aspiration in standard Italian: some first cross-regional acoustic evidence. Mary Stevens, John Hajek
2010	Pre- and short-term posttreatment vocal functioning in patients with advanced head and neck cancer treated with concomitant chemoradiotherapy. Irene Jacobi, Lisette van der Molen, Maya van Rossum, Frans J. M. Hilgers
2010	Predicting human perception and ASR classification of word-final [t] by its acoustic sub-segmental properties. Barbara Schuppler, Mirjam Ernestus, Wim A. van Dommelen, Jacques C. Koreman
2010	Predicting unseen articulations from multi-speaker articulatory models. Gopal Ananthakrishnan, Pierre Badin, Julián Andrés Valdés Vargas, Olov Engwall
2010	Predicting word accuracy for the automatic speech recognition of non-native speech. Su-Youn Yoon, Lei Chen, Klaus Zechner
2010	Prior information for rapid speaker adaptation. Catherine Breslin, K. K. Chin, Mark J. F. Gales, Kate M. Knill, Haitian Xu
2010	Probabilistic integration of joint density model and speaker model for voice conversion. Daisuke Saito, Shinji Watanabe, Atsushi Nakamura, Nobuaki Minematsu
2010	Probabilistic state clustering using conditional random field for context-dependent acoustic modelling. Khe Chai Sim
2010	Production and perception of vietnamese short vowels in V1V2 context. Viet Son Nguyen, Eric Castelli, René Carré
2010	Prominence based scoring of speech segments for automatic speech-to-speech summarization. Sree Harsha Yella, Vasudeva Varma, Kishore Prahallad
2010	Prominence detection in Swedish using syllable correlates. Samer Al Moubayed, Jonas Beskow
2010	Prosodic grouping and relative clause disambiguation in Mandarin. Jianjing Kuang
2010	Prosodic speaker verification using subspace multinomial models with intersession compensation. Marcel Kockmann, Lukás Burget, Ondrej Glembek, Luciana Ferrer, Jan Cernocký
2010	Prosodic timing analysis for articulatory re-synthesis using a bank of resonators with an adaptive oscillator. Michael C. Brady
2010	Prosodic word-based error correction in speech recognition using prosodic word expansion and contextual information. Chao-Hong Liu, Chung-Hsien Wu
2010	Prosody and voice quality of vocal social signals: the case of dominance in scenario meetings. Marcela Charfuelan, Marc Schröder, Ingmar Steiner
2010	Prosody cues for classification of the discourse particle "hã" in hindi. Sankalan Prasad, Kalika Bali
2010	Prosody for the eyes: quantifying visual prosody using guided principal component analysis. Erin Cvejic, Jeesun Kim, Chris Davis, Guillaume Gibert
2010	Psychological evaluation of a group communication activation robot in a party game. Yoichi Matsuyama, Shinya Fujie, Hikaru Taniyama, Tetsunori Kobayashi
2010	Quality conversion of non-acoustic signals for facilitating human-to-human speech communication under harsh acoustic conditions. Seyed Omid Sadjadi, Sanjay A. Patil, John H. L. Hansen
2010	Quality-based playout buffering with FEC for conversational voIP. Qipeng Gong, Peter Kabal
2010	Quantification of prosodic entrainment in affective spontaneous spoken interactions of married couples. Chi-Chun Lee, Matthew Black, Athanasios Katsamanis, Adam C. Lammert, Brian R. Baucom, Andrew Christensen, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2010	Quantized HMMs for low footprint text-to-speech synthesis. Alexander Gutkin, Xavi Gonzalvo, Stefan Breuer, Paul Taylor
2010	Rapid bootstrapping of five eastern european languages using the rapid language adaptation toolkit. Ngoc Thang Vu, Tim Schlippe, Franziska Kraus, Tanja Schultz
2010	Rapid development of speech translation using consecutive interpretation. Matthias Paulik, Alex Waibel
2010	Rapid semi-automatic segmentation of real-time magnetic resonance images for parametric vocal tract analysis. Michael I. Proctor, Daniel Bone, Athanasios Katsamanis, Shrikanth S. Narayanan
2010	Real-life emotion-related states detection in call centers: a cross-corpora study. Laurence Devillers, Christophe Vaudable, Clément Chastagnol
2010	Recognition of spontaneous conversational speech using long short-term memory phoneme predictions. Martin Wöllmer, Florian Eyben, Björn W. Schuller, Gerhard Rigoll
2010	Recognizing cochlear implant-like spectrally reduced speech with HMM-based ASR: experiments with MFCCs and PLP coefficients. Cong-Thanh Do, Dominique Pastor, Gaël Le Lan, André Goalic
2010	Recurrent neural network based language model. Tomás Mikolov, Martin Karafiát, Lukás Burget, Jan Cernocký, Sanjeev Khudanpur
2010	Redescribing intonational categories with functional data analysis. Margaret Zellers, Michele Gubian, Brechtje Post
2010	Reducing musical noise in blind source separation by time-domain sparse filters and split bregman method. Wenye Ma, Meng Yu, Jack Xin, Stanley J. Osher
2010	Reduction of broadband noise in speech signals by multilinear subspace analysis. Yusuke Sato, Tetsuya Hoya, Hovagim Bakardjian, Andrzej Cichocki
2010	Regularized-MLLR speaker adaptation for computer-assisted language learning system. Dean Luo, Yu Qiao, Nobuaki Minematsu, Yutaka Yamauchi, Keikichi Hirose
2010	Reinforced blocking matrix with cross channel projection for speech enhancement. Inho Lee, Jongsung Yoon, Yoonjae Lee, Hanseok Ko
2010	Reliable tracking based on speech sample salience of vocal cycle length perturbations. Christophe Mertens, Francis Grenez, Lise Crevier-Buchman, Jean Schoentgen
2010	Relying on critical articulators to estimate vocal tract spectra in an articulatory-acoustic database. Daniel Felps, Christian Geng, Michael Berger, Korin Richmond, Ricardo Gutierrez-Osuna
2010	Repair strategies on trial: which error recovery do users like best?. Alexander Zgorzelski, Alexander Schmitt, Tobias Heinroth, Wolfgang Minker
2010	Resources for turn competition in overlap in multi-party conversations: speech rate, pausing and duration. Emina Kurtic, Guy J. Brown, Bill Wells
2010	Restructuring exponential family mixture models. Pierre L. Dognin, John R. Hershey, Vaibhava Goel, Peder A. Olsen
2010	Revisiting VTLN using linear transformation on conventional MFCC. Doddipatla Rama Sanand, Ralf Schlüter, Hermann Ney
2010	Rhythm and formant features for automatic alcohol detection. Florian Schiel, Christian Heinrich, Veronika Neumeyer
2010	Robust and efficient pitch estimation using an iterative ARMA technique. Jung Ook Hong, Patrick J. Wolfe
2010	Robust automatic speech recognition with decoder oriented ideal binary mask estimation. Lae-Hoon Kim, Kyung-Tae Kim, Mark Hasegawa-Johnson
2010	Robust mixture modeling using t-distribution: application to speaker ID. Sundar Harshavardhan, Thippur V. Sreenivas
2010	Robust noise estimation using minimum correction with harmonicity control. Xuejing Sun, Kuan-Chieh Yen, Rogerio Guedes Alves
2010	Robust statistical voice activity detection using a likelihood ratio sign test. Shiwen Deng, Jiqing Han
2010	Robust voice activity detection in stereo recording with crosstalk. Prasanta Kumar Ghosh, Andreas Tsiartas, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2010	Robust word recognition using articulatory trajectories and gestures. Vikramjit Mitra, Hosung Nam, Carol Y. Espy-Wilson, Elliot Saltzman, Louis Goldstein
2010	Role of language models in spoken fluency evaluation. Om Deshmukh, Harish Doddala, Ashish Verma, Karthik Visweswariah
2010	Roles of the average voice in speaker-adaptive HMM-based speech synthesis. Junichi Yamagishi, Oliver Watts, Simon King, Bela Usabaev
2010	Round-robin discrimination model for reranking ASR hypotheses. Takanobu Oba, Takaaki Hori, Atsushi Nakamura
2010	Russian infants and children's sounds and speech corpuses for language acquisition studies. Elena E. Lyakso, Olga V. Frolova, Anna V. Kurazhova, Julia S. Gaikova
2010	SAFE: a statistical algorithm for F0 estimation for both clean and noisy speech. Wei Chu, Abeer Alwan
2010	SCARF: a segmental conditional random field toolkit for speech recognition. Geoffrey Zweig, Patrick Nguyen
2010	SEAME: a Mandarin-English code-switching speech corpus in south-east asia. Dau-Cheng Lyu, Tien Ping Tan, Engsiong Chng, Haizhou Li
2010	SNR-based mask compensation for computational auditory scene analysis applied to speech recognition in a car environment. Ji Hun Park, Seon Man Kim, Jae Sam Yoon, Hong Kook Kim, Sung Joo Lee, Yunkeun Lee
2010	Say it as you mean it - analyzing free user comments in the VOICE awards corpus. Florian Gödde, Sebastian Möller
2010	Say what? why users choose to speak their web queries. Maryam Kamvar, Doug Beeferman
2010	Score-level compensation of extreme speech duration variability in speaker verification. Sergio Perez-Gomez, Daniel Ramos, Javier Gonzalez-Dominguez, Joaquin Gonzalez-Rodriguez
2010	Search by voice in Mandarin Chinese. Jiulong Shan, Genqing Wu, Zhihong Hu, Xiliu Tang, Martin Jansche, Pedro J. Moreno
2010	Selecting phonotactic features for language recognition. Rong Tong, Bin Ma, Haizhou Li, Engsiong Chng
2010	Selective gammatone filterbank feature for robust sound event recognition. Yi Ren Leng, Tran Huy Dat, Norihide Kitaoka, Haizhou Li
2010	Semantic facilitation in bilingual everyday speech comprehension. Marco van de Ven, Benjamin V. Tucker, Mirjam Ernestus
2010	Semi-automated update of automatic transcription system for the Japanese national congress. Yuya Akita, Masato Mimura, Graham Neubig, Tatsuya Kawahara
2010	Semi-parametric trajectory modelling using temporally varying feature mapping for speech recognition. Khe Chai Sim, Shilin Liu
2010	Semi-supervised extractive speech summarization via co-training algorithm. Shasha Xie, Hui Lin, Yang Liu
2010	Semi-supervised learning for improved expression of uncertainty in discriminative classifiers. Jonathan Malkin, Jeff A. Bilmes
2010	Semi-supervised part-of-speech tagging in speech applications. Richard Dufour, Benoît Favre
2010	Semi-supervised training of Gaussian mixture models by conditional entropy minimization. Jui-Ting Huang, Mark Hasegawa-Johnson
2010	Session variability contrasts in the MARP corpus. Keith W. Godin, John H. L. Hansen
2010	Setup for acoustic-visual speech synthesis by concatenating bimodal units. Asterios Toutios, Utpala Musti, Slim Ouni, Vincent Colotte, Brigitte Wrobel-Dautcourt, Marie-Odile Berger
2010	Shape-invariant speech transformation with the phase vocoder. Axel Röbel
2010	Shrinkage model adaptation in automatic speech recognition. Jinyu Li, Yu Tsao, Chin-Hui Lee
2010	Signal interaction and the devil function. John R. Hershey, Peder A. Olsen, Steven J. Rennie
2010	Signal-based accent and phrase marking using the fujisaki model. Hussein Hussein, Rüdiger Hoffmann
2010	Significance of pitch synchronous analysis for speaker recognition using AANN models. Sri Harish Reddy Mallidi, Kishore Prahallad, Suryakanth V. Gangashetty, B. Yegnanarayana
2010	Silent vs vocalized articulation for a portable ultrasound-based silent speech interface. Victoria M. Florescu, Lise Crevier-Buchman, Bruce Denby, Thomas Hueber, Antonia Colazo-Simon, Claire Pillot-Loiseau, Pierre Roussel-Ragot, Cédric Gendrot, Sophie Quattrocchi
2010	Similar n-gram language model. Christian Gillot, Christophe Cerisara, David Langlois, Jean Paul Haton
2010	Similarity of effects of emotions on the speech organ configuration with and without speaking. Tatsuya Kitamura
2010	Similarity scoring for recognizing repeated out-of-vocabulary words. Mirko Hannemann, Stefan Kombrink, Martin Karafiát, Lukás Burget
2010	Simple and efficient speaker comparison using approximate KL divergence. William M. Campbell, Zahi N. Karam
2010	Simplification and extension of non-periodic excitation source representations for high-quality speech manipulation systems. Hideki Kawahara, Masanori Morise, Toru Takahashi, Hideki Banno, Ryuichi Nisimura, Toshio Irino
2010	Single-channel speech enhancement using kalman filtering in the modulation domain. Stephen So, Kamil K. Wójcicki, Kuldip K. Paliwal
2010	Single-speaker/multi-speaker co-channel speech classification. Stéphane Rossignol, Olivier Pietquin
2010	Sinusoidal model parameterization for HMM-based TTS system. Slava Shechtman, Alexander Sorin
2010	Social role discovery from spoken language using dynamic Bayesian networks. Sibel Yaman, Dilek Hakkani-Tür, Gökhan Tür
2010	Sound-based assistive technology supporting "seeing", "hearing" and "speaking" for the disabled and the elderly. Tohru Ifukube
2010	Sparse auto-associative neural networks: theory and application to speech recognition. Garimella S. V. S. Sivaram, Sriram Ganapathy, Hynek Hermansky
2010	Sparse component analysis for speech recognition in multi-speaker environment. Afsaneh Asaei, Hervé Bourlard, Philip N. Garner
2010	Sparse representation features for speech recognition. Tara N. Sainath, Bhuvana Ramabhadran, David Nahamoo, Dimitri Kanevsky, Abhinav Sethy
2010	Sparse representations for text categorization. Tara N. Sainath, Sameer Maskey, Dimitri Kanevsky, Bhuvana Ramabhadran, David Nahamoo, Julia Hirschberg
2010	Speaker adaptation based on nonlinear spectral transform for speech recognition. Toyohiro Hayashi, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda
2010	Speaker adaptation based on system combination using speaker-class models. Tetsuo Kosaka, Takashi Ito, Masaharu Katoh, Masaki Kohda
2010	Speaker adaptation in transformation space using two-dimensional PCA. Yongwon Jeong, Young Rok Song, Hyung Soon Kim
2010	Speaker and language adaptive training for HMM-based polyglot speech synthesis. Heiga Zen
2010	Speaker characterization using long-term and temporal information. Chien-Lin Huang, Hanwu Sun, Bin Ma, Haizhou Li
2010	Speaker diarization in meeting audio for single distant microphone. Tin Lay Nwe, Hanwu Sun, Bin Ma, Haizhou Li
2010	Speaker recognition experiments using connectionist transformation network features. Alberto Abad, Isabel Trancoso
2010	Speaker recognition using supervised probabilistic principal component analysis. Yun Lei, John H. L. Hansen
2010	Speaker recognition using the resynthesized speech via spectrum modeling. Xiang Zhang, Chuan Cao, Lin Yang, Hongbin Suo, Jianping Zhang, Yonghong Yan
2010	Speaker tracking in an unsupervised speech controlled system. Tobias Herbig, Franz Gerl, Wolfgang Minker
2010	Speaker-dependent mapping of source and system features for enhancement of throat microphone speech. Anand Joseph Xavier Medabalimi, Sri Harish Reddy Mallidi, B. Yegnanarayana
2010	Speaker-independent HMM-based voice conversion using quantized fundamental frequency. Takashi Nose, Takao Kobayashi
2010	Speaking style dependency of formant targets. Akiko Amano-Kusumoto, John-Paul Hosom, Alexander Kain
2010	Specification in context - devoicing processes in Polish, French, american English and German sonorants. Jagoda Sieczkowska, Bernd Möbius, Grzegorz Dogil
2010	Spectral entropy-based voice activity detector for videoconferencing systems. Bowon Lee, Debargha Muhkerjee
2010	Spectro-temporal modulations for robust speech emotion recognition. Lan-Ying Yeh, Tai-Shih Chi
2010	Speech categorization context effects in seven- to nine-month-old infants. Ellen Marklund, Francisco Lacerda, Anna Ericsson
2010	Speech database reduction method for corpus-based TTS system. Mitsuaki Isogai, Hideyuki Mizuno
2010	Speech dominoes and phonetic convergence. Gérard Bailly, Amélie Lelong
2010	Speech enhancement using improved generalized sidelobe canceller in frequency domain with multi-channel postfiltering. Kai Li, Qiang Fu, Yonghong Yan
2010	Speech estimation in non-stationary noise environments using timing structures between mouth movements and sound signals. Hiroaki Kawashima, Yu Horii, Takashi Matsuyama
2010	Speech intelligibility of diagonally localized speech with competing noise using bone-conduction headphones. Kazuhiro Kondo, Takayuki Kanda, Yosuke Kobayashi, Hiroyuki Yagyu
2010	Speech inventory based discriminative training for joint speech enhancement and low-rate speech coding. Xiaoqiang Xiao, Robert M. Nickel
2010	Speech recognition using long-term phase information. Kazumasa Yamamoto, Eiichi Sueyoshi, Seiichi Nakagawa
2010	Speech recognition with a seamlessly updated language model for real-time closed-captioning. Toru Imai, Shinichi Homma, Akio Kobayashi, Takahiro Oku, Shoei Sato
2010	Speech recognizer optimization under speed constraints. Ivan Bulyko
2010	Speech robot mimicking human articulatory motion. Kotaro Fukui, Toshihiro Kusano, Yoshikazu Mukaeda, Yuto Suzuki, Atsuo Takanishi, Masaaki Honda
2010	Speech synthesis by modeling harmonics structure with multiple function. Toru Nakashika, Ryuki Tachibana, Masafumi Nishimura, Tetsuya Takiguchi, Yasuo Ariki
2010	Speech-based automated cognitive status assessment. Dilek Hakkani-Tür, Dimitra Vergyri, Gökhan Tür
2010	Spoken English assessment system for non-native speakers using acoustic and prosodic features. Qin Shi, Kun Li, Shilei Zhang, Stephen M. Chu, Ji Xiao, Zhijian Ou
2010	Spoken document retrieval for oral presentations integrating global document similarities into local document similarities. Hiroaki Nanjo, Yusuke Iyonaga, Takehiko Yoshimi
2010	State-based labelling for a sparse representation of speech and its application to robust speech recognition. Tuomas Virtanen, Jort F. Gemmeke, Antti Hurmalainen
2010	Statistical modeling of F0 dynamics in singing voices based on Gaussian processes with multiple oscillation bases. Yasunori Ohishi, Hirokazu Kameoka, Daichi Mochihashi, Hidehisa Nagano, Kunio Kashino
2010	Statistical multi-stream modeling of real-time MRI articulatory speech data. Erik Bresch, Athanasios Katsamanis, Louis Goldstein, Shrikanth S. Narayanan
2010	Still talking to machines (cognitively speaking). Steve J. Young
2010	Strategies for statistical spoken language understanding with small amount of data - an empirical study. Ye-Yi Wang
2010	Study on interaction between entropy pruning and kneser-ney smoothing. Ciprian Chelba, Thorsten Brants, Will Neveitt, Peng Xu
2010	Sub-band basis spectrum model for pitch-synchronous log-spectrum and phase based on approximation of sparse coding. Masatsune Tamura, Takehiko Kagoshima, Masami Akamine
2010	Superwideband extension of g.718 and g.729.1 speech codecs. Lasse Laaksonen, Mikko Tammi, Vladimir Malenovsky, Tommy Vaillancourt, Mi Suk Lee, Tomofumi Yamanashi, Masahiro Oshikiri, Claude Lamblin, Balázs Kövesi, Lei Miao, Deming Zhang, Jon Gibbs, Holly Francois
2010	Syllable-level prominence detection with acoustic evidence. Je Hun Jeon, Yang Liu
2010	Synthesis of fast speech with interpolation of adapted HSMMs and its evaluation by blind and sighted listeners. Michael Pucher, Dietmar Schabus, Junichi Yamagishi
2010	Synthesizing photo-real talking head via trajectory-guided sample selection. Lijuan Wang, Xiaojun Qian, Wei Han, Frank K. Soong
2010	System output combination for improved speaker diarization. Simon Bozonnet, Nicholas W. D. Evans, Xavier Anguera, Oriol Vinyals, Gerald Friedland, Corinne Fredouille
2010	Techniques for topic detection based processing in spoken dialog systems. Rajesh Balchandran, Leonid Rachevsky, Bhuvana Ramabhadran, Miroslav Novak
2010	Template-based spectral estimation using microphone array for speech recognition. Satoshi Tamura, Eriko Hishikawa, Wataru Taguchi, Satoru Hayamizu
2010	Text normalization based on statistical machine translation and internet user support. Tim Schlippe, Chenfei Zhu, Jan Gebhardt, Tanja Schultz
2010	Text-based unstressed syllable prediction in Mandarin. Ya Li, Jianhua Tao, Meng Zhang, Shifeng Pan, Xiaoying Xu
2010	Text-independent F0 transformation with non-parallel data for voice conversion. Zhizheng Wu, Tomi Kinnunen, Engsiong Chng, Haizhou Li
2010	The 2010 CMU GALE speech-to-text system. Florian Metze, Roger Hsiao, Qin Jin, Udhyakumar Nallasamy, Tanja Schultz
2010	The AMIDA 2009 meeting transcription system. Thomas Hain, Lukás Burget, John Dines, Philip N. Garner, Asmaa El Hannani, Marijn Huijbregts, Martin Karafiát, Mike Lincoln, Vincent Wan
2010	The CHiME corpus: a resource and a challenge for computational hearing in multisource environments. Heidi Christensen, Jon Barker, Ning Ma, Phil D. Green
2010	The IIR NIST SRE 2008 and 2010 summed channel speaker recognition systems. Hanwu Sun, Bin Ma, Chien-Lin Huang, Trung Hieu Nguyen, Haizhou Li
2010	The INTERSPEECH 2010 paralinguistic challenge. Björn W. Schuller, Stefan Steidl, Anton Batliner, Felix Burkhardt, Laurence Devillers, Christian A. Müller, Shrikanth S. Narayanan
2010	The NIST 2010 speaker recognition evaluation. Alvin F. Martin, Craig S. Greenberg
2010	The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms. David Dean, Sridha Sridharan, Robert Vogt, Michael Mason
2010	The RWTH 2009 quaero ASR evaluation system for English and German. Markus Nußbaum-Thom, Simon Wiesler, Martin Sundermeyer, Christian Plahl, Stefan Hahn, Ralf Schlüter, Hermann Ney
2010	The characterization of the relative information content by spectral features for the objective intelligibility assessment of nonlinearly processed speech. Anton Schlesinger, Marinus M. Boone
2010	The comparison between the deletion-based methods and the mixing-based methods for audio CAPTCHA systems. Takuya Nishimoto, Takayuki Watanabe
2010	The effect of a word embedded in a sentence and speaking rate variation on the perceptual training of geminate and singleton consonant distinction. Mee Sonu, Keiichi Tajima, Hiroaki Kato, Yoshinori Sagisaka
2010	The effect of audience familiarity on the perception of modified accent. Jonathan Teutenberg, Catherine Inez Watson
2010	The effects of EMA-based augmented visual feedback on the English speakers' acquisition of the Japanese flap: a perceptual study. June S. Levitt, William F. Katz
2010	The estimation and kernel metric of spectral correlation for text-independent speaker verification. Eryu Wang, Kong-Aik Lee, Bin Ma, Haizhou Li, Wu Guo, Li-Rong Dai
2010	The impact of ASR on abstractive vs. extractive meeting summaries. Gabriel Murray, Giuseppe Carenini, Raymond T. Ng
2010	The influence of actual and perceived sexual orientation on diadochokinetic rate in women and men. Benjamin Munson
2010	The influence of expertise and efficiency on modality selection strategies and perceived mental effort. Ina Wechsung, Stefan Schaffer, Robert Schleicher, Anja Naumann, Sebastian Möller
2010	The interrelation between the stimulus range and the number of response categories in vowel categorization. Titia Benders, Paola Escudero
2010	The prosody of Swedish conversational grunts. Daniel Neiberg, Joakim Gustafson
2010	The relation between pitch perception preference and emotion identification. Marie Nilsenová, Martijn Goudbeek, Luuk Kempen
2010	The relevance of timing, pauses and overlaps in dialogues: detecting topic changes in scenario based meetings. Saturnino Luz, Jing Su
2010	The role of higher-level linguistic features in HMM-based speech synthesis. Oliver Watts, Junichi Yamagishi, Simon King
2010	The use of air-pressure sensor in electrolaryngeal speech enhancement based on statistical voice conversion. Keigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano
2010	The use of sense in unsupervised training of acoustic models for ASR systems. Rita Singh, Benjamin Lambert, Bhiksha Raj
2010	The use of subvector quantization and discrete densities for fast GMM computation for speaker verification. Guoli Ye, Brian Mak
2010	Time conditioned search in automatic speech recognition reconsidered. David Nolden, Hermann Ney, Ralf Schlüter
2010	Topic and style-adapted language modeling for Thai broadcast news ASR. Markpong Jongtaveesataporn, Sadaoki Furui
2010	Topic-dependent n-gram models based on optimization of context lengths in LDA. Akira Nakamura, Satoru Hayamizu
2010	Topological representation of speech for speaker recognition. Gabriel Hernández Sierra, Jean-François Bonastre, Driss Matrouf, José R. Calvo
2010	Toward aero-acoustical analysis of the sibilant /s/: an oral cavity modeling. Kazunori Nozaki, Youhei Ohnishi, Takashi Suda, Shigeo Wada, Shinji Shimojo
2010	Toward detecting voice activity employing soft decision in second-order conditional MAP. Sang-Kyun Kim, Jae-Hun Choi, Sang-Ick Kang, Ji-Hyun Song, Joon-Hyuk Chang
2010	Towards a robust face recognition system using compressive sensing. Allen Y. Yang, Zihan Zhou, Yi Ma, Shankar Sastry
2010	Towards affective state modeling in narrative and conversational settings. Bart Jochems, Martha A. Larson, Roeland Ordelman, Ronald Poppe, Khiet P. Truong
2010	Towards an ASR-free objective analysis of pathological speech. Catherine Middag, Yvan Saeys, Jean-Pierre Martens
2010	Towards long-range prosodic attribute modeling for language recognition. Raymond W. M. Ng, Cheung-Chi Leung, Ville Hautamäki, Tan Lee, Bin Ma, Haizhou Li
2010	Towards mixed language speech recognition systems. David Imseng, Hervé Bourlard, Mathew Magimai-Doss
2010	Towards spoken term discovery at scale with zero resources. Aren Jansen, Kenneth Church, Hynek Hermansky
2010	Tracter: a lightweight dataflow framework. Philip N. Garner, John Dines
2010	Training a parametric-based logF0 model with the minimum generation error criterion. Javier Latorre, Mark J. F. Gales, Heiga Zen
2010	Transcript-dependent speaker recognition using mixer 1 and 2. Fred S. Richardson, Joseph P. Campbell
2010	Turn taking-based conversation detection by using DOA estimation. Yohei Kawaguchi, Masahito Togami, Yasunari Obuchi
2010	Turn-alignment using eye-gaze and speech in conversational interaction. Kristiina Jokinen, Kazuaki Harada, Masafumi Nishida, Seiichi Yamamoto
2010	Two new estimation methods for a superpositional intonation model. Humberto M. Torres, Hansjörg Mixdorff, Jorge A. Gurlekian, Hartmut R. Pfitzinger
2010	Two-layered audio-visual integration in voice activity detection and automatic speech recognition for robots. Takami Yoshida, Kazuhiro Nakadai
2010	Ungrounded independent non-negative factor analysis. Bhiksha Raj, Kevin W. Wilson, Alexander Krueger, Reinhold Haeb-Umbach
2010	Unscented transform with online distortion estimation for HMM adaptation. Jinyu Li, Dong Yu, Yifan Gong, Li Deng
2010	Unsupervised acoustic model adaptation for multi-origin non native ASR. Sethserey Sam, Eric Castelli, Laurent Besacier
2010	Unsupervised discovery and training of maximally dissimilar cluster models. Françoise Beaufays, Vincent Vanhoucke, Brian Strope
2010	Unsupervised learning of vowels from continuous speech based on self-organized phoneme acquisition model. Kouki Miyazawa, Hideaki Kikuchi, Reiko Mazuka
2010	Unsupervised model adaptation on targeted speech segments for LVCSR system combination. Richard Dufour, Fethi Bougares, Yannick Estève, Paul Deléglise
2010	Unsupervised sequential organization for cochannel speech separation. Ke Hu, DeLiang Wang
2010	Unsupervised spoken-term detection with spoken queries using segment-based dynamic time warping. Chun-an Chan, Lin-Shan Lee
2010	Unvoiced speech segregation based on CASA and spectral subtraction. Ke Hu, DeLiang Wang
2010	Using a DBN to integrate sparse classification and GMM-based ASR. Yang Sun, Jort F. Gemmeke, Bert Cranen, Louis ten Bosch, Lou Boves
2010	Using cross-decoder co-occurrences of phone n-grams in SVM-based phonotactic language recognition. Mikel Peñagarikano, Amparo Varona, Luis Javier Rodríguez-Fuentes, Germán Bordel
2010	Using dependency parsing and machine learning for factoid question answering on spoken documents. Pere Comas, Jordi Turmo, Lluís Màrquez
2010	Using harmonic phase information to improve ASR rate. Ibon Saratxaga, Inma Hernáez, Igor Odriozola, Eva Navas, Iker Luengo, Daniel Erro
2010	Using high-level information to detect key audio events in a tennis game. Qiang Huang, Stephen J. Cox
2010	Using non-native error patterns to improve pronunciation verification. Joost van Doremalen, Catia Cucchiarini, Helmer Strik
2010	Using phoneme recognition and text-dependent speaker verification to improve speaker segmentation for Chinese speech. Gang Wang, Xiaojun Wu, Thomas Fang Zheng
2010	Using prosody to improve Mandarin automatic speech recognition. Chong-Jia Ni, Wenju Liu, Bo Xu
2010	Using robust viterbi algorithm and HMM-modeling in unit selection TTS to replace units of poor quality. Hanna Silén, Elina Helander, Jani Nurminen, Konsta Koppinen, Moncef Gabbouj
2010	Using spectro-temporal features to improve AFE feature extraction for ASR. Suman V. Ravuri, Nelson Morgan
2010	Utilizing a noisy-channel approach for Korean LVCSR. Sakriani Sakti, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura
2010	Utterance selection for speech acts in a cognitive tourguide scenario. Felix Putze, Tanja Schultz
2010	VAD-measure-embedded decoder with online model adaptation. Tasuku Oonishi, Koji Iwano, Sadaoki Furui
2010	Validation of a training method for L2 continuous-speech segmentation. Anne Cutler, Janise Shanley
2010	Variant time-frequency cepstral features for speaker recognition. Weiqiang Zhang, Yan Deng, Liang He, Jia Liu
2010	Verifying pronunciation dictionaries using conflict analysis. Marelie H. Davel, Febe de Wet
2010	Viseme-dependent weight optimization for CHMM-based audio-visual speech recognition. Alexey Karpov, Andrey Ronzhin, Konstantin Markov, Milos Zelezný
2010	Vocabulary independent spoken query: a case for subword units. Evandro B. Gouvêa, Tony Ezzat
2010	Vocal tract contour analysis of emotional speech by the functional data curve representation. Sungbok Lee, Shrikanth S. Narayanan
2010	Voice activity detection based on conditional random fields using multiple features. Akira Saito, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda
2010	Voice activity detection in a reguarized reproducing kernel hilbert space. Xugang Lu, Masashi Unoki, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura
2010	Voice activity detection using frame-wise model re-estimation method based on Gaussian pruning with weight normalization. Masakiyo Fujimoto, Shinji Watanabe, Tomohiro Nakatani
2010	Voice attributes affecting likability perception. Benjamin Weiss, Felix Burkhardt
2010	Voice quality evaluation of recent open source codecs. Anssi Rämö, Henri Toukomaa
2010	Voice search for development. Etienne Barnard, Johan Schalkwyk, Charl Johannes van Heerden, Pedro J. Moreno
2010	WFST compression for automatic speech recognition. Diamantino Caseiro
2010	What do you mean, you're uncertain?: the interpretation of cue words and rising intonation in dialogue. Catherine Lai
2010	What else is new than the hamming window? robust MFCCs for speaker recognition via multitapering. Tomi Kinnunen, Rahim Saeidi, Johan Sandberg, Maria Hansson-Sandsten
2010	When is indexical information about speech activated? evidence from a cross-modal priming experiment. Benjamin Munson, Renata Solum
2010	Wiktionary as a source for automatic pronunciation extraction. Tim Schlippe, Sebastian Ochs, Tanja Schultz
2010	Within and across sentence boundary language model. Saeedeh Momtazi, Friedrich Faubel, Dietrich Klakow