INTERSPEECH A

782 papers

YearTitle / Authors
2010"flat pitch accents" in Czech.
Tomás Dubeda
201011th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, Makuhari, Chiba, Japan, September 26-30, 2010
Takao Kobayashi, Keikichi Hirose, Satoshi Nakamura
20102010, a speech oddity: phonetic transcription of reversed speech.
François Pellegrino, Emmanuel Ferragne, Fanny Meunier
2010A Bayesian approach to voice activity detection using multiple statistical models and discriminative training.
Tao Yu, John H. L. Hansen
2010A DOA estimation algorithm based on equalization-cancellation theory.
Duc Thanh Chau, Junfeng Li, Masato Akagi
2010A MMSE estimator in mel-cepstral domain for robust large vocabulary automatic speech recognition using uncertainty propagation.
Ramón Fernandez Astudillo, Reinhold Orglmeister
2010A blind signal-to-noise ratio estimator for high noise speech recordings.
Charles Mercier, Roch Lefebvre
2010A classifier-based target cost for unit selection speech synthesis trained on perceptual data.
Volker Strom, Simon King
2010A cluster-profile representation of emotion using agglomerative hierarchical clustering.
Emily Mower, Kyu Jeong Han, Sungbok Lee, Shrikanth S. Narayanan
2010A comparative large scale study of MLP features for Mandarin ASR.
Fabio Valente, Mathew Magimai-Doss, Christian Plahl, Suman V. Ravuri, Wen Wang
2010A comparative study of constrained and unconstrained approaches for segmentation of speech signal.
Venkatesh Keri, Kishore Prahallad
2010A comparative study of noise estimation algorithms for VTS-based robust speech recognition.
Yong Zhao, Biing-Hwang Juang
2010A comparison of pronunciation modeling approaches for HMM-TTS.
Gabriel Webster, Sacha Krstulovic, Kate M. Knill
2010A corpus-based approach to speech enhancement from nonstationary noise.
Ji Ming, Ramji Srinivasan, Danny Crookes
2010A discriminative performance metric for GMM-UBM speaker identification.
Omid Dehzangi, Bin Ma, Engsiong Chng, Haizhou Li
2010A discriminative splitting criterion for phonetic decision trees.
Simon Wiesler, Georg Heigold, Markus Nußbaum-Thom, Ralf Schlüter, Hermann Ney
2010A duration modeling technique with incremental speech rate normalization.
Hiroshi Fujimura, Takashi Masuko, Mitsuyoshi Tachimori
2010A factorial sparse coder model for single channel source separation.
Robert Peharz, Michael Stark, Franz Pernkopf, Yannis Stylianou
2010A fast implementation of factor analysis for speaker verification.
Qingsong Liu, Wei Huang, Dongxing Xu, Hongbin Cai, Beiqian Dai
2010A fast one-pass-training feature selection technique for GMM-based acoustic event detection with audio-visual data.
Taras Butko, Climent Nadeu
2010A fast query by humming system based on notes.
Jingzhou Yang, Jia Liu, Weiqiang Zhang
2010A fast speaker indexing using vector quantization and second order statistics with adaptive threshold computation.
Konstantin Biatov
2010A feature extraction method for automatic speech recognition based on the cochlear nucleus.
Serajul Haque, Roberto Togneri
2010A hierarchical F0 modeling method for HMM-based speech synthesis.
Ming Lei, Yi-Jian Wu, Frank K. Soong, Zhen-Hua Ling, Li-Rong Dai
2010A hybrid approach to online speaker diarization.
Carlos Vaquero, Oriol Vinyals, Gerald Friedland
2010A hybrid approach to robust word lattice generation via acoustic-based word detection.
Icksang Han, Chiyoun Park, Jeongmi Cho, Jeongsu Kim
2010A hybrid architecture for mobile voice user interfaces.
Imre Kiss, Joseph Polifroni, Chao Wang, Ghinwa F. Choueiter, Mike Phillips
2010A hybrid modeling strategy for GMM-SVM speaker recognition with adaptive relevance factor.
Chang Huai You, Haizhou Li, Kong-Aik Lee
2010A language-identification inspired method for spontaneous speech detection.
Mickael Rouvier, Richard Dufour, Georges Linarès, Yannick Estève
2010A lightweight keyword and tag-cloud retrieval algorithm for automatic speech recognition transcripts.
Sebastian Tschöpel, Daniel Schneider
2010A longest matching segment approach for text-independent speaker recognition.
Ayeh Jafari, Ramji Srinivasan, Danny Crookes, Ji Ming
2010A maximum a posteriori sound source localization in reverberant and noisy conditions.
Jinho Choi, Chang D. Yoo
2010A minimum classification error approach to pronunciation variation modeling of non-native proper names.
Line Adde, Bert Réveil, Jean-Pierre Martens, Torbjørn Svendsen
2010A minimum converted trajectory error (MCTE) approach to high quality speech-to-lips conversion.
Xiaodan Zhuang, Lijuan Wang, Frank K. Soong, Mark Hasegawa-Johnson
2010A modified parameterization of the Fujisaki model.
Robert Schubert, Oliver Jokisch, Diane Hirschfeld
2010A multidomain approach for automatic home environmental sound classification.
Stavros Ntalampiras, Ilyas Potamitis, Nikos Fakotakis
2010A multimodal density function estimation approach to formant tracking.
Sundar Harshavardhan, Chandra Sekhar Seelamantula, Thippur V. Sreenivas
2010A multipulse FEC scheme based on amplitude estimation for CELP codecs over packet networks.
José L. Carmona, Angel M. Gomez, Antonio M. Peinado, José L. Pérez-Córdoba, José A. González
2010A multistream multiresolution framework for phoneme recognition.
Nima Mesgarani, Samuel Thomas, Hynek Hermansky
2010A new VAD framework using statistical model and human knowledge based empirical rule.
Ji Wu, Xiao-Lei Zhang, Wei Li
2010A new approach for automatic tone error detection in strong accented Mandarin based on dominant set.
Taotao Zhu, Dengfeng Ke, Zhenbiao Chen, Bo Xu
2010A new binary mask based on noise constraints for improved speech intelligibility.
Gibak Kim, Philipos C. Loizou
2010A new multichannel multi modal dyadic interaction database.
Viktor Rozgic, Bo Xiao, Athanasios Katsamanis, Brian R. Baucom, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2010A novel approach for matched reverberant training of HMMs using data pairs.
Armin Sehr, Christian Hofmann, Roland Maas, Walter Kellermann
2010A novel confidence measure based on marginalization of jointly estimated error cause probabilities.
Atsunori Ogawa, Atsushi Nakamura
2010A novel feature extraction strategy for multi-stream robust emotion identification.
Gang Liu, Yun Lei, John H. L. Hansen
2010A novel hybrid approach for Mandarin speech synthesis.
Shifeng Pan, Meng Zhang, Jianhua Tao
2010A novel path extension framework using steady segment detection for Mandarin speech recognition.
Zhanlei Yang, Wenju Liu
2010A novel speaker binary key derived from anchor models.
Xavier Anguera, Jean-François Bonastre
2010A novel text-independent phonetic segmentation algorithm based on the microcanonical multiscale formalism.
Vahid Khanagha, Khalid Daoudi, Oriol Pont, Hussein M. Yahia
2010A particle filter feature compensation approach to robust speech recognition.
Aleem Mushtaq, Yu Tsao, Chin-Hui Lee
2010A perceptual study of acceleration parameters in HMM-based TTS.
Yining Chen, Zhi-Jie Yan, Frank K. Soong
2010A phoneme recognition framework based on auditory spectro-temporal receptive fields.
Samuel Thomas, Kailash Patil, Sriram Ganapathy, Nima Mesgarani, Hynek Hermansky
2010A phonetic alternative to cross-language voice conversion in a text-dependent context: evaluation of speaker identity.
Kayoko Yanagisawa, Mark A. Huckvale
2010A procedure for estimating gestural scores from natural speech.
Hosung Nam, Vikramjit Mitra, Mark Tiede, Elliot Saltzman, Louis Goldstein, Carol Y. Espy-Wilson, Mark Hasegawa-Johnson
2010A quick sequential forward floating feature selection algorithm for emotion detection from speech.
Mátyás Brendel, Riccardo Zaccarelli, Laurence Devillers
2010A regularized discriminative training method of acoustic models derived by minimum relative entropy discrimination.
Yotaro Kubo, Shinji Watanabe, Atsushi Nakamura, Tetsunori Kobayashi
2010A robust audio-visual speech recognition using audio-visual voice activity detection.
Satoshi Tamura, Masato Ishikawa, Takashi Hashiba, Shin'ichi Takeuchi, Satoru Hayamizu
2010A robust speech recognition system against the ego noise of a robot.
Gökhan Ince, Kazuhiro Nakadai, Tobias Rodemann, Hiroshi Tsujino, Jun-ichi Imura
2010A rule-based backchannel prediction model using pitch and pause information.
Khiet P. Truong, Ronald Poppe, Dirk Heylen
2010A segment-based non-parametric approach for monophone recognition.
Ladan Golipour, Douglas D. O'Shaughnessy
2010A semi-supervised cluster-and-label approach for utterance classification.
Amparo Albalate, Aparna Suchindranath, David Suendermann, Wolfgang Minker
2010A singing style modeling system for singing voice synthesizers.
Keijiro Saino, Makoto Tachibana, Hideki Kenmochi
2010A spectral LF model based approach to voice source parameterisation.
John Kane, Mark Kane, Christer Gobl
2010A speech-in-noise test based on spoken digits: comparison of normal and impaired listeners using a computer model.
Matthew Robertson, Guy J. Brown, Wendy Lecluyse, Manasa Panda, Christine M. Tan
2010A spoken term detection framework for recovering out-of-vocabulary words using the web.
Carolina Parada, Abhinav Sethy, Mark Dredze, Frederick Jelinek
2010A statistical segment-based approach for spoken language understanding.
Lucía Ortega, Isabel Galiano, Lluís F. Hurtado, Emilio Sanchis, Encarna Segarra
2010A stochastic finite-state transducer approach to spoken dialog management.
Lluís F. Hurtado, Joaquin Planells, Encarna Segarra, Emilio Sanchis, David Griol
2010A study of interplay between articulatory movement and prosodic characteristics in emotional speech production.
Jangwon Kim, Sungbok Lee, Shrikanth S. Narayanan
2010A study of intra-speaker and inter-speaker affective variability using electroglottograph and inverse filtered glottal waveforms.
Daniel Bone, Samuel Kim, Sungbok Lee, Shrikanth S. Narayanan
2010A study of irrelevant variability normalization based training and unsupervised online adaptation for LVCSR.
Guangchuan Shi, Yu Shi, Qiang Huo
2010A study of term weighting in phonotactic approach to spoken language recognition.
Sirinoot Boonsuk, Donglai Zhu, Bin Ma, Atiwong Suchato, Proadpran Punyabukkana, Nattanun Thatphithakkul, Chai Wutiwiwatchai
2010A super-resolution spectrogram using coupled PLCA.
Juhan Nam, Gautham J. Mysore, Joachim Ganseman, Kyogu Lee, Jonathan S. Abel
2010A variable frame length and rate algorithm based on the spectral kurtosis measure for speaker verification.
Chi-Sang Jung, Kyu Jeong Han, Hyunson Seo, Shrikanth S. Narayanan, Hong-Goo Kang
2010Accelerating hierarchical acoustic likelihood computation on graphics processors.
Pavel Kveton, Miroslav Novak
2010Accurate pitch marking for prosodic modification of speech segments.
Thomas Ewender, Beat Pfister
2010Acoustic analysis of intonation in parkinson's disease.
Joan K. Y. Ma, Rüdiger Hoffmann
2010Acoustic correlates of meaning structure in conversational speech.
Alexei V. Ivanov, Giuseppe Riccardi, Sucheta Ghosh, Sara Tonelli, Evgeny A. Stepanov
2010Acoustic correlates of voice quality improvement by voice training.
Kiyoaki Aikawa, Junko Uenuma, Tomoko Akitake
2010Acoustic feature analysis in speech emotion primitives estimation.
Dongrui Wu, Thomas D. Parsons, Shrikanth S. Narayanan
2010Acoustic feature diversity and speaker verification.
R. Padmanabhan, Hema A. Murthy
2010Acoustic modeling with bootstrap and restructuring for low-resourced languages.
Xiaodong Cui, Jian Xue, Pierre L. Dognin, Upendra V. Chaudhari, Bowen Zhou
2010Acoustic vector resampling for GMMSVM-based speaker verification.
Man-Wai Mak, Wei Rao
2010Acoustic-based recognition of head gestures accompanying speech.
Akira Sasou, Yasuharu Hashimoto, Katsuhiko Sakaue
2010Acoustic-to-articulatory inversion based on local regression.
Samer Al Moubayed, Gopal Ananthakrishnan
2010Acoustics-based phonetic transcription method for proper nouns.
Antoine Laurent, Sylvain Meignier, Téva Merlin, Paul Deléglise
2010Active appearance models for photorealistic visual speech synthesis.
Wesley Mattheyses, Lukas Latacz, Werner Verhelst
2010Active word learning under uncertain input conditions.
Maarten Versteegh, Louis ten Bosch, Lou Boves
2010Adaptation of a tongue shape model by local feature transformations.
Chao Qin, Miguel Á. Carreira-Perpiñán, Mohsen Farhadloo
2010Adapting a duration synthesis model to rate children's oral reading prosody.
Minh Duong, Jack Mostow
2010Adaptive high accuracy approaches to speech activity detection in noisy and hostile audio environments.
Mark C. Huggins, Brett Y. Smolenski, Aaron D. Lawson
2010Adaptive voice-quality control based on one-to-many eigenvoice conversion.
Kumi Ohta, Tomoki Toda, Yamato Ohtani, Hiroshi Saruwatari, Kiyohiro Shikano
2010Advanced speech communication system for deaf people.
Rubén San Segundo, Verónica López-Ludeña, Raquel Martín, Syaheerah L. Lutfi, Javier Ferreiros, Ricardo de Córdoba, José Manuel Pardo
2010Advances in fast multistream diarization based on the information bottleneck framework.
Deepu Vijayasenan, Fabio Valente, Hervé Bourlard
2010Affective story teller: a TTS system for emotional expressivity.
Shaikh Mostafa Al Masum, Antonio Rui Ferreira Rebordão, Keikichi Hirose
2010Age and gender classification from speech using decision level fusion and ensemble based techniques.
Florian Lingenfelser, Johannes Wagner, Thurid Vogt, Jonghwa Kim, Elisabeth André
2010Age and gender classification using fusion of acoustic and prosodic features.
Hugo Meinedo, Isabel Trancoso
2010Age and gender recognition based on multiple systems - early vs. late fusion.
Tobias Bocklet, Georg Stemmer, Viktor Zeißler, Elmar Nöth
2010Age recognition based on speech signals using weights supervector.
Royi Porat, Dan Lange, Yaniv Zigel
2010An HMM trajectory tiling (HTT) approach to high quality TTS.
Yao Qian, Zhi-Jie Yan, Yi-Jian Wu, Frank K. Soong, Xin Zhuang, Shengyi Kong
2010An analysis of language mismatch in HMM state mapping-based cross-lingual speaker adaptation.
Hui Liang, John Dines
2010An analysis of sparseness and regularization in exemplar-based methods for speech classification.
Dimitri Kanevsky, Tara N. Sainath, Bhuvana Ramabhadran, David Nahamoo
2010An analytic modeling approach to enhancing throat microphone speech commands for keyword spotting.
Jun Cai, Stefano Marini, Pierre Malarme, Francis Grenez, Jean Schoentgen
2010An auditory based modulation spectral feature for reverberant speech recognition.
Hari Krishna Maganti, Marco Matassoni
2010An effect of formant amplitude in vowel perception.
Masashi Ito, Keiji Ohara, Akinori Ito, Masafumi Yano
2010An effective feature compensation scheme tightly matched with speech recognizer employing SVM-based GMM generation.
Wooil Kim, Jun-Won Suh, John H. L. Hansen
2010An empirical comparison of the t
Josef R. Novak, Paul R. Dixon, Sadaoki Furui
2010An exploration of voice source correlates of focus.
Irena Yanushevskaya, Christer Gobl, John Kane, Ailbhe Ní Chasaide
2010An implementation of decision tree-based context clustering on graphics processing units.
Nicholas Pilkington, Heiga Zen
2010An improved cluster model selection method for agglomerative hierarchical speaker clustering using incremental Gaussian mixture models.
Kyu Jeong Han, Shrikanth S. Narayanan
2010An improved wavelet-based dereverberation for robust automatic speech recognition.
Randy Gomez, Tatsuya Kawahara
2010An integrated top-down/bottom-up approach to speaker diarization.
Simon Bozonnet, Nicholas W. D. Evans, Corinne Fredouille, Dong Wang, Raphaël Troncy
2010An intonation model for TTS in sepedi.
Daniel R. van Niekerk, Etienne Barnard
2010An intrusive super-wideband speech quality model: DIAL.
Nicolas Côté, Vincent Koehl, Valérie Gautier-Turbin, Alexander Raake, Sebastian Möller
2010An investigation into direct scoring methods without SVM training in speaker verification.
Ce Zhang, Rong Zheng, Bo Xu
2010An investigation of formant frequencies for cognitive load classification.
Tet Fei Yap, Julien Epps, Eliathamby Ambikairajah, Eric H. C. Choi
2010An unsupervised approach to creating web audio contents-based HMM voices.
Jinfu Ni, Hisashi Kawai
2010Analysis and detection of cognitive load and frustration in drivers' speech.
Hynek Boril, Seyed Omid Sadjadi, Tristan Kleinschmidt, John H. L. Hansen
2010Analysis of excitation source information in emotional speech.
S. R. Mahadeva Prasanna, D. Govind
2010Analysis of gender normalization using MLP and VTLN features.
Thomas Schaaf, Florian Metze
2010Analytical assessment and distance modeling of speech transmission quality.
Marcel Wältermann, Alexander Raake, Sebastian Möller
2010Analyzing user utterances in barge-in-able spoken dialogue system for improving identification accuracy.
Kyoko Matsuyama, Kazunori Komatani, Ryu Takeda, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno
2010Applying geometric source separation for improved pitch extraction in human-robot interaction.
Martin Heckmann, Claudius Gläser, Frank Joublin, Kazuhiro Nakadai
2010Applying scalable phonetic context similarity in unit selection of concatenative text-to-speech.
Wei Zhang, Xiaodong Cui
2010Applying voice conversion to concatenative singing-voice synthesis.
Fernando Villavicencio, Jordi Bonada
2010Approaching human listener accuracy with modern speaker verification.
Ville Hautamäki, Tomi Kinnunen, Mohaddeseh Nosratighods, Kong-Aik Lee, Bin Ma, Haizhou Li
2010Articulatory grounding of southern salentino harmony processes.
Mirko Grimaldi, Andrea Calabrese, Francesco Sigona, Luigia Garrapa, Bianca Sisinni
2010Articulatory inversion of american English /turnr/ by conditional density modes.
Chao Qin, Miguel Á. Carreira-Perpiñán
2010Articulatory synthesis and perception of plosive-vowel syllables with virtual consonant targets.
Peter Birkholz, Bernd J. Kröger, Christiane Neuschaefer-Rube
2010Articulatory-functional modeling of speech prosody: a review.
Yi Xu, Santitham Prom-on
2010Artificial and online acquired noise dictionaries for noise robust ASR.
Jort F. Gemmeke, Tuomas Virtanen
2010Assessment of single-channel speech enhancement techniques for speaker identification under mismatched conditions.
Seyed Omid Sadjadi, John H. L. Hansen
2010Assessment of spoken and multimodal applications: lessons learned from laboratory and field studies.
Markku Turunen, Jaakko Hakulinen, Tomi Heimonen
2010Asymptotically exact noise-corrupted speech likelihoods.
Rogier C. van Dalen, Mark J. F. Gales
2010Audio analytics by template modeling and 1-pass DP based decoding.
Srikanth Cherla, V. Ramasubramanian
2010Audio-based sports highlight detection by fourier local auto-correlations.
Jiaxing Ye, Takumi Kobayashi, Tetsuya Higuchi
2010Audio-visual anticipatory coarticulation modeling by human and machine.
Louis H. Terry, Karen Livescu, Janet B. Pierrehumbert, Aggelos K. Katsaggelos
2010Audio-visual synchronisation for speaker diarisation.
Giulia Garau, Alfred Dielmann, Hervé Bourlard
2010Audiovisual congruence and pragmatic focus marking.
Charlotte Wollermann, Bernhard Schröder, Ulrich Schade
2010Augmentation of adaptation data.
Ravichander Vipperla, Steve Renals, Joe Frankel
2010Augmented context features for Arabic speech recognition.
Ahmad Emami, Hong-Kwang Jeff Kuo, Imed Zitouni, Lidia Mangu
2010Augmented set of features for confidence estimation in spoken term detection.
Javier Tejedor, Doroteo T. Toledano, Miguel Bautista, Simon King, Dong Wang, José Colás
2010AutoBI - a tool for automatic toBI annotation.
Andrew Rosenberg
2010Autocorrelation and double autocorrelation based spectral representations for a noisy word recognition system.
Tetsuya Shimamura, Ngoc Dinh Nguyen
2010Automated vocal emotion recognition using phoneme class specific features.
Géza Kiss, Jan P. H. van Santen
2010Automatic analysis of the intonation of a tone language. applying the momel algorithm to spontaneous standard Chinese (beijing).
Na Zhi, Daniel Hirst, Pier Marco Bertinetto
2010Automatic classification of married couples' behavior using audio features.
Matthew Black, Athanasios Katsamanis, Chi-Chun Lee, Adam C. Lammert, Brian R. Baucom, Andrew Christensen, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2010Automatic derivation of phonological rules for mispronunciation detection in a computer-assisted pronunciation training system.
Wai Kit Lo, Shuang Zhang, Helen M. Meng
2010Automatic detection of abnormal stress patterns in unit selection synthesis.
Yeon-Jun Kim, Marc C. Beutnagel
2010Automatic detection of task-incompleted dialog for spoken dialog system based on dialog act n-gram.
Sunao Hara, Norihide Kitaoka, Kazuya Takeda
2010Automatic discriminative measurement of voice onset time.
Morgan Sonderegger, Joseph Keshet
2010Automatic error detection for unit selection speech synthesis using log likelihood ratio based SVM classifier.
Heng Lu, Zhen-Hua Ling, Si Wei, Li-Rong Dai, Ren-Hua Wang
2010Automatic estimation of transcription accuracy and difficulty.
Brandon Roy, Soroush Vosoughi, Deb Roy
2010Automatic evaluation of English pronunciation by Japanese speakers using various acoustic features and pattern recognition techniques.
Kuniaki Hirabayashi, Seiichi Nakagawa
2010Automatic excitement-level detection for sports highlights generation.
Hynek Boril, Abhijeet Sangwan, Taufiq Hasan, John H. L. Hansen
2010Automatic perceptual categorization of disordered connected speech.
Ali Alpan, Jean Schoentgen, Youri Maryn, Francis Grenez
2010Automatic pronunciation scoring using learning to rank and DP-based score segmentation.
Liang-Yu Chen, Jyh-Shing Roger Jang
2010Automatic reference independent evaluation of prosody quality using multiple knowledge fusions.
Shen Huang, Hongyan Li, Shijin Wang, Jiaen Liang, Bo Xu
2010Automatic selection of thresholds for signal separation algorithms based on interaural delay.
Chanwoo Kim, Richard M. Stern, Kiwan Eom, Jaewon Lee
2010Automatic speaker age and gender recognition in the car for tailoring dialog and mobile services.
Michael Feld, Felix Burkhardt, Christian A. Müller
2010Automatic speech recognition for assistive writing in speech supplemented word prediction.
John-Paul Hosom, Tom Jakobs, Allen Baker, Susan Fager
2010Automatic speech recognition of multiple accented English data.
Dimitra Vergyri, Lori Lamel, Jean-Luc Gauvain
2010Automatic speech recognition system channel modeling.
Qun Feng Tan, Kartik Audhkhasi, Panayiotis G. Georgiou, Emil Ettelaie, Shrikanth S. Narayanan
2010Automatic turn segmentation in spoken conversations.
Alexei V. Ivanov, Giuseppe Riccardi
2010Autoregressive clustering for HMM speech synthesis.
Matt Shannon, William Byrne
2010Autoregressive modelling for linear prediction of ultrasonic speech.
Farzaneh Ahmadi, Ian Vince McLoughlin, Hamid R. Sharifzadeh
2010Bandwidth expansion of speech based on wavelet transform modulus maxima vector mapping.
Zhe Chen, You-Chi Cheng, Fuliang Yin, Chin-Hui Lee
2010Bayes factor based speaker segmentation for speaker diarization.
David Wang, Robert Vogt, Sridha Sridharan
2010Bayesian speaker recognition using Gaussian mixture model and laplace approximation.
Shih-Sian Cheng, I-Fan Chen, Hsin-Min Wang
2010Beyond sentence prosody.
Chiu-yu Tseng
2010Bias considerations for minimum subspace noise tracking.
Mahdi Triki, Kees Janse
2010Bimodal coherence based scale ambiguity cancellation for target speech extraction and enhancement.
Qingju Liu, Wenwu Wang, Philip J. B. Jackson
2010Binary coding of speech spectrograms using a deep auto-encoder.
Li Deng, Michael L. Seltzer, Dong Yu, Alex Acero, Abdel-rahman Mohamed, Geoffrey E. Hinton
2010Boosted mixture learning of Gaussian mixture HMMs for speech recognition.
Jun Du, Yu Hu, Hui Jiang
2010Boosting systems for LVCSR.
George Saon, Hagen Soltau
2010Brazilian portuguese acoustic model training based on data borrowing from other language.
Kazuhiko Abe, Sakriani Sakti, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura
2010Brno university of technology system for interspeech 2010 paralinguistic challenge.
Marcel Kockmann, Lukás Burget, Jan Cernocký
2010Building transcribed speech corpora quickly and cheaply for many languages.
Thad Hughes, Kaisuke Nakajima, Linne Ha, Atul Vasu, Pedro J. Moreno, Mike LeBeau
2010CASTLE: a computer-assisted stress teaching and learning environment for learners of English as a second language.
Jingli Lu, Ruili Wang, Liyanage C. De Silva, Yang Gao, Jia Liu
2010CRF-based combination of contextual features to improve a posteriori word-level confidence measures.
Julien Fayolle, Fabienne Moreau, Christian Raymond, Guillaume Gravier, Patrick Gros
2010CRF-based stochastic pronunciation modeling for out-of-vocabulary spoken term detection.
Dong Wang, Simon King, Nicholas W. D. Evans, Raphaël Troncy
2010Can conversational word usage be used to predict speaker demographics?.
Dan Gillick
2010Can tongue be recovered from face? the answer of data-driven statistical models.
Atef Ben Youssef, Pierre Badin, Gérard Bailly
2010Canonical state models for automatic speech recognition.
Mark J. F. Gales, Kai Yu
2010Cantonese tone word learning by tone and non-tone language speakers.
Angela Cooper, Yue Wang
2010Catalog-based single-channel speech-music separation.
Cemil Demir, A. Taylan Cemgil, Murat Saraclar
2010Challenging the speech intelligibility index: macroscopic vs. microscopic prediction of sentence recognition in normal and hearing-impaired listeners.
Tim Jürgens, Stefan Fredelake, Ralf M. Meyer, Birger Kollmeier, Thomas Brand
2010Changes in temporal processing of speech across the adult lifespan.
Diane Kewley-Port, Larry E. Humes, Daniel Fogerty
2010Channel detectors for system fusion in the context of NIST LRE 2009.
Florian Verdet, Driss Matrouf, Jean-François Bonastre, Jean Hennebert
2010Chirp complex cepstrum-based decomposition for asynchronous glottal analysis.
Thomas Drugman, Thierry Dutoit
2010Classifying dialog acts in human-human and human-machine spoken conversations.
Silvia Quarteroni, Giuseppe Riccardi
2010Classroom note-taking system for hearing impaired students using automatic speech recognition adapted to lectures.
Tatsuya Kawahara, Norihiro Katsumaru, Yuya Akita, Shinsuke Mori
2010Close speaker cancellation for suppression of non-stationary background noise for hands-free speech interface.
Jani Even, Carlos Toshinori Ishi, Hiroshi Saruwatari, Norihiro Hagita
2010Cluster analysis of differential spectral envelopes on emotional speech.
Giampiero Salvi, Fabio Tesser, Enrico Zovato, Piero Cosi
2010Cluster-based language model for spoken document retrieval using NMF-based document clustering.
Xinhui Hu, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura
2010Combination of probabilistic and possibilistic language models.
Stanislas Oger, Vladimir Popescu, Georges Linarès
2010Combining Chinese spoken term detection systems via side-information conditioned linear logistic regression.
Sha Meng, Weiqiang Zhang, Jia Liu
2010Combining five acoustic level modeling methods for automatic speaker age and gender recognition.
Ming Li, Chi-Sang Jung, Kyu Jeong Han
2010Combining many alignments for speech to speech translation.
Sameer Maskey, Steven J. Rennie, Bowen Zhou
2010Combining monaural and binaural evidence for reverberant speech segregation.
John Woodruff, Rohit Prabhavalkar, Eric Fosler-Lussier, DeLiang Wang
2010Combining text categorization and dialog modeling for speaker role identification on call center conversations.
Rémi Lavalley, Chloé Clavel, Patrice Bellot, Marc El-Bèze
2010Combining user intention and error modeling for statistical dialog simulators.
Silvia Quarteroni, Meritxell González, Giuseppe Riccardi, Sebastian Varges
2010Combining word-based features, statistical language models, and parsing for named entity recognition.
Joseph Polifroni, Stephanie Seneff
2010Comparing measures of synchrony and alignment in dialogue speech timing with respect to turn-taking activity.
Nick Campbell, Stefan Scherer
2010Comparing mono- & multilingual acoustic seed models for a low e-resourced language: a case-study of luxembourgish.
Martine Adda-Decker, Lori Lamel, Natalie D. Snoeren
2010Comparison of HMM and TMDN methods for lip synchronisation.
Gregor Hofer, Korin Richmond
2010Comparison of approaches for instrumentally predicting the quality of text-to-speech systems.
Sebastian Möller, Florian Hinterleitner, Tiago H. Falk, Tim Polzehl
2010Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems.
Bo Li, Khe Chai Sim
2010Comparison of methods for topic classification in a speech-oriented guidance system.
Rafael Torres, Shota Takeuchi, Hiromichi Kawanami, Tomoko Matsui, Hiroshi Saruwatari, Kiyohiro Shikano
2010Competition in the perception of spoken Japanese words.
Takashi Otake, James M. McQueen, Anne Cutler
2010Concurrent speaker localization using multi-band position-pitch (m-popi) algorithm with spectro-temporal pre-processing.
Tania Habib, Harald Romsdorfer
2010Conditional models for detecting lambda-functions in a spoken language understanding system.
Frédéric Duvert, Renato De Mori
2010Confidence measures for speaker segmentation and their relation to speaker verification.
Carlos Vaquero, Alfonso Ortega, Jesús Antonio Villalba López, Antonio Miguel, Eduardo Lleida
2010Constructing Japanese test collections for spoken term detection.
Yoshiaki Itoh, Hiromitsu Nishizaki, Xinhui Hu, Hiroaki Nanjo, Tomoyosi Akiba, Tatsuya Kawahara, Seiichi Nakagawa, Tomoko Matsui, Yoichi Yamashita, Kiyoaki Aikawa
2010Construction and evaluations of an annotated Chinese conversational corpus in travel domain for the language model of speech recognition.
Xinhui Hu, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura
2010Content-based advertisement detection.
Patrick Cardinal, Vishwa Gupta, Gilles Boulianne
2010Context adaptive training with factorized decision trees for HMM-based speech synthesis.
Kai Yu, Heiga Zen, François Mairesse, Steve J. Young
2010Context dependent modelling approaches for hybrid speech recognizers.
Alberto Abad, Thomas Pellegrini, Isabel Trancoso, João Paulo Neto
2010Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional LSTM modeling.
Martin Wöllmer, Angeliki Metallinou, Florian Eyben, Björn W. Schuller, Shrikanth S. Narayanan
2010Contextual verification for open vocabulary spoken term detection.
Daniel Schneider, Timo Mertens, Martha A. Larson, Joachim Köhler
2010Continuous speech recognition with a TF-IDF acoustic model.
Geoffrey Zweig, Patrick Nguyen, Jasha Droppo, Alex Acero
2010Conversational spontaneous speech synthesis using average voice model.
Tomoki Koriyama, Takashi Nose, Takao Kobayashi
2010Convexity and fast speech extraction by split bregman method.
Meng Yu, Wenye Ma, Jack Xin, Stanley J. Osher
2010Coping imbalanced prosodic unit boundary detection with linguistically-motivated prosodic features.
Yi-Fen Liu, Shu-Chuan Tseng, Jyh-Shing Roger Jang, C.-H. Alvin Chen
2010Creating a linguistic plausibility dataset with non-expert annotators.
Benjamin Lambert, Rita Singh, Bhiksha Raj
2010Cross-cultural investigation of prosody in verbal feedback in interactional rapport.
Gina-Anne Levow, Susan Duncan, Edward T. King
2010Cross-lingual acoustic modeling for dialectal Arabic speech recognition.
Mohamed Elmahdy, Rainer Gruhn, Wolfgang Minker, Slim Abdennadher
2010Cross-lingual and multi-stream posterior features for low resource LVCSR systems.
Samuel Thomas, Sriram Ganapathy, Hynek Hermansky
2010Cross-lingual speaker adaptation via Gaussian component mapping.
Houwei Cao, Tan Lee, P. C. Ching
2010Cross-lingual spoken language understanding from unaligned data using discriminative classification models and machine translation.
Fabrice Lefèvre, François Mairesse, Steve J. Young
2010Cross-lingual talker discrimination.
Mirjam Wester
2010Dajare is not the lowest form of wit.
Takashi Otake
2010Data pruning for template-based automatic speech recognition.
Dino Seppi, Dirk Van Compernolle
2010Data selection for language modeling using sparse representations.
Abhinav Sethy, Tara N. Sainath, Bhuvana Ramabhadran, Dimitri Kanevsky
2010Data-dependent evaluator modeling and its application to emotional valence classification from speech.
Kartik Audhkhasi, Shrikanth S. Narayanan
2010Data-driven analysis of realtime vocal tract MRI using correlated image regions.
Adam C. Lammert, Michael I. Proctor, Shrikanth S. Narayanan
2010Decision tree based tone modeling with corrective feedbacks for automatic Mandarin tone assessment.
Hsien-Cheng Liao, Jiang-Chun Chen, Sen-Chia Chang, Ying-Hua Guan, Chin-Hui Lee
2010Decision tree state clustering with word and syllable features.
Hank Liao, Christopher Alberti, Michiel Bacchiani, Olivier Siohan
2010Declarative sentence intonation patterns in 8 swiss German dialects.
Adrian Leemann, Lucy Zuberbühler
2010Decoding with shrinkage-based language models.
Ahmad Emami, Stanley F. Chen, Abraham Ittycheriah, Hagen Soltau, Bing Zhao
2010Decoupling session variability modelling and speaker characterisation.
Anthony Larcher, Christophe Lévy, Driss Matrouf, Jean-François Bonastre
2010Deep-structured hidden conditional random fields for phonetic recognition.
Dong Yu, Li Deng
2010Detailed pronunciation variant modeling for speech transcription.
Denis Jouvet, Dominique Fohr, Irina Illina
2010Detecting Politeness and efficiency in a cooperative social interaction.
Paul M. Brunet, Marcela Charfuelan, Roderick Cowie, Marc Schröder, Hastings Donnan, Ellen Douglas-Cowie
2010Detecting categorical perception in continuous discrimination data.
Paul Boersma, Katerina Chládková
2010Detecting novel objects in acoustic scenes through classifier incongruence.
Jörg-Hendrik Bach, Jörn Anemüller
2010Detection of anger emotion in dialog speech using prosody feature and temporal relation of utterances.
Narichika Nomoto, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi
2010Detection of hot spots in poster conversations based on reactive tokens of audience.
Tatsuya Kawahara, Kouhei Sumi, Zhi-qiang Chang, Katsuya Takanashi
2010Determining optimal features for emotion recognition from speech by applying an evolutionary algorithm.
David Philippou-Hübner, Bogdan Vlasenko, Tobias Grosser, Andreas Wendemuth
2010Developing a Chinese L2 speech database of Japanese learners with narrow-phonetic labels for computer assisted pronunciation training.
Wen Cao, Dongning Wang, Jinsong Zhang, Ziyu Xiong
2010Dialect recognition using a phone-GMM-supervector-based SVM kernel.
Fadi Biadsy, Julia Hirschberg, Michael Collins
2010Dialog prediction for a general model of turn-taking.
Nigel G. Ward, Olac Fuentes, Alejandro Vega
2010Dialogue act detection in error-prone spoken dialogue systems using partial sentence tree and latent dialogue act matrix.
Wei-Bin Liang, Chung-Hsien Wu, Yu-Cheng Hsiao
2010Dialogue act tagging and segmentation with a single perceptron.
Ramón Granell, Stephen G. Pulman, Carlos D. Martínez-Hinarejos, José-Miguel Benedí
2010Did you say susi or shushi? measuring the emergence of robust fricative contrasts in English- and Japanese-acquiring children.
Jeffrey J. Holliday, Mary E. Beckman, Chanelle Mays
2010Direct construction of compact context-dependency transducers from data.
David Rybach, Michael Riley
2010Direct observation of pruning errors (DOPE): a search analysis tool.
Volker Steinbiss, Martin Sundermeyer, Hermann Ney
2010Disambiguating the functions of conversational sounds with prosody: the case of 'yeah'.
Khiet P. Truong, Dirk Heylen
2010Discovering an optimal set of minimally contrasting acoustic speech units: a point of focus for whole-word pattern matching.
Guillaume Aimetti, Roger K. Moore, Louis ten Bosch
2010Discriminative acoustic model for improving mispronunciation detection and diagnosis in computer-aided pronunciation training (CAPT).
Xiaojun Qian, Frank K. Soong, Helen M. Meng
2010Discriminative adaptation based on fast combination of DMAP and dfMLLR.
Lukás Machlica, Zbynek Zajíc, Ludek Müller
2010Discriminative adaptation for log-linear acoustic models.
Jonas Lööf, Ralf Schlüter, Hermann Ney
2010Discriminative language modeling using simulated ASR errors.
Preethi Jyothi, Eric Fosler-Lussier
2010Discriminative training and unsupervised adaptation for labeling prosodic events with limited training data.
Raul Fernandez, Bhuvana Ramabhadran
2010Discriminative training for hierarchical clustering in speaker diarization.
Oriol Vinyals, Gerald Friedland, Nelson Morgan
2010Distribution and trichotomic realization of voiced velars in Japanese - an experimental study.
Shin-ichiro Sano, Tomohiko Ooigawa
2010Does sentence complexity interfere with intelligibility in noise? evaluation of the oldenburg linguistically and audiologically controlled sentence test (OLACS).
Verena N. Uslar, Thomas Brand, Mirko Hanke, Rebecca Carroll, Esther Ruigendijk, Cornelia Hamann, Birger Kollmeier
2010Domain adaptation and compensation for emotion detection.
Michelle Hewlett Sanchez, Gökhan Tür, Luciana Ferrer, Dilek Hakkani-Tür
2010Durational structure of Japanese single/geminate stops in three- and four-mora words spoken at varied rates.
Yukari Hirata, Shigeaki Amano
2010Dynamic language model adaptation using keyword category classification.
Hitoshi Yamamoto, Ken Hanazawa, Kiyokazu Miki, Koichi Shinoda
2010Dynamic language modeling using Bayesian networks for spoken dialog systems.
Antoine Raux, Neville Mehta, Deepak Ramachandran, Rakesh Gupta
2010Dynamic model selection for spectral voice conversion.
Pierre Lanchantin, Xavier Rodet
2010Effect of spatial separation on speech-in-noise comprehension in dyslexic adults.
Marjorie Dole, Michel Hoen, Fanny Meunier
2010Effects of Korean learners' consonant cluster reduction strategies on English speech recognition performance.
Hyejin Hong, Jina Kim, Minhwa Chung
2010Effects of accent typicality and phonotactic frequency on nonword immediate serial recall performance in Japanese.
Yuuki Tanida, Taiji Ueno, Satoru Saito, Matthew A. Lambon Ralph
2010Effects of enhancement of spectral changes on speech quality and subjective speech intelligibility.
Jing Chen, Thomas Baer, Brian C. J. Moore
2010Effects of modelling within- and between-frame temporal variations in power spectra on non-verbal sound recognition.
Nobuhide Yamakawa, Tetsuro Kitahara, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
2010Effects of the phonological relevance in speaker verification.
Yanhua Long, Li-Rong Dai, Bin Ma, Wu Guo
2010Effects of wall impedance on transmission and attenuation of higher-order modes in vocal-tract model.
Kunitoshi Motoki
2010Efficient HMM-based estimation of missing features, with applications to packet loss concealment.
Bengt J. Borgström, Per Henrik Borgström, Abeer Alwan
2010Efficient combined approach for named entity recognition in spoken language.
Azeddine Zidouni, Sophie Rosset, Hervé Glotin
2010Efficient data selection for speech recognition based on prior confidence estimation using speech and context independent models.
Satoshi Kobashikawa, Taichi Asami, Yoshikazu Yamaguchi, Hirokazu Masataki, Satoshi Takahashi
2010Efficient estimation of maximum entropy language models with n-gram features: an SRILM extension.
Tanel Alumäe, Mikko Kurimo
2010Efficient manycore CHMM speech recognition for audiovisual and multistream data.
Dorothea Kolossa, Jike Chong, Steffen Zeiler, Kurt Keutzer
2010Efficient three-stage pitch estimation for packet loss concealment.
Xuejing Sun, Sameer Gadre
2010Emotion recognition using imperfect speech recognition.
Florian Metze, Anton Batliner, Florian Eyben, Tim Polzehl, Björn W. Schuller, Stefan Steidl
2010Empirical mode decomposition for noise-robust automatic speech recognition.
Kuo-Hao Wu, Chia-Ping Chen
2010Energy reallocation strategies for speech enhancement in known noise conditions.
Yan Tang, Martin Cooke
2010English spoken term detection in multilingual recordings.
Petr Motlícek, Fabio Valente, Philip N. Garner
2010Enhanced monitoring tools and online dialogue optimisation merged into a new spoken dialogue system design experience.
Romain Laroche, Philippe Bretier, Ghislain Putois
2010Enhanced speech yielding higher intelligibility for all listeners and environments.
Takayuki Arai, Nao Hodoshima
2010Enhanced word classing for model M.
Stanley F. Chen, Stephen M. Chu
2010Enhancements of viterbi search for fast unit selection synthesis.
Daniel Tihelka, Jirí Kala, Jindrich Matousek
2010Enhancing children's speech recognition under mismatched condition by explicit acoustic normalization.
Shweta Ghai, Rohit Sinha
2010Estimating missing data sequences in x-ray microbeam recordings.
Chao Qin, Miguel Á. Carreira-Perpiñán
2010Estimating noise from noisy speech features with a monte carlo variant of the expectation maximization algorithm.
Friedrich Faubel, Dietrich Klakow
2010Estimation of glottal area function using stereo-endoscopic high-speed digital imaging.
Hiroshi Imagawa, Ken-Ichi Sakakibara, Isao T. Tokuda, Mamiko Otsuka, Niro Tayama
2010Estimation of speech lip features from discrete cosinus transform.
Zuheng Ming, Denis Beautemps, Gang Feng, Sébastien Schmerber
2010Estimation of two-to-one forced selection intelligibility scores by speech recognizers using noise-adapted models.
Kazuhiro Kondo, Yusuke Takano
2010Estimation studies of vocal tract shape trajectory using a variable length and lossy kelly-lochbaum model.
Heikki Rasilo, Unto K. Laine, Okko Johannes Räsänen
2010Evaluating a dialog language generation system: comparing the mountain system to other NLG approaches.
Brian Langner, Stephan Vogel, Alan W. Black
2010Evaluation of a silent speech interface based on magnetic sensing.
Robin Hofe, Stephen R. Ell, Michael J. Fagan, James M. Gilbert, Phil D. Green, Roger K. Moore, Sergey I. Rybchenko
2010Evaluation of bone-conducted ultrasonic hearing-aid regarding transmission of paralinguistic information: a comparison with cochlear implant simulator.
Takayuki Kagomiya, Seiji Nakagawa
2010Evaluation of prosodic contextual factors for HMM-based speech synthesis.
Shuji Yokomizo, Takashi Nose, Takao Kobayashi
2010Evaluation of speaker mimic technology for personalizing SGD voices.
Esther Klabbers, Alexander Kain, Jan P. H. van Santen
2010Excitation modeling based on waveform interpolation for HMM-based speech synthesis.
June Sig Sung, Doo Hwa Hong, Kyung Hwan Oh, Nam Soo Kim
2010Expectations for discourse genre identification: a prosodic study.
Nicolas Obin, Volker Dellwo, Anne Lacheret, Xavier Rodet
2010Exploitation of phase information for speaker recognition.
Ning Wang, P. C. Ching, Tan Lee
2010Exploiting context-dependency and acoustic resolution of universal speech attribute models in spoken language recognition.
Sabato Marco Siniscalchi, Jeremy Reed, Torbjørn Svendsen, Chin-Hui Lee
2010Exploiting glottal formant parameters for glottal inverse filtering and parameterization.
Alan Ó Cinnéide, David Dorran, Mikel Gainza, Eugene Coyle
2010Exploiting variety-dependent phones in portuguese variety identification applied to broadcast news transcription.
Oscar Koller, Alberto Abad, Isabel Trancoso, Céu Viana
2010Exploring goodness of prosody by diverse matching templates.
Shen Huang, Hongyan Li, Shijin Wang, Jiaen Liang, Bo Xu
2010Exploring recognition network representations for efficient speech inference on highly parallel platforms.
Jike Chong, Ekaterina Gonina, Kisun You, Kurt Keutzer
2010Exploring speaker characteristics for meeting summarization.
Fei Liu, Yang Liu
2010Exploring subsegmental and suprasegmental features for a text-dependent speaker verification in distant speech signals.
B. Avinash, Sunitha Guruprasad, B. Yegnanarayana
2010Exploring the mechanism of tonal contraction in taiwan Mandarin.
Chierh Cheng, Yi Xu, Michele Gubian
2010Exploring web-browser based runtimes engines for creating ubiquitous speech interfaces.
Paul R. Dixon, Sadaoki Furui
2010Extended weighted linear prediction (XLP) analysis of speech and its application to speaker verification in adverse conditions.
Jouni Pohjalainen, Rahim Saeidi, Tomi Kinnunen, Paavo Alku
2010Extending the punctuation module for european portuguese.
Fernando Batista, Helena Moniz, Isabel Trancoso, Hugo Meinedo, Ana Isabel Mata, Nuno J. Mamede
2010Extractive speech summarization - from the view of decision theory.
Shih-Hsiang Lin, Yao-Ming Yeh, Berlin Chen
2010Extractive summarization using a latent variable model.
Asli Celikyilmaz, Dilek Hakkani-Tür
2010F
Jiahong Yuan, Mark Y. Liberman
2010FSM-based pronunciation modeling using articulatory phonological code.
Chi Hu, Xiaodan Zhuang, Mark Hasegawa-Johnson
2010Fast computation of speaker characterization vector using MLLR and sufficient statistics in anchor model framework.
Achintya Kumar Sarkar, Srinivasan Umesh
2010Fast converging iterative kalman filtering for speech enhancement using long and overlapped tapered windows with large side lobe attenuation.
Stephen So, Kuldip K. Paliwal
2010Fast least-squares solution for sinusoidal, harmonic and quasi-harmonic models.
Georgios Tzedakis, Yannis Pantazis, Olivier Rosec, Yannis Stylianou
2010Feature selection for pose invariant lip biometrics.
Adrian Pass, Jianguo Zhang, Darryl Stewart
2010Feature versus model based noise robustness.
Kris Demuynck, Xueru Zhang, Dirk Van Compernolle, Hugo Van hamme
2010Floor holder detection and end of speaker turn prediction in meetings.
Alfred Dielmann, Giulia Garau, Hervé Bourlard
2010Fluency and structural complexity as predictors of L2 oral proficiency.
Jared Bernstein, Jian Cheng, Masanori Suzuki
2010Focus-sensitive operator or focus inducer: always and only.
Yong-Cheol Lee, Satoshi Nambu
2010Foreign accent matters most when timing is wrong.
Chiharu Tsurutani
2010Formant-based frequency warping for improving speaker adaptation in HMM TTS.
Xin Zhuang, Yao Qian, Frank K. Soong, Yi-Jian Wu, Bo Zhang
2010Frequency of occurrence effects on pitch accent realisation.
Katrin Schweitzer, Michael Walsh, Bernd Möbius, Hinrich Schütze
2010Frequency-domain delexicalization using surrogate vowels.
Alexander Kain, Jan P. H. van Santen
2010Full body aero-tactile integration in speech perception.
Donald Derrick, Bryan Gick
2010Fully automatic segmentation for prosodic speech corpora.
Sarah Hoffmann, Beat Pfister
2010Fully unsupervised word learning from continuous speech using transitional probabilities of atomic acoustic events.
Okko Johannes Räsänen
2010Functional imaging of brain regions sensitive to communication sounds in primates.
Christopher I. Petkov, Benjamin Wilson
2010Fuzzy support vector machines for age and gender classification.
Phuoc Nguyen, Trung Le, Dat Tran, Xu Huang, Dharmendra Sharma
2010GMM-UBM based open-set online speaker diarization.
Jürgen T. Geiger, Frank Wallhoff, Gerhard Rigoll
2010Gender and affect recognition based on GMM and GMM-UBM modeling with relevance MAP estimation.
Rok Gajsek, Janez Zibert, Tadej Justin, Vitomir Struc, Bostjan Vesnicer, France Mihelic
2010Gesture and speech coordination: the influence of the relationship between manual gesture and speech.
Benjamin Roustan, Marion Dohen
2010Global variance modeling on the log power spectrum of LSPs for HMM-based speech synthesis.
Zhen-Hua Ling, Yu Hu, Li-Rong Dai
2010Glottal parameters estimation on speech using the zeros of the z-transform.
Nicolas Sturmel, Christophe d'Alessandro, Boris Doval
2010Glottal-based analysis of the lombard effect.
Thomas Drugman, Thierry Dutoit
2010Graph-embedding for speaker recognition.
Zahi N. Karam, William M. Campbell
2010HMM adaptation using linear spline interpolation with integrated spline parameter training for robust speech recognition.
Michael L. Seltzer, Alex Acero
2010HMM based TTS for mixed language text.
Zhiwei Shuang, Shiyin Kang, Yong Qin, Li-Rong Dai, Lianhong Cai
2010HMM-based automatic visual speech segmentation using facial data.
Utpala Musti, Asterios Toutios, Slim Ouni, Vincent Colotte, Brigitte Wrobel-Dautcourt, Marie-Odile Berger
2010HMM-based prosodic structure model using rich linguistic context.
Nicolas Obin, Xavier Rodet, Anne Lacheret
2010HMM-based singing voice synthesis system using pitch-shifted pseudo training data.
Ayami Mase, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda
2010HMM-based text-to-articulatory-movement prediction and analysis of critical articulators.
Zhen-Hua Ling, Korin Richmond, Junichi Yamagishi
2010Hands free audio analysis from home entertainment.
Danil Korchagin, Philip N. Garner, Petr Motlícek
2010Hidden Markov models with context-sensitive observations for grapheme-to-phoneme conversion.
Kalu U. Ogbureke, Peter Cahill, Julie Carson-Berndsen
2010Hidden logistic linear regression for support vector machine based phone verification.
Bo Li, Khe Chai Sim
2010Hierarchical bottle neck features for LVCSR.
Christian Plahl, Ralf Schlüter, Hermann Ney
2010Hierarchical classification for speech-to-speech translation.
Emil Ettelaie, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2010Hierarchical multilayer perceptron based language identification.
David Imseng, Mathew Magimai-Doss, Hervé Bourlard
2010Hierarchical neural net architectures for feature extraction in ASR.
Frantisek Grézl, Martin Karafiát
2010How abstract is phonetics?.
Osamu Fujimura
2010How children acquire situation understanding skills?: a developmental analysis utilizing multimodal speech behavior corpus.
Shogo Ishikawa, Shinya Kiriyama, Yoichi Takebayashi, Shigeyoshi Kitazawa
2010Identification of abnormal audio events based on probabilistic novelty detection.
Stavros Ntalampiras, Ilyas Potamitis, Nikos Fakotakis
2010Identifying articulatory goals from kinematic data using principal differential analysis.
Michael Reimer, Frank Rudzicz
2010Impact of lack of acoustic feedback in EMG-based silent speech recognition.
Matthias Janke, Michael Wand, Tanja Schultz
2010Impact of word classing on shrinkage-based language models.
Ruhi Sarikaya, Stanley F. Chen, Abhinav Sethy, Bhuvana Ramabhadran
2010Improved generation of fundamental frequency in HMM-based speech synthesis using generation process model.
Miaomiao Wang, Miaomiao Wen, Keikichi Hirose, Nobuaki Minematsu
2010Improved language recognition using mixture components statistics.
Abualsoud Hanani, Michael J. Carey, Martin J. Russell
2010Improved modelling of speech dynamics using non-linear formant trajectories for HMM-based speech synthesis.
Hongwei Hu, Martin J. Russell
2010Improved n-gram phonotactic models for language recognition.
Mohamed Faouzi BenZeghiba, Jean-Luc Gauvain, Lori Lamel
2010Improved neural network based language modelling and adaptation.
Junho Park, Xunying Liu, Mark J. F. Gales, Philip C. Woodland
2010Improved phoneme recognition by integrating evidence from spectro-temporal and cepstral features.
Shang-wen Li, Liang-Che Sun, Lin-Shan Lee
2010Improved real-time MRI of oral-velar coordination using a golden-ratio spiral view order.
Yoon-Chul Kim, Shrikanth S. Narayanan, Krishna S. Nayak
2010Improved spoken term detection by discriminative training of acoustic models based on user relevance feedback.
Hung-yi Lee, Chia-Ping Chen, Ching-Feng Yeh, Lin-Shan Lee
2010Improved spoken term detection by feature space pseudo-relevance feedback.
Chia-Ping Chen, Hung-yi Lee, Ching-Feng Yeh, Lin-Shan Lee
2010Improved topic classification and keyword discovery using an HMM-based speech recognizer trained without supervision.
Man-Hung Siu, Herbert Gish, Arthur Chan, William Belfield
2010Improved training of excitation for HMM-based parametric speech synthesis.
Yoshinori Shiga, Tomoki Toda, Shinsuke Sakai, Hisashi Kawai
2010Improvement on plural unit selection and fusion.
Jian Luan, Jian Li
2010Improvements of search error risk minimization in viterbi beam search for speech recognition.
Takaaki Hori, Shinji Watanabe, Atsushi Nakamura
2010Improvements to generalized discriminative feature transformation for speech recognition.
Roger Hsiao, Florian Metze, Tanja Schultz
2010Improvements to the equal-parameter BIC for speaker diarization.
Themos Stafylakis, Xavier Anguera
2010Improving ASR error detection with non-decoder based features.
Thomas Pellegrini, Isabel Trancoso
2010Improving ASR-based topic segmentation of TV programs with confidence measures and semantic relations.
Camille Guinaudeau, Guillaume Gravier, Pascale Sébillot
2010Improving Mandarin segmental duration prediction with automatically extracted syntax features.
Miaomiao Wen, Miaomiao Wang, Keikichi Hirose, Nobuaki Minematsu
2010Improving back-off models with bag of words and hollow-grams.
Benjamin Lecouteux, Raphaël Rubino, Georges Linarès
2010Improving cross database prediction of dialogue quality using mixture of experts.
Klaus-Peter Engelbrecht, Hamed Ketabdar, Sebastian Möller
2010Improving monaural speaker identification by double-talk detection.
Rahim Saeidi, Pejman Mowlaee, Tomi Kinnunen, Zheng-Hua Tan, Mads Græsbøll Christensen, Søren Holdt Jensen, Pasi Fränti
2010Improving prosodic phrase prediction by unsupervised adaptation and syntactic features extraction.
Zhigang Chen, Guoping Hu, Wei Jiang
2010Improving speech synthesis of machine translation output.
Alok Parlikar, Alan W. Black, Stephan Vogel
2010Improving the readability of class lecture ASR results using a confusion network.
Yasuhisa Fujii, Kazumasa Yamamoto, Seiichi Nakagawa
2010Incorporating MAP estimation and covariance transform for SVM based speaker recognition.
Cheung-Chi Leung, Donglai Zhu, Kong-Aik Lee, Bin Ma, Haizhou Li
2010Incorporating sparse representation phone identification features in automatic speech recognition using exponential families.
Vaibhava Goel, Tara N. Sainath, Bhuvana Ramabhadran, Peder A. Olsen, David Nahamoo, Dimitri Kanevsky
2010Incremental acoustic valence recognition: an inter-corpus perspective on features, matching, and performance in a gating paradigm.
Björn W. Schuller, Laurence Devillers
2010Incremental composition of static decoding graphs with label pushing.
Miroslav Novak
2010Incremental diarization of telephone conversations.
Oshry Ben-Harush, Itshak Lapidot, Hugo Guterman
2010Incremental word learning using large-margin discriminative training and variance floor estimation.
Irene Ayllón Clemente, Martin Heckmann, Alexander Denecke, Britta Wrede, Christian Goerick
2010Influence of gestural salience on the interpretation of spoken requests.
Gideon Kowadlo, Patrick Ye, Ingrid Zukerman
2010Influence of lexical tones on intonation in kammu.
Anastasia Karlsson, David House, Jan-Olof Svantesson, Damrong Tayanin
2010Influence of musical training on perception of L2 speech.
Makiko Sadakata, Lotte van der Zanden, Kaoru Sekiyama
2010Integrate template matching and statistical modeling for speech recognition.
Xie Sun, Yunxin Zhao
2010Integrated feedback and noise reduction algorithm in digital hearing aids via oscillation detection.
Miao Yao, Weiqian Liang
2010Integrating MLP features and discriminative training in data sampling based ensemble acoustic modeling.
Xin Chen, Yunxin Zhao
2010Integration of cache-based model and topic dependent class model with soft clustering and soft voting.
Welly Naptali, Masatoshi Tsuchiya, Seiichi Nakagawa
2010Integration of multilayer regression analysis with structure-based pronunciation assessment.
Masayuki Suzuki, Yu Qiao, Nobuaki Minematsu, Keikichi Hirose
2010Intelligibility predictions for speech against fluctuating masker.
Juan-Pablo Ramirez, Hamed Ketabdar, Alexander Raake
2010Interaction of syntax-marked focus and wh-question induced focus in standard Chinese.
Yuan Jia, Aijun Li
2010Intra-frame variability as a predictor of frame classifiability.
Trond Skogstad, Torbjørn Svendsen
2010Invariant integration features combined with speaker-adaptation methods.
Florian Müller, Alfred Mertins
2010Investigating articulatory setting - pauses, ready position, and rest - using real-time MRI.
Vikram Ramanarayanan, Dani Byrd, Louis Goldstein, Shrikanth S. Narayanan
2010Investigating multiple approaches for SLU portability to a new language.
Bassam Jabaian, Laurent Besacier, Fabrice Lefèvre
2010Investigation of full-sequence training of deep belief networks for speech recognition.
Abdel-rahman Mohamed, Dong Yu, Li Deng
2010Is it possible to predict task completion in automated troubleshooters?.
Alexander Schmitt, Michael Scholz, Wolfgang Minker, Jackson Liscombe, David Suendermann
2010It takes two to tango - assessing the impact of delay on conversational interactivity on perceived speech quality.
Sebastian Egger, Raimund Schatz, Stefan Scherer
2010Japanese spoken term detection using syllable transition network derived from multiple speech recognizers' outputs.
Satoshi Natori, Hiromitsu Nishizaki, Yoshihiro Sekiguchi
2010Jointly optimized discriminative features for speech recognition.
Tim Ng, Bing Zhang, Long Nguyen
2010Kinematic analysis of tongue movement control in spastic dysarthria.
Heejin Kim, Panying Rong, Torrey M. Loucks, Mark Hasegawa-Johnson
2010Korean lenis, fortis, and aspirated stops: effect of place of articulation on acoustic realization.
Mirjam Broersma
2010L2 experience and non-native vowel categorization of L1-Mandarin speakers.
Bo-ren Hsieh, Ho-Hsien Pan
2010Landmark-based automated pronunciation error detection.
Su-Youn Yoon, Mark Hasegawa-Johnson, Richard Sproat
2010Language acquisition and cross-modal associations: computational simulation of the result of infant studies.
Louis ten Bosch, Lou Boves
2010Language model cross adaptation for LVCSR system combination.
Xunying Liu, Mark J. F. Gales, Philip C. Woodland
2010Language specific effects of emotion on phoneme duration.
Martijn Goudbeek, Mirjam Broersma
2010Language-specific influence on phoneme development: French and drehu data.
Julia Monnin, Hélène Loevenbruck
2010Large margin Gaussian mixture models for speaker identification.
Reda Jourani, Khalid Daoudi, Régine André-Obrecht, Driss Aboutajdine
2010Large vocabulary continuous speech recognition using WFST-based linear classifier for structured data.
Shinji Watanabe, Takaaki Hori, Atsushi Nakamura
2010Laryngeal characteristics during the production of geminate consonants.
Masako Fujimoto, Kikuo Maekawa, Seiya Funatsu
2010Laryngeal voice quality in the expression of focus.
Martti Vainio, Matti Airas, Juhani Järvikivi, Paavo Alku
2010Laryngealization and features for Chinese tonal recognition.
Kristine M. Yu
2010Latent affective mapping: a novel framework for the data-driven analysis of emotion in text.
Jerome R. Bellegarda
2010Latent perceptual mapping: a new acoustic modeling framework for speech recognition.
Shiva Sundaram, Jerome R. Bellegarda
2010Learning a language model from continuous speech.
Graham Neubig, Masato Mimura, Shinsuke Mori, Tatsuya Kawahara
2010Learning from human errors: prediction of phoneme confusions based on modified ASR training.
Bernd T. Meyer, Birger Kollmeier
2010Learning naturally spoken commands for a robot.
Anja Austermann, Seiji Yamada, Kotaro Funakoshi, Mikio Nakano
2010Learning new word pronunciations from spoken examples.
Ibrahim Badr, Ian McGraw, James R. Glass
2010Learning speaker normalization using semisupervised manifold alignment.
Andrew R. Plummer, Mary E. Beckman, Mikhail Belkin, Eric Fosler-Lussier, Benjamin Munson
2010Learning words and speech units through natural interactions.
Jonas Hörnstein, José Santos-Victor
2010Lecture speech recognition by combining word graphs of various acoustic models.
Tetsuo Kosaka, Keisuke Goto, Takashi Ito, Masaharu Katoh
2010Lecture subtopic retrieval by retrieval keyword expansion using subordinate concept.
Noboru Kanedera, Tetsuo Funada, Seiichi Nakagawa
2010Level of interest sensing in spoken dialog using multi-level fusion of acoustic and lexical evidence.
Je Hun Jeon, Rui Xia, Yang Liu
2010Lexical entrainment of real users in the let's go spoken dialog system.
Gabriel Parent, Maxine Eskénazi
2010Lightly supervised recognition for automatic alignment of large coherent speech recordings.
Norbert Braunschweiler, Mark J. F. Gales, Sabine Buchholz
2010Linguistic rhythm in foreign accent.
Jiahong Yuan
2010Locally-weighted regression for estimating the forward kinematics of a geometric vocal tract model.
Adam C. Lammert, Louis Goldstein, Khalil Iskarous
2010Long short-term memory networks for noise robust speech recognition.
Martin Wöllmer, Yang Sun, Florian Eyben, Björn W. Schuller
2010Longitudinal changes of selected voice source parameters.
Hideki Kasuya, Hajime Yoshida, Satoshi Ebihara, Hiroki Mori
2010Looking for relevant features for speaker role recognition.
Benjamin Bigot, Julien Pinquier, Isabelle Ferrané, Régine André-Obrecht
2010Low-dimensional space transforms of posteriors in speech recognition.
Jan Zelinka, Jan Trmal, Ludek Müller
2010MAP estimation of subspace transform for speaker recognition.
Donglai Zhu, Bin Ma, Kong-Aik Lee, Cheung-Chi Leung, Haizhou Li
2010Machine learning for text selection with expressive unit-selection voices.
Dominic Espinosa, Michael White, Eric Fosler-Lussier, Chris Brew
2010Mandarin digit recognition assisted by selective tone distinction.
Xiaodong Wang, Kunihiko Owa, Makoto Shozakai
2010Mandarin tone recognition using affine-invariant prosodic features and tone posteriorgram.
Yow-Bang Wang, Lin-Shan Lee
2010Manipulating treacheoesophageal speech.
R. J. J. H. van Son, Irene Jacobi, Frans J. M. Hilgers
2010Mask estimation in non-stationary noise environments for missing feature based robust speech recognition.
Shirin Badiezadegan, Richard C. Rose
2010Masking of vowel-analog transitions by vowel-analog distracters.
Pierre L. Divenyi
2010Masking property based microphone array post-filter design.
Ning Cheng, Wenju Liu, Lan Wang
2010Maximum a posteriori voice conversion using sequential monte carlo methods.
Elina Helander, Hanna Silén, Joaquín Míguez, Moncef Gabbouj
2010Maximum lexical cohesion for fine-grained news story segmentation.
Zihan Liu, Lei Xie, Wei Feng
2010Measuring basic tempo across languages and some implications for speech rhythm.
Gertraud Fenk-Oczlon, August Fenk
2010Mechanical vocal-tract models for speech dynamics.
Takayuki Arai
2010Melody pitch estimation based on range estimation and candidate extraction using harmonic structure model.
Seokhwan Jo, Sihyun Joo, Chang D. Yoo
2010Memory-based active learning for French broadcast news.
Frédéric Tantini, Christophe Cerisara, Claire Gardent
2010Methods for robust speech recognition in reverberant environments: a comparison.
Rico Petrick, Thomas Fehér, Masashi Unoki, Rüdiger Hoffmann
2010Metric subspace indexing for fast spoken term detection.
Taisuke Kaneko, Tomoyosi Akiba
2010Minimally invasive surgery for spoken dialog systems.
David Suendermann, Jackson Liscombe, Roberto Pieraccini
2010Modal analysis of vocal fold vibrations using laryngotopography.
Ken-Ichi Sakakibara, Hiroshi Imagawa, Miwako Kimura, Hisayuki Yokonishi, Niro Tayama
2010Model synthesis for band-limited speech recognition.
Yongjun He, Jiqing Han
2010Modeling liaison in French by using decision trees.
Josafá de Jesus Aguiar Pontes, Sadaoki Furui
2010Modeling of sentence-medial pauses in bangla readout speech: occurrence and duration.
Shyamal Kr. Das Mandal, Arup Saha, Tulika Basu, Keikichi Hirose, Hiroya Fujisaki
2010Modeling perceived vocal age in american English.
James D. Harnsberger, Rahul Shrivastav, W. S. Brown Jr.
2010Modeling posterior probabilities using the linear exponential family.
Peder A. Olsen, Vaibhava Goel, Charles A. Micchelli, John R. Hershey
2010Modeling pronunciation variation with context-dependent articulatory feature decision trees.
Samuel R. Bowman, Karen Livescu
2010Modelling speech line spectral frequencies with dirichlet mixture models.
Zhanyu Ma, Arne Leijon
2010Modelling the effect of speaker familiarity and noise on infant word recognition.
Christina Bergmann, Michele Gubian, Lou Boves
2010Modified spatial audio object coding scheme with harmonic extraction and elimination structure for interactive audio service.
Jihoon Park, Kwang-Ki Kim, Jeongil Seo, Minsoo Hahn
2010Morphological and predictability effects on schwa reduction: the case of dutch word-initial syllables.
Iris Hanique, Barbara Schuppler, Mirjam Ernestus
2010Multi resolution discriminative models for subvocalic speech recognition.
Mark Raugas, Vivek Kumar Rangarajan Sridhar, Rohit Prasad, Prem Natarajan
2010Multi-channel iterative dereverberation based on codebook constrained iterative multi-channel wiener filter.
Ajay Srinivasamurthy, Thippur V. Sreenivas
2010Multi-class and hierarchical SVMs for emotion recognition.
Ali Hassan, Robert I. Damper
2010Multi-pitch estimation by a joint 2-d representation of pitch and pitch dynamics.
Tianyu T. Wang, Thomas F. Quatieri
2010MultiBIC: an improved speaker segmentation technique for TV shows.
Paula Lopez-Otero, Laura Docío Fernández, Carmen García-Mateo
2010Multichannel noise reduction using low order RTF estimate.
Subhojit Chakladar, Nam Soo Kim, Yu Gwang Jin, Tae Gyoon Kang
2010Multichannel source separation based on source location cue with log-spectral shaping by hidden Markov source model.
Tomohiro Nakatani, Shoko Araki, Takuya Yoshioka, Masakiyo Fujimoto
2010Multimodal dialog in the car: combining speech and turn-and-push dial to control comfort functions.
Sandro Castronovo, Angela Mahr, Margarita Pentcheva, Christian A. Müller
2010Multimodal speaker diarization using oriented optical flow histograms.
Mary Tai Knox, Gerald Friedland
2010Multivariate analysis of vocal fatigue in continuous reading.
Marie-José Caraty, Claude Montacié
2010Mutual information analysis for feature and sensor subset selection in surface electromyography based speech recognition.
Vivek Kumar Rangarajan Sridhar, Rohit Prasad, Prem Natarajan
2010Named-entity projection and data-driven morphological decomposition for field maintainable speech-to-speech translation systems.
Ian R. Lane, Alex Waibel
2010Native and non-native speaker judgements on the quality of synthesized speech.
Anna C. Janska, Robert A. J. Clark
2010Natural belief-critic: a reinforcement algorithm for parameter estimation in statistical spoken dialogue systems.
Filip Jurcícek, Blaise Thomson, Simon Keizer, François Mairesse, Milica Gasic, Kai Yu, Steve J. Young
2010Near field sound source localization based on cross-power spectrum phase analysis with multiple microphones.
Kohei Hayashida, Masanori Morise, Takanobu Nishiura
2010New insights into subspace noise tracking.
Mahdi Triki
2010New technique to enhance the performance of spoken dialogue systems based on dialogue states-dependent language models and grammatical rules.
Ramón López-Cózar, David Griol
2010Noise robust voice activity detection using features extracted from the time-domain autocorrelation function.
Houman Ghaemmaghami, Brendan Baker, Robert Vogt, Sridha Sridharan
2010Non-audible murmur recognition based on fusion of audio and visual streams.
Panikos Heracleous, Norihiro Hagita
2010Non-linear predictive vector quantization of feature vectors for distributed speech recognition.
José Enrique García Laínez, Alfonso Ortega, Antonio Miguel, Eduardo Lleida
2010Non-negative matrix factorization based compensation of music for automatic speech recognition.
Bhiksha Raj, Tuomas Virtanen, Sourish Chaudhuri, Rita Singh
2010Nonlinear enhancement of onset for robust speech recognition.
Chanwoo Kim, Richard M. Stern
2010Novel probabilistic control of noise reduction for improved microphone array beamforming.
Jungpyo Hong, Seung Ho Han, Sangbae Jeong, Minsoo Hahn
2010Novel weighting scheme for unsupervised language model adaptation using latent dirichlet allocation.
Md. Akmal Haidar, Douglas D. O'Shaughnessy
2010Nucleus position within the intonation phrase: a typological study of English, Czech and Hungarian.
Tomás Dubeda, Katalin Mády
2010Numerical study of turbulent flow-induced sound production in presence of a tooth-shaped obstacle: towards sibilant [s] physical modeling.
Julien Cisonni, Kazunori Nozaki, Annemie Van Hirtum, Shigeo Wada
2010Observation uncertainty measures for sparse imputation.
Jort F. Gemmeke, Ulpu Remes, Kalle J. Palomäki
2010On enhancing feature sequence filtering with filter-bank energy transformation in speaker verification with telephone speech.
Claudio Garretón, Néstor Becerra Yoma
2010On evaluation of the f
Keiichi Funaki
2010On generating combilex pronunciations via morphological analysis.
Korin Richmond, Robert A. J. Clark, Susan Fitt
2010On speaker adaptive training of artificial neural networks.
Jan Trmal, Jan Zelinka, Ludek Müller
2010On the automatic toBI accent type identification from data.
César González Ferreras, Carlos Vivaracho-Pascual, David Escudero Mancebo, Valentín Cardeñoso-Payo
2010On the effect of fundamental frequency on amplitude and frequency modulation patterns in speech resonances.
Pirros Tsiakoulis, Alexandros Potamianos
2010On the exploitation of hidden Markov models and linear dynamic models in a hybrid decoder architecture for continuous speech recognition.
Volker Leutnant, Reinhold Haeb-Umbach
2010On the importance of glottal flow spectral energy for the recognition of emotions in speech.
Ling He, Margaret Lech, Nicholas B. Allen
2010On the interdependencies between voice quality, glottal gaps, and voice-source related acoustic measures.
Yen-Liang Shue, Gang Chen, Abeer Alwan
2010On the potential of channel selection for recognition of reverberated speech with multiple microphones.
Martin Wolf, Climent Nadeu
2010On the potential of glottal signatures for speaker recognition.
Thomas Drugman, Thierry Dutoit
2010On the relation of Bayes risk, word error, and word posteriors in ASR.
Ralf Schlüter, Markus Nußbaum-Thom, Hermann Ney
2010On the use of Gaussian component information in the generative likelihood ratio estimation for speaker verification.
Rong Zheng, Bo Xu
2010On using Gaussian mixture model for double-talk detection in acoustic echo suppression.
Ji-Hyun Song, Kyu-Ho Lee, Yun-Sik Park, Sang-Ick Kang, Joon-Hyuk Chang
2010On using missing-feature theory with cepstral features - approximations to the multivariate integral.
Frank Seide, Pei Zhao
2010On using voice source measures in automatic gender classification of children's speech.
Gang Chen, Xue Feng, Yen-Liang Shue, Abeer Alwan
2010On-demand language model interpolation for mobile speech input.
Brandon Ballinger, Cyril Allauzen, Alexander Gruenstein, Johan Schalkwyk
2010On-the-fly lattice rescoring for real-time automatic speech recognition.
Hasim Sak, Murat Saraclar, Tunga Güngör
2010One-model speech recognition and synthesis based on articulatory movement HMMs.
Tsuneo Nitta, Takayuki Onoda, Masashi Kimura, Yurie Iribe, Kouichi Katsurada
2010Online Gaussian process for nonstationary speech separation.
Hsin-Lung Hsieh, Jen-Tzung Chien
2010Online SLU model adaptation with a partial oracle.
Pierre Gotab, Géraldine Damnati, Frédéric Béchet, Lionel Delphin-Poulat
2010Online adaptive learning for speech recognition decoding.
Jeff A. Bilmes, Hui Lin
2010Optimising a handcrafted dialogue system design.
Romain Laroche, Ghislain Putois, Philippe Bretier
2010Optimizing spoken dialogue management with fitted value iteration.
Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin
2010Oriented PCA method for blind speech separation of convolutive mixtures.
Yasmina Benabderrahmane, Sid-Ahmed Selouani, Douglas D. O'Shaughnessy
2010Overlap detection for speaker diarization by fusing spectral and spatial features.
Martin Zelenák, Carlos Segura, Javier Hernando
2010PDF-optimized LSF vector quantization based on beta mixture models.
Zhanyu Ma, Arne Leijon
2010Parallel lexical-tree based LVCSR on multi-core processors.
Naveen Parihar, Ralf Schlüter, David Rybach, Eric A. Hansen
2010Parallel processing of interruptions and feedback in companions affective dialogue system.
Jaakko Hakulinen, Markku Turunen, Raúl Santos de la Cámara, Nigel T. Crook
2010Parallel training of neural networks for speech recognition.
Karel Veselý, Lukás Burget, Frantisek Grézl
2010Parameters describing multimodal interaction - definitions and three usage scenarios.
Christine Kühnel, Benjamin Weiss, Sebastian Möller
2010Paraphrase generation to improve text-to-speech synthesis.
Ghislain Putois, Jonathan Chevelu, Cédric Boidin
2010Perception of estonian vowel categories by native and non-native speakers.
Lya Meister, Einar Meister
2010Perception of voiceless fricatives by Japanese listeners of advanced and intermediate level English proficiency.
Hinako Masuda, Takayuki Arai
2010Perception on pitch reset at discourse boundaries.
Hsin-Yi Lin, Janice Fon
2010Perception-based automatic approximation of F0 contours in Cantonese speech.
Yujia Li, Tan Lee
2010Perceptual compensation for effects of reverberation in speech identification: a computer model based on auditory efferent processing.
Amy V. Beeston, Guy J. Brown
2010Perceptual wavelet decomposition for speech segmentation.
Mariusz Ziólko, Jakub Galka, Bartosz Ziólko, Tomasz Drwiega
2010Performance estimation of noisy speech recognition considering recognition task complexity.
Takeshi Yamada, Tomohiro Nakajima, Nobuhiko Kitawaki, Shoji Makino
2010Performance estimation of reverberant speech recognition based on reverberant criteria RSR-d
Takahiro Fukumori, Masanori Morise, Takanobu Nishiura
2010Phase equalization-based autoregressive model of speech signals.
Sadao Hiroya, Takemi Mochida
2010Phone boundary detection using sample-based acoustic parameters.
You-yu Lin, Yih-Ru Wang, Yuan-Fu Liao
2010Phone mismatch penalty matrices for two-stage keyword spotting via multi-pass phone recognizer.
Chang Woo Han, Shin Jae Kang, Chul Min Lee, Nam Soo Kim
2010Phoneme classification and lattice rescoring based on a k-NN approach.
Ladan Golipour, Douglas D. O'Shaughnessy
2010Phoneme lattice based texttiling towards multilingual story segmentation.
Xiaoxuan Wang, Lei Xie, Bin Ma, Engsiong Chng, Haizhou Li
2010Phonetic imitation of Japanese vowel devoicing.
Kuniko Y. Nielsen
2010Phonetic realization of second occurrence focus in Japanese.
Satoshi Nambu, Yong-Cheol Lee
2010Phonetic segmentation of singing voice using MIDI and parallel speech.
Minghui Dong, Paul Y. Chan, Ling Cen, Haizhou Li, Jason Teo, Ping Jen Kua
2010Phonetic subspace mixture model for speaker diarization.
I-Fan Chen, Shih-Sian Cheng, Hsin-Min Wang
2010Phrase alignment confidence for statistical machine translation.
Sankaranarayanan Ananthakrishnan, Rohit Prasad, Prem Natarajan
2010Phrase-medial vowel devoicing in spontaneous French.
Francisco Torreira, Mirjam Ernestus
2010Physics of body-conducted silent speech - production, propagation and representation of non-audible murmur.
Makoto Otani, Tatsuya Hirahara
2010Pitch determination using autocorrelation function in spectral domain.
M. Shahidur Rahman, Tetsuya Shimamura
2010Pitch estimation in noisy speech based on temporal accumulation of spectrum peaks.
Feng Huang, Tan Lee
2010Pitch similarity in the vicinity of backchannels.
Mattias Heldner, Jens Edlund, Julia Hirschberg
2010Positional variability of pitch accents in Czech.
Tomás Dubeda
2010Post-aspiration in standard Italian: some first cross-regional acoustic evidence.
Mary Stevens, John Hajek
2010Pre- and short-term posttreatment vocal functioning in patients with advanced head and neck cancer treated with concomitant chemoradiotherapy.
Irene Jacobi, Lisette van der Molen, Maya van Rossum, Frans J. M. Hilgers
2010Predicting human perception and ASR classification of word-final [t] by its acoustic sub-segmental properties.
Barbara Schuppler, Mirjam Ernestus, Wim A. van Dommelen, Jacques C. Koreman
2010Predicting unseen articulations from multi-speaker articulatory models.
Gopal Ananthakrishnan, Pierre Badin, Julián Andrés Valdés Vargas, Olov Engwall
2010Predicting word accuracy for the automatic speech recognition of non-native speech.
Su-Youn Yoon, Lei Chen, Klaus Zechner
2010Prior information for rapid speaker adaptation.
Catherine Breslin, K. K. Chin, Mark J. F. Gales, Kate M. Knill, Haitian Xu
2010Probabilistic integration of joint density model and speaker model for voice conversion.
Daisuke Saito, Shinji Watanabe, Atsushi Nakamura, Nobuaki Minematsu
2010Probabilistic state clustering using conditional random field for context-dependent acoustic modelling.
Khe Chai Sim
2010Production and perception of vietnamese short vowels in V1V2 context.
Viet Son Nguyen, Eric Castelli, René Carré
2010Prominence based scoring of speech segments for automatic speech-to-speech summarization.
Sree Harsha Yella, Vasudeva Varma, Kishore Prahallad
2010Prominence detection in Swedish using syllable correlates.
Samer Al Moubayed, Jonas Beskow
2010Prosodic grouping and relative clause disambiguation in Mandarin.
Jianjing Kuang
2010Prosodic speaker verification using subspace multinomial models with intersession compensation.
Marcel Kockmann, Lukás Burget, Ondrej Glembek, Luciana Ferrer, Jan Cernocký
2010Prosodic timing analysis for articulatory re-synthesis using a bank of resonators with an adaptive oscillator.
Michael C. Brady
2010Prosodic word-based error correction in speech recognition using prosodic word expansion and contextual information.
Chao-Hong Liu, Chung-Hsien Wu
2010Prosody and voice quality of vocal social signals: the case of dominance in scenario meetings.
Marcela Charfuelan, Marc Schröder, Ingmar Steiner
2010Prosody cues for classification of the discourse particle "hã" in hindi.
Sankalan Prasad, Kalika Bali
2010Prosody for the eyes: quantifying visual prosody using guided principal component analysis.
Erin Cvejic, Jeesun Kim, Chris Davis, Guillaume Gibert
2010Psychological evaluation of a group communication activation robot in a party game.
Yoichi Matsuyama, Shinya Fujie, Hikaru Taniyama, Tetsunori Kobayashi
2010Quality conversion of non-acoustic signals for facilitating human-to-human speech communication under harsh acoustic conditions.
Seyed Omid Sadjadi, Sanjay A. Patil, John H. L. Hansen
2010Quality-based playout buffering with FEC for conversational voIP.
Qipeng Gong, Peter Kabal
2010Quantification of prosodic entrainment in affective spontaneous spoken interactions of married couples.
Chi-Chun Lee, Matthew Black, Athanasios Katsamanis, Adam C. Lammert, Brian R. Baucom, Andrew Christensen, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2010Quantized HMMs for low footprint text-to-speech synthesis.
Alexander Gutkin, Xavi Gonzalvo, Stefan Breuer, Paul Taylor
2010Rapid bootstrapping of five eastern european languages using the rapid language adaptation toolkit.
Ngoc Thang Vu, Tim Schlippe, Franziska Kraus, Tanja Schultz
2010Rapid development of speech translation using consecutive interpretation.
Matthias Paulik, Alex Waibel
2010Rapid semi-automatic segmentation of real-time magnetic resonance images for parametric vocal tract analysis.
Michael I. Proctor, Daniel Bone, Athanasios Katsamanis, Shrikanth S. Narayanan
2010Real-life emotion-related states detection in call centers: a cross-corpora study.
Laurence Devillers, Christophe Vaudable, Clément Chastagnol
2010Recognition of spontaneous conversational speech using long short-term memory phoneme predictions.
Martin Wöllmer, Florian Eyben, Björn W. Schuller, Gerhard Rigoll
2010Recognizing cochlear implant-like spectrally reduced speech with HMM-based ASR: experiments with MFCCs and PLP coefficients.
Cong-Thanh Do, Dominique Pastor, Gaël Le Lan, André Goalic
2010Recurrent neural network based language model.
Tomás Mikolov, Martin Karafiát, Lukás Burget, Jan Cernocký, Sanjeev Khudanpur
2010Redescribing intonational categories with functional data analysis.
Margaret Zellers, Michele Gubian, Brechtje Post
2010Reducing musical noise in blind source separation by time-domain sparse filters and split bregman method.
Wenye Ma, Meng Yu, Jack Xin, Stanley J. Osher
2010Reduction of broadband noise in speech signals by multilinear subspace analysis.
Yusuke Sato, Tetsuya Hoya, Hovagim Bakardjian, Andrzej Cichocki
2010Regularized-MLLR speaker adaptation for computer-assisted language learning system.
Dean Luo, Yu Qiao, Nobuaki Minematsu, Yutaka Yamauchi, Keikichi Hirose
2010Reinforced blocking matrix with cross channel projection for speech enhancement.
Inho Lee, Jongsung Yoon, Yoonjae Lee, Hanseok Ko
2010Reliable tracking based on speech sample salience of vocal cycle length perturbations.
Christophe Mertens, Francis Grenez, Lise Crevier-Buchman, Jean Schoentgen
2010Relying on critical articulators to estimate vocal tract spectra in an articulatory-acoustic database.
Daniel Felps, Christian Geng, Michael Berger, Korin Richmond, Ricardo Gutierrez-Osuna
2010Repair strategies on trial: which error recovery do users like best?.
Alexander Zgorzelski, Alexander Schmitt, Tobias Heinroth, Wolfgang Minker
2010Resources for turn competition in overlap in multi-party conversations: speech rate, pausing and duration.
Emina Kurtic, Guy J. Brown, Bill Wells
2010Restructuring exponential family mixture models.
Pierre L. Dognin, John R. Hershey, Vaibhava Goel, Peder A. Olsen
2010Revisiting VTLN using linear transformation on conventional MFCC.
Doddipatla Rama Sanand, Ralf Schlüter, Hermann Ney
2010Rhythm and formant features for automatic alcohol detection.
Florian Schiel, Christian Heinrich, Veronika Neumeyer
2010Robust and efficient pitch estimation using an iterative ARMA technique.
Jung Ook Hong, Patrick J. Wolfe
2010Robust automatic speech recognition with decoder oriented ideal binary mask estimation.
Lae-Hoon Kim, Kyung-Tae Kim, Mark Hasegawa-Johnson
2010Robust mixture modeling using t-distribution: application to speaker ID.
Sundar Harshavardhan, Thippur V. Sreenivas
2010Robust noise estimation using minimum correction with harmonicity control.
Xuejing Sun, Kuan-Chieh Yen, Rogerio Guedes Alves
2010Robust statistical voice activity detection using a likelihood ratio sign test.
Shiwen Deng, Jiqing Han
2010Robust voice activity detection in stereo recording with crosstalk.
Prasanta Kumar Ghosh, Andreas Tsiartas, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2010Robust word recognition using articulatory trajectories and gestures.
Vikramjit Mitra, Hosung Nam, Carol Y. Espy-Wilson, Elliot Saltzman, Louis Goldstein
2010Role of language models in spoken fluency evaluation.
Om Deshmukh, Harish Doddala, Ashish Verma, Karthik Visweswariah
2010Roles of the average voice in speaker-adaptive HMM-based speech synthesis.
Junichi Yamagishi, Oliver Watts, Simon King, Bela Usabaev
2010Round-robin discrimination model for reranking ASR hypotheses.
Takanobu Oba, Takaaki Hori, Atsushi Nakamura
2010Russian infants and children's sounds and speech corpuses for language acquisition studies.
Elena E. Lyakso, Olga V. Frolova, Anna V. Kurazhova, Julia S. Gaikova
2010SAFE: a statistical algorithm for F0 estimation for both clean and noisy speech.
Wei Chu, Abeer Alwan
2010SCARF: a segmental conditional random field toolkit for speech recognition.
Geoffrey Zweig, Patrick Nguyen
2010SEAME: a Mandarin-English code-switching speech corpus in south-east asia.
Dau-Cheng Lyu, Tien Ping Tan, Engsiong Chng, Haizhou Li
2010SNR-based mask compensation for computational auditory scene analysis applied to speech recognition in a car environment.
Ji Hun Park, Seon Man Kim, Jae Sam Yoon, Hong Kook Kim, Sung Joo Lee, Yunkeun Lee
2010Say it as you mean it - analyzing free user comments in the VOICE awards corpus.
Florian Gödde, Sebastian Möller
2010Say what? why users choose to speak their web queries.
Maryam Kamvar, Doug Beeferman
2010Score-level compensation of extreme speech duration variability in speaker verification.
Sergio Perez-Gomez, Daniel Ramos, Javier Gonzalez-Dominguez, Joaquin Gonzalez-Rodriguez
2010Search by voice in Mandarin Chinese.
Jiulong Shan, Genqing Wu, Zhihong Hu, Xiliu Tang, Martin Jansche, Pedro J. Moreno
2010Selecting phonotactic features for language recognition.
Rong Tong, Bin Ma, Haizhou Li, Engsiong Chng
2010Selective gammatone filterbank feature for robust sound event recognition.
Yi Ren Leng, Tran Huy Dat, Norihide Kitaoka, Haizhou Li
2010Semantic facilitation in bilingual everyday speech comprehension.
Marco van de Ven, Benjamin V. Tucker, Mirjam Ernestus
2010Semi-automated update of automatic transcription system for the Japanese national congress.
Yuya Akita, Masato Mimura, Graham Neubig, Tatsuya Kawahara
2010Semi-parametric trajectory modelling using temporally varying feature mapping for speech recognition.
Khe Chai Sim, Shilin Liu
2010Semi-supervised extractive speech summarization via co-training algorithm.
Shasha Xie, Hui Lin, Yang Liu
2010Semi-supervised learning for improved expression of uncertainty in discriminative classifiers.
Jonathan Malkin, Jeff A. Bilmes
2010Semi-supervised part-of-speech tagging in speech applications.
Richard Dufour, Benoît Favre
2010Semi-supervised training of Gaussian mixture models by conditional entropy minimization.
Jui-Ting Huang, Mark Hasegawa-Johnson
2010Session variability contrasts in the MARP corpus.
Keith W. Godin, John H. L. Hansen
2010Setup for acoustic-visual speech synthesis by concatenating bimodal units.
Asterios Toutios, Utpala Musti, Slim Ouni, Vincent Colotte, Brigitte Wrobel-Dautcourt, Marie-Odile Berger
2010Shape-invariant speech transformation with the phase vocoder.
Axel Röbel
2010Shrinkage model adaptation in automatic speech recognition.
Jinyu Li, Yu Tsao, Chin-Hui Lee
2010Signal interaction and the devil function.
John R. Hershey, Peder A. Olsen, Steven J. Rennie
2010Signal-based accent and phrase marking using the fujisaki model.
Hussein Hussein, Rüdiger Hoffmann
2010Significance of pitch synchronous analysis for speaker recognition using AANN models.
Sri Harish Reddy Mallidi, Kishore Prahallad, Suryakanth V. Gangashetty, B. Yegnanarayana
2010Silent vs vocalized articulation for a portable ultrasound-based silent speech interface.
Victoria M. Florescu, Lise Crevier-Buchman, Bruce Denby, Thomas Hueber, Antonia Colazo-Simon, Claire Pillot-Loiseau, Pierre Roussel-Ragot, Cédric Gendrot, Sophie Quattrocchi
2010Similar n-gram language model.
Christian Gillot, Christophe Cerisara, David Langlois, Jean Paul Haton
2010Similarity of effects of emotions on the speech organ configuration with and without speaking.
Tatsuya Kitamura
2010Similarity scoring for recognizing repeated out-of-vocabulary words.
Mirko Hannemann, Stefan Kombrink, Martin Karafiát, Lukás Burget
2010Simple and efficient speaker comparison using approximate KL divergence.
William M. Campbell, Zahi N. Karam
2010Simplification and extension of non-periodic excitation source representations for high-quality speech manipulation systems.
Hideki Kawahara, Masanori Morise, Toru Takahashi, Hideki Banno, Ryuichi Nisimura, Toshio Irino
2010Single-channel speech enhancement using kalman filtering in the modulation domain.
Stephen So, Kamil K. Wójcicki, Kuldip K. Paliwal
2010Single-speaker/multi-speaker co-channel speech classification.
Stéphane Rossignol, Olivier Pietquin
2010Sinusoidal model parameterization for HMM-based TTS system.
Slava Shechtman, Alexander Sorin
2010Social role discovery from spoken language using dynamic Bayesian networks.
Sibel Yaman, Dilek Hakkani-Tür, Gökhan Tür
2010Sound-based assistive technology supporting "seeing", "hearing" and "speaking" for the disabled and the elderly.
Tohru Ifukube
2010Sparse auto-associative neural networks: theory and application to speech recognition.
Garimella S. V. S. Sivaram, Sriram Ganapathy, Hynek Hermansky
2010Sparse component analysis for speech recognition in multi-speaker environment.
Afsaneh Asaei, Hervé Bourlard, Philip N. Garner
2010Sparse representation features for speech recognition.
Tara N. Sainath, Bhuvana Ramabhadran, David Nahamoo, Dimitri Kanevsky, Abhinav Sethy
2010Sparse representations for text categorization.
Tara N. Sainath, Sameer Maskey, Dimitri Kanevsky, Bhuvana Ramabhadran, David Nahamoo, Julia Hirschberg
2010Speaker adaptation based on nonlinear spectral transform for speech recognition.
Toyohiro Hayashi, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda
2010Speaker adaptation based on system combination using speaker-class models.
Tetsuo Kosaka, Takashi Ito, Masaharu Katoh, Masaki Kohda
2010Speaker adaptation in transformation space using two-dimensional PCA.
Yongwon Jeong, Young Rok Song, Hyung Soon Kim
2010Speaker and language adaptive training for HMM-based polyglot speech synthesis.
Heiga Zen
2010Speaker characterization using long-term and temporal information.
Chien-Lin Huang, Hanwu Sun, Bin Ma, Haizhou Li
2010Speaker diarization in meeting audio for single distant microphone.
Tin Lay Nwe, Hanwu Sun, Bin Ma, Haizhou Li
2010Speaker recognition experiments using connectionist transformation network features.
Alberto Abad, Isabel Trancoso
2010Speaker recognition using supervised probabilistic principal component analysis.
Yun Lei, John H. L. Hansen
2010Speaker recognition using the resynthesized speech via spectrum modeling.
Xiang Zhang, Chuan Cao, Lin Yang, Hongbin Suo, Jianping Zhang, Yonghong Yan
2010Speaker tracking in an unsupervised speech controlled system.
Tobias Herbig, Franz Gerl, Wolfgang Minker
2010Speaker-dependent mapping of source and system features for enhancement of throat microphone speech.
Anand Joseph Xavier Medabalimi, Sri Harish Reddy Mallidi, B. Yegnanarayana
2010Speaker-independent HMM-based voice conversion using quantized fundamental frequency.
Takashi Nose, Takao Kobayashi
2010Speaking style dependency of formant targets.
Akiko Amano-Kusumoto, John-Paul Hosom, Alexander Kain
2010Specification in context - devoicing processes in Polish, French, american English and German sonorants.
Jagoda Sieczkowska, Bernd Möbius, Grzegorz Dogil
2010Spectral entropy-based voice activity detector for videoconferencing systems.
Bowon Lee, Debargha Muhkerjee
2010Spectro-temporal modulations for robust speech emotion recognition.
Lan-Ying Yeh, Tai-Shih Chi
2010Speech categorization context effects in seven- to nine-month-old infants.
Ellen Marklund, Francisco Lacerda, Anna Ericsson
2010Speech database reduction method for corpus-based TTS system.
Mitsuaki Isogai, Hideyuki Mizuno
2010Speech dominoes and phonetic convergence.
Gérard Bailly, Amélie Lelong
2010Speech enhancement using improved generalized sidelobe canceller in frequency domain with multi-channel postfiltering.
Kai Li, Qiang Fu, Yonghong Yan
2010Speech estimation in non-stationary noise environments using timing structures between mouth movements and sound signals.
Hiroaki Kawashima, Yu Horii, Takashi Matsuyama
2010Speech intelligibility of diagonally localized speech with competing noise using bone-conduction headphones.
Kazuhiro Kondo, Takayuki Kanda, Yosuke Kobayashi, Hiroyuki Yagyu
2010Speech inventory based discriminative training for joint speech enhancement and low-rate speech coding.
Xiaoqiang Xiao, Robert M. Nickel
2010Speech recognition using long-term phase information.
Kazumasa Yamamoto, Eiichi Sueyoshi, Seiichi Nakagawa
2010Speech recognition with a seamlessly updated language model for real-time closed-captioning.
Toru Imai, Shinichi Homma, Akio Kobayashi, Takahiro Oku, Shoei Sato
2010Speech recognizer optimization under speed constraints.
Ivan Bulyko
2010Speech robot mimicking human articulatory motion.
Kotaro Fukui, Toshihiro Kusano, Yoshikazu Mukaeda, Yuto Suzuki, Atsuo Takanishi, Masaaki Honda
2010Speech synthesis by modeling harmonics structure with multiple function.
Toru Nakashika, Ryuki Tachibana, Masafumi Nishimura, Tetsuya Takiguchi, Yasuo Ariki
2010Speech-based automated cognitive status assessment.
Dilek Hakkani-Tür, Dimitra Vergyri, Gökhan Tür
2010Spoken English assessment system for non-native speakers using acoustic and prosodic features.
Qin Shi, Kun Li, Shilei Zhang, Stephen M. Chu, Ji Xiao, Zhijian Ou
2010Spoken document retrieval for oral presentations integrating global document similarities into local document similarities.
Hiroaki Nanjo, Yusuke Iyonaga, Takehiko Yoshimi
2010State-based labelling for a sparse representation of speech and its application to robust speech recognition.
Tuomas Virtanen, Jort F. Gemmeke, Antti Hurmalainen
2010Statistical modeling of F0 dynamics in singing voices based on Gaussian processes with multiple oscillation bases.
Yasunori Ohishi, Hirokazu Kameoka, Daichi Mochihashi, Hidehisa Nagano, Kunio Kashino
2010Statistical multi-stream modeling of real-time MRI articulatory speech data.
Erik Bresch, Athanasios Katsamanis, Louis Goldstein, Shrikanth S. Narayanan
2010Still talking to machines (cognitively speaking).
Steve J. Young
2010Strategies for statistical spoken language understanding with small amount of data - an empirical study.
Ye-Yi Wang
2010Study on interaction between entropy pruning and kneser-ney smoothing.
Ciprian Chelba, Thorsten Brants, Will Neveitt, Peng Xu
2010Sub-band basis spectrum model for pitch-synchronous log-spectrum and phase based on approximation of sparse coding.
Masatsune Tamura, Takehiko Kagoshima, Masami Akamine
2010Superwideband extension of g.718 and g.729.1 speech codecs.
Lasse Laaksonen, Mikko Tammi, Vladimir Malenovsky, Tommy Vaillancourt, Mi Suk Lee, Tomofumi Yamanashi, Masahiro Oshikiri, Claude Lamblin, Balázs Kövesi, Lei Miao, Deming Zhang, Jon Gibbs, Holly Francois
2010Syllable-level prominence detection with acoustic evidence.
Je Hun Jeon, Yang Liu
2010Synthesis of fast speech with interpolation of adapted HSMMs and its evaluation by blind and sighted listeners.
Michael Pucher, Dietmar Schabus, Junichi Yamagishi
2010Synthesizing photo-real talking head via trajectory-guided sample selection.
Lijuan Wang, Xiaojun Qian, Wei Han, Frank K. Soong
2010System output combination for improved speaker diarization.
Simon Bozonnet, Nicholas W. D. Evans, Xavier Anguera, Oriol Vinyals, Gerald Friedland, Corinne Fredouille
2010Techniques for topic detection based processing in spoken dialog systems.
Rajesh Balchandran, Leonid Rachevsky, Bhuvana Ramabhadran, Miroslav Novak
2010Template-based spectral estimation using microphone array for speech recognition.
Satoshi Tamura, Eriko Hishikawa, Wataru Taguchi, Satoru Hayamizu
2010Text normalization based on statistical machine translation and internet user support.
Tim Schlippe, Chenfei Zhu, Jan Gebhardt, Tanja Schultz
2010Text-based unstressed syllable prediction in Mandarin.
Ya Li, Jianhua Tao, Meng Zhang, Shifeng Pan, Xiaoying Xu
2010Text-independent F0 transformation with non-parallel data for voice conversion.
Zhizheng Wu, Tomi Kinnunen, Engsiong Chng, Haizhou Li
2010The 2010 CMU GALE speech-to-text system.
Florian Metze, Roger Hsiao, Qin Jin, Udhyakumar Nallasamy, Tanja Schultz
2010The AMIDA 2009 meeting transcription system.
Thomas Hain, Lukás Burget, John Dines, Philip N. Garner, Asmaa El Hannani, Marijn Huijbregts, Martin Karafiát, Mike Lincoln, Vincent Wan
2010The CHiME corpus: a resource and a challenge for computational hearing in multisource environments.
Heidi Christensen, Jon Barker, Ning Ma, Phil D. Green
2010The IIR NIST SRE 2008 and 2010 summed channel speaker recognition systems.
Hanwu Sun, Bin Ma, Chien-Lin Huang, Trung Hieu Nguyen, Haizhou Li
2010The INTERSPEECH 2010 paralinguistic challenge.
Björn W. Schuller, Stefan Steidl, Anton Batliner, Felix Burkhardt, Laurence Devillers, Christian A. Müller, Shrikanth S. Narayanan
2010The NIST 2010 speaker recognition evaluation.
Alvin F. Martin, Craig S. Greenberg
2010The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms.
David Dean, Sridha Sridharan, Robert Vogt, Michael Mason
2010The RWTH 2009 quaero ASR evaluation system for English and German.
Markus Nußbaum-Thom, Simon Wiesler, Martin Sundermeyer, Christian Plahl, Stefan Hahn, Ralf Schlüter, Hermann Ney
2010The characterization of the relative information content by spectral features for the objective intelligibility assessment of nonlinearly processed speech.
Anton Schlesinger, Marinus M. Boone
2010The comparison between the deletion-based methods and the mixing-based methods for audio CAPTCHA systems.
Takuya Nishimoto, Takayuki Watanabe
2010The effect of a word embedded in a sentence and speaking rate variation on the perceptual training of geminate and singleton consonant distinction.
Mee Sonu, Keiichi Tajima, Hiroaki Kato, Yoshinori Sagisaka
2010The effect of audience familiarity on the perception of modified accent.
Jonathan Teutenberg, Catherine Inez Watson
2010The effects of EMA-based augmented visual feedback on the English speakers' acquisition of the Japanese flap: a perceptual study.
June S. Levitt, William F. Katz
2010The estimation and kernel metric of spectral correlation for text-independent speaker verification.
Eryu Wang, Kong-Aik Lee, Bin Ma, Haizhou Li, Wu Guo, Li-Rong Dai
2010The impact of ASR on abstractive vs. extractive meeting summaries.
Gabriel Murray, Giuseppe Carenini, Raymond T. Ng
2010The influence of actual and perceived sexual orientation on diadochokinetic rate in women and men.
Benjamin Munson
2010The influence of expertise and efficiency on modality selection strategies and perceived mental effort.
Ina Wechsung, Stefan Schaffer, Robert Schleicher, Anja Naumann, Sebastian Möller
2010The interrelation between the stimulus range and the number of response categories in vowel categorization.
Titia Benders, Paola Escudero
2010The prosody of Swedish conversational grunts.
Daniel Neiberg, Joakim Gustafson
2010The relation between pitch perception preference and emotion identification.
Marie Nilsenová, Martijn Goudbeek, Luuk Kempen
2010The relevance of timing, pauses and overlaps in dialogues: detecting topic changes in scenario based meetings.
Saturnino Luz, Jing Su
2010The role of higher-level linguistic features in HMM-based speech synthesis.
Oliver Watts, Junichi Yamagishi, Simon King
2010The use of air-pressure sensor in electrolaryngeal speech enhancement based on statistical voice conversion.
Keigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano
2010The use of sense in unsupervised training of acoustic models for ASR systems.
Rita Singh, Benjamin Lambert, Bhiksha Raj
2010The use of subvector quantization and discrete densities for fast GMM computation for speaker verification.
Guoli Ye, Brian Mak
2010Time conditioned search in automatic speech recognition reconsidered.
David Nolden, Hermann Ney, Ralf Schlüter
2010Topic and style-adapted language modeling for Thai broadcast news ASR.
Markpong Jongtaveesataporn, Sadaoki Furui
2010Topic-dependent n-gram models based on optimization of context lengths in LDA.
Akira Nakamura, Satoru Hayamizu
2010Topological representation of speech for speaker recognition.
Gabriel Hernández Sierra, Jean-François Bonastre, Driss Matrouf, José R. Calvo
2010Toward aero-acoustical analysis of the sibilant /s/: an oral cavity modeling.
Kazunori Nozaki, Youhei Ohnishi, Takashi Suda, Shigeo Wada, Shinji Shimojo
2010Toward detecting voice activity employing soft decision in second-order conditional MAP.
Sang-Kyun Kim, Jae-Hun Choi, Sang-Ick Kang, Ji-Hyun Song, Joon-Hyuk Chang
2010Towards a robust face recognition system using compressive sensing.
Allen Y. Yang, Zihan Zhou, Yi Ma, Shankar Sastry
2010Towards affective state modeling in narrative and conversational settings.
Bart Jochems, Martha A. Larson, Roeland Ordelman, Ronald Poppe, Khiet P. Truong
2010Towards an ASR-free objective analysis of pathological speech.
Catherine Middag, Yvan Saeys, Jean-Pierre Martens
2010Towards long-range prosodic attribute modeling for language recognition.
Raymond W. M. Ng, Cheung-Chi Leung, Ville Hautamäki, Tan Lee, Bin Ma, Haizhou Li
2010Towards mixed language speech recognition systems.
David Imseng, Hervé Bourlard, Mathew Magimai-Doss
2010Towards spoken term discovery at scale with zero resources.
Aren Jansen, Kenneth Church, Hynek Hermansky
2010Tracter: a lightweight dataflow framework.
Philip N. Garner, John Dines
2010Training a parametric-based logF0 model with the minimum generation error criterion.
Javier Latorre, Mark J. F. Gales, Heiga Zen
2010Transcript-dependent speaker recognition using mixer 1 and 2.
Fred S. Richardson, Joseph P. Campbell
2010Turn taking-based conversation detection by using DOA estimation.
Yohei Kawaguchi, Masahito Togami, Yasunari Obuchi
2010Turn-alignment using eye-gaze and speech in conversational interaction.
Kristiina Jokinen, Kazuaki Harada, Masafumi Nishida, Seiichi Yamamoto
2010Two new estimation methods for a superpositional intonation model.
Humberto M. Torres, Hansjörg Mixdorff, Jorge A. Gurlekian, Hartmut R. Pfitzinger
2010Two-layered audio-visual integration in voice activity detection and automatic speech recognition for robots.
Takami Yoshida, Kazuhiro Nakadai
2010Ungrounded independent non-negative factor analysis.
Bhiksha Raj, Kevin W. Wilson, Alexander Krueger, Reinhold Haeb-Umbach
2010Unscented transform with online distortion estimation for HMM adaptation.
Jinyu Li, Dong Yu, Yifan Gong, Li Deng
2010Unsupervised acoustic model adaptation for multi-origin non native ASR.
Sethserey Sam, Eric Castelli, Laurent Besacier
2010Unsupervised discovery and training of maximally dissimilar cluster models.
Françoise Beaufays, Vincent Vanhoucke, Brian Strope
2010Unsupervised learning of vowels from continuous speech based on self-organized phoneme acquisition model.
Kouki Miyazawa, Hideaki Kikuchi, Reiko Mazuka
2010Unsupervised model adaptation on targeted speech segments for LVCSR system combination.
Richard Dufour, Fethi Bougares, Yannick Estève, Paul Deléglise
2010Unsupervised sequential organization for cochannel speech separation.
Ke Hu, DeLiang Wang
2010Unsupervised spoken-term detection with spoken queries using segment-based dynamic time warping.
Chun-an Chan, Lin-Shan Lee
2010Unvoiced speech segregation based on CASA and spectral subtraction.
Ke Hu, DeLiang Wang
2010Using a DBN to integrate sparse classification and GMM-based ASR.
Yang Sun, Jort F. Gemmeke, Bert Cranen, Louis ten Bosch, Lou Boves
2010Using cross-decoder co-occurrences of phone n-grams in SVM-based phonotactic language recognition.
Mikel Peñagarikano, Amparo Varona, Luis Javier Rodríguez-Fuentes, Germán Bordel
2010Using dependency parsing and machine learning for factoid question answering on spoken documents.
Pere Comas, Jordi Turmo, Lluís Màrquez
2010Using harmonic phase information to improve ASR rate.
Ibon Saratxaga, Inma Hernáez, Igor Odriozola, Eva Navas, Iker Luengo, Daniel Erro
2010Using high-level information to detect key audio events in a tennis game.
Qiang Huang, Stephen J. Cox
2010Using non-native error patterns to improve pronunciation verification.
Joost van Doremalen, Catia Cucchiarini, Helmer Strik
2010Using phoneme recognition and text-dependent speaker verification to improve speaker segmentation for Chinese speech.
Gang Wang, Xiaojun Wu, Thomas Fang Zheng
2010Using prosody to improve Mandarin automatic speech recognition.
Chong-Jia Ni, Wenju Liu, Bo Xu
2010Using robust viterbi algorithm and HMM-modeling in unit selection TTS to replace units of poor quality.
Hanna Silén, Elina Helander, Jani Nurminen, Konsta Koppinen, Moncef Gabbouj
2010Using spectro-temporal features to improve AFE feature extraction for ASR.
Suman V. Ravuri, Nelson Morgan
2010Utilizing a noisy-channel approach for Korean LVCSR.
Sakriani Sakti, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura
2010Utterance selection for speech acts in a cognitive tourguide scenario.
Felix Putze, Tanja Schultz
2010VAD-measure-embedded decoder with online model adaptation.
Tasuku Oonishi, Koji Iwano, Sadaoki Furui
2010Validation of a training method for L2 continuous-speech segmentation.
Anne Cutler, Janise Shanley
2010Variant time-frequency cepstral features for speaker recognition.
Weiqiang Zhang, Yan Deng, Liang He, Jia Liu
2010Verifying pronunciation dictionaries using conflict analysis.
Marelie H. Davel, Febe de Wet
2010Viseme-dependent weight optimization for CHMM-based audio-visual speech recognition.
Alexey Karpov, Andrey Ronzhin, Konstantin Markov, Milos Zelezný
2010Vocabulary independent spoken query: a case for subword units.
Evandro B. Gouvêa, Tony Ezzat
2010Vocal tract contour analysis of emotional speech by the functional data curve representation.
Sungbok Lee, Shrikanth S. Narayanan
2010Voice activity detection based on conditional random fields using multiple features.
Akira Saito, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda
2010Voice activity detection in a reguarized reproducing kernel hilbert space.
Xugang Lu, Masashi Unoki, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura
2010Voice activity detection using frame-wise model re-estimation method based on Gaussian pruning with weight normalization.
Masakiyo Fujimoto, Shinji Watanabe, Tomohiro Nakatani
2010Voice attributes affecting likability perception.
Benjamin Weiss, Felix Burkhardt
2010Voice quality evaluation of recent open source codecs.
Anssi Rämö, Henri Toukomaa
2010Voice search for development.
Etienne Barnard, Johan Schalkwyk, Charl Johannes van Heerden, Pedro J. Moreno
2010WFST compression for automatic speech recognition.
Diamantino Caseiro
2010What do you mean, you're uncertain?: the interpretation of cue words and rising intonation in dialogue.
Catherine Lai
2010What else is new than the hamming window? robust MFCCs for speaker recognition via multitapering.
Tomi Kinnunen, Rahim Saeidi, Johan Sandberg, Maria Hansson-Sandsten
2010When is indexical information about speech activated? evidence from a cross-modal priming experiment.
Benjamin Munson, Renata Solum
2010Wiktionary as a source for automatic pronunciation extraction.
Tim Schlippe, Sebastian Ochs, Tanja Schultz
2010Within and across sentence boundary language model.
Saeedeh Momtazi, Friedrich Faubel, Dietrich Klakow