INTERSPEECH - RankMe

678 papers

Year	Title / Authors
2012	"Help Me, I Need More User Tests!" User Simulations as Supportive Tool in the Development Process of Spoken Dialogue Systems. Florian Kretzschmar, Sebastian Möller
2012	13th Annual Conference of the International Speech Communication Association, INTERSPEECH 2012, Portland, Oregon, USA, September 9-13, 2012
2012	A Bayesian Approach to Speaker Recognition Based on GMMs Using Multiple Model Structures. Takafumi Hattori, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda
2012	A Case Study: Detecting Counselor Reflections in Psychotherapy for Addictions using Linguistic Features. Dogan Can, Panayiotis G. Georgiou, David C. Atkins, Shrikanth S. Narayanan
2012	A Comparison of Classification Paradigms for Speaker Likeability Determination. Nicholas Cummins, Julien Epps, Jia Min Karen Kua
2012	A Continuous Prominence Score Based On Acoustic Features. Jean-Philippe Goldman, Mathieu Avanzi, Anne-Catherine Simon, Antoine Auchlin
2012	A Conversational Movie Search System Based on Conditional Random Fields. Jingjing Liu, Scott Cyphers, Panupong Pasupat, Ian McGraw, James R. Glass
2012	A Corpus-Based Study of Interruptions in Spoken Dialogue. Agustín Gravano, Julia Hirschberg
2012	A Correlational Discriminant Approach to Feature Extraction for Robust Speech Recognition. Vikrant Singh Tomar, Richard C. Rose
2012	A Data-driven Approach to Understanding Spoken Route Directions in Human-Robot Dialogue. Raveesh Meena, Gabriel Skantze, Joakim Gustafson
2012	A Discriminative Classification-Based Approach to Information State Updates for a Multi-Domain Dialog System. Dilek Hakkani-Tür, Gökhan Tür, Larry P. Heck, Ashley Fidler, Asli Celikyilmaz
2012	A Fast-Converging Adaptive Frequency-Domain MVDR Beamformer for Speech Enhancement. Shengkui Zhao, Douglas L. Jones
2012	A Feature Space Transformation Method for Personalization using Generalized I-Vector Clustering. Kaisheng Yao, Yifan Gong, Chaojun Liu
2012	A Frame Pruning Approach for Paralinguistic Recognition Tasks. Johannes Wagner, Florian Lingenfelser, Elisabeth André
2012	A HMM approach to residual estimation for high resolution voice conversion. Winston S. Percybrooks, Elliot Moore
2012	A Hierarchical Bayesian Approach for Semi-supervised Discriminative Language Modeling. Yik-Cheung Tam, Paul Vozila
2012	A Initial Attempt on Task-Specific Adaptation for Deep Neural Network-based Large Vocabulary Continuous Speech Recognition. Yeming Xiao, Zhen Zhang, Shang Cai, Jielin Pan, Yonghong Yan
2012	A Natural In-Car Speech Interface to Internet Services Using Hybrid ASR. Hansjörg Hofmann, Ute Ehrlich, Klaus Bader, Ilona Nothelfer, André Berton
2012	A New Approach of Speaking Rate Modeling for Mandarin Speech Prosody. Chiao-Hua Hsieh, Chen-Yu Chiang, Yih-Ru Wang, Hsiu-Min Yu, Sin-Horng Chen
2012	A Non-Uniform Filterbank for Speaker Recognition. Jia Min Karen Kua, Tharmarajah Thiruvaran, Eliathamby Ambikairajah
2012	A Novel Confidence Measure Based on Context Consistency for Spoken Term Detection. Haiyang Li, Jiqing Han, Tieran Zheng, Guibin Zheng
2012	A Preliminary Study on Cross-Databases Emotion Recognition using the Glottal Features in Speech. Rui Sun, Elliot Moore II
2012	A Random, Semantically Appropriate Sentence Generator for Speaker Verification. Jason Lilley, Amanda Stent, Ilija Zeljkovic
2012	A Robust Unsupervised Arousal Rating Framework using Prosody with Cross-Corpora Evaluation. Daniel Bone, Chi-Chun Lee, Shrikanth S. Narayanan
2012	A Rule Based Pronunciation Generator and Regional Accent Databank for Portuguese. Simone Ashby, Sílvia Barbosa, Silvia Brandão, José Pedro Ferreira, Maarten Janssen, Catarina Silva, Mário Eduardo Viaro
2012	A Self-Learning Assistive Vocal Interface Based on Vocabulary Learning and Grammar Induction. Jort F. Gemmeke, Janneke van de Loo, Guy De Pauw, Joris Driesen, Hugo Van hamme, Walter Daelemans
2012	A Sequential Bayesian Dialog Agent for Computational Ethnography. Abe Kazemzadeh, James Gibson, Juanchen Li, Sungbok Lee, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2012	A Simple Hybrid Acoustic / Morphologically-Constrained Technique for the Synthesis of Stop Consonants in Various Vocalic Contexts. Frédéric Berthommier, Laurent Girin, Louis-Jean Boë
2012	A Sparse Plus Low Rank Maximum Entropy Language Model. Brian Hutchinson, Mari Ostendorf, Maryam Fazel
2012	A Specialized WFST Approach for Class Models and Dynamic Vocabulary. Paul R. Dixon, Chiori Hori, Hideki Kashioka
2012	A Stochastic Model of Singing Voice F0 Contours for Characterizing Expressive Dynamic Components. Yasunori Ohishi, Hirokazu Kameoka, Daichi Mochihashi, Kunio Kashino
2012	A Study of Mutual Information for GMM-Based Spectral Conversion. Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Yih-Ru Wang, Sin-Horng Chen
2012	A Study on Using Word-Level HMMs to Improve ASR Performance over State-of-the-Art Phone-Level Acoustic Modeling for LVCSR. I-Fan Chen, Chin-Hui Lee
2012	A Triple-Microphone Real-Time Speech Enhancement Algorithm Based on Approximate Array Analytical Solutions. Meng Yu, Ryan Ritch, Jack Xin
2012	A Two-stage Speaker Adaptation Approach for Subspace Gaussian Mixture Model based Nonnative Speech Recognition. Bo Li, Khe Chai Sim
2012	A Two-step NMF Based Algorithm for Single Channel Speech Separation. Shuo Wang, Wenjun Wu
2012	A Weighted Combination of Speech with Text-based Models for Arabic Diacritization. Aisha S. Azim, Xiaoxuan Wang, Khe Chai Sim
2012	A comparative study of adaptive, automatic recognition of disordered speech. Heidi Christensen, Stuart P. Cunningham, Charles Fox, Phil D. Green, Thomas Hain
2012	A factorized representation of FMLLR transform based on QR-decomposition. Shakti P. Rath, Martin Karafiát, Ondrej Glembek, Jan Cernocký
2012	A full-band adaptive harmonic representation of speech. Gilles Degottex, Yannis Stylianou
2012	A method of speaker identification based on phoneme mean F-ratio contribution. Songgun Hyon, Hongcui Wang, Chen Zhao, Jianguo Wei, Jianwu Dang
2012	A methodology for the study of rhythm in drummed forms of languages: application to Bora Manguaré of Amazon. Julien Meyer, Laure Dentel, Frank Seifart
2012	A new noise-tracking algorithm for generalizing binary time-frequency (T-F) masking to ratio masking. Shan Liang, Wei Jiang, Wenju Liu
2012	A signal-separation-based array postfilter for distant speech recognition. Rita Singh, Ken'ichi Kumatani, John W. McDonough, Chen Liu
2012	A simple and efficient method to align very long speech signals to acoustically imperfect transcriptions. Germán Bordel, Mikel Peñagarikano, Luis Javier Rodríguez-Fuentes, Amparo Varona
2012	A speaker-role based approach for detecting politicians in TV broadcast news. Delphine Charlet, Géraldine Damnati
2012	A speech parameter generation algorithm using local variance for HMM-based speech synthesis. Vataya Chunwijitra, Takashi Nose, Takao Kobayashi
2012	A tutorial dialogue system with unrestricted spoken input. Peter Bell, Myroslava O. Dzikovska, Amy Isard
2012	Accelerated Batch Learning of Convex Log-linear Models for LVCSR. Simon Wiesler, Ralf Schlüter, Hermann Ney
2012	Accentual Transfer from Swiss-German to French. A Study of "Français Fédéral". Mathieu Avanzi, Pauline Dubosson, Sandra Schwab, Nicolas Obin
2012	Accounting for Speech Rate in Spoken Word Recognition. David Cheng-Huan Li, Elsi Kaiser
2012	Acoustic Cues of Vowel Quality to Coda Nasal Perception in Southern Min. Ying Chen, Vsevolod Kapatsinski, Susan Guion-Anderson
2012	Acoustic Feature-based Non-scorable Response Detection for an Automated Speaking Proficiency Assessment. Je Hun Jeon, Su-Youn Yoon
2012	Acoustic Features for Classification Based Speech Separation. Yuxuan Wang, Kun Han, DeLiang Wang
2012	Acoustic and Data-driven Features for Robust Speech Activity Detection. Samuel Thomas, Sri Harish Reddy Mallidi, Thomas Janu, Hynek Hermansky, Nima Mesgarani, Xinhui Zhou, Shihab A. Shamma, Tim Ng, Bing Zhang, Long Nguyen, Spyros Matsoukas
2012	Acoustic and Perceptual Similarity in Coarticulatorily Nasalized Vowels. Rebecca Scarborough, Georgia Zellou
2012	Active Learning by Sparse Instance Tracking and Classifier Confidence in Acoustic Emotion Recognition. Zixing Zhang, Björn W. Schuller
2012	Addressing Confusions in Spoken Language in ESL Pronunciation Tutors. Oscar Saz, Maxine Eskénazi
2012	Advances in combined electro-optical palatography. Peter Birkholz, Philippe Daechert, Christiane Neuschaefer-Rube
2012	Advances in noise robust digit recognition using hybrid exemplar-based techniques. Jort F. Gemmeke, Hugo Van hamme
2012	Age Estimation from Telephone Speech using i-vectors. Mohamad Hasan Bahari, Mitchell McLaren, Hugo Van hamme, David A. van Leeuwen
2012	Aligning manifolds to model the earliest phonological abstraction in infant-caretaker vocal imitation. Andrew R. Plummer
2012	Amplitude Modulation Filters as Feature Sets for Robust ASR: Constant Absolute or Relative Bandwidth? Niko Moritz, Jörn Anemüller, Birger Kollmeier
2012	Amplitude Spectrum based Excitation Model for HMM-based Speech Synthesis. Zhengqi Wen, Jianhua Tao
2012	An Auditory Inspired Multimodal Framework for Speech Enhancement. Majid Mirbagheri, Sahar Akram, Shihab A. Shamma
2012	An Automatic Child-Directed Speech Detector for the Study of Child Language Development. Soroush Vosoughi, Deb Roy
2012	An Evaluation of Parameter Generation Methods with Rich Context Models in HMM-Based Speech Synthesis. Shinnosuke Takamichi, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai, Sakriani Sakti, Satoshi Nakamura
2012	An Information-Extraction Approach to Speech Analysis and Processing. Chin-Hui Lee
2012	An MRI study of the oral articulation of European Portuguese nasal vowels. Catarina Oliveira, Paula Martins, Samuel S. Silva, António J. S. Teixeira
2012	An On-Line, Cloud-Based Spanish-Spanish Sign Language Translation System. Javier Tejedor, Fernando J. López-Colino, Jordi Porta, José Colás
2012	An Online Generated Transducer to Increase Dialog Manager Coverage. Joaquin Planells, Lluís F. Hurtado, Emilio Sanchis, Encarna Segarra
2012	An alignment matching method to explore pseudosyllable properties across different corpora. Raymond W. M. Ng, Thomas Hain, Keikichi Hirose
2012	Analysis of Mimicry Speech. D. Gomathi, Sathya Adithya Thati, Karthik Venkat Sridaran, Bayya Yegnanarayana
2012	Analysis of Temporal Resolution in Frequency Domain Linear Prediction. Sriram Ganapathy, Hynek Hermansky
2012	Analysis of speaker clustering strategies for HMM-based speech synthesis. Rasmus Dall, Christophe Veaux, Junichi Yamagishi, Simon King
2012	Analysis of the Characteristics of Talk-show TV Programs. Fabio Brugnara, Daniele Falavigna, Diego Giuliani, Roberto Gretter
2012	Analysis of vocal tremor and jitter by empirical mode decomposition of glottal cycle length time series. Christophe Mertens, Francis Grenez, Jean Schoentgen
2012	Analysis on the Importance of Short-Term Speech Parameterizations for Emotional Statistical Parametric Speech Synthesis. Ranniery Maia
2012	Analyzing and Interpreting Automatically Learned Rules Across Dialects. Nancy F. Chen, Wade Shen, Joseph P. Campbell
2012	Anchor Models and WCCN Normalization For Speaker Trait Classification. Yazid Attabi, Pierre Dumouchel
2012	Annotation and Recognition of Personality Traits in Spoken Conversations from the AMI Meetings Corpus. Fabio Valente, Samuel Kim, Petr Motlícek
2012	Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition. Navdeep Jaitly, Patrick Nguyen, Andrew W. Senior, Vincent Vanhoucke
2012	Application of Structural Events Detected on ASR Outputs for Automated Speaking Assessment. Lei Chen, Su-Youn Yoon
2012	Applying multiview learning algorithms to human-human conversation classification. Sokol Koço, Cécile Capponi, Frédéric Béchet
2012	Arabic Dialect Identification - 'Is the Secret in the Silence?' and Other Observations. Hynek Boril, Abhijeet Sangwan, John H. L. Hansen
2012	Are Sparse Representations Rich Enough for Acoustic Modeling? Oriol Vinyals, Li Deng
2012	Articulatory Feature based Multilingual MLPs for Low-Resource Speech Recognition. Yanmin Qian, Jia Liu
2012	Articulatory Strategies in Obstruent Production in Mandarin Esophageal Speech. Fang Hu, Yungang Wu, Wen Xu, Demin Han
2012	Articulatory VCV Synthesis from EMA Data. Asterios Toutios, Shinji Maeda
2012	Articulatory differences between oral and nasal vowels based on the simulation of a speaker-adaptive articulatory model. Panying Rong, Ryan Shosted, David Kuehn
2012	Articulatory speaker normalisation based on MRI-data using three-way linear decomposition methods. Julián Andrés Valdés Vargas, Pierre Badin, Laurent Lamalle
2012	Assessing agreement level between forced alignment models with data from endangered language documentation corpora. Christian DiCanio, Hosung Nam, Douglas H. Whalen, H. Timothy Bunnell, Jonathan D. Amith, Rey Castillo García
2012	Assessment of Disordered Voices Using Empirical Mode Decomposition in the Log-Spectral Domain. Abdellah Kacha, Francis Grenez, Jean Schoentgen
2012	Assessment of user simulators for spoken dialogue systems by means of subspace multidimensional clustering. Zoraida Callejas, David Griol, Klaus-Peter Engelbrecht
2012	Asymmetries in the perception of synthesized speech. Anna C. Janska, Erich Schröger, Thomas Jacobsen, Robert A. J. Clark
2012	Audio and Contact Microphones for Cough Detection. Thomas Drugman, Jérôme Urbain, Nathalie Bauwens, Ricardo Chessini, Anne-Sophie Aubriot, Patrick Lebecque, Thierry Dutoit
2012	Audio-visual Evaluation and Detection of Word Prominence in a Human-Machine Interaction Scenario. Martin Heckmann
2012	Audiovisual correlates of basic emotions in blind and sighted people. Marc Swerts, Kitty Leuverink, Madelène Munnik, Vera Nijveld
2012	Audiovisual discrimination of CV syllables: a simultaneous fMRI-EEG study. Cyril Dubois, Rudolph Sock
2012	Auditory and Dynamic Modeling Paradigms to Detect L2 Mispronunciations. Christos Koniaris, Olov Engwall, Giampiero Salvi
2012	Auditory-visual speech to infants and adults: signals and correlations. Jeesun Kim, Chris Davis, Christine Kitamura
2012	Automated Dysarthria Severity Classification for Improved Objective Intelligibility Assessment of Spastic Dysarthric Speech. Milton Orlando Sarria-Paja, Tiago H. Falk
2012	Automatic Detection of High Vocal Effort in Telephone Speech. Jouni Pohjalainen, Tuomo Raitio, Hannu Pulakka, Paavo Alku
2012	Automatic Error Recovery for Pronunciation Dictionaries. Tim Schlippe, Sebastian Ochs, Ngoc Thang Vu, Tanja Schultz
2012	Automatic Measurement of Positive and Negative Voice Onset Time. Katharine Henry, Morgan Sonderegger, Joseph Keshet
2012	Automatic Phoneme Segmentation Using Auditory Attention Features. Ozlem Kalinli
2012	Automatic Pronunciation Error Detection Based on Extended Pronunciation Space Using the Unsupervised Clustering of Pronunciation Errors. Long Zhang, Haifeng Li
2012	Automatic Speech Segmentation Using Probabilistic Latent Component Modeling. Sayan Ghosh, T. V. Sreenivas
2012	Automatic Tone Assessment of Non-Native Mandarin Speakers. Jian Cheng
2012	Automatic Topology Generation of Glottal Source HMM. Akira Sasou
2012	Automatic Transcription of Lecture Speech using Language Model Based on Speaking-Style Transformation of Proceeding Texts. Yuya Akita, Makoto Watanabe, Tatsuya Kawahara
2012	Automatic Vocabulary Adaptation Based on Semantic Similarity and Speech Recognition Confidence Measure. Shoko Yamahata, Yoshikazu Yamaguchi, Atsunori Ogawa, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi
2012	Automatic detection of conflict escalation in spoken conversations. Samuel Kim, Sree Harsha Yella, Fabio Valente
2012	Automatic detection of hypernasal speech signals using nonlinear and entropy measurements. Juan Rafael Orozco-Arroyave, Julián D. Arias-Londoño, Jesús Francisco Vargas-Bonilla, Elmar Nöth
2012	Automatic estimation of the first two subglottal resonances in children's speech with application to speaker normalization in limited-data conditions. Harish Arsikere, Gary K. F. Leung, Steven M. Lulich, Abeer Alwan
2012	Automatic intelligibility assessment of pathologic speech in head and neck cancer based on auditory-inspired spectro-temporal modulations. Xinhui Zhou, Daniel Garcia-Romero, Nima Mesgarani, Maureen L. Stone, Carol Y. Espy-Wilson, Shihab A. Shamma
2012	Automatic transcription error recovery for Person Name Recognition. Richard Dufour, Géraldine Damnati, Delphine Charlet, Frédéric Béchet
2012	Automatic word naming recognition for treatment and assessment of aphasia. Alberto Abad, Anna Pompili, Ângela Costa, Isabel Trancoso
2012	Automating Crowd-supervised Learning for Spoken Language Systems. Ian McGraw, Scott Cyphers, Panupong Pasupat, Jingjing Liu, James R. Glass
2012	Average Spectrotemporal Structure of Continuous Speech Matches with the Frequency Resolution of Human Hearing. Okko Räsänen
2012	Bag-of-Audio-Words Approach for Multimedia Event Classification. Stephanie Pancoast, Murat Akbacak
2012	Based on Isolated Saliency or Causal Integration? Toward a Better Understanding of Human Annotation Process using Multiple Instance Learning and Sequential Probability Ratio Test. Chi-Chun Lee, Athanasios Katsamanis, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2012	Bayesian Feature Enhancement for ASR of Noisy Reverberant Real-World Data. Alexander Krueger, Oliver Walter, Volker Leutnant, Reinhold Haeb-Umbach
2012	Bayesian Group Sparse Learning for Nonnegative Matrix Factorization. Jen-Tzung Chien, Hsin-Lung Hsieh
2012	Bayesian Mixture of Probabilistic Linear Regressions for Voice Conversion. Na Li, Yu Qiao
2012	Beamforming using uniform circular arrays for distant speech recognition in reverberant environments and double talk scenarios. Hannes Pessentheiner, Stefan Petrik, Harald Romsdorfer
2012	Bilinear Factor Analysis for iVector Based Speaker Verification. Yun Lei, Lukás Burget, Nicolas Scheffer
2012	Binary Mask Estimation for Improved Speech Intelligibility in Reverberant Environments. Oldooz Hazrati, Jaewook Lee, Philipos C. Loizou
2012	Binaural Noise Reduction Using Frequency-Warped FIR Filters. Jorge I. Marin-Hurtado, David V. Anderson
2012	Boosting Classification Based Speech Separation Using Temporal Dynamics. Yuxuan Wang, DeLiang Wang
2012	C2H: A Computational Model of H&H-based Phonetic Contrast in Synthetic Speech. Mauro Nicolao, Javier Latorre, Roger K. Moore
2012	CRF-based Diacritisation of Colloquial Arabic for Automatic Speech Recognition. Sarah Al-Shareef, Thomas Hain
2012	Calibration of probabilistic age recognition. David A. van Leeuwen, Mohamad Hasan Bahari
2012	Caller Response Timing Patterns in Spoken Dialog Systems. Silke M. Witt
2012	Can litheners retune native categories acroth a thoneme boundary? Michael D. Tyler, Mona Faris
2012	Can modified casual speech reach the intelligibility of clear speech? Maria Koutsogiannaki, Michèle Pettinato, Cassie Mayo, Varvara Kandia, Yannis Stylianou
2012	Characterizing Covert Articulation in Apraxic Speech Using real-time MRI. Christina Hagedorn, Michael I. Proctor, Louis Goldstein, Maria Luisa Gorno-Tempini, Shrikanth S. Narayanan
2012	Children's Productions of Multi-Syllabic Lexical Stress Patterns in Different Prosodic Positions. Irina Shport
2012	Classification of Stressed Speech Using Physical Parameters Derived from Two-Mass Model. Xiao Yao, Takatoshi Jitsuhiro, Chiyomi Miyajima, Norihide Kitaoka, Kazuya Takeda
2012	Classifying Skewed Data: Importance Weighting to Optimize Average Recall. Andrew Rosenberg
2012	ClippyScript: A Programming Language for Multi-Domain Dialogue Systems. Frank Seide, Sean McDirmid
2012	Co-occurrence of reduced word forms in natural speech. Malte C. Viebahn, Mirjam Ernestus, James M. McQueen
2012	Coherent Topic Transition in a Conversational Agent. Daniel Macías-Galindo, Wilson Wong, Lawrence Cavedon, John Thangarajah
2012	Combination of Multiple Speech Dimensions for Automatic Assessment of Dysarthric Speech Intelligibility. Myung Jong Kim, Hoirin Kim
2012	Combination of Sparse Classification and Multilayer Perceptron for Noise-robust ASR. Yang Sun, Mathew M. Doss, Jort F. Gemmeke, Bert Cranen, Louis ten Bosch, Lou Boves
2012	Combining Acoustic Data Driven G2P and Letter-to-Sound Rules for Under Resource Lexicon Generation. Ramya Rasipuram, Mathew Magimai-Doss
2012	Combining Bottleneck-BLSTM and Semi-Supervised Sparse NMF for Recognition of Conversational Speech in Highly Instationary Noise. Felix Weninger, Martin Wöllmer, Björn W. Schuller
2012	Combining Ranking and Classification to Improve Emotion Recognition in Spontaneous Speech. Houwei Cao, Ragini Verma, Ani Nenkova
2012	Combining frame and segment based models for environmental sound classification. Pengfei Hu, Wenju Liu, Wei Jiang
2012	Combining multiple high quality corpora for improving HMM-TTS. Vincent Wan, Javier Latorre, K. K. Chin, Langzhou Chen, Mark J. F. Gales, Heiga Zen, Kate M. Knill, Masami Akamine
2012	Combining temporal and cepstral features for the automatic perceptual categorization of disordered connected speech. Ali Alpan, Jean Schoentgen, Francis Grenez
2012	Compact Audio Representation for Event Detection in Consumer Media. Xiaodan Zhuang, Stavros Tsakalidis, Shuang Wu, Pradeep Natarajan, Rohit Prasad, Prem Natarajan
2012	Comparative Analysis of Intensity between Native Speakers and Japanese Speakers of English. Tomoko Nariai, Kazuyo Tanaka, Tatsuya Kawahara
2012	Comparing different acoustic modeling techniques for multilingual boosting. David Imseng, John Dines, Petr Motlícek, Philip N. Garner, Hervé Bourlard
2012	Comparing transcription agreement on non-native English speech corpus between native and non-native annotators. Hyuksu Ryu, Sunhee Kim, Minhwa Chung
2012	Comparison of Grapheme-to-Phoneme Methods on Large Pronunciation Dictionaries and LVCSR Tasks. Stefan Hahn, Paul Vozila, Maximilian Bisani
2012	Compensating for Ageing and Quality variation in Speaker Verification. Finnian Kelly, Andrzej Drygajlo, Naomi Harte
2012	Compensation of Intrinsic Variability with Factor Analysis Modeling for Robust Speaker Verification. Sheng Chen, Mingxing Xu
2012	Complementary Phone Error Training. Frank Diehl, Philip C. Woodland
2012	Computational Modelling of the Recognition of Foreign-Accented Speech. Odette Scharenborg, Marijt J. Witteman, Andrea Weber
2012	Confidence Measures in Speech Emotion Recognition Based on Semi-supervised Learning. Jun Deng, Björn W. Schuller
2012	Confidence for Speaker Diarization using PCA Spectral Ratio. Orith Toledo-Ronen, Hagai Aronowitz
2012	Confidence measure for speech indexing based on Latent Dirichlet Allocation. Grégory Senay, Georges Linarès
2012	Considering Global Variance of the Log Power Spectrum Derived from Mel-Cepstrum in HMM-based Parametric Speech Synthesis. Xiang Yin, Zhen-Hua Ling, Ming Lei, Li-Rong Dai
2012	Consonantal space area in Children with a Cleft Palate An acoustic Study. Marion Bechet, Fabrice Hirsch, Camille Fauth, Rudolph Sock
2012	Constrained Maximum Mutual Information Dimensionality Reduction for Language Identification. Shuai Huang, Glen A. Coppersmith, Damianos G. Karakos
2012	Constrained Multichannel Speech Dereverberation. Meng Yu, Frank K. Soong
2012	Consumer-level multimedia event detection through unsupervised audio signal modeling. Byungki Byun, Ilseo Kim, Sabato Marco Siniscalchi, Chin-Hui Lee
2012	Context-Dependent MLPs for LVCSR: TANDEM, Hybrid or Both? Zoltán Tüske, Ralf Schlüter, Hermann Ney, Martin Sundermeyer
2012	Continuous Articulatory-to-Acoustic Mapping using Phone-based Trajectory HMM for a Silent Speech Interface. Thomas Hueber, Gérard Bailly, Bruce Denby
2012	Continuous Digit Recognition in Noise: Reservoirs can do an excellent job! Azarakhsh Jalalvand, Fabian Triefenbach, Jean-Pierre Martens
2012	Contrasting Cues to Verbal and Non-Verbal Backchannels in Multi-lingual Dyadic Rapport. Gina-Anne Levow, Susan Duncan
2012	Contrastive intonation in autism: The effect of speaker- and listener-perspective. Constantijn Kaland, Emiel Krahmer, Marc Swerts
2012	Contribution of Spectral Shapes to Tone Perception. Natthawut Kertkeidkachorn, Surapol Vorapatratorn, Sirinart Tangruamsub, Proadpran Punyabukkana, Atiwong Suchato
2012	Conversion of Recurrent Neural Network Language Models to Weighted Finite State Transducers for Automatic Speech Recognition. Gwénolé Lecorvé, Petr Motlícek
2012	Convolutive Non-Negative Sparse Coding and New Features for Speech Overlap Handling in Speaker Diarization. Jürgen T. Geiger, Ravichander Vipperla, Simon Bozonnet, Nicholas W. D. Evans, Björn W. Schuller, Gerhard Rigoll
2012	Correlation Between Model-based Approximations of Grounding-related Cognition and User Judgments. Klaus-Peter Engelbrecht, Sebastian Möller
2012	Correlation between vocal tract length, body height, formant frequencies, and pitch frequency for the five Japanese vowels uttered by fifteen male speakers. Hiroaki Hatano, Tatsuya Kitamura, Hironori Takemoto, Parham Mokhtari, Kiyoshi Honda, Shinobu Masaki
2012	Coupling identification and reconstruction of missing features for noise-robust automatic speech recognition. Ning Ma, Jon Barker
2012	Cries and Whispers - Classification of Vocal Effort in Expressive Speech. Nicolas Obin
2012	Cross Linguistic Comparison of Mandarin and English EMA Articulatory Data. Sheng Li, Lan Wang
2012	Cross-Lingual and Ensemble MLPs Strategies for Low-Resource Speech Recognition. Yanmin Qian, Jia Liu
2012	Cross-lingual Speaker Adaptation for HMM-based Speech Synthesis based on Perceptual Characteristics and Speaker Interpolation. Viviane de Franca Oliveira, Sayaka Shiota, Yoshihiko Nankaku, Keiichi Tokuda
2012	Cross-speaker Acoustic-to-Articulatory Inversion using Phone-based Trajectory HMM for Pronunciation Training. Thomas Hueber, Atef Ben Youssef, Gérard Bailly, Pierre Badin, Frédéric Elisei
2012	Data-driven Posterior Features for Low Resource Speech Recognition Applications. Samuel Thomas, Sriram Ganapathy, Aren Jansen, Hynek Hermansky
2012	Decoding of Uncertain Features Using the Posterior Distribution of the Clean Data for Robust Speech Recognition. Ahmed Hussen Abdelaziz, Dorothea Kolossa
2012	Deep Architectures for Articulatory Inversion. Benigno Uria, Iain Murray, Steve Renals, Korin Richmond
2012	Demonstration of Advanced Multi-Modal, Network-Centric Communication Management Suite. Victor S. Finomore
2012	Dereverberation based on Wavelet Packet Filtering for Robust Automatic Speech Recognition. Tatsuya Kawahara, Randy Gomez
2012	Deriving conversation-based features from unlabeled speech for discriminative language modeling. Damianos G. Karakos, Brian Roark, Izhak Shafran, Kenji Sagae, Maider Lehr, Emily Tucker Prud'hommeaux, Puyang Xu, Nathan Glenn, Sanjeev Khudanpur, Murat Saraclar, Daniel M. Bikel, Mark Dredze, Chris Callison-Burch, Yuan Cao, Keith B. Hall, Eva Hasler, Philipp Koehn, Adam Lopez, Matt Post, Darcey Riley
2012	Describing the development of intonational categories using a target-oriented parametric approach. Britta Lintfert, Bernd Möbius
2012	Descriptive Vocabulary Development for Degraded Speech. Dushyant Sharma, Gaston Hilkhuysen, Patrick A. Naylor, Nikolay D. Gaubitch, Mark A. Huckvale, Mike Brookes
2012	Designing a spoken language interface for a tutorial dialogue system. Peter Bell, Myroslava O. Dzikovska, Amy Isard
2012	Detecting Acronyms from Capital Letter Sequences in Spanish. Rubén San Segundo, Juan Manuel Montero, Verónica López-Ludeña, Simon King
2012	Detecting Converted Speech and Natural Speech for anti-Spoofing Attack in Speaker Recognition. Zhizheng Wu, Chng Eng Siong, Haizhou Li
2012	Detecting Intelligibility by Linear Dimensionality Reduction and Normalized Voice Quality Hierarchical Features. Dong-Yan Huang, Yongwei Zhu, Dajun Wu, Rongshan Yu
2012	Detecting OOV Named-Entities in Conversational Speech. Rohit Kumar, Rohit Prasad, Sankaranarayanan Ananthakrishnan, Aravind Namandi Vembu, David Stallard, Stavros Tsakalidis, Prem Natarajan
2012	Detecting System-directed Utterances using Dialogue-level Features. Kazunori Komatani, Akira Hirano, Mikio Nakano
2012	Detection and Positioning of Overlapped Sounds in a Room Environment. Rupayan Chakraborty, Climent Nadeu, Taras Butko
2012	Detection of Transition Segments in VCV Utterances for Estimation of the Place of Closure of Oral Stops for Speech Training. K. S. Nataraj, Prem C. Pandey
2012	Developing a Speech Activity Detection System for the DARPA RATS Program. Tim Ng, Bing Zhang, Long Nguyen, Spyros Matsoukas, Xinhui Zhou, Nima Mesgarani, Karel Veselý, Pavel Matejka
2012	Development and Evaluation of Automatic Punctuation for French and English Speech-to-Text. Jáchym Kolár, Lori Lamel
2012	Deviation measure of waveform symmetry and its application to high-speed and temporally-fine F0 extraction for vocal sound texture manipulation. Hideki Kawahara, Masanori Morise, Ryuichi Nisimura, Toshio Irino
2012	Diagnostic Prediction of Transmitted Speech Quality: A New Framework for Signal-based and Parametric Models. Sebastian Möller, Marcel Wältermann, Nicolas Côté
2012	Dialectal and generational variations in vowels in spontaneous speech. Robert Allen Fox, Ewa Jacewicz
2012	DiarTk : An Open Source Toolkit for Research in Multistream Speaker Diarization and its Application to Meetings Recordings. Deepu Vijayasenan, Fabio Valente
2012	Direction of Arrival Estimation Based on Subband Weighting for Noisy Conditions. Wei Xue, Wenju Liu
2012	Discontinuous Observation HMM for Prosodic-Event-Based F0 Generation. Tomoki Koriyama, Takashi Nose, Takao Kobayashi
2012	Discrimination of Linguistic and Non-Linguistic Vocalizations in Spontaneous Speech: Intra- and Inter-Corpus Perspectives. Felix Weninger, Björn W. Schuller
2012	Discriminative Decision Function Based Scoring Method in Joint Factor Analysis for Speaker Verification. Chunyan Liang, Xiang Zhang, Lin Yang, Yonghong Yan
2012	Discriminative Fuzzy Clustering Maximum a Posterior Linear Regression for Speaker Adaptation. Ting-Yao Hu, Yu Tsao, Lin-Shan Lee
2012	Discriminative Reranking for LVCSR Leveraging Invariant Structure. Masayuki Suzuki, Gakuto Kurata, Masafumi Nishimura, Nobuaki Minematsu
2012	Discriminative Training Using Non-uniform Criteria for Keyword Spotting on Spontaneous Speech. Chao Weng, Biing-Hwang Juang, Daniel Povey
2012	Discriminative feature-space transforms using deep neural networks. George Saon, Brian Kingsbury
2012	Discriminatively learning factorized finite state pronunciation models from dynamic Bayesian networks. Preethi Jyothi, Eric Fosler-Lussier, Karen Livescu
2012	Discriminatively trained phoneme confusion model for keyword spotting. Panagiota Karanasou, Lukás Burget, Dimitra Vergyri, Murat Akbacak, Arindam Mandal
2012	Disentangling lexical, morphological, syntactic and semantic influences on German prominence - Evidence from a production study. Barbara Samlowski, Petra Wagner, Bernd Möbius
2012	Distance-Dependent Noise Reduction for Two-Channel Microphones. Thomas Fehér, Dietmar Richter, Oliver Jokisch, Rüdiger Hoffmann
2012	Duration of ambulatory monitoring needed to accurately estimate voice use. Daryush D. Mehta, Rebecca Woodbury Listfield, Harold A. Cheyne II, James T. Heaton, Shengran W. Feng, Matías Zanartu, Robert E. Hillman
2012	Dutch Automatic Speech Recognition on the Web: Towards a General Purpose System. Joris Pelemans, Kris Demuynck, Patrick Wambacq
2012	Dynamic Conditional Random Fields for Joint Sentence Boundary and Punctuation Prediction. Xuancong Wang, Hwee Tou Ng, Khe Chai Sim
2012	Dynamic Grammars with Lookahead Composition for WFST-based Speech Recognition. Josef R. Novak, Nobuaki Minematsu, Keikichi Hirose
2012	EFL Conversational Triads: Foreigner-directed Speech and Hyperarticulation. Hua-Li Jian, Richard Konopka
2012	Effect of Relevance Factor of Maximum a posteriori Adaptation for GMM-SVM in Speaker and Language Recognition. Changhuai You, Haizhou Li, Bin Ma, Kong-Aik Lee
2012	Effect of Tongue Tip Trilling on the Glottal Excitation Source. Vinay Kumar Mittal, N. Dhananjaya, Bayya Yegnanarayana
2012	Effect of being seen on the production of visible speech cues. A pilot study on Lombard speech. Maeva Garnier, Lucie Ménard, Gabrielle Richard
2012	Effect of noise type and level on focus related fundamental frequency changes. Martti Vainio, Daniel Aalto, Antti Suni, Anja Arnhold, Tuomo Raitio, Henri Seijo, Juhani Järvikivi, Paavo Alku
2012	Effect of prosodic changes on speech intelligibility. Catherine Mayo, Vincent Aubanel, Martin Cooke
2012	Effect of speech priors in single-channel speech-music separation for ASR. Cemil Demir, Ali Taylan Cemgil, Murat Saraçlar
2012	Effects of Dialectal Origin on Articulation Rate in French. Mathieu Avanzi, Pauline Dubosson, Sandra Schwab
2012	Effects of Speaker Adaptive Training on Tensor-based Arbitrary Speaker Conversion. Daisuke Saito, Nobuaki Minematsu, Keikichi Hirose
2012	Effects of stress and speech rate on vowel quality in Catalan and Spanish. Marianna Nadeu
2012	Effects of the availability of visual information and presence of competing conversations on speech production. Vincent Aubanel, Martin Cooke, Emma Foster, María Luisa García Lecumberri, Cassie Mayo
2012	Effects of visual speech information on native listener judgments of L2 consonant intelligibility. Saya Kawase, Yue Wang
2012	Efficient Beam Width Control to Suppress Excessive Speech Recognition Computation Time Based on Prior Score Range Normalization. Satoshi Kobashikawa, Takaaki Hori, Yoshikazu Yamaguchi, Taichi Asami, Hirokazu Masataki, Satoshi Takahashi
2012	Efficient On-The-Fly Hypothesis Rescoring in a Hybrid GPU/CPU-based Large Vocabulary Continuous Speech Recognition Engine. Jungsuk Kim, Jike Chong, Ian R. Lane
2012	Efficient Segmental Conditional Random Fields for One-Pass Phone Recognition. Yanzhang He, Eric Fosler-Lussier
2012	Efficient Structured Language Modeling for Speech Recognition. Ariya Rastrow, Mark Dredze, Sanjeev Khudanpur
2012	Efficient VTS Adaptation Using Jacobian Approximation. Jinyu Li, Michael L. Seltzer, Yifan Gong
2012	Efficient multipulse approximation of speech excitation using the most singular manifold. Vahid Khanagha, Khalid Daoudi
2012	Emotion Recognition using Acoustic and Lexical Features. Viktor Rozgic, Sankaranarayanan Ananthakrishnan, Shirin Saleem, Rohit Kumar, Aravind Namandi Vembu, Rohit Prasad
2012	Emotional Speech: A Spectral Analysis. Pouria Fewzee, Fakhri Karray
2012	Emphatic segments and emphasis spread in Lebanese Arabic: a Real-time Magnetic Resonance Imaging Study. Assaf Israel, Michael I. Proctor, Louis Goldstein, Khalil Iskarous, Shrikanth S. Narayanan
2012	Employing Sentence Structure: Syntax Trees as Prosody Generators. Sarah Hoffmann, Beat Pfister
2012	Enhanced Polyphone Decision Tree Adaptation for Accented Speech Recognition. Udhyakumar Nallasamy, Florian Metze, Tanja Schultz
2012	Enhancing Exemplar-Based Posteriors for Speech Recognition Tasks. Tara N. Sainath, David Nahamoo, Dimitri Kanevsky, Bhuvana Ramabhadran
2012	Enhancing Speech Understanding in Spoken Dialogue Systems by Means of a New Frame-Correction Technique. Ramón López-Cózar, Zoraida Callejas, David Griol
2012	Enhancing Speech by Reconstruction from Robust Acoustic Features. Philip Harding, Ben Milner
2012	Enhancing Subjective Speech Intelligibility Using a Statistical Model of Speech. Petko Nikolov Petkov, W. Bastiaan Kleijn, Gustav Eje Henter
2012	Enhancing Vocal Tract Length Normalization with Elastic Registration for Automatic Speech Recognition. Florian Müller, Alfred Mertins
2012	Ensemble Classifiers Using Unsupervised Data Selection for Speaker Recognition. Chien-Lin Huang, Chiori Hori, Hideki Kashioka, Bin Ma
2012	Enumerating Differences Between Various Communicative Functions for Purposes of Czech Expressive Speech Synthesis in Limited Domain. Martin Gruber
2012	Enumerative Algebraic Coding for ACELP. Tom Bäckström
2012	Error Pattern Detection Integrating Generative and Discriminative Learning for Computer-Aided Pronunciation Training. Yow-Bang Wang, Lin-Shan Lee
2012	Estimating Classifier Performance in Unknown Noise. Ehsan Variani, Hynek Hermansky
2012	Estimating Word-Stability During Incremental Speech Recognition. Ian McGraw, Alexander Gruenstein
2012	Estimating the Vocal-Tract Area Function From Formants Using a Sensitivity Function and Least Square. Tokihiko Kaburagi, Tetsuro Takano, Yuki Sakamoto
2012	Estimating the voice source in noise. Gang Chen, Yen-Liang Shue, Jody Kreiman, Abeer Alwan
2012	Estimation of Talker's Head Orientation Based on Discrimination of the Shape of Cross-power Spectrum Phase Coefficients. Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
2012	Estimation of the vocal tract shape of nasals using a Bayesian scheme. Christian H. Kasess, Wolfgang Kreuzer, Ewald Enzinger, Nadja Kerschhofer-Puhalo
2012	EuskoParl: a speech and text Spanish-Basque parallel corpus. Alicia Pérez, José M. Alcaide, M. Inés Torres
2012	Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech Transcripts. Khairun-nisa Hassanali, Yang Liu, Thamar Solorio
2012	Evaluating Prosodic Processing for Incremental Speech Synthesis. Timo Baumann, David Schlangen
2012	Evaluation of Many-to-Many Alignment Algorithm by Automatic Pronunciation Annotation Using Web Text Mining. Keigo Kubo, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano
2012	Evaluation of a Sparse Representation-Based Classifier For Bird Phrase Classification Under Limited Data Conditions. Lee Ngee Tan, Kantapon Kaewtip, Martin L. Cody, Charles E. Taylor, Abeer Alwan
2012	Evaluation of a formant-based speech-driven lip motion generation. Carlos Toshinori Ishi, Chaoran Liu, Hiroshi Ishiguro, Norihiro Hagita
2012	Event-based Video Retrieval Using Audio. Qin Jin, Peter Franz Schulam, Shourabh Rawat, Susanne Burger, Duo Ding, Florian Metze
2012	Example-based speech enhancement with joint utilization of spatial, spectral & temporal cues of speech and noise. Keisuke Kinoshita, Marc Delcroix, Mehrez Souden, Tomohiro Nakatani
2012	Exemplar-Based Sparse Representation for Language Recognition on I-Vectors. Bing Jiang, Yan Song, Wu Guo, Li-Rong Dai
2012	Expand CRF to Model Long Distance Dependencies in Prosodic Break Prediction. Jian Luan
2012	Exploiting Discriminative Point Process Models for Spoken Term Detection. Atta Norouzian, Aren Jansen, Richard C. Rose, Samuel Thomas
2012	Exploiting Temporal Sequence Structure for Semantic Analysis of Multimedia. Sourish Chaudhuri, Rita Singh, Bhiksha Raj
2012	Exploiting the Semantic Web for Unsupervised Natural Language Semantic Parsing. Gökhan Tür, Minwoo Jeong, Ye-Yi Wang, Dilek Hakkani-Tür, Larry P. Heck
2012	Exploring Discriminative Speech Trajectory Structures. Heyun Huang, Louis ten Bosch, Bert Cranen, Lou Boves
2012	Exploring Joint Equalization of Spatial-Temporal Contextual Statistics of Speech Features for Robust Speech Recognition. Hsin-Ju Hsieh, Jeih-weih Hung, Berlin Chen
2012	Exploring Off Time Nature for Speech Enhancement. Meng Yu, Jack Xin
2012	Exploring Rich Expressive Information from Audiobook Data Using Cluster Adaptive Training. Langzhou Chen, Mark J. F. Gales, Vincent Wan, Javier Latorre, Masami Akamine
2012	Expressing Speaker's Intentions through Sentence-Final Intonations for Japanese Conversational Speech Synthesis. Kazuhiko Iwata, Tetsunori Kobayashi
2012	Extrinsic normalization for vocal tracts depends on the signal, not on attention. Matthias J. Sjerps, James M. McQueen, Holger Mitterer
2012	F0 and the Perception of Prominence. Tim Mahrt, Jennifer Cole, Margaret M. Fleck, Mark Hasegawa-Johnson
2012	Factor Analysis and Nuisance Attribute Projection Revisited. Lukás Machlica, Zbynek Zajíc
2012	Factored MLLR Adaptation Algorithm for HMM-based Expressive TTS. June Sig Sung, Doo Hwa Hong, Hyun Woo Koo, Nam Soo Kim
2012	Factored adaptation using a combination of feature-space and model-space transforms. Michael L. Seltzer, Alex Acero
2012	Feature Selection for Speaker Traits. Jouni Pohjalainen, Serdar Kadioglu, Okko Räsänen
2012	Feature extraction based on hearing system signal processing for robust large vocabulary speech recognition. Peter Li, Xie Sun
2012	Foreground Speech Segmentation using Zero Frequency Filtered Signal. Deepak K. T., Biswajit Dev Sarma, S. R. Mahadeva Prasanna
2012	From PVI to Perception: A Return to the Roots of Rhythm in Broadcast News. Matthew Benton
2012	Front-end Channel Compensation using Mixture-dependent Feature Transformations for i-Vector Speaker Recognition. Taufiq Hasan, John H. L. Hansen
2012	Fully Automated Neuropsychological Assessment for Detecting Mild Cognitive Impairment. Maider Lehr, Emily Tucker Prud'hommeaux, Izhak Shafran, Brian Roark
2012	Fully Bayesian speaker clustering based on hierarchically structured utterance-oriented Dirichlet process mixture model. Naohiro Tawara, Tetsuji Ogawa, Shinji Watanabe, Atsushi Nakamura, Tetsunori Kobayashi
2012	GCC-PHAT based Head Orientation Estimation. Carlos Segura, Javier Hernando
2012	Gaussian Map based Acoustic Model Adaptation Using Untranscribed Data for Speech Recognition in Severely Adverse Environments. Wooil Kim, John H. L. Hansen
2012	Gaussian Mixture Gain Priors for Regularized Nonnegative Matrix Factorization in Single-Channel Source Separation. Emad M. Grais, Hakan Erdogan
2012	Gaze Patterns in Turn-Taking. Catharine Oertel, Marcin Wlodarczak, Jens Edlund, Petra Wagner, Joakim Gustafson
2012	Gendered sound symbolism and masking effects in speech processing. Molly Babel, Grant McGuire
2012	Genetic Algorithm Based Feature Selection for Speaker Trait Classification. Dongrui Wu
2012	Glottal Waveform Analysis of Physical Task Stress Speech. Keith W. Godin, Taufiq Hasan, John H. L. Hansen
2012	Glottal source shape parameter estimation using phase minimization variants. Stefan Huber, Axel Röbel, Gilles Degottex
2012	Goal-Oriented Auditory Scene Recognition. Kailash Patil, Mounya Elhilali
2012	Group Sparse Hidden Markov Models for Speech Recognition. Jen-Tzung Chien, Cheng-Chun Chiang
2012	Group Sparsity for Speaker Identity Discrimination in Factorisation-based Speech Recognition. Antti Hurmalainen, Rahim Saeidi, Tuomas Virtanen
2012	HMM Based Continuous EOG Recognition for Eye-input Speech Interface. Fuming Fang, Takahiro Shinozaki, Yasuo Horiuchi, Shingo Kuroiwa, Sadaoki Furui, Toshimitsu Musha
2012	HMM-based speech synthesis using sub-band basis spectrum model. Yamato Ohtani, Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima, Masami Akamine
2012	Hearing Loss and the Use of Acoustic Cues in Phonetic Categorisation of Fricatives. Odette Scharenborg, Esther Janse
2012	Hermitian based Hidden Activation Functions for Adaptation of Hybrid HMM/ANN Models. Sabato Marco Siniscalchi, Jinyu Li, Chin-Hui Lee
2012	Heterogeneous Convolutive Non-Negative Sparse Coding. Dong Wang, Javier Tejedor
2012	Hidden Conditional Random Fields with M-to-N Alignments for Grapheme-to-Phoneme Conversion. Patrick Lehnen, Stefan Hahn, Vlad-Andrei Guta, Hermann Ney
2012	Hidden Markov Convolutive Mixture Model for Pitch Contour Analysis of Speech. Kota Yoshizato, Hirokazu Kameoka, Daisuke Saito, Shigeki Sagayama
2012	Hidden Markov Models as Priors for Regularized Nonnegative Matrix Factorization in Single-Channel Source Separation. Emad M. Grais, Hakan Erdogan
2012	Hierarchical English Emphatic Speech Synthesis Based on HMM with Limited Training Data. Fanbo Meng, Zhiyong Wu, Helen M. Meng, Jia Jia, Lianhong Cai
2012	Histogram-based spectral equalization for HMM-based speech synthesis using mel-LSP. Yamato Ohtani, Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima, Masami Akamine
2012	Hooking up spectro-temporal filters with auditory-inspired representations for robust automatic speech recognition. Bernd T. Meyer, Constantin Spille, Birger Kollmeier, Nelson Morgan
2012	How Marni Helps English Language Learners Acquire Oral Reading Fluency. Ronald A. Cole, Daniel Bolaños, Wayne H. Ward, J. T. Carmer, Eric Borts, Edward Svirsky
2012	How consonants, dialect and speech rate affect vowel devoicing? Masako Fujimoto, Seiya Funatsu, Ichiro Fujimoto
2012	I-vectors and ILP clustering adapted to cross-show speaker diarization. Grégor Dupuy, Mickael Rouvier, Sylvain Meignier, Yannick Estève
2012	IVN-Based Joint Training Of GMM And HMMs Using An Improved VTS-Based Feature Compensation For Noisy Speech Recognition. Jun Du, Qiang Huo
2012	Implementation of Computationally Efficient Real-Time Voice Conversion. Tomoki Toda, Takashi Muramatsu, Hideki Banno
2012	Implementation of Simple Spectral Techniques to Enhance the Intelligibility of Speech using a Harmonic Model. Daniel Erro, Yannis Stylianou, Eva Navas, Inma Hernáez
2012	Improve the Implementation of Pitch Features for Mandarin Digit String Recognition Task. Pei Ding, Liqiang He
2012	Improved Automatic Extraction of Generation Process Model Commands and Its use for Generating Fundamental Frequency Contours for Training HMM-based Speech Synthesis. Hiroya Hashimoto, Keikichi Hirose, Nobuaki Minematsu
2012	Improved Model Selection for the ASR-Driven Binary Mask. William Hartmann, Eric Fosler-Lussier
2012	Improved Prediction of Japanese Word Accent Sandhi Using CRF. Nobuaki Minematsu, Shumpei Kobayashi, Shinya Shimizu, Keikichi Hirose
2012	Improved Speech Intelligibility with a Chimaera Hearing Aid Algorithm. Andrew Hines, Naomi Harte
2012	Improved formant frequency estimation from high-pitched vowels by downgrading the contribution of the glottal source with weighted linear prediction. Paavo Alku, Jouni Pohjalainen, Martti Vainio, Anne-Maria Laukkanen, Brad H. Story
2012	Improvement in Automatic Pronunciation Scoring using Additional Basic Scores and Learning to Rank. Liang-Yu Chen, Jyh-Shing Roger Jang
2012	Improvements in Japanese Voice Search. Ken-ichi Iso, Edward Whittaker, Tadashi Emori, Junpei Miyake
2012	Improvements of the Beta-Order Minimum Mean-Square Error (MMSE) Spectral Amplitude Estimator using Chi Priors. Marek B. Trawicki, Michael T. Johnson
2012	Improving Discriminative Training for Robust Acoustic Models in Large Vocabulary Continuous Speech Recognition. Janne Pylkkönen, Mikko Kurimo
2012	Improving L1-Specific Phonological Error Diagnosis in Computer Assisted Pronunciation Training. Theban Stanley, Kadri Hacioglu
2012	Improving Recognition of Speaker States and Traits by Cumulative Evidence: Intoxication, Sleepiness, Age and Gender. Felix Weninger, Erik Marchi, Björn W. Schuller
2012	Improving WFST-based G2P Conversion with Alignment Constraints and RNNLM N-best Rescoring. Josef R. Novak, Nobuaki Minematsu, Keikichi Hirose, Chiori Hori, Hideki Kashioka, Paul R. Dixon
2012	Improving the Entropy Estimate of Neuronal Firings of Modeled Cochlear Nucleus Neurons. Andrea Grigorescu, Marek Rudnicki, Michael Isik, Werner Hemmert, Stefano Rini
2012	Indexing Raw Acoustic Features for Scalable Zero Resource Search. Aren Jansen, Benjamin Van Durme
2012	Inference of Critical Articulator Position for Fricative Consonants. Alexander Sepúlveda, Rodrigo Capobianco Guido, Germán Castellanos-Domínguez
2012	Initialization Schemes for Multilayer Perceptron Training and their Impact on ASR Performance using Multilingual Data. Ngoc Thang Vu, Wojtek Breiter, Florian Metze, Tanja Schultz
2012	Integrated Feature Normalization and Enhancement for robust Speaker Recognition using Acoustic Factor Analysis. Taufiq Hasan, John H. L. Hansen
2012	Integrating Adaptive Beam-forming and Auditory Features for Robust Large Vocabulary Speech Recognition. Xie Sun, Peter Li, Manli Zhu, Qiru Zhou
2012	Integrating Deep Neural Networks into Structural Classification Approach based on Weighted Finite-State Transducers. Yotaro Kubo, Takaaki Hori, Atsushi Nakamura
2012	Integrating Intra-Speaker Topic Modeling and Temporal-Based Inter-Speaker Topic Modeling in Random Walk for Improved Multi-Party Meeting Summarization. Yun-Nung Chen, Florian Metze
2012	Integrating Stress Information in Large Vocabulary Continuous Speech Recognition. Bogdan Ludusan, Stefan Ziegler, Guillaume Gravier
2012	Intelligibility classification of pathological speech using fusion of multiple high level descriptors. Jangwon Kim, Naveen Kumar, Andreas Tsiartas, Ming Li, Shrikanth S. Narayanan
2012	Intelligibility of speech spoken in noise/reverberation for older adults in reverberant environments. Nao Hodoshima, Takayuki Arai, Kiyohiro Kurisu
2012	Inter-gestural timing in French nasal vowels: A comparative study of (Liège, Tournai) Northern French vs. (Marseille, Toulouse) Southern French. Véronique Delvaux, Kathy Huet, Myriam Piccaluga, Bernard Harmegnies
2012	Interactions Between Turn-taking Gaps, Disfluencies and Social Obligation. Rebecca Lunsford, Peter A. Heeman, Jan P. H. van Santen
2012	Interactive Spoken Content Retrieval with Different Types of Actions Optimized By a Markov Decision Process. Tsung-Hsien Wen, Hung-yi Lee, Lin-Shan Lee
2012	Interplay between verbal response latency and physiology of children with autism during ECA interactions. Theodora Chaspari, Chi-Chun Lee, Shrikanth S. Narayanan
2012	Interspeech Pathology Challenge: Investigations into Speaker and Sentence Specific Effects. Anthony P. Stark, Alireza Bayestehtashk, Meysam Asgari, Izhak Shafran
2012	Intrinsic Spectral Analysis for Zero and High Resource Speech Recognition. Aren Jansen, Samuel Thomas, Hynek Hermansky
2012	Intrinsic velocity differences of lip and jaw movements: preliminary results. Peter Birkholz, Phil Hoole
2012	Inventory-Based Audio-Visual Speech Enhancement. Dorothea Kolossa, Robert M. Nickel, Steffen Zeiler, Rainer Martin
2012	Inverting the Point Process Model for Fast Phonetic Keyword Search. Keith Kintzley, Aren Jansen, Kenneth Church, Hynek Hermansky
2012	Investigating Performance of the Discriminative Methods for Long-Term Speaker Adaptation. Danning Jiang, Dimitri Kanevsky, Vaibhava Goel, Yong Qin
2012	Investigating syllabic prominence with Conditional Random Fields and Latent-Dynamic Conditional Random Fields. Francesco Cutugno, Enrico Leone, Bogdan Ludusan, Antonio Origlia
2012	Investigation of Maximum Entropy Hybrid Language Models for Open Vocabulary German and Polish LVCSR. M. Ali Basha Shaik, Amr El-Desoky Mousa, Ralf Schlüter, Hermann Ney
2012	Is 'not bad' good enough? Aspects of unknown voices' likability. Benjamin Weiss, Felix Burkhardt
2012	Iterative MMSE Estimation of Vocal Tract Length Normalization Factors for Voice Transformation. Daniel Erro, Eva Navas, Inma Hernáez
2012	Joint Decoding for Speech Recognition and Semantic Tagging. Anoop Deoras, Ruhi Sarikaya, Gökhan Tür, Dilek Hakkani-Tür
2012	Joint Pitch-Analysis Formant-Synthesis framework for CS recovery of speech. Srikanth Raj Chetupally, Thippur V. Sreenivas
2012	Judging temporal onset differences for concurrent vowels: Results for young, middle-aged, and older adults. Daniel Fogerty, Diane Kewley-Port, Larry E. Humes
2012	KNNDIST: A Non-Parametric Distance Measure for Speaker Segmentation. Seyed Hamidreza Mohammadi, Hossein Sameti, Mahsa Sadat Elyasi Langarani, Amirhossein Tavanaei
2012	Knowledge-Based Word Lattice Rescoring in a Dynamic Context. Todd Shore, Friedrich Faubel, Hartmut Helmke, Dietrich Klakow
2012	LSTM Neural Networks for Language Modeling. Martin Sundermeyer, Ralf Schlüter, Hermann Ney
2012	Language Modeling for Voice-Enabled Social TV Using Tweets. Junlan Feng, Bernard Renger
2012	Language differences in the perceptual weight of prominence-lending properties. Bistra Andreeva, William J. Barry, Magdalena Wolska
2012	Large Scale Hierarchical Neural Network Language Models. Hong-Kwang Kuo, Ebru Arisoy, Ahmad Emami, Paul Vozila
2012	Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks. Dong Yu, Li Deng, Frank Seide
2012	Learning When to Listen: Detecting System-Addressed Speech in Human-Human-Computer Dialog. Elizabeth Shriberg, Andreas Stolcke, Dilek Hakkani-Tür, Larry P. Heck
2012	Learning an Artificial F0-Contour for ALT Speech. Anna Katharina Fuchs, Martin Hagmüller
2012	Lenition of /d/ in spontaneous Spanish and Catalan. Miguel Simonet, José Ignacio Hualde, Marianna Nadeu
2012	Less errors with TTS? A dictation experiment with foreign language learners. Thomas Pellegrini, Ângela Costa, Isabel Trancoso
2012	Leveraging Social Annotation for Topic Language Model Adaptation. Youzheng Wu, Kazuhiko Abe, Paul R. Dixon, Chiori Hori, Hideki Kashioka
2012	Lexical Story Co-Segmentation of Chinese Broadcast News. Wei Feng, Xuecheng Nie, Liang Wan, Lei Xie, Jianmin Jiang
2012	Lexical-phonetic automata for spoken utterance indexing and retrieval. Julien Fayolle, Murat Saraclar, Fabienne Moreau, Christian Raymond, Guillaume Gravier
2012	Likability Classification - A Not so Deep Neural Network Approach. Raymond Brueckner, Björn W. Schuller
2012	Local-feature-map Integration Using Convolutional Neural Networks for Music Genre Classification. Toru Nakashika, Christophe Garcia, Tetsuya Takiguchi
2012	Log-spectral feature reconstruction based on an occlusion model for noise robust speech recognition. José A. González, Antonio M. Peinado, Angel M. Gomez, Ning Ma
2012	Longer Features: They do a speech detector good. T. J. Tsai, Nelson Morgan
2012	Low latency combination of parallelized single-pass LVCSR systems. Fethi Bougares, Mickael Rouvier, Yannick Estève, Georges Linarès
2012	Low-SNR, Speaker-Dependent Speech Enhancement using GMMs and MFCCs. Laura E. Boucheron, Phillip L. De Leon
2012	Low-rank Audio Signal Classification Under Soft Margin and Trace Norm Constraints. Ziqiang Shi, Tieran Zheng, Jiqing Han, Shiwen Deng
2012	MAP Estimation of Whole-Word Acoustic Models with Dictionary Priors. Keith Kintzley, Aren Jansen, Hynek Hermansky
2012	Making Conversational Vowels More Clear. Seyed Hamidreza Mohammadi, Alexander Kain, Jan P. H. van Santen
2012	Mask Estimation and Refinement for MFT-based Robust Speaker Verification. Yali Zhao, Lei Xie, Zhonghua Fu
2012	Maximising objective speech intelligibility by local f0 modulation. Julián Villegas, Martin Cooke
2012	Maximum Entropy Language Model Adaptation for Mobile Speech Input. Tanel Alumäe, Kaarel Kaljurand
2012	Maximum F1-Score Discriminative Training for Automatic Mispronunciation Detection in Computer-Assisted Language Learning. Huang Hao, Jianming Wang, Halidan Abudureyimu
2012	Mean Hilbert Envelope Coefficients (MHEC) for Robust Speaker Recognition. Seyed Omid Sadjadi, Taufiq Hasan, John H. L. Hansen
2012	Meaning inhibition and sentence processing in Chinese: Evidence from negative priming. Michael C. W. Yip
2012	Measuring prosodic alignment in cooperative task-based conversations. Khiet P. Truong, Dirk Heylen
2012	Mel cepstral coefficient modification based on the Glimpse Proportion measure for improving the intelligibility of HMM-generated synthetic speech in noise. Cassia Valentini-Botinhao, Junichi Yamagishi, Simon King
2012	Methodological Issues in Assessing Perceptual Representation of Consonant Sounds in Thai. Charturong Tantibundhit, Chutamanee Onsuwan, P. Phienphanich, Chai Wutiwiwatchai
2012	Microphone Array Post-filter based on Spatially-Correlated Noise Measurements for Distant Speech Recognition. Ken'ichi Kumatani, Bhiksha Raj, Rita Singh, John W. McDonough
2012	Mitigating Effects of Recording Condition Mismatch in Speaker Recognition Using Partial Least Squares. Jeremiah Remus, Jenniffer Estrada, Stephanie A. C. Schuckers
2012	Mixed probabilistic and deterministic dependency parsing. Christophe Cerisara, Alejandra Lorenzo
2012	Mixture Component Clustering for Efficient Speaker Verification. Richard D. McClanahan, Phillip L. De Leon
2012	Model-Based Approaches for Degraded Channel Modelling in Robust ASR. Mark J. F. Gales, Federico Flego
2012	Model-based Duration-difference Approach on Accent Evaluation of L2 Learner. Chatchawarn Hansakunbuntheung, Ananlada Chotimongkol, Sumonmas Thatphithakkul, Patcharika Chootrakool
2012	Model-based Single-Channel Dereverberation in Noisy Acoustical Environments. Xulei Bao, Jie Zhu
2012	Model-based approaches to adaptive training in reverberant environments. Yongqiang Wang, Mark J. F. Gales
2012	Modeling Cue Trading in Human Word Recognition. Louis ten Bosch, Odette Scharenborg
2012	Modeling Pause-Duration for Style-Specific Speech Synthesis. Alok Parlikar, Alan W. Black
2012	Modeling source-tract interaction in speech production: Voicing onset vs. vowel height after a voiceless obstruent. Jorge C. Lucero, Laura L. Koenig, Susanne Fuchs
2012	Modeling spoken language acquisition with a generic cognitive architecture for associative learning. Okko Räsänen, Heikki Rasilo, Unto K. Laine
2012	Modeling the Creaky Excitation for Parametric Speech Synthesis. Thomas Drugman, John Kane, Christer Gobl
2012	Modelling a Noisy-channel for Voice Conversion Using Articulatory Features. Bajibabu Bollepalli, Alan W. Black, Kishore Prahallad
2012	Modelling pause duration as a function of contextual length. David Doukhan, Albert Rilliard, Sophie Rosset, Christophe d'Alessandro
2012	Modulation Spectrum Analysis for Speaker Personality Trait Recognition. Alexei Ivanov, Xin Chen
2012	Modulation domain blind source separation for noisy speech mixture. Yi Zhang, Yunxin Zhao
2012	More on the Normalization of Syllable Prominence Ratings. Christopher Sappok, Denis Arnold
2012	Morpheme Level Feature-based Language Models for German LVCSR. Amr El-Desoky Mousa, M. Ali Basha Shaik, Ralf Schlüter, Hermann Ney
2012	Multi-System Fusion of Extended Context Prosodic and Cepstral Features for Paralinguistic Speaker Trait Classification. Michelle Hewlett Sanchez, Aaron Lawson, Dimitra Vergyri, Harry Bratt
2012	N-gram FST Indexing for Spoken Term Detection. Chao Liu, Dong Wang, Javier Tejedor
2012	Nasal Coarticulation and Contrastive Stress. Georgia Zellou, Rebecca Scarborough
2012	Nasality from Moroccan Arabic Nasal and Pharyngeal Consonants: Patterns of Airflow and Nasalance. Georgia Zellou
2012	Nativeness Classification with Suprasegmental Features on the Accent Group Level. Mahnoosh Mehrabani, Joseph Tepperman, Emily Nava
2012	Naturalness Judgement of Prosodic Variation of Japanese Utterances with Prosody Modified Stimuli. Chiharu Tsurutani, Shunichi Ishihara
2012	Noise Compensation for Subspace Gaussian Mixture Models. Liang Lu, K. K. Chin, Arnab Ghoshal, Steve Renals
2012	Noise Robust Pitch Tracking by Subband Autocorrelation Classification. Byung Suk Lee, Daniel P. W. Ellis
2012	Non-auditory cognitive capabilities in computational modeling of early language acquisition. Okko Räsänen
2012	Normalization of Text Messages Using Character- and Phone-based Machine Translation Approaches. Chen Li, Yang Liu
2012	Normalization of spectro-temporal Gabor filter bank features for improved robust automatic speech recognition systems. Marc René Schädler, Birger Kollmeier
2012	Novel Approach to Live Captioning Through Re-speaking: Tailoring Speech Recognition to Re-speaker's Needs. Ales Prazák, Zdenek Loose, Jan Trmal, Josef V. Psutka, Josef Psutka
2012	Novel Metrics of Speech Rhythm for the Assessment of Emotion. Fabien Ringeval, Mohamed Chetouani, Björn W. Schuller
2012	OOV Word Detection using Hybrid Models with Mixed Types of Fragments. Long Qin, Alexander I. Rudnicky
2012	Objective Child Vocal Development Measurement with Naturalistic Daylong Audio Recording. Dongxin Xu, Jill Gilkerson, Jeffery Richards
2012	Objective Intelligibility Assessment of Text-to-Speech System using Template Constrained Generalized Posterior Probability. Linfang Wang, Lijuan Wang, Yan Teng, Zhe Geng, Frank K. Soong
2012	Objective, Subjective and Linguistic Roads to Perceptual Prominence - How are they compared and why? Petra Wagner, Fabio Tamburini, Andreas Windmann
2012	Obtaining prominence judgments from naïve listeners - Influence of rating scales, linguistic levels and normalisation. Denis Arnold, Petra Wagner, Bernd Möbius
2012	On Speaker-Independent Personality Perception and Prediction from Speech. Tim Polzehl, Katrin Schoenenberg, Sebastian Möller, Florian Metze, Gelareh Mohammadi, Alessandro Vinciarelli
2012	On the Dynamics of Overlap in Multi-Party Conversation. Kornel Laskowski, Mattias Heldner, Jens Edlund
2012	On the Modeling of Voiceless Stop Sounds of Speech using Adaptive Quasi-Harmonic Models. George P. Kafentzis, Olivier Rosec, Yannis Stylianou
2012	On the Role of Binary Mask Pattern in Automatic Speech Recognition. Arun Narayanan, DeLiang Wang
2012	On the Use of Non-Linear Polynomial Kernel SVMs in Language Recognition. Sibel Yaman, Jason W. Pelecanos, Mohamed Kamal Omar
2012	On the Use of Spectral and Iterative Methods for Speaker Diarization. Stephen Shum, Najim Dehak, Jim Glass
2012	On the acoustics of overlapping laughter in conversational speech. Khiet P. Truong, Jürgen Trouvain
2012	On the assessment of audiovisual cues to speaker confidence by preteens with typical development (TD) and a-typical development (AD). Marc Swerts, Cees de Bie
2012	On the effect of the acoustic environment on the accuracy of perception of speaker orientation from auditory cues alone. Jens Edlund, Mattias Heldner, Joakim Gustafson
2012	On the use of Machine Learning Methods for Speech and Voicing Classification. Philip Harding, Ben Milner
2012	On-the-fly Topic Adaptation for YouTube Video Transcription. Kapil Thadani, Fadi Biadsy, Daniel M. Bikel
2012	Online Story Segmentation of Multilingual Streaming Broadcast News. Amit Srivastava, Saurabh Khanwalkar, Gretchen Markiewicz, Guruprasad Saikumar
2012	Open-Vocabulary Retrieval of Spoken Content with Shorter/Longer Queries Considering Word/Subword-based Acoustic Feature Similarity. Hung-yi Lee, Po-Wei Chou, Lin-Shan Lee
2012	Optimised spectral weightings for noise-dependent speech intelligibility enhancement. Yan Tang, Martin Cooke
2012	Optimization of Dialog Strategies using Automatic Dialog Simulation and Statistical Dialog Management Techniques. Zoraida Callejas, Ramón López-Cózar
2012	Optimization-Based Control for the Extended Baum-Welch Algorithm. Janne Pylkkönen, Mikko Kurimo
2012	Overlapped Speech Detection in Meeting Using Cross-Channel Spectral Subtraction and Spectrum Similarity. Ryo Yokoyama, Yu Nasu, Koichi Shinoda, Koji Iwano
2012	Overlapping Sound Event Recognition using Local Spectrogram Features with the Generalised Hough Transform. Jonathan William Dennis, Tran Huy Dat, Engsiong Chng
2012	PLDA Modeling in I-Vector and Supervector Space for Speaker Verification. Ye Jiang, Kong-Aik Lee, Zhenmin Tang, Bin Ma, Anthony Larcher, Haizhou Li
2012	PLDA using Gaussian Restricted Boltzmann Machines with application to Speaker Verification. Themos Stafylakis, Patrick Kenny, Mohammed Senoussaoui, Pierre Dumouchel
2012	Parallel Training for Deep Stacking Networks. Li Deng, Brian Hutchinson, Dong Yu
2012	Parallel combination of multilingual speech streams for improved ASR. João Miranda, João Paulo Neto, Alan W. Black
2012	Paraphrastic Language Models. Xunying Liu, Mark J. F. Gales, Philip C. Woodland
2012	Patrol Team Language Identification System for DARPA RATS P1 Evaluation. Pavel Matejka, Oldrich Plchot, Mehdi Soufifar, Ondrej Glembek, Luis Fernando D'Haro, Karel Veselý, Frantisek Grézl, Jeff Z. Ma, Spyros Matsoukas, Najim Dehak
2012	Pauses and respiratory markers of the structure of book reading. Gérard Bailly, Cécilia Gouvernayre
2012	Perceived prosodic boundaries in Taiwanese and their acoustic correlates. Grace Kuo
2012	Perception of Pitch Contours among Native Tone Listeners. Ratree Wayland, Donruethai Laphasradakul, Edith Kaan, Cao Rui
2012	Perception of Synthetic Speech in Adult Users of Cochlear Implants. Kyoko Nagao, Mark Paullin, James B. Polikoff, Jason Lilley, H. Timothy Bunnell
2012	Perception of the moraic obstruent /Q/: a cross-linguistic study. Makiko Sadakata, Mizuki Shingai, Alex Brandmeyer, Kaoru Sekiyama
2012	Perceptual Assimilation of Arabic Voiceless Fricatives by English Monolinguals. Michael D. Tyler, Sarah Fenwick
2012	Perceptual Foundations for Naturalistic Variability in the Prosody of Synthetic Speech. Nanette Veilleux, Jonathan Barnes, Alejna Brugos, Stefanie Shattuck-Hufnagel
2012	Perceptual Importance of the Phase Related Information in Speech. Ibon Saratxaga, Inma Hernáez, Michael Pucher, Eva Navas, Iñaki Sainz
2012	Perceptual Learning of /f/-/s/ by Older Listeners. Odette Scharenborg, Esther Janse, Andrea Weber
2012	Perceptual compensation for the effects of reverberation on consonant identification: A comparison of human and machine performance. Guy J. Brown, Amy V. Beeston, Kalle J. Palomäki
2012	Performance Comparison of Intrusive Objective Speech Intelligibility and Quality Metrics for Cochlear Implant Users. João Felipe Santos, Stefano Cosentino, Oldooz Hazrati, Philipos C. Loizou, Tiago H. Falk
2012	Performance Comparison of Training Algorithms for Semi-Supervised Discriminative Language Modeling. Erinç Dikici, Arda Çelebi, Murat Saraclar
2012	PermA and Balloon: Tools for string alignment and text processing. Uwe D. Reichel
2012	Personality traits detection using a parallelized modified SFFS algorithm. Clément Chastagnol, Laurence Devillers
2012	Phase estimation for signal reconstruction in single-channel source separation. Pejman Mowlaee, Rahim Saeidi, Rainer Martin
2012	Phone Adaptive Training for Speaker Diarization. Simon Bozonnet, Ravichander Vipperla, Nicholas W. D. Evans
2012	Phone recognition in critical bands using sub-band temporal modulations. Feipeng Li, Sri Harish Reddy Mallidi, Hynek Hermansky
2012	Phoneme Class Based Adaptation for Mismatch Acoustic Modeling of Distant Noisy Speech. Seçkin Uluskan, John H. L. Hansen
2012	Phoneme resistance during speech-in-speech comprehension. Léo Varnet, Julien Meyer, Michel Hoen, Fanny Meunier
2012	Phonetic Foreignization of Mandarin for Dubbing in Imported Western Movies. Luying Hou, Yuan Jia, Aijun Li
2012	Phonological complexity and vocabulary size in 30-month-old Swedish children. Ulrika Marklund, Ulla Sundberg, Iris-Corinna Schwarz, Francisco Lacerda
2012	Phonology & the Interpretation of Fine Phonetic Detail in Berlin German. Stefanie Jannedy, Melanie Weirich
2012	Phonotactic Language Recognition Using MLP Features. Mohamed Faouzi BenZeghiba, Jean-Luc Gauvain, Lori Lamel
2012	Phonotactic Language Recognition using i-vectors and Phoneme Posteriogram Counts. Luis Fernando D'Haro, Ondrej Glembek, Oldrich Plchot, Pavel Matejka, Mehdi Soufifar, Ricardo de Córdoba, Jan Cernocký
2012	Phrasal Cohort Based Unsupervised Discriminative Language Modeling. Puyang Xu, Brian Roark, Sanjeev Khudanpur
2012	Phrase Boundary Assignment from Text in Multiple Domains. Andrew Rosenberg, Raul Fernandez, Bhuvana Ramabhadran
2012	Physiological and acoustic study of word initial post-lexical gemination in Moroccan Arabic. Chakir Zeroual, Diamantis Gafos, Phil Hoole, John H. Esling
2012	Pipelined Back-Propagation for Context-Dependent Deep Neural Networks. Xie Chen, Adam Eversole, Gang Li, Dong Yu, Frank Seide
2012	Pitch Estimation Based on Long Frame Harmonic Model and Short Frame Average Correlation Coefficient. Dongmei Wang, Philipos C. Loizou
2012	Pitch and Intonation Contribution to Speakers' Traits Classification. Claude Montacié, Marie-José Caraty
2012	Pitch and phonological perception of tone in the Suruí language of Rondônia (Brazil): identification task of LHL and LHH tonal patterns. Julien Meyer
2012	Pitch range control of Japanese boundary pitch movements. Yosuke Igarashi, Hanae Koiso
2012	Pitch-Scaled Analysis based Residual Reconstruction for Speech Analysis and Synthesis. Zhengqi Wen, Hideki Kawahara, Jianhua Tao
2012	Plagiarism Detection in Polyphonic Music using Monaural Signal Separation. Soham De, Indradyumna Roy, Tarunima Prabhakar, Kriti Suneja, Sourish Chaudhuri, Rita Singh, Bhiksha Raj
2012	PodCastle: Collaborative Training of Language Models on the Basis of Wisdom of Crowds. Jun Ogata, Masataka Goto
2012	Pooling Robust Shift-Invariant Sparse Representations of Acoustic Signals. Po-Sen Huang, Jianchao Yang, Mark Hasegawa-Johnson, Feng Liang, Thomas S. Huang
2012	Portability of Semantic Annotations for Fast Development of Dialogue Corpora. Bassam Jabaian, Fabrice Lefèvre, Laurent Besacier
2012	Posterior-Scaled MPE: Novel Discriminative Training Criteria. Markus Nußbaum-Thom, Zoltán Tüske, Georg Heigold, Ralf Schlüter, Hermann Ney
2012	Power Mean Pyramid Scores for Summarization Evaluation. Sameer Maskey, Andrew Rosenberg
2012	Practice and feedback in L2 speaking: an evaluation of the DISCO CALL system. Catia Cucchiarini, Joost van Doremalen, Helmer Strik
2012	Predictability affects vowel dispersion and dynamics in the Buckeye Corpus. Michael McAuliffe, Molly Babel
2012	Predicting Character-Appropriate Voices for a TTS-based Storyteller System. Erica Greene, Taniya Mishra, Patrick Haffner, Alistair Conkie
2012	Predicting Likability of Speakers with Gaussian Processes. Dingchao Lu, Fei Sha
2012	Prediction of Turn-Taking by Combining Prosodic and Eye-Gaze Information in Poster Conversations. Tatsuya Kawahara, Takuma Iwatate, Katsuya Takanashi
2012	Preference-learning based Inverse Reinforcement Learning for Dialog Control. Hiroaki Sugiyama, Toyomi Meguro, Yasuhiro Minami
2012	ProTK: An Improved Prosody Toolkit. Jacob Okamoto, Serguei V. S. Pakhomov, Elizabeth Shriberg, Andreas Stolcke
2012	Probabilistic Speaker-Class based Acoustic Modeling for Large Vocabulary Continuous Speech Recognition. Xiangang Li, Dan Su, Zaihu Pang, Xihong Wu
2012	Production and Perception of Focus in PFC and non-PFC Languages: Comparing Beijing Mandarin and Hainan Tsat. Bei Wang, Chenxia Li, Qian Wu, Xiaxia Zhang, Baofeng Wang, Yi Xu
2012	Prof-Life-Log: Audio Environment Detection for Naturalistic Audio Streams. Ali Ziaei, Abhijeet Sangwan, John H. L. Hansen
2012	Pronunciation quality evaluation of sentences by combining word based scores. Jorge Wuth, Néstor Becerra Yoma, Leopoldo Benavides, Hiram Vivanco
2012	Proper Name Splicing in Computer Games with TTS. Blaise Potard, Matthew P. Aylett, Christopher J. Pidcock
2012	Prosodic Cues to Disengagement and Uncertainty in Physics Tutorial Dialogues. Diane J. Litman, Heather Friedberg, Katherine Forbes-Riley
2012	Prosodic Entrainment in an Information-Driven Dialog System. Andrew Fandrianto, Maxine Eskénazi
2012	Prosodic Marking of Continuation versus Completion in Children's Narratives. Melissa A. Redford, Laura Dilley, Jessica Gamache, Elizabeth Wieland
2012	Prosodic Realization of Focus in Statement and Question in Tibetan (Lhasa Dialect). Xiaxia Zhang, Bei Wang, Qian Wu, Yi Xu
2012	Prosodic contex-based analysis of disfluencies. Helena Moniz, Fernando Batista, Isabel Trancoso, Ana Isabel Mata
2012	Prosodic measurements and question types in the Spontal corpus of Swedish dialogues. Sofia Strömbergsson, Jens Edlund, David House
2012	Psychoacoustic Segment Scoring for Multi-Form Speech Synthesis. Alexander Sorin, Slava Shechtman, Vincent Pollet
2012	Q-Gaussian based spectral subtraction for robust speech recognition. Hilman Ferdinandus Pardede, Koichi Shinoda, Koji Iwano
2012	Quality Analysis of Macroprosodic F0 Dynamics in Text-to-Speech Signals. Christoph Norrenbrock, Florian Hinterleitner, Ulrich Heute, Sebastian Möller
2012	Quantitative Analysis of Pitch in Speech of Children with Neurodevelopmental Disorders. Géza Kiss, Jan P. H. van Santen, Emily Tucker Prud'hommeaux, Lois M. Black
2012	Query-by-Example using Speaker Content Graphs. William M. Campbell, Elliot Singer
2012	RSR2015: Database for Text-Dependent Speaker Verification using Multiple Pass-Phrases. Anthony Larcher, Kong-Aik Lee, Bin Ma, Haizhou Li
2012	Rapid Nonlinear Speaker Adaptation for Large-Vocabulary Continuous Speech Recognition. Zoi Roupakia, Anton Ragni, Mark J. F. Gales
2012	Real-Time Lecture Transcription using ASR for Czech Hearing Impaired or Deaf Students. Petr Cerva, Jan Silovský, Jindrich Zdánský, Jan Nouza, Jirí Málek
2012	Real-time Implementation of Multi-band Frequency Compression for Listeners with Moderate Sensorineural Impairment. Nitya Tiwari, Prem C. Pandey, Pandurangarao N. Kulkarni
2012	Real-time Visualization of English Pronunciation on an IPA Chart Based on Articulatory Feature Extraction. Yurie Iribe, Takurou Mori, Kouichi Katsurada, Goh Kawai, Tsuneo Nitta
2012	Recurrent Neural Networks for Noise Reduction in Robust ASR. Andrew L. Maas, Quoc V. Le, Tyler M. O'Neil, Oriol Vinyals, Patrick Nguyen, Andrew Y. Ng
2012	Relative Importance of Temporal Envelope and Fine Structure Cues in Low- and High-Order Harmonic Regions for Mandarin Lexical-tone Recognition. Guangting Mai
2012	Residual Phase Cepstrum Coefficients with Application to Cross-lingual Speaker Verification. Michael T. Johnson, Jianglin Wang
2012	Resonator-based creaky voice detection. Thomas Drugman, John Kane, Christer Gobl
2012	Rethinking The Corpus: Moving towards Dynamic Linguistic Resources. Andrew Rosenberg
2012	Robust Event Detection From Spoken Content In Consumer Domain Videos. Stavros Tsakalidis, Xiaodan Zhuang, Roger Hsiao, Shuang Wu, Pradeep Natarajan, Rohit Prasad, Prem Natarajan
2012	Robust Feature Extraction for Speech Recognition by Enhancing Auditory Spectrum. Md. Jahangir Alam, Patrick Kenny, Douglas D. O'Shaughnessy
2012	Robust Pitch Estimation Using l1-regularized Maximum Likelihood Estimation. Feng Huang, Tan Lee
2012	Robust Tracking for Automatic Reading Tutors. Emre Yilmaz, Dirk Van Compernolle, Hugo Van hamme
2012	Robust phoneme recognition based on biomimetic speech contours. Michael A. Carlin, Kailash Patil, Sridhar Krishna Nemala, Mounya Elhilali
2012	Robust triphone mapping for acoustic modeling. Milos Cernak, David Imseng, Hervé Bourlard
2012	Role of Prosody in Automatic Modality Recognition of Bangla Speech. Anal Warsi, Tulika Basu, Debasis Mazumdar
2012	Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization. Brian Kingsbury, Tara N. Sainath, Hagen Soltau
2012	Search Space Pruning Based on Anticipated Path Recombination in LVCSR. David Nolden, Ralf Schlüter, Hermann Ney
2012	Selection of TDOA Parameters for MDM Speaker Diarization. Beatriz Martínez-González, José Manuel Pardo, Julián D. Echeverry-Correa, José A. Vallejo-Pinto, Roberto Barra-Chicote
2012	Semi-Blind Model Adaptation using Piece-wise Energy Decay Curve for Large Reverberant Environments. Abdul Waheed Mohammed, Marco Matassoni, Hari Krishna Maganti, Maurizio Omologo
2012	Semi-Supervised Methods for Improving Keyword Search of Unseen Terms. Scott Novotney, Ivan Bulyko, Richard M. Schwartz, Sanjeev Khudanpur, Owen Kimball
2012	Sentence Detection Using Multiple Annotations. Ann Lee, James R. Glass
2012	Sibilant Speech Detection in Noise. Sira Gonzalez, Mike Brookes
2012	Similar Speaker Selection Technique Based on Distance Metric Learning with Perceptual Voice Quality Similarity. Yusuke Ijima, Mitsuaki Isogai, Hideyuki Mizuno
2012	Similarities in fundamental frequency in infant speech segmentation models. Ellen Marklund, Francisco Lacerda, Iris-Corinna Schwarz, Ulla Sundberg
2012	Simultaneous Discriminative Training and Mixture Splitting of HMMs for Speech Recognition. Muhammad Ali Tahir, Markus Nußbaum-Thom, Ralf Schlüter, Hermann Ney
2012	Smile with a smile. Hugo Quené, Will Schuerman
2012	Sparse Bayesian Factor Analysis for Stereo-based Stochastic Mapping. Xiaodong Cui, Mohamed Afify, George Saon, Vaibhava Goel
2012	Sparse Probabilistic Linear Discriminant Analysis for Speaker Verification. Hai Yang, Chunyan Liang, Yunfei Xu, Lin Yang, Yonghong Yan
2012	Speaker Adaptation Using Variational Bayesian Linear Regression in Normalized Feature Space. Seong-Jun Hahm, Atsunori Ogawa, Masakiyo Fujimoto, Takaaki Hori, Atsushi Nakamura
2012	Speaker Clustering for a Mixture of Singing and Reading. Mahnoosh Mehrabani, John H. L. Hansen
2012	Speaker Clustering in Emotion Recognition. Ni Ding, Julien Epps
2012	Speaker Discrimination Ability of Glottal Waveform Features. Juan F. Torres, Elliot Moore
2012	Speaker Independent Single Channel Source Separation using Sinusoidal Features. Shivesh Ranjan, Karen L. Payton, Pejman Mowlaee
2012	Speaker Personality Classification Using Systems Based on Acoustic-Lexical Cues and an Optimal Tree-Structured Bayesian Network. Kartik Audhkhasi, Angeliki Metallinou, Ming Li, Shrikanth S. Narayanan
2012	Speaker Recognition for Children's Speech. Saeid Safavi, Maryam Najafian, Abualsoud Hanani, Martin J. Russell, Peter Jancovic, Michael J. Carey
2012	Speaker Verification Using Neighborhood Preserving Embedding. Chunyan Liang, Jinchao Yang, Lin Yang, Yonghong Yan
2012	Speaker diarization of overlapping speech based on silence distribution in meeting recordings. Sree Harsha Yella, Fabio Valente
2012	Speaker idiosyncratic rhythmic features in the speech signal. Volker Dellwo, Adrian Leemann, Marie-José Kolly
2012	Speaker-Dependent Voice Activity Detection Robust to Background Speech Noise. Shigeki Matsuda, Naoya Ito, Kosuke Tsujino, Hideki Kashioka, Shigeki Sagayama
2012	Speaker-adaptive visual speech synthesis in the HMM-framework. Dietmar Schabus, Michael Pucher, Gregor Hofer
2012	Spectral Intersections for Non-Stationary Signal Separation. Trausti T. Kristjansson, Thad Hughes
2012	Speech Activity Detection for Noisy Data Using Adaptation Techniques. Mohamed Omar
2012	Speech Data Clustering Based on Phoneme Error Trend for Unsupervised Acoustic Model Adaptation. Taichi Asami, Satoshi Kobashikawa, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi
2012	Speech Enhancement Using Sparse Convolutive Non-negative Matrix Factorization with Basis Adaptation. Michael Carlin, Nicolas Malyska, Thomas F. Quatieri
2012	Speech Enhancement With Bivariate Gamma Model. Atanu Saha, Tetsuya Shimamura
2012	Speech Enhancement by Online Non-negative Spectrogram Decomposition in Non-stationary Noise Environments. Zhiyao Duan, Gautham J. Mysore, Paris Smaragdis
2012	Speech Enhancement for Android (SEA): A Speech Processing Demonstration Tool for Android Based Smart Phones and Tablets. Roger Chappel, Kuldip K. Paliwal
2012	Speech Pattern Discovery using Audio-Visual Fusion and Canonical Correlation Analysis. Lei Xie, Yinqing Xu, Lilei Zheng, Qiang Huang, Bingfeng Li
2012	Speech Production-Perception Relationships in Children with Speech Delay. Kyoko Nagao, Mark Paullin, Vilena Livinsky, James B. Polikoff, Linda D. Vallino, Thierry G. Morlet, N. Carolyn Schanen, H. Timothy Bunnell
2012	Speech Recognition by Denoising and Dereverberation Based on Spectral Subtraction in a Real Noisy Reverberant Environment. Kyohei Odani, Longbiao Wang, Atsuhiko Kai
2012	Speech and speaker separation in human auditory cortex. Nima Mesgarani, Edward Chang
2012	Speech factorization for HMM-TTS based on cluster adaptive training. Javier Latorre, Vincent Wan, Mark J. F. Gales, Langzhou Chen, K. K. Chin, Kate M. Knill, Masami Akamine
2012	Speech modeling and processing by low-dimensional dynamic glottal models. Carlo Drioli, Andrea Calanca
2012	Speech restoration based on deep learning autoencoder with layer-wised pretraining. Xugang Lu, Shigeki Matsuda, Chiori Hori, Hideki Kashioka
2012	Speech synthesis using a non-maximally decimated filter bank for embedded systems. Nobuyuki Nishizawa, Tsuneo Kato
2012	Speech-in-noise intelligibility improvement based on spectral shaping and dynamic range compression. Tudor-Catalin Zorila, Varvara Kandia, Yannis Stylianou
2012	Speech/Nonspeech Segmentation in Web Videos. Ananya Misra
2012	SpeechMark: Landmark Detection Tool for Speech Analysis. Suzanne Boyce, Harriet J. Fell, Joel MacAuslan
2012	Spelling as a Complementary Strategy for Speech Recognition. Keith Vertanen, Per Ola Kristensson
2012	Spoken Dialogs With a Virtual Science Tutor. Wayne H. Ward, Daniel Bolaños, Ronald A. Cole
2012	Spoken Document Clustering Using Word Confusion Networks. Shajith Ikbal, Sachindra Joshi, Ashish Verma, Om D. Deshmukh
2012	Spoken Inquiry Discrimination Using Bag-of-Words for Speech-Oriented Guidance System. Haruka Majima, Rafael Torres, Yoko Fujita, Hiromichi Kawanami, Tomoko Matsui, Hiroshi Saruwatari, Kiyohiro Shikano
2012	Spontaneous-Speech Acoustic-Prosodic Features of Children with Autism and the Interacting Psychologist. Daniel Bone, Matthew P. Black, Chi-Chun Lee, Marian E. Williams, Pat Levitt, Sungbok Lee, Shrikanth S. Narayanan
2012	Spoofing countermeasures for the protection of automatic speaker recognition systems against attacks with artificial signals. Federico Alegre, Ravichander Vipperla, Nicholas W. D. Evans
2012	Study of Different Backends in a State-Of-the-Art Language Recognition System. Mikel Peñagarikano, Amparo Varona, Mireia Díez, Luis Javier Rodríguez-Fuentes, Germán Bordel
2012	Study of the Effect of I-vector Modeling on Short and Mismatch Utterance Duration for Speaker Verification. Achintya Kumar Sarkar, Driss Matrouf, Pierre-Michel Bousquet, Jean-François Bonastre
2012	Study on Integration of Speaker Diarization with Speaker Adaptive Speech Recognition for Broadcast Transcription. Jan Silovský, Petr Cerva, Jindrich Zdánský, Jan Nouza
2012	Sub-band based Log-energy and Its Dynamic Range Stretching for Robust In-car Speech Recognition. Weifeng Li, Hervé Bourlard
2012	Subspace Gaussian Mixture Models Based on Noise Compensation for Speech Recognition. Mohamed Bouallegue, Driss Matrouf, Georges Linarès, Mickael Rouvier
2012	Subspace-Based Feature Representation and Learning for Language Recognition. Yu-Chin Shih, Hung-Shin Lee, Hsin-Min Wang, Shyh-Kang Jeng
2012	Subword speech recognition for detection of unseen words. Ivan Bulyko, Jose Herrero, Chris Mihelich, Owen Kimball
2012	Supervector LDA: A New Approach to Reduced-Complexity I-vector Language Recognition. Alan McCree, Bengt J. Borgstrom
2012	Supervised Spoken Document Summarization jointly Considering Utterance Importance and Redundancy by Structured Support Vector Machine. Hung-yi Lee, Yu-Yu Chou, Yow-Bang Wang, Lin-Shan Lee
2012	Supervised and unsupervised Web-based language model domain adaptation. Gwénolé Lecorvé, John Dines, Thomas Hain, Petr Motlícek
2012	Supervized Mixture of PLDA Models for Cross-Channel Speaker Verification. Konstantin Simonchik, Timur Pekhovsky, Andrey Shulipa, Anton Afanasyev
2012	Syllable perception depends on tone perception. Iris Chuoying Ouyang, Khalil Iskarous
2012	Synthetic F0 Can Effectively Convey Speaker ID in Delexicalized Speech. Eric Morley, Esther Klabbers, Jan P. H. van Santen, Alexander Kain, Seyed Hamidreza Mohammadi
2012	Synthetic References for Template-based ASR using posterior features. Serena Soldo, Mathew Magimai-Doss, Hervé Bourlard
2012	Synthetic Speech Discrimination using Pitch Pattern Statistics Derived from Image Analysis. Phillip L. De Leon, Bryan Stewart, Junichi Yamagishi
2012	Synthetic correction of deviant speech - children's perception of phonologically modified recordings of their own speech. Sofia Strömbergsson
2012	TDOA Estimation for Multiple Speakers in Underdetermined Case. Mariem Bouafif, Zied Lachiri
2012	Temporal and Situational Context Modeling for Improved Dominance Recognition in Meetings. Martin Wöllmer, Florian Eyben, Björn W. Schuller, Gerhard Rigoll
2012	Temporal entrainment in overlapped speech: Cross-linguistic study. Marcin Wlodarczak, Juraj Simko, Petra Wagner
2012	Text-To-Speech Intelligibility Across Speech Rates. Ann K. Syrdal, H. Timothy Bunnell, Susan R. Hertz, Taniya Mishra, Murray F. Spiegel, Corine A. Bickley, Deborah Rekart, Matthew J. Makashay
2012	Text-dependent pathological voice detection. Gopala Krishna Anumanchipalli, Hugo Meinedo, Miguel M. F. Bugalho, Isabel Trancoso, Luís C. Oliveira, Alan W. Black
2012	The 'Audio-Visual Face Cover Corpus': Investigations into audio-visual speech and speaker recognition when the speaker's face is occluded by facewear. Natalie Fecher
2012	The 2011 NIST Language Recognition Evaluation. Craig S. Greenberg, Alvin F. Martin, Mark A. Przybocki
2012	The Automatic Assessment of Non-native Prosody: Combining Classical Prosodic Analysis with Acoustic Modelling. Florian Hönig, Tobias Bocklet, Korbinian Riedhammer, Anton Batliner, Elmar Nöth
2012	The BLZ Submission to the NIST 2011 LRE: Data Collection, System Development and Performance. Luis Javier Rodríguez-Fuentes, Mikel Peñagarikano, Amparo Varona, Mireia Díez, Germán Bordel, Alberto Abad, David Martínez González, Jesús Antonio Villalba López, Alfonso Ortega, Eduardo Lleida
2012	The EHU Systems for the NIST 2011 Language Recognition Evaluation. Mikel Peñagarikano, Amparo Varona, Luis Javier Rodríguez-Fuentes, Mireia Díez, Germán Bordel
2012	The Effect of Spectral Estimator on Common Spectral Measures for Sibilant Fricatives. Patrick Reidy, Mary E. Beckman
2012	The Effect of Use of Drugs on Speaker's Fundamental Frequency and Formants. Andrey N. Raev, Yuri Matveev, Tatiana Goloshchapova
2012	The Effects of Lexical Tones and Nasal Coda /-n/ to Sadness in Taiwan Hakka. Shao-Ren Lyu
2012	The F0 fall delay of lexical pitch accent in Japanese Infant-directed speech. Yoko Saikachi, Mafuyu Kitahara, Ken'ya Nishikawa, Ai Kanato, Reiko Mazuka
2012	The IIIT-H Indic Speech Databases. Kishore Prahallad, Naresh Kumar Elluru, Venkatesh Keri, Rajendran S, Alan W. Black
2012	The INTERSPEECH 2012 Speaker Trait Challenge. Björn W. Schuller, Stefan Steidl, Anton Batliner, Elmar Nöth, Alessandro Vinciarelli, Felix Burkhardt, Rob van Son, Felix Weninger, Florian Eyben, Tobias Bocklet, Gelareh Mohammadi, Benjamin Weiss
2012	The Intelligibility of Lombard Speech: Communicative setting matters. Michael Fitzpatrick, Jeesun Kim, Chris Davis
2012	The Role of Creaky Voice in Mandarin Tone 2 and Tone 3 Perception. Rui Cao, Ratree Wayland, Edith Kaan
2012	The Role of Score Calibration in Speaker Recognition. George R. Doddington
2012	The Speech Recognition Virtual Kitchen: An Initial Prototype. Florian Metze, Eric Fosler-Lussier
2012	The Use of DBN-HMMs for Mispronunciation Detection and Diagnosis in L2 English to Support Computer-Aided Pronunciation Training. Xiaojun Qian, Helen M. Meng, Frank K. Soong
2012	The effect of dichotic processing on the perception of binaural cues. Akiko Amano-Kusumoto, Justin M. Aronoff, Motokuni Itoh, Sigfrid D. Soli
2012	The entropy of intoxicated speech - lexical creativity and heavy tongues. Uwe D. Reichel
2012	The log-Gabor method: speech classification using spectrogram image analysis. Harm Buisman, Eric O. Postma
2012	The processes underlying two frequent casual speech phenomena in Dutch: A production experiment. Iris Hanique, Mirjam Ernestus
2012	The production and perception of Estonian quantity degrees by native and non-native speakers. Lya Meister, Einar Meister
2012	Tied-State Mixture Language Model for WFST-based Speech Recognition. Hitoshi Yamamoto, Paul R. Dixon, Shigeki Matsuda, Chiori Hori, Hideki Kashioka
2012	Time Delay Estimation for Speech Signal Based on FOC-Spectrum. Hong Liu, Xiaofei Li
2012	Toward an Optimum Feature Set and HMM Model Parameters for Automatic Phonetic Alignment of Spontaneous Speech. Montri Karnjanadecha, Stephen A. Zahorian
2012	Towards Automated Annotation of Audio and Video Recordings by Application of Advanced Web-services. Przemyslaw Lenkiewicz, Dieter Van Uytvanck, Peter Wittenburg, Sebastian Drude
2012	Towards Empirical Dialog-State Modeling and its Use in Language Modeling. Nigel G. Ward, Alejandro Vega
2012	Towards Glottal Source Controllability in Expressive Speech Synthesis. Jaime Lorenzo-Trueba, Roberto Barra-Chicote, Tuomo Raitio, Nicolas Obin, Paavo Alku, Junichi Yamagishi, Juan Manuel Montero
2012	Towards Hierarchical Prosodic Prominence Generation in TTS Synthesis. Leonardo Badino, Robert A. J. Clark
2012	Towards Recurrent Neural Networks Language Models with Linguistic and Contextual Features. Yangyang Shi, Pascal Wiggers, Catholijn M. Jonker
2012	Towards an Unsupervised Speaking Style Voice Building Framework: Multi-Style Speaker Diarization. Jaime Lorenzo-Trueba, Beatriz Martínez-González, Roberto Barra-Chicote, Verónica López-Ludeña, Javier Ferreiros, Junichi Yamagishi, Juan Manuel Montero
2012	Training Deep Nets with Imbalanced and Unlabeled Data. Jeffrey Berry, Ian R. Fasel, Luciano Fadiga, Diana Archangeli
2012	Turning a Monolingual Speaker into Multilingual for a Mixed-language TTS. Ji He, Yao Qian, Frank K. Soong, Sheng Zhao
2012	Ultrax: An Animated Midsagittal Vocal Tract Display for Speech Therapy. Korin Richmond, Steve Renals
2012	Uncertainty driven Compensation of Multi-Stream MLP Acoustic Models for Robust ASR. Ramón Fernandez Astudillo, Alberto Abad, João Paulo Neto
2012	Unconstrained Speech Separation by Composition of Longest Segments. Ji Ming, Ramji Srinivasan, Danny Crookes
2012	Unsupervised Acoustic Analyses of Normal and Lombard Speech, with Spectral Envelope Transformation to Improve Intelligibility. Elizabeth Godoy, Yannis Stylianou
2012	Unsupervised Deep Belief Features for Speech Translation. Sameer Maskey, Bowen Zhou
2012	Unsupervised NAP Training Data Design for Speaker Recognition. Hanwu Sun, Bin Ma
2012	Unsupervised Speaker Identification using Overlaid Texts in TV Broadcast. Johann Poignant, Hervé Bredin, Viet Bac Le, Laurent Besacier, Claude Barras, Georges Quénot
2012	Unveiling the Acoustic Properties that Describe the Valence Dimension. Carlos Busso, Tauhidur Rahman
2012	Using Bayesian Networks to find relevant context features for HMM-based speech synthesis. Heng Lu, Simon King
2012	Using Blob Detection in Missing Feature Linear-Frequency Cepstral Coefficients for Robust Sound Event Recognition. Yi Ren Leng, Tran Huy Dat
2012	Using HMM-based Speech Synthesis to Reconstruct the Voice of Individuals with Degenerative Speech Disorders. Christophe Veaux, Junichi Yamagishi, Simon King
2012	Using Prominence and Phrasing Predictions to Improve Weighted Dictionary Pronunciation Models. Andrew Rosenberg
2012	Using Quality Ratings to Predict Modality Choice in Multimodal Systems. Ina Wechsung, Klaus-Peter Engelbrecht, Sebastian Möller
2012	Using Reinforcement Learning for Dialogue Management Policies: Towards Understanding MDP Violations and Convergence. Peter A. Heeman, Jordan Fryer, Rebecca Lunsford, Andrew Rueckert, Ethan Selfridge
2012	Using Sparse Classification Outputs as Feature Observations for Noise-robust ASR. Yang Sun, Bert Cranen, Jort F. Gemmeke, Louis ten Bosch, Lou Boves, Mathew M. Doss
2012	Using Sub-word-level Information for Confidence Estimation with Conditional Random Field Models. Matthew Stephen Seigel, Philip C. Woodland
2012	Using Time-Synchronous Phone Co-occurrences in a SVM-Phonotactic Dialect Recognition System. Amparo Varona, Mikel Peñagarikano, Luis Javier Rodríguez-Fuentes, Germán Bordel, Mireia Díez
2012	Using broad phonetic classes to guide search in automatic speech recognition. Stefan Ziegler, Bogdan Ludusan, Guillaume Gravier
2012	Using context-free grammars for embedded speech recognition with Weighted Finite-State Transducers. Frank Duckhorn, Rüdiger Hoffmann
2012	Using i-Vector Space Model for Emotion Recognition. Rui Xia, Yang Liu
2012	Using magnetic resonance to image the pharynx during Arabic speech: Static and dynamic aspects. Ryan Shosted, Bradley P. Sutton, Abbas Benmamoun
2012	Using spectral measures to differentiate Mandarin and Korean sibilant fricatives. Jeffrey Kallay, Jeffrey J. Holliday
2012	Utilization of the Lombard effect in post-filtering for intelligibility enhancement of telephone speech. Emma Jokinen, Paavo Alku, Martti Vainio
2012	Utilizing Markov Chain Monte Carlo (MCMC) Method for Improved Glottal Inverse Filtering. Harri Auvinen, Tuomo Raitio, Samuli Siltanen, Paavo Alku
2012	Verifying Session Level Pronunciation Accuracy in a Speech Therapy Application. Shou-Chun Yin, Richard C. Rose, Yun Tang
2012	VisArtico: a visualization tool for articulatory data. Slim Ouni, Loic Mangeonjean, Ingmar Steiner
2012	Visualizing tool for evaluating inter-label similarity in prosodic labeling experiments. David Escudero Mancebo, Eva Estebas-Vilaplana
2012	Vocal Tremor Measurement Based on Autocorrelation of Contours. Markus Brückl
2012	Vocal-Source Biomarkers for Depression: A Link to Psychomotor Activity. Thomas F. Quatieri, Nicolas Malyska
2012	Voice Activity Detection Using Speech Recognizer Feedback. Kit Thambiratnam, Weiwu Zhu, Frank Seide
2012	Voice Production Mechanisms of Vibrato in Noh. Ikuyo Yoshinaga, Jiangping Kong
2012	Voice Query Refinement. Cyril Allauzen, Edward Benson, Ciprian Chelba, Michael Riley, Johan Schalkwyk
2012	Voice source analysis using biomechanical modeling and glottal inverse filtering. Alan Pinheiro, Tuomo Raitio, Danyane Gomes, Paavo Alku
2012	Vowel Creation by Articulatory Control in HMM-based Parametric Speech Synthesis. Zhen-Hua Ling, Korin Richmond, Junichi Yamagishi
2012	Vowels Produced by Sliding Three-tube Model with Different Lengths. Takayuki Arai
2012	Ways to Implement Global Variance in Statistical Speech Synthesis. Hanna Silén, Elina Helander, Jani Nurminen, Moncef Gabbouj
2012	Where did I go wrong?: Identifying troublesome segments for speaker diarization systems. Mary Tai Knox, Nikki Mirghafori, Gerald Friedland
2012	Where to associate stressed additive particles? Evidence from speech prosody. Bettina Braun
2012	White Listing and Score Normalization for Keyword Spotting of Noisy Speech. Bing Zhang, Richard M. Schwartz, Stavros Tsakalidis, Long Nguyen, Spyros Matsoukas
2012	Whole-Word Recognition from Articulatory Movements for Silent Speech Interfaces. Jun Wang, Ashok Samal, Jordan R. Green, Frank Rudzicz
2012	Wideband Parametric Speech Synthesis Using Warped Linear Prediction. Tuomo Raitio, Antti Suni, Martti Vainio, Paavo Alku
2012	Word Discovery with Beta Process Factor Analysis. Niklas Vanhainen, Giampiero Salvi
2012	Word Prominence Detection using Robust yet Simple Prosodic Features. Taniya Mishra, Vivek Kumar Rangarajan Sridhar, Alistair Conkie
2012	Word Relevance Modeling for Speech Recognition. Kuan-Yu Chen, Hao-Chin Chang, Berlin Chen, Hsin-Min Wang
2012	sparse banded precision matrices for low resource speech recognition. Weibin Zhang, Pascale Fung