INTERSPEECH A

678 papers

YearTitle / Authors
2012"Help Me, I Need More User Tests!" User Simulations as Supportive Tool in the Development Process of Spoken Dialogue Systems.
Florian Kretzschmar, Sebastian Möller
201213th Annual Conference of the International Speech Communication Association, INTERSPEECH 2012, Portland, Oregon, USA, September 9-13, 2012
2012A Bayesian Approach to Speaker Recognition Based on GMMs Using Multiple Model Structures.
Takafumi Hattori, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda
2012A Case Study: Detecting Counselor Reflections in Psychotherapy for Addictions using Linguistic Features.
Dogan Can, Panayiotis G. Georgiou, David C. Atkins, Shrikanth S. Narayanan
2012A Comparison of Classification Paradigms for Speaker Likeability Determination.
Nicholas Cummins, Julien Epps, Jia Min Karen Kua
2012A Continuous Prominence Score Based On Acoustic Features.
Jean-Philippe Goldman, Mathieu Avanzi, Anne-Catherine Simon, Antoine Auchlin
2012A Conversational Movie Search System Based on Conditional Random Fields.
Jingjing Liu, Scott Cyphers, Panupong Pasupat, Ian McGraw, James R. Glass
2012A Corpus-Based Study of Interruptions in Spoken Dialogue.
Agustín Gravano, Julia Hirschberg
2012A Correlational Discriminant Approach to Feature Extraction for Robust Speech Recognition.
Vikrant Singh Tomar, Richard C. Rose
2012A Data-driven Approach to Understanding Spoken Route Directions in Human-Robot Dialogue.
Raveesh Meena, Gabriel Skantze, Joakim Gustafson
2012A Discriminative Classification-Based Approach to Information State Updates for a Multi-Domain Dialog System.
Dilek Hakkani-Tür, Gökhan Tür, Larry P. Heck, Ashley Fidler, Asli Celikyilmaz
2012A Fast-Converging Adaptive Frequency-Domain MVDR Beamformer for Speech Enhancement.
Shengkui Zhao, Douglas L. Jones
2012A Feature Space Transformation Method for Personalization using Generalized I-Vector Clustering.
Kaisheng Yao, Yifan Gong, Chaojun Liu
2012A Frame Pruning Approach for Paralinguistic Recognition Tasks.
Johannes Wagner, Florian Lingenfelser, Elisabeth André
2012A HMM approach to residual estimation for high resolution voice conversion.
Winston S. Percybrooks, Elliot Moore
2012A Hierarchical Bayesian Approach for Semi-supervised Discriminative Language Modeling.
Yik-Cheung Tam, Paul Vozila
2012A Initial Attempt on Task-Specific Adaptation for Deep Neural Network-based Large Vocabulary Continuous Speech Recognition.
Yeming Xiao, Zhen Zhang, Shang Cai, Jielin Pan, Yonghong Yan
2012A Natural In-Car Speech Interface to Internet Services Using Hybrid ASR.
Hansjörg Hofmann, Ute Ehrlich, Klaus Bader, Ilona Nothelfer, André Berton
2012A New Approach of Speaking Rate Modeling for Mandarin Speech Prosody.
Chiao-Hua Hsieh, Chen-Yu Chiang, Yih-Ru Wang, Hsiu-Min Yu, Sin-Horng Chen
2012A Non-Uniform Filterbank for Speaker Recognition.
Jia Min Karen Kua, Tharmarajah Thiruvaran, Eliathamby Ambikairajah
2012A Novel Confidence Measure Based on Context Consistency for Spoken Term Detection.
Haiyang Li, Jiqing Han, Tieran Zheng, Guibin Zheng
2012A Preliminary Study on Cross-Databases Emotion Recognition using the Glottal Features in Speech.
Rui Sun, Elliot Moore II
2012A Random, Semantically Appropriate Sentence Generator for Speaker Verification.
Jason Lilley, Amanda Stent, Ilija Zeljkovic
2012A Robust Unsupervised Arousal Rating Framework using Prosody with Cross-Corpora Evaluation.
Daniel Bone, Chi-Chun Lee, Shrikanth S. Narayanan
2012A Rule Based Pronunciation Generator and Regional Accent Databank for Portuguese.
Simone Ashby, Sílvia Barbosa, Silvia Brandão, José Pedro Ferreira, Maarten Janssen, Catarina Silva, Mário Eduardo Viaro
2012A Self-Learning Assistive Vocal Interface Based on Vocabulary Learning and Grammar Induction.
Jort F. Gemmeke, Janneke van de Loo, Guy De Pauw, Joris Driesen, Hugo Van hamme, Walter Daelemans
2012A Sequential Bayesian Dialog Agent for Computational Ethnography.
Abe Kazemzadeh, James Gibson, Juanchen Li, Sungbok Lee, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2012A Simple Hybrid Acoustic / Morphologically-Constrained Technique for the Synthesis of Stop Consonants in Various Vocalic Contexts.
Frédéric Berthommier, Laurent Girin, Louis-Jean Boë
2012A Sparse Plus Low Rank Maximum Entropy Language Model.
Brian Hutchinson, Mari Ostendorf, Maryam Fazel
2012A Specialized WFST Approach for Class Models and Dynamic Vocabulary.
Paul R. Dixon, Chiori Hori, Hideki Kashioka
2012A Stochastic Model of Singing Voice F0 Contours for Characterizing Expressive Dynamic Components.
Yasunori Ohishi, Hirokazu Kameoka, Daichi Mochihashi, Kunio Kashino
2012A Study of Mutual Information for GMM-Based Spectral Conversion.
Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Yih-Ru Wang, Sin-Horng Chen
2012A Study on Using Word-Level HMMs to Improve ASR Performance over State-of-the-Art Phone-Level Acoustic Modeling for LVCSR.
I-Fan Chen, Chin-Hui Lee
2012A Triple-Microphone Real-Time Speech Enhancement Algorithm Based on Approximate Array Analytical Solutions.
Meng Yu, Ryan Ritch, Jack Xin
2012A Two-stage Speaker Adaptation Approach for Subspace Gaussian Mixture Model based Nonnative Speech Recognition.
Bo Li, Khe Chai Sim
2012A Two-step NMF Based Algorithm for Single Channel Speech Separation.
Shuo Wang, Wenjun Wu
2012A Weighted Combination of Speech with Text-based Models for Arabic Diacritization.
Aisha S. Azim, Xiaoxuan Wang, Khe Chai Sim
2012A comparative study of adaptive, automatic recognition of disordered speech.
Heidi Christensen, Stuart P. Cunningham, Charles Fox, Phil D. Green, Thomas Hain
2012A factorized representation of FMLLR transform based on QR-decomposition.
Shakti P. Rath, Martin Karafiát, Ondrej Glembek, Jan Cernocký
2012A full-band adaptive harmonic representation of speech.
Gilles Degottex, Yannis Stylianou
2012A method of speaker identification based on phoneme mean F-ratio contribution.
Songgun Hyon, Hongcui Wang, Chen Zhao, Jianguo Wei, Jianwu Dang
2012A methodology for the study of rhythm in drummed forms of languages: application to Bora Manguaré of Amazon.
Julien Meyer, Laure Dentel, Frank Seifart
2012A new noise-tracking algorithm for generalizing binary time-frequency (T-F) masking to ratio masking.
Shan Liang, Wei Jiang, Wenju Liu
2012A signal-separation-based array postfilter for distant speech recognition.
Rita Singh, Ken'ichi Kumatani, John W. McDonough, Chen Liu
2012A simple and efficient method to align very long speech signals to acoustically imperfect transcriptions.
Germán Bordel, Mikel Peñagarikano, Luis Javier Rodríguez-Fuentes, Amparo Varona
2012A speaker-role based approach for detecting politicians in TV broadcast news.
Delphine Charlet, Géraldine Damnati
2012A speech parameter generation algorithm using local variance for HMM-based speech synthesis.
Vataya Chunwijitra, Takashi Nose, Takao Kobayashi
2012A tutorial dialogue system with unrestricted spoken input.
Peter Bell, Myroslava O. Dzikovska, Amy Isard
2012Accelerated Batch Learning of Convex Log-linear Models for LVCSR.
Simon Wiesler, Ralf Schlüter, Hermann Ney
2012Accentual Transfer from Swiss-German to French. A Study of "Français Fédéral".
Mathieu Avanzi, Pauline Dubosson, Sandra Schwab, Nicolas Obin
2012Accounting for Speech Rate in Spoken Word Recognition.
David Cheng-Huan Li, Elsi Kaiser
2012Acoustic Cues of Vowel Quality to Coda Nasal Perception in Southern Min.
Ying Chen, Vsevolod Kapatsinski, Susan Guion-Anderson
2012Acoustic Feature-based Non-scorable Response Detection for an Automated Speaking Proficiency Assessment.
Je Hun Jeon, Su-Youn Yoon
2012Acoustic Features for Classification Based Speech Separation.
Yuxuan Wang, Kun Han, DeLiang Wang
2012Acoustic and Data-driven Features for Robust Speech Activity Detection.
Samuel Thomas, Sri Harish Reddy Mallidi, Thomas Janu, Hynek Hermansky, Nima Mesgarani, Xinhui Zhou, Shihab A. Shamma, Tim Ng, Bing Zhang, Long Nguyen, Spyros Matsoukas
2012Acoustic and Perceptual Similarity in Coarticulatorily Nasalized Vowels.
Rebecca Scarborough, Georgia Zellou
2012Active Learning by Sparse Instance Tracking and Classifier Confidence in Acoustic Emotion Recognition.
Zixing Zhang, Björn W. Schuller
2012Addressing Confusions in Spoken Language in ESL Pronunciation Tutors.
Oscar Saz, Maxine Eskénazi
2012Advances in combined electro-optical palatography.
Peter Birkholz, Philippe Daechert, Christiane Neuschaefer-Rube
2012Advances in noise robust digit recognition using hybrid exemplar-based techniques.
Jort F. Gemmeke, Hugo Van hamme
2012Age Estimation from Telephone Speech using i-vectors.
Mohamad Hasan Bahari, Mitchell McLaren, Hugo Van hamme, David A. van Leeuwen
2012Aligning manifolds to model the earliest phonological abstraction in infant-caretaker vocal imitation.
Andrew R. Plummer
2012Amplitude Modulation Filters as Feature Sets for Robust ASR: Constant Absolute or Relative Bandwidth?
Niko Moritz, Jörn Anemüller, Birger Kollmeier
2012Amplitude Spectrum based Excitation Model for HMM-based Speech Synthesis.
Zhengqi Wen, Jianhua Tao
2012An Auditory Inspired Multimodal Framework for Speech Enhancement.
Majid Mirbagheri, Sahar Akram, Shihab A. Shamma
2012An Automatic Child-Directed Speech Detector for the Study of Child Language Development.
Soroush Vosoughi, Deb Roy
2012An Evaluation of Parameter Generation Methods with Rich Context Models in HMM-Based Speech Synthesis.
Shinnosuke Takamichi, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai, Sakriani Sakti, Satoshi Nakamura
2012An Information-Extraction Approach to Speech Analysis and Processing.
Chin-Hui Lee
2012An MRI study of the oral articulation of European Portuguese nasal vowels.
Catarina Oliveira, Paula Martins, Samuel S. Silva, António J. S. Teixeira
2012An On-Line, Cloud-Based Spanish-Spanish Sign Language Translation System.
Javier Tejedor, Fernando J. López-Colino, Jordi Porta, José Colás
2012An Online Generated Transducer to Increase Dialog Manager Coverage.
Joaquin Planells, Lluís F. Hurtado, Emilio Sanchis, Encarna Segarra
2012An alignment matching method to explore pseudosyllable properties across different corpora.
Raymond W. M. Ng, Thomas Hain, Keikichi Hirose
2012Analysis of Mimicry Speech.
D. Gomathi, Sathya Adithya Thati, Karthik Venkat Sridaran, Bayya Yegnanarayana
2012Analysis of Temporal Resolution in Frequency Domain Linear Prediction.
Sriram Ganapathy, Hynek Hermansky
2012Analysis of speaker clustering strategies for HMM-based speech synthesis.
Rasmus Dall, Christophe Veaux, Junichi Yamagishi, Simon King
2012Analysis of the Characteristics of Talk-show TV Programs.
Fabio Brugnara, Daniele Falavigna, Diego Giuliani, Roberto Gretter
2012Analysis of vocal tremor and jitter by empirical mode decomposition of glottal cycle length time series.
Christophe Mertens, Francis Grenez, Jean Schoentgen
2012Analysis on the Importance of Short-Term Speech Parameterizations for Emotional Statistical Parametric Speech Synthesis.
Ranniery Maia
2012Analyzing and Interpreting Automatically Learned Rules Across Dialects.
Nancy F. Chen, Wade Shen, Joseph P. Campbell
2012Anchor Models and WCCN Normalization For Speaker Trait Classification.
Yazid Attabi, Pierre Dumouchel
2012Annotation and Recognition of Personality Traits in Spoken Conversations from the AMI Meetings Corpus.
Fabio Valente, Samuel Kim, Petr Motlícek
2012Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition.
Navdeep Jaitly, Patrick Nguyen, Andrew W. Senior, Vincent Vanhoucke
2012Application of Structural Events Detected on ASR Outputs for Automated Speaking Assessment.
Lei Chen, Su-Youn Yoon
2012Applying multiview learning algorithms to human-human conversation classification.
Sokol Koço, Cécile Capponi, Frédéric Béchet
2012Arabic Dialect Identification - 'Is the Secret in the Silence?' and Other Observations.
Hynek Boril, Abhijeet Sangwan, John H. L. Hansen
2012Are Sparse Representations Rich Enough for Acoustic Modeling?
Oriol Vinyals, Li Deng
2012Articulatory Feature based Multilingual MLPs for Low-Resource Speech Recognition.
Yanmin Qian, Jia Liu
2012Articulatory Strategies in Obstruent Production in Mandarin Esophageal Speech.
Fang Hu, Yungang Wu, Wen Xu, Demin Han
2012Articulatory VCV Synthesis from EMA Data.
Asterios Toutios, Shinji Maeda
2012Articulatory differences between oral and nasal vowels based on the simulation of a speaker-adaptive articulatory model.
Panying Rong, Ryan Shosted, David Kuehn
2012Articulatory speaker normalisation based on MRI-data using three-way linear decomposition methods.
Julián Andrés Valdés Vargas, Pierre Badin, Laurent Lamalle
2012Assessing agreement level between forced alignment models with data from endangered language documentation corpora.
Christian DiCanio, Hosung Nam, Douglas H. Whalen, H. Timothy Bunnell, Jonathan D. Amith, Rey Castillo García
2012Assessment of Disordered Voices Using Empirical Mode Decomposition in the Log-Spectral Domain.
Abdellah Kacha, Francis Grenez, Jean Schoentgen
2012Assessment of user simulators for spoken dialogue systems by means of subspace multidimensional clustering.
Zoraida Callejas, David Griol, Klaus-Peter Engelbrecht
2012Asymmetries in the perception of synthesized speech.
Anna C. Janska, Erich Schröger, Thomas Jacobsen, Robert A. J. Clark
2012Audio and Contact Microphones for Cough Detection.
Thomas Drugman, Jérôme Urbain, Nathalie Bauwens, Ricardo Chessini, Anne-Sophie Aubriot, Patrick Lebecque, Thierry Dutoit
2012Audio-visual Evaluation and Detection of Word Prominence in a Human-Machine Interaction Scenario.
Martin Heckmann
2012Audiovisual correlates of basic emotions in blind and sighted people.
Marc Swerts, Kitty Leuverink, Madelène Munnik, Vera Nijveld
2012Audiovisual discrimination of CV syllables: a simultaneous fMRI-EEG study.
Cyril Dubois, Rudolph Sock
2012Auditory and Dynamic Modeling Paradigms to Detect L2 Mispronunciations.
Christos Koniaris, Olov Engwall, Giampiero Salvi
2012Auditory-visual speech to infants and adults: signals and correlations.
Jeesun Kim, Chris Davis, Christine Kitamura
2012Automated Dysarthria Severity Classification for Improved Objective Intelligibility Assessment of Spastic Dysarthric Speech.
Milton Orlando Sarria-Paja, Tiago H. Falk
2012Automatic Detection of High Vocal Effort in Telephone Speech.
Jouni Pohjalainen, Tuomo Raitio, Hannu Pulakka, Paavo Alku
2012Automatic Error Recovery for Pronunciation Dictionaries.
Tim Schlippe, Sebastian Ochs, Ngoc Thang Vu, Tanja Schultz
2012Automatic Measurement of Positive and Negative Voice Onset Time.
Katharine Henry, Morgan Sonderegger, Joseph Keshet
2012Automatic Phoneme Segmentation Using Auditory Attention Features.
Ozlem Kalinli
2012Automatic Pronunciation Error Detection Based on Extended Pronunciation Space Using the Unsupervised Clustering of Pronunciation Errors.
Long Zhang, Haifeng Li
2012Automatic Speech Segmentation Using Probabilistic Latent Component Modeling.
Sayan Ghosh, T. V. Sreenivas
2012Automatic Tone Assessment of Non-Native Mandarin Speakers.
Jian Cheng
2012Automatic Topology Generation of Glottal Source HMM.
Akira Sasou
2012Automatic Transcription of Lecture Speech using Language Model Based on Speaking-Style Transformation of Proceeding Texts.
Yuya Akita, Makoto Watanabe, Tatsuya Kawahara
2012Automatic Vocabulary Adaptation Based on Semantic Similarity and Speech Recognition Confidence Measure.
Shoko Yamahata, Yoshikazu Yamaguchi, Atsunori Ogawa, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi
2012Automatic detection of conflict escalation in spoken conversations.
Samuel Kim, Sree Harsha Yella, Fabio Valente
2012Automatic detection of hypernasal speech signals using nonlinear and entropy measurements.
Juan Rafael Orozco-Arroyave, Julián D. Arias-Londoño, Jesús Francisco Vargas-Bonilla, Elmar Nöth
2012Automatic estimation of the first two subglottal resonances in children's speech with application to speaker normalization in limited-data conditions.
Harish Arsikere, Gary K. F. Leung, Steven M. Lulich, Abeer Alwan
2012Automatic intelligibility assessment of pathologic speech in head and neck cancer based on auditory-inspired spectro-temporal modulations.
Xinhui Zhou, Daniel Garcia-Romero, Nima Mesgarani, Maureen L. Stone, Carol Y. Espy-Wilson, Shihab A. Shamma
2012Automatic transcription error recovery for Person Name Recognition.
Richard Dufour, Géraldine Damnati, Delphine Charlet, Frédéric Béchet
2012Automatic word naming recognition for treatment and assessment of aphasia.
Alberto Abad, Anna Pompili, Ângela Costa, Isabel Trancoso
2012Automating Crowd-supervised Learning for Spoken Language Systems.
Ian McGraw, Scott Cyphers, Panupong Pasupat, Jingjing Liu, James R. Glass
2012Average Spectrotemporal Structure of Continuous Speech Matches with the Frequency Resolution of Human Hearing.
Okko Räsänen
2012Bag-of-Audio-Words Approach for Multimedia Event Classification.
Stephanie Pancoast, Murat Akbacak
2012Based on Isolated Saliency or Causal Integration? Toward a Better Understanding of Human Annotation Process using Multiple Instance Learning and Sequential Probability Ratio Test.
Chi-Chun Lee, Athanasios Katsamanis, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2012Bayesian Feature Enhancement for ASR of Noisy Reverberant Real-World Data.
Alexander Krueger, Oliver Walter, Volker Leutnant, Reinhold Haeb-Umbach
2012Bayesian Group Sparse Learning for Nonnegative Matrix Factorization.
Jen-Tzung Chien, Hsin-Lung Hsieh
2012Bayesian Mixture of Probabilistic Linear Regressions for Voice Conversion.
Na Li, Yu Qiao
2012Beamforming using uniform circular arrays for distant speech recognition in reverberant environments and double talk scenarios.
Hannes Pessentheiner, Stefan Petrik, Harald Romsdorfer
2012Bilinear Factor Analysis for iVector Based Speaker Verification.
Yun Lei, Lukás Burget, Nicolas Scheffer
2012Binary Mask Estimation for Improved Speech Intelligibility in Reverberant Environments.
Oldooz Hazrati, Jaewook Lee, Philipos C. Loizou
2012Binaural Noise Reduction Using Frequency-Warped FIR Filters.
Jorge I. Marin-Hurtado, David V. Anderson
2012Boosting Classification Based Speech Separation Using Temporal Dynamics.
Yuxuan Wang, DeLiang Wang
2012C2H: A Computational Model of H&H-based Phonetic Contrast in Synthetic Speech.
Mauro Nicolao, Javier Latorre, Roger K. Moore
2012CRF-based Diacritisation of Colloquial Arabic for Automatic Speech Recognition.
Sarah Al-Shareef, Thomas Hain
2012Calibration of probabilistic age recognition.
David A. van Leeuwen, Mohamad Hasan Bahari
2012Caller Response Timing Patterns in Spoken Dialog Systems.
Silke M. Witt
2012Can litheners retune native categories acroth a thoneme boundary?
Michael D. Tyler, Mona Faris
2012Can modified casual speech reach the intelligibility of clear speech?
Maria Koutsogiannaki, Michèle Pettinato, Cassie Mayo, Varvara Kandia, Yannis Stylianou
2012Characterizing Covert Articulation in Apraxic Speech Using real-time MRI.
Christina Hagedorn, Michael I. Proctor, Louis Goldstein, Maria Luisa Gorno-Tempini, Shrikanth S. Narayanan
2012Children's Productions of Multi-Syllabic Lexical Stress Patterns in Different Prosodic Positions.
Irina Shport
2012Classification of Stressed Speech Using Physical Parameters Derived from Two-Mass Model.
Xiao Yao, Takatoshi Jitsuhiro, Chiyomi Miyajima, Norihide Kitaoka, Kazuya Takeda
2012Classifying Skewed Data: Importance Weighting to Optimize Average Recall.
Andrew Rosenberg
2012ClippyScript: A Programming Language for Multi-Domain Dialogue Systems.
Frank Seide, Sean McDirmid
2012Co-occurrence of reduced word forms in natural speech.
Malte C. Viebahn, Mirjam Ernestus, James M. McQueen
2012Coherent Topic Transition in a Conversational Agent.
Daniel Macías-Galindo, Wilson Wong, Lawrence Cavedon, John Thangarajah
2012Combination of Multiple Speech Dimensions for Automatic Assessment of Dysarthric Speech Intelligibility.
Myung Jong Kim, Hoirin Kim
2012Combination of Sparse Classification and Multilayer Perceptron for Noise-robust ASR.
Yang Sun, Mathew M. Doss, Jort F. Gemmeke, Bert Cranen, Louis ten Bosch, Lou Boves
2012Combining Acoustic Data Driven G2P and Letter-to-Sound Rules for Under Resource Lexicon Generation.
Ramya Rasipuram, Mathew Magimai-Doss
2012Combining Bottleneck-BLSTM and Semi-Supervised Sparse NMF for Recognition of Conversational Speech in Highly Instationary Noise.
Felix Weninger, Martin Wöllmer, Björn W. Schuller
2012Combining Ranking and Classification to Improve Emotion Recognition in Spontaneous Speech.
Houwei Cao, Ragini Verma, Ani Nenkova
2012Combining frame and segment based models for environmental sound classification.
Pengfei Hu, Wenju Liu, Wei Jiang
2012Combining multiple high quality corpora for improving HMM-TTS.
Vincent Wan, Javier Latorre, K. K. Chin, Langzhou Chen, Mark J. F. Gales, Heiga Zen, Kate M. Knill, Masami Akamine
2012Combining temporal and cepstral features for the automatic perceptual categorization of disordered connected speech.
Ali Alpan, Jean Schoentgen, Francis Grenez
2012Compact Audio Representation for Event Detection in Consumer Media.
Xiaodan Zhuang, Stavros Tsakalidis, Shuang Wu, Pradeep Natarajan, Rohit Prasad, Prem Natarajan
2012Comparative Analysis of Intensity between Native Speakers and Japanese Speakers of English.
Tomoko Nariai, Kazuyo Tanaka, Tatsuya Kawahara
2012Comparing different acoustic modeling techniques for multilingual boosting.
David Imseng, John Dines, Petr Motlícek, Philip N. Garner, Hervé Bourlard
2012Comparing transcription agreement on non-native English speech corpus between native and non-native annotators.
Hyuksu Ryu, Sunhee Kim, Minhwa Chung
2012Comparison of Grapheme-to-Phoneme Methods on Large Pronunciation Dictionaries and LVCSR Tasks.
Stefan Hahn, Paul Vozila, Maximilian Bisani
2012Compensating for Ageing and Quality variation in Speaker Verification.
Finnian Kelly, Andrzej Drygajlo, Naomi Harte
2012Compensation of Intrinsic Variability with Factor Analysis Modeling for Robust Speaker Verification.
Sheng Chen, Mingxing Xu
2012Complementary Phone Error Training.
Frank Diehl, Philip C. Woodland
2012Computational Modelling of the Recognition of Foreign-Accented Speech.
Odette Scharenborg, Marijt J. Witteman, Andrea Weber
2012Confidence Measures in Speech Emotion Recognition Based on Semi-supervised Learning.
Jun Deng, Björn W. Schuller
2012Confidence for Speaker Diarization using PCA Spectral Ratio.
Orith Toledo-Ronen, Hagai Aronowitz
2012Confidence measure for speech indexing based on Latent Dirichlet Allocation.
Grégory Senay, Georges Linarès
2012Considering Global Variance of the Log Power Spectrum Derived from Mel-Cepstrum in HMM-based Parametric Speech Synthesis.
Xiang Yin, Zhen-Hua Ling, Ming Lei, Li-Rong Dai
2012Consonantal space area in Children with a Cleft Palate An acoustic Study.
Marion Bechet, Fabrice Hirsch, Camille Fauth, Rudolph Sock
2012Constrained Maximum Mutual Information Dimensionality Reduction for Language Identification.
Shuai Huang, Glen A. Coppersmith, Damianos G. Karakos
2012Constrained Multichannel Speech Dereverberation.
Meng Yu, Frank K. Soong
2012Consumer-level multimedia event detection through unsupervised audio signal modeling.
Byungki Byun, Ilseo Kim, Sabato Marco Siniscalchi, Chin-Hui Lee
2012Context-Dependent MLPs for LVCSR: TANDEM, Hybrid or Both?
Zoltán Tüske, Ralf Schlüter, Hermann Ney, Martin Sundermeyer
2012Continuous Articulatory-to-Acoustic Mapping using Phone-based Trajectory HMM for a Silent Speech Interface.
Thomas Hueber, Gérard Bailly, Bruce Denby
2012Continuous Digit Recognition in Noise: Reservoirs can do an excellent job!
Azarakhsh Jalalvand, Fabian Triefenbach, Jean-Pierre Martens
2012Contrasting Cues to Verbal and Non-Verbal Backchannels in Multi-lingual Dyadic Rapport.
Gina-Anne Levow, Susan Duncan
2012Contrastive intonation in autism: The effect of speaker- and listener-perspective.
Constantijn Kaland, Emiel Krahmer, Marc Swerts
2012Contribution of Spectral Shapes to Tone Perception.
Natthawut Kertkeidkachorn, Surapol Vorapatratorn, Sirinart Tangruamsub, Proadpran Punyabukkana, Atiwong Suchato
2012Conversion of Recurrent Neural Network Language Models to Weighted Finite State Transducers for Automatic Speech Recognition.
Gwénolé Lecorvé, Petr Motlícek
2012Convolutive Non-Negative Sparse Coding and New Features for Speech Overlap Handling in Speaker Diarization.
Jürgen T. Geiger, Ravichander Vipperla, Simon Bozonnet, Nicholas W. D. Evans, Björn W. Schuller, Gerhard Rigoll
2012Correlation Between Model-based Approximations of Grounding-related Cognition and User Judgments.
Klaus-Peter Engelbrecht, Sebastian Möller
2012Correlation between vocal tract length, body height, formant frequencies, and pitch frequency for the five Japanese vowels uttered by fifteen male speakers.
Hiroaki Hatano, Tatsuya Kitamura, Hironori Takemoto, Parham Mokhtari, Kiyoshi Honda, Shinobu Masaki
2012Coupling identification and reconstruction of missing features for noise-robust automatic speech recognition.
Ning Ma, Jon Barker
2012Cries and Whispers - Classification of Vocal Effort in Expressive Speech.
Nicolas Obin
2012Cross Linguistic Comparison of Mandarin and English EMA Articulatory Data.
Sheng Li, Lan Wang
2012Cross-Lingual and Ensemble MLPs Strategies for Low-Resource Speech Recognition.
Yanmin Qian, Jia Liu
2012Cross-lingual Speaker Adaptation for HMM-based Speech Synthesis based on Perceptual Characteristics and Speaker Interpolation.
Viviane de Franca Oliveira, Sayaka Shiota, Yoshihiko Nankaku, Keiichi Tokuda
2012Cross-speaker Acoustic-to-Articulatory Inversion using Phone-based Trajectory HMM for Pronunciation Training.
Thomas Hueber, Atef Ben Youssef, Gérard Bailly, Pierre Badin, Frédéric Elisei
2012Data-driven Posterior Features for Low Resource Speech Recognition Applications.
Samuel Thomas, Sriram Ganapathy, Aren Jansen, Hynek Hermansky
2012Decoding of Uncertain Features Using the Posterior Distribution of the Clean Data for Robust Speech Recognition.
Ahmed Hussen Abdelaziz, Dorothea Kolossa
2012Deep Architectures for Articulatory Inversion.
Benigno Uria, Iain Murray, Steve Renals, Korin Richmond
2012Demonstration of Advanced Multi-Modal, Network-Centric Communication Management Suite.
Victor S. Finomore
2012Dereverberation based on Wavelet Packet Filtering for Robust Automatic Speech Recognition.
Tatsuya Kawahara, Randy Gomez
2012Deriving conversation-based features from unlabeled speech for discriminative language modeling.
Damianos G. Karakos, Brian Roark, Izhak Shafran, Kenji Sagae, Maider Lehr, Emily Tucker Prud'hommeaux, Puyang Xu, Nathan Glenn, Sanjeev Khudanpur, Murat Saraclar, Daniel M. Bikel, Mark Dredze, Chris Callison-Burch, Yuan Cao, Keith B. Hall, Eva Hasler, Philipp Koehn, Adam Lopez, Matt Post, Darcey Riley
2012Describing the development of intonational categories using a target-oriented parametric approach.
Britta Lintfert, Bernd Möbius
2012Descriptive Vocabulary Development for Degraded Speech.
Dushyant Sharma, Gaston Hilkhuysen, Patrick A. Naylor, Nikolay D. Gaubitch, Mark A. Huckvale, Mike Brookes
2012Designing a spoken language interface for a tutorial dialogue system.
Peter Bell, Myroslava O. Dzikovska, Amy Isard
2012Detecting Acronyms from Capital Letter Sequences in Spanish.
Rubén San Segundo, Juan Manuel Montero, Verónica López-Ludeña, Simon King
2012Detecting Converted Speech and Natural Speech for anti-Spoofing Attack in Speaker Recognition.
Zhizheng Wu, Chng Eng Siong, Haizhou Li
2012Detecting Intelligibility by Linear Dimensionality Reduction and Normalized Voice Quality Hierarchical Features.
Dong-Yan Huang, Yongwei Zhu, Dajun Wu, Rongshan Yu
2012Detecting OOV Named-Entities in Conversational Speech.
Rohit Kumar, Rohit Prasad, Sankaranarayanan Ananthakrishnan, Aravind Namandi Vembu, David Stallard, Stavros Tsakalidis, Prem Natarajan
2012Detecting System-directed Utterances using Dialogue-level Features.
Kazunori Komatani, Akira Hirano, Mikio Nakano
2012Detection and Positioning of Overlapped Sounds in a Room Environment.
Rupayan Chakraborty, Climent Nadeu, Taras Butko
2012Detection of Transition Segments in VCV Utterances for Estimation of the Place of Closure of Oral Stops for Speech Training.
K. S. Nataraj, Prem C. Pandey
2012Developing a Speech Activity Detection System for the DARPA RATS Program.
Tim Ng, Bing Zhang, Long Nguyen, Spyros Matsoukas, Xinhui Zhou, Nima Mesgarani, Karel Veselý, Pavel Matejka
2012Development and Evaluation of Automatic Punctuation for French and English Speech-to-Text.
Jáchym Kolár, Lori Lamel
2012Deviation measure of waveform symmetry and its application to high-speed and temporally-fine F0 extraction for vocal sound texture manipulation.
Hideki Kawahara, Masanori Morise, Ryuichi Nisimura, Toshio Irino
2012Diagnostic Prediction of Transmitted Speech Quality: A New Framework for Signal-based and Parametric Models.
Sebastian Möller, Marcel Wältermann, Nicolas Côté
2012Dialectal and generational variations in vowels in spontaneous speech.
Robert Allen Fox, Ewa Jacewicz
2012DiarTk : An Open Source Toolkit for Research in Multistream Speaker Diarization and its Application to Meetings Recordings.
Deepu Vijayasenan, Fabio Valente
2012Direction of Arrival Estimation Based on Subband Weighting for Noisy Conditions.
Wei Xue, Wenju Liu
2012Discontinuous Observation HMM for Prosodic-Event-Based F0 Generation.
Tomoki Koriyama, Takashi Nose, Takao Kobayashi
2012Discrimination of Linguistic and Non-Linguistic Vocalizations in Spontaneous Speech: Intra- and Inter-Corpus Perspectives.
Felix Weninger, Björn W. Schuller
2012Discriminative Decision Function Based Scoring Method in Joint Factor Analysis for Speaker Verification.
Chunyan Liang, Xiang Zhang, Lin Yang, Yonghong Yan
2012Discriminative Fuzzy Clustering Maximum a Posterior Linear Regression for Speaker Adaptation.
Ting-Yao Hu, Yu Tsao, Lin-Shan Lee
2012Discriminative Reranking for LVCSR Leveraging Invariant Structure.
Masayuki Suzuki, Gakuto Kurata, Masafumi Nishimura, Nobuaki Minematsu
2012Discriminative Training Using Non-uniform Criteria for Keyword Spotting on Spontaneous Speech.
Chao Weng, Biing-Hwang Juang, Daniel Povey
2012Discriminative feature-space transforms using deep neural networks.
George Saon, Brian Kingsbury
2012Discriminatively learning factorized finite state pronunciation models from dynamic Bayesian networks.
Preethi Jyothi, Eric Fosler-Lussier, Karen Livescu
2012Discriminatively trained phoneme confusion model for keyword spotting.
Panagiota Karanasou, Lukás Burget, Dimitra Vergyri, Murat Akbacak, Arindam Mandal
2012Disentangling lexical, morphological, syntactic and semantic influences on German prominence - Evidence from a production study.
Barbara Samlowski, Petra Wagner, Bernd Möbius
2012Distance-Dependent Noise Reduction for Two-Channel Microphones.
Thomas Fehér, Dietmar Richter, Oliver Jokisch, Rüdiger Hoffmann
2012Duration of ambulatory monitoring needed to accurately estimate voice use.
Daryush D. Mehta, Rebecca Woodbury Listfield, Harold A. Cheyne II, James T. Heaton, Shengran W. Feng, Matías Zanartu, Robert E. Hillman
2012Dutch Automatic Speech Recognition on the Web: Towards a General Purpose System.
Joris Pelemans, Kris Demuynck, Patrick Wambacq
2012Dynamic Conditional Random Fields for Joint Sentence Boundary and Punctuation Prediction.
Xuancong Wang, Hwee Tou Ng, Khe Chai Sim
2012Dynamic Grammars with Lookahead Composition for WFST-based Speech Recognition.
Josef R. Novak, Nobuaki Minematsu, Keikichi Hirose
2012EFL Conversational Triads: Foreigner-directed Speech and Hyperarticulation.
Hua-Li Jian, Richard Konopka
2012Effect of Relevance Factor of Maximum a posteriori Adaptation for GMM-SVM in Speaker and Language Recognition.
Changhuai You, Haizhou Li, Bin Ma, Kong-Aik Lee
2012Effect of Tongue Tip Trilling on the Glottal Excitation Source.
Vinay Kumar Mittal, N. Dhananjaya, Bayya Yegnanarayana
2012Effect of being seen on the production of visible speech cues. A pilot study on Lombard speech.
Maeva Garnier, Lucie Ménard, Gabrielle Richard
2012Effect of noise type and level on focus related fundamental frequency changes.
Martti Vainio, Daniel Aalto, Antti Suni, Anja Arnhold, Tuomo Raitio, Henri Seijo, Juhani Järvikivi, Paavo Alku
2012Effect of prosodic changes on speech intelligibility.
Catherine Mayo, Vincent Aubanel, Martin Cooke
2012Effect of speech priors in single-channel speech-music separation for ASR.
Cemil Demir, Ali Taylan Cemgil, Murat Saraçlar
2012Effects of Dialectal Origin on Articulation Rate in French.
Mathieu Avanzi, Pauline Dubosson, Sandra Schwab
2012Effects of Speaker Adaptive Training on Tensor-based Arbitrary Speaker Conversion.
Daisuke Saito, Nobuaki Minematsu, Keikichi Hirose
2012Effects of stress and speech rate on vowel quality in Catalan and Spanish.
Marianna Nadeu
2012Effects of the availability of visual information and presence of competing conversations on speech production.
Vincent Aubanel, Martin Cooke, Emma Foster, María Luisa García Lecumberri, Cassie Mayo
2012Effects of visual speech information on native listener judgments of L2 consonant intelligibility.
Saya Kawase, Yue Wang
2012Efficient Beam Width Control to Suppress Excessive Speech Recognition Computation Time Based on Prior Score Range Normalization.
Satoshi Kobashikawa, Takaaki Hori, Yoshikazu Yamaguchi, Taichi Asami, Hirokazu Masataki, Satoshi Takahashi
2012Efficient On-The-Fly Hypothesis Rescoring in a Hybrid GPU/CPU-based Large Vocabulary Continuous Speech Recognition Engine.
Jungsuk Kim, Jike Chong, Ian R. Lane
2012Efficient Segmental Conditional Random Fields for One-Pass Phone Recognition.
Yanzhang He, Eric Fosler-Lussier
2012Efficient Structured Language Modeling for Speech Recognition.
Ariya Rastrow, Mark Dredze, Sanjeev Khudanpur
2012Efficient VTS Adaptation Using Jacobian Approximation.
Jinyu Li, Michael L. Seltzer, Yifan Gong
2012Efficient multipulse approximation of speech excitation using the most singular manifold.
Vahid Khanagha, Khalid Daoudi
2012Emotion Recognition using Acoustic and Lexical Features.
Viktor Rozgic, Sankaranarayanan Ananthakrishnan, Shirin Saleem, Rohit Kumar, Aravind Namandi Vembu, Rohit Prasad
2012Emotional Speech: A Spectral Analysis.
Pouria Fewzee, Fakhri Karray
2012Emphatic segments and emphasis spread in Lebanese Arabic: a Real-time Magnetic Resonance Imaging Study.
Assaf Israel, Michael I. Proctor, Louis Goldstein, Khalil Iskarous, Shrikanth S. Narayanan
2012Employing Sentence Structure: Syntax Trees as Prosody Generators.
Sarah Hoffmann, Beat Pfister
2012Enhanced Polyphone Decision Tree Adaptation for Accented Speech Recognition.
Udhyakumar Nallasamy, Florian Metze, Tanja Schultz
2012Enhancing Exemplar-Based Posteriors for Speech Recognition Tasks.
Tara N. Sainath, David Nahamoo, Dimitri Kanevsky, Bhuvana Ramabhadran
2012Enhancing Speech Understanding in Spoken Dialogue Systems by Means of a New Frame-Correction Technique.
Ramón López-Cózar, Zoraida Callejas, David Griol
2012Enhancing Speech by Reconstruction from Robust Acoustic Features.
Philip Harding, Ben Milner
2012Enhancing Subjective Speech Intelligibility Using a Statistical Model of Speech.
Petko Nikolov Petkov, W. Bastiaan Kleijn, Gustav Eje Henter
2012Enhancing Vocal Tract Length Normalization with Elastic Registration for Automatic Speech Recognition.
Florian Müller, Alfred Mertins
2012Ensemble Classifiers Using Unsupervised Data Selection for Speaker Recognition.
Chien-Lin Huang, Chiori Hori, Hideki Kashioka, Bin Ma
2012Enumerating Differences Between Various Communicative Functions for Purposes of Czech Expressive Speech Synthesis in Limited Domain.
Martin Gruber
2012Enumerative Algebraic Coding for ACELP.
Tom Bäckström
2012Error Pattern Detection Integrating Generative and Discriminative Learning for Computer-Aided Pronunciation Training.
Yow-Bang Wang, Lin-Shan Lee
2012Estimating Classifier Performance in Unknown Noise.
Ehsan Variani, Hynek Hermansky
2012Estimating Word-Stability During Incremental Speech Recognition.
Ian McGraw, Alexander Gruenstein
2012Estimating the Vocal-Tract Area Function From Formants Using a Sensitivity Function and Least Square.
Tokihiko Kaburagi, Tetsuro Takano, Yuki Sakamoto
2012Estimating the voice source in noise.
Gang Chen, Yen-Liang Shue, Jody Kreiman, Abeer Alwan
2012Estimation of Talker's Head Orientation Based on Discrimination of the Shape of Cross-power Spectrum Phase Coefficients.
Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
2012Estimation of the vocal tract shape of nasals using a Bayesian scheme.
Christian H. Kasess, Wolfgang Kreuzer, Ewald Enzinger, Nadja Kerschhofer-Puhalo
2012EuskoParl: a speech and text Spanish-Basque parallel corpus.
Alicia Pérez, José M. Alcaide, M. Inés Torres
2012Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech Transcripts.
Khairun-nisa Hassanali, Yang Liu, Thamar Solorio
2012Evaluating Prosodic Processing for Incremental Speech Synthesis.
Timo Baumann, David Schlangen
2012Evaluation of Many-to-Many Alignment Algorithm by Automatic Pronunciation Annotation Using Web Text Mining.
Keigo Kubo, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano
2012Evaluation of a Sparse Representation-Based Classifier For Bird Phrase Classification Under Limited Data Conditions.
Lee Ngee Tan, Kantapon Kaewtip, Martin L. Cody, Charles E. Taylor, Abeer Alwan
2012Evaluation of a formant-based speech-driven lip motion generation.
Carlos Toshinori Ishi, Chaoran Liu, Hiroshi Ishiguro, Norihiro Hagita
2012Event-based Video Retrieval Using Audio.
Qin Jin, Peter Franz Schulam, Shourabh Rawat, Susanne Burger, Duo Ding, Florian Metze
2012Example-based speech enhancement with joint utilization of spatial, spectral & temporal cues of speech and noise.
Keisuke Kinoshita, Marc Delcroix, Mehrez Souden, Tomohiro Nakatani
2012Exemplar-Based Sparse Representation for Language Recognition on I-Vectors.
Bing Jiang, Yan Song, Wu Guo, Li-Rong Dai
2012Expand CRF to Model Long Distance Dependencies in Prosodic Break Prediction.
Jian Luan
2012Exploiting Discriminative Point Process Models for Spoken Term Detection.
Atta Norouzian, Aren Jansen, Richard C. Rose, Samuel Thomas
2012Exploiting Temporal Sequence Structure for Semantic Analysis of Multimedia.
Sourish Chaudhuri, Rita Singh, Bhiksha Raj
2012Exploiting the Semantic Web for Unsupervised Natural Language Semantic Parsing.
Gökhan Tür, Minwoo Jeong, Ye-Yi Wang, Dilek Hakkani-Tür, Larry P. Heck
2012Exploring Discriminative Speech Trajectory Structures.
Heyun Huang, Louis ten Bosch, Bert Cranen, Lou Boves
2012Exploring Joint Equalization of Spatial-Temporal Contextual Statistics of Speech Features for Robust Speech Recognition.
Hsin-Ju Hsieh, Jeih-weih Hung, Berlin Chen
2012Exploring Off Time Nature for Speech Enhancement.
Meng Yu, Jack Xin
2012Exploring Rich Expressive Information from Audiobook Data Using Cluster Adaptive Training.
Langzhou Chen, Mark J. F. Gales, Vincent Wan, Javier Latorre, Masami Akamine
2012Expressing Speaker's Intentions through Sentence-Final Intonations for Japanese Conversational Speech Synthesis.
Kazuhiko Iwata, Tetsunori Kobayashi
2012Extrinsic normalization for vocal tracts depends on the signal, not on attention.
Matthias J. Sjerps, James M. McQueen, Holger Mitterer
2012F0 and the Perception of Prominence.
Tim Mahrt, Jennifer Cole, Margaret M. Fleck, Mark Hasegawa-Johnson
2012Factor Analysis and Nuisance Attribute Projection Revisited.
Lukás Machlica, Zbynek Zajíc
2012Factored MLLR Adaptation Algorithm for HMM-based Expressive TTS.
June Sig Sung, Doo Hwa Hong, Hyun Woo Koo, Nam Soo Kim
2012Factored adaptation using a combination of feature-space and model-space transforms.
Michael L. Seltzer, Alex Acero
2012Feature Selection for Speaker Traits.
Jouni Pohjalainen, Serdar Kadioglu, Okko Räsänen
2012Feature extraction based on hearing system signal processing for robust large vocabulary speech recognition.
Peter Li, Xie Sun
2012Foreground Speech Segmentation using Zero Frequency Filtered Signal.
Deepak K. T., Biswajit Dev Sarma, S. R. Mahadeva Prasanna
2012From PVI to Perception: A Return to the Roots of Rhythm in Broadcast News.
Matthew Benton
2012Front-end Channel Compensation using Mixture-dependent Feature Transformations for i-Vector Speaker Recognition.
Taufiq Hasan, John H. L. Hansen
2012Fully Automated Neuropsychological Assessment for Detecting Mild Cognitive Impairment.
Maider Lehr, Emily Tucker Prud'hommeaux, Izhak Shafran, Brian Roark
2012Fully Bayesian speaker clustering based on hierarchically structured utterance-oriented Dirichlet process mixture model.
Naohiro Tawara, Tetsuji Ogawa, Shinji Watanabe, Atsushi Nakamura, Tetsunori Kobayashi
2012GCC-PHAT based Head Orientation Estimation.
Carlos Segura, Javier Hernando
2012Gaussian Map based Acoustic Model Adaptation Using Untranscribed Data for Speech Recognition in Severely Adverse Environments.
Wooil Kim, John H. L. Hansen
2012Gaussian Mixture Gain Priors for Regularized Nonnegative Matrix Factorization in Single-Channel Source Separation.
Emad M. Grais, Hakan Erdogan
2012Gaze Patterns in Turn-Taking.
Catharine Oertel, Marcin Wlodarczak, Jens Edlund, Petra Wagner, Joakim Gustafson
2012Gendered sound symbolism and masking effects in speech processing.
Molly Babel, Grant McGuire
2012Genetic Algorithm Based Feature Selection for Speaker Trait Classification.
Dongrui Wu
2012Glottal Waveform Analysis of Physical Task Stress Speech.
Keith W. Godin, Taufiq Hasan, John H. L. Hansen
2012Glottal source shape parameter estimation using phase minimization variants.
Stefan Huber, Axel Röbel, Gilles Degottex
2012Goal-Oriented Auditory Scene Recognition.
Kailash Patil, Mounya Elhilali
2012Group Sparse Hidden Markov Models for Speech Recognition.
Jen-Tzung Chien, Cheng-Chun Chiang
2012Group Sparsity for Speaker Identity Discrimination in Factorisation-based Speech Recognition.
Antti Hurmalainen, Rahim Saeidi, Tuomas Virtanen
2012HMM Based Continuous EOG Recognition for Eye-input Speech Interface.
Fuming Fang, Takahiro Shinozaki, Yasuo Horiuchi, Shingo Kuroiwa, Sadaoki Furui, Toshimitsu Musha
2012HMM-based speech synthesis using sub-band basis spectrum model.
Yamato Ohtani, Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima, Masami Akamine
2012Hearing Loss and the Use of Acoustic Cues in Phonetic Categorisation of Fricatives.
Odette Scharenborg, Esther Janse
2012Hermitian based Hidden Activation Functions for Adaptation of Hybrid HMM/ANN Models.
Sabato Marco Siniscalchi, Jinyu Li, Chin-Hui Lee
2012Heterogeneous Convolutive Non-Negative Sparse Coding.
Dong Wang, Javier Tejedor
2012Hidden Conditional Random Fields with M-to-N Alignments for Grapheme-to-Phoneme Conversion.
Patrick Lehnen, Stefan Hahn, Vlad-Andrei Guta, Hermann Ney
2012Hidden Markov Convolutive Mixture Model for Pitch Contour Analysis of Speech.
Kota Yoshizato, Hirokazu Kameoka, Daisuke Saito, Shigeki Sagayama
2012Hidden Markov Models as Priors for Regularized Nonnegative Matrix Factorization in Single-Channel Source Separation.
Emad M. Grais, Hakan Erdogan
2012Hierarchical English Emphatic Speech Synthesis Based on HMM with Limited Training Data.
Fanbo Meng, Zhiyong Wu, Helen M. Meng, Jia Jia, Lianhong Cai
2012Histogram-based spectral equalization for HMM-based speech synthesis using mel-LSP.
Yamato Ohtani, Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima, Masami Akamine
2012Hooking up spectro-temporal filters with auditory-inspired representations for robust automatic speech recognition.
Bernd T. Meyer, Constantin Spille, Birger Kollmeier, Nelson Morgan
2012How Marni Helps English Language Learners Acquire Oral Reading Fluency.
Ronald A. Cole, Daniel Bolaños, Wayne H. Ward, J. T. Carmer, Eric Borts, Edward Svirsky
2012How consonants, dialect and speech rate affect vowel devoicing?
Masako Fujimoto, Seiya Funatsu, Ichiro Fujimoto
2012I-vectors and ILP clustering adapted to cross-show speaker diarization.
Grégor Dupuy, Mickael Rouvier, Sylvain Meignier, Yannick Estève
2012IVN-Based Joint Training Of GMM And HMMs Using An Improved VTS-Based Feature Compensation For Noisy Speech Recognition.
Jun Du, Qiang Huo
2012Implementation of Computationally Efficient Real-Time Voice Conversion.
Tomoki Toda, Takashi Muramatsu, Hideki Banno
2012Implementation of Simple Spectral Techniques to Enhance the Intelligibility of Speech using a Harmonic Model.
Daniel Erro, Yannis Stylianou, Eva Navas, Inma Hernáez
2012Improve the Implementation of Pitch Features for Mandarin Digit String Recognition Task.
Pei Ding, Liqiang He
2012Improved Automatic Extraction of Generation Process Model Commands and Its use for Generating Fundamental Frequency Contours for Training HMM-based Speech Synthesis.
Hiroya Hashimoto, Keikichi Hirose, Nobuaki Minematsu
2012Improved Model Selection for the ASR-Driven Binary Mask.
William Hartmann, Eric Fosler-Lussier
2012Improved Prediction of Japanese Word Accent Sandhi Using CRF.
Nobuaki Minematsu, Shumpei Kobayashi, Shinya Shimizu, Keikichi Hirose
2012Improved Speech Intelligibility with a Chimaera Hearing Aid Algorithm.
Andrew Hines, Naomi Harte
2012Improved formant frequency estimation from high-pitched vowels by downgrading the contribution of the glottal source with weighted linear prediction.
Paavo Alku, Jouni Pohjalainen, Martti Vainio, Anne-Maria Laukkanen, Brad H. Story
2012Improvement in Automatic Pronunciation Scoring using Additional Basic Scores and Learning to Rank.
Liang-Yu Chen, Jyh-Shing Roger Jang
2012Improvements in Japanese Voice Search.
Ken-ichi Iso, Edward Whittaker, Tadashi Emori, Junpei Miyake
2012Improvements of the Beta-Order Minimum Mean-Square Error (MMSE) Spectral Amplitude Estimator using Chi Priors.
Marek B. Trawicki, Michael T. Johnson
2012Improving Discriminative Training for Robust Acoustic Models in Large Vocabulary Continuous Speech Recognition.
Janne Pylkkönen, Mikko Kurimo
2012Improving L1-Specific Phonological Error Diagnosis in Computer Assisted Pronunciation Training.
Theban Stanley, Kadri Hacioglu
2012Improving Recognition of Speaker States and Traits by Cumulative Evidence: Intoxication, Sleepiness, Age and Gender.
Felix Weninger, Erik Marchi, Björn W. Schuller
2012Improving WFST-based G2P Conversion with Alignment Constraints and RNNLM N-best Rescoring.
Josef R. Novak, Nobuaki Minematsu, Keikichi Hirose, Chiori Hori, Hideki Kashioka, Paul R. Dixon
2012Improving the Entropy Estimate of Neuronal Firings of Modeled Cochlear Nucleus Neurons.
Andrea Grigorescu, Marek Rudnicki, Michael Isik, Werner Hemmert, Stefano Rini
2012Indexing Raw Acoustic Features for Scalable Zero Resource Search.
Aren Jansen, Benjamin Van Durme
2012Inference of Critical Articulator Position for Fricative Consonants.
Alexander Sepúlveda, Rodrigo Capobianco Guido, Germán Castellanos-Domínguez
2012Initialization Schemes for Multilayer Perceptron Training and their Impact on ASR Performance using Multilingual Data.
Ngoc Thang Vu, Wojtek Breiter, Florian Metze, Tanja Schultz
2012Integrated Feature Normalization and Enhancement for robust Speaker Recognition using Acoustic Factor Analysis.
Taufiq Hasan, John H. L. Hansen
2012Integrating Adaptive Beam-forming and Auditory Features for Robust Large Vocabulary Speech Recognition.
Xie Sun, Peter Li, Manli Zhu, Qiru Zhou
2012Integrating Deep Neural Networks into Structural Classification Approach based on Weighted Finite-State Transducers.
Yotaro Kubo, Takaaki Hori, Atsushi Nakamura
2012Integrating Intra-Speaker Topic Modeling and Temporal-Based Inter-Speaker Topic Modeling in Random Walk for Improved Multi-Party Meeting Summarization.
Yun-Nung Chen, Florian Metze
2012Integrating Stress Information in Large Vocabulary Continuous Speech Recognition.
Bogdan Ludusan, Stefan Ziegler, Guillaume Gravier
2012Intelligibility classification of pathological speech using fusion of multiple high level descriptors.
Jangwon Kim, Naveen Kumar, Andreas Tsiartas, Ming Li, Shrikanth S. Narayanan
2012Intelligibility of speech spoken in noise/reverberation for older adults in reverberant environments.
Nao Hodoshima, Takayuki Arai, Kiyohiro Kurisu
2012Inter-gestural timing in French nasal vowels: A comparative study of (Liège, Tournai) Northern French vs. (Marseille, Toulouse) Southern French.
Véronique Delvaux, Kathy Huet, Myriam Piccaluga, Bernard Harmegnies
2012Interactions Between Turn-taking Gaps, Disfluencies and Social Obligation.
Rebecca Lunsford, Peter A. Heeman, Jan P. H. van Santen
2012Interactive Spoken Content Retrieval with Different Types of Actions Optimized By a Markov Decision Process.
Tsung-Hsien Wen, Hung-yi Lee, Lin-Shan Lee
2012Interplay between verbal response latency and physiology of children with autism during ECA interactions.
Theodora Chaspari, Chi-Chun Lee, Shrikanth S. Narayanan
2012Interspeech Pathology Challenge: Investigations into Speaker and Sentence Specific Effects.
Anthony P. Stark, Alireza Bayestehtashk, Meysam Asgari, Izhak Shafran
2012Intrinsic Spectral Analysis for Zero and High Resource Speech Recognition.
Aren Jansen, Samuel Thomas, Hynek Hermansky
2012Intrinsic velocity differences of lip and jaw movements: preliminary results.
Peter Birkholz, Phil Hoole
2012Inventory-Based Audio-Visual Speech Enhancement.
Dorothea Kolossa, Robert M. Nickel, Steffen Zeiler, Rainer Martin
2012Inverting the Point Process Model for Fast Phonetic Keyword Search.
Keith Kintzley, Aren Jansen, Kenneth Church, Hynek Hermansky
2012Investigating Performance of the Discriminative Methods for Long-Term Speaker Adaptation.
Danning Jiang, Dimitri Kanevsky, Vaibhava Goel, Yong Qin
2012Investigating syllabic prominence with Conditional Random Fields and Latent-Dynamic Conditional Random Fields.
Francesco Cutugno, Enrico Leone, Bogdan Ludusan, Antonio Origlia
2012Investigation of Maximum Entropy Hybrid Language Models for Open Vocabulary German and Polish LVCSR.
M. Ali Basha Shaik, Amr El-Desoky Mousa, Ralf Schlüter, Hermann Ney
2012Is 'not bad' good enough? Aspects of unknown voices' likability.
Benjamin Weiss, Felix Burkhardt
2012Iterative MMSE Estimation of Vocal Tract Length Normalization Factors for Voice Transformation.
Daniel Erro, Eva Navas, Inma Hernáez
2012Joint Decoding for Speech Recognition and Semantic Tagging.
Anoop Deoras, Ruhi Sarikaya, Gökhan Tür, Dilek Hakkani-Tür
2012Joint Pitch-Analysis Formant-Synthesis framework for CS recovery of speech.
Srikanth Raj Chetupally, Thippur V. Sreenivas
2012Judging temporal onset differences for concurrent vowels: Results for young, middle-aged, and older adults.
Daniel Fogerty, Diane Kewley-Port, Larry E. Humes
2012KNNDIST: A Non-Parametric Distance Measure for Speaker Segmentation.
Seyed Hamidreza Mohammadi, Hossein Sameti, Mahsa Sadat Elyasi Langarani, Amirhossein Tavanaei
2012Knowledge-Based Word Lattice Rescoring in a Dynamic Context.
Todd Shore, Friedrich Faubel, Hartmut Helmke, Dietrich Klakow
2012LSTM Neural Networks for Language Modeling.
Martin Sundermeyer, Ralf Schlüter, Hermann Ney
2012Language Modeling for Voice-Enabled Social TV Using Tweets.
Junlan Feng, Bernard Renger
2012Language differences in the perceptual weight of prominence-lending properties.
Bistra Andreeva, William J. Barry, Magdalena Wolska
2012Large Scale Hierarchical Neural Network Language Models.
Hong-Kwang Kuo, Ebru Arisoy, Ahmad Emami, Paul Vozila
2012Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks.
Dong Yu, Li Deng, Frank Seide
2012Learning When to Listen: Detecting System-Addressed Speech in Human-Human-Computer Dialog.
Elizabeth Shriberg, Andreas Stolcke, Dilek Hakkani-Tür, Larry P. Heck
2012Learning an Artificial F0-Contour for ALT Speech.
Anna Katharina Fuchs, Martin Hagmüller
2012Lenition of /d/ in spontaneous Spanish and Catalan.
Miguel Simonet, José Ignacio Hualde, Marianna Nadeu
2012Less errors with TTS? A dictation experiment with foreign language learners.
Thomas Pellegrini, Ângela Costa, Isabel Trancoso
2012Leveraging Social Annotation for Topic Language Model Adaptation.
Youzheng Wu, Kazuhiko Abe, Paul R. Dixon, Chiori Hori, Hideki Kashioka
2012Lexical Story Co-Segmentation of Chinese Broadcast News.
Wei Feng, Xuecheng Nie, Liang Wan, Lei Xie, Jianmin Jiang
2012Lexical-phonetic automata for spoken utterance indexing and retrieval.
Julien Fayolle, Murat Saraclar, Fabienne Moreau, Christian Raymond, Guillaume Gravier
2012Likability Classification - A Not so Deep Neural Network Approach.
Raymond Brueckner, Björn W. Schuller
2012Local-feature-map Integration Using Convolutional Neural Networks for Music Genre Classification.
Toru Nakashika, Christophe Garcia, Tetsuya Takiguchi
2012Log-spectral feature reconstruction based on an occlusion model for noise robust speech recognition.
José A. González, Antonio M. Peinado, Angel M. Gomez, Ning Ma
2012Longer Features: They do a speech detector good.
T. J. Tsai, Nelson Morgan
2012Low latency combination of parallelized single-pass LVCSR systems.
Fethi Bougares, Mickael Rouvier, Yannick Estève, Georges Linarès
2012Low-SNR, Speaker-Dependent Speech Enhancement using GMMs and MFCCs.
Laura E. Boucheron, Phillip L. De Leon
2012Low-rank Audio Signal Classification Under Soft Margin and Trace Norm Constraints.
Ziqiang Shi, Tieran Zheng, Jiqing Han, Shiwen Deng
2012MAP Estimation of Whole-Word Acoustic Models with Dictionary Priors.
Keith Kintzley, Aren Jansen, Hynek Hermansky
2012Making Conversational Vowels More Clear.
Seyed Hamidreza Mohammadi, Alexander Kain, Jan P. H. van Santen
2012Mask Estimation and Refinement for MFT-based Robust Speaker Verification.
Yali Zhao, Lei Xie, Zhonghua Fu
2012Maximising objective speech intelligibility by local f0 modulation.
Julián Villegas, Martin Cooke
2012Maximum Entropy Language Model Adaptation for Mobile Speech Input.
Tanel Alumäe, Kaarel Kaljurand
2012Maximum F1-Score Discriminative Training for Automatic Mispronunciation Detection in Computer-Assisted Language Learning.
Huang Hao, Jianming Wang, Halidan Abudureyimu
2012Mean Hilbert Envelope Coefficients (MHEC) for Robust Speaker Recognition.
Seyed Omid Sadjadi, Taufiq Hasan, John H. L. Hansen
2012Meaning inhibition and sentence processing in Chinese: Evidence from negative priming.
Michael C. W. Yip
2012Measuring prosodic alignment in cooperative task-based conversations.
Khiet P. Truong, Dirk Heylen
2012Mel cepstral coefficient modification based on the Glimpse Proportion measure for improving the intelligibility of HMM-generated synthetic speech in noise.
Cassia Valentini-Botinhao, Junichi Yamagishi, Simon King
2012Methodological Issues in Assessing Perceptual Representation of Consonant Sounds in Thai.
Charturong Tantibundhit, Chutamanee Onsuwan, P. Phienphanich, Chai Wutiwiwatchai
2012Microphone Array Post-filter based on Spatially-Correlated Noise Measurements for Distant Speech Recognition.
Ken'ichi Kumatani, Bhiksha Raj, Rita Singh, John W. McDonough
2012Mitigating Effects of Recording Condition Mismatch in Speaker Recognition Using Partial Least Squares.
Jeremiah Remus, Jenniffer Estrada, Stephanie A. C. Schuckers
2012Mixed probabilistic and deterministic dependency parsing.
Christophe Cerisara, Alejandra Lorenzo
2012Mixture Component Clustering for Efficient Speaker Verification.
Richard D. McClanahan, Phillip L. De Leon
2012Model-Based Approaches for Degraded Channel Modelling in Robust ASR.
Mark J. F. Gales, Federico Flego
2012Model-based Duration-difference Approach on Accent Evaluation of L2 Learner.
Chatchawarn Hansakunbuntheung, Ananlada Chotimongkol, Sumonmas Thatphithakkul, Patcharika Chootrakool
2012Model-based Single-Channel Dereverberation in Noisy Acoustical Environments.
Xulei Bao, Jie Zhu
2012Model-based approaches to adaptive training in reverberant environments.
Yongqiang Wang, Mark J. F. Gales
2012Modeling Cue Trading in Human Word Recognition.
Louis ten Bosch, Odette Scharenborg
2012Modeling Pause-Duration for Style-Specific Speech Synthesis.
Alok Parlikar, Alan W. Black
2012Modeling source-tract interaction in speech production: Voicing onset vs. vowel height after a voiceless obstruent.
Jorge C. Lucero, Laura L. Koenig, Susanne Fuchs
2012Modeling spoken language acquisition with a generic cognitive architecture for associative learning.
Okko Räsänen, Heikki Rasilo, Unto K. Laine
2012Modeling the Creaky Excitation for Parametric Speech Synthesis.
Thomas Drugman, John Kane, Christer Gobl
2012Modelling a Noisy-channel for Voice Conversion Using Articulatory Features.
Bajibabu Bollepalli, Alan W. Black, Kishore Prahallad
2012Modelling pause duration as a function of contextual length.
David Doukhan, Albert Rilliard, Sophie Rosset, Christophe d'Alessandro
2012Modulation Spectrum Analysis for Speaker Personality Trait Recognition.
Alexei Ivanov, Xin Chen
2012Modulation domain blind source separation for noisy speech mixture.
Yi Zhang, Yunxin Zhao
2012More on the Normalization of Syllable Prominence Ratings.
Christopher Sappok, Denis Arnold
2012Morpheme Level Feature-based Language Models for German LVCSR.
Amr El-Desoky Mousa, M. Ali Basha Shaik, Ralf Schlüter, Hermann Ney
2012Multi-System Fusion of Extended Context Prosodic and Cepstral Features for Paralinguistic Speaker Trait Classification.
Michelle Hewlett Sanchez, Aaron Lawson, Dimitra Vergyri, Harry Bratt
2012N-gram FST Indexing for Spoken Term Detection.
Chao Liu, Dong Wang, Javier Tejedor
2012Nasal Coarticulation and Contrastive Stress.
Georgia Zellou, Rebecca Scarborough
2012Nasality from Moroccan Arabic Nasal and Pharyngeal Consonants: Patterns of Airflow and Nasalance.
Georgia Zellou
2012Nativeness Classification with Suprasegmental Features on the Accent Group Level.
Mahnoosh Mehrabani, Joseph Tepperman, Emily Nava
2012Naturalness Judgement of Prosodic Variation of Japanese Utterances with Prosody Modified Stimuli.
Chiharu Tsurutani, Shunichi Ishihara
2012Noise Compensation for Subspace Gaussian Mixture Models.
Liang Lu, K. K. Chin, Arnab Ghoshal, Steve Renals
2012Noise Robust Pitch Tracking by Subband Autocorrelation Classification.
Byung Suk Lee, Daniel P. W. Ellis
2012Non-auditory cognitive capabilities in computational modeling of early language acquisition.
Okko Räsänen
2012Normalization of Text Messages Using Character- and Phone-based Machine Translation Approaches.
Chen Li, Yang Liu
2012Normalization of spectro-temporal Gabor filter bank features for improved robust automatic speech recognition systems.
Marc René Schädler, Birger Kollmeier
2012Novel Approach to Live Captioning Through Re-speaking: Tailoring Speech Recognition to Re-speaker's Needs.
Ales Prazák, Zdenek Loose, Jan Trmal, Josef V. Psutka, Josef Psutka
2012Novel Metrics of Speech Rhythm for the Assessment of Emotion.
Fabien Ringeval, Mohamed Chetouani, Björn W. Schuller
2012OOV Word Detection using Hybrid Models with Mixed Types of Fragments.
Long Qin, Alexander I. Rudnicky
2012Objective Child Vocal Development Measurement with Naturalistic Daylong Audio Recording.
Dongxin Xu, Jill Gilkerson, Jeffery Richards
2012Objective Intelligibility Assessment of Text-to-Speech System using Template Constrained Generalized Posterior Probability.
Linfang Wang, Lijuan Wang, Yan Teng, Zhe Geng, Frank K. Soong
2012Objective, Subjective and Linguistic Roads to Perceptual Prominence - How are they compared and why?
Petra Wagner, Fabio Tamburini, Andreas Windmann
2012Obtaining prominence judgments from naïve listeners - Influence of rating scales, linguistic levels and normalisation.
Denis Arnold, Petra Wagner, Bernd Möbius
2012On Speaker-Independent Personality Perception and Prediction from Speech.
Tim Polzehl, Katrin Schoenenberg, Sebastian Möller, Florian Metze, Gelareh Mohammadi, Alessandro Vinciarelli
2012On the Dynamics of Overlap in Multi-Party Conversation.
Kornel Laskowski, Mattias Heldner, Jens Edlund
2012On the Modeling of Voiceless Stop Sounds of Speech using Adaptive Quasi-Harmonic Models.
George P. Kafentzis, Olivier Rosec, Yannis Stylianou
2012On the Role of Binary Mask Pattern in Automatic Speech Recognition.
Arun Narayanan, DeLiang Wang
2012On the Use of Non-Linear Polynomial Kernel SVMs in Language Recognition.
Sibel Yaman, Jason W. Pelecanos, Mohamed Kamal Omar
2012On the Use of Spectral and Iterative Methods for Speaker Diarization.
Stephen Shum, Najim Dehak, Jim Glass
2012On the acoustics of overlapping laughter in conversational speech.
Khiet P. Truong, Jürgen Trouvain
2012On the assessment of audiovisual cues to speaker confidence by preteens with typical development (TD) and a-typical development (AD).
Marc Swerts, Cees de Bie
2012On the effect of the acoustic environment on the accuracy of perception of speaker orientation from auditory cues alone.
Jens Edlund, Mattias Heldner, Joakim Gustafson
2012On the use of Machine Learning Methods for Speech and Voicing Classification.
Philip Harding, Ben Milner
2012On-the-fly Topic Adaptation for YouTube Video Transcription.
Kapil Thadani, Fadi Biadsy, Daniel M. Bikel
2012Online Story Segmentation of Multilingual Streaming Broadcast News.
Amit Srivastava, Saurabh Khanwalkar, Gretchen Markiewicz, Guruprasad Saikumar
2012Open-Vocabulary Retrieval of Spoken Content with Shorter/Longer Queries Considering Word/Subword-based Acoustic Feature Similarity.
Hung-yi Lee, Po-Wei Chou, Lin-Shan Lee
2012Optimised spectral weightings for noise-dependent speech intelligibility enhancement.
Yan Tang, Martin Cooke
2012Optimization of Dialog Strategies using Automatic Dialog Simulation and Statistical Dialog Management Techniques.
Zoraida Callejas, Ramón López-Cózar
2012Optimization-Based Control for the Extended Baum-Welch Algorithm.
Janne Pylkkönen, Mikko Kurimo
2012Overlapped Speech Detection in Meeting Using Cross-Channel Spectral Subtraction and Spectrum Similarity.
Ryo Yokoyama, Yu Nasu, Koichi Shinoda, Koji Iwano
2012Overlapping Sound Event Recognition using Local Spectrogram Features with the Generalised Hough Transform.
Jonathan William Dennis, Tran Huy Dat, Engsiong Chng
2012PLDA Modeling in I-Vector and Supervector Space for Speaker Verification.
Ye Jiang, Kong-Aik Lee, Zhenmin Tang, Bin Ma, Anthony Larcher, Haizhou Li
2012PLDA using Gaussian Restricted Boltzmann Machines with application to Speaker Verification.
Themos Stafylakis, Patrick Kenny, Mohammed Senoussaoui, Pierre Dumouchel
2012Parallel Training for Deep Stacking Networks.
Li Deng, Brian Hutchinson, Dong Yu
2012Parallel combination of multilingual speech streams for improved ASR.
João Miranda, João Paulo Neto, Alan W. Black
2012Paraphrastic Language Models.
Xunying Liu, Mark J. F. Gales, Philip C. Woodland
2012Patrol Team Language Identification System for DARPA RATS P1 Evaluation.
Pavel Matejka, Oldrich Plchot, Mehdi Soufifar, Ondrej Glembek, Luis Fernando D'Haro, Karel Veselý, Frantisek Grézl, Jeff Z. Ma, Spyros Matsoukas, Najim Dehak
2012Pauses and respiratory markers of the structure of book reading.
Gérard Bailly, Cécilia Gouvernayre
2012Perceived prosodic boundaries in Taiwanese and their acoustic correlates.
Grace Kuo
2012Perception of Pitch Contours among Native Tone Listeners.
Ratree Wayland, Donruethai Laphasradakul, Edith Kaan, Cao Rui
2012Perception of Synthetic Speech in Adult Users of Cochlear Implants.
Kyoko Nagao, Mark Paullin, James B. Polikoff, Jason Lilley, H. Timothy Bunnell
2012Perception of the moraic obstruent /Q/: a cross-linguistic study.
Makiko Sadakata, Mizuki Shingai, Alex Brandmeyer, Kaoru Sekiyama
2012Perceptual Assimilation of Arabic Voiceless Fricatives by English Monolinguals.
Michael D. Tyler, Sarah Fenwick
2012Perceptual Foundations for Naturalistic Variability in the Prosody of Synthetic Speech.
Nanette Veilleux, Jonathan Barnes, Alejna Brugos, Stefanie Shattuck-Hufnagel
2012Perceptual Importance of the Phase Related Information in Speech.
Ibon Saratxaga, Inma Hernáez, Michael Pucher, Eva Navas, Iñaki Sainz
2012Perceptual Learning of /f/-/s/ by Older Listeners.
Odette Scharenborg, Esther Janse, Andrea Weber
2012Perceptual compensation for the effects of reverberation on consonant identification: A comparison of human and machine performance.
Guy J. Brown, Amy V. Beeston, Kalle J. Palomäki
2012Performance Comparison of Intrusive Objective Speech Intelligibility and Quality Metrics for Cochlear Implant Users.
João Felipe Santos, Stefano Cosentino, Oldooz Hazrati, Philipos C. Loizou, Tiago H. Falk
2012Performance Comparison of Training Algorithms for Semi-Supervised Discriminative Language Modeling.
Erinç Dikici, Arda Çelebi, Murat Saraclar
2012PermA and Balloon: Tools for string alignment and text processing.
Uwe D. Reichel
2012Personality traits detection using a parallelized modified SFFS algorithm.
Clément Chastagnol, Laurence Devillers
2012Phase estimation for signal reconstruction in single-channel source separation.
Pejman Mowlaee, Rahim Saeidi, Rainer Martin
2012Phone Adaptive Training for Speaker Diarization.
Simon Bozonnet, Ravichander Vipperla, Nicholas W. D. Evans
2012Phone recognition in critical bands using sub-band temporal modulations.
Feipeng Li, Sri Harish Reddy Mallidi, Hynek Hermansky
2012Phoneme Class Based Adaptation for Mismatch Acoustic Modeling of Distant Noisy Speech.
Seçkin Uluskan, John H. L. Hansen
2012Phoneme resistance during speech-in-speech comprehension.
Léo Varnet, Julien Meyer, Michel Hoen, Fanny Meunier
2012Phonetic Foreignization of Mandarin for Dubbing in Imported Western Movies.
Luying Hou, Yuan Jia, Aijun Li
2012Phonological complexity and vocabulary size in 30-month-old Swedish children.
Ulrika Marklund, Ulla Sundberg, Iris-Corinna Schwarz, Francisco Lacerda
2012Phonology & the Interpretation of Fine Phonetic Detail in Berlin German.
Stefanie Jannedy, Melanie Weirich
2012Phonotactic Language Recognition Using MLP Features.
Mohamed Faouzi BenZeghiba, Jean-Luc Gauvain, Lori Lamel
2012Phonotactic Language Recognition using i-vectors and Phoneme Posteriogram Counts.
Luis Fernando D'Haro, Ondrej Glembek, Oldrich Plchot, Pavel Matejka, Mehdi Soufifar, Ricardo de Córdoba, Jan Cernocký
2012Phrasal Cohort Based Unsupervised Discriminative Language Modeling.
Puyang Xu, Brian Roark, Sanjeev Khudanpur
2012Phrase Boundary Assignment from Text in Multiple Domains.
Andrew Rosenberg, Raul Fernandez, Bhuvana Ramabhadran
2012Physiological and acoustic study of word initial post-lexical gemination in Moroccan Arabic.
Chakir Zeroual, Diamantis Gafos, Phil Hoole, John H. Esling
2012Pipelined Back-Propagation for Context-Dependent Deep Neural Networks.
Xie Chen, Adam Eversole, Gang Li, Dong Yu, Frank Seide
2012Pitch Estimation Based on Long Frame Harmonic Model and Short Frame Average Correlation Coefficient.
Dongmei Wang, Philipos C. Loizou
2012Pitch and Intonation Contribution to Speakers' Traits Classification.
Claude Montacié, Marie-José Caraty
2012Pitch and phonological perception of tone in the Suruí language of Rondônia (Brazil): identification task of LHL and LHH tonal patterns.
Julien Meyer
2012Pitch range control of Japanese boundary pitch movements.
Yosuke Igarashi, Hanae Koiso
2012Pitch-Scaled Analysis based Residual Reconstruction for Speech Analysis and Synthesis.
Zhengqi Wen, Hideki Kawahara, Jianhua Tao
2012Plagiarism Detection in Polyphonic Music using Monaural Signal Separation.
Soham De, Indradyumna Roy, Tarunima Prabhakar, Kriti Suneja, Sourish Chaudhuri, Rita Singh, Bhiksha Raj
2012PodCastle: Collaborative Training of Language Models on the Basis of Wisdom of Crowds.
Jun Ogata, Masataka Goto
2012Pooling Robust Shift-Invariant Sparse Representations of Acoustic Signals.
Po-Sen Huang, Jianchao Yang, Mark Hasegawa-Johnson, Feng Liang, Thomas S. Huang
2012Portability of Semantic Annotations for Fast Development of Dialogue Corpora.
Bassam Jabaian, Fabrice Lefèvre, Laurent Besacier
2012Posterior-Scaled MPE: Novel Discriminative Training Criteria.
Markus Nußbaum-Thom, Zoltán Tüske, Georg Heigold, Ralf Schlüter, Hermann Ney
2012Power Mean Pyramid Scores for Summarization Evaluation.
Sameer Maskey, Andrew Rosenberg
2012Practice and feedback in L2 speaking: an evaluation of the DISCO CALL system.
Catia Cucchiarini, Joost van Doremalen, Helmer Strik
2012Predictability affects vowel dispersion and dynamics in the Buckeye Corpus.
Michael McAuliffe, Molly Babel
2012Predicting Character-Appropriate Voices for a TTS-based Storyteller System.
Erica Greene, Taniya Mishra, Patrick Haffner, Alistair Conkie
2012Predicting Likability of Speakers with Gaussian Processes.
Dingchao Lu, Fei Sha
2012Prediction of Turn-Taking by Combining Prosodic and Eye-Gaze Information in Poster Conversations.
Tatsuya Kawahara, Takuma Iwatate, Katsuya Takanashi
2012Preference-learning based Inverse Reinforcement Learning for Dialog Control.
Hiroaki Sugiyama, Toyomi Meguro, Yasuhiro Minami
2012ProTK: An Improved Prosody Toolkit.
Jacob Okamoto, Serguei V. S. Pakhomov, Elizabeth Shriberg, Andreas Stolcke
2012Probabilistic Speaker-Class based Acoustic Modeling for Large Vocabulary Continuous Speech Recognition.
Xiangang Li, Dan Su, Zaihu Pang, Xihong Wu
2012Production and Perception of Focus in PFC and non-PFC Languages: Comparing Beijing Mandarin and Hainan Tsat.
Bei Wang, Chenxia Li, Qian Wu, Xiaxia Zhang, Baofeng Wang, Yi Xu
2012Prof-Life-Log: Audio Environment Detection for Naturalistic Audio Streams.
Ali Ziaei, Abhijeet Sangwan, John H. L. Hansen
2012Pronunciation quality evaluation of sentences by combining word based scores.
Jorge Wuth, Néstor Becerra Yoma, Leopoldo Benavides, Hiram Vivanco
2012Proper Name Splicing in Computer Games with TTS.
Blaise Potard, Matthew P. Aylett, Christopher J. Pidcock
2012Prosodic Cues to Disengagement and Uncertainty in Physics Tutorial Dialogues.
Diane J. Litman, Heather Friedberg, Katherine Forbes-Riley
2012Prosodic Entrainment in an Information-Driven Dialog System.
Andrew Fandrianto, Maxine Eskénazi
2012Prosodic Marking of Continuation versus Completion in Children's Narratives.
Melissa A. Redford, Laura Dilley, Jessica Gamache, Elizabeth Wieland
2012Prosodic Realization of Focus in Statement and Question in Tibetan (Lhasa Dialect).
Xiaxia Zhang, Bei Wang, Qian Wu, Yi Xu
2012Prosodic contex-based analysis of disfluencies.
Helena Moniz, Fernando Batista, Isabel Trancoso, Ana Isabel Mata
2012Prosodic measurements and question types in the Spontal corpus of Swedish dialogues.
Sofia Strömbergsson, Jens Edlund, David House
2012Psychoacoustic Segment Scoring for Multi-Form Speech Synthesis.
Alexander Sorin, Slava Shechtman, Vincent Pollet
2012Q-Gaussian based spectral subtraction for robust speech recognition.
Hilman Ferdinandus Pardede, Koichi Shinoda, Koji Iwano
2012Quality Analysis of Macroprosodic F0 Dynamics in Text-to-Speech Signals.
Christoph Norrenbrock, Florian Hinterleitner, Ulrich Heute, Sebastian Möller
2012Quantitative Analysis of Pitch in Speech of Children with Neurodevelopmental Disorders.
Géza Kiss, Jan P. H. van Santen, Emily Tucker Prud'hommeaux, Lois M. Black
2012Query-by-Example using Speaker Content Graphs.
William M. Campbell, Elliot Singer
2012RSR2015: Database for Text-Dependent Speaker Verification using Multiple Pass-Phrases.
Anthony Larcher, Kong-Aik Lee, Bin Ma, Haizhou Li
2012Rapid Nonlinear Speaker Adaptation for Large-Vocabulary Continuous Speech Recognition.
Zoi Roupakia, Anton Ragni, Mark J. F. Gales
2012Real-Time Lecture Transcription using ASR for Czech Hearing Impaired or Deaf Students.
Petr Cerva, Jan Silovský, Jindrich Zdánský, Jan Nouza, Jirí Málek
2012Real-time Implementation of Multi-band Frequency Compression for Listeners with Moderate Sensorineural Impairment.
Nitya Tiwari, Prem C. Pandey, Pandurangarao N. Kulkarni
2012Real-time Visualization of English Pronunciation on an IPA Chart Based on Articulatory Feature Extraction.
Yurie Iribe, Takurou Mori, Kouichi Katsurada, Goh Kawai, Tsuneo Nitta
2012Recurrent Neural Networks for Noise Reduction in Robust ASR.
Andrew L. Maas, Quoc V. Le, Tyler M. O'Neil, Oriol Vinyals, Patrick Nguyen, Andrew Y. Ng
2012Relative Importance of Temporal Envelope and Fine Structure Cues in Low- and High-Order Harmonic Regions for Mandarin Lexical-tone Recognition.
Guangting Mai
2012Residual Phase Cepstrum Coefficients with Application to Cross-lingual Speaker Verification.
Michael T. Johnson, Jianglin Wang
2012Resonator-based creaky voice detection.
Thomas Drugman, John Kane, Christer Gobl
2012Rethinking The Corpus: Moving towards Dynamic Linguistic Resources.
Andrew Rosenberg
2012Robust Event Detection From Spoken Content In Consumer Domain Videos.
Stavros Tsakalidis, Xiaodan Zhuang, Roger Hsiao, Shuang Wu, Pradeep Natarajan, Rohit Prasad, Prem Natarajan
2012Robust Feature Extraction for Speech Recognition by Enhancing Auditory Spectrum.
Md. Jahangir Alam, Patrick Kenny, Douglas D. O'Shaughnessy
2012Robust Pitch Estimation Using l1-regularized Maximum Likelihood Estimation.
Feng Huang, Tan Lee
2012Robust Tracking for Automatic Reading Tutors.
Emre Yilmaz, Dirk Van Compernolle, Hugo Van hamme
2012Robust phoneme recognition based on biomimetic speech contours.
Michael A. Carlin, Kailash Patil, Sridhar Krishna Nemala, Mounya Elhilali
2012Robust triphone mapping for acoustic modeling.
Milos Cernak, David Imseng, Hervé Bourlard
2012Role of Prosody in Automatic Modality Recognition of Bangla Speech.
Anal Warsi, Tulika Basu, Debasis Mazumdar
2012Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization.
Brian Kingsbury, Tara N. Sainath, Hagen Soltau
2012Search Space Pruning Based on Anticipated Path Recombination in LVCSR.
David Nolden, Ralf Schlüter, Hermann Ney
2012Selection of TDOA Parameters for MDM Speaker Diarization.
Beatriz Martínez-González, José Manuel Pardo, Julián D. Echeverry-Correa, José A. Vallejo-Pinto, Roberto Barra-Chicote
2012Semi-Blind Model Adaptation using Piece-wise Energy Decay Curve for Large Reverberant Environments.
Abdul Waheed Mohammed, Marco Matassoni, Hari Krishna Maganti, Maurizio Omologo
2012Semi-Supervised Methods for Improving Keyword Search of Unseen Terms.
Scott Novotney, Ivan Bulyko, Richard M. Schwartz, Sanjeev Khudanpur, Owen Kimball
2012Sentence Detection Using Multiple Annotations.
Ann Lee, James R. Glass
2012Sibilant Speech Detection in Noise.
Sira Gonzalez, Mike Brookes
2012Similar Speaker Selection Technique Based on Distance Metric Learning with Perceptual Voice Quality Similarity.
Yusuke Ijima, Mitsuaki Isogai, Hideyuki Mizuno
2012Similarities in fundamental frequency in infant speech segmentation models.
Ellen Marklund, Francisco Lacerda, Iris-Corinna Schwarz, Ulla Sundberg
2012Simultaneous Discriminative Training and Mixture Splitting of HMMs for Speech Recognition.
Muhammad Ali Tahir, Markus Nußbaum-Thom, Ralf Schlüter, Hermann Ney
2012Smile with a smile.
Hugo Quené, Will Schuerman
2012Sparse Bayesian Factor Analysis for Stereo-based Stochastic Mapping.
Xiaodong Cui, Mohamed Afify, George Saon, Vaibhava Goel
2012Sparse Probabilistic Linear Discriminant Analysis for Speaker Verification.
Hai Yang, Chunyan Liang, Yunfei Xu, Lin Yang, Yonghong Yan
2012Speaker Adaptation Using Variational Bayesian Linear Regression in Normalized Feature Space.
Seong-Jun Hahm, Atsunori Ogawa, Masakiyo Fujimoto, Takaaki Hori, Atsushi Nakamura
2012Speaker Clustering for a Mixture of Singing and Reading.
Mahnoosh Mehrabani, John H. L. Hansen
2012Speaker Clustering in Emotion Recognition.
Ni Ding, Julien Epps
2012Speaker Discrimination Ability of Glottal Waveform Features.
Juan F. Torres, Elliot Moore
2012Speaker Independent Single Channel Source Separation using Sinusoidal Features.
Shivesh Ranjan, Karen L. Payton, Pejman Mowlaee
2012Speaker Personality Classification Using Systems Based on Acoustic-Lexical Cues and an Optimal Tree-Structured Bayesian Network.
Kartik Audhkhasi, Angeliki Metallinou, Ming Li, Shrikanth S. Narayanan
2012Speaker Recognition for Children's Speech.
Saeid Safavi, Maryam Najafian, Abualsoud Hanani, Martin J. Russell, Peter Jancovic, Michael J. Carey
2012Speaker Verification Using Neighborhood Preserving Embedding.
Chunyan Liang, Jinchao Yang, Lin Yang, Yonghong Yan
2012Speaker diarization of overlapping speech based on silence distribution in meeting recordings.
Sree Harsha Yella, Fabio Valente
2012Speaker idiosyncratic rhythmic features in the speech signal.
Volker Dellwo, Adrian Leemann, Marie-José Kolly
2012Speaker-Dependent Voice Activity Detection Robust to Background Speech Noise.
Shigeki Matsuda, Naoya Ito, Kosuke Tsujino, Hideki Kashioka, Shigeki Sagayama
2012Speaker-adaptive visual speech synthesis in the HMM-framework.
Dietmar Schabus, Michael Pucher, Gregor Hofer
2012Spectral Intersections for Non-Stationary Signal Separation.
Trausti T. Kristjansson, Thad Hughes
2012Speech Activity Detection for Noisy Data Using Adaptation Techniques.
Mohamed Omar
2012Speech Data Clustering Based on Phoneme Error Trend for Unsupervised Acoustic Model Adaptation.
Taichi Asami, Satoshi Kobashikawa, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi
2012Speech Enhancement Using Sparse Convolutive Non-negative Matrix Factorization with Basis Adaptation.
Michael Carlin, Nicolas Malyska, Thomas F. Quatieri
2012Speech Enhancement With Bivariate Gamma Model.
Atanu Saha, Tetsuya Shimamura
2012Speech Enhancement by Online Non-negative Spectrogram Decomposition in Non-stationary Noise Environments.
Zhiyao Duan, Gautham J. Mysore, Paris Smaragdis
2012Speech Enhancement for Android (SEA): A Speech Processing Demonstration Tool for Android Based Smart Phones and Tablets.
Roger Chappel, Kuldip K. Paliwal
2012Speech Pattern Discovery using Audio-Visual Fusion and Canonical Correlation Analysis.
Lei Xie, Yinqing Xu, Lilei Zheng, Qiang Huang, Bingfeng Li
2012Speech Production-Perception Relationships in Children with Speech Delay.
Kyoko Nagao, Mark Paullin, Vilena Livinsky, James B. Polikoff, Linda D. Vallino, Thierry G. Morlet, N. Carolyn Schanen, H. Timothy Bunnell
2012Speech Recognition by Denoising and Dereverberation Based on Spectral Subtraction in a Real Noisy Reverberant Environment.
Kyohei Odani, Longbiao Wang, Atsuhiko Kai
2012Speech and speaker separation in human auditory cortex.
Nima Mesgarani, Edward Chang
2012Speech factorization for HMM-TTS based on cluster adaptive training.
Javier Latorre, Vincent Wan, Mark J. F. Gales, Langzhou Chen, K. K. Chin, Kate M. Knill, Masami Akamine
2012Speech modeling and processing by low-dimensional dynamic glottal models.
Carlo Drioli, Andrea Calanca
2012Speech restoration based on deep learning autoencoder with layer-wised pretraining.
Xugang Lu, Shigeki Matsuda, Chiori Hori, Hideki Kashioka
2012Speech synthesis using a non-maximally decimated filter bank for embedded systems.
Nobuyuki Nishizawa, Tsuneo Kato
2012Speech-in-noise intelligibility improvement based on spectral shaping and dynamic range compression.
Tudor-Catalin Zorila, Varvara Kandia, Yannis Stylianou
2012Speech/Nonspeech Segmentation in Web Videos.
Ananya Misra
2012SpeechMark: Landmark Detection Tool for Speech Analysis.
Suzanne Boyce, Harriet J. Fell, Joel MacAuslan
2012Spelling as a Complementary Strategy for Speech Recognition.
Keith Vertanen, Per Ola Kristensson
2012Spoken Dialogs With a Virtual Science Tutor.
Wayne H. Ward, Daniel Bolaños, Ronald A. Cole
2012Spoken Document Clustering Using Word Confusion Networks.
Shajith Ikbal, Sachindra Joshi, Ashish Verma, Om D. Deshmukh
2012Spoken Inquiry Discrimination Using Bag-of-Words for Speech-Oriented Guidance System.
Haruka Majima, Rafael Torres, Yoko Fujita, Hiromichi Kawanami, Tomoko Matsui, Hiroshi Saruwatari, Kiyohiro Shikano
2012Spontaneous-Speech Acoustic-Prosodic Features of Children with Autism and the Interacting Psychologist.
Daniel Bone, Matthew P. Black, Chi-Chun Lee, Marian E. Williams, Pat Levitt, Sungbok Lee, Shrikanth S. Narayanan
2012Spoofing countermeasures for the protection of automatic speaker recognition systems against attacks with artificial signals.
Federico Alegre, Ravichander Vipperla, Nicholas W. D. Evans
2012Study of Different Backends in a State-Of-the-Art Language Recognition System.
Mikel Peñagarikano, Amparo Varona, Mireia Díez, Luis Javier Rodríguez-Fuentes, Germán Bordel
2012Study of the Effect of I-vector Modeling on Short and Mismatch Utterance Duration for Speaker Verification.
Achintya Kumar Sarkar, Driss Matrouf, Pierre-Michel Bousquet, Jean-François Bonastre
2012Study on Integration of Speaker Diarization with Speaker Adaptive Speech Recognition for Broadcast Transcription.
Jan Silovský, Petr Cerva, Jindrich Zdánský, Jan Nouza
2012Sub-band based Log-energy and Its Dynamic Range Stretching for Robust In-car Speech Recognition.
Weifeng Li, Hervé Bourlard
2012Subspace Gaussian Mixture Models Based on Noise Compensation for Speech Recognition.
Mohamed Bouallegue, Driss Matrouf, Georges Linarès, Mickael Rouvier
2012Subspace-Based Feature Representation and Learning for Language Recognition.
Yu-Chin Shih, Hung-Shin Lee, Hsin-Min Wang, Shyh-Kang Jeng
2012Subword speech recognition for detection of unseen words.
Ivan Bulyko, Jose Herrero, Chris Mihelich, Owen Kimball
2012Supervector LDA: A New Approach to Reduced-Complexity I-vector Language Recognition.
Alan McCree, Bengt J. Borgstrom
2012Supervised Spoken Document Summarization jointly Considering Utterance Importance and Redundancy by Structured Support Vector Machine.
Hung-yi Lee, Yu-Yu Chou, Yow-Bang Wang, Lin-Shan Lee
2012Supervised and unsupervised Web-based language model domain adaptation.
Gwénolé Lecorvé, John Dines, Thomas Hain, Petr Motlícek
2012Supervized Mixture of PLDA Models for Cross-Channel Speaker Verification.
Konstantin Simonchik, Timur Pekhovsky, Andrey Shulipa, Anton Afanasyev
2012Syllable perception depends on tone perception.
Iris Chuoying Ouyang, Khalil Iskarous
2012Synthetic F0 Can Effectively Convey Speaker ID in Delexicalized Speech.
Eric Morley, Esther Klabbers, Jan P. H. van Santen, Alexander Kain, Seyed Hamidreza Mohammadi
2012Synthetic References for Template-based ASR using posterior features.
Serena Soldo, Mathew Magimai-Doss, Hervé Bourlard
2012Synthetic Speech Discrimination using Pitch Pattern Statistics Derived from Image Analysis.
Phillip L. De Leon, Bryan Stewart, Junichi Yamagishi
2012Synthetic correction of deviant speech - children's perception of phonologically modified recordings of their own speech.
Sofia Strömbergsson
2012TDOA Estimation for Multiple Speakers in Underdetermined Case.
Mariem Bouafif, Zied Lachiri
2012Temporal and Situational Context Modeling for Improved Dominance Recognition in Meetings.
Martin Wöllmer, Florian Eyben, Björn W. Schuller, Gerhard Rigoll
2012Temporal entrainment in overlapped speech: Cross-linguistic study.
Marcin Wlodarczak, Juraj Simko, Petra Wagner
2012Text-To-Speech Intelligibility Across Speech Rates.
Ann K. Syrdal, H. Timothy Bunnell, Susan R. Hertz, Taniya Mishra, Murray F. Spiegel, Corine A. Bickley, Deborah Rekart, Matthew J. Makashay
2012Text-dependent pathological voice detection.
Gopala Krishna Anumanchipalli, Hugo Meinedo, Miguel M. F. Bugalho, Isabel Trancoso, Luís C. Oliveira, Alan W. Black
2012The 'Audio-Visual Face Cover Corpus': Investigations into audio-visual speech and speaker recognition when the speaker's face is occluded by facewear.
Natalie Fecher
2012The 2011 NIST Language Recognition Evaluation.
Craig S. Greenberg, Alvin F. Martin, Mark A. Przybocki
2012The Automatic Assessment of Non-native Prosody: Combining Classical Prosodic Analysis with Acoustic Modelling.
Florian Hönig, Tobias Bocklet, Korbinian Riedhammer, Anton Batliner, Elmar Nöth
2012The BLZ Submission to the NIST 2011 LRE: Data Collection, System Development and Performance.
Luis Javier Rodríguez-Fuentes, Mikel Peñagarikano, Amparo Varona, Mireia Díez, Germán Bordel, Alberto Abad, David Martínez González, Jesús Antonio Villalba López, Alfonso Ortega, Eduardo Lleida
2012The EHU Systems for the NIST 2011 Language Recognition Evaluation.
Mikel Peñagarikano, Amparo Varona, Luis Javier Rodríguez-Fuentes, Mireia Díez, Germán Bordel
2012The Effect of Spectral Estimator on Common Spectral Measures for Sibilant Fricatives.
Patrick Reidy, Mary E. Beckman
2012The Effect of Use of Drugs on Speaker's Fundamental Frequency and Formants.
Andrey N. Raev, Yuri Matveev, Tatiana Goloshchapova
2012The Effects of Lexical Tones and Nasal Coda /-n/ to Sadness in Taiwan Hakka.
Shao-Ren Lyu
2012The F0 fall delay of lexical pitch accent in Japanese Infant-directed speech.
Yoko Saikachi, Mafuyu Kitahara, Ken'ya Nishikawa, Ai Kanato, Reiko Mazuka
2012The IIIT-H Indic Speech Databases.
Kishore Prahallad, Naresh Kumar Elluru, Venkatesh Keri, Rajendran S, Alan W. Black
2012The INTERSPEECH 2012 Speaker Trait Challenge.
Björn W. Schuller, Stefan Steidl, Anton Batliner, Elmar Nöth, Alessandro Vinciarelli, Felix Burkhardt, Rob van Son, Felix Weninger, Florian Eyben, Tobias Bocklet, Gelareh Mohammadi, Benjamin Weiss
2012The Intelligibility of Lombard Speech: Communicative setting matters.
Michael Fitzpatrick, Jeesun Kim, Chris Davis
2012The Role of Creaky Voice in Mandarin Tone 2 and Tone 3 Perception.
Rui Cao, Ratree Wayland, Edith Kaan
2012The Role of Score Calibration in Speaker Recognition.
George R. Doddington
2012The Speech Recognition Virtual Kitchen: An Initial Prototype.
Florian Metze, Eric Fosler-Lussier
2012The Use of DBN-HMMs for Mispronunciation Detection and Diagnosis in L2 English to Support Computer-Aided Pronunciation Training.
Xiaojun Qian, Helen M. Meng, Frank K. Soong
2012The effect of dichotic processing on the perception of binaural cues.
Akiko Amano-Kusumoto, Justin M. Aronoff, Motokuni Itoh, Sigfrid D. Soli
2012The entropy of intoxicated speech - lexical creativity and heavy tongues.
Uwe D. Reichel
2012The log-Gabor method: speech classification using spectrogram image analysis.
Harm Buisman, Eric O. Postma
2012The processes underlying two frequent casual speech phenomena in Dutch: A production experiment.
Iris Hanique, Mirjam Ernestus
2012The production and perception of Estonian quantity degrees by native and non-native speakers.
Lya Meister, Einar Meister
2012Tied-State Mixture Language Model for WFST-based Speech Recognition.
Hitoshi Yamamoto, Paul R. Dixon, Shigeki Matsuda, Chiori Hori, Hideki Kashioka
2012Time Delay Estimation for Speech Signal Based on FOC-Spectrum.
Hong Liu, Xiaofei Li
2012Toward an Optimum Feature Set and HMM Model Parameters for Automatic Phonetic Alignment of Spontaneous Speech.
Montri Karnjanadecha, Stephen A. Zahorian
2012Towards Automated Annotation of Audio and Video Recordings by Application of Advanced Web-services.
Przemyslaw Lenkiewicz, Dieter Van Uytvanck, Peter Wittenburg, Sebastian Drude
2012Towards Empirical Dialog-State Modeling and its Use in Language Modeling.
Nigel G. Ward, Alejandro Vega
2012Towards Glottal Source Controllability in Expressive Speech Synthesis.
Jaime Lorenzo-Trueba, Roberto Barra-Chicote, Tuomo Raitio, Nicolas Obin, Paavo Alku, Junichi Yamagishi, Juan Manuel Montero
2012Towards Hierarchical Prosodic Prominence Generation in TTS Synthesis.
Leonardo Badino, Robert A. J. Clark
2012Towards Recurrent Neural Networks Language Models with Linguistic and Contextual Features.
Yangyang Shi, Pascal Wiggers, Catholijn M. Jonker
2012Towards an Unsupervised Speaking Style Voice Building Framework: Multi-Style Speaker Diarization.
Jaime Lorenzo-Trueba, Beatriz Martínez-González, Roberto Barra-Chicote, Verónica López-Ludeña, Javier Ferreiros, Junichi Yamagishi, Juan Manuel Montero
2012Training Deep Nets with Imbalanced and Unlabeled Data.
Jeffrey Berry, Ian R. Fasel, Luciano Fadiga, Diana Archangeli
2012Turning a Monolingual Speaker into Multilingual for a Mixed-language TTS.
Ji He, Yao Qian, Frank K. Soong, Sheng Zhao
2012Ultrax: An Animated Midsagittal Vocal Tract Display for Speech Therapy.
Korin Richmond, Steve Renals
2012Uncertainty driven Compensation of Multi-Stream MLP Acoustic Models for Robust ASR.
Ramón Fernandez Astudillo, Alberto Abad, João Paulo Neto
2012Unconstrained Speech Separation by Composition of Longest Segments.
Ji Ming, Ramji Srinivasan, Danny Crookes
2012Unsupervised Acoustic Analyses of Normal and Lombard Speech, with Spectral Envelope Transformation to Improve Intelligibility.
Elizabeth Godoy, Yannis Stylianou
2012Unsupervised Deep Belief Features for Speech Translation.
Sameer Maskey, Bowen Zhou
2012Unsupervised NAP Training Data Design for Speaker Recognition.
Hanwu Sun, Bin Ma
2012Unsupervised Speaker Identification using Overlaid Texts in TV Broadcast.
Johann Poignant, Hervé Bredin, Viet Bac Le, Laurent Besacier, Claude Barras, Georges Quénot
2012Unveiling the Acoustic Properties that Describe the Valence Dimension.
Carlos Busso, Tauhidur Rahman
2012Using Bayesian Networks to find relevant context features for HMM-based speech synthesis.
Heng Lu, Simon King
2012Using Blob Detection in Missing Feature Linear-Frequency Cepstral Coefficients for Robust Sound Event Recognition.
Yi Ren Leng, Tran Huy Dat
2012Using HMM-based Speech Synthesis to Reconstruct the Voice of Individuals with Degenerative Speech Disorders.
Christophe Veaux, Junichi Yamagishi, Simon King
2012Using Prominence and Phrasing Predictions to Improve Weighted Dictionary Pronunciation Models.
Andrew Rosenberg
2012Using Quality Ratings to Predict Modality Choice in Multimodal Systems.
Ina Wechsung, Klaus-Peter Engelbrecht, Sebastian Möller
2012Using Reinforcement Learning for Dialogue Management Policies: Towards Understanding MDP Violations and Convergence.
Peter A. Heeman, Jordan Fryer, Rebecca Lunsford, Andrew Rueckert, Ethan Selfridge
2012Using Sparse Classification Outputs as Feature Observations for Noise-robust ASR.
Yang Sun, Bert Cranen, Jort F. Gemmeke, Louis ten Bosch, Lou Boves, Mathew M. Doss
2012Using Sub-word-level Information for Confidence Estimation with Conditional Random Field Models.
Matthew Stephen Seigel, Philip C. Woodland
2012Using Time-Synchronous Phone Co-occurrences in a SVM-Phonotactic Dialect Recognition System.
Amparo Varona, Mikel Peñagarikano, Luis Javier Rodríguez-Fuentes, Germán Bordel, Mireia Díez
2012Using broad phonetic classes to guide search in automatic speech recognition.
Stefan Ziegler, Bogdan Ludusan, Guillaume Gravier
2012Using context-free grammars for embedded speech recognition with Weighted Finite-State Transducers.
Frank Duckhorn, Rüdiger Hoffmann
2012Using i-Vector Space Model for Emotion Recognition.
Rui Xia, Yang Liu
2012Using magnetic resonance to image the pharynx during Arabic speech: Static and dynamic aspects.
Ryan Shosted, Bradley P. Sutton, Abbas Benmamoun
2012Using spectral measures to differentiate Mandarin and Korean sibilant fricatives.
Jeffrey Kallay, Jeffrey J. Holliday
2012Utilization of the Lombard effect in post-filtering for intelligibility enhancement of telephone speech.
Emma Jokinen, Paavo Alku, Martti Vainio
2012Utilizing Markov Chain Monte Carlo (MCMC) Method for Improved Glottal Inverse Filtering.
Harri Auvinen, Tuomo Raitio, Samuli Siltanen, Paavo Alku
2012Verifying Session Level Pronunciation Accuracy in a Speech Therapy Application.
Shou-Chun Yin, Richard C. Rose, Yun Tang
2012VisArtico: a visualization tool for articulatory data.
Slim Ouni, Loic Mangeonjean, Ingmar Steiner
2012Visualizing tool for evaluating inter-label similarity in prosodic labeling experiments.
David Escudero Mancebo, Eva Estebas-Vilaplana
2012Vocal Tremor Measurement Based on Autocorrelation of Contours.
Markus Brückl
2012Vocal-Source Biomarkers for Depression: A Link to Psychomotor Activity.
Thomas F. Quatieri, Nicolas Malyska
2012Voice Activity Detection Using Speech Recognizer Feedback.
Kit Thambiratnam, Weiwu Zhu, Frank Seide
2012Voice Production Mechanisms of Vibrato in Noh.
Ikuyo Yoshinaga, Jiangping Kong
2012Voice Query Refinement.
Cyril Allauzen, Edward Benson, Ciprian Chelba, Michael Riley, Johan Schalkwyk
2012Voice source analysis using biomechanical modeling and glottal inverse filtering.
Alan Pinheiro, Tuomo Raitio, Danyane Gomes, Paavo Alku
2012Vowel Creation by Articulatory Control in HMM-based Parametric Speech Synthesis.
Zhen-Hua Ling, Korin Richmond, Junichi Yamagishi
2012Vowels Produced by Sliding Three-tube Model with Different Lengths.
Takayuki Arai
2012Ways to Implement Global Variance in Statistical Speech Synthesis.
Hanna Silén, Elina Helander, Jani Nurminen, Moncef Gabbouj
2012Where did I go wrong?: Identifying troublesome segments for speaker diarization systems.
Mary Tai Knox, Nikki Mirghafori, Gerald Friedland
2012Where to associate stressed additive particles? Evidence from speech prosody.
Bettina Braun
2012White Listing and Score Normalization for Keyword Spotting of Noisy Speech.
Bing Zhang, Richard M. Schwartz, Stavros Tsakalidis, Long Nguyen, Spyros Matsoukas
2012Whole-Word Recognition from Articulatory Movements for Silent Speech Interfaces.
Jun Wang, Ashok Samal, Jordan R. Green, Frank Rudzicz
2012Wideband Parametric Speech Synthesis Using Warped Linear Prediction.
Tuomo Raitio, Antti Suni, Martti Vainio, Paavo Alku
2012Word Discovery with Beta Process Factor Analysis.
Niklas Vanhainen, Giampiero Salvi
2012Word Prominence Detection using Robust yet Simple Prosodic Features.
Taniya Mishra, Vivek Kumar Rangarajan Sridhar, Alistair Conkie
2012Word Relevance Modeling for Speech Recognition.
Kuan-Yu Chen, Hao-Chin Chang, Berlin Chen, Hsin-Min Wang
2012sparse banded precision matrices for low resource speech recognition.
Weibin Zhang, Pascale Fung