INTERSPEECH - RankMe

819 papers

Year	Title / Authors
2016	/r/ as Language Marker in Bilingual Speech Production and Perception. Constantijn Kaland, Vincenzo Galatà, Lorenzo Spreafico, Alessandro Vietti
2016	17th Annual Conference of the International Speech Communication Association, Interspeech 2016, San Francisco, CA, USA, September 8-12, 2016 Nelson Morgan
2016	A 50-Year Retrospective on Speech and Language Processing. John Makhoul
2016	A Class-Specific Speech Enhancement for Phoneme Recognition: A Dictionary Learning Approach. Nazreen P. M., A. G. Ramakrishnan, Prasanta Kumar Ghosh
2016	A Convex Model for Linguistic Influence in Group Conversations. Kan Kawabata, Visar Berisha, Anna Scaglione, Amy LaCross
2016	A DNN-HMM Approach to Non-Negative Matrix Factorization Based Speech Enhancement. Ziteng Wang, Xu Li, Xiaofei Wang, Qiang Fu, Yonghong Yan
2016	A DNN-HMM Approach to Story Segmentation. Jia Yu, Xiong Xiao, Lei Xie, Eng Siong Chng, Haizhou Li
2016	A Deep Learning Approach to Modeling Empathy in Addiction Counseling. James Gibson, Dogan Can, Bo Xiao, Zac E. Imel, David C. Atkins, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2016	A Divide-and-Conquer Approach for Language Identification Based on Recurrent Neural Networks. Gregory Gelly, Jean-Luc Gauvain, Viet Bac Le, Abdelkhalek Messaoudi
2016	A Fast and Accurate Fundamental Frequency Estimator Using Recursive Moving Average Filters. Ryunosuke Daido, Yuji Hisaminato
2016	A Feature Normalisation Technique for PLLR Based Language Identification Systems. Sarith Fernando, Vidhyasaharan Sethu, Eliathamby Ambikairajah
2016	A Feature Study for Masking-Based Reverberant Speech Separation. Masood Delfarah, DeLiang Wang
2016	A Framework for Automated Marmoset Vocalization Detection and Classification. Alan Wisler, Laura J. Brattain, Rogier Landman, Thomas F. Quatieri
2016	A Framework for Practical Multistream ASR. Sri Harish Reddy Mallidi, Hynek Hermansky
2016	A French Corpus for Distant-Microphone Speech Processing in Real Homes. Nancy Bertin, Ewen Camberlein, Emmanuel Vincent, Romain Lebarbenchon, Stéphane Peillon, Éric Lamande, Sunit Sivasankaran, Frédéric Bimbot, Irina Illina, Ariane Tom, Sylvain Fleury, Éric Jamet
2016	A Hierarchical Predictor of Synthetic Speech Naturalness Using Neural Networks. Takenori Yoshimura, Gustav Eje Henter, Oliver Watts, Mirjam Wester, Junichi Yamagishi, Keiichi Tokuda
2016	A Hybrid System for Continuous Word-Level Emphasis Modeling Based on HMM State Clustering and Adaptive Training. Quoc Truong Do, Tomoki Toda, Graham Neubig, Sakriani Sakti, Satoshi Nakamura
2016	A KL Divergence and DNN-Based Approach to Voice Conversion without Parallel Training Sentences. Feng-Long Xie, Frank K. Soong, Haifeng Li
2016	A Longitudinal Study of Children's Intonation in Narrative Speech. Jeffrey Kallay, Melissa A. Redford
2016	A Low Cost Desktop Robot and Tele-Presence Device for Interactive Speech Research. Michael C. Brady
2016	A Multimodal Dialogue System for Air Traffic Control Trainees Based on Discrete-Event Simulation. Lubos Smídl, Adam Chýlek, Jan Svec
2016	A New Model for Acoustic Wave Propagation and Scattering in the Vocal Tract. Jianguo Wei, Wendan Guan, Darcy Q. Hou, Dingyi Pan, Wenhuan Lu, Jianwu Dang
2016	A New Model of Speech Motor Control Based on Task Dynamics and State Feedback. Vikram Ramanarayanan, Benjamin Parrell, Louis Goldstein, Srikantan S. Nagarajan, John F. Houde
2016	A New Pre-Training Method for Training Deep Learning Models with Application to Spoken Language Understanding. Asli Celikyilmaz, Ruhi Sarikaya, Dilek Hakkani-Tür, Xiaohu Liu, Nikhil Ramesh, Gökhan Tür
2016	A Nonparametric Bayesian Approach for Spoken Term Detection by Example Query. Amir Hossein Harati Nejad Torbati, Joseph Picone
2016	A Novel Discriminative Score Calibration Method for Keyword Search. Zhiqiang Lv, Meng Cai, Wei-Qiang Zhang, Jia Liu
2016	A Novel Research to Artificial Bandwidth Extension Based on Deep BLSTM Recurrent Neural Networks and Exemplar-Based Sparse Representation. Bin Liu, Jianhua Tao
2016	A Novel Risk-Estimation-Theoretic Framework for Speech Enhancement in Nonstationary and Non-Gaussian Noise Conditions. Jishnu Sadasivan, Chandra Sekhar Seelamantula
2016	A Phase-Based Time-Frequency Masking for Multi-Channel Speech Enhancement in Domestic Environments. Alessio Brutti, Antigoni Tsiami, Athanasios Katsamanis, Petros Maragos
2016	A Portable Automatic PA-TA-KA Syllable Detection System to Derive Biomarkers for Neurological Disorders. Fei Tao, Louis Daudet, Christian Poellabauer, Sandra L. Schneider, Carlos Busso
2016	A Praat-Based Algorithm to Extract the Amplitude Envelope and Temporal Fine Structure Using the Hilbert Transform. Lei He, Volker Dellwo
2016	A Preliminary Ultrasound Study of Nasal and Lateral Coronals in Arrernte. Marija Tabain, Richard Beare
2016	A Real-Time Framework for Visual Feedback of Articulatory Data Using Statistical Shape Models. Kristy James, Alexander Hewer, Ingmar Steiner, Stefanie Wuhrer
2016	A Real-Time Parametric General-Purpose Mammalian Vocal Synthesiser. Roger K. Moore
2016	A Robust Dual-Microphone Speech Source Localization Algorithm for Reverberant Environments. Yanmeng Guo, Xiaofei Wang, Chao Wu, Qiang Fu, Ning Ma, Guy J. Brown
2016	A Robust Non-Parametric and Filtering Based Approach for Glottal Closure Instant Detection. Pradeep Rengaswamy, Gurunath Reddy M., K. Sreenivasa Rao, Pallab Dasgupta
2016	A Sequence-to-Sequence Model for User Simulation in Spoken Dialogue Systems. Layla El Asri, Jing He, Kaheer Suleman
2016	A Sparse Spherical Harmonic-Based Model in Subbands for Head-Related Transfer Functions. Xiaoke Qi, Jianhua Tao
2016	A Speaker Diarization System for Studying Peer-Led Team Learning Groups. Harishchandra Dubey, Lakshmish Kaushik, Abhijeet Sangwan, John H. L. Hansen
2016	A Speaker Recognition System for the SITW Challenge. Oleg Kudashev, Sergey Novoselov, Konstantin Simonchik, Alexander Kozlov
2016	A Spectral Modulation Sensitivity Weighted Pre-Emphasis Filter for Active Noise Control System. Kah-Meng Cheong, Yuh-Yuan Wang, Tai-Shih Chi
2016	A Step Beyond Local Observations with a Dialog Aware Bidirectional GRU Network for Spoken Language Understanding. Vedran Vukotic, Christian Raymond, Guillaume Gravier
2016	A Stochastic Model for Computer-Aided Human-Human Dialogue. Merwan Barlier, Romain Laroche, Olivier Pietquin
2016	A Template-Based Approach for Speech Synthesis Intonation Generation Using LSTMs. Srikanth Ronanki, Gustav Eje Henter, Zhizheng Wu, Simon King
2016	A Voice Conversion Mapping Function Based on a Stacked Joint-Autoencoder. Seyed Hamidreza Mohammadi, Alexander Kain
2016	A WFST Framework for Single-Pass Multi-Stream Decoding. Sirui Xu, Eric Fosler-Lussier
2016	A priori SNR Estimation Using a Generalized Decision Directed Approach. Aleksej Chinaev, Reinhold Haeb-Umbach
2016	ARET - Automatic Reading of Educational Texts for Visually Impaired Students. Martin Gruber, Jindrich Matousek, Zdenek Hanzlícek, Zdenek Krnoul, Zbynek Zajíc
2016	ASR Confidence Estimation with Speaker-Adapted Recurrent Neural Networks. Miguel Ángel del Agua, Santiago Piqueras, Adrià Giménez, Alberto Sanchís, Jorge Civera, Alfons Juan
2016	ASR for South Slavic Languages Developed in Almost Automated Way. Jan Nouza, Radek Safarík, Petr Cerva
2016	AUT System for SITW Speaker Recognition Challenge. Abbas Khosravani, Mohammad Mehdi Homayounpour
2016	Accent Identification by Combining Deep Neural Networks and Recurrent Neural Networks Trained on Long and Short Term Features. Yishan Jiao, Ming Tu, Visar Berisha, Julie M. Liss
2016	Acoustic Analysis of Syllables Across Indian Languages. Anusha Prakash, Jeena J. Prakash, Hema A. Murthy
2016	Acoustic Differences Between English /t/ Glottalization and Phrasal Creak. Marc Garellek, Scott Seyfarth
2016	Acoustic Modeling Using Bidirectional Gated Recurrent Convolutional Units. Markus Nußbaum-Thom, Jia Cui, Bhuvana Ramabhadran, Vaibhava Goel
2016	Acoustic Modelling from the Signal Domain Using CNNs. Pegah Ghahremani, Vimal Manohar, Daniel Povey, Sanjeev Khudanpur
2016	Acoustic Properties of Formality in Conversational Japanese. Ethan Sherr-Ziarko
2016	Acoustic Word Embeddings for ASR Error Detection. Sahar Ghannay, Yannick Estève, Nathalie Camelin, Paul Deléglise
2016	Acoustic and Visual Analysis of Expressive Speech: A Case Study of French Acted Speech. Slim Ouni, Vincent Colotte, Sara Dahmani, Soumaya Azzi
2016	Acoustic-Prosodic and Turn-Taking Features in Interactions with Children with Neurodevelopmental Disorders. Daniel Bone, Somer Bishop, Rahul Gupta, Sungbok Lee, Shrikanth S. Narayanan
2016	Acoustic-to-Articulatory Inversion Mapping Based on Latent Trajectory Gaussian Mixture Model. Patrick Lumban Tobing, Tomoki Toda, Hirokazu Kameoka, Satoshi Nakamura
2016	Active and Semi-Supervised Learning in ASR: Benefits on the Acoustic and Language Models. Thomas Drugman, Janne Pylkkönen, Reinhard Kneser
2016	Adaptation of Neural Networks Constrained by Prior Statistics of Node Co-Activations. Tasha Nagamine, Zhuo Chen, Nima Mesgarani
2016	Adaptive Group Sparsity for Non-Negative Matrix Factorization with Application to Unsupervised Source Separation. Xu Li, Ziteng Wang, Xiaofei Wang, Qiang Fu, Yonghong Yan
2016	Adaptive Latency for Part-of-Speech Tagging in Incremental Text-to-Speech Synthesis. Maël Pouget, Olha Nahorna, Thomas Hueber, Gérard Bailly
2016	Advances in Very Deep Convolutional Neural Networks for LVCSR. Tom Sercu, Vaibhava Goel
2016	Adversarial Multi-Task Learning of Deep Neural Networks for Robust Speech Recognition. Yusuke Shinohara
2016	An Acoustic Analysis of /r/ in Tyrolean. Vincenzo Galatà, Lorenzo Spreafico, Alessandro Vietti, Constantijn Kaland
2016	An Acoustic Analysis of Child-Child and Child-Robot Interactions for Understanding Engagement during Speech-Controlled Computer Games. Theodora Chaspari, Jill Fain Lehman
2016	An Adaptive Multi-Band System for Low Power Voice Command Recognition. Qing He, Gregory W. Wornell, Wei Ma
2016	An Articulatory-Based Singing Voice Synthesis Using Tongue and Lips Imaging. Aurore Jaumard-Hakoun, Kele Xu, Clémence Leboullenger, Pierre Roussel-Ragot, Bruce Denby
2016	An Automatic Training Tool for Air Traffic Control Training. Petr Stanislav, Lubos Smídl, Jan Svec
2016	An Engine for Online Video Search in Large Archives of the Holocaust Testimonies. Petr Stanislav, Jan Svec, Pavel Ircing
2016	An Expectation Maximization Approach to Joint Modeling of Multidimensional Ratings Derived from Multiple Annotators. Anil Ramakrishna, Rahul Gupta, Ruth B. Grossman, Shrikanth S. Narayanan
2016	An Improved 3D Geometric Tongue Model. Qiang Fang, Yun Chen, Haibo Wang, Jianguo Wei, Jianrong Wang, Xiyu Wu, Aijun Li
2016	An Interaural Magnification Algorithm for Enhancement of Naturally-Occurring Level Differences. Shadi Pirhosseinloo, Kostas Kokkinakis
2016	An Investigation of DNN-Based Speech Synthesis Using Speaker Codes. Nobukatsu Hojo, Yusuke Ijima, Hideyuki Mizuno
2016	An Investigation of Deep Neural Network Architectures for Language Recognition in Indian Languages. Mounika K. V., Sivanand Achanta, Lakshmi H. R., Suryakanth V. Gangashetty, Anil Kumar Vuppala
2016	An Investigation of Emotional Speech in Depression Classification. Brian Stasak, Julien Epps, Nicholas Cummins, Roland Goecke
2016	An Investigation of Recurrent Neural Network Architectures Using Word Embeddings for Phrase Break Prediction. Anandaswarup Vadapalli, Suryakanth V. Gangashetty
2016	An Investigation of Spoofing Speech Detection Under Additive Noise and Reverberant Conditions. Xiaohai Tian, Zhizheng Wu, Xiong Xiao, Eng Siong Chng, Haizhou Li
2016	An Investigation on Training Deep Neural Networks Using Probabilistic Transcriptions. Amit Das, Mark Hasegawa-Johnson
2016	An Investigation on the Use of i-Vectors for Robust ASR. Dimitrios Dimitriadis, Samuel Thomas, Sriram Ganapathy
2016	An Iterative Phase Recovery Framework with Phase Mask for Spectral Mapping with an Application to Speech Enhancement. Kehuang Li, Bo Wu, Chin-Hui Lee
2016	An Objective Evaluation Methodology for Blind Bandwidth Extension. Stéphane Villette, Sen Li, Pravin Ramadas, Daniel J. Sinder
2016	Analysis of Chinese Syllable Durations in Running Speech of Japanese L2 Learners. Yue Sun, Shudon Hsiao, Yoshinori Sagisaka, Jinsong Zhang
2016	Analysis of Face Mask Effect on Speaker Recognition. Rahim Saeidi, Ilkka Huhtakallio, Paavo Alku
2016	Analysis of Glottal Stop in Assam Sora Language. Sishir Kalita, Luke Horo, Priyankoo Sarmah, S. R. Mahadeva Prasanna, Samarendra Dandapat
2016	Analysis of Mismatched Transcriptions Generated by Humans and Machines for Under-Resourced Languages. Van Hai Do, Nancy F. Chen, Boon Pang Lim, Mark Hasegawa-Johnson
2016	Analysis of Multi-Lingual Emotion Recognition Using Auditory Attention Features. Ozlem Kalinli
2016	Analysis of Speaker Recognition Systems in Realistic Scenarios of the SITW 2016 Challenge. Ondrej Novotný, Pavel Matejka, Oldrich Plchot, Ondrej Glembek, Lukás Burget, Jan Cernocký
2016	Analysis of the Voice Conversion Challenge 2016 Evaluation Results. Mirjam Wester, Zhizheng Wu, Junichi Yamagishi
2016	Analysis on Gated Recurrent Unit Based Question Detection Approach. Yaodong Tang, Zhiyong Wu, Helen M. Meng, Mingxing Xu, Lianhong Cai
2016	Analytical Assessment of Dual-Stream Merging for Noise-Robust ASR. Louis ten Bosch, Bert Cranen, Yang Sun
2016	Analyzing Temporal Dynamics of Dyadic Synchrony in Affective Interactions. Zhaojun Yang, Shrikanth S. Narayanan
2016	Analyzing the Contribution of Top-Down Lexical and Bottom-Up Acoustic Cues in the Detection of Sentence Prominence. Sofoklis Kakouros, Joris Pelemans, Lyan Verwimp, Patrick Wambacq, Okko Räsänen
2016	Analyzing the Relation Between Overall Quality and the Quality of Individual Phases in a Telephone Conversation. Friedemann Köster, Sebastian Möller
2016	Anchored Speech Detection. Roland Maas, Sree Hari Krishnan Parthasarathi, Brian John King, Ruitong Huang, Björn Hoffmeister
2016	Applying Spectral Normalisation and Efficient Envelope Estimation and Statistical Transformation for the Voice Conversion Challenge 2016. Fernando Villavicencio, Junichi Yamagishi, Jordi Bonada, Felipe Espic
2016	Articulation Rate Filtering of CQCC Features for Automatic Speaker Verification. Massimiliano Todisco, Héctor Delgado, Nicholas W. D. Evans
2016	Articulation Rate in Adverse Listening Conditions in Younger and Older Adults. Outi Tuomainen, Valérie Hazan
2016	Articulatory Feature Extraction Using CTC to Build Articulatory Classifiers Without Forced Frame Alignments for Speech Recognition. Basil Abraham, Srinivasan Umesh, Neethu Mariam Joy
2016	Articulatory Synthesis Based on Real-Time Magnetic Resonance Imaging Data. Asterios Toutios, Tanner Sorensen, Krishna Somandepalli, Rachel Alexander, Shrikanth S. Narayanan
2016	Articulatory-to-Acoustic Conversion with Cascaded Prediction of Spectral and Excitation Features Using Neural Networks. Zheng-Chen Liu, Zhen-Hua Ling, Li-Rong Dai
2016	Artificial Neural Network-Based Feature Combination for Spatial Voice Activity Detection. Stefan Meier, Walter Kellermann
2016	Assessing Idiosyncrasies in a Bayesian Model of Speech Communication. Marie-Lou Barnaud, Julien Diard, Pierre Bessière, Jean-Luc Schwartz
2016	Assessing Level-Dependent Segmental Contribution to the Intelligibility of Speech Processed by Single-Channel Noise-Suppression Algorithms. Tian Guan, Guangxing Chu, Fei Chen, Feng Yang
2016	Assessing Speech Quality in Speech-Aware Hearing Aids Based on Phoneme Posteriorgrams. Constantin Spille, Hendrik Kayser, Hynek Hermansky, Bernd T. Meyer
2016	At the Border of Acoustics and Linguistics: Bag-of-Audio-Words for the Recognition of Emotions in Speech. Maximilian Schmitt, Fabien Ringeval, Björn W. Schuller
2016	Attention Assisted Discovery of Sub-Utterance Structure in Speech Emotion Recognition. Che-Wei Huang, Shrikanth S. Narayanan
2016	Attention-Based Convolutional Neural Networks for Sentence Classification. Zhiwei Zhao, Youzheng Wu
2016	Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling. Bing Liu, Ian R. Lane
2016	Audio Word2Vec: Unsupervised Learning of Audio Segment Representations Using Sequence-to-Sequence Autoencoder. Yu-An Chung, Chao-Chung Wu, Chia-Hao Shen, Hung-yi Lee, Lin-Shan Lee
2016	Audio-Based Distributional Representations of Meaning Using a Fusion of Feature Encodings. Giannis Karamanolakis, Elias Iosif, Athanasia Zlatintsi, Aggelos Pikrakis, Alexandros Potamianos
2016	Audio-Visual Speech Recognition Using Bimodal-Trained Bottleneck Features for a Person with Severe Hearing Loss. Yuki Takashima, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki, Nobuyuki Mitani, Kiyohiro Omori, Kaoru Nakazono
2016	Audio-to-Visual Speech Conversion Using Deep Neural Networks. Sarah Taylor, Akihiro Kato, Iain A. Matthews, Ben P. Milner
2016	Audiovisual Speech Scene Analysis in the Context of Competing Sources. Attigodu C. Ganesh, Frédéric Berthommier, Jean-Luc Schwartz
2016	Audiovisual Training Effects for Japanese Children Learning English /r/-/l/. Yasuaki Shinohara
2016	Auditory Processing Impairments Under Background Noise in Children with Non-Syndromic Cleft Lip and/or Palate. Yang Feng, Zhang Lu
2016	Auditory-Visual Lexical Tone Perception in Thai Elderly Listeners with and without Hearing Impairment. Benjawan Kasisopa, Chutamanee Onsuwan, Charturong Tantibundhit, Nittayapa Klangpornkun, Suparak Techacharoenrungrueang, Sudaporn Luksaneeyanawin, Denis Burnham
2016	Auditory-Visual Perception of VCVs Produced by People with Down Syndrome: Preliminary Results. Alexandre Hennequin, Amélie Rochet-Capellan, Marion Dohen
2016	Automated Pause Insertion for Improved Intelligibility Under Reverberation. Petko Nikolov Petkov, Norbert Braunschweiler, Yannis Stylianou
2016	Automated Screening of Speech Development Issues in Children by Identifying Phonological Error Patterns. Lauren Ward, Alessandro Stefani, Daniel V. Smith, Andreas Duenser, Jill Freyne, Barbara Dodd, Angela Morgan
2016	Automatic Analysis of Phonetic Speech Style Dimensions. Neville Ryant, Mark Y. Liberman
2016	Automatic Analysis of Typical and Atypical Encoding of Spontaneous Emotion in the Voice of Children. Fabien Ringeval, Erik Marchi, Charline Grossard, Jean Xavier, Mohamed Chetouani, David Cohen, Björn W. Schuller
2016	Automatic Assessment and Error Detection of Shadowing Speech: Case of English Spoken by Japanese Learners. Shuju Shi, Yosuke Kashiwagi, Shohei Toyama, Junwei Yue, Yutaka Yamauchi, Daisuke Saito, Nobuaki Minematsu
2016	Automatic Classification of Lexical Stress in English and Arabic Languages Using Deep Learning. Mostafa Ali Shahin, Julien Epps, Beena Ahmed
2016	Automatic Classification of Phonation Modes in Singing Voice: Towards Singing Style Characterisation and Application to Ethnomusicological Recordings. Jean-Luc Rouas, Leonidas Ioannidis
2016	Automatic Correction of ASR Outputs by Using Machine Translation. Luis Fernando D'Haro, Rafael E. Banchs
2016	Automatic Detection of Parkinson's Disease Based on Modulated Vowels. Daria Hemmerling, Juan Rafael Orozco-Arroyave, Andrzej Skalski, Janusz Gajda, Elmar Nöth
2016	Automatic Dialect Detection in Arabic Broadcast Speech. Ahmed Ali, Najim Dehak, Patrick Cardinal, Sameer Khurana, Sree Harsha Yella, James R. Glass, Peter Bell, Steve Renals
2016	Automatic Discrimination of Soft Voice Onset Using Acoustic Features of Breathy Voicing. Keiko Ochi, Koichi Mori, Naomi Sakai, Nobutaka Ono
2016	Automatic Estimation of Perceived Sincerity from Spoken Language. Brandon M. Booth, Rahul Gupta, Pavlos Papadopoulos, Ruchir Travadi, Shrikanth S. Narayanan
2016	Automatic Genre and Show Identification of Broadcast Media. Mortaza Doulaty, Oscar Saz, Raymond W. M. Ng, Thomas Hain
2016	Automatic Glottal Inverse Filtering with Non-Negative Matrix Factorization. Manu Airaksinen, Lauri Juvela, Tom Bäckström, Paavo Alku
2016	Automatic Measurement of Voice Onset Time and Prevoicing Using Recurrent Neural Networks. Yossi Adi, Joseph Keshet, Olga Dmitrieva, Matthew Goldrick
2016	Automatic Paragraph Segmentation with Lexical and Prosodic Features. Catherine Lai, Mireia Farrús, Johanna D. Moore
2016	Automatic Pronunciation Evaluation of Non-Native Mandarin Tone by Using Multi-Level Confidence Measures. Ju Lin, Yanlu Xie, Jinsong Zhang
2016	Automatic Pronunciation Generation by Utilizing a Semi-Supervised Deep Neural Networks. Naoya Takahashi, Tofigh Naghibi, Beat Pfister
2016	Automatic Recognition of Social Roles Using Long Term Role Transitions in Small Group Interactions. Gaurav Fotedar, Aditya Gaonkar P., Saikat Chatterjee, Prasanta Kumar Ghosh
2016	Automatic Scoring of Monologue Video Interviews Using Multimodal Cues. Lei Chen, Gary Feng, Michelle P. Martin-Raugh, Chee Wee Leong, Christopher Kitchen, Su-Youn Yoon, Blair Lehman, Harrison Kell, Chong Min Lee
2016	Automatic Speech Recognition Using Probabilistic Transcriptions in Swahili, Amharic, and Dinka. Amit Das, Preethi Jyothi, Mark Hasegawa-Johnson
2016	Automatic Speech Transcription for Low-Resource Languages - The Case of Yoloxóchitl Mixtec (Mexico). Vikramjit Mitra, Andreas Kathol, Jonathan D. Amith, Rey Castillo García
2016	Automatically Classifying Self-Rated Personality Scores from Speech. Guozhen An, Sarah Ita Levitan, Rivka Levitan, Andrew Rosenberg, Michelle Levine, Julia Hirschberg
2016	Bayesian Modeling in Speech Motor Control: A Principled Structure for the Integration of Various Constraints. Jean-François Patri, Pascal Perrier, Julien Diard
2016	Behavioral Coding of Therapist Language in Addiction Counseling Using Recurrent Neural Networks. Bo Xiao, Dogan Can, James Gibson, Zac E. Imel, David C. Atkins, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2016	Bertsokantari: a TTS Based Singing Synthesis System. Eder del Blanco, Inma Hernáez, Eva Navas, Xabier Sarasola, Daniel Erro
2016	Better Evaluation of ASR in Speech Translation Context Using Word Embeddings. Ngoc-Tien Le, Christophe Servan, Benjamin Lecouteux, Laurent Besacier
2016	Between- and Within-Speaker Effects of Bilingualism on F0 Variation. Rob Voigt, Dan Jurafsky, Meghan Sumner
2016	Beyond Utterance Extraction: Summary Recombination for Speech Summarization. Jérémy Trione, Benoît Favre, Frédéric Béchet
2016	Bidirectional Recurrent Neural Network with Attention Mechanism for Punctuation Restoration. Ottokar Tilk, Tanel Alumäe
2016	Bird Song Synthesis Based on Hidden Markov Models. Jordi Bonada, Robert Lachlan, Merlijn Blaauw
2016	Blind Non-Intrusive Speech Intelligibility Prediction Using Twin-HMMs. Mahdie Karbasi, Ahmed Hussen Abdelaziz, Hendrik Meutzner, Dorothea Kolossa
2016	Blind Recovery of Perceptual Models in Distributed Speech and Audio Coding. Tom Bäckström, Florin Ghido, Johannes Fischer
2016	Blind Speech Separation with GCC-NMF. Sean U. N. Wood, Jean Rouat
2016	CNN-Based Phone Segmentation Experiments in a Less-Represented Language. Céline Manenti, Thomas Pellegrini, Julien Pinquier
2016	Call Alternation Between Specific Pairs of Male Frogs Revealed by a Sound-Imaging Method in Their Natural Habitat. Ikkyu Aihara, Takeshi Mizumoto, Hiromitsu Awano, Hiroshi G. Okuno
2016	Can Intensive Exposure to Foreign Language Sounds Affect the Perception of Native Sounds? Jian Gong, María Luisa García Lecumberri, Martin Cooke
2016	Categorization of Natural Spanish Whistled Vowels by Naïve Spanish Listeners. Julien Meyer, Laure Dentel, Fanny Meunier
2016	Causal Speech Enhancement Combining Data-Driven Learning and Suppression Rule Estimation. Seyedmahdad Mirsamadi, Ivan Tashev
2016	Channel Selection for Distant Speech Recognition Exploiting Cepstral Distance. Cristina Guerrero, Georgina Tryfou, Maurizio Omologo
2016	Characterization of Audiovisual Dramatic Attitudes. Adela Barbulescu, Rémi Ronfard, Gérard Bailly
2016	Characterizing Vocal Tract Dynamics Across Speakers Using Real-Time MRI. Tanner Sorensen, Asterios Toutios, Louis Goldstein, Shrikanth S. Narayanan
2016	Classification of Voice Modality Using Electroglottogram Waveforms. Michal Borsky, Daryush D. Mehta, Julius P. Gudjohnsen, Jón Guðnason
2016	Closing Remarks. Naomi Harte, Peter Jancovic, Karl-L. Schuchmann
2016	CloudCAST - Remote Speech Technology for Speech Professionals. Phil D. Green, Ricard Marxer, Stuart P. Cunningham, Heidi Christensen, Frank Rudzicz, Maria Yancheva, André Coy, Massimiliano Malavasi, Lorenzo Desideri, Fabio Tamburini
2016	Coda Stop and Taiwan Min Checked Tone Sound Changes. Ho-Hsien Pan, Hsiao-tung Huang, Shao-Ren Lyu
2016	Colloquialising Modern Standard Arabic Text for Improved Speech Recognition. Sarah Al-Shareef, Thomas Hain
2016	Combining Acoustic-Prosodic, Lexical, and Phonotactic Features for Automatic Deception Detection. Sarah Ita Levitan, Guozhen An, Min Ma, Rivka Levitan, Andrew Rosenberg, Julia Hirschberg
2016	Combining CNN and BLSTM to Extract Textual and Acoustic Features for Recognizing Stances in Mandarin Ideological Debate Competition. Linchuan Li, Zhiyong Wu, Mingxing Xu, Helen M. Meng, Lianhong Cai
2016	Combining Data-Oriented and Process-Oriented Approaches to Modeling Reaction Time Data. Louis ten Bosch, Lou Boves, Mirjam Ernestus
2016	Combining Energy and Cross-Entropy Analysis for Nuclear Segments Detection. Antonio Origlia, Francesco Cutugno
2016	Combining Feature and Model-Based Adaptation of RNNLMs for Multi-Genre Broadcast Speech Recognition. Salil Deena, Madina Hasan, Mortaza Doulaty, Oscar Saz, Thomas Hain
2016	Combining Mask Estimates for Single Channel Audio Source Separation Using Deep Neural Networks. Emad M. Grais, Gerard Roma, Andrew J. R. Simpson, Mark D. Plumbley
2016	Combining Non-Pathological Data of Different Language Varieties to Improve DNN-HMM Performance on Pathological Speech. Emre Yilmaz, Mario Ganzeboom, Catia Cucchiarini, Helmer Strik
2016	Combining Semantic Word Classes and Sub-Word Unit Speech Recognition for Robust OOV Detection. Axel Horndasch, Anton Batliner, Caroline Kaufhold, Elmar Nöth
2016	Combining State-Level Spotting and Posterior-Based Acoustic Match for Improved Query-by-Example Spoken Term Detection. Shuji Oishi, Tatsuya Matsuba, Mitsuaki Makino, Atsuhiko Kai
2016	Combining Weak Tokenisers for Phonotactic Language Recognition in a Resource-Constrained Setting. Raymond W. M. Ng, Bhusan Chettri, Thomas Hain
2016	Compact Feedforward Sequential Memory Networks for Large Vocabulary Continuous Speech Recognition. Shiliang Zhang, Hui Jiang, Shifu Xiong, Si Wei, Li-Rong Dai
2016	Comparing Articulatory and Acoustic Strategies for Reducing Non-Native Accents. Sandesh Aryal, Ricardo Gutierrez-Osuna
2016	Comparing Different Methods for Analyzing ERP Signals. Kimberley Mulder, Louis ten Bosch, Lou Boves
2016	Comparing the Contributions of Amplitude and Phase to Speech Intelligibility in a Vocoder-Based Speech Synthesis Model. Fei Chen, Benson C. L. Chiao
2016	Comparing the Influence of Spectro-Temporal Integration in Computational Speech Segregation. Thomas Bentsen, Tobias May, Abigail A. Kressner, Torsten Dau
2016	Comparison of Multiple System Combination Techniques for Keyword Spotting. William Hartmann, Le Zhang, Kerri Barnes, Roger Hsiao, Stavros Tsakalidis, Richard M. Schwartz
2016	Complex Linear Projection (CLP): A Discriminative Approach to Joint Feature Extraction and Acoustic Modeling. Ehsan Variani, Tara N. Sainath, Izhak Shafran, Michiel Bacchiani
2016	Complexity in Prosody: A Nonlinear Dynamical Systems Approach for Dyadic Conversations; Behavior and Outcomes in Couples Therapy. Md. Nasir, Brian R. Baucom, Shrikanth S. Narayanan, Panayiotis G. Georgiou
2016	Compositional Neural Network Language Models for Agglutinative Languages. Ebru Arisoy, Murat Saraclar
2016	Computational Approaches to Linguistic Code Switching. Mona T. Diab, Pascale Fung, Julia Hirschberg, Thamar Solorio
2016	Conditional Random Fields for the Tunisian Dialect Grapheme-to-Phoneme Conversion. Abir Masmoudi, Mariem Ellouze, Fethi Bougares, Yannick Estève, Lamia Hadrich Belguith
2016	Congruency Effect Between Articulation and Grasping in Native English Speakers. Mikko Tiainen, Fatima M. Felisberti, Kaisa Tiippana, Martti Vainio, Juraj Simko, Jirí Lukavský, Lari Vainio
2016	Context Adaptive Neural Network for Rapid Adaptation of Deep CNN Based Acoustic Models. Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Takuya Yoshioka, Dung T. Tran, Tomohiro Nakatani
2016	Context Aware Mispronunciation Detection for Mandarin Pronunciation Training. Rong Tong, Nancy F. Chen, Bin Ma, Haizhou Li
2016	Context-Aware Restaurant Recommendation for Natural Language Queries: A Formative User Study in the Automotive Domain. Philipp Fischer, Cornelius Styp von Rekowski, Andreas Nürnberger
2016	Context-Sensitive and Role-Dependent Spoken Language Understanding Using Bidirectional and Attention LSTMs. Chiori Hori, Takaaki Hori, Shinji Watanabe, John R. Hershey
2016	Contextual Prediction Models for Speech Recognition. Yoni Halpern, Keith B. Hall, Vlad Schogol, Michael Riley, Brian Roark, Gleb Skobeltsyn, Martin Bäuml
2016	Conversational Engagement Recognition Using Auditory and Visual Cues. Yuyun Huang, Emer Gilmartin, Nick Campbell
2016	Convex Hull Convolutive Non-Negative Matrix Factorization for Uncovering Temporal Patterns in Multivariate Time-Series Data. Colin Vaz, Asterios Toutios, Shrikanth S. Narayanan
2016	Convolutional Neural Networks with Data Augmentation for Classifying Speakers' Native Language. Gil Keren, Jun Deng, Jouni Pohjalainen, Björn W. Schuller
2016	Coping with Unseen Data Conditions: Investigating Neural Net Architectures, Robust Features, and Information Fusion for Robust Speech Recognition. Vikramjit Mitra, Horacio Franco
2016	Corpora for the Evaluation of Robust Speaker Recognition Systems. Douglas E. Sturim, Pedro A. Torres-Carrasquillo, Joseph P. Campbell
2016	Cost Effective Acoustic Monitoring of Bird Species. Ciira Wa Maina
2016	Couples Behavior Modeling and Annotation Using Low-Resource LSTM Language Models. Shao-Yen Tseng, Sandeep Nallan Chakravarthula, Brian R. Baucom, Panayiotis G. Georgiou
2016	Cross-Cultural Depression Recognition from Vocal Biomarkers. Sharifa Alghowinem, Roland Goecke, Julien Epps, Michael Wagner, Jeffrey F. Cohn
2016	Cross-Database Evaluation of Audio-Based Spoofing Detection Systems. Pavel Korshunov, Sébastien Marcel
2016	Cross-Gender and Cross-Dialect Tone Recognition for Vietnamese. Antje Schweitzer, Ngoc Thang Vu
2016	Cross-Lingual Speaker Adaptation for Statistical Speech Synthesis Using Limited Data. Seyyed Saeed Sarfjoo, Cenk Demiroglu
2016	DBN-ivector Framework for Acoustic Emotion Recognition. Rui Xia, Yang Liu
2016	DNN Online with iVectors Acoustic Modeling and Doc2Vec Distributed Representations for Improving Automated Speech Scoring. Jidong Tao, Lei Chen, Chong Min Lee
2016	DNN-Based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification. Zeyan Oo, Yuta Kawakami, Longbiao Wang, Seiichi Nakagawa, Xiong Xiao, Masahiro Iwahashi
2016	DNN-Based Automatic Speech Recognition as a Model for Human Phoneme Perception. Mats Exter, Bernd T. Meyer
2016	DNN-Based Feature Enhancement Using Joint Training Framework for Robust Multichannel Speech Recognition. Kang Hyun Lee, Tae Gyoon Kang, Woo Hyun Kang, Nam Soo Kim
2016	DNN-Based Speaker Clustering for Speaker Diarisation. Rosanna Milner, Thomas Hain
2016	DNNs for Unsupervised Extraction of Pseudo FMLLR Features Without Explicit Adaptation Data. Neethu Mariam Joy, Murali Karthick Baskar, Srinivasan Umesh, Basil Abraham
2016	Data Augmentation Using Multi-Input Multi-Output Source Separation for Deep Neural Network Based Acoustic Modeling. Yusuke Fujita, Ryoichi Takashima, Takeshi Homma, Masahito Togami
2016	Data Selection and Adaptation for Naturalness in HMM-Based Speech Synthesis. Erica Cooper, Alison Chang, Yocheved Levitan, Julia Hirschberg
2016	Data Selection by Sequence Summarizing Neural Network in Mismatch Condition Training. Katerina Zmolíková, Martin Karafiát, Karel Veselý, Marc Delcroix, Shinji Watanabe, Lukás Burget, Jan Cernocký
2016	Data Selection for Within-Class Covariance Estimation. Elliot Singer, Tyler Campbell, Douglas A. Reynolds
2016	Deep Bidirectional LSTM Modeling of Timbre and Prosody for Emotional Voice Conversion. Huaiping Ming, Dong-Yan Huang, Lei Xie, Jie Wu, Minghui Dong, Haizhou Li
2016	Deep Bidirectional Long Short-Term Memory Recurrent Neural Networks for Grapheme-to-Phoneme Conversion Utilizing Complex Many-to-Many Alignments. Amr El-Desoky Mousa, Björn W. Schuller
2016	Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Recognition. Naoya Takahashi, Michael Gygli, Beat Pfister, Luc Van Gool
2016	Deep Convolutional Neural Networks with Layer-Wise Context Expansion and Attention. Dong Yu, Wayne Xiong, Jasha Droppo, Andreas Stolcke, Guoli Ye, Jinyu Li, Geoffrey Zweig
2016	Deep Neural Network Based Acoustic-to-Articulatory Inversion Using Phone Sequence Information. Xurong Xie, Xunying Liu, Lan Wang
2016	Deep Neural Network Bottleneck Features for Acoustic Event Recognition. Seongkyu Mun, Suwon Shon, Wooil Kim, Hanseok Ko
2016	Deep Neural Network Frontend for Continuous EMG-Based Speech Recognition. Michael Wand, Jürgen Schmidhuber
2016	Deep Neural Networks for Voice Quality Assessment Based on the GRBAS Scale. Simin Xie, Nan Yan, Ping Yu, Manwa L. Ng, Lan Wang, Zhuanzhuan Ji
2016	Deep Neural Networks for i-Vector Language Identification of Short Utterances in Cars. Omid Ghahabi, Antonio Bonafonte, Javier Hernando, Asunción Moreno
2016	Deep Stacked Autoencoders for Spoken Language Understanding. Killian Janod, Mohamed Morchid, Richard Dufour, Georges Linarès, Renato De Mori
2016	Defining Emotionally Salient Regions Using Qualitative Agreement Method. Srinivas Parthasarathy, Carlos Busso
2016	Deriving Phonetic Transcriptions and Discovering Word Segmentations for Speech-to-Speech Translation in Low-Resource Settings. Andrew Wilkinson, Tiancheng Zhao, Alan W. Black
2016	Detecting Mild Cognitive Impairment from Spontaneous Speech by Correlation-Based Phonetic Feature Selection. Gábor Gosztolya, László Tóth, Tamás Grósz, Veronika Vincze, Ildikó Hoffmann, Gréta Szatlóczki, Magdolna Pákáski, János Kálmán
2016	Detecting Mispronunciations of L2 Learners and Providing Corrective Feedback Using Knowledge-Guided and Data-Driven Decision Trees. Wei Li, Kehuang Li, Sabato Marco Siniscalchi, Nancy F. Chen, Chin-Hui Lee
2016	Detection of Total Syllables and Canonical Syllables in Infant Vocalizations. Anne S. Warlaumont, Heather L. Ramsdell-Hudock
2016	Detection of User Escalation in Human-Computer Interactions. Ian Beaver, Cynthia Freeman
2016	Determining Native Language and Deception Using Phonetic Features and Classifier Combination. Gábor Gosztolya, Tamás Grósz, Róbert Busa-Fekete, László Tóth
2016	Development of Mandarin Onset-Rime Detection in Relation to Age and Pinyin Instruction. Fei Chen, Nan Yan, Xunan Huang, Hao Zhang, Lan Wang, Gang Peng
2016	Diagnosing People with Dementia Using Automatic Conversation Analysis. Bahman Mirheidari, Daniel Blackburn, Markus Reuber, Traci Walker, Heidi Christensen
2016	Dialogue Session Segmentation by Embedding-Enhanced TextTiling. Yiping Song, Lili Mou, Rui Yan, Li Yi, Zinan Zhu, Xiaohua Hu, Ming Zhang
2016	Differential Effects of Velopharyngeal Dysfunction on Speech Intelligibility During Early and Late Stages of Amyotrophic Lateral Sclerosis. Panying Rong, Yana Yunusova, Jordan R. Green
2016	Digitala: An Augmented Test and Review Process Prototype for High-Stakes Spoken Foreign Language Examination. Reima Karhila, Aku Rouhe, Peter Smit, André Mansikkaniemi, Heini Kallio, Erik Lindroos, Raili Hildén, Martti Vainio, Mikko Kurimo
2016	Diphthongization of Nuclear Vowels and the Emergence of a Tetraphthong in Hetang Cantonese. Wenqi Hu, Fang Hu, Jian Jin
2016	Direct Expressive Voice Training Based on Semantic Selection. Igor Jauk, Antonio Bonafonte
2016	Directly Comparing the Listening Strategies of Humans and Machines. Michael I. Mandel
2016	Discriminative Layered Nonnegative Matrix Factorization for Speech Separation. Chung-Chien Hsu, Tai-Shih Chi, Jen-Tzung Chien
2016	Discussion. Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini
2016	Discussion. Dayana Ribas, Emmanuel Vincent, John H. L. Hansen, Emma Jokinen, Mirco Ravanelli, Hannes Gamper, Fred Richardson
2016	Discussion. Naomi Harte, Peter Jancovic, Karl-L. Schuchmann
2016	Disentrainment may be a Positive Thing: A Novel Measure of Unsigned Acoustic-Prosodic Synchrony, and its Relation to Speaker Engagement. Juan Manuel Pérez, Ramiro H. Gálvez, Agustín Gravano
2016	Disfluency Detection Using a Bidirectional LSTM. Vicky Zayats, Mari Ostendorf, Hannaneh Hajishirzi
2016	Distilling Knowledge from Ensembles of Neural Networks for Speech Recognition. Yevgen Chebotar, Austin Waters
2016	Do GMM Phoneme Classifiers Perceive Synthetic Sibilants as Humans Do? Gábor Pintér, Hiroki Watanabe
2016	Do Listeners Learn Better from Natural Speech? Michael McAuliffe, Molly Babel, Charlotte Vaughn
2016	Does Auditory-Motor Learning of Speech Transfer from the CV Syllable to the CVCV Word? Tiphaine Caudrelier, Pascal Perrier, Jean-Luc Schwartz, Amélie Rochet-Capellan
2016	Does She Speak RTT? Towards an Earlier Identification of Rett Syndrome Through Intelligent Pre-Linguistic Vocalisation Analysis. Florian B. Pokorny, Peter B. Marschik, Christa Einspieler, Björn W. Schuller
2016	Does the Importance of Word-Initial and Word-Final Information Differ in Native versus Non-Native Spoken-Word Recognition? Odette Scharenborg, Juul Coumans, Sofoklis Kakouros, Roeland van Hout
2016	Domain Adaptation of CNN Based Acoustic Models Under Limited Resource Settings. Masayuki Suzuki, Ryuki Tachibana, Samuel Thomas, Bhuvana Ramabhadran, George Saon
2016	Domain Adaptation of Recurrent Neural Networks for Natural Language Understanding. Aaron Jaech, Larry P. Heck, Mari Ostendorf
2016	Dynamic Stream Weighting for Turbo-Decoding-Based Audiovisual ASR. Sebastian Gergen, Steffen Zeiler, Ahmed Hussen Abdelaziz, Robert M. Nickel, Dorothea Kolossa
2016	Dynamic Transcription for Low-Latency Speech Translation. Jan Niehues, Thai Son Nguyen, Eunah Cho, Thanh-Le Ha, Kevin Kilgour, Markus Müller, Matthias Sperber, Sebastian Stüker, Alex Waibel
2016	Dysarthric Speech Recognition Using Kullback-Leibler Divergence-Based Hidden Markov Model. Myung Jong Kim, Jun Wang, Hoirin Kim
2016	EVS Channel Aware Mode Robustness to Frame Erasures. Anssi Rämö, Antti Kurittu, Henri Toukomaa
2016	Effect of Noise on Lexical Tone Perception in Cantonese-Speaking Amusics. Jing Shao, Caicai Zhang, Gang Peng, Yike Yang, William S.-Y. Wang
2016	Effectiveness of Near-End Speech Enhancement Under Equal-Loudness and Equal-Level Constraints. Tudor-Catalin Zorila, Sheila Flanagan, Brian C. J. Moore, Yannis Stylianou
2016	Effects of Cochlear Hearing Loss on the Benefits of Ideal Binary Masking. Vahid Montazeri, Shaikat Hossain, Peter F. Assmann
2016	Effects of L1 Phonotactic Constraints on L2 Word Segmentation Strategies. Tamami Katayama
2016	Effects of Stress on Fricatives: Evidence from Standard Modern Greek. Charalambos Themistocleous, Angelandria Savva, Andrie Aristodemou
2016	Effects of Subglottal-Coupling and Interdental-Space on Formant Trajectories During Front-to-Back Vowel Transitions in Chinese. Shuanglin Fan, Kiyoshi Honda, Jianwu Dang, Hui Feng
2016	Effects of Urgent Speech and Preceding Sounds on Speech Intelligibility in Noisy and Reverberant Environments. Nao Hodoshima
2016	Efficient Segmental Cascades for Speech Recognition. Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu
2016	Efficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling. Sittipong Saychum, Sarawoot Kongyoung, Anocha Rugchatjaroen, Patcharika Chootrakool, Sawit Kasuriya, Chai Wutiwiwatchai
2016	Emergence of Vocal Developmental Sequences in a Predictive Coding Model of Speech Acquisition. Shamima Najnin, Bonny Banerjee
2016	End-to-End Language Identification Using Attention-Based Recurrent Neural Networks. Wang Geng, Wenfu Wang, Yuanyuan Zhao, Xinyuan Cai, Bo Xu
2016	End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding. Yun-Nung Chen, Dilek Hakkani-Tür, Gökhan Tür, Jianfeng Gao, Li Deng
2016	English Language Speech Assistant. Xavier Anguera, Vu Van
2016	Enhance the Word Vector with Prosodic Information for the Recurrent Neural Network Based TTS System. Xin Wang, Shinji Takaki, Junichi Yamagishi
2016	Enhanced Harmonic Content and Vocal Note Based Predominant Melody Extraction from Vocal Polyphonic Music Signals. Gurunath Reddy M., K. Sreenivasa Rao
2016	Enhancement of Automatic Oral Presentation Assessment System Using Latent N-Grams Word Representation and Part-of-Speech Information. Wen-Yu Huang, Shan-Wen Hsiao, Hung-Ching Sun, Ming-Chuan Hsieh, Ming-Hsueh Tsai, Chi-Chun Lee
2016	Enhancing Data-Driven Phone Confusions Using Restricted Recognition. Mark Kane, Julie Carson-Berndsen
2016	Enhancing Multilingual Recognition of Emotion in Speech by Language Identification. Hesam Sagha, Pavel Matejka, Maryna Gavryukova, Filip Povolný, Erik Marchi, Björn W. Schuller
2016	Entropy Based Pruning for Non-Negative Matrix Based Language Models with Contextual Features. Barlas Oguz, Issac Alphonso, Shuangyu Chang
2016	Entropy Coding of Spectral Envelopes for Speech and Audio Coding Using Distribution Quantization. Srikanth Korse, Tobias Jähnel, Tom Bäckström
2016	Error Correction in Lightly Supervised Alignment of Broadcast Subtitles. Julia Olcoz, Oscar Saz, Thomas Hain
2016	Estimating the Sincerity of Apologies in Speech by DNN Rank Learning and Prosodic Analysis. Gábor Gosztolya, Tamás Grósz, György Szaszák, László Tóth
2016	Estimation of Children's Physical Characteristics from Their Voices. Jill Fain Lehman, Rita Singh
2016	Evaluation of Phonatory Behavior of German and French Speakers in Native and Non-Native Speech. Manfred Pützer, Frank Zimmerer, Wolfgang Wokurek, Jeanin Jügler
2016	Evaluation of Singing Synthesis: Methodology and Case Study with Concatenative and Performative Systems. Lionel Feugère, Christophe d'Alessandro, Samuel Delalez, Luc Ardaillon, Axel Roebel
2016	Evaluation of a Phone-Based Anomaly Detection Approach for Dysarthric Speech. Imed Laaridh, Corinne Fredouille, Christine Meunier
2016	Exemplar Dynamics in Phonetic Convergence of Speech Rate. Antje Schweitzer, Michael Walsh
2016	Experiences with Shared Resources for Research and Education in Speech and Language Processing. Rebecca Bates, Eric Fosler-Lussier, Florian Metze, Martha A. Larson, Gina-Anne Levow, Emily Mower Provost
2016	Experimental Validation of Sound Generated from Flow in Simplified Vocal Tract Model of Sibilant /s/. Tsukasa Yoshinaga, Kazunori Nozaki, Shigeo Wada
2016	Exploiting Depth and Highway Connections in Convolutional Recurrent Deep Neural Networks for Speech Recognition. Wei-Ning Hsu, Yu Zhang, Ann Lee, James R. Glass
2016	Exploiting Hidden-Layer Responses of Deep Neural Networks for Language Recognition. Ruizhi Li, Sri Harish Reddy Mallidi, Lukás Burget, Oldrich Plchot, Najim Dehak
2016	Exploiting Phone Log-Likelihood Ratio Features for the Detection of the Native Language of Non-Native English Speakers. Alberto Abad, Eugénio Ribeiro, Fábio N. Kepler, Ramón Fernandez Astudillo, Isabel Trancoso
2016	Exploring Collections of Multimedia Archives Through Innovative Interfaces in the Context of Digital Humanities. Géraldine Damnati, Delphine Charlet, Marc Denjean
2016	Exploring Session Variability and Template Aging in Speaker Verification for Fixed Phrase Short Utterances. Rohan Kumar Das, Sarfaraz Jelil, S. R. Mahadeva Prasanna
2016	Exploring Word Mover's Distance and Semantic-Aware Embedding Techniques for Extractive Broadcast News Summarization. Shih-Hung Liu, Kuan-Yu Chen, Yu-Lun Hsieh, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu
2016	Exploring the Correlation of Pitch Accents and Semantic Slots for Spoken Language Understanding. Sabrina Stehwien, Ngoc Thang Vu
2016	Expressive Control of Singing Voice Synthesis Using Musical Contexts and a Parametric F0 Model. Luc Ardaillon, Celine Chabot-Canet, Axel Roebel
2016	Expressive Singing Synthesis Based on Unit Selection for the Singing Synthesis Challenge 2016. Jordi Bonada, Martí Umbert, Merlijn Blaauw
2016	Expressive Speech Driven Talking Avatar Synthesis with DBLSTM Using Limited Amount of Emotional Bimodal Data. Xu Li, Zhiyong Wu, Helen M. Meng, Jia Jia, Xiaoyan Lou, Lianhong Cai
2016	F Xiaoyun Wang, Xugang Lu, Hisashi Kawai, Seiichi Yamamoto
2016	F0 Development in Acquiring Korean Stop Distinction. Gayeon Son
2016	Facing Realism in Spontaneous Emotion Recognition from Speech: Feature Enhancement by Autoencoder with LSTM Neural Networks. Zixing Zhang, Fabien Ringeval, Jing Han, Jun Deng, Erik Marchi, Björn W. Schuller
2016	Factor Analysis Based Speaker Normalisation for Continuous Emotion Prediction. Ting Dang, Vidhyasaharan Sethu, Eliathamby Ambikairajah
2016	Factor Analysis Based Speaker Verification Using ASR. Hang Su, Steven Wegmann
2016	Factorized Linear Input Network for Acoustic Model Adaptation in Noisy Conditions. Dung T. Tran, Marc Delcroix, Atsunori Ogawa, Tomohiro Nakatani
2016	Factors Affecting the Intelligibility of Sine-Wave Speech. Fei Chen, Daniel Fogerty
2016	Far-Field ASR Without Parallel Data. Vijayaditya Peddinti, Vimal Manohar, Yiming Wang, Daniel Povey, Sanjeev Khudanpur
2016	Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices. Heiga Zen, Yannis Agiomyrgiannakis, Niels Egberts, Fergus Henderson, Przemyslaw Szczepaniak
2016	Feature Learning and Automatic Segmentation for Dolphin Communication Analysis. Daniel Kohlsdorf, Denise Herzing, Thad Starner
2016	Feature Learning with Raw-Waveform CLDNNs for Voice Activity Detection. Rubén Zazo, Tara N. Sainath, Gabor Simko, Carolina Parada
2016	Feature-Level Decision Fusion for Audio-Visual Word Prominence Detection. Martin Heckmann
2016	First Step Towards End-to-End Parametric TTS Synthesis: Generating Spectral Parameters with Neural Attention. Wenfu Wang, Shuang Xu, Bo Xu
2016	Flexible, Rapid Authoring of Goal-Orientated, Multi-Turn Dialogues Using the Task Completion Platform. Alex Marin, Paul A. Crook, Omar Zia Khan, Vasiliy Radostev, Khushboo Aggarwal, Ruhi Sarikaya
2016	Formant Estimation and Tracking Using Deep Learning. Yehoshua Dissen, Joseph Keshet
2016	Frequency Estimation from Waveforms Using Multi-Layered Neural Networks. Prateek Verma, Ronald W. Schafer
2016	Fusing Acoustic Feature Representations for Computational Paralinguistics Tasks. Heysem Kaya, Alexey A. Karpov
2016	Fusion Strategies for Robust Speech Recognition and Keyword Spotting for Channel- and Noise-Degraded Speech. Vikramjit Mitra, Julien van Hout, Wen Wang, Chris Bartels, Horacio Franco, Dimitra Vergyri, Abeer Alwan, Adam Janin, John H. L. Hansen, Richard M. Stern, Abhijeet Sangwan, Nelson Morgan
2016	Future Context Attention for Unidirectional LSTM Based Acoustic Model. Jian Tang, Shiliang Zhang, Si Wei, Li-Rong Dai
2016	GMM-Free Flat Start Sequence-Discriminative DNN Training. Gábor Gosztolya, Tamás Grósz, László Tóth
2016	Gating Recurrent Enhanced Memory Neural Networks on Language Identification. Wang Geng, Yuanyuan Zhao, Wenfu Wang, Xinyuan Cai, Bo Xu
2016	Generalized Discriminant Analysis (GDA) for Improved i-Vector Based Speaker Recognition. Fahimeh Bahmaninezhad, John H. L. Hansen
2016	Generalizing Steady State Suppression for Enhanced Intelligibility Under Reverberation. Petko Nikolov Petkov, Yannis Stylianou
2016	Generating Complementary Acoustic Model Spaces in DNN-Based Sequence-to-Frame DTW Scheme for Out-of-Vocabulary Spoken Term Detection. Shi-wook Lee, Kazuyo Tanaka, Yoshiaki Itoh
2016	Generating Gestural Scores from Acoustics Through a Sparse Anchor-Based Representation of Speech. Christopher Liberatore, Ricardo Gutierrez-Osuna
2016	Generating Natural Video Descriptions via Multimodal Processing. Qin Jin, Junwei Liang, Xiaozhu Lin
2016	Generation and Pruning of Pronunciation Variants to Improve ASR Accuracy. Zhenhao Ge, Aravind Ganapathiraju, Ananth N. Iyer, Scott A. Randal, Felix I. Wyss
2016	Generation of Emotion Control Vector Using MDS-Based Space Transformation for Expressive Speech Synthesis. Yan-You Chen, Chung-Hsien Wu, Yu-Fong Huang
2016	Generative Acoustic-Phonemic-Speaker Model Based on Three-Way Restricted Boltzmann Machine. Toru Nakashika, Yasuhiro Minami
2016	Glimpse-Based Metrics for Predicting Speech Intelligibility in Additive Noise Conditions. Yan Tang, Martin Cooke
2016	Glimpsing Predictions for Natural and Vocoded Sentence Intelligibility During Modulation Masking: Effect of the Glimpse Cutoff Criterion. Bobby Gibbs II, Daniel Fogerty
2016	GlottDNN - A Full-Band Glottal Vocoder for Statistical Parametric Speech Synthesis. Manu Airaksinen, Bajibabu Bollepalli, Lauri Juvela, Zhizheng Wu, Simon King, Paavo Alku
2016	Glottal Squeaks in VC Sequences. Mísa Hejná, Pertti Palo, Scott Moisik
2016	HAPPY Team Entry to NIST OpenSAD Challenge: A Fusion of Short-Term Unsupervised and Segment i-Vector Based Speech Activity Detectors. Tomi Kinnunen, Alexey Sholokhov, Elie Khoury, Dennis Alexander Lehmann Thomsen, Md. Sahidullah, Zheng-Hua Tan
2016	HMM-Based Non-Native Accent Assessment Using Posterior Features. Ramya Rasipuram, Milos Cernak, Mathew Magimai-Doss
2016	HMM-Based Speech Enhancement Using Sub-Word Models and Noise Adaptation. Akihiro Kato, Ben P. Milner
2016	Head Motion Generation with Synthetic Speech: A Data Driven Approach. Najmeh Sadoughi, Carlos Busso
2016	Hierarchical Classification of Speaker and Background Noise and Estimation of SNR Using Sparse Representation. K. V. Vijay Girish, A. G. Ramakrishnan, T. V. Ananthapadmanabha
2016	Highlighting Psychological Features for Predicting Child Interjections During Story Telling. Gaël Lejeune, François Rioult, Bruno Crémilleux
2016	How Neural Network Depth Compensates for HMM Conditional Independence Assumptions in DNN-HMM Acoustic Models. Suman V. Ravuri, Steven Wegmann
2016	Hybrid Accelerated Optimization for Speech Recognition. Jen-Tzung Chien, Pei-Wen Huang, Tan Lee
2016	Hybrid Dialogue State Tracking for Real World Human-to-Human Dialogues. Kai Sun, Su Zhu, Lu Chen, Siqiu Yao, Xueyang Wu, Kai Yu
2016	Hyperarticulated Production of Korean Glides by Age Group. Seung-Eun Chang, Minsook Kim
2016	Identifying Hearing Loss from Learned Speech Kernels. Shamima Najnin, Bonny Banerjee, Lisa Lucks Mendel, Masoumeh Heidari Kapourchali, Jayanta Kumar Dutta, Sungmin Lee, Chhayakanta Patro, Monique Pousson
2016	Identifying Perceptually Similar Voices with a Speaker Recognition System Using Auto-Phonetic Features. Finnian Kelly, Anil Alexander, Oscar Forth, Samuel Kent, Jonas Lindh, Joel Åkesson
2016	Idlak Tangle: An Open Source Kaldi Based Parametric Speech Synthesiser Based on DNN. Blaise Potard, Matthew P. Aylett, David A. Baude, Petr Motlícek
2016	Illustrating the Production of the International Phonetic Alphabet Sounds Using Fast Real-Time Magnetic Resonance Imaging. Asterios Toutios, Sajan Goud Lingala, Colin Vaz, Jangwon Kim, John H. Esling, Patricia A. Keating, Matthew Gordon, Dani Byrd, Louis Goldstein, Krishna S. Nayak, Shrikanth S. Narayanan
2016	Impaired Categorical Perception of Mandarin Tones and its Relationship to Language Ability in Autism Spectrum Disorders. Fei Chen, Nan Yan, Xiaojie Pan, Feng Yang, Zhuanzhuan Ji, Lan Wang, Gang Peng
2016	Implementing Acoustic-Prosodic Entrainment in a Conversational Avatar. Rivka Levitan, Stefan Benus, Ramiro H. Gálvez, Agustín Gravano, Florencia Savoretti, Marián Trnka, Andreas Weise, Julia Hirschberg
2016	Improved Depiction of Tissue Boundaries in Vocal Tract Real-Time MRI Using Automatic Off-Resonance Correction. Yongwan Lim, Sajan Goud Lingala, Asterios Toutios, Shrikanth S. Narayanan, Krishna S. Nayak
2016	Improved MVDR Beamforming Using Single-Channel Mask Prediction Networks. Hakan Erdogan, John R. Hershey, Shinji Watanabe, Michael I. Mandel, Jonathan Le Roux
2016	Improved Multilingual Training of Stacked Neural Network Acoustic Models for Low Resource Languages. Tanel Alumäe, Stavros Tsakalidis, Richard M. Schwartz
2016	Improved Music Genre Classification with Convolutional Neural Networks. Weibin Zhang, Wenkang Lei, Xiangmin Xu, Xiaofeng Xing
2016	Improved Neural Bag-of-Words Model to Retrieve Out-of-Vocabulary Words in Speech Recognition. Imran A. Sheikh, Irina Illina, Dominique Fohr, Georges Linarès
2016	Improved Neural Network Initialization by Grouping Context-Dependent Targets for Acoustic Modeling. Gakuto Kurata, Brian Kingsbury
2016	Improved Time-Frequency Trajectory Excitation Vocoder for DNN-Based Speech Synthesis. Eunwoo Song, Frank K. Soong, Hong-Goo Kang
2016	Improved a priori SAP Estimator in Complex Noisy Environment for Dual Channel Microphone System. Youna Ji, Young-Cheol Park
2016	Improving Automatic Recognition of Aphasic Speech with AphasiaBank. Duc Le, Emily Mower Provost
2016	Improving Boundary Estimation in Audiovisual Speech Activity Detection Using Bayesian Information Criterion. Fei Tao, John H. L. Hansen, Carlos Busso
2016	Improving Children's Speech Recognition Through Out-of-Domain Data Augmentation. Joachim Fainberg, Peter Bell, Mike Lincoln, Steve Renals
2016	Improving Deep Neural Networks Based Speaker Verification Using Unlabeled Data. Yao Tian, Meng Cai, Liang He, Wei-Qiang Zhang, Jia Liu
2016	Improving English Conversational Telephone Speech Recognition. Ivan Medennikov, Alexey Prudnikov, Alexander Zatvornitskiy
2016	Improving Generalisation to New Speakers in Spoken Dialogue State Tracking. Iñigo Casanueva, Thomas Hain, Phil D. Green
2016	Improving Large Vocabulary Accented Mandarin Speech Recognition with Attribute-Based I-Vectors. Hao Zheng, Shanshan Zhang, Liwei Qiao, Jianping Li, Wenju Liu
2016	Improving Prosodic Boundaries Prediction for Mandarin Speech Synthesis by Using Enhanced Embedding Feature and Model Fusion Approach. Yibin Zheng, Ya Li, Zhengqi Wen, Xingguang Ding, Jianhua Tao
2016	Improving TTS with Corpus-Specific Pronunciation Adaptation. Marie Tahon, Raheel Qader, Gwénolé Lecorvé, Damien Lolive
2016	Improving Under-Resourced Language ASR Through Latent Subword Unit Space Discovery. Marzieh Razavi, Mathew Magimai-Doss
2016	Improving i-Vector and PLDA Based Speaker Clustering with Long-Term Features. Abraham Woubie, Jordi Luque, Javier Hernando
2016	Improving the Lwazi ASR Baseline. Charl Johannes van Heerden, Neil Kleynhans, Marelie H. Davel
2016	Improving the Probabilistic Framework for Representing Dialogue Systems with User Response Model. Miao Li, Zhipeng Chen, Ji Wu
2016	Incorporating a Generative Front-End Layer to Deep Neural Network for Noise Robust Automatic Speech Recognition. Souvik Kundu, Khe Chai Sim, Mark J. F. Gales
2016	Individual Identity in Songbirds: Signal Representations and Metric Learning for Locating the Information in Complex Corvid Calls. Dan Stowell, Veronica Morfi, Lisa F. Gill
2016	Inferring Phonemic Classes from CNN Activation Maps Using Clustering Techniques. Thomas Pellegrini, Sandrine Mouysset
2016	Integrated Spoofing Countermeasures and Automatic Speaker Verification: An Evaluation on ASVspoof 2015. Md. Sahidullah, Héctor Delgado, Massimiliano Todisco, Hong Yu, Tomi Kinnunen, Nicholas W. D. Evans, Zheng-Hua Tan
2016	Intelligibility Enhancement at the Receiving End of the Speech Transmission System - Effects of Far-End Noise Reduction. Emma Jokinen, Paavo Alku
2016	Intelligibility of Disordered Speech: Global and Detailed Scores. Mario Ganzeboom, Marjoke Bakker, Catia Cucchiarini, Helmer Strik
2016	Inter-Speech Clicks in an Interspeech Keynote. Jürgen Trouvain, Zofia Malisz
2016	Inter-Task System Fusion for Speaker Recognition. Marc Ferras, Srikanth R. Madikeri, Subhadeep Dey, Petr Motlícek, Hervé Bourlard
2016	Interaction Between Lexical Tone and Intonation: An EMA Study. Hao Yi, Sam Tilsen
2016	Interactive Spoken Content Retrieval by Deep Reinforcement Learning. Yen-Chen Wu, Tzu-Hsiang Lin, Yang-De Chen, Hung-yi Lee, Lin-Shan Lee
2016	Interpretation of Low Dimensional Neural Network Bottleneck Features in Terms of Human Perception and Production. Philip Weber, Linxue Bai, Martin J. Russell, Peter Jancovic, Stephen M. Houghton
2016	Introducing Temporal Rate Coding for Speech in Cochlear Implants: A Microscopic Evaluation in Humans and Models. Anja Eichenauer, Mathias Dietz, Bernd T. Meyer, Tim Jürgens
2016	Introducing the Turbo-Twin-HMM for Audio-Visual Speech Enhancement. Steffen Zeiler, Hendrik Meutzner, Ahmed Hussen Abdelaziz, Dorothea Kolossa
2016	Introduction to Poster Presentation of Part II. Jeesun Kim, Gérard Bailly
2016	Introduction. Naomi Harte, Peter Jancovic, Karl-L. Schuchmann
2016	Investigating Various Diarization Algorithms for Speaker in the Wild (SITW) Speaker Recognition Challenge. Yi Liu, Yao Tian, Liang He, Jia Liu
2016	Investigating the Impact of Dialect Prestige on Lexical Decision. Mairym Lloréns Monteserín, Jason D. Zevin
2016	Investigation of Semi-Supervised Acoustic Model Training Based on the Committee of Heterogeneous Neural Networks. Naoyuki Kanda, Shoji Harada, Xugang Lu, Hisashi Kawai
2016	Investigation of Speed-Accuracy Tradeoffs in Speech Production Using Real-Time Magnetic Resonance Imaging. Adam C. Lammert, Christine H. Shadle, Shrikanth S. Narayanan, Thomas F. Quatieri
2016	Investigation of Sub-Band Discriminative Information Between Spoofed and Genuine Speech. Kaavya Sriskandaraja, Vidhyasaharan Sethu, Phu Ngoc Le, Eliathamby Ambikairajah
2016	Is Deception Emotional? An Emotion-Driven Predictive Approach. Shahin Amiriparian, Jouni Pohjalainen, Erik Marchi, Sergey Pugachevskiy, Björn W. Schuller
2016	Iterative PLDA Adaptation for Speaker Diarization. Gaël Le Lan, Delphine Charlet, Anthony Larcher, Sylvain Meignier
2016	Joint Effect of Dialect and Mandarin on English Vowel Production: A Case Study in Changsha EFL Learners. Xinyi Wen, Yuan Jia
2016	Joint Enhancement and Coding of Speech by Incorporating Wiener Filtering in a CELP Codec. Johannes Fischer, Tom Bäckström
2016	Joint Learning of Speaker and Phonetic Similarities with Siamese Networks. Neil Zeghidour, Gabriel Synnaeve, Nicolas Usunier, Emmanuel Dupoux
2016	Joint Optimization of Denoising Autoencoder and DNN Acoustic Model Based on Multi-Target Learning for Noisy Speech Recognition. Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara
2016	Joint Sound Source Separation and Speaker Recognition. Jeroen Zegers, Hugo Van hamme
2016	Joint Speaker and Lexical Modeling for Short-Term Characterization of Speaker. Guangsen Wang, Kong-Aik Lee, Trung Hieu Nguyen, Hanwu Sun, Bin Ma
2016	Joint Syntactic and Semantic Analysis with a Multitask Deep Learning Framework for Spoken Language Understanding. Jérémie Tafforeau, Frédéric Béchet, Thierry Artières, Benoît Favre
2016	Jointly Learning to Locate and Classify Words Using Convolutional Networks. Dimitri Palaz, Gabriel Synnaeve, Ronan Collobert
2016	Jointly Optimizing Activation Coefficients of Convolutive NMF Using DNN for Speech Separation. Hao Li, Shuai Nie, Xueliang Zhang, Hui Zhang
2016	Ketchup, Interdisciplinarity, and the Spread of Innovation in Speech and Language Processing. Dan Jurafsky
2016	Kulning (Swedish Cattle Calls): Acoustic, EGG, Stroboscopic and High-Speed Video Analyses of an Unusual Singing Style. Ahmed Geneid, Anne-Maria Laukkanen, Anita McAllister, Robert Eklund
2016	L1-L2 Interference: The Case of Final Devoicing of French Voiced Fricatives in Final Position by German Learners. Sucheta Ghosh, Camille Fauth, Aghilas Sini, Yves Laprie
2016	L2 Acquisition and Production of the English Rhotic Pharyngeal Gesture. Sarah Harper, Louis Goldstein, Shrikanth S. Narayanan
2016	L2 English Rhythm in Read Speech by Chinese Students. Hongwei Ding, Xinping Xu
2016	LIA System for the SITW Speaker Recognition Challenge. Waad Ben Kheder, Moez Ajili, Pierre-Michel Bousquet, Driss Matrouf, Jean-François Bonastre
2016	LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition. Kazuki Irie, Zoltán Tüske, Tamer Alkhouli, Ralf Schlüter, Hermann Ney
2016	LSTM-Based NeuroCRFs for Named Entity Recognition. Marc-Antoine Rondeau, Yi Su
2016	Labeled Data Generation with Encoder-Decoder LSTM for Semantic Slot Filling. Gakuto Kurata, Bing Xiang, Bowen Zhou
2016	Language Adaptive DNNs for Improved Low Resource Speech Recognition. Markus Müller, Sebastian Stüker, Alex Waibel
2016	Language Effects in Noise-Induced Word Misperceptions. María Luisa García Lecumberri, Jon Barker, Ricard Marxer, Martin Cooke
2016	Language Identification Based on Generative Modeling of Posteriorgram Sequences Extracted from Frame-by-Frame DNNs and LSTM-RNNs. Ryo Masumura, Taichi Asami, Hirokazu Masataki, Yushi Aono, Sumitaka Sakauchi
2016	Language Model Data Augmentation for Keyword Spotting in Low-Resourced Training Conditions. Arseniy Gorin, Rasa Lileikyte, Guangpu Huang, Lori Lamel, Jean-Luc Gauvain, Antoine Laurent
2016	Language Recognition via Sparse Coding. Youngjune L. Gwon, William M. Campbell, Douglas E. Sturim, H. T. Kung
2016	LatticeRnn: Recurrent Neural Networks Over Lattices. Faisal Ladhak, Ankur Gandhe, Markus Dreyer, Lambert Mathias, Ariya Rastrow, Björn Hoffmeister
2016	Laughter Valence Prediction in Motivational Interviewing Based on Lexical and Acoustic Cues. Rahul Gupta, Nishant Nath, Taruna Agrawal, Panayiotis G. Georgiou, David C. Atkins, Shrikanth S. Narayanan
2016	Learning Document Representations Using Subspace Multinomial Model. Santosh Kesiraju, Lukás Burget, Igor Szöke, Jan Cernocký
2016	Learning Multiscale Features Directly from Waveforms. Zhenyao Zhu, Jesse H. Engel, Awni Y. Hannun
2016	Learning N-Gram Language Models from Uncertain Data. Vitaly Kuznetsov, Hank Liao, Mehryar Mohri, Michael Riley, Brian Roark
2016	Learning Neural Network Representations Using Cross-Lingual Bottleneck Features with Word-Pair Information. Yougen Yuan, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li
2016	Learning Personalized Pronunciations for Contact Name Recognition. Antoine Bruguier, Fuchun Peng, Françoise Beaufays
2016	Learning a Translation Model from Word Lattices. Oliver Adams, Graham Neubig, Trevor Cohn, Steven Bird
2016	Lig-Aikuma: A Mobile App to Collect Parallel Speech for Under-Resourced Language Studies. Elodie Gauthier, David Blachon, Laurent Besacier, Guy-Noël Kouarata, Martine Adda-Decker, Annie Rialland, Gilles Adda, Grégoire Bachman
2016	Likelihood Ratio Calculation in Acoustic-Phonetic Forensic Voice Comparison: Comparison of Three Statistical Modelling Approaches. Ewald Enzinger
2016	Local Sparsity Based Online Dictionary Learning for Environment-Adaptive Speech Enhancement with Nonnegative Matrix Factorization. Kwang Myung Jeon, Hong Kook Kim
2016	Localizing Bird Songs Using an Open Source Robot Audition System with a Microphone Array. Reiji Suzuki, Shiho Matsubayashi, Kazuhiro Nakadai, Hiroshi G. Okuno
2016	Locally Linear Embedding for Exemplar-Based Spectral Conversion. Yi-Chiao Wu, Hsin-Te Hwang, Chin-Cheng Hsu, Yu Tsao, Hsin-Min Wang
2016	Log-Linear System Combination Using Structured Support Vector Machines. Jingzhou Yang, Anton Ragni, Mark J. F. Gales, Kate M. Knill
2016	Long Short-Term Memory for Speaker Generalization in Supervised Speech Separation. Jitong Chen, DeLiang Wang
2016	Long-Term Stability of Tracheoesophageal Voices. Klaske E. van Sluis, Michiel W. M. van den Brekel, Frans J. M. Hilgers, Rob J. J. H. van Son
2016	Low-Rank Representation of Nearest Neighbor Posterior Probabilities to Enhance DNN Based Acoustic Modeling. Gil Luyet, Pranay Dighe, Afsaneh Asaei, Hervé Bourlard
2016	Lower Frame Rate Neural Network Acoustic Models. Golan Pundak, Tara N. Sainath
2016	MIVOQ-PTTS - A Revolutionary New Way of Thinking TTS. Piero Cosi, Giulio Paci, Giacomo Sommavilla, Fabio Tesser
2016	ML Parameter Generation with a Reformulated MGE Training Criterion - Participation in the Voice Conversion Challenge 2016. Daniel Erro, Agustín Alonso, Luis Serrano, David Tavarez, Igor Odriozola, Xabier Sarasola, Eder del Blanco, Jon Sánchez, Ibon Saratxaga, Eva Navas, Inma Hernáez
2016	Mahalanobis Metric Scoring Learned from Weighted Pairwise Constraints in I-Vector Speaker Recognition System. Zhenchun Lei, Yanhong Wan, Jian Luo, Yingen Yang
2016	Majorisation-Minimisation Based Optimisation of the Composite Autoregressive System with Application to Glottal Inverse Filtering. Lauri Juvela, Hirokazu Kameoka, Manu Airaksinen, Junichi Yamagishi, Paavo Alku
2016	Making Personal Digital Assistants Aware of What They Do Not Know. Omar Zia Khan, Ruhi Sarikaya
2016	Manipulating Word Lattices to Incorporate Human Corrections. Yashesh Gaur, Florian Metze, Jeffrey P. Bigham
2016	Manual versus Automated: The Challenging Routine of Infant Vocalisation Segmentation in Home Videos to Study Neuro(mal)development. Florian B. Pokorny, Robert Peharz, Wolfgang Roth, Matthias Zöhrer, Franz Pernkopf, Peter B. Marschik, Björn W. Schuller
2016	Marginal Contrast Among Romanian Vowels: Evidence from ASR and Functional Load. Margaret E. L. Renwick, Ioana Vasilescu, Camille Dutrey, Lori Lamel, Bianca Vieru
2016	Maximum a posteriori Based Decoding for CTC Acoustic Models. Naoyuki Kanda, Xugang Lu, Hisashi Kawai
2016	Measuring Pronunciation Improvement in Users of CAPT Tool TipTopTalk! Cristian Tejedor García, David Escudero Mancebo, Enrique Cámara Arenas, César González Ferreras, Valentín Cardeñoso-Payo
2016	Measuring Turn-Taking Offsets in Human-Human Dialogues. Rebecca Lunsford, Peter A. Heeman, Emma Rennie
2016	Mechanical Production of [b], [m] and [w] Using Controlled Labial and Velopharyngeal Gestures. Takayuki Arai
2016	Memory-Efficient Modeling and Search Techniques for Hardware ASR Decoders. Michael Price, Anantha P. Chandrakasan, James R. Glass
2016	Microphone Distance Adaptation Using Cluster Adaptive Training for Robust Far Field Speech Recognition. Animesh Prasad, Khe Chai Sim
2016	Microscopic Multilingual Matrix Test Predictions Using an ASR-Based Speech Recognition Model. Marc René Schädler, David Hülsmeier, Anna Warzybok, Sabine Hochmuth, Birger Kollmeier
2016	Mindfulness Special Event. Nikki Mirghafori
2016	Minimization of Regression and Ranking Losses with Shallow Neural Networks on Automatic Sincerity Evaluation. Hung-Shin Lee, Yu Tsao, Chi-Chun Lee, Hsin-Min Wang, Wei-Cheng Lin, Wei-Chen Chen, Shan-Wen Hsiao, Shyh-Kang Jeng
2016	Minimizing Annotation Effort for Adaptation of Speech-Activity Detection Systems. Luciana Ferrer, Martin Graciarena
2016	Misperceptions Arising from Speech-in-Babble Interactions. Máté Attila Tóth, Martin Cooke, Jon Barker
2016	Mispronunciation Detection Leveraging Maximum Performance Criterion Training of Acoustic Models and Decision Functions. Yao-Chi Hsu, Ming-Han Yang, Hsiao-Tsung Hung, Berlin Chen
2016	Model Adaptation and Active Learning in the BBN Speech Activity Detection System for the DARPA RATS Program. Damianos G. Karakos, Scott Novotney, Le Zhang, Richard M. Schwartz
2016	Model Compression Applied to Small-Footprint Keyword Spotting. George Tucker, Minhua Wu, Ming Sun, Sankaran Panchapagesan, Gengshen Fu, Shiv Vitaladevuni
2016	Model Integration for HMM- and DNN-Based Speech Synthesis Using Product-of-Experts Framework. Kentaro Tachibana, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai
2016	Model-Based Parametric Prosody Synthesis with Deep Neural Network. Hao Liu, Heng Lu, Xu Shao, Yi Xu
2016	Modeling Noise Influence to Speech Intelligibility Non-Intrusively by Reduced Speech Dynamic Range. Fei Chen
2016	Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks. Tara N. Sainath, Bo Li
2016	Modeling and Transforming Speech Using Variational Autoencoders. Merlijn Blaauw, Jordi Bonada
2016	Modulation Enhancement of Temporal Envelopes for Increasing Speech Intelligibility in Noise. Maria Koutsogiannaki, Yannis Stylianou
2016	Modulation Spectral Features for Predicting Vocal Emotion Recognition by Simulated Cochlear Implants. Zhi Zhu, Ryota Miyauchi, Yukiko Araki, Masashi Unoki
2016	Monaural Source Separation Using a Random Forest Classifier. Cosimo Riday, Saurabh Bhargava, Richard H. R. Hahnloser, Shih-Chii Liu
2016	Multi-Attribute Factorized Hidden Layer Adaptation for DNN Acoustic Models. Lahiru Samarakoon, Khe Chai Sim
2016	Multi-Channel Linear Prediction Based on Binaural Coherence for Speech Dereverberation. Hong Liu, Xiuling Wang, Miao Sun, Cheng Pang
2016	Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM. Dilek Hakkani-Tür, Gökhan Tür, Asli Celikyilmaz, Yun-Nung Chen, Jianfeng Gao, Li Deng, Ye-Yi Wang
2016	Multi-Language Multi-Speaker Acoustic Modeling for LSTM-RNN Based Statistical Parametric Speech Synthesis. Bo Li, Heiga Zen
2016	Multi-Language Neural Network Language Models. Anton Ragni, Edgar Dakin, Xie Chen, Mark J. F. Gales, Kate M. Knill
2016	Multi-Talker Speech Recognition Based on Blind Source Separation with ad hoc Microphone Array Using Smartphones and Cloud Storage. Keiko Ochi, Nobutaka Ono, Shigeki Miyabe, Shoji Makino
2016	Multi-Task Learning and Weighted Cross-Entropy for DNN-Based Keyword Spotting. Sankaran Panchapagesan, Ming Sun, Aparna Khare, Spyros Matsoukas, Arindam Mandal, Björn Hoffmeister, Shiv Vitaladevuni
2016	Multichannel Spatial Clustering for Robust Far-Field Automatic Speech Recognition in Mismatched Conditions. Michael I. Mandel, Jon Barker
2016	Multidimensional Residual Learning Based on Recurrent Neural Networks for Acoustic Modeling. Yuanyuan Zhao, Shuang Xu, Bo Xu
2016	Multilingual Data Selection for Low Resource Speech Recognition. Samuel Thomas, Kartik Audhkhasi, Jia Cui, Brian Kingsbury, Bhuvana Ramabhadran
2016	Multilingual Speech Emotion Recognition System Based on a Three-Layer Model. Xingfeng Li, Masato Akagi
2016	Multimodal Fusion of Multirate Acoustic, Prosodic, and Lexical Speaker Characteristics for Native Language Identification. Prashanth Gurunath Shivakumar, Sandeep Nallan Chakravarthula, Panayiotis G. Georgiou
2016	Multiple Influences on Vocabulary Acquisition: Parental Input Dominates. Dominic W. Massaro
2016	Multiplicity of the Acoustic Correlates of the Fortis-Lenis Contrast: Plosives in Aberystwyth English. Mísa Hejná
2016	My-Own-Voice: A Web Service That Allows You to Create a Text-to-Speech Voice From Your Own Voice. Fabrice Malfrère, Olivier Deroo, Emmanuelle Franques, Jonathan Hourez, Nicolas Mazars, Vincent Pagel, Geoffrey Wilfart
2016	NN-Grams: Unifying Neural Network and n-Gram Language Models for Speech Recognition. Babak Damavandi, Shankar Kumar, Noam Shazeer, Antoine Bruguier
2016	Native Language Detection Using the I-Vector Framework. Mohammed Senoussaoui, Patrick Cardinal, Najim Dehak, Alessandro L. Koerich
2016	Native Language Identification Using Spectral and Source-Based Features. Avni Rajpal, Tanvina B. Patel, Hardik B. Sailor, Maulik C. Madhavi, Hemant A. Patil, Hiroya Fujisaki
2016	Naturalness Judgement of L2 English Through Dubbing Practice. Dean Luo, Ruxin Luo, Lixin Wang
2016	Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection. Sheng-syun Shen, Hung-yi Lee
2016	Neural Network Adaptive Beamforming for Robust Multichannel Speech Recognition. Bo Li, Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Michiel Bacchiani
2016	Neural Responses to Speech-Specific Modulations Derived from a Spectro-Temporal Filter Bank. Marina Frye, Cristiano Micheli, Inga M. Schepers, Gerwin Schalk, Jochem W. Rieger, Bernd T. Meyer
2016	Neurophysiological Vocal Source Modeling for Biomarkers of Disease. Gregory A. Ciccarelli, Thomas F. Quatieri, Satrajit S. Ghosh
2016	Noise Aware and Combined Noise Models for Speech Denoising in Unknown Noise Conditions. Pavlos Papadopoulos, Colin Vaz, Shrikanth S. Narayanan
2016	Noise and Metadata Sensitive Bottleneck Features for Improving Speaker Recognition with Non-Native Speech Input. Yao Qian, Jidong Tao, David Suendermann-Oeft, Keelan Evanini, Alexei V. Ivanov, Vikram Ramanarayanan
2016	Noise-Robust Hidden Markov Models for Limited Training Data for Within-Species Bird Phrase Classification. Kantapon Kaewtip, Charles E. Taylor, Abeer Alwan
2016	Non-Iterative Parameter Estimation for Total Variability Model Using Randomized Singular Value Decomposition. Ruchir Travadi, Shrikanth S. Narayanan
2016	Non-Uniform Boosted MCE Training of Deep Neural Networks for Keyword Spotting. Zhong Meng, Biing-Hwang Juang
2016	Novel Front-End Features Based on Neural Graph Embeddings for DNN-HMM and LSTM-CTC Acoustic Modeling. Yuzong Liu, Katrin Kirchhoff
2016	Novel Nonlinear Prediction Based Features for Spoofed Speech Detection. Himanshu N. Bhavsar, Tanvina B. Patel, Hemant A. Patil
2016	Novel Subband Autoencoder Features for Detection of Spoofed Speech. Meet H. Soni, Tanvina B. Patel, Hemant A. Patil
2016	Novel Subband Autoencoder Features for Non-Intrusive Quality Assessment of Noise Suppressed Speech. Meet H. Soni, Hemant A. Patil
2016	Objective Evaluation Methods for Chinese Text-To-Speech Systems. Teng Zhang, Zhipeng Chen, Ji Wu, Sam Lai, Wenhui Lei, Carsten Isert
2016	Objective Evaluation Using Association Between Dimensions Within Spectral Features for Statistical Parametric Speech Synthesis. Yusuke Ijima, Taichi Asami, Hideyuki Mizuno
2016	Objective Language Feature Analysis in Children with Neurodevelopmental Disorders During Autism Assessment. Manoj Kumar, Rahul Gupta, Daniel Bone, Nikolaos Malandrakis, Somer Bishop, Shrikanth S. Narayanan
2016	On Discriminative Framework for Single Channel Audio Source Separation. Arpita Gang, Pravesh Biyani
2016	On Employing a Highly Mismatched Crowd for Speech Transcription. Purushotam G. Radadia, Rahul Kumar, Kanika Kalra, Shirish Karande, Sachin Lodha
2016	On Online Attention-Based Speech Recognition and Joint Mandarin Character-Pinyin Training. William Chan, Ian R. Lane
2016	On Smoothing and Enhancing Dynamics of Pitch Contours Represented by Discrete Orthogonal Polynomials for Prosody Generation. Chen-Yu Chiang
2016	On the Appropriateness of Complex-Valued Neural Networks for Speech Enhancement. Lukas Drude, Bhiksha Raj, Reinhold Haeb-Umbach
2016	On the Correlation and Transferability of Features Between Automatic Speech Recognition and Speech Emotion Recognition. Haytham M. Fayek, Margaret Lech, Lawrence Cavedon
2016	On the Efficient Representation and Execution of Deep Acoustic Models. Raziel Alvarez, Rohit Prabhavalkar, Anton Bakhtin
2016	On the Importance of Efficient Transition Modeling for Speaker Diarization. Itshak Lapidot, Jean-François Bonastre
2016	On the Influence of Gender on Interruptions in Multiparty Dialogue. Paul Van Eecke, Raquel Fernández
2016	On the Influence of Text Content on Pass-Phrase Strength for Short-Duration Text-Dependent Automatic Speaker Authentication. Giacomo Valenti, Adrien Daniel, Nicholas W. D. Evans
2016	On the Issue of Calibration in DNN-Based Speaker Recognition Systems. Mitchell McLaren, Diego Castán, Luciana Ferrer, Aaron Lawson
2016	On the Role of Nonlinear Transformations in Deep Neural Network Acoustic Models. Tasha Nagamine, Michael L. Seltzer, Nima Mesgarani
2016	On the Suitability of Vocalic Sandwiches in a Corpus-Based TTS Engine. David Guennec, Damien Lolive
2016	On the Use of Gaussian Mixture Model Framework to Improve Speaker Adaptation of Deep Neural Network Acoustic Models. Natalia A. Tomashenko, Yuri Y. Khokhlov, Yannick Estève
2016	Open Language Interface for Voice Exploitation (OLIVE). Aaron Lawson, Mitchell McLaren, Harry Bratt, Martin Graciarena, Horacio Franco, Christopher George, Allen R. Stauffer, Chris Bartels, Julien van Hout
2016	Open Source Speech and Language Resources for Frisian. Emre Yilmaz, Henk van den Heuvel, Jelske Dijkstra, Hans Van de Velde, Frederik Kampstra, Jouke Algra, David A. van Leeuwen
2016	Open-Domain Audio-Visual Speech Recognition: A Deep Learning Approach. Yajie Miao, Florian Metze
2016	Optimal Unit Stitching in a Unit Selection Singing Synthesis System. Marius Cotescu
2016	Optimization of Speech Enhancement Front-End with Speech Recognition-Level Criterion. Takuya Higuchi, Takuya Yoshioka, Tomohiro Nakatani
2016	Optimizing Speech Recognition Evaluation Using Stratified Sampling. Janne Pylkkönen, Thomas Drugman, Max Bisani
2016	Organizing Syllables into Sandhi Domains - Evidence from F0 and Duration Patterns in Shanghai Chinese. Bijun Ling, Jie Liang
2016	Out of Set Language Modelling in Hierarchical Language Identification. Saad Irtza, Vidhyasaharan Sethu, Sarith Fernando, Eliathamby Ambikairajah, Haizhou Li
2016	Overcoming Data Sparsity in Acoustic Modeling of Low-Resource Language by Borrowing Data and Model Parameters from High-Resource Languages. Basil Abraham, Srinivasan Umesh, Neethu Mariam Joy
2016	Pair-Wise Distance Metric Learning of Neural Network Model for Spoken Language Identification. Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai
2016	Parallel Dictionary Learning for Voice Conversion Using Discriminative Graph-Embedded Non-Negative Matrix Factorization. Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
2016	Parallel Speaker and Content Modelling for Text-Dependent Speaker Verification. Jianbo Ma, Saad Irtza, Kaavya Sriskandaraja, Vidhyasaharan Sethu, Eliathamby Ambikairajah
2016	Parkinson's Disease Progression Assessment from Speech Using GMM-UBM. Tomás Arias-Vergara, Juan Camilo Vásquez-Correa, Juan Rafael Orozco-Arroyave, Jesús Francisco Vargas-Bonilla, Elmar Nöth
2016	Part-of-Speech Tagging and Chunking in Text-to-Speech Synthesis for South African Languages. Georg I. Schlünz, Nkosikhona Dlamini, Rynhardt P. Kruger
2016	Pause Prediction from Text for Speech Synthesis with User-Definable Pause Insertion Likelihood Threshold. Norbert Braunschweiler, Ranniery Maia
2016	Perceived Naturalness of Electrolaryngeal Speech Produced Using sEMG-Controlled vs. Manual Pitch Modulation. Kathleen F. Nagle, James T. Heaton
2016	Perceived Usability and Cognitive Demand of Secondary Tasks in Spoken Versus Visual-Manual Automotive Interaction. Annika Silvervarg, Sofia Lindvall, Jonatan Andersson, Ida Esberg, Christian Jernberg, Filip Frumerie, Arne Jönsson
2016	Perception Optimized Deep Denoising AutoEncoders for Speech Enhancement. Prashanth Gurunath Shivakumar, Panayiotis G. Georgiou
2016	Perception of Tone in Whispered Mandarin Sentences: The Case for Singapore Mandarin. Yuling Gu, Boon Pang Lim, Nancy F. Chen
2016	Perceptual Lateralization of Coda Rhotic Production in Puerto Rican Spanish. Mairym Lloréns Monteserín, Shrikanth S. Narayanan, Louis Goldstein
2016	Perceptual Salience of Voice Source Parameters in Signaling Focal Prominence. Irena Yanushevskaya, Andy Murphy, Christer Gobl, Ailbhe Ní Chasaide
2016	Personalized Natural Language Understanding. Xiaohu Liu, Ruhi Sarikaya, Liang Zhao, Yong Ni, Yi-Cheng Pan
2016	Personalized, Cross-Lingual TTS Using Phonetic Posteriorgrams. Lifa Sun, Hao Wang, Shiyin Kang, Kun Li, Helen M. Meng
2016	Phase-Aware Signal Processing for Automatic Speech Recognition. Johannes Fahringer, Tobias Schrank, Johannes Stahl, Pejman Mowlaee, Franz Pernkopf
2016	Phase-Encoded Speech Spectrograms. Chandra Sekhar Seelamantula
2016	PhonVoc: A Phonetic and Phonological Vocoding Toolkit. Milos Cernak, Philip N. Garner
2016	Phone Synchronous Decoding with CTC Lattice. Zhehuai Chen, Wei Deng, Tao Xu, Kai Yu
2016	Phoneme Embedding and its Application to Speech Driven Talking Avatar Synthesis. Xu Li, Zhiyong Wu, Helen M. Meng, Jia Jia, Xiaoyan Lou, Lianhong Cai
2016	Phoneme Set Design Considering Integrated Acoustic and Linguistic Features of Second Language Speech. Xiaoyun Wang, Tsuneo Kato, Seiichi Yamamoto
2016	Phoneme, Phone Boundary, and Tone in Automatic Scoring of Mandarin Proficiency. Jiahong Yuan, Mark Y. Liberman
2016	Phonetic Context Embeddings for DNN-HMM Phone Recognition. Leonardo Badino
2016	Phonetic Reduction Can Lead to Lengthening, and Enhancement Can Lead to Shortening. Clara Cohen, Matt Carlson
2016	Phonetic and Phonological Posterior Search Space Hashing Exploiting Class-Specific Sparsity Structures. Afsaneh Asaei, Gil Luyet, Milos Cernak, Hervé Bourlard
2016	Phonotactic Language Identification for Singing. Anna M. Kruspe
2016	Pitch-Adaptive Front-End Features for Robust Children's ASR. Syed Shahnawazuddin, Abhishek Dey, Rohit Sinha
2016	Pitch-Range Perception: The Dynamic Interaction Between Voice Quality and Fundamental Frequency. Jianjing Kuang, Mark Y. Liberman
2016	Poster Overview Presentations. Naomi Harte, Peter Jancovic, Karl-L. Schuchmann
2016	Predicting Affective Dimensions Based on Self Assessed Depression Severity. Rahul Gupta, Shrikanth S. Narayanan
2016	Predicting Binaural Speech Intelligibility from Signals Estimated by a Blind Source Separation Algorithm. Qingju Liu, Yan Tang, Philip J. B. Jackson, Wenwu Wang
2016	Predicting Pronunciations with Syllabification and Stress with Recurrent Neural Networks. Daan van Esch, Mason Chua, Kanishka Rao
2016	Predicting Severity of Voice Disorder from DNN-HMM Acoustic Posteriors. Tan Lee, Yuanyuan Liu, Yu Ting Yeung, Thomas K. T. Law, Kathy Y. S. Lee
2016	Predicting User Satisfaction from Turn-Taking in Spoken Conversations. Shammur Absar Chowdhury, Evgeny A. Stepanov, Giuseppe Riccardi
2016	Prediction and Generation of Backchannel Form for Attentive Listening Systems. Tatsuya Kawahara, Takashi Yamaguchi, Koji Inoue, Katsuya Takanashi, Nigel G. Ward
2016	Prediction of Deception and Sincerity from Speech Using Automatic Phone Recognition-Based Features. Robert Herms
2016	Prediction of the Articulatory Movements of Unseen Phonemes of a Speaker Using the Speech Structure of Another Speaker. Hidetsugu Uchida, Daisuke Saito, Nobuaki Minematsu
2016	Preliminary Experiments on Unsupervised Word Discovery in Mboshi. Pierre Godard, Gilles Adda, Martine Adda-Decker, Alexandre Allauzen, Laurent Besacier, Hélène Bonneau-Maynard, Guy-Noël Kouarata, Kevin Löser, Annie Rialland, François Yvon
2016	Priors for Speaker Counting and Diarization with AHC. Gregory Sell, Alan McCree, Daniel Garcia-Romero
2016	Privacy-Preserving Speech Analytics for Automatic Assessment of Student Collaboration. Nikoletta Bassiou, Andreas Tsiartas, Jennifer Smith, Harry Bratt, Colleen Richey, Elizabeth Shriberg, Cynthia M. D'Angelo, Nonye Alozie
2016	Probabilistic Amplitude Demodulation Features in Speech Synthesis for Improving Prosody. Alexandros Lazaridis, Milos Cernak, Philip N. Garner
2016	Probabilistic Approach Using Joint Clean and Noisy i-Vectors Modeling for Speaker Recognition. Waad Ben Kheder, Driss Matrouf, Moez Ajili, Jean-François Bonastre
2016	Probabilistic Approach Using Joint Long and Short Session i-Vectors Modeling to Deal with Short Utterances for Speaker Recognition. Waad Ben Kheder, Driss Matrouf, Moez Ajili, Jean-François Bonastre
2016	Probabilistic Spatial Filter Estimation for Signal Enhancement in Multi-Channel Automatic Speech Recognition. Hendrik Kayser, Niko Moritz, Jörn Anemüller
2016	Processing and Adaptation to Ambiguous Sounds during the Course of Perceptual Learning. Polina Drozdova, Roeland van Hout, Odette Scharenborg
2016	Progress and Prospects for Spoken Language Technology: Results from Four Sexennial Surveys. Roger K. Moore, Ricard Marxer
2016	Progress and Prospects for Spoken Language Technology: What Ordinary People Think. Roger K. Moore, Hui Li, Shih-Hao Liao
2016	Pronunciation Assessment of Japanese Learners of French with GOP Scores and Phonetic Information. Vincent Laborde, Thomas Pellegrini, Lionel Fontan, Julie Mauclair, Halima Sahraoui, Jérôme Farinas
2016	Pronunciation Error Detection for New Language Learners. Sean Robertson, Cosmin Munteanu, Gerald Penn
2016	Prosodic Convergence with Spoken Stimuli in Laboratory Data. Margaret Zellers
2016	Prosodic Cues and Answer Type Detection for the Deception Sub-Challenge. Claude Montacié, Marie-José Caraty
2016	Prosodic and Linguistic Analysis of Semantic Fluency Data: A Window into Speech Production and Cognition. Maria K. Wolters, Najoung Kim, Jung-Ho Kim, Sarah E. MacPherson, Jong-Chan Park
2016	Prosody Modification Using Allpass Residual of Speech Signals. Karthika Vijayan, K. Sri Rama Murty
2016	Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI. Daniel Povey, Vijayaditya Peddinti, Daniel Galvez, Pegah Ghahremani, Vimal Manohar, Xingyu Na, Yiming Wang, Sanjeev Khudanpur
2016	Putting German [ʃ] and [ç] in Two Different Boxes: Native German vs L2 German of French Learners. Jane Wottawa, Martine Adda-Decker, Frédéric Isel
2016	Quantitative Analysis of Backchannels Uttered by an Interviewer During Neuropsychological Tests. Gérard Bailly, Frédéric Elisei, Alexandra Juphard, Olivier Moreaud
2016	RNN-BLSTM Based Multi-Pitch Estimation. Jianshu Zhang, Jian Tang, Li-Rong Dai
2016	Rapid Update of Multilingual Deep Neural Network for Low-Resource Keyword Search. Chongjia Ni, Lei Wang, Cheung-Chi Leung, Feng Rao, Li Lu, Bin Ma, Haizhou Li
2016	Real-Time Presentation Tracking Using Semantic Keyword Spotting. Reza Asadi, Harriet J. Fell, Timothy W. Bickmore, Ha Trinh
2016	Real-Time Tracking of Speakers' Emotions, States, and Traits on Mobile Platforms. Erik Marchi, Florian Eyben, Gerhard Hagerer, Björn W. Schuller
2016	Realistic Multi-Microphone Data Simulation for Distant Speech Recognition. Mirco Ravanelli, Piergiorgio Svaizer, Maurizio Omologo
2016	Recent Advances in Google Real-Time HMM-Driven Unit Selection Synthesizer. Xavi Gonzalvo, Siamak Tazari, Chun-an Chan, Markus Becker, Alexander Gutkin, Hanna Silén
2016	Recognition of Depression in Bipolar Disorder: Leveraging Cohort and Person-Specific Knowledge. Soheil Khorram, John Gideon, Melvin G. McInnis, Emily Mower Provost
2016	Recognition of Dysarthric Speech Using Voice Parameters for Speaker Adaptation and Multi-Taper Spectral Estimation. Chitralekha Bhat, Bhavik Vachhani, Sunil Kumar Kopparapu
2016	Recognition of Multiple Bird Species Based on Penalised Maximum Likelihood and HMM-Based Modelling of Individual Vocalisation Elements. Peter Jancovic, Münevver Köküer
2016	Recurrent Models for Auditory Attention in Multi-Microphone Distant Speech Recognition. Suyoun Kim, Ian R. Lane
2016	Recurrent Neural Network Language Model with Incremental Updated Context Information Generated Using Bag-of-Words Representation. Md. Akmal Haidar, Mikko Kurimo
2016	Recurrent Neural Network-Based Phoneme Sequence Estimation Using Multiple ASR Systems' Outputs for Spoken Term Detection. Naoki Sawada, Hiromitsu Nishizaki
2016	Recurrent Out-of-Vocabulary Word Detection Using Distribution of Features. Taichi Asami, Ryo Masumura, Yushi Aono, Koichi Shinoda
2016	Redefining the Linguistic Context Feature Set for HMM and DNN TTS Through Position and Parsing. Rasmus Dall, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda
2016	Reducing the Computational Complexity of Multimicrophone Acoustic Models with Integrated Feature Extraction. Tara N. Sainath, Arun Narayanan, Ron J. Weiss, Ehsan Variani, Kevin W. Wilson, Michiel Bacchiani, Izhak Shafran
2016	Relating Estimated Cyclic Spectral Peak Frequency to Measured Epilarynx Length Using Magnetic Resonance Imaging. Elizabeth Godoy, Andrew Dumas, Jennifer Melot, Nicolas Malyska, Thomas F. Quatieri
2016	Relation of Automatically Extracted Formant Trajectories with Intelligibility Loss and Speaking Rate Decline in Amyotrophic Lateral Sclerosis. Rachelle L. Horwitz-Martin, Thomas F. Quatieri, Adam C. Lammert, James R. Williamson, Yana Yunusova, Elizabeth Godoy, Daryush D. Mehta, Jordan R. Green
2016	Relationships Between Functional Load and Auditory Confusability Under Different Speech Environments. Shinae Kang, Clara Cohen
2016	Relative Contributions of Amplitude and Phase to the Intelligibility Advantage of Ideal Binary Masked Sentences. Lei Wang, Shufeng Zhu, Diliang Chen, Yong Feng, Fei Chen
2016	Release from Energetic Masking Caused by Repeated Patterns of Glimpsing Windows. Maury Lander-Portnoy
2016	Remeeting - Deep Insights to Conversations. Allen Guo, Arlo Faria, Korbinian Riedhammer
2016	Representation Learning for Speech Emotion Recognition. Sayan Ghosh, Eugene Laksana, Louis-Philippe Morency, Stefan Scherer
2016	Rescoring Hypothesized Detections of Out-of-Vocabulary Keywords Using Subword Samples. Van Tung Pham, Haihua Xu, Xiong Xiao, Nancy F. Chen, Eng Siong Chng, Haizhou Li
2016	Rescoring by Combination of Posteriorgram Score and Subword-Matching Score for Use in Query-by-Example. Masato Obara, Kazunori Kojima, Kazuyo Tanaka, Shi-wook Lee, Yoshiaki Itoh
2016	Respiratory Belts and Whistles: A Preliminary Study of Breathing Acoustics for Turn-Taking. Marcin Wlodarczak, Mattias Heldner
2016	Respiratory Turn-Taking Cues. Marcin Wlodarczak, Mattias Heldner
2016	Results of The 2015 NIST Language Recognition Evaluation. Hui Zhao, Désiré Bansé, George R. Doddington, Craig S. Greenberg, Jaime Hernandez-Cordero, John M. Howard, Lisa P. Mason, Alvin F. Martin, Douglas A. Reynolds, Elliot Singer, Audrey Tong
2016	Retrieval of Textual Song Lyrics from Sung Inputs. Anna M. Kruspe
2016	Retrieving Categorical Emotions Using a Probabilistic Framework to Define Preference Learning Samples. Reza Lotfian, Carlos Busso
2016	Reverberation-Robust One-Bit TDOA Based Moving Source Localization for Automatic Camera Steering. Sundar Harshavardhan, Gokul Deepak Manavalan, T. V. Sreenivas, Chandra Sekhar Seelamantula
2016	Robust Audio Event Recognition with 1-Max Pooling Convolutional Neural Networks. Huy Phan, Lars Hertel, Marco Maaß, Alfred Mertins
2016	Robust DNN-Based VAD Augmented with Phone Entropy Based Rejection of Background Speech. Yuya Fujita, Ken-ichi Iso
2016	Robust Detection of Multiple Bioacoustic Events with Repetitive Structures. Frank Kurth
2016	Robust Estimation of Fundamental Frequency Using Single Frequency Filtering Approach. Vishala Pannala, G. Aneeja, Sudarsana Reddy Kadiri, B. Yegnanarayana
2016	Robust Example Search Using Bottleneck Features for Example-Based Speech Enhancement. Atsunori Ogawa, Shogo Seki, Keisuke Kinoshita, Marc Delcroix, Takuya Yoshioka, Tomohiro Nakatani, Kazuya Takeda
2016	Robust Multichannel Gender Classification from Speech in Movie Audio. Naveen Kumar, Md. Nasir, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2016	Robust Sound Event Detection in Continuous Audio Environments. Haomin Zhang, Ian McLoughlin, Yan Song
2016	Robust Speaker Recognition with Combined Use of Acoustic and Throat Microphone Speech. Md. Sahidullah, Rosa González Hautamäki, Dennis Alexander Lehmann Thomsen, Tomi Kinnunen, Zheng-Hua Tan, Ville Hautamäki, Robert Parts, Martti Pitkänen
2016	Robust Speech Recognition Using Generalized Distillation Framework. Konstantin Markov, Tomoko Matsui
2016	Robust Vowel Landmark Detection Using Epoch-Based Features. Sri Harsha Dumpala, Bhanu Teja Nellore, Raghu Ram Nevali, Suryakanth V. Gangashetty, B. Yegnanarayana
2016	Robustness in Speech, Speaker, and Language Recognition: "You've Got to Know Your Limitations". John H. L. Hansen, Hynek Boril
2016	Root Cause Analysis of Miscommunication Hotspots in Spoken Dialogue Systems. Spiros Georgiladakis, Georgia Athanasopoulou, Raveesh Meena, José Lopes, Arodami Chorianopoulou, Elisavet Palogiannidi, Elias Iosif, Gabriel Skantze, Alexandros Potamianos
2016	SERAPHIM Live! - Singing Synthesis for the Performer, the Composer, and the 3D Game Developer. Paul Yaozhu Chan, Minghui Dong, Grace Xue Hui Ho, Haizhou Li
2016	SERAPHIM: A Wavetable Synthesis System with 3D Lip Animation for Real-Time Speech and Singing Applications on Mobile Platforms. Paul Yaozhu Chan, Minghui Dong, Grace Xue Hui Ho, Haizhou Li
2016	SNR-Aware Convolutional Neural Network Modeling for Speech Enhancement. Szu-Wei Fu, Yu Tsao, Xugang Lu
2016	SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement. Tian Gao, Jun Du, Li-Rong Dai, Chin-Hui Lee
2016	STON: Efficient Subtitling in Dutch Using State-of-the-Art Tools. Lyan Verwimp, Brecht Desplanques, Kris Demuynck, Joris Pelemans, Marieke Lycke, Patrick Wambacq
2016	Sage: The New BBN Speech Processing Platform. Roger Hsiao, Ralf Meermeier, Tim Ng, Zhongqiang Huang, Maxwell Jordan, Enoch Kan, Tanel Alumäe, Jan Silovský, William Hartmann, Francis Keith, Omer Lang, Man-Hung Siu, Owen Kimball
2016	Segmental Recurrent Neural Networks for End-to-End Speech Recognition. Liang Lu, Lingpeng Kong, Chris Dyer, Noah A. Smith, Steve Renals
2016	Segmented Dynamic Time Warping for Spoken Query-by-Example Search. Jorge Proença, Fernando Perdigão
2016	Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems. Pierre Lanchantin, Mark J. F. Gales, Penny Karanasou, Xunying Liu, Yanman Qian, Linlin Wang, Philip C. Woodland, Chao Zhang
2016	Self-Adaptive DNN for Improving Spoken Language Proficiency Assessment. Yao Qian, Xinhao Wang, Keelan Evanini, David Suendermann-Oeft
2016	Semi-Coupled Dictionary Based Automatic Bandwidth Extension Approach for Enhancing Children's ASR. Ganji Sreeram, Rohit Sinha
2016	Semi-Supervised Joint Enhancement of Spectral and Cepstral Sequences of Noisy Speech. Li Li, Hirokazu Kameoka, Takuya Higuchi, Hiroshi Saruwatari
2016	Semi-Supervised Speaker Adaptation for In-Vehicle Speech Recognition with Deep Neural Networks. Wonkyum Lee, Kyu Jeong Han, Ian R. Lane
2016	Semi-Supervised Training in Deep Learning Acoustic Model. Yan Huang, Yongqiang Wang, Yifan Gong
2016	Semi-Supervised and Cross-Lingual Knowledge Transfer Learnings for DNN Hybrid Acoustic Models Under Low-Resource Conditions. Haihua Xu, Hang Su, Chongjia Ni, Xiong Xiao, Hao Huang, Eng Siong Chng, Haizhou Li
2016	Sensitivity of Quantitative RT-MRI Metrics of Vocal Tract Dynamics to Image Reconstruction Settings. Johannes Töger, Yongwan Lim, Sajan Goud Lingala, Shrikanth S. Narayanan, Krishna S. Nayak
2016	Sensorimotor Response to Visual Imagery of Tongue Displacement. William F. Katz, Divya Prabhakaran
2016	Sentence Boundary Detection Based on Parallel Lexical and Acoustic Models. Xiaoyin Che, Sheng Luo, Haojin Yang, Christoph Meinel
2016	Sequence Student-Teacher Training of Deep Neural Networks. Jeremy Heng Meng Wong, Mark J. F. Gales
2016	Sequence Summarizing Neural Networks for Spoken Language Recognition. Jan Pesán, Lukás Burget, Jan Cernocký
2016	Sequential Convolutional Neural Networks for Slot Filling in Spoken Language Understanding. Ngoc Thang Vu
2016	Sequential Recurrent Neural Networks for Language Modeling. Youssef Oualil, Clayton Greenberg, Mittul Singh, Dietrich Klakow
2016	Sharing Speech Synthesis Software for Research and Education Within Low-Tech and Low-Resource Communities. Andrew R. Plummer, Mary E. Beckman
2016	Short Utterance Variance Modelling and Utterance Partitioning for PLDA Speaker Verification. Ahilan Kanagasundaram, David Dean, Sridha Sridharan, Clinton Fookes, Ivan Himawan
2016	Silent-Speech Command Word Recognition Using Electro-Optical Stomatography. Simon Stone, Peter Birkholz
2016	Sincerity and Deception in Speech: Two Sides of the Same Coin? A Transfer- and Multi-Task Learning Perspective. Yue Zhang, Felix Weninger, Zhao Ren, Björn W. Schuller
2016	SingaKids-Mandarin: Speech Corpus of Singaporean Children Speaking Mandarin Chinese. Nancy F. Chen, Rong Tong, Darren Wee, Pei Xuan Lee, Bin Ma, Haizhou Li
2016	Singing Voice Synthesis Based on Deep Neural Networks. Masanari Nishimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda
2016	Single-Channel Multi-Speaker Separation Using Deep Clustering. Yusuf Ziya Isik, Jonathan Le Roux, Zhuo Chen, Shinji Watanabe, John R. Hershey
2016	Single-Channel Speech Enhancement Using Double Spectrum. Martin Blass, Pejman Mowlaee, W. Bastiaan Kleijn
2016	Sinusoidal Modelling for Ecoacoustics. Patrice Guyot, Alice Eldridge, Ying Chen Eyre-Walker, Alison Johnston, Thomas Pellegrini, Mika Peck
2016	Small-Footprint Deep Neural Networks with Highway Connections for Speech Recognition. Liang Lu, Steve Renals
2016	Sound Pattern Matching for Automatic Prosodic Event Detection. Milos Cernak, Afsaneh Asaei, Pierre-Edouard Honnet, Philip N. Garner, Hervé Bourlard
2016	SparkNG: Interactive MATLAB Tools for Introduction to Speech Production, Perception and Processing Fundamentals and Application of the Aliasing-Free L-F Model Component. Hideki Kawahara
2016	Sparsely Connected and Disjointly Trained Deep Neural Networks for Low Resource Behavioral Annotation: Acoustic Classification in Couples' Therapy. Haoqi Li, Brian R. Baucom, Panayiotis G. Georgiou
2016	Speaker Age Classification and Regression Using i-Vectors. Joanna Grzybowska, Stanislaw Kacprzak
2016	Speaker Comparison for Forensic and Investigative Applications II. Jean-François Bonastre, Joseph P. Campbell, Anders P. Eriksson, Hirotaka Nakasone, Reva Schwartz
2016	Speaker Identity and Voice Quality: Modeling Human Responses and Automatic Speaker Recognition. Soo Jin Park, Caroline Sigouin, Jody Kreiman, Patricia A. Keating, Jinxi Guo, Gary Yeung, Fang-Yu Kuo, Abeer Alwan
2016	Speaker Linking and Applications Using Non-Parametric Hashing Methods. Douglas E. Sturim, William M. Campbell
2016	Speaker Normalization Through Feature Shifting of Linearly Transformed i-Vector. Jahyun Goo, Younggwan Kim, Hyungjun Lim, Hoirin Kim
2016	Speaker Recognition Using Real vs Synthetic Parallel Data for DNN Channel Compensation. Fred Richardson, Michael S. Brandstein, Jennifer Melot, Douglas A. Reynolds
2016	Speaker Representations for Speaker Adaptation in Multiple Speakers' BLSTM-RNN-Based Speech Synthesis. Yi Zhao, Daisuke Saito, Nobuaki Minematsu
2016	Speaker Verification Using Short Utterances with DNN-Based Estimation of Subglottal Acoustic Features. Jinxi Guo, Gary Yeung, Deepak Muralidharan, Harish Arsikere, Amber Afshan, Abeer Alwan
2016	Speaker-Dependent Dictionary-Based Speech Enhancement for Text-Dependent Speaker Verification. Nicolai Bæk Thomsen, Dennis Alexander Lehmann Thomsen, Zheng-Hua Tan, Børge Lindberg, Søren Holdt Jensen
2016	Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments. Guan-Lin Chao, William Chan, Ian R. Lane
2016	Speakers In The Wild (SITW): The QUT Speaker Recognition System. Houman Ghaemmaghami, Md. Hafizur Rahman, Ivan Himawan, David Dean, Ahilan Kanagasundaram, Sridha Sridharan, Clinton Fookes
2016	Spectral Enhancement of Cleft Lip and Palate Speech. Vikram C. M., Nagaraj Adiga, S. R. Mahadeva Prasanna
2016	Speech Bandwidth Extension Using Bottleneck Features and Deep Recurrent Neural Networks. Yu Gu, Zhen-Hua Ling, Li-Rong Dai
2016	Speech Emotion Recognition Using Affective Saliency. Arodami Chorianopoulou, Polychronis Koutsakis, Alexandros Potamianos
2016	Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System Using Deep Recurrent Neural Networks. Cassia Valentini-Botinhao, Xin Wang, Shinji Takaki, Junichi Yamagishi
2016	Speech Enhancement in Multiple-Noise Conditions Using Deep Neural Networks. Anurag Kumar, Dinei A. F. Florêncio
2016	Speech Features for Depression Detection. Saurabh Sahu, Carol Y. Espy-Wilson
2016	Speech Intelligibility Prediction Based on the Envelope Power Spectrum Model with the Dynamic Compressive Gammachirp Auditory Filterbank. Katsuhiko Yamamoto, Toshio Irino, Toshie Matsui, Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani
2016	Speech Likability and Personality-Based Social Relations: A Round-Robin Analysis over Communication Channels. Laura Fernández Gallardo, Benjamin Weiss
2016	Speech Localisation in a Multitalker Mixture by Humans and Machines. Ning Ma, Guy J. Brown
2016	Speech Recognition in Alzheimer's Disease and in its Assessment. Luke Zhou, Kathleen C. Fraser, Frank Rudzicz
2016	Speech Reductions Cause a De-Weighting of Secondary Acoustic Cues. Léo Varnet, Fanny Meunier, Michel Hoen
2016	Speech Rhythm in Parkinson's Disease: A Study on Italian. Massimo Pettorino, Maria Grazia Busà, Elisa Pellegrino
2016	Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence. Bidisha Sharma, S. R. Mahadeva Prasanna
2016	Speech Ventures. Nicolas Scheffer, Korbinian Riedhammer, Alexandre Lebrun, David Suendermann-Oeft
2016	Speech-Based Detection of Alzheimer's Disease in Conversational German. Jochen Weiner, Christian Herff, Tanja Schultz
2016	Speed Perturbation and Vowel Duration Modeling for ASR in Hausa and Wolof Languages. Elodie Gauthier, Laurent Besacier, Sylvie Voisin
2016	Spoken Language Understanding in a Latent Topic-Based Subspace. Mohamed Morchid, Mohamed Bouaziz, Waad Ben Kheder, Killian Janod, Pierre-Michel Bousquet, Richard Dufour, Georges Linarès
2016	Stacked Long-Term TDNN for Spoken Language Recognition. Daniel Garcia-Romero, Alan McCree
2016	State-of-the-Art MRI Protocol for Comprehensive Assessment of Vocal Tract Structure and Function. Sajan Goud Lingala, Asterios Toutios, Johannes Töger, Yongwan Lim, Yinghua Zhu, Yoon-Chul Kim, Colin Vaz, Shrikanth S. Narayanan, Krishna S. Nayak
2016	Statistical Modeling of Speaker's Voice with Temporal Co-Location for Active Voice Authentication. Zhong Meng, Biing-Hwang Juang
2016	Stimulated Deep Neural Network for Speech Recognition. Chunyang Wu, Penny Karanasou, Mark J. F. Gales, Khe Chai Sim
2016	Subspace Detection of DNN Posterior Probabilities via Sparse Representation for Query by Example Spoken Term Detection. Dhananjay Ram, Afsaneh Asaei, Hervé Bourlard
2016	Subspace LHUC for Fast Adaptation of Deep Neural Network Acoustic Models. Lahiru Samarakoon, Khe Chai Sim
2016	Supervised Learning of Acoustic Models in a Zero Resource Setting to Improve DPGMM Clustering. Michael Heck, Sakriani Sakti, Satoshi Nakamura
2016	Supplementary Motor Area Activation in Disfluency Perception: An fMRI Study of Listener Neural Responses to Spontaneously Produced Unfilled and Filled Pauses. Robert Eklund, Martin Ingvar
2016	Syllable-Level Representations of Suprasegmental Features for DNN-Based Text-to-Speech Synthesis. Manuel Sam Ribeiro, Oliver Watts, Junichi Yamagishi
2016	Synthesis of Device-Independent Noise Corpora for Realistic ASR Evaluation. Hannes Gamper, Mark R. P. Thomas, Lyle Corbin, Ivan Tashev
2016	THU-EE System Description for NIST LRE 2015. Liang He, Yao Tian, Yi Liu, Jiaming Xu, Weiwei Liu, Cai Meng, Jia Liu
2016	TUSK: A Framework for Overviewing the Performance of F0 Estimators. Masanori Morise, Hideki Kawahara
2016	Talking to a System and Talking to a Human: A Study from a Speech-to-Speech, Machine Translation Mediated Map Task. Hayakawa Akira, Saturnino Luz, Nick Campbell
2016	Talking with Kids Really Matters: Early Language Experience Shapes Later Life Chances. Anne Fernald
2016	Tandem Features for Text-Dependent Speaker Verification on the RedDots Corpus. Md. Jahangir Alam, Patrick Kenny, Vishwa Gupta
2016	Target-Based State and Tracking Algorithm for Spoken Dialogue System. Miao Li, Zhiyang He, Ji Wu
2016	Teaming Up: Making the Most of Diverse Representations for a Novel Personalized Speech Retrieval Application. Stephanie Pancoast, Murat Akbacak
2016	Temporal Envelopes in Sine-Wave Speech Recognition. Li Xu
2016	Text Dependent Speaker Verification Using Un-Supervised HMM-UBM and Temporal GMM-UBM. Achintya Kumar Sarkar, Zheng-Hua Tan
2016	Text-Available Speaker Recognition System for Forensic Applications. Chengzhu Yu, Chunlei Zhang, Finnian Kelly, Abhijeet Sangwan, John H. L. Hansen
2016	Text-Dependent Audiovisual Synchrony Detection for Spoofing Detection in Mobile Person Recognition. Amit Aides, Hagai Aronowitz
2016	Text-to-Speech for Individuals with Vision Loss: A User Study. Monika Podsiadlo, Shweta Chahar
2016	The 2015 NIST Language Recognition Evaluation: The Shared View of I2R, Fantastic4 and SingaMS. Kong-Aik Lee, Haizhou Li, Li Deng, Ville Hautamäki, Wei Rao, Xiong Xiao, Anthony Larcher, Hanwu Sun, Trung Hieu Nguyen, Guangsen Wang, Aleksandr Sizov, Jianshu Chen, Ivan Kukanov, Amir Hossein Poorjam, Trung Ngo Trong, Chenglin Xu, Haihua Xu, Bin Ma, Eng Siong Chng, Sylvain Meignier
2016	The 2016 Speakers in the Wild Speaker Recognition Evaluation. Mitchell McLaren, Luciana Ferrer, Diego Castán, Aaron Lawson
2016	The Acoustic Manifestation of Prominence in Stressless Languages. Angeliki Athanasopoulou, Irene Vogel
2016	The Acoustics of Lexical Stress in Italian as a Function of Stress Level and Speaking Style. Anders Eriksson, Pier Marco Bertinetto, Mattias Heldner, Rosalba Nodari, Giovanna Lenoci
2016	The Berkeley Phonetics Machine. Ronald L. Sprouse, Keith Johnson
2016	The Consistency and Stability of Acoustic and Visual Cues for Different Prosodic Attitudes. Jeesun Kim, Chris Davis
2016	The Deception Sub-Challenge: The Data. Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron C. Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini
2016	The Discourse Marker "so" in Turn-Taking and Turn-Releasing Behavior. Emma Rennie, Rebecca Lunsford, Peter A. Heeman
2016	The Effect of Background Noise on the Activation of Phonological and Semantic Information During Spoken-Word Recognition. Florian Hintz, Odette Scharenborg
2016	The Effect of Postlexical Deletion on Automatic Speech Recognition in Fast Spontaneously Spoken Zulu. Ewald van der Westhuizen, Thomas Niesler
2016	The Effect of Sentence Accent on Non-Native Speech Perception in Noise. Odette Scharenborg, Elea Kolkman, Sofoklis Kakouros, Brechtje Post
2016	The Effects of Modified Speech Styles on Intelligibility for Non-Native Listeners. Martin Cooke, María Luisa García Lecumberri
2016	The Effects of Prosody on French V-to-V Coarticulation: A Corpus-Based Study. Giuseppina Turco, Cécile Fougeron, Nicolas Audibert
2016	The Human Speech Cortex. Edward Chang
2016	The IBM 2016 English Conversational Telephone Speech Recognition System. George Saon, Tom Sercu, Steven J. Rennie, Hong-Kwang Jeff Kuo
2016	The IBM Speaker Recognition System: Recent Advances and Error Analysis. Seyed Omid Sadjadi, Jason W. Pelecanos, Sriram Ganapathy
2016	The INTERSPEECH 2016 Computational Paralinguistics Challenge: A Summary of Results. Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini
2016	The INTERSPEECH 2016 Computational Paralinguistics Challenge: Deception, Sincerity & Native Language. Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron C. Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini
2016	The Impact of Manner of Articulation on the Intelligibility of Voicing Contrast in Noise: Cross-Linguistic Implications. Mayuki Matsui
2016	The Influence of Language Experience on the Categorical Perception of Vowels: Evidence from Mandarin and Korean. Hao Zhang, Fei Chen, Nan Yan, Lan Wang, Feng Shi, Manwa L. Ng
2016	The Influence of Modality and Speaking Style on the Assimilation Type and Categorization Consistency of Non-Native Speech. Sarah E. Fenwick, Catherine T. Best, Chris Davis, Michael D. Tyler
2016	The Magic Stone: A Video Game to Improve Communication Skills of People with Intellectual Disabilities. Mario Corrales-Astorgano, David Escudero Mancebo, César González Ferreras, Yurena Gutiérrez-González, Valle Flores-Lucas, Valentín Cardeñoso-Payo, Lourdes Aguilar-Cuevas
2016	The NU-NAIST Voice Conversion System for the Voice Conversion Challenge 2016. Kazuhiro Kobayashi, Shinnosuke Takamichi, Satoshi Nakamura, Tomoki Toda
2016	The Native Language Sub-Challenge: The Data. Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini
2016	The Parameterized Phoneme Identity Feature as a Continuous Real-Valued Vector for Neural Network Based Speech Synthesis. Zhengqi Wen, Ya Li, Jianhua Tao
2016	The Perception of Overlapping Speech: Effects of Speaker Prosody and Listener Attitudes. Katherine Hilton
2016	The Perceptual Effect of L1 Prosody Transplantation on L2 Speech: The Case of French Accented German. Jeanin Jügler, Frank Zimmerer, Jürgen Trouvain, Bernd Möbius
2016	The Production of Intervocalic Glides in Non Dysarthric Parkinsonian Speech. Véronique Delvaux, Virginie Roland, Kathy Huet, Myriam Piccaluga, Marie-Claire Haelewyck, Bernard Harmegnies
2016	The Rhythmic Constraint on Prosodic Boundaries in Mandarin Chinese Based on Corpora of Silent Reading and Speech Perception. Wei Lai, Jiahong Yuan, Ya Li, Xiaoying Xu, Mark Y. Liberman
2016	The Role of Pitch in Punjabi Word Identification. Jasmeen Kanwal, Amanda Ritchart
2016	The Role of Spectral Resolution in Foreign-Accented Speech Perception. Michelle R. Kapolowicz, Vahid Montazeri, Peter F. Assmann
2016	The SIWIS Database: A Multilingual Speech Database with Acted Emphasis. Jean-Philippe Goldman, Pierre-Edouard Honnet, Robert A. J. Clark, Philip N. Garner, Maria Ivanova, Alexandros Lazaridis, Hui Liang, Tiago Macedo, Beat Pfister, Manuel Sam Ribeiro, Eric Wehrli, Junichi Yamagishi
2016	The SRI CLEO Speaker-State Corpus. Andreas Kathol, Elizabeth Shriberg, Massimiliano de Zambotti
2016	The SRI Speech-Based Collaborative Learning Corpus. Colleen Richey, Cynthia M. D'Angelo, Nonye Alozie, Harry Bratt, Elizabeth Shriberg
2016	The SRI System for the NIST OpenSAD 2015 Speech Activity Detection Evaluation. Martin Graciarena, Luciana Ferrer, Vikramjit Mitra
2016	The Sheffield Wargame Corpus - Day Two and Day Three. Yulan Liu, Charles Fox, Madina Hasan, Thomas Hain
2016	The Sincerity Sub-Challenge: The Data. Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini
2016	The Sound of Disgust: How Facial Expression May Influence Speech Production. Chee Seng Chong, Jeesun Kim, Chris Davis
2016	The Speakers in the Wild (SITW) Speaker Recognition Database. Mitchell McLaren, Luciana Ferrer, Diego Castán, Aaron Lawson
2016	The USTC System for Voice Conversion Challenge 2016: Neural Network Based Approaches for Spectrum, Aperiodicity and F Ling-Hui Chen, Li-Juan Liu, Zhen-Hua Ling, Yuan Jiang, Li-Rong Dai
2016	The Unit of Speech Encoding: The Case of Romanian. Irene Vogel, Laura Spinu
2016	The Use of Locally Normalized Cepstral Coefficients (LNCC) to Improve Speaker Recognition Accuracy in Highly Reverberant Rooms. Víctor Poblete, Juan Pablo Escudero, Josué Fredes, José Novoa, Richard M. Stern, Simon King, Néstor Becerra Yoma
2016	The Use of Read versus Conversational Lombard Speech in Spectral Tilt Modeling for Intelligibility Enhancement in Near-End Noise Conditions. Emma Jokinen, Ulpu Remes, Paavo Alku
2016	The Voice Conversion Challenge 2016. Tomoki Toda, Ling-Hui Chen, Daisuke Saito, Fernando Villavicencio, Mirjam Wester, Zhizheng Wu, Junichi Yamagishi
2016	TheanoLM - An Extensible Toolkit for Neural Network Language Modeling. Seppo Enarvi, Mikko Kurimo
2016	Time-Varying Quasi-Closed-Phase Weighted Linear Prediction Analysis of Speech for Accurate Formant Detection and Tracking. Dhananjaya N. Gowda, Paavo Alku
2016	Today's Most Frequently Used F Sofia Strömbergsson
2016	Tone Classification in Mandarin Chinese Using Convolutional Neural Networks. Charles Chen, Razvan C. Bunescu, Li Xu, Chang Liu
2016	Toward Development and Evaluation of Pain Level-Rating Scale for Emergency Triage based on Vocal Characteristics and Facial Expressions. Fu-Sheng Tsai, Ya-Ling Hsu, Wei-Chen Chen, Yi-Ming Weng, Chip-Jin Ng, Chi-Chun Lee
2016	Toward High-Performance Language-Independent Query-by-Example Spoken Term Detection for MediaEval 2015: Post-Evaluation Analysis. Cheung-Chi Leung, Lei Wang, Haihua Xu, Jingyong Hou, Van Tung Pham, Hang Lv, Lei Xie, Xiong Xiao, Chongjia Ni, Bin Ma, Eng Siong Chng, Haizhou Li
2016	Towards Automatic Detection of Amyotrophic Lateral Sclerosis from Speech Acoustic and Articulatory Samples. Jun Wang, Prasanna V. Kothalkar, Beiming Cao, Daragh Heitzman
2016	Towards Building an Attentive Artificial Listener: On the Perception of Attentiveness in Feedback Utterances. Catharine Oertel, Joakim Gustafson, Alan W. Black
2016	Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks. Ying Zhang, Mohammad Pezeshki, Philémon Brakel, Saizheng Zhang, César Laurent, Yoshua Bengio, Aaron C. Courville
2016	Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine. Bo-Hsiang Tseng, Sheng-syun Shen, Hung-yi Lee, Lin-Shan Lee
2016	Towards Minimally Invasive Velar State Detection in Normal and Silent Speech. Peter Birkholz, Petko Bakardjiev, Steffen Kürbis, Rico Petrick
2016	Towards Online-Recognition with Deep Bidirectional LSTM Acoustic Models. Albert Zeyer, Ralf Schlüter, Hermann Ney
2016	Towards Smart-Cars That Can Listen: Abnormal Acoustic Event Detection on the Road. Mahesh Kumar Nandwana, Taufiq Hasan
2016	Towards an Automated Screening Tool for Developmental Speech and Language Impairments. Jen J. Gong, Maryann Gong, Dina Levy-Lambert, Jordan R. Green, Tiffany P. Hogan, John V. Guttag
2016	Tracking Contours of Orofacial Articulators from Real-Time MRI of Speech. Mathieu Labrunie, Pierre Badin, Dirk Voit, Arun A. Joseph, Laurent Lamalle, Coriandre Vilain, Louis-Jean Boë, Jens Frahm
2016	Transfer Learning for Speaker Verification on Short Utterances. Qingyang Hong, Lin Li, Lihong Wan, Jun Zhang, Feng Tong
2016	Transfer Learning with Bottleneck Feature Networks for Whispered Speech Recognition. Boon Pang Lim, Faith Wong, Yuyao Li, Jia Wei Bay
2016	Transferring Emphasis in Speech Translation Using Hard-Attentional Neural Network Models. Quoc Truong Do, Sakriani Sakti, Graham Neubig, Satoshi Nakamura
2016	Triphone State-Tying via Deep Canonical Correlation Analysis. Weiran Wang, Hao Tang, Karen Livescu
2016	Twin Model G-PLDA for Duration Mismatch Compensation in Text-Independent Speaker Verification. Jianbo Ma, Vidhyasaharan Sethu, Eliathamby Ambikairajah, Kong-Aik Lee
2016	Two-Pass IB Based Speaker Diarization System Using Meeting-Specific ANN Based Features. Nauman Dawalatabad, Srikanth R. Madikeri, C. Chandra Sekhar, Hema A. Murthy
2016	Two-Stage Data Augmentation for Low-Resourced Speech Recognition. William Hartmann, Tim Ng, Roger Hsiao, Stavros Tsakalidis, Richard M. Schwartz
2016	Two-Stage Temporal Processing for Single-Channel Speech Enhancement. Suman Samui, Indrajit Chakrabarti, Soumya Kanti Ghosh
2016	Uncontrolled Manifolds in Vowel Production: Assessment with a Biomechanical Model of the Tongue. Andrew Szabados, Pascal Perrier
2016	Understanding Periodically Interrupted Mandarin Speech. Jing Liu, Rosanna H. N. Tong, Fei Chen
2016	Undoing Misperceptions: A Microscopic Analysis of Consistent Confusions Through Signal Modifications. Máté Attila Tóth, Martin Cooke
2016	Unipolar Depression vs. Bipolar Disorder: An Elicitation-Based Approach to Short-Term Detection of Mood Disorder. Kun-Yi Huang, Chung-Hsien Wu, Yu-Ting Kuo, Fong-Lin Jang
2016	Unit-Selection Attack Detection Based on Unfiltered Frequency-Domain Features. Ulrich Scherhag, Andreas Nautsch, Christian Rathgeb, Christoph Busch
2016	Universal Background Sparse Coding and Multilayer Bootstrap Network for Speaker Clustering. Xiao-Lei Zhang
2016	Unrestricted Vocabulary Keyword Spotting Using LSTM-CTC. Yimeng Zhuang, Xuankai Chang, Yanmin Qian, Kai Yu
2016	Unsupervised Adaptation of Recurrent Neural Network Language Models. Siva Reddy Gangireddy, Pawel Swietojanski, Peter Bell, Steve Renals
2016	Unsupervised Bottleneck Features for Low-Resource Query-by-Example Spoken Term Detection. Hongjie Chen, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li
2016	Unsupervised Deep Auditory Model Using Stack of Convolutional RBMs for Speech Recognition. Hardik B. Sailor, Hemant A. Patil
2016	Unsupervised Joint Estimation of Grapheme-to-Phoneme Conversion Systems and Acoustic Model Adaptation for Non-Native Speech Recognition. Satoshi Tsujioka, Sakriani Sakti, Koichiro Yoshino, Graham Neubig, Satoshi Nakamura
2016	Unsupervised Learning of Acoustic Units Using Autoencoders and Kohonen Nets. Vikramjit Mitra, Dimitra Vergyri, Horacio Franco
2016	Unsupervised Phoneme Segmentation of Previously Unseen Languages. Marco Vetter, Markus Müller, Fatima Hamlaoui, Graham Neubig, Satoshi Nakamura, Sebastian Stüker, Alex Waibel
2016	Unsupervised Stress Information Labeling Using Gaussian Process Latent Variable Model for Statistical Speech Synthesis. Decha Moungsri, Tomoki Koriyama, Takao Kobayashi
2016	Use of Agreement/Disagreement Classification in Dyadic Interactions for Continuous Emotion Recognition. Hossein Khaki, Engin Erzin
2016	Use of Generalised Nonlinearity in Vector Taylor Series Noise Compensation for Robust Speech Recognition. Erfan Loweimi, Jon Barker, Thomas Hain
2016	Use of Vowels in Discriminating Speech-Laugh from Laughter and Neutral Speech. Sri Harsha Dumpala, P. Gangamohan, Suryakanth V. Gangashetty, B. Yegnanarayana
2016	Using Clinician Annotations to Improve Automatic Speech Recognition of Stuttered Speech. Peter A. Heeman, Rebecca Lunsford, Andy McMillin, J. Scott Yaruss
2016	Using Past Speaker Behavior to Better Predict Turn Transitions. Tomer Meshorer, Peter A. Heeman
2016	Using Phonologically Weighted Levenshtein Distances for the Prediction of Microscopic Intelligibility. Lionel Fontan, Isabelle Ferrané, Jérôme Farinas, Julien Pinquier, Xavier Aumont
2016	Using Text and Acoustic Features in Predicting Glottal Excitation Waveforms for Parametric Speech Synthesis with Recurrent Neural Networks. Lauri Juvela, Xin Wang, Shinji Takaki, Manu Airaksinen, Junichi Yamagishi, Paavo Alku
2016	Using Zero-Frequency Resonator to Extract Multilingual Intonation Structure. Jinfu Ni, Yoshinori Shiga, Hisashi Kawai
2016	Using a Biomechanical Model and Articulatory Data for the Numerical Production of Vowels. Saeed Dabbaghchian, Marc Arnela, Olov Engwall, Oriol Guasch, Ian Stavness, Pierre Badin
2016	Utterance Verification for Text-Dependent Speaker Recognition: A Comparative Assessment Using the RedDots Corpus. Tomi Kinnunen, Md. Sahidullah, Ivan Kukanov, Héctor Delgado, Massimiliano Todisco, Achintya Kumar Sarkar, Nicolai Bæk Thomsen, Ville Hautamäki, Nicholas W. D. Evans, Zheng-Hua Tan
2016	Variation in Spoken North Sami Language. Kristiina Jokinen, Trung Ngo Trong, Ville Hautamäki
2016	Velum Control for Oral Sounds. Reed Blaylock, Louis Goldstein, Shrikanth S. Narayanan
2016	Virtual Adversarial Training Applied to Neural Higher-Order Factors for Phone Classification. Martin Ratajczak, Sebastian Tschiatschek, Franz Pernkopf
2016	Virtual Machines and Containers as a Platform for Experimentation. Florian Metze, Eric Riebling, Anne S. Warlaumont, Elika Bergelson
2016	Visual Speech Synthesis Using Dynamic Visemes, Contextual Features and DNNs. Ausdang Thangthai, Ben Milner, Sarah Taylor
2016	Vocal Effort Modification for Singing Synthesis. Olivier Perrotin, Christophe d'Alessandro
2016	Vocal Tract Length Normalization for Speaker Independent Acoustic-to-Articulatory Speech Inversion. Ganesh Sivaraman, Vikramjit Mitra, Hosung Nam, Mark K. Tiede, Carol Y. Espy-Wilson
2016	Voice Conversion Based on Matrix Variate Gaussian Mixture Model Using Multiple Frame Features. Yi Yang, Hidetsugu Uchida, Daisuke Saito, Nobuaki Minematsu
2016	Voice Conversion Based on Trajectory Model Training of Neural Networks Considering Global Variance. Naoki Hosaka, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda
2016	Voice Quality Control Using Perceptual Expressions for Statistical Parametric Speech Synthesis Based on Cluster Adaptive Training. Yamato Ohtani, Koichiro Mori, Masahiro Morita
2016	Voice-Quality Difference Between the Vowels in Filled Pauses and Ordinary Lexical Items. Kikuo Maekawa, Hiroki Mori
2016	Voting Detector: A Combination of Anomaly Detectors to Reveal Annotation Errors in TTS Corpora. Jindrich Matousek, Daniel Tihelka
2016	Vowel Characteristics in the Assessment of L2 English Pronunciation. Calbert Graham, Paula Buttery, Francis Nolan
2016	Vowel Fundamental and Formant Frequency Contributions to English and Mandarin Sentence Intelligibility. Daniel Fogerty, Fei Chen
2016	Vowels and Diphthongs in Cangnan Southern Min Chinese Dialect. Fang Hu, Chunyu Ge
2016	Vowels and Diphthongs in the Taiyuan Jin Chinese Dialect. Liping Xia, Fang Hu
2016	Waveform Generation Based on Signal Reshaping for Statistical Parametric Speech Synthesis. Felipe Espic, Cassia Valentini-Botinhao, Zhizheng Wu, Simon King
2016	Web Data Selection Based on Word Embedding for Low-Resource Speech Recognition. Chuandong Xie, Wu Guo, Guoping Hu, Junhua Liu
2016	Who Do You Think Will Speak Next? Perception of Turn-Taking Cues in Slovak and Argentine Spanish. Agustín Gravano, Pablo Brusco, Stefan Benus
2016	Why do ASR Systems Despite Neural Nets Still Depend on Robust Features. Angel Mario Castro Martinez, Marc René Schädler
2016	Within-Speaker Features for Native Language Recognition in the Interspeech 2016 Computational Paralinguistics Challenge. Mark A. Huckvale
2016	Word-Phrase-Entity Recurrent Neural Networks for Language Modeling. Michael Levit, Sarangarajan Parthasarathy, Shuangyu Chang
2016	YIN-Bird: Improved Pitch Tracking for Bird Vocalisations. Colm O'Reilly, Nicola M. Marples, David J. Kelly, Naomi Harte
2016	Zara: An Empathetic Interactive Virtual Agent. Pascale Fung, Anik Dey, Farhad Bin Siddique, Ruixi Lin, Yang Yang, Yan Wan, Ricky Ho Yin Chan
2016	i-Vector/HMM Based Text-Dependent Speaker Verification System for RedDots Challenge. Hossein Zeinali, Hossein Sameti, Lukás Burget, Jan Cernocký, Nooshin Maghsoodi, Pavel Matejka
2016	webASR 2 - Improved Cloud Based Speech Technology. Thomas Hain, Jeremy Christian, Oscar Saz, Salil Deena, Madina Hasan, Raymond W. M. Ng, Rosanna Milner, Mortaza Doulaty, Yulan Liu