| 2016 | /r/ as Language Marker in Bilingual Speech Production and Perception. Constantijn Kaland, Vincenzo Galatà, Lorenzo Spreafico, Alessandro Vietti |
| 2016 | 17th Annual Conference of the International Speech Communication Association, Interspeech 2016, San Francisco, CA, USA, September 8-12, 2016 Nelson Morgan |
| 2016 | A 50-Year Retrospective on Speech and Language Processing. John Makhoul |
| 2016 | A Class-Specific Speech Enhancement for Phoneme Recognition: A Dictionary Learning Approach. Nazreen P. M., A. G. Ramakrishnan, Prasanta Kumar Ghosh |
| 2016 | A Convex Model for Linguistic Influence in Group Conversations. Kan Kawabata, Visar Berisha, Anna Scaglione, Amy LaCross |
| 2016 | A DNN-HMM Approach to Non-Negative Matrix Factorization Based Speech Enhancement. Ziteng Wang, Xu Li, Xiaofei Wang, Qiang Fu, Yonghong Yan |
| 2016 | A DNN-HMM Approach to Story Segmentation. Jia Yu, Xiong Xiao, Lei Xie, Eng Siong Chng, Haizhou Li |
| 2016 | A Deep Learning Approach to Modeling Empathy in Addiction Counseling. James Gibson, Dogan Can, Bo Xiao, Zac E. Imel, David C. Atkins, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2016 | A Divide-and-Conquer Approach for Language Identification Based on Recurrent Neural Networks. Gregory Gelly, Jean-Luc Gauvain, Viet Bac Le, Abdelkhalek Messaoudi |
| 2016 | A Fast and Accurate Fundamental Frequency Estimator Using Recursive Moving Average Filters. Ryunosuke Daido, Yuji Hisaminato |
| 2016 | A Feature Normalisation Technique for PLLR Based Language Identification Systems. Sarith Fernando, Vidhyasaharan Sethu, Eliathamby Ambikairajah |
| 2016 | A Feature Study for Masking-Based Reverberant Speech Separation. Masood Delfarah, DeLiang Wang |
| 2016 | A Framework for Automated Marmoset Vocalization Detection and Classification. Alan Wisler, Laura J. Brattain, Rogier Landman, Thomas F. Quatieri |
| 2016 | A Framework for Practical Multistream ASR. Sri Harish Reddy Mallidi, Hynek Hermansky |
| 2016 | A French Corpus for Distant-Microphone Speech Processing in Real Homes. Nancy Bertin, Ewen Camberlein, Emmanuel Vincent, Romain Lebarbenchon, Stéphane Peillon, Éric Lamande, Sunit Sivasankaran, Frédéric Bimbot, Irina Illina, Ariane Tom, Sylvain Fleury, Éric Jamet |
| 2016 | A Hierarchical Predictor of Synthetic Speech Naturalness Using Neural Networks. Takenori Yoshimura, Gustav Eje Henter, Oliver Watts, Mirjam Wester, Junichi Yamagishi, Keiichi Tokuda |
| 2016 | A Hybrid System for Continuous Word-Level Emphasis Modeling Based on HMM State Clustering and Adaptive Training. Quoc Truong Do, Tomoki Toda, Graham Neubig, Sakriani Sakti, Satoshi Nakamura |
| 2016 | A KL Divergence and DNN-Based Approach to Voice Conversion without Parallel Training Sentences. Feng-Long Xie, Frank K. Soong, Haifeng Li |
| 2016 | A Longitudinal Study of Children's Intonation in Narrative Speech. Jeffrey Kallay, Melissa A. Redford |
| 2016 | A Low Cost Desktop Robot and Tele-Presence Device for Interactive Speech Research. Michael C. Brady |
| 2016 | A Multimodal Dialogue System for Air Traffic Control Trainees Based on Discrete-Event Simulation. Lubos Smídl, Adam Chýlek, Jan Svec |
| 2016 | A New Model for Acoustic Wave Propagation and Scattering in the Vocal Tract. Jianguo Wei, Wendan Guan, Darcy Q. Hou, Dingyi Pan, Wenhuan Lu, Jianwu Dang |
| 2016 | A New Model of Speech Motor Control Based on Task Dynamics and State Feedback. Vikram Ramanarayanan, Benjamin Parrell, Louis Goldstein, Srikantan S. Nagarajan, John F. Houde |
| 2016 | A New Pre-Training Method for Training Deep Learning Models with Application to Spoken Language Understanding. Asli Celikyilmaz, Ruhi Sarikaya, Dilek Hakkani-Tür, Xiaohu Liu, Nikhil Ramesh, Gökhan Tür |
| 2016 | A Nonparametric Bayesian Approach for Spoken Term Detection by Example Query. Amir Hossein Harati Nejad Torbati, Joseph Picone |
| 2016 | A Novel Discriminative Score Calibration Method for Keyword Search. Zhiqiang Lv, Meng Cai, Wei-Qiang Zhang, Jia Liu |
| 2016 | A Novel Research to Artificial Bandwidth Extension Based on Deep BLSTM Recurrent Neural Networks and Exemplar-Based Sparse Representation. Bin Liu, Jianhua Tao |
| 2016 | A Novel Risk-Estimation-Theoretic Framework for Speech Enhancement in Nonstationary and Non-Gaussian Noise Conditions. Jishnu Sadasivan, Chandra Sekhar Seelamantula |
| 2016 | A Phase-Based Time-Frequency Masking for Multi-Channel Speech Enhancement in Domestic Environments. Alessio Brutti, Antigoni Tsiami, Athanasios Katsamanis, Petros Maragos |
| 2016 | A Portable Automatic PA-TA-KA Syllable Detection System to Derive Biomarkers for Neurological Disorders. Fei Tao, Louis Daudet, Christian Poellabauer, Sandra L. Schneider, Carlos Busso |
| 2016 | A Praat-Based Algorithm to Extract the Amplitude Envelope and Temporal Fine Structure Using the Hilbert Transform. Lei He, Volker Dellwo |
| 2016 | A Preliminary Ultrasound Study of Nasal and Lateral Coronals in Arrernte. Marija Tabain, Richard Beare |
| 2016 | A Real-Time Framework for Visual Feedback of Articulatory Data Using Statistical Shape Models. Kristy James, Alexander Hewer, Ingmar Steiner, Stefanie Wuhrer |
| 2016 | A Real-Time Parametric General-Purpose Mammalian Vocal Synthesiser. Roger K. Moore |
| 2016 | A Robust Dual-Microphone Speech Source Localization Algorithm for Reverberant Environments. Yanmeng Guo, Xiaofei Wang, Chao Wu, Qiang Fu, Ning Ma, Guy J. Brown |
| 2016 | A Robust Non-Parametric and Filtering Based Approach for Glottal Closure Instant Detection. Pradeep Rengaswamy, Gurunath Reddy M., K. Sreenivasa Rao, Pallab Dasgupta |
| 2016 | A Sequence-to-Sequence Model for User Simulation in Spoken Dialogue Systems. Layla El Asri, Jing He, Kaheer Suleman |
| 2016 | A Sparse Spherical Harmonic-Based Model in Subbands for Head-Related Transfer Functions. Xiaoke Qi, Jianhua Tao |
| 2016 | A Speaker Diarization System for Studying Peer-Led Team Learning Groups. Harishchandra Dubey, Lakshmish Kaushik, Abhijeet Sangwan, John H. L. Hansen |
| 2016 | A Speaker Recognition System for the SITW Challenge. Oleg Kudashev, Sergey Novoselov, Konstantin Simonchik, Alexander Kozlov |
| 2016 | A Spectral Modulation Sensitivity Weighted Pre-Emphasis Filter for Active Noise Control System. Kah-Meng Cheong, Yuh-Yuan Wang, Tai-Shih Chi |
| 2016 | A Step Beyond Local Observations with a Dialog Aware Bidirectional GRU Network for Spoken Language Understanding. Vedran Vukotic, Christian Raymond, Guillaume Gravier |
| 2016 | A Stochastic Model for Computer-Aided Human-Human Dialogue. Merwan Barlier, Romain Laroche, Olivier Pietquin |
| 2016 | A Template-Based Approach for Speech Synthesis Intonation Generation Using LSTMs. Srikanth Ronanki, Gustav Eje Henter, Zhizheng Wu, Simon King |
| 2016 | A Voice Conversion Mapping Function Based on a Stacked Joint-Autoencoder. Seyed Hamidreza Mohammadi, Alexander Kain |
| 2016 | A WFST Framework for Single-Pass Multi-Stream Decoding. Sirui Xu, Eric Fosler-Lussier |
| 2016 | A priori SNR Estimation Using a Generalized Decision Directed Approach. Aleksej Chinaev, Reinhold Haeb-Umbach |
| 2016 | ARET - Automatic Reading of Educational Texts for Visually Impaired Students. Martin Gruber, Jindrich Matousek, Zdenek Hanzlícek, Zdenek Krnoul, Zbynek Zajíc |
| 2016 | ASR Confidence Estimation with Speaker-Adapted Recurrent Neural Networks. Miguel Ángel del Agua, Santiago Piqueras, Adrià Giménez, Alberto Sanchís, Jorge Civera, Alfons Juan |
| 2016 | ASR for South Slavic Languages Developed in Almost Automated Way. Jan Nouza, Radek Safarík, Petr Cerva |
| 2016 | AUT System for SITW Speaker Recognition Challenge. Abbas Khosravani, Mohammad Mehdi Homayounpour |
| 2016 | Accent Identification by Combining Deep Neural Networks and Recurrent Neural Networks Trained on Long and Short Term Features. Yishan Jiao, Ming Tu, Visar Berisha, Julie M. Liss |
| 2016 | Acoustic Analysis of Syllables Across Indian Languages. Anusha Prakash, Jeena J. Prakash, Hema A. Murthy |
| 2016 | Acoustic Differences Between English /t/ Glottalization and Phrasal Creak. Marc Garellek, Scott Seyfarth |
| 2016 | Acoustic Modeling Using Bidirectional Gated Recurrent Convolutional Units. Markus Nußbaum-Thom, Jia Cui, Bhuvana Ramabhadran, Vaibhava Goel |
| 2016 | Acoustic Modelling from the Signal Domain Using CNNs. Pegah Ghahremani, Vimal Manohar, Daniel Povey, Sanjeev Khudanpur |
| 2016 | Acoustic Properties of Formality in Conversational Japanese. Ethan Sherr-Ziarko |
| 2016 | Acoustic Word Embeddings for ASR Error Detection. Sahar Ghannay, Yannick Estève, Nathalie Camelin, Paul Deléglise |
| 2016 | Acoustic and Visual Analysis of Expressive Speech: A Case Study of French Acted Speech. Slim Ouni, Vincent Colotte, Sara Dahmani, Soumaya Azzi |
| 2016 | Acoustic-Prosodic and Turn-Taking Features in Interactions with Children with Neurodevelopmental Disorders. Daniel Bone, Somer Bishop, Rahul Gupta, Sungbok Lee, Shrikanth S. Narayanan |
| 2016 | Acoustic-to-Articulatory Inversion Mapping Based on Latent Trajectory Gaussian Mixture Model. Patrick Lumban Tobing, Tomoki Toda, Hirokazu Kameoka, Satoshi Nakamura |
| 2016 | Active and Semi-Supervised Learning in ASR: Benefits on the Acoustic and Language Models. Thomas Drugman, Janne Pylkkönen, Reinhard Kneser |
| 2016 | Adaptation of Neural Networks Constrained by Prior Statistics of Node Co-Activations. Tasha Nagamine, Zhuo Chen, Nima Mesgarani |
| 2016 | Adaptive Group Sparsity for Non-Negative Matrix Factorization with Application to Unsupervised Source Separation. Xu Li, Ziteng Wang, Xiaofei Wang, Qiang Fu, Yonghong Yan |
| 2016 | Adaptive Latency for Part-of-Speech Tagging in Incremental Text-to-Speech Synthesis. Maël Pouget, Olha Nahorna, Thomas Hueber, Gérard Bailly |
| 2016 | Advances in Very Deep Convolutional Neural Networks for LVCSR. Tom Sercu, Vaibhava Goel |
| 2016 | Adversarial Multi-Task Learning of Deep Neural Networks for Robust Speech Recognition. Yusuke Shinohara |
| 2016 | An Acoustic Analysis of /r/ in Tyrolean. Vincenzo Galatà, Lorenzo Spreafico, Alessandro Vietti, Constantijn Kaland |
| 2016 | An Acoustic Analysis of Child-Child and Child-Robot Interactions for Understanding Engagement during Speech-Controlled Computer Games. Theodora Chaspari, Jill Fain Lehman |
| 2016 | An Adaptive Multi-Band System for Low Power Voice Command Recognition. Qing He, Gregory W. Wornell, Wei Ma |
| 2016 | An Articulatory-Based Singing Voice Synthesis Using Tongue and Lips Imaging. Aurore Jaumard-Hakoun, Kele Xu, Clémence Leboullenger, Pierre Roussel-Ragot, Bruce Denby |
| 2016 | An Automatic Training Tool for Air Traffic Control Training. Petr Stanislav, Lubos Smídl, Jan Svec |
| 2016 | An Engine for Online Video Search in Large Archives of the Holocaust Testimonies. Petr Stanislav, Jan Svec, Pavel Ircing |
| 2016 | An Expectation Maximization Approach to Joint Modeling of Multidimensional Ratings Derived from Multiple Annotators. Anil Ramakrishna, Rahul Gupta, Ruth B. Grossman, Shrikanth S. Narayanan |
| 2016 | An Improved 3D Geometric Tongue Model. Qiang Fang, Yun Chen, Haibo Wang, Jianguo Wei, Jianrong Wang, Xiyu Wu, Aijun Li |
| 2016 | An Interaural Magnification Algorithm for Enhancement of Naturally-Occurring Level Differences. Shadi Pirhosseinloo, Kostas Kokkinakis |
| 2016 | An Investigation of DNN-Based Speech Synthesis Using Speaker Codes. Nobukatsu Hojo, Yusuke Ijima, Hideyuki Mizuno |
| 2016 | An Investigation of Deep Neural Network Architectures for Language Recognition in Indian Languages. Mounika K. V., Sivanand Achanta, Lakshmi H. R., Suryakanth V. Gangashetty, Anil Kumar Vuppala |
| 2016 | An Investigation of Emotional Speech in Depression Classification. Brian Stasak, Julien Epps, Nicholas Cummins, Roland Goecke |
| 2016 | An Investigation of Recurrent Neural Network Architectures Using Word Embeddings for Phrase Break Prediction. Anandaswarup Vadapalli, Suryakanth V. Gangashetty |
| 2016 | An Investigation of Spoofing Speech Detection Under Additive Noise and Reverberant Conditions. Xiaohai Tian, Zhizheng Wu, Xiong Xiao, Eng Siong Chng, Haizhou Li |
| 2016 | An Investigation on Training Deep Neural Networks Using Probabilistic Transcriptions. Amit Das, Mark Hasegawa-Johnson |
| 2016 | An Investigation on the Use of i-Vectors for Robust ASR. Dimitrios Dimitriadis, Samuel Thomas, Sriram Ganapathy |
| 2016 | An Iterative Phase Recovery Framework with Phase Mask for Spectral Mapping with an Application to Speech Enhancement. Kehuang Li, Bo Wu, Chin-Hui Lee |
| 2016 | An Objective Evaluation Methodology for Blind Bandwidth Extension. Stéphane Villette, Sen Li, Pravin Ramadas, Daniel J. Sinder |
| 2016 | Analysis of Chinese Syllable Durations in Running Speech of Japanese L2 Learners. Yue Sun, Shudon Hsiao, Yoshinori Sagisaka, Jinsong Zhang |
| 2016 | Analysis of Face Mask Effect on Speaker Recognition. Rahim Saeidi, Ilkka Huhtakallio, Paavo Alku |
| 2016 | Analysis of Glottal Stop in Assam Sora Language. Sishir Kalita, Luke Horo, Priyankoo Sarmah, S. R. Mahadeva Prasanna, Samarendra Dandapat |
| 2016 | Analysis of Mismatched Transcriptions Generated by Humans and Machines for Under-Resourced Languages. Van Hai Do, Nancy F. Chen, Boon Pang Lim, Mark Hasegawa-Johnson |
| 2016 | Analysis of Multi-Lingual Emotion Recognition Using Auditory Attention Features. Ozlem Kalinli |
| 2016 | Analysis of Speaker Recognition Systems in Realistic Scenarios of the SITW 2016 Challenge. Ondrej Novotný, Pavel Matejka, Oldrich Plchot, Ondrej Glembek, Lukás Burget, Jan Cernocký |
| 2016 | Analysis of the Voice Conversion Challenge 2016 Evaluation Results. Mirjam Wester, Zhizheng Wu, Junichi Yamagishi |
| 2016 | Analysis on Gated Recurrent Unit Based Question Detection Approach. Yaodong Tang, Zhiyong Wu, Helen M. Meng, Mingxing Xu, Lianhong Cai |
| 2016 | Analytical Assessment of Dual-Stream Merging for Noise-Robust ASR. Louis ten Bosch, Bert Cranen, Yang Sun |
| 2016 | Analyzing Temporal Dynamics of Dyadic Synchrony in Affective Interactions. Zhaojun Yang, Shrikanth S. Narayanan |
| 2016 | Analyzing the Contribution of Top-Down Lexical and Bottom-Up Acoustic Cues in the Detection of Sentence Prominence. Sofoklis Kakouros, Joris Pelemans, Lyan Verwimp, Patrick Wambacq, Okko Räsänen |
| 2016 | Analyzing the Relation Between Overall Quality and the Quality of Individual Phases in a Telephone Conversation. Friedemann Köster, Sebastian Möller |
| 2016 | Anchored Speech Detection. Roland Maas, Sree Hari Krishnan Parthasarathi, Brian John King, Ruitong Huang, Björn Hoffmeister |
| 2016 | Applying Spectral Normalisation and Efficient Envelope Estimation and Statistical Transformation for the Voice Conversion Challenge 2016. Fernando Villavicencio, Junichi Yamagishi, Jordi Bonada, Felipe Espic |
| 2016 | Articulation Rate Filtering of CQCC Features for Automatic Speaker Verification. Massimiliano Todisco, Héctor Delgado, Nicholas W. D. Evans |
| 2016 | Articulation Rate in Adverse Listening Conditions in Younger and Older Adults. Outi Tuomainen, Valérie Hazan |
| 2016 | Articulatory Feature Extraction Using CTC to Build Articulatory Classifiers Without Forced Frame Alignments for Speech Recognition. Basil Abraham, Srinivasan Umesh, Neethu Mariam Joy |
| 2016 | Articulatory Synthesis Based on Real-Time Magnetic Resonance Imaging Data. Asterios Toutios, Tanner Sorensen, Krishna Somandepalli, Rachel Alexander, Shrikanth S. Narayanan |
| 2016 | Articulatory-to-Acoustic Conversion with Cascaded Prediction of Spectral and Excitation Features Using Neural Networks. Zheng-Chen Liu, Zhen-Hua Ling, Li-Rong Dai |
| 2016 | Artificial Neural Network-Based Feature Combination for Spatial Voice Activity Detection. Stefan Meier, Walter Kellermann |
| 2016 | Assessing Idiosyncrasies in a Bayesian Model of Speech Communication. Marie-Lou Barnaud, Julien Diard, Pierre Bessière, Jean-Luc Schwartz |
| 2016 | Assessing Level-Dependent Segmental Contribution to the Intelligibility of Speech Processed by Single-Channel Noise-Suppression Algorithms. Tian Guan, Guangxing Chu, Fei Chen, Feng Yang |
| 2016 | Assessing Speech Quality in Speech-Aware Hearing Aids Based on Phoneme Posteriorgrams. Constantin Spille, Hendrik Kayser, Hynek Hermansky, Bernd T. Meyer |
| 2016 | At the Border of Acoustics and Linguistics: Bag-of-Audio-Words for the Recognition of Emotions in Speech. Maximilian Schmitt, Fabien Ringeval, Björn W. Schuller |
| 2016 | Attention Assisted Discovery of Sub-Utterance Structure in Speech Emotion Recognition. Che-Wei Huang, Shrikanth S. Narayanan |
| 2016 | Attention-Based Convolutional Neural Networks for Sentence Classification. Zhiwei Zhao, Youzheng Wu |
| 2016 | Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling. Bing Liu, Ian R. Lane |
| 2016 | Audio Word2Vec: Unsupervised Learning of Audio Segment Representations Using Sequence-to-Sequence Autoencoder. Yu-An Chung, Chao-Chung Wu, Chia-Hao Shen, Hung-yi Lee, Lin-Shan Lee |
| 2016 | Audio-Based Distributional Representations of Meaning Using a Fusion of Feature Encodings. Giannis Karamanolakis, Elias Iosif, Athanasia Zlatintsi, Aggelos Pikrakis, Alexandros Potamianos |
| 2016 | Audio-Visual Speech Recognition Using Bimodal-Trained Bottleneck Features for a Person with Severe Hearing Loss. Yuki Takashima, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki, Nobuyuki Mitani, Kiyohiro Omori, Kaoru Nakazono |
| 2016 | Audio-to-Visual Speech Conversion Using Deep Neural Networks. Sarah Taylor, Akihiro Kato, Iain A. Matthews, Ben P. Milner |
| 2016 | Audiovisual Speech Scene Analysis in the Context of Competing Sources. Attigodu C. Ganesh, Frédéric Berthommier, Jean-Luc Schwartz |
| 2016 | Audiovisual Training Effects for Japanese Children Learning English /r/-/l/. Yasuaki Shinohara |
| 2016 | Auditory Processing Impairments Under Background Noise in Children with Non-Syndromic Cleft Lip and/or Palate. Yang Feng, Zhang Lu |
| 2016 | Auditory-Visual Lexical Tone Perception in Thai Elderly Listeners with and without Hearing Impairment. Benjawan Kasisopa, Chutamanee Onsuwan, Charturong Tantibundhit, Nittayapa Klangpornkun, Suparak Techacharoenrungrueang, Sudaporn Luksaneeyanawin, Denis Burnham |
| 2016 | Auditory-Visual Perception of VCVs Produced by People with Down Syndrome: Preliminary Results. Alexandre Hennequin, Amélie Rochet-Capellan, Marion Dohen |
| 2016 | Automated Pause Insertion for Improved Intelligibility Under Reverberation. Petko Nikolov Petkov, Norbert Braunschweiler, Yannis Stylianou |
| 2016 | Automated Screening of Speech Development Issues in Children by Identifying Phonological Error Patterns. Lauren Ward, Alessandro Stefani, Daniel V. Smith, Andreas Duenser, Jill Freyne, Barbara Dodd, Angela Morgan |
| 2016 | Automatic Analysis of Phonetic Speech Style Dimensions. Neville Ryant, Mark Y. Liberman |
| 2016 | Automatic Analysis of Typical and Atypical Encoding of Spontaneous Emotion in the Voice of Children. Fabien Ringeval, Erik Marchi, Charline Grossard, Jean Xavier, Mohamed Chetouani, David Cohen, Björn W. Schuller |
| 2016 | Automatic Assessment and Error Detection of Shadowing Speech: Case of English Spoken by Japanese Learners. Shuju Shi, Yosuke Kashiwagi, Shohei Toyama, Junwei Yue, Yutaka Yamauchi, Daisuke Saito, Nobuaki Minematsu |
| 2016 | Automatic Classification of Lexical Stress in English and Arabic Languages Using Deep Learning. Mostafa Ali Shahin, Julien Epps, Beena Ahmed |
| 2016 | Automatic Classification of Phonation Modes in Singing Voice: Towards Singing Style Characterisation and Application to Ethnomusicological Recordings. Jean-Luc Rouas, Leonidas Ioannidis |
| 2016 | Automatic Correction of ASR Outputs by Using Machine Translation. Luis Fernando D'Haro, Rafael E. Banchs |
| 2016 | Automatic Detection of Parkinson's Disease Based on Modulated Vowels. Daria Hemmerling, Juan Rafael Orozco-Arroyave, Andrzej Skalski, Janusz Gajda, Elmar Nöth |
| 2016 | Automatic Dialect Detection in Arabic Broadcast Speech. Ahmed Ali, Najim Dehak, Patrick Cardinal, Sameer Khurana, Sree Harsha Yella, James R. Glass, Peter Bell, Steve Renals |
| 2016 | Automatic Discrimination of Soft Voice Onset Using Acoustic Features of Breathy Voicing. Keiko Ochi, Koichi Mori, Naomi Sakai, Nobutaka Ono |
| 2016 | Automatic Estimation of Perceived Sincerity from Spoken Language. Brandon M. Booth, Rahul Gupta, Pavlos Papadopoulos, Ruchir Travadi, Shrikanth S. Narayanan |
| 2016 | Automatic Genre and Show Identification of Broadcast Media. Mortaza Doulaty, Oscar Saz, Raymond W. M. Ng, Thomas Hain |
| 2016 | Automatic Glottal Inverse Filtering with Non-Negative Matrix Factorization. Manu Airaksinen, Lauri Juvela, Tom Bäckström, Paavo Alku |
| 2016 | Automatic Measurement of Voice Onset Time and Prevoicing Using Recurrent Neural Networks. Yossi Adi, Joseph Keshet, Olga Dmitrieva, Matthew Goldrick |
| 2016 | Automatic Paragraph Segmentation with Lexical and Prosodic Features. Catherine Lai, Mireia Farrús, Johanna D. Moore |
| 2016 | Automatic Pronunciation Evaluation of Non-Native Mandarin Tone by Using Multi-Level Confidence Measures. Ju Lin, Yanlu Xie, Jinsong Zhang |
| 2016 | Automatic Pronunciation Generation by Utilizing a Semi-Supervised Deep Neural Networks. Naoya Takahashi, Tofigh Naghibi, Beat Pfister |
| 2016 | Automatic Recognition of Social Roles Using Long Term Role Transitions in Small Group Interactions. Gaurav Fotedar, Aditya Gaonkar P., Saikat Chatterjee, Prasanta Kumar Ghosh |
| 2016 | Automatic Scoring of Monologue Video Interviews Using Multimodal Cues. Lei Chen, Gary Feng, Michelle P. Martin-Raugh, Chee Wee Leong, Christopher Kitchen, Su-Youn Yoon, Blair Lehman, Harrison Kell, Chong Min Lee |
| 2016 | Automatic Speech Recognition Using Probabilistic Transcriptions in Swahili, Amharic, and Dinka. Amit Das, Preethi Jyothi, Mark Hasegawa-Johnson |
| 2016 | Automatic Speech Transcription for Low-Resource Languages - The Case of Yoloxóchitl Mixtec (Mexico). Vikramjit Mitra, Andreas Kathol, Jonathan D. Amith, Rey Castillo García |
| 2016 | Automatically Classifying Self-Rated Personality Scores from Speech. Guozhen An, Sarah Ita Levitan, Rivka Levitan, Andrew Rosenberg, Michelle Levine, Julia Hirschberg |
| 2016 | Bayesian Modeling in Speech Motor Control: A Principled Structure for the Integration of Various Constraints. Jean-François Patri, Pascal Perrier, Julien Diard |
| 2016 | Behavioral Coding of Therapist Language in Addiction Counseling Using Recurrent Neural Networks. Bo Xiao, Dogan Can, James Gibson, Zac E. Imel, David C. Atkins, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2016 | Bertsokantari: a TTS Based Singing Synthesis System. Eder del Blanco, Inma Hernáez, Eva Navas, Xabier Sarasola, Daniel Erro |
| 2016 | Better Evaluation of ASR in Speech Translation Context Using Word Embeddings. Ngoc-Tien Le, Christophe Servan, Benjamin Lecouteux, Laurent Besacier |
| 2016 | Between- and Within-Speaker Effects of Bilingualism on F0 Variation. Rob Voigt, Dan Jurafsky, Meghan Sumner |
| 2016 | Beyond Utterance Extraction: Summary Recombination for Speech Summarization. Jérémy Trione, Benoît Favre, Frédéric Béchet |
| 2016 | Bidirectional Recurrent Neural Network with Attention Mechanism for Punctuation Restoration. Ottokar Tilk, Tanel Alumäe |
| 2016 | Bird Song Synthesis Based on Hidden Markov Models. Jordi Bonada, Robert Lachlan, Merlijn Blaauw |
| 2016 | Blind Non-Intrusive Speech Intelligibility Prediction Using Twin-HMMs. Mahdie Karbasi, Ahmed Hussen Abdelaziz, Hendrik Meutzner, Dorothea Kolossa |
| 2016 | Blind Recovery of Perceptual Models in Distributed Speech and Audio Coding. Tom Bäckström, Florin Ghido, Johannes Fischer |
| 2016 | Blind Speech Separation with GCC-NMF. Sean U. N. Wood, Jean Rouat |
| 2016 | CNN-Based Phone Segmentation Experiments in a Less-Represented Language. Céline Manenti, Thomas Pellegrini, Julien Pinquier |
| 2016 | Call Alternation Between Specific Pairs of Male Frogs Revealed by a Sound-Imaging Method in Their Natural Habitat. Ikkyu Aihara, Takeshi Mizumoto, Hiromitsu Awano, Hiroshi G. Okuno |
| 2016 | Can Intensive Exposure to Foreign Language Sounds Affect the Perception of Native Sounds? Jian Gong, María Luisa García Lecumberri, Martin Cooke |
| 2016 | Categorization of Natural Spanish Whistled Vowels by Naïve Spanish Listeners. Julien Meyer, Laure Dentel, Fanny Meunier |
| 2016 | Causal Speech Enhancement Combining Data-Driven Learning and Suppression Rule Estimation. Seyedmahdad Mirsamadi, Ivan Tashev |
| 2016 | Channel Selection for Distant Speech Recognition Exploiting Cepstral Distance. Cristina Guerrero, Georgina Tryfou, Maurizio Omologo |
| 2016 | Characterization of Audiovisual Dramatic Attitudes. Adela Barbulescu, Rémi Ronfard, Gérard Bailly |
| 2016 | Characterizing Vocal Tract Dynamics Across Speakers Using Real-Time MRI. Tanner Sorensen, Asterios Toutios, Louis Goldstein, Shrikanth S. Narayanan |
| 2016 | Classification of Voice Modality Using Electroglottogram Waveforms. Michal Borsky, Daryush D. Mehta, Julius P. Gudjohnsen, Jón Guðnason |
| 2016 | Closing Remarks. Naomi Harte, Peter Jancovic, Karl-L. Schuchmann |
| 2016 | CloudCAST - Remote Speech Technology for Speech Professionals. Phil D. Green, Ricard Marxer, Stuart P. Cunningham, Heidi Christensen, Frank Rudzicz, Maria Yancheva, André Coy, Massimiliano Malavasi, Lorenzo Desideri, Fabio Tamburini |
| 2016 | Coda Stop and Taiwan Min Checked Tone Sound Changes. Ho-Hsien Pan, Hsiao-tung Huang, Shao-Ren Lyu |
| 2016 | Colloquialising Modern Standard Arabic Text for Improved Speech Recognition. Sarah Al-Shareef, Thomas Hain |
| 2016 | Combining Acoustic-Prosodic, Lexical, and Phonotactic Features for Automatic Deception Detection. Sarah Ita Levitan, Guozhen An, Min Ma, Rivka Levitan, Andrew Rosenberg, Julia Hirschberg |
| 2016 | Combining CNN and BLSTM to Extract Textual and Acoustic Features for Recognizing Stances in Mandarin Ideological Debate Competition. Linchuan Li, Zhiyong Wu, Mingxing Xu, Helen M. Meng, Lianhong Cai |
| 2016 | Combining Data-Oriented and Process-Oriented Approaches to Modeling Reaction Time Data. Louis ten Bosch, Lou Boves, Mirjam Ernestus |
| 2016 | Combining Energy and Cross-Entropy Analysis for Nuclear Segments Detection. Antonio Origlia, Francesco Cutugno |
| 2016 | Combining Feature and Model-Based Adaptation of RNNLMs for Multi-Genre Broadcast Speech Recognition. Salil Deena, Madina Hasan, Mortaza Doulaty, Oscar Saz, Thomas Hain |
| 2016 | Combining Mask Estimates for Single Channel Audio Source Separation Using Deep Neural Networks. Emad M. Grais, Gerard Roma, Andrew J. R. Simpson, Mark D. Plumbley |
| 2016 | Combining Non-Pathological Data of Different Language Varieties to Improve DNN-HMM Performance on Pathological Speech. Emre Yilmaz, Mario Ganzeboom, Catia Cucchiarini, Helmer Strik |
| 2016 | Combining Semantic Word Classes and Sub-Word Unit Speech Recognition for Robust OOV Detection. Axel Horndasch, Anton Batliner, Caroline Kaufhold, Elmar Nöth |
| 2016 | Combining State-Level Spotting and Posterior-Based Acoustic Match for Improved Query-by-Example Spoken Term Detection. Shuji Oishi, Tatsuya Matsuba, Mitsuaki Makino, Atsuhiko Kai |
| 2016 | Combining Weak Tokenisers for Phonotactic Language Recognition in a Resource-Constrained Setting. Raymond W. M. Ng, Bhusan Chettri, Thomas Hain |
| 2016 | Compact Feedforward Sequential Memory Networks for Large Vocabulary Continuous Speech Recognition. Shiliang Zhang, Hui Jiang, Shifu Xiong, Si Wei, Li-Rong Dai |
| 2016 | Comparing Articulatory and Acoustic Strategies for Reducing Non-Native Accents. Sandesh Aryal, Ricardo Gutierrez-Osuna |
| 2016 | Comparing Different Methods for Analyzing ERP Signals. Kimberley Mulder, Louis ten Bosch, Lou Boves |
| 2016 | Comparing the Contributions of Amplitude and Phase to Speech Intelligibility in a Vocoder-Based Speech Synthesis Model. Fei Chen, Benson C. L. Chiao |
| 2016 | Comparing the Influence of Spectro-Temporal Integration in Computational Speech Segregation. Thomas Bentsen, Tobias May, Abigail A. Kressner, Torsten Dau |
| 2016 | Comparison of Multiple System Combination Techniques for Keyword Spotting. William Hartmann, Le Zhang, Kerri Barnes, Roger Hsiao, Stavros Tsakalidis, Richard M. Schwartz |
| 2016 | Complex Linear Projection (CLP): A Discriminative Approach to Joint Feature Extraction and Acoustic Modeling. Ehsan Variani, Tara N. Sainath, Izhak Shafran, Michiel Bacchiani |
| 2016 | Complexity in Prosody: A Nonlinear Dynamical Systems Approach for Dyadic Conversations; Behavior and Outcomes in Couples Therapy. Md. Nasir, Brian R. Baucom, Shrikanth S. Narayanan, Panayiotis G. Georgiou |
| 2016 | Compositional Neural Network Language Models for Agglutinative Languages. Ebru Arisoy, Murat Saraclar |
| 2016 | Computational Approaches to Linguistic Code Switching. Mona T. Diab, Pascale Fung, Julia Hirschberg, Thamar Solorio |
| 2016 | Conditional Random Fields for the Tunisian Dialect Grapheme-to-Phoneme Conversion. Abir Masmoudi, Mariem Ellouze, Fethi Bougares, Yannick Estève, Lamia Hadrich Belguith |
| 2016 | Congruency Effect Between Articulation and Grasping in Native English Speakers. Mikko Tiainen, Fatima M. Felisberti, Kaisa Tiippana, Martti Vainio, Juraj Simko, Jirí Lukavský, Lari Vainio |
| 2016 | Context Adaptive Neural Network for Rapid Adaptation of Deep CNN Based Acoustic Models. Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Takuya Yoshioka, Dung T. Tran, Tomohiro Nakatani |
| 2016 | Context Aware Mispronunciation Detection for Mandarin Pronunciation Training. Rong Tong, Nancy F. Chen, Bin Ma, Haizhou Li |
| 2016 | Context-Aware Restaurant Recommendation for Natural Language Queries: A Formative User Study in the Automotive Domain. Philipp Fischer, Cornelius Styp von Rekowski, Andreas Nürnberger |
| 2016 | Context-Sensitive and Role-Dependent Spoken Language Understanding Using Bidirectional and Attention LSTMs. Chiori Hori, Takaaki Hori, Shinji Watanabe, John R. Hershey |
| 2016 | Contextual Prediction Models for Speech Recognition. Yoni Halpern, Keith B. Hall, Vlad Schogol, Michael Riley, Brian Roark, Gleb Skobeltsyn, Martin Bäuml |
| 2016 | Conversational Engagement Recognition Using Auditory and Visual Cues. Yuyun Huang, Emer Gilmartin, Nick Campbell |
| 2016 | Convex Hull Convolutive Non-Negative Matrix Factorization for Uncovering Temporal Patterns in Multivariate Time-Series Data. Colin Vaz, Asterios Toutios, Shrikanth S. Narayanan |
| 2016 | Convolutional Neural Networks with Data Augmentation for Classifying Speakers' Native Language. Gil Keren, Jun Deng, Jouni Pohjalainen, Björn W. Schuller |
| 2016 | Coping with Unseen Data Conditions: Investigating Neural Net Architectures, Robust Features, and Information Fusion for Robust Speech Recognition. Vikramjit Mitra, Horacio Franco |
| 2016 | Corpora for the Evaluation of Robust Speaker Recognition Systems. Douglas E. Sturim, Pedro A. Torres-Carrasquillo, Joseph P. Campbell |
| 2016 | Cost Effective Acoustic Monitoring of Bird Species. Ciira Wa Maina |
| 2016 | Couples Behavior Modeling and Annotation Using Low-Resource LSTM Language Models. Shao-Yen Tseng, Sandeep Nallan Chakravarthula, Brian R. Baucom, Panayiotis G. Georgiou |
| 2016 | Cross-Cultural Depression Recognition from Vocal Biomarkers. Sharifa Alghowinem, Roland Goecke, Julien Epps, Michael Wagner, Jeffrey F. Cohn |
| 2016 | Cross-Database Evaluation of Audio-Based Spoofing Detection Systems. Pavel Korshunov, Sébastien Marcel |
| 2016 | Cross-Gender and Cross-Dialect Tone Recognition for Vietnamese. Antje Schweitzer, Ngoc Thang Vu |
| 2016 | Cross-Lingual Speaker Adaptation for Statistical Speech Synthesis Using Limited Data. Seyyed Saeed Sarfjoo, Cenk Demiroglu |
| 2016 | DBN-ivector Framework for Acoustic Emotion Recognition. Rui Xia, Yang Liu |
| 2016 | DNN Online with iVectors Acoustic Modeling and Doc2Vec Distributed Representations for Improving Automated Speech Scoring. Jidong Tao, Lei Chen, Chong Min Lee |
| 2016 | DNN-Based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification. Zeyan Oo, Yuta Kawakami, Longbiao Wang, Seiichi Nakagawa, Xiong Xiao, Masahiro Iwahashi |
| 2016 | DNN-Based Automatic Speech Recognition as a Model for Human Phoneme Perception. Mats Exter, Bernd T. Meyer |
| 2016 | DNN-Based Feature Enhancement Using Joint Training Framework for Robust Multichannel Speech Recognition. Kang Hyun Lee, Tae Gyoon Kang, Woo Hyun Kang, Nam Soo Kim |
| 2016 | DNN-Based Speaker Clustering for Speaker Diarisation. Rosanna Milner, Thomas Hain |
| 2016 | DNNs for Unsupervised Extraction of Pseudo FMLLR Features Without Explicit Adaptation Data. Neethu Mariam Joy, Murali Karthick Baskar, Srinivasan Umesh, Basil Abraham |
| 2016 | Data Augmentation Using Multi-Input Multi-Output Source Separation for Deep Neural Network Based Acoustic Modeling. Yusuke Fujita, Ryoichi Takashima, Takeshi Homma, Masahito Togami |
| 2016 | Data Selection and Adaptation for Naturalness in HMM-Based Speech Synthesis. Erica Cooper, Alison Chang, Yocheved Levitan, Julia Hirschberg |
| 2016 | Data Selection by Sequence Summarizing Neural Network in Mismatch Condition Training. Katerina Zmolíková, Martin Karafiát, Karel Veselý, Marc Delcroix, Shinji Watanabe, Lukás Burget, Jan Cernocký |
| 2016 | Data Selection for Within-Class Covariance Estimation. Elliot Singer, Tyler Campbell, Douglas A. Reynolds |
| 2016 | Deep Bidirectional LSTM Modeling of Timbre and Prosody for Emotional Voice Conversion. Huaiping Ming, Dong-Yan Huang, Lei Xie, Jie Wu, Minghui Dong, Haizhou Li |
| 2016 | Deep Bidirectional Long Short-Term Memory Recurrent Neural Networks for Grapheme-to-Phoneme Conversion Utilizing Complex Many-to-Many Alignments. Amr El-Desoky Mousa, Björn W. Schuller |
| 2016 | Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Recognition. Naoya Takahashi, Michael Gygli, Beat Pfister, Luc Van Gool |
| 2016 | Deep Convolutional Neural Networks with Layer-Wise Context Expansion and Attention. Dong Yu, Wayne Xiong, Jasha Droppo, Andreas Stolcke, Guoli Ye, Jinyu Li, Geoffrey Zweig |
| 2016 | Deep Neural Network Based Acoustic-to-Articulatory Inversion Using Phone Sequence Information. Xurong Xie, Xunying Liu, Lan Wang |
| 2016 | Deep Neural Network Bottleneck Features for Acoustic Event Recognition. Seongkyu Mun, Suwon Shon, Wooil Kim, Hanseok Ko |
| 2016 | Deep Neural Network Frontend for Continuous EMG-Based Speech Recognition. Michael Wand, Jürgen Schmidhuber |
| 2016 | Deep Neural Networks for Voice Quality Assessment Based on the GRBAS Scale. Simin Xie, Nan Yan, Ping Yu, Manwa L. Ng, Lan Wang, Zhuanzhuan Ji |
| 2016 | Deep Neural Networks for i-Vector Language Identification of Short Utterances in Cars. Omid Ghahabi, Antonio Bonafonte, Javier Hernando, Asunción Moreno |
| 2016 | Deep Stacked Autoencoders for Spoken Language Understanding. Killian Janod, Mohamed Morchid, Richard Dufour, Georges Linarès, Renato De Mori |
| 2016 | Defining Emotionally Salient Regions Using Qualitative Agreement Method. Srinivas Parthasarathy, Carlos Busso |
| 2016 | Deriving Phonetic Transcriptions and Discovering Word Segmentations for Speech-to-Speech Translation in Low-Resource Settings. Andrew Wilkinson, Tiancheng Zhao, Alan W. Black |
| 2016 | Detecting Mild Cognitive Impairment from Spontaneous Speech by Correlation-Based Phonetic Feature Selection. Gábor Gosztolya, László Tóth, Tamás Grósz, Veronika Vincze, Ildikó Hoffmann, Gréta Szatlóczki, Magdolna Pákáski, János Kálmán |
| 2016 | Detecting Mispronunciations of L2 Learners and Providing Corrective Feedback Using Knowledge-Guided and Data-Driven Decision Trees. Wei Li, Kehuang Li, Sabato Marco Siniscalchi, Nancy F. Chen, Chin-Hui Lee |
| 2016 | Detection of Total Syllables and Canonical Syllables in Infant Vocalizations. Anne S. Warlaumont, Heather L. Ramsdell-Hudock |
| 2016 | Detection of User Escalation in Human-Computer Interactions. Ian Beaver, Cynthia Freeman |
| 2016 | Determining Native Language and Deception Using Phonetic Features and Classifier Combination. Gábor Gosztolya, Tamás Grósz, Róbert Busa-Fekete, László Tóth |
| 2016 | Development of Mandarin Onset-Rime Detection in Relation to Age and Pinyin Instruction. Fei Chen, Nan Yan, Xunan Huang, Hao Zhang, Lan Wang, Gang Peng |
| 2016 | Diagnosing People with Dementia Using Automatic Conversation Analysis. Bahman Mirheidari, Daniel Blackburn, Markus Reuber, Traci Walker, Heidi Christensen |
| 2016 | Dialogue Session Segmentation by Embedding-Enhanced TextTiling. Yiping Song, Lili Mou, Rui Yan, Li Yi, Zinan Zhu, Xiaohua Hu, Ming Zhang |
| 2016 | Differential Effects of Velopharyngeal Dysfunction on Speech Intelligibility During Early and Late Stages of Amyotrophic Lateral Sclerosis. Panying Rong, Yana Yunusova, Jordan R. Green |
| 2016 | Digitala: An Augmented Test and Review Process Prototype for High-Stakes Spoken Foreign Language Examination. Reima Karhila, Aku Rouhe, Peter Smit, André Mansikkaniemi, Heini Kallio, Erik Lindroos, Raili Hildén, Martti Vainio, Mikko Kurimo |
| 2016 | Diphthongization of Nuclear Vowels and the Emergence of a Tetraphthong in Hetang Cantonese. Wenqi Hu, Fang Hu, Jian Jin |
| 2016 | Direct Expressive Voice Training Based on Semantic Selection. Igor Jauk, Antonio Bonafonte |
| 2016 | Directly Comparing the Listening Strategies of Humans and Machines. Michael I. Mandel |
| 2016 | Discriminative Layered Nonnegative Matrix Factorization for Speech Separation. Chung-Chien Hsu, Tai-Shih Chi, Jen-Tzung Chien |
| 2016 | Discussion. Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini |
| 2016 | Discussion. Dayana Ribas, Emmanuel Vincent, John H. L. Hansen, Emma Jokinen, Mirco Ravanelli, Hannes Gamper, Fred Richardson |
| 2016 | Discussion. Naomi Harte, Peter Jancovic, Karl-L. Schuchmann |
| 2016 | Disentrainment may be a Positive Thing: A Novel Measure of Unsigned Acoustic-Prosodic Synchrony, and its Relation to Speaker Engagement. Juan Manuel Pérez, Ramiro H. Gálvez, Agustín Gravano |
| 2016 | Disfluency Detection Using a Bidirectional LSTM. Vicky Zayats, Mari Ostendorf, Hannaneh Hajishirzi |
| 2016 | Distilling Knowledge from Ensembles of Neural Networks for Speech Recognition. Yevgen Chebotar, Austin Waters |
| 2016 | Do GMM Phoneme Classifiers Perceive Synthetic Sibilants as Humans Do? Gábor Pintér, Hiroki Watanabe |
| 2016 | Do Listeners Learn Better from Natural Speech? Michael McAuliffe, Molly Babel, Charlotte Vaughn |
| 2016 | Does Auditory-Motor Learning of Speech Transfer from the CV Syllable to the CVCV Word? Tiphaine Caudrelier, Pascal Perrier, Jean-Luc Schwartz, Amélie Rochet-Capellan |
| 2016 | Does She Speak RTT? Towards an Earlier Identification of Rett Syndrome Through Intelligent Pre-Linguistic Vocalisation Analysis. Florian B. Pokorny, Peter B. Marschik, Christa Einspieler, Björn W. Schuller |
| 2016 | Does the Importance of Word-Initial and Word-Final Information Differ in Native versus Non-Native Spoken-Word Recognition? Odette Scharenborg, Juul Coumans, Sofoklis Kakouros, Roeland van Hout |
| 2016 | Domain Adaptation of CNN Based Acoustic Models Under Limited Resource Settings. Masayuki Suzuki, Ryuki Tachibana, Samuel Thomas, Bhuvana Ramabhadran, George Saon |
| 2016 | Domain Adaptation of Recurrent Neural Networks for Natural Language Understanding. Aaron Jaech, Larry P. Heck, Mari Ostendorf |
| 2016 | Dynamic Stream Weighting for Turbo-Decoding-Based Audiovisual ASR. Sebastian Gergen, Steffen Zeiler, Ahmed Hussen Abdelaziz, Robert M. Nickel, Dorothea Kolossa |
| 2016 | Dynamic Transcription for Low-Latency Speech Translation. Jan Niehues, Thai Son Nguyen, Eunah Cho, Thanh-Le Ha, Kevin Kilgour, Markus Müller, Matthias Sperber, Sebastian Stüker, Alex Waibel |
| 2016 | Dysarthric Speech Recognition Using Kullback-Leibler Divergence-Based Hidden Markov Model. Myung Jong Kim, Jun Wang, Hoirin Kim |
| 2016 | EVS Channel Aware Mode Robustness to Frame Erasures. Anssi Rämö, Antti Kurittu, Henri Toukomaa |
| 2016 | Effect of Noise on Lexical Tone Perception in Cantonese-Speaking Amusics. Jing Shao, Caicai Zhang, Gang Peng, Yike Yang, William S.-Y. Wang |
| 2016 | Effectiveness of Near-End Speech Enhancement Under Equal-Loudness and Equal-Level Constraints. Tudor-Catalin Zorila, Sheila Flanagan, Brian C. J. Moore, Yannis Stylianou |
| 2016 | Effects of Cochlear Hearing Loss on the Benefits of Ideal Binary Masking. Vahid Montazeri, Shaikat Hossain, Peter F. Assmann |
| 2016 | Effects of L1 Phonotactic Constraints on L2 Word Segmentation Strategies. Tamami Katayama |
| 2016 | Effects of Stress on Fricatives: Evidence from Standard Modern Greek. Charalambos Themistocleous, Angelandria Savva, Andrie Aristodemou |
| 2016 | Effects of Subglottal-Coupling and Interdental-Space on Formant Trajectories During Front-to-Back Vowel Transitions in Chinese. Shuanglin Fan, Kiyoshi Honda, Jianwu Dang, Hui Feng |
| 2016 | Effects of Urgent Speech and Preceding Sounds on Speech Intelligibility in Noisy and Reverberant Environments. Nao Hodoshima |
| 2016 | Efficient Segmental Cascades for Speech Recognition. Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu |
| 2016 | Efficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling. Sittipong Saychum, Sarawoot Kongyoung, Anocha Rugchatjaroen, Patcharika Chootrakool, Sawit Kasuriya, Chai Wutiwiwatchai |
| 2016 | Emergence of Vocal Developmental Sequences in a Predictive Coding Model of Speech Acquisition. Shamima Najnin, Bonny Banerjee |
| 2016 | End-to-End Language Identification Using Attention-Based Recurrent Neural Networks. Wang Geng, Wenfu Wang, Yuanyuan Zhao, Xinyuan Cai, Bo Xu |
| 2016 | End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding. Yun-Nung Chen, Dilek Hakkani-Tür, Gökhan Tür, Jianfeng Gao, Li Deng |
| 2016 | English Language Speech Assistant. Xavier Anguera, Vu Van |
| 2016 | Enhance the Word Vector with Prosodic Information for the Recurrent Neural Network Based TTS System. Xin Wang, Shinji Takaki, Junichi Yamagishi |
| 2016 | Enhanced Harmonic Content and Vocal Note Based Predominant Melody Extraction from Vocal Polyphonic Music Signals. Gurunath Reddy M., K. Sreenivasa Rao |
| 2016 | Enhancement of Automatic Oral Presentation Assessment System Using Latent N-Grams Word Representation and Part-of-Speech Information. Wen-Yu Huang, Shan-Wen Hsiao, Hung-Ching Sun, Ming-Chuan Hsieh, Ming-Hsueh Tsai, Chi-Chun Lee |
| 2016 | Enhancing Data-Driven Phone Confusions Using Restricted Recognition. Mark Kane, Julie Carson-Berndsen |
| 2016 | Enhancing Multilingual Recognition of Emotion in Speech by Language Identification. Hesam Sagha, Pavel Matejka, Maryna Gavryukova, Filip Povolný, Erik Marchi, Björn W. Schuller |
| 2016 | Entropy Based Pruning for Non-Negative Matrix Based Language Models with Contextual Features. Barlas Oguz, Issac Alphonso, Shuangyu Chang |
| 2016 | Entropy Coding of Spectral Envelopes for Speech and Audio Coding Using Distribution Quantization. Srikanth Korse, Tobias Jähnel, Tom Bäckström |
| 2016 | Error Correction in Lightly Supervised Alignment of Broadcast Subtitles. Julia Olcoz, Oscar Saz, Thomas Hain |
| 2016 | Estimating the Sincerity of Apologies in Speech by DNN Rank Learning and Prosodic Analysis. Gábor Gosztolya, Tamás Grósz, György Szaszák, László Tóth |
| 2016 | Estimation of Children's Physical Characteristics from Their Voices. Jill Fain Lehman, Rita Singh |
| 2016 | Evaluation of Phonatory Behavior of German and French Speakers in Native and Non-Native Speech. Manfred Pützer, Frank Zimmerer, Wolfgang Wokurek, Jeanin Jügler |
| 2016 | Evaluation of Singing Synthesis: Methodology and Case Study with Concatenative and Performative Systems. Lionel Feugère, Christophe d'Alessandro, Samuel Delalez, Luc Ardaillon, Axel Roebel |
| 2016 | Evaluation of a Phone-Based Anomaly Detection Approach for Dysarthric Speech. Imed Laaridh, Corinne Fredouille, Christine Meunier |
| 2016 | Exemplar Dynamics in Phonetic Convergence of Speech Rate. Antje Schweitzer, Michael Walsh |
| 2016 | Experiences with Shared Resources for Research and Education in Speech and Language Processing. Rebecca Bates, Eric Fosler-Lussier, Florian Metze, Martha A. Larson, Gina-Anne Levow, Emily Mower Provost |
| 2016 | Experimental Validation of Sound Generated from Flow in Simplified Vocal Tract Model of Sibilant /s/. Tsukasa Yoshinaga, Kazunori Nozaki, Shigeo Wada |
| 2016 | Exploiting Depth and Highway Connections in Convolutional Recurrent Deep Neural Networks for Speech Recognition. Wei-Ning Hsu, Yu Zhang, Ann Lee, James R. Glass |
| 2016 | Exploiting Hidden-Layer Responses of Deep Neural Networks for Language Recognition. Ruizhi Li, Sri Harish Reddy Mallidi, Lukás Burget, Oldrich Plchot, Najim Dehak |
| 2016 | Exploiting Phone Log-Likelihood Ratio Features for the Detection of the Native Language of Non-Native English Speakers. Alberto Abad, Eugénio Ribeiro, Fábio N. Kepler, Ramón Fernandez Astudillo, Isabel Trancoso |
| 2016 | Exploring Collections of Multimedia Archives Through Innovative Interfaces in the Context of Digital Humanities. Géraldine Damnati, Delphine Charlet, Marc Denjean |
| 2016 | Exploring Session Variability and Template Aging in Speaker Verification for Fixed Phrase Short Utterances. Rohan Kumar Das, Sarfaraz Jelil, S. R. Mahadeva Prasanna |
| 2016 | Exploring Word Mover's Distance and Semantic-Aware Embedding Techniques for Extractive Broadcast News Summarization. Shih-Hung Liu, Kuan-Yu Chen, Yu-Lun Hsieh, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu |
| 2016 | Exploring the Correlation of Pitch Accents and Semantic Slots for Spoken Language Understanding. Sabrina Stehwien, Ngoc Thang Vu |
| 2016 | Expressive Control of Singing Voice Synthesis Using Musical Contexts and a Parametric F0 Model. Luc Ardaillon, Celine Chabot-Canet, Axel Roebel |
| 2016 | Expressive Singing Synthesis Based on Unit Selection for the Singing Synthesis Challenge 2016. Jordi Bonada, Martí Umbert, Merlijn Blaauw |
| 2016 | Expressive Speech Driven Talking Avatar Synthesis with DBLSTM Using Limited Amount of Emotional Bimodal Data. Xu Li, Zhiyong Wu, Helen M. Meng, Jia Jia, Xiaoyan Lou, Lianhong Cai |
| 2016 | F Xiaoyun Wang, Xugang Lu, Hisashi Kawai, Seiichi Yamamoto |
| 2016 | F0 Development in Acquiring Korean Stop Distinction. Gayeon Son |
| 2016 | Facing Realism in Spontaneous Emotion Recognition from Speech: Feature Enhancement by Autoencoder with LSTM Neural Networks. Zixing Zhang, Fabien Ringeval, Jing Han, Jun Deng, Erik Marchi, Björn W. Schuller |
| 2016 | Factor Analysis Based Speaker Normalisation for Continuous Emotion Prediction. Ting Dang, Vidhyasaharan Sethu, Eliathamby Ambikairajah |
| 2016 | Factor Analysis Based Speaker Verification Using ASR. Hang Su, Steven Wegmann |
| 2016 | Factorized Linear Input Network for Acoustic Model Adaptation in Noisy Conditions. Dung T. Tran, Marc Delcroix, Atsunori Ogawa, Tomohiro Nakatani |
| 2016 | Factors Affecting the Intelligibility of Sine-Wave Speech. Fei Chen, Daniel Fogerty |
| 2016 | Far-Field ASR Without Parallel Data. Vijayaditya Peddinti, Vimal Manohar, Yiming Wang, Daniel Povey, Sanjeev Khudanpur |
| 2016 | Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices. Heiga Zen, Yannis Agiomyrgiannakis, Niels Egberts, Fergus Henderson, Przemyslaw Szczepaniak |
| 2016 | Feature Learning and Automatic Segmentation for Dolphin Communication Analysis. Daniel Kohlsdorf, Denise Herzing, Thad Starner |
| 2016 | Feature Learning with Raw-Waveform CLDNNs for Voice Activity Detection. Rubén Zazo, Tara N. Sainath, Gabor Simko, Carolina Parada |
| 2016 | Feature-Level Decision Fusion for Audio-Visual Word Prominence Detection. Martin Heckmann |
| 2016 | First Step Towards End-to-End Parametric TTS Synthesis: Generating Spectral Parameters with Neural Attention. Wenfu Wang, Shuang Xu, Bo Xu |
| 2016 | Flexible, Rapid Authoring of Goal-Orientated, Multi-Turn Dialogues Using the Task Completion Platform. Alex Marin, Paul A. Crook, Omar Zia Khan, Vasiliy Radostev, Khushboo Aggarwal, Ruhi Sarikaya |
| 2016 | Formant Estimation and Tracking Using Deep Learning. Yehoshua Dissen, Joseph Keshet |
| 2016 | Frequency Estimation from Waveforms Using Multi-Layered Neural Networks. Prateek Verma, Ronald W. Schafer |
| 2016 | Fusing Acoustic Feature Representations for Computational Paralinguistics Tasks. Heysem Kaya, Alexey A. Karpov |
| 2016 | Fusion Strategies for Robust Speech Recognition and Keyword Spotting for Channel- and Noise-Degraded Speech. Vikramjit Mitra, Julien van Hout, Wen Wang, Chris Bartels, Horacio Franco, Dimitra Vergyri, Abeer Alwan, Adam Janin, John H. L. Hansen, Richard M. Stern, Abhijeet Sangwan, Nelson Morgan |
| 2016 | Future Context Attention for Unidirectional LSTM Based Acoustic Model. Jian Tang, Shiliang Zhang, Si Wei, Li-Rong Dai |
| 2016 | GMM-Free Flat Start Sequence-Discriminative DNN Training. Gábor Gosztolya, Tamás Grósz, László Tóth |
| 2016 | Gating Recurrent Enhanced Memory Neural Networks on Language Identification. Wang Geng, Yuanyuan Zhao, Wenfu Wang, Xinyuan Cai, Bo Xu |
| 2016 | Generalized Discriminant Analysis (GDA) for Improved i-Vector Based Speaker Recognition. Fahimeh Bahmaninezhad, John H. L. Hansen |
| 2016 | Generalizing Steady State Suppression for Enhanced Intelligibility Under Reverberation. Petko Nikolov Petkov, Yannis Stylianou |
| 2016 | Generating Complementary Acoustic Model Spaces in DNN-Based Sequence-to-Frame DTW Scheme for Out-of-Vocabulary Spoken Term Detection. Shi-wook Lee, Kazuyo Tanaka, Yoshiaki Itoh |
| 2016 | Generating Gestural Scores from Acoustics Through a Sparse Anchor-Based Representation of Speech. Christopher Liberatore, Ricardo Gutierrez-Osuna |
| 2016 | Generating Natural Video Descriptions via Multimodal Processing. Qin Jin, Junwei Liang, Xiaozhu Lin |
| 2016 | Generation and Pruning of Pronunciation Variants to Improve ASR Accuracy. Zhenhao Ge, Aravind Ganapathiraju, Ananth N. Iyer, Scott A. Randal, Felix I. Wyss |
| 2016 | Generation of Emotion Control Vector Using MDS-Based Space Transformation for Expressive Speech Synthesis. Yan-You Chen, Chung-Hsien Wu, Yu-Fong Huang |
| 2016 | Generative Acoustic-Phonemic-Speaker Model Based on Three-Way Restricted Boltzmann Machine. Toru Nakashika, Yasuhiro Minami |
| 2016 | Glimpse-Based Metrics for Predicting Speech Intelligibility in Additive Noise Conditions. Yan Tang, Martin Cooke |
| 2016 | Glimpsing Predictions for Natural and Vocoded Sentence Intelligibility During Modulation Masking: Effect of the Glimpse Cutoff Criterion. Bobby Gibbs II, Daniel Fogerty |
| 2016 | GlottDNN - A Full-Band Glottal Vocoder for Statistical Parametric Speech Synthesis. Manu Airaksinen, Bajibabu Bollepalli, Lauri Juvela, Zhizheng Wu, Simon King, Paavo Alku |
| 2016 | Glottal Squeaks in VC Sequences. Mísa Hejná, Pertti Palo, Scott Moisik |
| 2016 | HAPPY Team Entry to NIST OpenSAD Challenge: A Fusion of Short-Term Unsupervised and Segment i-Vector Based Speech Activity Detectors. Tomi Kinnunen, Alexey Sholokhov, Elie Khoury, Dennis Alexander Lehmann Thomsen, Md. Sahidullah, Zheng-Hua Tan |
| 2016 | HMM-Based Non-Native Accent Assessment Using Posterior Features. Ramya Rasipuram, Milos Cernak, Mathew Magimai-Doss |
| 2016 | HMM-Based Speech Enhancement Using Sub-Word Models and Noise Adaptation. Akihiro Kato, Ben P. Milner |
| 2016 | Head Motion Generation with Synthetic Speech: A Data Driven Approach. Najmeh Sadoughi, Carlos Busso |
| 2016 | Hierarchical Classification of Speaker and Background Noise and Estimation of SNR Using Sparse Representation. K. V. Vijay Girish, A. G. Ramakrishnan, T. V. Ananthapadmanabha |
| 2016 | Highlighting Psychological Features for Predicting Child Interjections During Story Telling. Gaël Lejeune, François Rioult, Bruno Crémilleux |
| 2016 | How Neural Network Depth Compensates for HMM Conditional Independence Assumptions in DNN-HMM Acoustic Models. Suman V. Ravuri, Steven Wegmann |
| 2016 | Hybrid Accelerated Optimization for Speech Recognition. Jen-Tzung Chien, Pei-Wen Huang, Tan Lee |
| 2016 | Hybrid Dialogue State Tracking for Real World Human-to-Human Dialogues. Kai Sun, Su Zhu, Lu Chen, Siqiu Yao, Xueyang Wu, Kai Yu |
| 2016 | Hyperarticulated Production of Korean Glides by Age Group. Seung-Eun Chang, Minsook Kim |
| 2016 | Identifying Hearing Loss from Learned Speech Kernels. Shamima Najnin, Bonny Banerjee, Lisa Lucks Mendel, Masoumeh Heidari Kapourchali, Jayanta Kumar Dutta, Sungmin Lee, Chhayakanta Patro, Monique Pousson |
| 2016 | Identifying Perceptually Similar Voices with a Speaker Recognition System Using Auto-Phonetic Features. Finnian Kelly, Anil Alexander, Oscar Forth, Samuel Kent, Jonas Lindh, Joel Åkesson |
| 2016 | Idlak Tangle: An Open Source Kaldi Based Parametric Speech Synthesiser Based on DNN. Blaise Potard, Matthew P. Aylett, David A. Baude, Petr Motlícek |
| 2016 | Illustrating the Production of the International Phonetic Alphabet Sounds Using Fast Real-Time Magnetic Resonance Imaging. Asterios Toutios, Sajan Goud Lingala, Colin Vaz, Jangwon Kim, John H. Esling, Patricia A. Keating, Matthew Gordon, Dani Byrd, Louis Goldstein, Krishna S. Nayak, Shrikanth S. Narayanan |
| 2016 | Impaired Categorical Perception of Mandarin Tones and its Relationship to Language Ability in Autism Spectrum Disorders. Fei Chen, Nan Yan, Xiaojie Pan, Feng Yang, Zhuanzhuan Ji, Lan Wang, Gang Peng |
| 2016 | Implementing Acoustic-Prosodic Entrainment in a Conversational Avatar. Rivka Levitan, Stefan Benus, Ramiro H. Gálvez, Agustín Gravano, Florencia Savoretti, Marián Trnka, Andreas Weise, Julia Hirschberg |
| 2016 | Improved Depiction of Tissue Boundaries in Vocal Tract Real-Time MRI Using Automatic Off-Resonance Correction. Yongwan Lim, Sajan Goud Lingala, Asterios Toutios, Shrikanth S. Narayanan, Krishna S. Nayak |
| 2016 | Improved MVDR Beamforming Using Single-Channel Mask Prediction Networks. Hakan Erdogan, John R. Hershey, Shinji Watanabe, Michael I. Mandel, Jonathan Le Roux |
| 2016 | Improved Multilingual Training of Stacked Neural Network Acoustic Models for Low Resource Languages. Tanel Alumäe, Stavros Tsakalidis, Richard M. Schwartz |
| 2016 | Improved Music Genre Classification with Convolutional Neural Networks. Weibin Zhang, Wenkang Lei, Xiangmin Xu, Xiaofeng Xing |
| 2016 | Improved Neural Bag-of-Words Model to Retrieve Out-of-Vocabulary Words in Speech Recognition. Imran A. Sheikh, Irina Illina, Dominique Fohr, Georges Linarès |
| 2016 | Improved Neural Network Initialization by Grouping Context-Dependent Targets for Acoustic Modeling. Gakuto Kurata, Brian Kingsbury |
| 2016 | Improved Time-Frequency Trajectory Excitation Vocoder for DNN-Based Speech Synthesis. Eunwoo Song, Frank K. Soong, Hong-Goo Kang |
| 2016 | Improved a priori SAP Estimator in Complex Noisy Environment for Dual Channel Microphone System. Youna Ji, Young-Cheol Park |
| 2016 | Improving Automatic Recognition of Aphasic Speech with AphasiaBank. Duc Le, Emily Mower Provost |
| 2016 | Improving Boundary Estimation in Audiovisual Speech Activity Detection Using Bayesian Information Criterion. Fei Tao, John H. L. Hansen, Carlos Busso |
| 2016 | Improving Children's Speech Recognition Through Out-of-Domain Data Augmentation. Joachim Fainberg, Peter Bell, Mike Lincoln, Steve Renals |
| 2016 | Improving Deep Neural Networks Based Speaker Verification Using Unlabeled Data. Yao Tian, Meng Cai, Liang He, Wei-Qiang Zhang, Jia Liu |
| 2016 | Improving English Conversational Telephone Speech Recognition. Ivan Medennikov, Alexey Prudnikov, Alexander Zatvornitskiy |
| 2016 | Improving Generalisation to New Speakers in Spoken Dialogue State Tracking. Iñigo Casanueva, Thomas Hain, Phil D. Green |
| 2016 | Improving Large Vocabulary Accented Mandarin Speech Recognition with Attribute-Based I-Vectors. Hao Zheng, Shanshan Zhang, Liwei Qiao, Jianping Li, Wenju Liu |
| 2016 | Improving Prosodic Boundaries Prediction for Mandarin Speech Synthesis by Using Enhanced Embedding Feature and Model Fusion Approach. Yibin Zheng, Ya Li, Zhengqi Wen, Xingguang Ding, Jianhua Tao |
| 2016 | Improving TTS with Corpus-Specific Pronunciation Adaptation. Marie Tahon, Raheel Qader, Gwénolé Lecorvé, Damien Lolive |
| 2016 | Improving Under-Resourced Language ASR Through Latent Subword Unit Space Discovery. Marzieh Razavi, Mathew Magimai-Doss |
| 2016 | Improving i-Vector and PLDA Based Speaker Clustering with Long-Term Features. Abraham Woubie, Jordi Luque, Javier Hernando |
| 2016 | Improving the Lwazi ASR Baseline. Charl Johannes van Heerden, Neil Kleynhans, Marelie H. Davel |
| 2016 | Improving the Probabilistic Framework for Representing Dialogue Systems with User Response Model. Miao Li, Zhipeng Chen, Ji Wu |
| 2016 | Incorporating a Generative Front-End Layer to Deep Neural Network for Noise Robust Automatic Speech Recognition. Souvik Kundu, Khe Chai Sim, Mark J. F. Gales |
| 2016 | Individual Identity in Songbirds: Signal Representations and Metric Learning for Locating the Information in Complex Corvid Calls. Dan Stowell, Veronica Morfi, Lisa F. Gill |
| 2016 | Inferring Phonemic Classes from CNN Activation Maps Using Clustering Techniques. Thomas Pellegrini, Sandrine Mouysset |
| 2016 | Integrated Spoofing Countermeasures and Automatic Speaker Verification: An Evaluation on ASVspoof 2015. Md. Sahidullah, Héctor Delgado, Massimiliano Todisco, Hong Yu, Tomi Kinnunen, Nicholas W. D. Evans, Zheng-Hua Tan |
| 2016 | Intelligibility Enhancement at the Receiving End of the Speech Transmission System - Effects of Far-End Noise Reduction. Emma Jokinen, Paavo Alku |
| 2016 | Intelligibility of Disordered Speech: Global and Detailed Scores. Mario Ganzeboom, Marjoke Bakker, Catia Cucchiarini, Helmer Strik |
| 2016 | Inter-Speech Clicks in an Interspeech Keynote. Jürgen Trouvain, Zofia Malisz |
| 2016 | Inter-Task System Fusion for Speaker Recognition. Marc Ferras, Srikanth R. Madikeri, Subhadeep Dey, Petr Motlícek, Hervé Bourlard |
| 2016 | Interaction Between Lexical Tone and Intonation: An EMA Study. Hao Yi, Sam Tilsen |
| 2016 | Interactive Spoken Content Retrieval by Deep Reinforcement Learning. Yen-Chen Wu, Tzu-Hsiang Lin, Yang-De Chen, Hung-yi Lee, Lin-Shan Lee |
| 2016 | Interpretation of Low Dimensional Neural Network Bottleneck Features in Terms of Human Perception and Production. Philip Weber, Linxue Bai, Martin J. Russell, Peter Jancovic, Stephen M. Houghton |
| 2016 | Introducing Temporal Rate Coding for Speech in Cochlear Implants: A Microscopic Evaluation in Humans and Models. Anja Eichenauer, Mathias Dietz, Bernd T. Meyer, Tim Jürgens |
| 2016 | Introducing the Turbo-Twin-HMM for Audio-Visual Speech Enhancement. Steffen Zeiler, Hendrik Meutzner, Ahmed Hussen Abdelaziz, Dorothea Kolossa |
| 2016 | Introduction to Poster Presentation of Part II. Jeesun Kim, Gérard Bailly |
| 2016 | Introduction. Naomi Harte, Peter Jancovic, Karl-L. Schuchmann |
| 2016 | Investigating Various Diarization Algorithms for Speaker in the Wild (SITW) Speaker Recognition Challenge. Yi Liu, Yao Tian, Liang He, Jia Liu |
| 2016 | Investigating the Impact of Dialect Prestige on Lexical Decision. Mairym Lloréns Monteserín, Jason D. Zevin |
| 2016 | Investigation of Semi-Supervised Acoustic Model Training Based on the Committee of Heterogeneous Neural Networks. Naoyuki Kanda, Shoji Harada, Xugang Lu, Hisashi Kawai |
| 2016 | Investigation of Speed-Accuracy Tradeoffs in Speech Production Using Real-Time Magnetic Resonance Imaging. Adam C. Lammert, Christine H. Shadle, Shrikanth S. Narayanan, Thomas F. Quatieri |
| 2016 | Investigation of Sub-Band Discriminative Information Between Spoofed and Genuine Speech. Kaavya Sriskandaraja, Vidhyasaharan Sethu, Phu Ngoc Le, Eliathamby Ambikairajah |
| 2016 | Is Deception Emotional? An Emotion-Driven Predictive Approach. Shahin Amiriparian, Jouni Pohjalainen, Erik Marchi, Sergey Pugachevskiy, Björn W. Schuller |
| 2016 | Iterative PLDA Adaptation for Speaker Diarization. Gaël Le Lan, Delphine Charlet, Anthony Larcher, Sylvain Meignier |
| 2016 | Joint Effect of Dialect and Mandarin on English Vowel Production: A Case Study in Changsha EFL Learners. Xinyi Wen, Yuan Jia |
| 2016 | Joint Enhancement and Coding of Speech by Incorporating Wiener Filtering in a CELP Codec. Johannes Fischer, Tom Bäckström |
| 2016 | Joint Learning of Speaker and Phonetic Similarities with Siamese Networks. Neil Zeghidour, Gabriel Synnaeve, Nicolas Usunier, Emmanuel Dupoux |
| 2016 | Joint Optimization of Denoising Autoencoder and DNN Acoustic Model Based on Multi-Target Learning for Noisy Speech Recognition. Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara |
| 2016 | Joint Sound Source Separation and Speaker Recognition. Jeroen Zegers, Hugo Van hamme |
| 2016 | Joint Speaker and Lexical Modeling for Short-Term Characterization of Speaker. Guangsen Wang, Kong-Aik Lee, Trung Hieu Nguyen, Hanwu Sun, Bin Ma |
| 2016 | Joint Syntactic and Semantic Analysis with a Multitask Deep Learning Framework for Spoken Language Understanding. Jérémie Tafforeau, Frédéric Béchet, Thierry Artières, Benoît Favre |
| 2016 | Jointly Learning to Locate and Classify Words Using Convolutional Networks. Dimitri Palaz, Gabriel Synnaeve, Ronan Collobert |
| 2016 | Jointly Optimizing Activation Coefficients of Convolutive NMF Using DNN for Speech Separation. Hao Li, Shuai Nie, Xueliang Zhang, Hui Zhang |
| 2016 | Ketchup, Interdisciplinarity, and the Spread of Innovation in Speech and Language Processing. Dan Jurafsky |
| 2016 | Kulning (Swedish Cattle Calls): Acoustic, EGG, Stroboscopic and High-Speed Video Analyses of an Unusual Singing Style. Ahmed Geneid, Anne-Maria Laukkanen, Anita McAllister, Robert Eklund |
| 2016 | L1-L2 Interference: The Case of Final Devoicing of French Voiced Fricatives in Final Position by German Learners. Sucheta Ghosh, Camille Fauth, Aghilas Sini, Yves Laprie |
| 2016 | L2 Acquisition and Production of the English Rhotic Pharyngeal Gesture. Sarah Harper, Louis Goldstein, Shrikanth S. Narayanan |
| 2016 | L2 English Rhythm in Read Speech by Chinese Students. Hongwei Ding, Xinping Xu |
| 2016 | LIA System for the SITW Speaker Recognition Challenge. Waad Ben Kheder, Moez Ajili, Pierre-Michel Bousquet, Driss Matrouf, Jean-François Bonastre |
| 2016 | LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition. Kazuki Irie, Zoltán Tüske, Tamer Alkhouli, Ralf Schlüter, Hermann Ney |
| 2016 | LSTM-Based NeuroCRFs for Named Entity Recognition. Marc-Antoine Rondeau, Yi Su |
| 2016 | Labeled Data Generation with Encoder-Decoder LSTM for Semantic Slot Filling. Gakuto Kurata, Bing Xiang, Bowen Zhou |
| 2016 | Language Adaptive DNNs for Improved Low Resource Speech Recognition. Markus Müller, Sebastian Stüker, Alex Waibel |
| 2016 | Language Effects in Noise-Induced Word Misperceptions. María Luisa García Lecumberri, Jon Barker, Ricard Marxer, Martin Cooke |
| 2016 | Language Identification Based on Generative Modeling of Posteriorgram Sequences Extracted from Frame-by-Frame DNNs and LSTM-RNNs. Ryo Masumura, Taichi Asami, Hirokazu Masataki, Yushi Aono, Sumitaka Sakauchi |
| 2016 | Language Model Data Augmentation for Keyword Spotting in Low-Resourced Training Conditions. Arseniy Gorin, Rasa Lileikyte, Guangpu Huang, Lori Lamel, Jean-Luc Gauvain, Antoine Laurent |
| 2016 | Language Recognition via Sparse Coding. Youngjune L. Gwon, William M. Campbell, Douglas E. Sturim, H. T. Kung |
| 2016 | LatticeRnn: Recurrent Neural Networks Over Lattices. Faisal Ladhak, Ankur Gandhe, Markus Dreyer, Lambert Mathias, Ariya Rastrow, Björn Hoffmeister |
| 2016 | Laughter Valence Prediction in Motivational Interviewing Based on Lexical and Acoustic Cues. Rahul Gupta, Nishant Nath, Taruna Agrawal, Panayiotis G. Georgiou, David C. Atkins, Shrikanth S. Narayanan |
| 2016 | Learning Document Representations Using Subspace Multinomial Model. Santosh Kesiraju, Lukás Burget, Igor Szöke, Jan Cernocký |
| 2016 | Learning Multiscale Features Directly from Waveforms. Zhenyao Zhu, Jesse H. Engel, Awni Y. Hannun |
| 2016 | Learning N-Gram Language Models from Uncertain Data. Vitaly Kuznetsov, Hank Liao, Mehryar Mohri, Michael Riley, Brian Roark |
| 2016 | Learning Neural Network Representations Using Cross-Lingual Bottleneck Features with Word-Pair Information. Yougen Yuan, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li |
| 2016 | Learning Personalized Pronunciations for Contact Name Recognition. Antoine Bruguier, Fuchun Peng, Françoise Beaufays |
| 2016 | Learning a Translation Model from Word Lattices. Oliver Adams, Graham Neubig, Trevor Cohn, Steven Bird |
| 2016 | Lig-Aikuma: A Mobile App to Collect Parallel Speech for Under-Resourced Language Studies. Elodie Gauthier, David Blachon, Laurent Besacier, Guy-Noël Kouarata, Martine Adda-Decker, Annie Rialland, Gilles Adda, Grégoire Bachman |
| 2016 | Likelihood Ratio Calculation in Acoustic-Phonetic Forensic Voice Comparison: Comparison of Three Statistical Modelling Approaches. Ewald Enzinger |
| 2016 | Local Sparsity Based Online Dictionary Learning for Environment-Adaptive Speech Enhancement with Nonnegative Matrix Factorization. Kwang Myung Jeon, Hong Kook Kim |
| 2016 | Localizing Bird Songs Using an Open Source Robot Audition System with a Microphone Array. Reiji Suzuki, Shiho Matsubayashi, Kazuhiro Nakadai, Hiroshi G. Okuno |
| 2016 | Locally Linear Embedding for Exemplar-Based Spectral Conversion. Yi-Chiao Wu, Hsin-Te Hwang, Chin-Cheng Hsu, Yu Tsao, Hsin-Min Wang |
| 2016 | Log-Linear System Combination Using Structured Support Vector Machines. Jingzhou Yang, Anton Ragni, Mark J. F. Gales, Kate M. Knill |
| 2016 | Long Short-Term Memory for Speaker Generalization in Supervised Speech Separation. Jitong Chen, DeLiang Wang |
| 2016 | Long-Term Stability of Tracheoesophageal Voices. Klaske E. van Sluis, Michiel W. M. van den Brekel, Frans J. M. Hilgers, Rob J. J. H. van Son |
| 2016 | Low-Rank Representation of Nearest Neighbor Posterior Probabilities to Enhance DNN Based Acoustic Modeling. Gil Luyet, Pranay Dighe, Afsaneh Asaei, Hervé Bourlard |
| 2016 | Lower Frame Rate Neural Network Acoustic Models. Golan Pundak, Tara N. Sainath |
| 2016 | MIVOQ-PTTS - A Revolutionary New Way of Thinking TTS. Piero Cosi, Giulio Paci, Giacomo Sommavilla, Fabio Tesser |
| 2016 | ML Parameter Generation with a Reformulated MGE Training Criterion - Participation in the Voice Conversion Challenge 2016. Daniel Erro, Agustín Alonso, Luis Serrano, David Tavarez, Igor Odriozola, Xabier Sarasola, Eder del Blanco, Jon Sánchez, Ibon Saratxaga, Eva Navas, Inma Hernáez |
| 2016 | Mahalanobis Metric Scoring Learned from Weighted Pairwise Constraints in I-Vector Speaker Recognition System. Zhenchun Lei, Yanhong Wan, Jian Luo, Yingen Yang |
| 2016 | Majorisation-Minimisation Based Optimisation of the Composite Autoregressive System with Application to Glottal Inverse Filtering. Lauri Juvela, Hirokazu Kameoka, Manu Airaksinen, Junichi Yamagishi, Paavo Alku |
| 2016 | Making Personal Digital Assistants Aware of What They Do Not Know. Omar Zia Khan, Ruhi Sarikaya |
| 2016 | Manipulating Word Lattices to Incorporate Human Corrections. Yashesh Gaur, Florian Metze, Jeffrey P. Bigham |
| 2016 | Manual versus Automated: The Challenging Routine of Infant Vocalisation Segmentation in Home Videos to Study Neuro(mal)development. Florian B. Pokorny, Robert Peharz, Wolfgang Roth, Matthias Zöhrer, Franz Pernkopf, Peter B. Marschik, Björn W. Schuller |
| 2016 | Marginal Contrast Among Romanian Vowels: Evidence from ASR and Functional Load. Margaret E. L. Renwick, Ioana Vasilescu, Camille Dutrey, Lori Lamel, Bianca Vieru |
| 2016 | Maximum a posteriori Based Decoding for CTC Acoustic Models. Naoyuki Kanda, Xugang Lu, Hisashi Kawai |
| 2016 | Measuring Pronunciation Improvement in Users of CAPT Tool TipTopTalk! Cristian Tejedor García, David Escudero Mancebo, Enrique Cámara Arenas, César González Ferreras, Valentín Cardeñoso-Payo |
| 2016 | Measuring Turn-Taking Offsets in Human-Human Dialogues. Rebecca Lunsford, Peter A. Heeman, Emma Rennie |
| 2016 | Mechanical Production of [b], [m] and [w] Using Controlled Labial and Velopharyngeal Gestures. Takayuki Arai |
| 2016 | Memory-Efficient Modeling and Search Techniques for Hardware ASR Decoders. Michael Price, Anantha P. Chandrakasan, James R. Glass |
| 2016 | Microphone Distance Adaptation Using Cluster Adaptive Training for Robust Far Field Speech Recognition. Animesh Prasad, Khe Chai Sim |
| 2016 | Microscopic Multilingual Matrix Test Predictions Using an ASR-Based Speech Recognition Model. Marc René Schädler, David Hülsmeier, Anna Warzybok, Sabine Hochmuth, Birger Kollmeier |
| 2016 | Mindfulness Special Event. Nikki Mirghafori |
| 2016 | Minimization of Regression and Ranking Losses with Shallow Neural Networks on Automatic Sincerity Evaluation. Hung-Shin Lee, Yu Tsao, Chi-Chun Lee, Hsin-Min Wang, Wei-Cheng Lin, Wei-Chen Chen, Shan-Wen Hsiao, Shyh-Kang Jeng |
| 2016 | Minimizing Annotation Effort for Adaptation of Speech-Activity Detection Systems. Luciana Ferrer, Martin Graciarena |
| 2016 | Misperceptions Arising from Speech-in-Babble Interactions. Máté Attila Tóth, Martin Cooke, Jon Barker |
| 2016 | Mispronunciation Detection Leveraging Maximum Performance Criterion Training of Acoustic Models and Decision Functions. Yao-Chi Hsu, Ming-Han Yang, Hsiao-Tsung Hung, Berlin Chen |
| 2016 | Model Adaptation and Active Learning in the BBN Speech Activity Detection System for the DARPA RATS Program. Damianos G. Karakos, Scott Novotney, Le Zhang, Richard M. Schwartz |
| 2016 | Model Compression Applied to Small-Footprint Keyword Spotting. George Tucker, Minhua Wu, Ming Sun, Sankaran Panchapagesan, Gengshen Fu, Shiv Vitaladevuni |
| 2016 | Model Integration for HMM- and DNN-Based Speech Synthesis Using Product-of-Experts Framework. Kentaro Tachibana, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai |
| 2016 | Model-Based Parametric Prosody Synthesis with Deep Neural Network. Hao Liu, Heng Lu, Xu Shao, Yi Xu |
| 2016 | Modeling Noise Influence to Speech Intelligibility Non-Intrusively by Reduced Speech Dynamic Range. Fei Chen |
| 2016 | Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks. Tara N. Sainath, Bo Li |
| 2016 | Modeling and Transforming Speech Using Variational Autoencoders. Merlijn Blaauw, Jordi Bonada |
| 2016 | Modulation Enhancement of Temporal Envelopes for Increasing Speech Intelligibility in Noise. Maria Koutsogiannaki, Yannis Stylianou |
| 2016 | Modulation Spectral Features for Predicting Vocal Emotion Recognition by Simulated Cochlear Implants. Zhi Zhu, Ryota Miyauchi, Yukiko Araki, Masashi Unoki |
| 2016 | Monaural Source Separation Using a Random Forest Classifier. Cosimo Riday, Saurabh Bhargava, Richard H. R. Hahnloser, Shih-Chii Liu |
| 2016 | Multi-Attribute Factorized Hidden Layer Adaptation for DNN Acoustic Models. Lahiru Samarakoon, Khe Chai Sim |
| 2016 | Multi-Channel Linear Prediction Based on Binaural Coherence for Speech Dereverberation. Hong Liu, Xiuling Wang, Miao Sun, Cheng Pang |
| 2016 | Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM. Dilek Hakkani-Tür, Gökhan Tür, Asli Celikyilmaz, Yun-Nung Chen, Jianfeng Gao, Li Deng, Ye-Yi Wang |
| 2016 | Multi-Language Multi-Speaker Acoustic Modeling for LSTM-RNN Based Statistical Parametric Speech Synthesis. Bo Li, Heiga Zen |
| 2016 | Multi-Language Neural Network Language Models. Anton Ragni, Edgar Dakin, Xie Chen, Mark J. F. Gales, Kate M. Knill |
| 2016 | Multi-Talker Speech Recognition Based on Blind Source Separation with ad hoc Microphone Array Using Smartphones and Cloud Storage. Keiko Ochi, Nobutaka Ono, Shigeki Miyabe, Shoji Makino |
| 2016 | Multi-Task Learning and Weighted Cross-Entropy for DNN-Based Keyword Spotting. Sankaran Panchapagesan, Ming Sun, Aparna Khare, Spyros Matsoukas, Arindam Mandal, Björn Hoffmeister, Shiv Vitaladevuni |
| 2016 | Multichannel Spatial Clustering for Robust Far-Field Automatic Speech Recognition in Mismatched Conditions. Michael I. Mandel, Jon Barker |
| 2016 | Multidimensional Residual Learning Based on Recurrent Neural Networks for Acoustic Modeling. Yuanyuan Zhao, Shuang Xu, Bo Xu |
| 2016 | Multilingual Data Selection for Low Resource Speech Recognition. Samuel Thomas, Kartik Audhkhasi, Jia Cui, Brian Kingsbury, Bhuvana Ramabhadran |
| 2016 | Multilingual Speech Emotion Recognition System Based on a Three-Layer Model. Xingfeng Li, Masato Akagi |
| 2016 | Multimodal Fusion of Multirate Acoustic, Prosodic, and Lexical Speaker Characteristics for Native Language Identification. Prashanth Gurunath Shivakumar, Sandeep Nallan Chakravarthula, Panayiotis G. Georgiou |
| 2016 | Multiple Influences on Vocabulary Acquisition: Parental Input Dominates. Dominic W. Massaro |
| 2016 | Multiplicity of the Acoustic Correlates of the Fortis-Lenis Contrast: Plosives in Aberystwyth English. Mísa Hejná |
| 2016 | My-Own-Voice: A Web Service That Allows You to Create a Text-to-Speech Voice From Your Own Voice. Fabrice Malfrère, Olivier Deroo, Emmanuelle Franques, Jonathan Hourez, Nicolas Mazars, Vincent Pagel, Geoffrey Wilfart |
| 2016 | NN-Grams: Unifying Neural Network and n-Gram Language Models for Speech Recognition. Babak Damavandi, Shankar Kumar, Noam Shazeer, Antoine Bruguier |
| 2016 | Native Language Detection Using the I-Vector Framework. Mohammed Senoussaoui, Patrick Cardinal, Najim Dehak, Alessandro L. Koerich |
| 2016 | Native Language Identification Using Spectral and Source-Based Features. Avni Rajpal, Tanvina B. Patel, Hardik B. Sailor, Maulik C. Madhavi, Hemant A. Patil, Hiroya Fujisaki |
| 2016 | Naturalness Judgement of L2 English Through Dubbing Practice. Dean Luo, Ruxin Luo, Lixin Wang |
| 2016 | Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection. Sheng-syun Shen, Hung-yi Lee |
| 2016 | Neural Network Adaptive Beamforming for Robust Multichannel Speech Recognition. Bo Li, Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Michiel Bacchiani |
| 2016 | Neural Responses to Speech-Specific Modulations Derived from a Spectro-Temporal Filter Bank. Marina Frye, Cristiano Micheli, Inga M. Schepers, Gerwin Schalk, Jochem W. Rieger, Bernd T. Meyer |
| 2016 | Neurophysiological Vocal Source Modeling for Biomarkers of Disease. Gregory A. Ciccarelli, Thomas F. Quatieri, Satrajit S. Ghosh |
| 2016 | Noise Aware and Combined Noise Models for Speech Denoising in Unknown Noise Conditions. Pavlos Papadopoulos, Colin Vaz, Shrikanth S. Narayanan |
| 2016 | Noise and Metadata Sensitive Bottleneck Features for Improving Speaker Recognition with Non-Native Speech Input. Yao Qian, Jidong Tao, David Suendermann-Oeft, Keelan Evanini, Alexei V. Ivanov, Vikram Ramanarayanan |
| 2016 | Noise-Robust Hidden Markov Models for Limited Training Data for Within-Species Bird Phrase Classification. Kantapon Kaewtip, Charles E. Taylor, Abeer Alwan |
| 2016 | Non-Iterative Parameter Estimation for Total Variability Model Using Randomized Singular Value Decomposition. Ruchir Travadi, Shrikanth S. Narayanan |
| 2016 | Non-Uniform Boosted MCE Training of Deep Neural Networks for Keyword Spotting. Zhong Meng, Biing-Hwang Juang |
| 2016 | Novel Front-End Features Based on Neural Graph Embeddings for DNN-HMM and LSTM-CTC Acoustic Modeling. Yuzong Liu, Katrin Kirchhoff |
| 2016 | Novel Nonlinear Prediction Based Features for Spoofed Speech Detection. Himanshu N. Bhavsar, Tanvina B. Patel, Hemant A. Patil |
| 2016 | Novel Subband Autoencoder Features for Detection of Spoofed Speech. Meet H. Soni, Tanvina B. Patel, Hemant A. Patil |
| 2016 | Novel Subband Autoencoder Features for Non-Intrusive Quality Assessment of Noise Suppressed Speech. Meet H. Soni, Hemant A. Patil |
| 2016 | Objective Evaluation Methods for Chinese Text-To-Speech Systems. Teng Zhang, Zhipeng Chen, Ji Wu, Sam Lai, Wenhui Lei, Carsten Isert |
| 2016 | Objective Evaluation Using Association Between Dimensions Within Spectral Features for Statistical Parametric Speech Synthesis. Yusuke Ijima, Taichi Asami, Hideyuki Mizuno |
| 2016 | Objective Language Feature Analysis in Children with Neurodevelopmental Disorders During Autism Assessment. Manoj Kumar, Rahul Gupta, Daniel Bone, Nikolaos Malandrakis, Somer Bishop, Shrikanth S. Narayanan |
| 2016 | On Discriminative Framework for Single Channel Audio Source Separation. Arpita Gang, Pravesh Biyani |
| 2016 | On Employing a Highly Mismatched Crowd for Speech Transcription. Purushotam G. Radadia, Rahul Kumar, Kanika Kalra, Shirish Karande, Sachin Lodha |
| 2016 | On Online Attention-Based Speech Recognition and Joint Mandarin Character-Pinyin Training. William Chan, Ian R. Lane |
| 2016 | On Smoothing and Enhancing Dynamics of Pitch Contours Represented by Discrete Orthogonal Polynomials for Prosody Generation. Chen-Yu Chiang |
| 2016 | On the Appropriateness of Complex-Valued Neural Networks for Speech Enhancement. Lukas Drude, Bhiksha Raj, Reinhold Haeb-Umbach |
| 2016 | On the Correlation and Transferability of Features Between Automatic Speech Recognition and Speech Emotion Recognition. Haytham M. Fayek, Margaret Lech, Lawrence Cavedon |
| 2016 | On the Efficient Representation and Execution of Deep Acoustic Models. Raziel Alvarez, Rohit Prabhavalkar, Anton Bakhtin |
| 2016 | On the Importance of Efficient Transition Modeling for Speaker Diarization. Itshak Lapidot, Jean-François Bonastre |
| 2016 | On the Influence of Gender on Interruptions in Multiparty Dialogue. Paul Van Eecke, Raquel Fernández |
| 2016 | On the Influence of Text Content on Pass-Phrase Strength for Short-Duration Text-Dependent Automatic Speaker Authentication. Giacomo Valenti, Adrien Daniel, Nicholas W. D. Evans |
| 2016 | On the Issue of Calibration in DNN-Based Speaker Recognition Systems. Mitchell McLaren, Diego Castán, Luciana Ferrer, Aaron Lawson |
| 2016 | On the Role of Nonlinear Transformations in Deep Neural Network Acoustic Models. Tasha Nagamine, Michael L. Seltzer, Nima Mesgarani |
| 2016 | On the Suitability of Vocalic Sandwiches in a Corpus-Based TTS Engine. David Guennec, Damien Lolive |
| 2016 | On the Use of Gaussian Mixture Model Framework to Improve Speaker Adaptation of Deep Neural Network Acoustic Models. Natalia A. Tomashenko, Yuri Y. Khokhlov, Yannick Estève |
| 2016 | Open Language Interface for Voice Exploitation (OLIVE). Aaron Lawson, Mitchell McLaren, Harry Bratt, Martin Graciarena, Horacio Franco, Christopher George, Allen R. Stauffer, Chris Bartels, Julien van Hout |
| 2016 | Open Source Speech and Language Resources for Frisian. Emre Yilmaz, Henk van den Heuvel, Jelske Dijkstra, Hans Van de Velde, Frederik Kampstra, Jouke Algra, David A. van Leeuwen |
| 2016 | Open-Domain Audio-Visual Speech Recognition: A Deep Learning Approach. Yajie Miao, Florian Metze |
| 2016 | Optimal Unit Stitching in a Unit Selection Singing Synthesis System. Marius Cotescu |
| 2016 | Optimization of Speech Enhancement Front-End with Speech Recognition-Level Criterion. Takuya Higuchi, Takuya Yoshioka, Tomohiro Nakatani |
| 2016 | Optimizing Speech Recognition Evaluation Using Stratified Sampling. Janne Pylkkönen, Thomas Drugman, Max Bisani |
| 2016 | Organizing Syllables into Sandhi Domains - Evidence from F0 and Duration Patterns in Shanghai Chinese. Bijun Ling, Jie Liang |
| 2016 | Out of Set Language Modelling in Hierarchical Language Identification. Saad Irtza, Vidhyasaharan Sethu, Sarith Fernando, Eliathamby Ambikairajah, Haizhou Li |
| 2016 | Overcoming Data Sparsity in Acoustic Modeling of Low-Resource Language by Borrowing Data and Model Parameters from High-Resource Languages. Basil Abraham, Srinivasan Umesh, Neethu Mariam Joy |
| 2016 | Pair-Wise Distance Metric Learning of Neural Network Model for Spoken Language Identification. Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai |
| 2016 | Parallel Dictionary Learning for Voice Conversion Using Discriminative Graph-Embedded Non-Negative Matrix Factorization. Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki |
| 2016 | Parallel Speaker and Content Modelling for Text-Dependent Speaker Verification. Jianbo Ma, Saad Irtza, Kaavya Sriskandaraja, Vidhyasaharan Sethu, Eliathamby Ambikairajah |
| 2016 | Parkinson's Disease Progression Assessment from Speech Using GMM-UBM. Tomás Arias-Vergara, Juan Camilo Vásquez-Correa, Juan Rafael Orozco-Arroyave, Jesús Francisco Vargas-Bonilla, Elmar Nöth |
| 2016 | Part-of-Speech Tagging and Chunking in Text-to-Speech Synthesis for South African Languages. Georg I. Schlünz, Nkosikhona Dlamini, Rynhardt P. Kruger |
| 2016 | Pause Prediction from Text for Speech Synthesis with User-Definable Pause Insertion Likelihood Threshold. Norbert Braunschweiler, Ranniery Maia |
| 2016 | Perceived Naturalness of Electrolaryngeal Speech Produced Using sEMG-Controlled vs. Manual Pitch Modulation. Kathleen F. Nagle, James T. Heaton |
| 2016 | Perceived Usability and Cognitive Demand of Secondary Tasks in Spoken Versus Visual-Manual Automotive Interaction. Annika Silvervarg, Sofia Lindvall, Jonatan Andersson, Ida Esberg, Christian Jernberg, Filip Frumerie, Arne Jönsson |
| 2016 | Perception Optimized Deep Denoising AutoEncoders for Speech Enhancement. Prashanth Gurunath Shivakumar, Panayiotis G. Georgiou |
| 2016 | Perception of Tone in Whispered Mandarin Sentences: The Case for Singapore Mandarin. Yuling Gu, Boon Pang Lim, Nancy F. Chen |
| 2016 | Perceptual Lateralization of Coda Rhotic Production in Puerto Rican Spanish. Mairym Lloréns Monteserín, Shrikanth S. Narayanan, Louis Goldstein |
| 2016 | Perceptual Salience of Voice Source Parameters in Signaling Focal Prominence. Irena Yanushevskaya, Andy Murphy, Christer Gobl, Ailbhe Ní Chasaide |
| 2016 | Personalized Natural Language Understanding. Xiaohu Liu, Ruhi Sarikaya, Liang Zhao, Yong Ni, Yi-Cheng Pan |
| 2016 | Personalized, Cross-Lingual TTS Using Phonetic Posteriorgrams. Lifa Sun, Hao Wang, Shiyin Kang, Kun Li, Helen M. Meng |
| 2016 | Phase-Aware Signal Processing for Automatic Speech Recognition. Johannes Fahringer, Tobias Schrank, Johannes Stahl, Pejman Mowlaee, Franz Pernkopf |
| 2016 | Phase-Encoded Speech Spectrograms. Chandra Sekhar Seelamantula |
| 2016 | PhonVoc: A Phonetic and Phonological Vocoding Toolkit. Milos Cernak, Philip N. Garner |
| 2016 | Phone Synchronous Decoding with CTC Lattice. Zhehuai Chen, Wei Deng, Tao Xu, Kai Yu |
| 2016 | Phoneme Embedding and its Application to Speech Driven Talking Avatar Synthesis. Xu Li, Zhiyong Wu, Helen M. Meng, Jia Jia, Xiaoyan Lou, Lianhong Cai |
| 2016 | Phoneme Set Design Considering Integrated Acoustic and Linguistic Features of Second Language Speech. Xiaoyun Wang, Tsuneo Kato, Seiichi Yamamoto |
| 2016 | Phoneme, Phone Boundary, and Tone in Automatic Scoring of Mandarin Proficiency. Jiahong Yuan, Mark Y. Liberman |
| 2016 | Phonetic Context Embeddings for DNN-HMM Phone Recognition. Leonardo Badino |
| 2016 | Phonetic Reduction Can Lead to Lengthening, and Enhancement Can Lead to Shortening. Clara Cohen, Matt Carlson |
| 2016 | Phonetic and Phonological Posterior Search Space Hashing Exploiting Class-Specific Sparsity Structures. Afsaneh Asaei, Gil Luyet, Milos Cernak, Hervé Bourlard |
| 2016 | Phonotactic Language Identification for Singing. Anna M. Kruspe |
| 2016 | Pitch-Adaptive Front-End Features for Robust Children's ASR. Syed Shahnawazuddin, Abhishek Dey, Rohit Sinha |
| 2016 | Pitch-Range Perception: The Dynamic Interaction Between Voice Quality and Fundamental Frequency. Jianjing Kuang, Mark Y. Liberman |
| 2016 | Poster Overview Presentations. Naomi Harte, Peter Jancovic, Karl-L. Schuchmann |
| 2016 | Predicting Affective Dimensions Based on Self Assessed Depression Severity. Rahul Gupta, Shrikanth S. Narayanan |
| 2016 | Predicting Binaural Speech Intelligibility from Signals Estimated by a Blind Source Separation Algorithm. Qingju Liu, Yan Tang, Philip J. B. Jackson, Wenwu Wang |
| 2016 | Predicting Pronunciations with Syllabification and Stress with Recurrent Neural Networks. Daan van Esch, Mason Chua, Kanishka Rao |
| 2016 | Predicting Severity of Voice Disorder from DNN-HMM Acoustic Posteriors. Tan Lee, Yuanyuan Liu, Yu Ting Yeung, Thomas K. T. Law, Kathy Y. S. Lee |
| 2016 | Predicting User Satisfaction from Turn-Taking in Spoken Conversations. Shammur Absar Chowdhury, Evgeny A. Stepanov, Giuseppe Riccardi |
| 2016 | Prediction and Generation of Backchannel Form for Attentive Listening Systems. Tatsuya Kawahara, Takashi Yamaguchi, Koji Inoue, Katsuya Takanashi, Nigel G. Ward |
| 2016 | Prediction of Deception and Sincerity from Speech Using Automatic Phone Recognition-Based Features. Robert Herms |
| 2016 | Prediction of the Articulatory Movements of Unseen Phonemes of a Speaker Using the Speech Structure of Another Speaker. Hidetsugu Uchida, Daisuke Saito, Nobuaki Minematsu |
| 2016 | Preliminary Experiments on Unsupervised Word Discovery in Mboshi. Pierre Godard, Gilles Adda, Martine Adda-Decker, Alexandre Allauzen, Laurent Besacier, Hélène Bonneau-Maynard, Guy-Noël Kouarata, Kevin Löser, Annie Rialland, François Yvon |
| 2016 | Priors for Speaker Counting and Diarization with AHC. Gregory Sell, Alan McCree, Daniel Garcia-Romero |
| 2016 | Privacy-Preserving Speech Analytics for Automatic Assessment of Student Collaboration. Nikoletta Bassiou, Andreas Tsiartas, Jennifer Smith, Harry Bratt, Colleen Richey, Elizabeth Shriberg, Cynthia M. D'Angelo, Nonye Alozie |
| 2016 | Probabilistic Amplitude Demodulation Features in Speech Synthesis for Improving Prosody. Alexandros Lazaridis, Milos Cernak, Philip N. Garner |
| 2016 | Probabilistic Approach Using Joint Clean and Noisy i-Vectors Modeling for Speaker Recognition. Waad Ben Kheder, Driss Matrouf, Moez Ajili, Jean-François Bonastre |
| 2016 | Probabilistic Approach Using Joint Long and Short Session i-Vectors Modeling to Deal with Short Utterances for Speaker Recognition. Waad Ben Kheder, Driss Matrouf, Moez Ajili, Jean-François Bonastre |
| 2016 | Probabilistic Spatial Filter Estimation for Signal Enhancement in Multi-Channel Automatic Speech Recognition. Hendrik Kayser, Niko Moritz, Jörn Anemüller |
| 2016 | Processing and Adaptation to Ambiguous Sounds during the Course of Perceptual Learning. Polina Drozdova, Roeland van Hout, Odette Scharenborg |
| 2016 | Progress and Prospects for Spoken Language Technology: Results from Four Sexennial Surveys. Roger K. Moore, Ricard Marxer |
| 2016 | Progress and Prospects for Spoken Language Technology: What Ordinary People Think. Roger K. Moore, Hui Li, Shih-Hao Liao |
| 2016 | Pronunciation Assessment of Japanese Learners of French with GOP Scores and Phonetic Information. Vincent Laborde, Thomas Pellegrini, Lionel Fontan, Julie Mauclair, Halima Sahraoui, Jérôme Farinas |
| 2016 | Pronunciation Error Detection for New Language Learners. Sean Robertson, Cosmin Munteanu, Gerald Penn |
| 2016 | Prosodic Convergence with Spoken Stimuli in Laboratory Data. Margaret Zellers |
| 2016 | Prosodic Cues and Answer Type Detection for the Deception Sub-Challenge. Claude Montacié, Marie-José Caraty |
| 2016 | Prosodic and Linguistic Analysis of Semantic Fluency Data: A Window into Speech Production and Cognition. Maria K. Wolters, Najoung Kim, Jung-Ho Kim, Sarah E. MacPherson, Jong-Chan Park |
| 2016 | Prosody Modification Using Allpass Residual of Speech Signals. Karthika Vijayan, K. Sri Rama Murty |
| 2016 | Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI. Daniel Povey, Vijayaditya Peddinti, Daniel Galvez, Pegah Ghahremani, Vimal Manohar, Xingyu Na, Yiming Wang, Sanjeev Khudanpur |
| 2016 | Putting German [ʃ] and [ç] in Two Different Boxes: Native German vs L2 German of French Learners. Jane Wottawa, Martine Adda-Decker, Frédéric Isel |
| 2016 | Quantitative Analysis of Backchannels Uttered by an Interviewer During Neuropsychological Tests. Gérard Bailly, Frédéric Elisei, Alexandra Juphard, Olivier Moreaud |
| 2016 | RNN-BLSTM Based Multi-Pitch Estimation. Jianshu Zhang, Jian Tang, Li-Rong Dai |
| 2016 | Rapid Update of Multilingual Deep Neural Network for Low-Resource Keyword Search. Chongjia Ni, Lei Wang, Cheung-Chi Leung, Feng Rao, Li Lu, Bin Ma, Haizhou Li |
| 2016 | Real-Time Presentation Tracking Using Semantic Keyword Spotting. Reza Asadi, Harriet J. Fell, Timothy W. Bickmore, Ha Trinh |
| 2016 | Real-Time Tracking of Speakers' Emotions, States, and Traits on Mobile Platforms. Erik Marchi, Florian Eyben, Gerhard Hagerer, Björn W. Schuller |
| 2016 | Realistic Multi-Microphone Data Simulation for Distant Speech Recognition. Mirco Ravanelli, Piergiorgio Svaizer, Maurizio Omologo |
| 2016 | Recent Advances in Google Real-Time HMM-Driven Unit Selection Synthesizer. Xavi Gonzalvo, Siamak Tazari, Chun-an Chan, Markus Becker, Alexander Gutkin, Hanna Silén |
| 2016 | Recognition of Depression in Bipolar Disorder: Leveraging Cohort and Person-Specific Knowledge. Soheil Khorram, John Gideon, Melvin G. McInnis, Emily Mower Provost |
| 2016 | Recognition of Dysarthric Speech Using Voice Parameters for Speaker Adaptation and Multi-Taper Spectral Estimation. Chitralekha Bhat, Bhavik Vachhani, Sunil Kumar Kopparapu |
| 2016 | Recognition of Multiple Bird Species Based on Penalised Maximum Likelihood and HMM-Based Modelling of Individual Vocalisation Elements. Peter Jancovic, Münevver Köküer |
| 2016 | Recurrent Models for Auditory Attention in Multi-Microphone Distant Speech Recognition. Suyoun Kim, Ian R. Lane |
| 2016 | Recurrent Neural Network Language Model with Incremental Updated Context Information Generated Using Bag-of-Words Representation. Md. Akmal Haidar, Mikko Kurimo |
| 2016 | Recurrent Neural Network-Based Phoneme Sequence Estimation Using Multiple ASR Systems' Outputs for Spoken Term Detection. Naoki Sawada, Hiromitsu Nishizaki |
| 2016 | Recurrent Out-of-Vocabulary Word Detection Using Distribution of Features. Taichi Asami, Ryo Masumura, Yushi Aono, Koichi Shinoda |
| 2016 | Redefining the Linguistic Context Feature Set for HMM and DNN TTS Through Position and Parsing. Rasmus Dall, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda |
| 2016 | Reducing the Computational Complexity of Multimicrophone Acoustic Models with Integrated Feature Extraction. Tara N. Sainath, Arun Narayanan, Ron J. Weiss, Ehsan Variani, Kevin W. Wilson, Michiel Bacchiani, Izhak Shafran |
| 2016 | Relating Estimated Cyclic Spectral Peak Frequency to Measured Epilarynx Length Using Magnetic Resonance Imaging. Elizabeth Godoy, Andrew Dumas, Jennifer Melot, Nicolas Malyska, Thomas F. Quatieri |
| 2016 | Relation of Automatically Extracted Formant Trajectories with Intelligibility Loss and Speaking Rate Decline in Amyotrophic Lateral Sclerosis. Rachelle L. Horwitz-Martin, Thomas F. Quatieri, Adam C. Lammert, James R. Williamson, Yana Yunusova, Elizabeth Godoy, Daryush D. Mehta, Jordan R. Green |
| 2016 | Relationships Between Functional Load and Auditory Confusability Under Different Speech Environments. Shinae Kang, Clara Cohen |
| 2016 | Relative Contributions of Amplitude and Phase to the Intelligibility Advantage of Ideal Binary Masked Sentences. Lei Wang, Shufeng Zhu, Diliang Chen, Yong Feng, Fei Chen |
| 2016 | Release from Energetic Masking Caused by Repeated Patterns of Glimpsing Windows. Maury Lander-Portnoy |
| 2016 | Remeeting - Deep Insights to Conversations. Allen Guo, Arlo Faria, Korbinian Riedhammer |
| 2016 | Representation Learning for Speech Emotion Recognition. Sayan Ghosh, Eugene Laksana, Louis-Philippe Morency, Stefan Scherer |
| 2016 | Rescoring Hypothesized Detections of Out-of-Vocabulary Keywords Using Subword Samples. Van Tung Pham, Haihua Xu, Xiong Xiao, Nancy F. Chen, Eng Siong Chng, Haizhou Li |
| 2016 | Rescoring by Combination of Posteriorgram Score and Subword-Matching Score for Use in Query-by-Example. Masato Obara, Kazunori Kojima, Kazuyo Tanaka, Shi-wook Lee, Yoshiaki Itoh |
| 2016 | Respiratory Belts and Whistles: A Preliminary Study of Breathing Acoustics for Turn-Taking. Marcin Wlodarczak, Mattias Heldner |
| 2016 | Respiratory Turn-Taking Cues. Marcin Wlodarczak, Mattias Heldner |
| 2016 | Results of The 2015 NIST Language Recognition Evaluation. Hui Zhao, Désiré Bansé, George R. Doddington, Craig S. Greenberg, Jaime Hernandez-Cordero, John M. Howard, Lisa P. Mason, Alvin F. Martin, Douglas A. Reynolds, Elliot Singer, Audrey Tong |
| 2016 | Retrieval of Textual Song Lyrics from Sung Inputs. Anna M. Kruspe |
| 2016 | Retrieving Categorical Emotions Using a Probabilistic Framework to Define Preference Learning Samples. Reza Lotfian, Carlos Busso |
| 2016 | Reverberation-Robust One-Bit TDOA Based Moving Source Localization for Automatic Camera Steering. Sundar Harshavardhan, Gokul Deepak Manavalan, T. V. Sreenivas, Chandra Sekhar Seelamantula |
| 2016 | Robust Audio Event Recognition with 1-Max Pooling Convolutional Neural Networks. Huy Phan, Lars Hertel, Marco Maaß, Alfred Mertins |
| 2016 | Robust DNN-Based VAD Augmented with Phone Entropy Based Rejection of Background Speech. Yuya Fujita, Ken-ichi Iso |
| 2016 | Robust Detection of Multiple Bioacoustic Events with Repetitive Structures. Frank Kurth |
| 2016 | Robust Estimation of Fundamental Frequency Using Single Frequency Filtering Approach. Vishala Pannala, G. Aneeja, Sudarsana Reddy Kadiri, B. Yegnanarayana |
| 2016 | Robust Example Search Using Bottleneck Features for Example-Based Speech Enhancement. Atsunori Ogawa, Shogo Seki, Keisuke Kinoshita, Marc Delcroix, Takuya Yoshioka, Tomohiro Nakatani, Kazuya Takeda |
| 2016 | Robust Multichannel Gender Classification from Speech in Movie Audio. Naveen Kumar, Md. Nasir, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2016 | Robust Sound Event Detection in Continuous Audio Environments. Haomin Zhang, Ian McLoughlin, Yan Song |
| 2016 | Robust Speaker Recognition with Combined Use of Acoustic and Throat Microphone Speech. Md. Sahidullah, Rosa González Hautamäki, Dennis Alexander Lehmann Thomsen, Tomi Kinnunen, Zheng-Hua Tan, Ville Hautamäki, Robert Parts, Martti Pitkänen |
| 2016 | Robust Speech Recognition Using Generalized Distillation Framework. Konstantin Markov, Tomoko Matsui |
| 2016 | Robust Vowel Landmark Detection Using Epoch-Based Features. Sri Harsha Dumpala, Bhanu Teja Nellore, Raghu Ram Nevali, Suryakanth V. Gangashetty, B. Yegnanarayana |
| 2016 | Robustness in Speech, Speaker, and Language Recognition: "You've Got to Know Your Limitations". John H. L. Hansen, Hynek Boril |
| 2016 | Root Cause Analysis of Miscommunication Hotspots in Spoken Dialogue Systems. Spiros Georgiladakis, Georgia Athanasopoulou, Raveesh Meena, José Lopes, Arodami Chorianopoulou, Elisavet Palogiannidi, Elias Iosif, Gabriel Skantze, Alexandros Potamianos |
| 2016 | SERAPHIM Live! - Singing Synthesis for the Performer, the Composer, and the 3D Game Developer. Paul Yaozhu Chan, Minghui Dong, Grace Xue Hui Ho, Haizhou Li |
| 2016 | SERAPHIM: A Wavetable Synthesis System with 3D Lip Animation for Real-Time Speech and Singing Applications on Mobile Platforms. Paul Yaozhu Chan, Minghui Dong, Grace Xue Hui Ho, Haizhou Li |
| 2016 | SNR-Aware Convolutional Neural Network Modeling for Speech Enhancement. Szu-Wei Fu, Yu Tsao, Xugang Lu |
| 2016 | SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement. Tian Gao, Jun Du, Li-Rong Dai, Chin-Hui Lee |
| 2016 | STON: Efficient Subtitling in Dutch Using State-of-the-Art Tools. Lyan Verwimp, Brecht Desplanques, Kris Demuynck, Joris Pelemans, Marieke Lycke, Patrick Wambacq |
| 2016 | Sage: The New BBN Speech Processing Platform. Roger Hsiao, Ralf Meermeier, Tim Ng, Zhongqiang Huang, Maxwell Jordan, Enoch Kan, Tanel Alumäe, Jan Silovský, William Hartmann, Francis Keith, Omer Lang, Man-Hung Siu, Owen Kimball |
| 2016 | Segmental Recurrent Neural Networks for End-to-End Speech Recognition. Liang Lu, Lingpeng Kong, Chris Dyer, Noah A. Smith, Steve Renals |
| 2016 | Segmented Dynamic Time Warping for Spoken Query-by-Example Search. Jorge Proença, Fernando Perdigão |
| 2016 | Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems. Pierre Lanchantin, Mark J. F. Gales, Penny Karanasou, Xunying Liu, Yanman Qian, Linlin Wang, Philip C. Woodland, Chao Zhang |
| 2016 | Self-Adaptive DNN for Improving Spoken Language Proficiency Assessment. Yao Qian, Xinhao Wang, Keelan Evanini, David Suendermann-Oeft |
| 2016 | Semi-Coupled Dictionary Based Automatic Bandwidth Extension Approach for Enhancing Children's ASR. Ganji Sreeram, Rohit Sinha |
| 2016 | Semi-Supervised Joint Enhancement of Spectral and Cepstral Sequences of Noisy Speech. Li Li, Hirokazu Kameoka, Takuya Higuchi, Hiroshi Saruwatari |
| 2016 | Semi-Supervised Speaker Adaptation for In-Vehicle Speech Recognition with Deep Neural Networks. Wonkyum Lee, Kyu Jeong Han, Ian R. Lane |
| 2016 | Semi-Supervised Training in Deep Learning Acoustic Model. Yan Huang, Yongqiang Wang, Yifan Gong |
| 2016 | Semi-Supervised and Cross-Lingual Knowledge Transfer Learnings for DNN Hybrid Acoustic Models Under Low-Resource Conditions. Haihua Xu, Hang Su, Chongjia Ni, Xiong Xiao, Hao Huang, Eng Siong Chng, Haizhou Li |
| 2016 | Sensitivity of Quantitative RT-MRI Metrics of Vocal Tract Dynamics to Image Reconstruction Settings. Johannes Töger, Yongwan Lim, Sajan Goud Lingala, Shrikanth S. Narayanan, Krishna S. Nayak |
| 2016 | Sensorimotor Response to Visual Imagery of Tongue Displacement. William F. Katz, Divya Prabhakaran |
| 2016 | Sentence Boundary Detection Based on Parallel Lexical and Acoustic Models. Xiaoyin Che, Sheng Luo, Haojin Yang, Christoph Meinel |
| 2016 | Sequence Student-Teacher Training of Deep Neural Networks. Jeremy Heng Meng Wong, Mark J. F. Gales |
| 2016 | Sequence Summarizing Neural Networks for Spoken Language Recognition. Jan Pesán, Lukás Burget, Jan Cernocký |
| 2016 | Sequential Convolutional Neural Networks for Slot Filling in Spoken Language Understanding. Ngoc Thang Vu |
| 2016 | Sequential Recurrent Neural Networks for Language Modeling. Youssef Oualil, Clayton Greenberg, Mittul Singh, Dietrich Klakow |
| 2016 | Sharing Speech Synthesis Software for Research and Education Within Low-Tech and Low-Resource Communities. Andrew R. Plummer, Mary E. Beckman |
| 2016 | Short Utterance Variance Modelling and Utterance Partitioning for PLDA Speaker Verification. Ahilan Kanagasundaram, David Dean, Sridha Sridharan, Clinton Fookes, Ivan Himawan |
| 2016 | Silent-Speech Command Word Recognition Using Electro-Optical Stomatography. Simon Stone, Peter Birkholz |
| 2016 | Sincerity and Deception in Speech: Two Sides of the Same Coin? A Transfer- and Multi-Task Learning Perspective. Yue Zhang, Felix Weninger, Zhao Ren, Björn W. Schuller |
| 2016 | SingaKids-Mandarin: Speech Corpus of Singaporean Children Speaking Mandarin Chinese. Nancy F. Chen, Rong Tong, Darren Wee, Pei Xuan Lee, Bin Ma, Haizhou Li |
| 2016 | Singing Voice Synthesis Based on Deep Neural Networks. Masanari Nishimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda |
| 2016 | Single-Channel Multi-Speaker Separation Using Deep Clustering. Yusuf Ziya Isik, Jonathan Le Roux, Zhuo Chen, Shinji Watanabe, John R. Hershey |
| 2016 | Single-Channel Speech Enhancement Using Double Spectrum. Martin Blass, Pejman Mowlaee, W. Bastiaan Kleijn |
| 2016 | Sinusoidal Modelling for Ecoacoustics. Patrice Guyot, Alice Eldridge, Ying Chen Eyre-Walker, Alison Johnston, Thomas Pellegrini, Mika Peck |
| 2016 | Small-Footprint Deep Neural Networks with Highway Connections for Speech Recognition. Liang Lu, Steve Renals |
| 2016 | Sound Pattern Matching for Automatic Prosodic Event Detection. Milos Cernak, Afsaneh Asaei, Pierre-Edouard Honnet, Philip N. Garner, Hervé Bourlard |
| 2016 | SparkNG: Interactive MATLAB Tools for Introduction to Speech Production, Perception and Processing Fundamentals and Application of the Aliasing-Free L-F Model Component. Hideki Kawahara |
| 2016 | Sparsely Connected and Disjointly Trained Deep Neural Networks for Low Resource Behavioral Annotation: Acoustic Classification in Couples' Therapy. Haoqi Li, Brian R. Baucom, Panayiotis G. Georgiou |
| 2016 | Speaker Age Classification and Regression Using i-Vectors. Joanna Grzybowska, Stanislaw Kacprzak |
| 2016 | Speaker Comparison for Forensic and Investigative Applications II. Jean-François Bonastre, Joseph P. Campbell, Anders P. Eriksson, Hirotaka Nakasone, Reva Schwartz |
| 2016 | Speaker Identity and Voice Quality: Modeling Human Responses and Automatic Speaker Recognition. Soo Jin Park, Caroline Sigouin, Jody Kreiman, Patricia A. Keating, Jinxi Guo, Gary Yeung, Fang-Yu Kuo, Abeer Alwan |
| 2016 | Speaker Linking and Applications Using Non-Parametric Hashing Methods. Douglas E. Sturim, William M. Campbell |
| 2016 | Speaker Normalization Through Feature Shifting of Linearly Transformed i-Vector. Jahyun Goo, Younggwan Kim, Hyungjun Lim, Hoirin Kim |
| 2016 | Speaker Recognition Using Real vs Synthetic Parallel Data for DNN Channel Compensation. Fred Richardson, Michael S. Brandstein, Jennifer Melot, Douglas A. Reynolds |
| 2016 | Speaker Representations for Speaker Adaptation in Multiple Speakers' BLSTM-RNN-Based Speech Synthesis. Yi Zhao, Daisuke Saito, Nobuaki Minematsu |
| 2016 | Speaker Verification Using Short Utterances with DNN-Based Estimation of Subglottal Acoustic Features. Jinxi Guo, Gary Yeung, Deepak Muralidharan, Harish Arsikere, Amber Afshan, Abeer Alwan |
| 2016 | Speaker-Dependent Dictionary-Based Speech Enhancement for Text-Dependent Speaker Verification. Nicolai Bæk Thomsen, Dennis Alexander Lehmann Thomsen, Zheng-Hua Tan, Børge Lindberg, Søren Holdt Jensen |
| 2016 | Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments. Guan-Lin Chao, William Chan, Ian R. Lane |
| 2016 | Speakers In The Wild (SITW): The QUT Speaker Recognition System. Houman Ghaemmaghami, Md. Hafizur Rahman, Ivan Himawan, David Dean, Ahilan Kanagasundaram, Sridha Sridharan, Clinton Fookes |
| 2016 | Spectral Enhancement of Cleft Lip and Palate Speech. Vikram C. M., Nagaraj Adiga, S. R. Mahadeva Prasanna |
| 2016 | Speech Bandwidth Extension Using Bottleneck Features and Deep Recurrent Neural Networks. Yu Gu, Zhen-Hua Ling, Li-Rong Dai |
| 2016 | Speech Emotion Recognition Using Affective Saliency. Arodami Chorianopoulou, Polychronis Koutsakis, Alexandros Potamianos |
| 2016 | Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System Using Deep Recurrent Neural Networks. Cassia Valentini-Botinhao, Xin Wang, Shinji Takaki, Junichi Yamagishi |
| 2016 | Speech Enhancement in Multiple-Noise Conditions Using Deep Neural Networks. Anurag Kumar, Dinei A. F. Florêncio |
| 2016 | Speech Features for Depression Detection. Saurabh Sahu, Carol Y. Espy-Wilson |
| 2016 | Speech Intelligibility Prediction Based on the Envelope Power Spectrum Model with the Dynamic Compressive Gammachirp Auditory Filterbank. Katsuhiko Yamamoto, Toshio Irino, Toshie Matsui, Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani |
| 2016 | Speech Likability and Personality-Based Social Relations: A Round-Robin Analysis over Communication Channels. Laura Fernández Gallardo, Benjamin Weiss |
| 2016 | Speech Localisation in a Multitalker Mixture by Humans and Machines. Ning Ma, Guy J. Brown |
| 2016 | Speech Recognition in Alzheimer's Disease and in its Assessment. Luke Zhou, Kathleen C. Fraser, Frank Rudzicz |
| 2016 | Speech Reductions Cause a De-Weighting of Secondary Acoustic Cues. Léo Varnet, Fanny Meunier, Michel Hoen |
| 2016 | Speech Rhythm in Parkinson's Disease: A Study on Italian. Massimo Pettorino, Maria Grazia Busà, Elisa Pellegrino |
| 2016 | Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence. Bidisha Sharma, S. R. Mahadeva Prasanna |
| 2016 | Speech Ventures. Nicolas Scheffer, Korbinian Riedhammer, Alexandre Lebrun, David Suendermann-Oeft |
| 2016 | Speech-Based Detection of Alzheimer's Disease in Conversational German. Jochen Weiner, Christian Herff, Tanja Schultz |
| 2016 | Speed Perturbation and Vowel Duration Modeling for ASR in Hausa and Wolof Languages. Elodie Gauthier, Laurent Besacier, Sylvie Voisin |
| 2016 | Spoken Language Understanding in a Latent Topic-Based Subspace. Mohamed Morchid, Mohamed Bouaziz, Waad Ben Kheder, Killian Janod, Pierre-Michel Bousquet, Richard Dufour, Georges Linarès |
| 2016 | Stacked Long-Term TDNN for Spoken Language Recognition. Daniel Garcia-Romero, Alan McCree |
| 2016 | State-of-the-Art MRI Protocol for Comprehensive Assessment of Vocal Tract Structure and Function. Sajan Goud Lingala, Asterios Toutios, Johannes Töger, Yongwan Lim, Yinghua Zhu, Yoon-Chul Kim, Colin Vaz, Shrikanth S. Narayanan, Krishna S. Nayak |
| 2016 | Statistical Modeling of Speaker's Voice with Temporal Co-Location for Active Voice Authentication. Zhong Meng, Biing-Hwang Juang |
| 2016 | Stimulated Deep Neural Network for Speech Recognition. Chunyang Wu, Penny Karanasou, Mark J. F. Gales, Khe Chai Sim |
| 2016 | Subspace Detection of DNN Posterior Probabilities via Sparse Representation for Query by Example Spoken Term Detection. Dhananjay Ram, Afsaneh Asaei, Hervé Bourlard |
| 2016 | Subspace LHUC for Fast Adaptation of Deep Neural Network Acoustic Models. Lahiru Samarakoon, Khe Chai Sim |
| 2016 | Supervised Learning of Acoustic Models in a Zero Resource Setting to Improve DPGMM Clustering. Michael Heck, Sakriani Sakti, Satoshi Nakamura |
| 2016 | Supplementary Motor Area Activation in Disfluency Perception: An fMRI Study of Listener Neural Responses to Spontaneously Produced Unfilled and Filled Pauses. Robert Eklund, Martin Ingvar |
| 2016 | Syllable-Level Representations of Suprasegmental Features for DNN-Based Text-to-Speech Synthesis. Manuel Sam Ribeiro, Oliver Watts, Junichi Yamagishi |
| 2016 | Synthesis of Device-Independent Noise Corpora for Realistic ASR Evaluation. Hannes Gamper, Mark R. P. Thomas, Lyle Corbin, Ivan Tashev |
| 2016 | THU-EE System Description for NIST LRE 2015. Liang He, Yao Tian, Yi Liu, Jiaming Xu, Weiwei Liu, Cai Meng, Jia Liu |
| 2016 | TUSK: A Framework for Overviewing the Performance of F0 Estimators. Masanori Morise, Hideki Kawahara |
| 2016 | Talking to a System and Talking to a Human: A Study from a Speech-to-Speech, Machine Translation Mediated Map Task. Hayakawa Akira, Saturnino Luz, Nick Campbell |
| 2016 | Talking with Kids Really Matters: Early Language Experience Shapes Later Life Chances. Anne Fernald |
| 2016 | Tandem Features for Text-Dependent Speaker Verification on the RedDots Corpus. Md. Jahangir Alam, Patrick Kenny, Vishwa Gupta |
| 2016 | Target-Based State and Tracking Algorithm for Spoken Dialogue System. Miao Li, Zhiyang He, Ji Wu |
| 2016 | Teaming Up: Making the Most of Diverse Representations for a Novel Personalized Speech Retrieval Application. Stephanie Pancoast, Murat Akbacak |
| 2016 | Temporal Envelopes in Sine-Wave Speech Recognition. Li Xu |
| 2016 | Text Dependent Speaker Verification Using Un-Supervised HMM-UBM and Temporal GMM-UBM. Achintya Kumar Sarkar, Zheng-Hua Tan |
| 2016 | Text-Available Speaker Recognition System for Forensic Applications. Chengzhu Yu, Chunlei Zhang, Finnian Kelly, Abhijeet Sangwan, John H. L. Hansen |
| 2016 | Text-Dependent Audiovisual Synchrony Detection for Spoofing Detection in Mobile Person Recognition. Amit Aides, Hagai Aronowitz |
| 2016 | Text-to-Speech for Individuals with Vision Loss: A User Study. Monika Podsiadlo, Shweta Chahar |
| 2016 | The 2015 NIST Language Recognition Evaluation: The Shared View of I2R, Fantastic4 and SingaMS. Kong-Aik Lee, Haizhou Li, Li Deng, Ville Hautamäki, Wei Rao, Xiong Xiao, Anthony Larcher, Hanwu Sun, Trung Hieu Nguyen, Guangsen Wang, Aleksandr Sizov, Jianshu Chen, Ivan Kukanov, Amir Hossein Poorjam, Trung Ngo Trong, Chenglin Xu, Haihua Xu, Bin Ma, Eng Siong Chng, Sylvain Meignier |
| 2016 | The 2016 Speakers in the Wild Speaker Recognition Evaluation. Mitchell McLaren, Luciana Ferrer, Diego Castán, Aaron Lawson |
| 2016 | The Acoustic Manifestation of Prominence in Stressless Languages. Angeliki Athanasopoulou, Irene Vogel |
| 2016 | The Acoustics of Lexical Stress in Italian as a Function of Stress Level and Speaking Style. Anders Eriksson, Pier Marco Bertinetto, Mattias Heldner, Rosalba Nodari, Giovanna Lenoci |
| 2016 | The Berkeley Phonetics Machine. Ronald L. Sprouse, Keith Johnson |
| 2016 | The Consistency and Stability of Acoustic and Visual Cues for Different Prosodic Attitudes. Jeesun Kim, Chris Davis |
| 2016 | The Deception Sub-Challenge: The Data. Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron C. Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini |
| 2016 | The Discourse Marker "so" in Turn-Taking and Turn-Releasing Behavior. Emma Rennie, Rebecca Lunsford, Peter A. Heeman |
| 2016 | The Effect of Background Noise on the Activation of Phonological and Semantic Information During Spoken-Word Recognition. Florian Hintz, Odette Scharenborg |
| 2016 | The Effect of Postlexical Deletion on Automatic Speech Recognition in Fast Spontaneously Spoken Zulu. Ewald van der Westhuizen, Thomas Niesler |
| 2016 | The Effect of Sentence Accent on Non-Native Speech Perception in Noise. Odette Scharenborg, Elea Kolkman, Sofoklis Kakouros, Brechtje Post |
| 2016 | The Effects of Modified Speech Styles on Intelligibility for Non-Native Listeners. Martin Cooke, María Luisa García Lecumberri |
| 2016 | The Effects of Prosody on French V-to-V Coarticulation: A Corpus-Based Study. Giuseppina Turco, Cécile Fougeron, Nicolas Audibert |
| 2016 | The Human Speech Cortex. Edward Chang |
| 2016 | The IBM 2016 English Conversational Telephone Speech Recognition System. George Saon, Tom Sercu, Steven J. Rennie, Hong-Kwang Jeff Kuo |
| 2016 | The IBM Speaker Recognition System: Recent Advances and Error Analysis. Seyed Omid Sadjadi, Jason W. Pelecanos, Sriram Ganapathy |
| 2016 | The INTERSPEECH 2016 Computational Paralinguistics Challenge: A Summary of Results. Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini |
| 2016 | The INTERSPEECH 2016 Computational Paralinguistics Challenge: Deception, Sincerity & Native Language. Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron C. Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini |
| 2016 | The Impact of Manner of Articulation on the Intelligibility of Voicing Contrast in Noise: Cross-Linguistic Implications. Mayuki Matsui |
| 2016 | The Influence of Language Experience on the Categorical Perception of Vowels: Evidence from Mandarin and Korean. Hao Zhang, Fei Chen, Nan Yan, Lan Wang, Feng Shi, Manwa L. Ng |
| 2016 | The Influence of Modality and Speaking Style on the Assimilation Type and Categorization Consistency of Non-Native Speech. Sarah E. Fenwick, Catherine T. Best, Chris Davis, Michael D. Tyler |
| 2016 | The Magic Stone: A Video Game to Improve Communication Skills of People with Intellectual Disabilities. Mario Corrales-Astorgano, David Escudero Mancebo, César González Ferreras, Yurena Gutiérrez-González, Valle Flores-Lucas, Valentín Cardeñoso-Payo, Lourdes Aguilar-Cuevas |
| 2016 | The NU-NAIST Voice Conversion System for the Voice Conversion Challenge 2016. Kazuhiro Kobayashi, Shinnosuke Takamichi, Satoshi Nakamura, Tomoki Toda |
| 2016 | The Native Language Sub-Challenge: The Data. Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini |
| 2016 | The Parameterized Phoneme Identity Feature as a Continuous Real-Valued Vector for Neural Network Based Speech Synthesis. Zhengqi Wen, Ya Li, Jianhua Tao |
| 2016 | The Perception of Overlapping Speech: Effects of Speaker Prosody and Listener Attitudes. Katherine Hilton |
| 2016 | The Perceptual Effect of L1 Prosody Transplantation on L2 Speech: The Case of French Accented German. Jeanin Jügler, Frank Zimmerer, Jürgen Trouvain, Bernd Möbius |
| 2016 | The Production of Intervocalic Glides in Non Dysarthric Parkinsonian Speech. Véronique Delvaux, Virginie Roland, Kathy Huet, Myriam Piccaluga, Marie-Claire Haelewyck, Bernard Harmegnies |
| 2016 | The Rhythmic Constraint on Prosodic Boundaries in Mandarin Chinese Based on Corpora of Silent Reading and Speech Perception. Wei Lai, Jiahong Yuan, Ya Li, Xiaoying Xu, Mark Y. Liberman |
| 2016 | The Role of Pitch in Punjabi Word Identification. Jasmeen Kanwal, Amanda Ritchart |
| 2016 | The Role of Spectral Resolution in Foreign-Accented Speech Perception. Michelle R. Kapolowicz, Vahid Montazeri, Peter F. Assmann |
| 2016 | The SIWIS Database: A Multilingual Speech Database with Acted Emphasis. Jean-Philippe Goldman, Pierre-Edouard Honnet, Robert A. J. Clark, Philip N. Garner, Maria Ivanova, Alexandros Lazaridis, Hui Liang, Tiago Macedo, Beat Pfister, Manuel Sam Ribeiro, Eric Wehrli, Junichi Yamagishi |
| 2016 | The SRI CLEO Speaker-State Corpus. Andreas Kathol, Elizabeth Shriberg, Massimiliano de Zambotti |
| 2016 | The SRI Speech-Based Collaborative Learning Corpus. Colleen Richey, Cynthia M. D'Angelo, Nonye Alozie, Harry Bratt, Elizabeth Shriberg |
| 2016 | The SRI System for the NIST OpenSAD 2015 Speech Activity Detection Evaluation. Martin Graciarena, Luciana Ferrer, Vikramjit Mitra |
| 2016 | The Sheffield Wargame Corpus - Day Two and Day Three. Yulan Liu, Charles Fox, Madina Hasan, Thomas Hain |
| 2016 | The Sincerity Sub-Challenge: The Data. Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini |
| 2016 | The Sound of Disgust: How Facial Expression May Influence Speech Production. Chee Seng Chong, Jeesun Kim, Chris Davis |
| 2016 | The Speakers in the Wild (SITW) Speaker Recognition Database. Mitchell McLaren, Luciana Ferrer, Diego Castán, Aaron Lawson |
| 2016 | The USTC System for Voice Conversion Challenge 2016: Neural Network Based Approaches for Spectrum, Aperiodicity and F Ling-Hui Chen, Li-Juan Liu, Zhen-Hua Ling, Yuan Jiang, Li-Rong Dai |
| 2016 | The Unit of Speech Encoding: The Case of Romanian. Irene Vogel, Laura Spinu |
| 2016 | The Use of Locally Normalized Cepstral Coefficients (LNCC) to Improve Speaker Recognition Accuracy in Highly Reverberant Rooms. Víctor Poblete, Juan Pablo Escudero, Josué Fredes, José Novoa, Richard M. Stern, Simon King, Néstor Becerra Yoma |
| 2016 | The Use of Read versus Conversational Lombard Speech in Spectral Tilt Modeling for Intelligibility Enhancement in Near-End Noise Conditions. Emma Jokinen, Ulpu Remes, Paavo Alku |
| 2016 | The Voice Conversion Challenge 2016. Tomoki Toda, Ling-Hui Chen, Daisuke Saito, Fernando Villavicencio, Mirjam Wester, Zhizheng Wu, Junichi Yamagishi |
| 2016 | TheanoLM - An Extensible Toolkit for Neural Network Language Modeling. Seppo Enarvi, Mikko Kurimo |
| 2016 | Time-Varying Quasi-Closed-Phase Weighted Linear Prediction Analysis of Speech for Accurate Formant Detection and Tracking. Dhananjaya N. Gowda, Paavo Alku |
| 2016 | Today's Most Frequently Used F Sofia Strömbergsson |
| 2016 | Tone Classification in Mandarin Chinese Using Convolutional Neural Networks. Charles Chen, Razvan C. Bunescu, Li Xu, Chang Liu |
| 2016 | Toward Development and Evaluation of Pain Level-Rating Scale for Emergency Triage based on Vocal Characteristics and Facial Expressions. Fu-Sheng Tsai, Ya-Ling Hsu, Wei-Chen Chen, Yi-Ming Weng, Chip-Jin Ng, Chi-Chun Lee |
| 2016 | Toward High-Performance Language-Independent Query-by-Example Spoken Term Detection for MediaEval 2015: Post-Evaluation Analysis. Cheung-Chi Leung, Lei Wang, Haihua Xu, Jingyong Hou, Van Tung Pham, Hang Lv, Lei Xie, Xiong Xiao, Chongjia Ni, Bin Ma, Eng Siong Chng, Haizhou Li |
| 2016 | Towards Automatic Detection of Amyotrophic Lateral Sclerosis from Speech Acoustic and Articulatory Samples. Jun Wang, Prasanna V. Kothalkar, Beiming Cao, Daragh Heitzman |
| 2016 | Towards Building an Attentive Artificial Listener: On the Perception of Attentiveness in Feedback Utterances. Catharine Oertel, Joakim Gustafson, Alan W. Black |
| 2016 | Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks. Ying Zhang, Mohammad Pezeshki, Philémon Brakel, Saizheng Zhang, César Laurent, Yoshua Bengio, Aaron C. Courville |
| 2016 | Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine. Bo-Hsiang Tseng, Sheng-syun Shen, Hung-yi Lee, Lin-Shan Lee |
| 2016 | Towards Minimally Invasive Velar State Detection in Normal and Silent Speech. Peter Birkholz, Petko Bakardjiev, Steffen Kürbis, Rico Petrick |
| 2016 | Towards Online-Recognition with Deep Bidirectional LSTM Acoustic Models. Albert Zeyer, Ralf Schlüter, Hermann Ney |
| 2016 | Towards Smart-Cars That Can Listen: Abnormal Acoustic Event Detection on the Road. Mahesh Kumar Nandwana, Taufiq Hasan |
| 2016 | Towards an Automated Screening Tool for Developmental Speech and Language Impairments. Jen J. Gong, Maryann Gong, Dina Levy-Lambert, Jordan R. Green, Tiffany P. Hogan, John V. Guttag |
| 2016 | Tracking Contours of Orofacial Articulators from Real-Time MRI of Speech. Mathieu Labrunie, Pierre Badin, Dirk Voit, Arun A. Joseph, Laurent Lamalle, Coriandre Vilain, Louis-Jean Boë, Jens Frahm |
| 2016 | Transfer Learning for Speaker Verification on Short Utterances. Qingyang Hong, Lin Li, Lihong Wan, Jun Zhang, Feng Tong |
| 2016 | Transfer Learning with Bottleneck Feature Networks for Whispered Speech Recognition. Boon Pang Lim, Faith Wong, Yuyao Li, Jia Wei Bay |
| 2016 | Transferring Emphasis in Speech Translation Using Hard-Attentional Neural Network Models. Quoc Truong Do, Sakriani Sakti, Graham Neubig, Satoshi Nakamura |
| 2016 | Triphone State-Tying via Deep Canonical Correlation Analysis. Weiran Wang, Hao Tang, Karen Livescu |
| 2016 | Twin Model G-PLDA for Duration Mismatch Compensation in Text-Independent Speaker Verification. Jianbo Ma, Vidhyasaharan Sethu, Eliathamby Ambikairajah, Kong-Aik Lee |
| 2016 | Two-Pass IB Based Speaker Diarization System Using Meeting-Specific ANN Based Features. Nauman Dawalatabad, Srikanth R. Madikeri, C. Chandra Sekhar, Hema A. Murthy |
| 2016 | Two-Stage Data Augmentation for Low-Resourced Speech Recognition. William Hartmann, Tim Ng, Roger Hsiao, Stavros Tsakalidis, Richard M. Schwartz |
| 2016 | Two-Stage Temporal Processing for Single-Channel Speech Enhancement. Suman Samui, Indrajit Chakrabarti, Soumya Kanti Ghosh |
| 2016 | Uncontrolled Manifolds in Vowel Production: Assessment with a Biomechanical Model of the Tongue. Andrew Szabados, Pascal Perrier |
| 2016 | Understanding Periodically Interrupted Mandarin Speech. Jing Liu, Rosanna H. N. Tong, Fei Chen |
| 2016 | Undoing Misperceptions: A Microscopic Analysis of Consistent Confusions Through Signal Modifications. Máté Attila Tóth, Martin Cooke |
| 2016 | Unipolar Depression vs. Bipolar Disorder: An Elicitation-Based Approach to Short-Term Detection of Mood Disorder. Kun-Yi Huang, Chung-Hsien Wu, Yu-Ting Kuo, Fong-Lin Jang |
| 2016 | Unit-Selection Attack Detection Based on Unfiltered Frequency-Domain Features. Ulrich Scherhag, Andreas Nautsch, Christian Rathgeb, Christoph Busch |
| 2016 | Universal Background Sparse Coding and Multilayer Bootstrap Network for Speaker Clustering. Xiao-Lei Zhang |
| 2016 | Unrestricted Vocabulary Keyword Spotting Using LSTM-CTC. Yimeng Zhuang, Xuankai Chang, Yanmin Qian, Kai Yu |
| 2016 | Unsupervised Adaptation of Recurrent Neural Network Language Models. Siva Reddy Gangireddy, Pawel Swietojanski, Peter Bell, Steve Renals |
| 2016 | Unsupervised Bottleneck Features for Low-Resource Query-by-Example Spoken Term Detection. Hongjie Chen, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li |
| 2016 | Unsupervised Deep Auditory Model Using Stack of Convolutional RBMs for Speech Recognition. Hardik B. Sailor, Hemant A. Patil |
| 2016 | Unsupervised Joint Estimation of Grapheme-to-Phoneme Conversion Systems and Acoustic Model Adaptation for Non-Native Speech Recognition. Satoshi Tsujioka, Sakriani Sakti, Koichiro Yoshino, Graham Neubig, Satoshi Nakamura |
| 2016 | Unsupervised Learning of Acoustic Units Using Autoencoders and Kohonen Nets. Vikramjit Mitra, Dimitra Vergyri, Horacio Franco |
| 2016 | Unsupervised Phoneme Segmentation of Previously Unseen Languages. Marco Vetter, Markus Müller, Fatima Hamlaoui, Graham Neubig, Satoshi Nakamura, Sebastian Stüker, Alex Waibel |
| 2016 | Unsupervised Stress Information Labeling Using Gaussian Process Latent Variable Model for Statistical Speech Synthesis. Decha Moungsri, Tomoki Koriyama, Takao Kobayashi |
| 2016 | Use of Agreement/Disagreement Classification in Dyadic Interactions for Continuous Emotion Recognition. Hossein Khaki, Engin Erzin |
| 2016 | Use of Generalised Nonlinearity in Vector Taylor Series Noise Compensation for Robust Speech Recognition. Erfan Loweimi, Jon Barker, Thomas Hain |
| 2016 | Use of Vowels in Discriminating Speech-Laugh from Laughter and Neutral Speech. Sri Harsha Dumpala, P. Gangamohan, Suryakanth V. Gangashetty, B. Yegnanarayana |
| 2016 | Using Clinician Annotations to Improve Automatic Speech Recognition of Stuttered Speech. Peter A. Heeman, Rebecca Lunsford, Andy McMillin, J. Scott Yaruss |
| 2016 | Using Past Speaker Behavior to Better Predict Turn Transitions. Tomer Meshorer, Peter A. Heeman |
| 2016 | Using Phonologically Weighted Levenshtein Distances for the Prediction of Microscopic Intelligibility. Lionel Fontan, Isabelle Ferrané, Jérôme Farinas, Julien Pinquier, Xavier Aumont |
| 2016 | Using Text and Acoustic Features in Predicting Glottal Excitation Waveforms for Parametric Speech Synthesis with Recurrent Neural Networks. Lauri Juvela, Xin Wang, Shinji Takaki, Manu Airaksinen, Junichi Yamagishi, Paavo Alku |
| 2016 | Using Zero-Frequency Resonator to Extract Multilingual Intonation Structure. Jinfu Ni, Yoshinori Shiga, Hisashi Kawai |
| 2016 | Using a Biomechanical Model and Articulatory Data for the Numerical Production of Vowels. Saeed Dabbaghchian, Marc Arnela, Olov Engwall, Oriol Guasch, Ian Stavness, Pierre Badin |
| 2016 | Utterance Verification for Text-Dependent Speaker Recognition: A Comparative Assessment Using the RedDots Corpus. Tomi Kinnunen, Md. Sahidullah, Ivan Kukanov, Héctor Delgado, Massimiliano Todisco, Achintya Kumar Sarkar, Nicolai Bæk Thomsen, Ville Hautamäki, Nicholas W. D. Evans, Zheng-Hua Tan |
| 2016 | Variation in Spoken North Sami Language. Kristiina Jokinen, Trung Ngo Trong, Ville Hautamäki |
| 2016 | Velum Control for Oral Sounds. Reed Blaylock, Louis Goldstein, Shrikanth S. Narayanan |
| 2016 | Virtual Adversarial Training Applied to Neural Higher-Order Factors for Phone Classification. Martin Ratajczak, Sebastian Tschiatschek, Franz Pernkopf |
| 2016 | Virtual Machines and Containers as a Platform for Experimentation. Florian Metze, Eric Riebling, Anne S. Warlaumont, Elika Bergelson |
| 2016 | Visual Speech Synthesis Using Dynamic Visemes, Contextual Features and DNNs. Ausdang Thangthai, Ben Milner, Sarah Taylor |
| 2016 | Vocal Effort Modification for Singing Synthesis. Olivier Perrotin, Christophe d'Alessandro |
| 2016 | Vocal Tract Length Normalization for Speaker Independent Acoustic-to-Articulatory Speech Inversion. Ganesh Sivaraman, Vikramjit Mitra, Hosung Nam, Mark K. Tiede, Carol Y. Espy-Wilson |
| 2016 | Voice Conversion Based on Matrix Variate Gaussian Mixture Model Using Multiple Frame Features. Yi Yang, Hidetsugu Uchida, Daisuke Saito, Nobuaki Minematsu |
| 2016 | Voice Conversion Based on Trajectory Model Training of Neural Networks Considering Global Variance. Naoki Hosaka, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda |
| 2016 | Voice Quality Control Using Perceptual Expressions for Statistical Parametric Speech Synthesis Based on Cluster Adaptive Training. Yamato Ohtani, Koichiro Mori, Masahiro Morita |
| 2016 | Voice-Quality Difference Between the Vowels in Filled Pauses and Ordinary Lexical Items. Kikuo Maekawa, Hiroki Mori |
| 2016 | Voting Detector: A Combination of Anomaly Detectors to Reveal Annotation Errors in TTS Corpora. Jindrich Matousek, Daniel Tihelka |
| 2016 | Vowel Characteristics in the Assessment of L2 English Pronunciation. Calbert Graham, Paula Buttery, Francis Nolan |
| 2016 | Vowel Fundamental and Formant Frequency Contributions to English and Mandarin Sentence Intelligibility. Daniel Fogerty, Fei Chen |
| 2016 | Vowels and Diphthongs in Cangnan Southern Min Chinese Dialect. Fang Hu, Chunyu Ge |
| 2016 | Vowels and Diphthongs in the Taiyuan Jin Chinese Dialect. Liping Xia, Fang Hu |
| 2016 | Waveform Generation Based on Signal Reshaping for Statistical Parametric Speech Synthesis. Felipe Espic, Cassia Valentini-Botinhao, Zhizheng Wu, Simon King |
| 2016 | Web Data Selection Based on Word Embedding for Low-Resource Speech Recognition. Chuandong Xie, Wu Guo, Guoping Hu, Junhua Liu |
| 2016 | Who Do You Think Will Speak Next? Perception of Turn-Taking Cues in Slovak and Argentine Spanish. Agustín Gravano, Pablo Brusco, Stefan Benus |
| 2016 | Why do ASR Systems Despite Neural Nets Still Depend on Robust Features. Angel Mario Castro Martinez, Marc René Schädler |
| 2016 | Within-Speaker Features for Native Language Recognition in the Interspeech 2016 Computational Paralinguistics Challenge. Mark A. Huckvale |
| 2016 | Word-Phrase-Entity Recurrent Neural Networks for Language Modeling. Michael Levit, Sarangarajan Parthasarathy, Shuangyu Chang |
| 2016 | YIN-Bird: Improved Pitch Tracking for Bird Vocalisations. Colm O'Reilly, Nicola M. Marples, David J. Kelly, Naomi Harte |
| 2016 | Zara: An Empathetic Interactive Virtual Agent. Pascale Fung, Anik Dey, Farhad Bin Siddique, Ruixi Lin, Yang Yang, Yan Wan, Ricky Ho Yin Chan |
| 2016 | i-Vector/HMM Based Text-Dependent Speaker Verification System for RedDots Challenge. Hossein Zeinali, Hossein Sameti, Lukás Burget, Jan Cernocký, Nooshin Maghsoodi, Pavel Matejka |
| 2016 | webASR 2 - Improved Cloud Based Speech Technology. Thomas Hain, Jeremy Christian, Oscar Saz, Salil Deena, Madina Hasan, Raymond W. M. Ng, Rosanna Milner, Mortaza Doulaty, Yulan Liu |