INTERSPEECH A

819 papers

YearTitle / Authors
2016/r/ as Language Marker in Bilingual Speech Production and Perception.
Constantijn Kaland, Vincenzo Galatà, Lorenzo Spreafico, Alessandro Vietti
201617th Annual Conference of the International Speech Communication Association, Interspeech 2016, San Francisco, CA, USA, September 8-12, 2016
Nelson Morgan
2016A 50-Year Retrospective on Speech and Language Processing.
John Makhoul
2016A Class-Specific Speech Enhancement for Phoneme Recognition: A Dictionary Learning Approach.
Nazreen P. M., A. G. Ramakrishnan, Prasanta Kumar Ghosh
2016A Convex Model for Linguistic Influence in Group Conversations.
Kan Kawabata, Visar Berisha, Anna Scaglione, Amy LaCross
2016A DNN-HMM Approach to Non-Negative Matrix Factorization Based Speech Enhancement.
Ziteng Wang, Xu Li, Xiaofei Wang, Qiang Fu, Yonghong Yan
2016A DNN-HMM Approach to Story Segmentation.
Jia Yu, Xiong Xiao, Lei Xie, Eng Siong Chng, Haizhou Li
2016A Deep Learning Approach to Modeling Empathy in Addiction Counseling.
James Gibson, Dogan Can, Bo Xiao, Zac E. Imel, David C. Atkins, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2016A Divide-and-Conquer Approach for Language Identification Based on Recurrent Neural Networks.
Gregory Gelly, Jean-Luc Gauvain, Viet Bac Le, Abdelkhalek Messaoudi
2016A Fast and Accurate Fundamental Frequency Estimator Using Recursive Moving Average Filters.
Ryunosuke Daido, Yuji Hisaminato
2016A Feature Normalisation Technique for PLLR Based Language Identification Systems.
Sarith Fernando, Vidhyasaharan Sethu, Eliathamby Ambikairajah
2016A Feature Study for Masking-Based Reverberant Speech Separation.
Masood Delfarah, DeLiang Wang
2016A Framework for Automated Marmoset Vocalization Detection and Classification.
Alan Wisler, Laura J. Brattain, Rogier Landman, Thomas F. Quatieri
2016A Framework for Practical Multistream ASR.
Sri Harish Reddy Mallidi, Hynek Hermansky
2016A French Corpus for Distant-Microphone Speech Processing in Real Homes.
Nancy Bertin, Ewen Camberlein, Emmanuel Vincent, Romain Lebarbenchon, Stéphane Peillon, Éric Lamande, Sunit Sivasankaran, Frédéric Bimbot, Irina Illina, Ariane Tom, Sylvain Fleury, Éric Jamet
2016A Hierarchical Predictor of Synthetic Speech Naturalness Using Neural Networks.
Takenori Yoshimura, Gustav Eje Henter, Oliver Watts, Mirjam Wester, Junichi Yamagishi, Keiichi Tokuda
2016A Hybrid System for Continuous Word-Level Emphasis Modeling Based on HMM State Clustering and Adaptive Training.
Quoc Truong Do, Tomoki Toda, Graham Neubig, Sakriani Sakti, Satoshi Nakamura
2016A KL Divergence and DNN-Based Approach to Voice Conversion without Parallel Training Sentences.
Feng-Long Xie, Frank K. Soong, Haifeng Li
2016A Longitudinal Study of Children's Intonation in Narrative Speech.
Jeffrey Kallay, Melissa A. Redford
2016A Low Cost Desktop Robot and Tele-Presence Device for Interactive Speech Research.
Michael C. Brady
2016A Multimodal Dialogue System for Air Traffic Control Trainees Based on Discrete-Event Simulation.
Lubos Smídl, Adam Chýlek, Jan Svec
2016A New Model for Acoustic Wave Propagation and Scattering in the Vocal Tract.
Jianguo Wei, Wendan Guan, Darcy Q. Hou, Dingyi Pan, Wenhuan Lu, Jianwu Dang
2016A New Model of Speech Motor Control Based on Task Dynamics and State Feedback.
Vikram Ramanarayanan, Benjamin Parrell, Louis Goldstein, Srikantan S. Nagarajan, John F. Houde
2016A New Pre-Training Method for Training Deep Learning Models with Application to Spoken Language Understanding.
Asli Celikyilmaz, Ruhi Sarikaya, Dilek Hakkani-Tür, Xiaohu Liu, Nikhil Ramesh, Gökhan Tür
2016A Nonparametric Bayesian Approach for Spoken Term Detection by Example Query.
Amir Hossein Harati Nejad Torbati, Joseph Picone
2016A Novel Discriminative Score Calibration Method for Keyword Search.
Zhiqiang Lv, Meng Cai, Wei-Qiang Zhang, Jia Liu
2016A Novel Research to Artificial Bandwidth Extension Based on Deep BLSTM Recurrent Neural Networks and Exemplar-Based Sparse Representation.
Bin Liu, Jianhua Tao
2016A Novel Risk-Estimation-Theoretic Framework for Speech Enhancement in Nonstationary and Non-Gaussian Noise Conditions.
Jishnu Sadasivan, Chandra Sekhar Seelamantula
2016A Phase-Based Time-Frequency Masking for Multi-Channel Speech Enhancement in Domestic Environments.
Alessio Brutti, Antigoni Tsiami, Athanasios Katsamanis, Petros Maragos
2016A Portable Automatic PA-TA-KA Syllable Detection System to Derive Biomarkers for Neurological Disorders.
Fei Tao, Louis Daudet, Christian Poellabauer, Sandra L. Schneider, Carlos Busso
2016A Praat-Based Algorithm to Extract the Amplitude Envelope and Temporal Fine Structure Using the Hilbert Transform.
Lei He, Volker Dellwo
2016A Preliminary Ultrasound Study of Nasal and Lateral Coronals in Arrernte.
Marija Tabain, Richard Beare
2016A Real-Time Framework for Visual Feedback of Articulatory Data Using Statistical Shape Models.
Kristy James, Alexander Hewer, Ingmar Steiner, Stefanie Wuhrer
2016A Real-Time Parametric General-Purpose Mammalian Vocal Synthesiser.
Roger K. Moore
2016A Robust Dual-Microphone Speech Source Localization Algorithm for Reverberant Environments.
Yanmeng Guo, Xiaofei Wang, Chao Wu, Qiang Fu, Ning Ma, Guy J. Brown
2016A Robust Non-Parametric and Filtering Based Approach for Glottal Closure Instant Detection.
Pradeep Rengaswamy, Gurunath Reddy M., K. Sreenivasa Rao, Pallab Dasgupta
2016A Sequence-to-Sequence Model for User Simulation in Spoken Dialogue Systems.
Layla El Asri, Jing He, Kaheer Suleman
2016A Sparse Spherical Harmonic-Based Model in Subbands for Head-Related Transfer Functions.
Xiaoke Qi, Jianhua Tao
2016A Speaker Diarization System for Studying Peer-Led Team Learning Groups.
Harishchandra Dubey, Lakshmish Kaushik, Abhijeet Sangwan, John H. L. Hansen
2016A Speaker Recognition System for the SITW Challenge.
Oleg Kudashev, Sergey Novoselov, Konstantin Simonchik, Alexander Kozlov
2016A Spectral Modulation Sensitivity Weighted Pre-Emphasis Filter for Active Noise Control System.
Kah-Meng Cheong, Yuh-Yuan Wang, Tai-Shih Chi
2016A Step Beyond Local Observations with a Dialog Aware Bidirectional GRU Network for Spoken Language Understanding.
Vedran Vukotic, Christian Raymond, Guillaume Gravier
2016A Stochastic Model for Computer-Aided Human-Human Dialogue.
Merwan Barlier, Romain Laroche, Olivier Pietquin
2016A Template-Based Approach for Speech Synthesis Intonation Generation Using LSTMs.
Srikanth Ronanki, Gustav Eje Henter, Zhizheng Wu, Simon King
2016A Voice Conversion Mapping Function Based on a Stacked Joint-Autoencoder.
Seyed Hamidreza Mohammadi, Alexander Kain
2016A WFST Framework for Single-Pass Multi-Stream Decoding.
Sirui Xu, Eric Fosler-Lussier
2016A priori SNR Estimation Using a Generalized Decision Directed Approach.
Aleksej Chinaev, Reinhold Haeb-Umbach
2016ARET - Automatic Reading of Educational Texts for Visually Impaired Students.
Martin Gruber, Jindrich Matousek, Zdenek Hanzlícek, Zdenek Krnoul, Zbynek Zajíc
2016ASR Confidence Estimation with Speaker-Adapted Recurrent Neural Networks.
Miguel Ángel del Agua, Santiago Piqueras, Adrià Giménez, Alberto Sanchís, Jorge Civera, Alfons Juan
2016ASR for South Slavic Languages Developed in Almost Automated Way.
Jan Nouza, Radek Safarík, Petr Cerva
2016AUT System for SITW Speaker Recognition Challenge.
Abbas Khosravani, Mohammad Mehdi Homayounpour
2016Accent Identification by Combining Deep Neural Networks and Recurrent Neural Networks Trained on Long and Short Term Features.
Yishan Jiao, Ming Tu, Visar Berisha, Julie M. Liss
2016Acoustic Analysis of Syllables Across Indian Languages.
Anusha Prakash, Jeena J. Prakash, Hema A. Murthy
2016Acoustic Differences Between English /t/ Glottalization and Phrasal Creak.
Marc Garellek, Scott Seyfarth
2016Acoustic Modeling Using Bidirectional Gated Recurrent Convolutional Units.
Markus Nußbaum-Thom, Jia Cui, Bhuvana Ramabhadran, Vaibhava Goel
2016Acoustic Modelling from the Signal Domain Using CNNs.
Pegah Ghahremani, Vimal Manohar, Daniel Povey, Sanjeev Khudanpur
2016Acoustic Properties of Formality in Conversational Japanese.
Ethan Sherr-Ziarko
2016Acoustic Word Embeddings for ASR Error Detection.
Sahar Ghannay, Yannick Estève, Nathalie Camelin, Paul Deléglise
2016Acoustic and Visual Analysis of Expressive Speech: A Case Study of French Acted Speech.
Slim Ouni, Vincent Colotte, Sara Dahmani, Soumaya Azzi
2016Acoustic-Prosodic and Turn-Taking Features in Interactions with Children with Neurodevelopmental Disorders.
Daniel Bone, Somer Bishop, Rahul Gupta, Sungbok Lee, Shrikanth S. Narayanan
2016Acoustic-to-Articulatory Inversion Mapping Based on Latent Trajectory Gaussian Mixture Model.
Patrick Lumban Tobing, Tomoki Toda, Hirokazu Kameoka, Satoshi Nakamura
2016Active and Semi-Supervised Learning in ASR: Benefits on the Acoustic and Language Models.
Thomas Drugman, Janne Pylkkönen, Reinhard Kneser
2016Adaptation of Neural Networks Constrained by Prior Statistics of Node Co-Activations.
Tasha Nagamine, Zhuo Chen, Nima Mesgarani
2016Adaptive Group Sparsity for Non-Negative Matrix Factorization with Application to Unsupervised Source Separation.
Xu Li, Ziteng Wang, Xiaofei Wang, Qiang Fu, Yonghong Yan
2016Adaptive Latency for Part-of-Speech Tagging in Incremental Text-to-Speech Synthesis.
Maël Pouget, Olha Nahorna, Thomas Hueber, Gérard Bailly
2016Advances in Very Deep Convolutional Neural Networks for LVCSR.
Tom Sercu, Vaibhava Goel
2016Adversarial Multi-Task Learning of Deep Neural Networks for Robust Speech Recognition.
Yusuke Shinohara
2016An Acoustic Analysis of /r/ in Tyrolean.
Vincenzo Galatà, Lorenzo Spreafico, Alessandro Vietti, Constantijn Kaland
2016An Acoustic Analysis of Child-Child and Child-Robot Interactions for Understanding Engagement during Speech-Controlled Computer Games.
Theodora Chaspari, Jill Fain Lehman
2016An Adaptive Multi-Band System for Low Power Voice Command Recognition.
Qing He, Gregory W. Wornell, Wei Ma
2016An Articulatory-Based Singing Voice Synthesis Using Tongue and Lips Imaging.
Aurore Jaumard-Hakoun, Kele Xu, Clémence Leboullenger, Pierre Roussel-Ragot, Bruce Denby
2016An Automatic Training Tool for Air Traffic Control Training.
Petr Stanislav, Lubos Smídl, Jan Svec
2016An Engine for Online Video Search in Large Archives of the Holocaust Testimonies.
Petr Stanislav, Jan Svec, Pavel Ircing
2016An Expectation Maximization Approach to Joint Modeling of Multidimensional Ratings Derived from Multiple Annotators.
Anil Ramakrishna, Rahul Gupta, Ruth B. Grossman, Shrikanth S. Narayanan
2016An Improved 3D Geometric Tongue Model.
Qiang Fang, Yun Chen, Haibo Wang, Jianguo Wei, Jianrong Wang, Xiyu Wu, Aijun Li
2016An Interaural Magnification Algorithm for Enhancement of Naturally-Occurring Level Differences.
Shadi Pirhosseinloo, Kostas Kokkinakis
2016An Investigation of DNN-Based Speech Synthesis Using Speaker Codes.
Nobukatsu Hojo, Yusuke Ijima, Hideyuki Mizuno
2016An Investigation of Deep Neural Network Architectures for Language Recognition in Indian Languages.
Mounika K. V., Sivanand Achanta, Lakshmi H. R., Suryakanth V. Gangashetty, Anil Kumar Vuppala
2016An Investigation of Emotional Speech in Depression Classification.
Brian Stasak, Julien Epps, Nicholas Cummins, Roland Goecke
2016An Investigation of Recurrent Neural Network Architectures Using Word Embeddings for Phrase Break Prediction.
Anandaswarup Vadapalli, Suryakanth V. Gangashetty
2016An Investigation of Spoofing Speech Detection Under Additive Noise and Reverberant Conditions.
Xiaohai Tian, Zhizheng Wu, Xiong Xiao, Eng Siong Chng, Haizhou Li
2016An Investigation on Training Deep Neural Networks Using Probabilistic Transcriptions.
Amit Das, Mark Hasegawa-Johnson
2016An Investigation on the Use of i-Vectors for Robust ASR.
Dimitrios Dimitriadis, Samuel Thomas, Sriram Ganapathy
2016An Iterative Phase Recovery Framework with Phase Mask for Spectral Mapping with an Application to Speech Enhancement.
Kehuang Li, Bo Wu, Chin-Hui Lee
2016An Objective Evaluation Methodology for Blind Bandwidth Extension.
Stéphane Villette, Sen Li, Pravin Ramadas, Daniel J. Sinder
2016Analysis of Chinese Syllable Durations in Running Speech of Japanese L2 Learners.
Yue Sun, Shudon Hsiao, Yoshinori Sagisaka, Jinsong Zhang
2016Analysis of Face Mask Effect on Speaker Recognition.
Rahim Saeidi, Ilkka Huhtakallio, Paavo Alku
2016Analysis of Glottal Stop in Assam Sora Language.
Sishir Kalita, Luke Horo, Priyankoo Sarmah, S. R. Mahadeva Prasanna, Samarendra Dandapat
2016Analysis of Mismatched Transcriptions Generated by Humans and Machines for Under-Resourced Languages.
Van Hai Do, Nancy F. Chen, Boon Pang Lim, Mark Hasegawa-Johnson
2016Analysis of Multi-Lingual Emotion Recognition Using Auditory Attention Features.
Ozlem Kalinli
2016Analysis of Speaker Recognition Systems in Realistic Scenarios of the SITW 2016 Challenge.
Ondrej Novotný, Pavel Matejka, Oldrich Plchot, Ondrej Glembek, Lukás Burget, Jan Cernocký
2016Analysis of the Voice Conversion Challenge 2016 Evaluation Results.
Mirjam Wester, Zhizheng Wu, Junichi Yamagishi
2016Analysis on Gated Recurrent Unit Based Question Detection Approach.
Yaodong Tang, Zhiyong Wu, Helen M. Meng, Mingxing Xu, Lianhong Cai
2016Analytical Assessment of Dual-Stream Merging for Noise-Robust ASR.
Louis ten Bosch, Bert Cranen, Yang Sun
2016Analyzing Temporal Dynamics of Dyadic Synchrony in Affective Interactions.
Zhaojun Yang, Shrikanth S. Narayanan
2016Analyzing the Contribution of Top-Down Lexical and Bottom-Up Acoustic Cues in the Detection of Sentence Prominence.
Sofoklis Kakouros, Joris Pelemans, Lyan Verwimp, Patrick Wambacq, Okko Räsänen
2016Analyzing the Relation Between Overall Quality and the Quality of Individual Phases in a Telephone Conversation.
Friedemann Köster, Sebastian Möller
2016Anchored Speech Detection.
Roland Maas, Sree Hari Krishnan Parthasarathi, Brian John King, Ruitong Huang, Björn Hoffmeister
2016Applying Spectral Normalisation and Efficient Envelope Estimation and Statistical Transformation for the Voice Conversion Challenge 2016.
Fernando Villavicencio, Junichi Yamagishi, Jordi Bonada, Felipe Espic
2016Articulation Rate Filtering of CQCC Features for Automatic Speaker Verification.
Massimiliano Todisco, Héctor Delgado, Nicholas W. D. Evans
2016Articulation Rate in Adverse Listening Conditions in Younger and Older Adults.
Outi Tuomainen, Valérie Hazan
2016Articulatory Feature Extraction Using CTC to Build Articulatory Classifiers Without Forced Frame Alignments for Speech Recognition.
Basil Abraham, Srinivasan Umesh, Neethu Mariam Joy
2016Articulatory Synthesis Based on Real-Time Magnetic Resonance Imaging Data.
Asterios Toutios, Tanner Sorensen, Krishna Somandepalli, Rachel Alexander, Shrikanth S. Narayanan
2016Articulatory-to-Acoustic Conversion with Cascaded Prediction of Spectral and Excitation Features Using Neural Networks.
Zheng-Chen Liu, Zhen-Hua Ling, Li-Rong Dai
2016Artificial Neural Network-Based Feature Combination for Spatial Voice Activity Detection.
Stefan Meier, Walter Kellermann
2016Assessing Idiosyncrasies in a Bayesian Model of Speech Communication.
Marie-Lou Barnaud, Julien Diard, Pierre Bessière, Jean-Luc Schwartz
2016Assessing Level-Dependent Segmental Contribution to the Intelligibility of Speech Processed by Single-Channel Noise-Suppression Algorithms.
Tian Guan, Guangxing Chu, Fei Chen, Feng Yang
2016Assessing Speech Quality in Speech-Aware Hearing Aids Based on Phoneme Posteriorgrams.
Constantin Spille, Hendrik Kayser, Hynek Hermansky, Bernd T. Meyer
2016At the Border of Acoustics and Linguistics: Bag-of-Audio-Words for the Recognition of Emotions in Speech.
Maximilian Schmitt, Fabien Ringeval, Björn W. Schuller
2016Attention Assisted Discovery of Sub-Utterance Structure in Speech Emotion Recognition.
Che-Wei Huang, Shrikanth S. Narayanan
2016Attention-Based Convolutional Neural Networks for Sentence Classification.
Zhiwei Zhao, Youzheng Wu
2016Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling.
Bing Liu, Ian R. Lane
2016Audio Word2Vec: Unsupervised Learning of Audio Segment Representations Using Sequence-to-Sequence Autoencoder.
Yu-An Chung, Chao-Chung Wu, Chia-Hao Shen, Hung-yi Lee, Lin-Shan Lee
2016Audio-Based Distributional Representations of Meaning Using a Fusion of Feature Encodings.
Giannis Karamanolakis, Elias Iosif, Athanasia Zlatintsi, Aggelos Pikrakis, Alexandros Potamianos
2016Audio-Visual Speech Recognition Using Bimodal-Trained Bottleneck Features for a Person with Severe Hearing Loss.
Yuki Takashima, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki, Nobuyuki Mitani, Kiyohiro Omori, Kaoru Nakazono
2016Audio-to-Visual Speech Conversion Using Deep Neural Networks.
Sarah Taylor, Akihiro Kato, Iain A. Matthews, Ben P. Milner
2016Audiovisual Speech Scene Analysis in the Context of Competing Sources.
Attigodu C. Ganesh, Frédéric Berthommier, Jean-Luc Schwartz
2016Audiovisual Training Effects for Japanese Children Learning English /r/-/l/.
Yasuaki Shinohara
2016Auditory Processing Impairments Under Background Noise in Children with Non-Syndromic Cleft Lip and/or Palate.
Yang Feng, Zhang Lu
2016Auditory-Visual Lexical Tone Perception in Thai Elderly Listeners with and without Hearing Impairment.
Benjawan Kasisopa, Chutamanee Onsuwan, Charturong Tantibundhit, Nittayapa Klangpornkun, Suparak Techacharoenrungrueang, Sudaporn Luksaneeyanawin, Denis Burnham
2016Auditory-Visual Perception of VCVs Produced by People with Down Syndrome: Preliminary Results.
Alexandre Hennequin, Amélie Rochet-Capellan, Marion Dohen
2016Automated Pause Insertion for Improved Intelligibility Under Reverberation.
Petko Nikolov Petkov, Norbert Braunschweiler, Yannis Stylianou
2016Automated Screening of Speech Development Issues in Children by Identifying Phonological Error Patterns.
Lauren Ward, Alessandro Stefani, Daniel V. Smith, Andreas Duenser, Jill Freyne, Barbara Dodd, Angela Morgan
2016Automatic Analysis of Phonetic Speech Style Dimensions.
Neville Ryant, Mark Y. Liberman
2016Automatic Analysis of Typical and Atypical Encoding of Spontaneous Emotion in the Voice of Children.
Fabien Ringeval, Erik Marchi, Charline Grossard, Jean Xavier, Mohamed Chetouani, David Cohen, Björn W. Schuller
2016Automatic Assessment and Error Detection of Shadowing Speech: Case of English Spoken by Japanese Learners.
Shuju Shi, Yosuke Kashiwagi, Shohei Toyama, Junwei Yue, Yutaka Yamauchi, Daisuke Saito, Nobuaki Minematsu
2016Automatic Classification of Lexical Stress in English and Arabic Languages Using Deep Learning.
Mostafa Ali Shahin, Julien Epps, Beena Ahmed
2016Automatic Classification of Phonation Modes in Singing Voice: Towards Singing Style Characterisation and Application to Ethnomusicological Recordings.
Jean-Luc Rouas, Leonidas Ioannidis
2016Automatic Correction of ASR Outputs by Using Machine Translation.
Luis Fernando D'Haro, Rafael E. Banchs
2016Automatic Detection of Parkinson's Disease Based on Modulated Vowels.
Daria Hemmerling, Juan Rafael Orozco-Arroyave, Andrzej Skalski, Janusz Gajda, Elmar Nöth
2016Automatic Dialect Detection in Arabic Broadcast Speech.
Ahmed Ali, Najim Dehak, Patrick Cardinal, Sameer Khurana, Sree Harsha Yella, James R. Glass, Peter Bell, Steve Renals
2016Automatic Discrimination of Soft Voice Onset Using Acoustic Features of Breathy Voicing.
Keiko Ochi, Koichi Mori, Naomi Sakai, Nobutaka Ono
2016Automatic Estimation of Perceived Sincerity from Spoken Language.
Brandon M. Booth, Rahul Gupta, Pavlos Papadopoulos, Ruchir Travadi, Shrikanth S. Narayanan
2016Automatic Genre and Show Identification of Broadcast Media.
Mortaza Doulaty, Oscar Saz, Raymond W. M. Ng, Thomas Hain
2016Automatic Glottal Inverse Filtering with Non-Negative Matrix Factorization.
Manu Airaksinen, Lauri Juvela, Tom Bäckström, Paavo Alku
2016Automatic Measurement of Voice Onset Time and Prevoicing Using Recurrent Neural Networks.
Yossi Adi, Joseph Keshet, Olga Dmitrieva, Matthew Goldrick
2016Automatic Paragraph Segmentation with Lexical and Prosodic Features.
Catherine Lai, Mireia Farrús, Johanna D. Moore
2016Automatic Pronunciation Evaluation of Non-Native Mandarin Tone by Using Multi-Level Confidence Measures.
Ju Lin, Yanlu Xie, Jinsong Zhang
2016Automatic Pronunciation Generation by Utilizing a Semi-Supervised Deep Neural Networks.
Naoya Takahashi, Tofigh Naghibi, Beat Pfister
2016Automatic Recognition of Social Roles Using Long Term Role Transitions in Small Group Interactions.
Gaurav Fotedar, Aditya Gaonkar P., Saikat Chatterjee, Prasanta Kumar Ghosh
2016Automatic Scoring of Monologue Video Interviews Using Multimodal Cues.
Lei Chen, Gary Feng, Michelle P. Martin-Raugh, Chee Wee Leong, Christopher Kitchen, Su-Youn Yoon, Blair Lehman, Harrison Kell, Chong Min Lee
2016Automatic Speech Recognition Using Probabilistic Transcriptions in Swahili, Amharic, and Dinka.
Amit Das, Preethi Jyothi, Mark Hasegawa-Johnson
2016Automatic Speech Transcription for Low-Resource Languages - The Case of Yoloxóchitl Mixtec (Mexico).
Vikramjit Mitra, Andreas Kathol, Jonathan D. Amith, Rey Castillo García
2016Automatically Classifying Self-Rated Personality Scores from Speech.
Guozhen An, Sarah Ita Levitan, Rivka Levitan, Andrew Rosenberg, Michelle Levine, Julia Hirschberg
2016Bayesian Modeling in Speech Motor Control: A Principled Structure for the Integration of Various Constraints.
Jean-François Patri, Pascal Perrier, Julien Diard
2016Behavioral Coding of Therapist Language in Addiction Counseling Using Recurrent Neural Networks.
Bo Xiao, Dogan Can, James Gibson, Zac E. Imel, David C. Atkins, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2016Bertsokantari: a TTS Based Singing Synthesis System.
Eder del Blanco, Inma Hernáez, Eva Navas, Xabier Sarasola, Daniel Erro
2016Better Evaluation of ASR in Speech Translation Context Using Word Embeddings.
Ngoc-Tien Le, Christophe Servan, Benjamin Lecouteux, Laurent Besacier
2016Between- and Within-Speaker Effects of Bilingualism on F0 Variation.
Rob Voigt, Dan Jurafsky, Meghan Sumner
2016Beyond Utterance Extraction: Summary Recombination for Speech Summarization.
Jérémy Trione, Benoît Favre, Frédéric Béchet
2016Bidirectional Recurrent Neural Network with Attention Mechanism for Punctuation Restoration.
Ottokar Tilk, Tanel Alumäe
2016Bird Song Synthesis Based on Hidden Markov Models.
Jordi Bonada, Robert Lachlan, Merlijn Blaauw
2016Blind Non-Intrusive Speech Intelligibility Prediction Using Twin-HMMs.
Mahdie Karbasi, Ahmed Hussen Abdelaziz, Hendrik Meutzner, Dorothea Kolossa
2016Blind Recovery of Perceptual Models in Distributed Speech and Audio Coding.
Tom Bäckström, Florin Ghido, Johannes Fischer
2016Blind Speech Separation with GCC-NMF.
Sean U. N. Wood, Jean Rouat
2016CNN-Based Phone Segmentation Experiments in a Less-Represented Language.
Céline Manenti, Thomas Pellegrini, Julien Pinquier
2016Call Alternation Between Specific Pairs of Male Frogs Revealed by a Sound-Imaging Method in Their Natural Habitat.
Ikkyu Aihara, Takeshi Mizumoto, Hiromitsu Awano, Hiroshi G. Okuno
2016Can Intensive Exposure to Foreign Language Sounds Affect the Perception of Native Sounds?
Jian Gong, María Luisa García Lecumberri, Martin Cooke
2016Categorization of Natural Spanish Whistled Vowels by Naïve Spanish Listeners.
Julien Meyer, Laure Dentel, Fanny Meunier
2016Causal Speech Enhancement Combining Data-Driven Learning and Suppression Rule Estimation.
Seyedmahdad Mirsamadi, Ivan Tashev
2016Channel Selection for Distant Speech Recognition Exploiting Cepstral Distance.
Cristina Guerrero, Georgina Tryfou, Maurizio Omologo
2016Characterization of Audiovisual Dramatic Attitudes.
Adela Barbulescu, Rémi Ronfard, Gérard Bailly
2016Characterizing Vocal Tract Dynamics Across Speakers Using Real-Time MRI.
Tanner Sorensen, Asterios Toutios, Louis Goldstein, Shrikanth S. Narayanan
2016Classification of Voice Modality Using Electroglottogram Waveforms.
Michal Borsky, Daryush D. Mehta, Julius P. Gudjohnsen, Jón Guðnason
2016Closing Remarks.
Naomi Harte, Peter Jancovic, Karl-L. Schuchmann
2016CloudCAST - Remote Speech Technology for Speech Professionals.
Phil D. Green, Ricard Marxer, Stuart P. Cunningham, Heidi Christensen, Frank Rudzicz, Maria Yancheva, André Coy, Massimiliano Malavasi, Lorenzo Desideri, Fabio Tamburini
2016Coda Stop and Taiwan Min Checked Tone Sound Changes.
Ho-Hsien Pan, Hsiao-tung Huang, Shao-Ren Lyu
2016Colloquialising Modern Standard Arabic Text for Improved Speech Recognition.
Sarah Al-Shareef, Thomas Hain
2016Combining Acoustic-Prosodic, Lexical, and Phonotactic Features for Automatic Deception Detection.
Sarah Ita Levitan, Guozhen An, Min Ma, Rivka Levitan, Andrew Rosenberg, Julia Hirschberg
2016Combining CNN and BLSTM to Extract Textual and Acoustic Features for Recognizing Stances in Mandarin Ideological Debate Competition.
Linchuan Li, Zhiyong Wu, Mingxing Xu, Helen M. Meng, Lianhong Cai
2016Combining Data-Oriented and Process-Oriented Approaches to Modeling Reaction Time Data.
Louis ten Bosch, Lou Boves, Mirjam Ernestus
2016Combining Energy and Cross-Entropy Analysis for Nuclear Segments Detection.
Antonio Origlia, Francesco Cutugno
2016Combining Feature and Model-Based Adaptation of RNNLMs for Multi-Genre Broadcast Speech Recognition.
Salil Deena, Madina Hasan, Mortaza Doulaty, Oscar Saz, Thomas Hain
2016Combining Mask Estimates for Single Channel Audio Source Separation Using Deep Neural Networks.
Emad M. Grais, Gerard Roma, Andrew J. R. Simpson, Mark D. Plumbley
2016Combining Non-Pathological Data of Different Language Varieties to Improve DNN-HMM Performance on Pathological Speech.
Emre Yilmaz, Mario Ganzeboom, Catia Cucchiarini, Helmer Strik
2016Combining Semantic Word Classes and Sub-Word Unit Speech Recognition for Robust OOV Detection.
Axel Horndasch, Anton Batliner, Caroline Kaufhold, Elmar Nöth
2016Combining State-Level Spotting and Posterior-Based Acoustic Match for Improved Query-by-Example Spoken Term Detection.
Shuji Oishi, Tatsuya Matsuba, Mitsuaki Makino, Atsuhiko Kai
2016Combining Weak Tokenisers for Phonotactic Language Recognition in a Resource-Constrained Setting.
Raymond W. M. Ng, Bhusan Chettri, Thomas Hain
2016Compact Feedforward Sequential Memory Networks for Large Vocabulary Continuous Speech Recognition.
Shiliang Zhang, Hui Jiang, Shifu Xiong, Si Wei, Li-Rong Dai
2016Comparing Articulatory and Acoustic Strategies for Reducing Non-Native Accents.
Sandesh Aryal, Ricardo Gutierrez-Osuna
2016Comparing Different Methods for Analyzing ERP Signals.
Kimberley Mulder, Louis ten Bosch, Lou Boves
2016Comparing the Contributions of Amplitude and Phase to Speech Intelligibility in a Vocoder-Based Speech Synthesis Model.
Fei Chen, Benson C. L. Chiao
2016Comparing the Influence of Spectro-Temporal Integration in Computational Speech Segregation.
Thomas Bentsen, Tobias May, Abigail A. Kressner, Torsten Dau
2016Comparison of Multiple System Combination Techniques for Keyword Spotting.
William Hartmann, Le Zhang, Kerri Barnes, Roger Hsiao, Stavros Tsakalidis, Richard M. Schwartz
2016Complex Linear Projection (CLP): A Discriminative Approach to Joint Feature Extraction and Acoustic Modeling.
Ehsan Variani, Tara N. Sainath, Izhak Shafran, Michiel Bacchiani
2016Complexity in Prosody: A Nonlinear Dynamical Systems Approach for Dyadic Conversations; Behavior and Outcomes in Couples Therapy.
Md. Nasir, Brian R. Baucom, Shrikanth S. Narayanan, Panayiotis G. Georgiou
2016Compositional Neural Network Language Models for Agglutinative Languages.
Ebru Arisoy, Murat Saraclar
2016Computational Approaches to Linguistic Code Switching.
Mona T. Diab, Pascale Fung, Julia Hirschberg, Thamar Solorio
2016Conditional Random Fields for the Tunisian Dialect Grapheme-to-Phoneme Conversion.
Abir Masmoudi, Mariem Ellouze, Fethi Bougares, Yannick Estève, Lamia Hadrich Belguith
2016Congruency Effect Between Articulation and Grasping in Native English Speakers.
Mikko Tiainen, Fatima M. Felisberti, Kaisa Tiippana, Martti Vainio, Juraj Simko, Jirí Lukavský, Lari Vainio
2016Context Adaptive Neural Network for Rapid Adaptation of Deep CNN Based Acoustic Models.
Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Takuya Yoshioka, Dung T. Tran, Tomohiro Nakatani
2016Context Aware Mispronunciation Detection for Mandarin Pronunciation Training.
Rong Tong, Nancy F. Chen, Bin Ma, Haizhou Li
2016Context-Aware Restaurant Recommendation for Natural Language Queries: A Formative User Study in the Automotive Domain.
Philipp Fischer, Cornelius Styp von Rekowski, Andreas Nürnberger
2016Context-Sensitive and Role-Dependent Spoken Language Understanding Using Bidirectional and Attention LSTMs.
Chiori Hori, Takaaki Hori, Shinji Watanabe, John R. Hershey
2016Contextual Prediction Models for Speech Recognition.
Yoni Halpern, Keith B. Hall, Vlad Schogol, Michael Riley, Brian Roark, Gleb Skobeltsyn, Martin Bäuml
2016Conversational Engagement Recognition Using Auditory and Visual Cues.
Yuyun Huang, Emer Gilmartin, Nick Campbell
2016Convex Hull Convolutive Non-Negative Matrix Factorization for Uncovering Temporal Patterns in Multivariate Time-Series Data.
Colin Vaz, Asterios Toutios, Shrikanth S. Narayanan
2016Convolutional Neural Networks with Data Augmentation for Classifying Speakers' Native Language.
Gil Keren, Jun Deng, Jouni Pohjalainen, Björn W. Schuller
2016Coping with Unseen Data Conditions: Investigating Neural Net Architectures, Robust Features, and Information Fusion for Robust Speech Recognition.
Vikramjit Mitra, Horacio Franco
2016Corpora for the Evaluation of Robust Speaker Recognition Systems.
Douglas E. Sturim, Pedro A. Torres-Carrasquillo, Joseph P. Campbell
2016Cost Effective Acoustic Monitoring of Bird Species.
Ciira Wa Maina
2016Couples Behavior Modeling and Annotation Using Low-Resource LSTM Language Models.
Shao-Yen Tseng, Sandeep Nallan Chakravarthula, Brian R. Baucom, Panayiotis G. Georgiou
2016Cross-Cultural Depression Recognition from Vocal Biomarkers.
Sharifa Alghowinem, Roland Goecke, Julien Epps, Michael Wagner, Jeffrey F. Cohn
2016Cross-Database Evaluation of Audio-Based Spoofing Detection Systems.
Pavel Korshunov, Sébastien Marcel
2016Cross-Gender and Cross-Dialect Tone Recognition for Vietnamese.
Antje Schweitzer, Ngoc Thang Vu
2016Cross-Lingual Speaker Adaptation for Statistical Speech Synthesis Using Limited Data.
Seyyed Saeed Sarfjoo, Cenk Demiroglu
2016DBN-ivector Framework for Acoustic Emotion Recognition.
Rui Xia, Yang Liu
2016DNN Online with iVectors Acoustic Modeling and Doc2Vec Distributed Representations for Improving Automated Speech Scoring.
Jidong Tao, Lei Chen, Chong Min Lee
2016DNN-Based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification.
Zeyan Oo, Yuta Kawakami, Longbiao Wang, Seiichi Nakagawa, Xiong Xiao, Masahiro Iwahashi
2016DNN-Based Automatic Speech Recognition as a Model for Human Phoneme Perception.
Mats Exter, Bernd T. Meyer
2016DNN-Based Feature Enhancement Using Joint Training Framework for Robust Multichannel Speech Recognition.
Kang Hyun Lee, Tae Gyoon Kang, Woo Hyun Kang, Nam Soo Kim
2016DNN-Based Speaker Clustering for Speaker Diarisation.
Rosanna Milner, Thomas Hain
2016DNNs for Unsupervised Extraction of Pseudo FMLLR Features Without Explicit Adaptation Data.
Neethu Mariam Joy, Murali Karthick Baskar, Srinivasan Umesh, Basil Abraham
2016Data Augmentation Using Multi-Input Multi-Output Source Separation for Deep Neural Network Based Acoustic Modeling.
Yusuke Fujita, Ryoichi Takashima, Takeshi Homma, Masahito Togami
2016Data Selection and Adaptation for Naturalness in HMM-Based Speech Synthesis.
Erica Cooper, Alison Chang, Yocheved Levitan, Julia Hirschberg
2016Data Selection by Sequence Summarizing Neural Network in Mismatch Condition Training.
Katerina Zmolíková, Martin Karafiát, Karel Veselý, Marc Delcroix, Shinji Watanabe, Lukás Burget, Jan Cernocký
2016Data Selection for Within-Class Covariance Estimation.
Elliot Singer, Tyler Campbell, Douglas A. Reynolds
2016Deep Bidirectional LSTM Modeling of Timbre and Prosody for Emotional Voice Conversion.
Huaiping Ming, Dong-Yan Huang, Lei Xie, Jie Wu, Minghui Dong, Haizhou Li
2016Deep Bidirectional Long Short-Term Memory Recurrent Neural Networks for Grapheme-to-Phoneme Conversion Utilizing Complex Many-to-Many Alignments.
Amr El-Desoky Mousa, Björn W. Schuller
2016Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Recognition.
Naoya Takahashi, Michael Gygli, Beat Pfister, Luc Van Gool
2016Deep Convolutional Neural Networks with Layer-Wise Context Expansion and Attention.
Dong Yu, Wayne Xiong, Jasha Droppo, Andreas Stolcke, Guoli Ye, Jinyu Li, Geoffrey Zweig
2016Deep Neural Network Based Acoustic-to-Articulatory Inversion Using Phone Sequence Information.
Xurong Xie, Xunying Liu, Lan Wang
2016Deep Neural Network Bottleneck Features for Acoustic Event Recognition.
Seongkyu Mun, Suwon Shon, Wooil Kim, Hanseok Ko
2016Deep Neural Network Frontend for Continuous EMG-Based Speech Recognition.
Michael Wand, Jürgen Schmidhuber
2016Deep Neural Networks for Voice Quality Assessment Based on the GRBAS Scale.
Simin Xie, Nan Yan, Ping Yu, Manwa L. Ng, Lan Wang, Zhuanzhuan Ji
2016Deep Neural Networks for i-Vector Language Identification of Short Utterances in Cars.
Omid Ghahabi, Antonio Bonafonte, Javier Hernando, Asunción Moreno
2016Deep Stacked Autoencoders for Spoken Language Understanding.
Killian Janod, Mohamed Morchid, Richard Dufour, Georges Linarès, Renato De Mori
2016Defining Emotionally Salient Regions Using Qualitative Agreement Method.
Srinivas Parthasarathy, Carlos Busso
2016Deriving Phonetic Transcriptions and Discovering Word Segmentations for Speech-to-Speech Translation in Low-Resource Settings.
Andrew Wilkinson, Tiancheng Zhao, Alan W. Black
2016Detecting Mild Cognitive Impairment from Spontaneous Speech by Correlation-Based Phonetic Feature Selection.
Gábor Gosztolya, László Tóth, Tamás Grósz, Veronika Vincze, Ildikó Hoffmann, Gréta Szatlóczki, Magdolna Pákáski, János Kálmán
2016Detecting Mispronunciations of L2 Learners and Providing Corrective Feedback Using Knowledge-Guided and Data-Driven Decision Trees.
Wei Li, Kehuang Li, Sabato Marco Siniscalchi, Nancy F. Chen, Chin-Hui Lee
2016Detection of Total Syllables and Canonical Syllables in Infant Vocalizations.
Anne S. Warlaumont, Heather L. Ramsdell-Hudock
2016Detection of User Escalation in Human-Computer Interactions.
Ian Beaver, Cynthia Freeman
2016Determining Native Language and Deception Using Phonetic Features and Classifier Combination.
Gábor Gosztolya, Tamás Grósz, Róbert Busa-Fekete, László Tóth
2016Development of Mandarin Onset-Rime Detection in Relation to Age and Pinyin Instruction.
Fei Chen, Nan Yan, Xunan Huang, Hao Zhang, Lan Wang, Gang Peng
2016Diagnosing People with Dementia Using Automatic Conversation Analysis.
Bahman Mirheidari, Daniel Blackburn, Markus Reuber, Traci Walker, Heidi Christensen
2016Dialogue Session Segmentation by Embedding-Enhanced TextTiling.
Yiping Song, Lili Mou, Rui Yan, Li Yi, Zinan Zhu, Xiaohua Hu, Ming Zhang
2016Differential Effects of Velopharyngeal Dysfunction on Speech Intelligibility During Early and Late Stages of Amyotrophic Lateral Sclerosis.
Panying Rong, Yana Yunusova, Jordan R. Green
2016Digitala: An Augmented Test and Review Process Prototype for High-Stakes Spoken Foreign Language Examination.
Reima Karhila, Aku Rouhe, Peter Smit, André Mansikkaniemi, Heini Kallio, Erik Lindroos, Raili Hildén, Martti Vainio, Mikko Kurimo
2016Diphthongization of Nuclear Vowels and the Emergence of a Tetraphthong in Hetang Cantonese.
Wenqi Hu, Fang Hu, Jian Jin
2016Direct Expressive Voice Training Based on Semantic Selection.
Igor Jauk, Antonio Bonafonte
2016Directly Comparing the Listening Strategies of Humans and Machines.
Michael I. Mandel
2016Discriminative Layered Nonnegative Matrix Factorization for Speech Separation.
Chung-Chien Hsu, Tai-Shih Chi, Jen-Tzung Chien
2016Discussion.
Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini
2016Discussion.
Dayana Ribas, Emmanuel Vincent, John H. L. Hansen, Emma Jokinen, Mirco Ravanelli, Hannes Gamper, Fred Richardson
2016Discussion.
Naomi Harte, Peter Jancovic, Karl-L. Schuchmann
2016Disentrainment may be a Positive Thing: A Novel Measure of Unsigned Acoustic-Prosodic Synchrony, and its Relation to Speaker Engagement.
Juan Manuel Pérez, Ramiro H. Gálvez, Agustín Gravano
2016Disfluency Detection Using a Bidirectional LSTM.
Vicky Zayats, Mari Ostendorf, Hannaneh Hajishirzi
2016Distilling Knowledge from Ensembles of Neural Networks for Speech Recognition.
Yevgen Chebotar, Austin Waters
2016Do GMM Phoneme Classifiers Perceive Synthetic Sibilants as Humans Do?
Gábor Pintér, Hiroki Watanabe
2016Do Listeners Learn Better from Natural Speech?
Michael McAuliffe, Molly Babel, Charlotte Vaughn
2016Does Auditory-Motor Learning of Speech Transfer from the CV Syllable to the CVCV Word?
Tiphaine Caudrelier, Pascal Perrier, Jean-Luc Schwartz, Amélie Rochet-Capellan
2016Does She Speak RTT? Towards an Earlier Identification of Rett Syndrome Through Intelligent Pre-Linguistic Vocalisation Analysis.
Florian B. Pokorny, Peter B. Marschik, Christa Einspieler, Björn W. Schuller
2016Does the Importance of Word-Initial and Word-Final Information Differ in Native versus Non-Native Spoken-Word Recognition?
Odette Scharenborg, Juul Coumans, Sofoklis Kakouros, Roeland van Hout
2016Domain Adaptation of CNN Based Acoustic Models Under Limited Resource Settings.
Masayuki Suzuki, Ryuki Tachibana, Samuel Thomas, Bhuvana Ramabhadran, George Saon
2016Domain Adaptation of Recurrent Neural Networks for Natural Language Understanding.
Aaron Jaech, Larry P. Heck, Mari Ostendorf
2016Dynamic Stream Weighting for Turbo-Decoding-Based Audiovisual ASR.
Sebastian Gergen, Steffen Zeiler, Ahmed Hussen Abdelaziz, Robert M. Nickel, Dorothea Kolossa
2016Dynamic Transcription for Low-Latency Speech Translation.
Jan Niehues, Thai Son Nguyen, Eunah Cho, Thanh-Le Ha, Kevin Kilgour, Markus Müller, Matthias Sperber, Sebastian Stüker, Alex Waibel
2016Dysarthric Speech Recognition Using Kullback-Leibler Divergence-Based Hidden Markov Model.
Myung Jong Kim, Jun Wang, Hoirin Kim
2016EVS Channel Aware Mode Robustness to Frame Erasures.
Anssi Rämö, Antti Kurittu, Henri Toukomaa
2016Effect of Noise on Lexical Tone Perception in Cantonese-Speaking Amusics.
Jing Shao, Caicai Zhang, Gang Peng, Yike Yang, William S.-Y. Wang
2016Effectiveness of Near-End Speech Enhancement Under Equal-Loudness and Equal-Level Constraints.
Tudor-Catalin Zorila, Sheila Flanagan, Brian C. J. Moore, Yannis Stylianou
2016Effects of Cochlear Hearing Loss on the Benefits of Ideal Binary Masking.
Vahid Montazeri, Shaikat Hossain, Peter F. Assmann
2016Effects of L1 Phonotactic Constraints on L2 Word Segmentation Strategies.
Tamami Katayama
2016Effects of Stress on Fricatives: Evidence from Standard Modern Greek.
Charalambos Themistocleous, Angelandria Savva, Andrie Aristodemou
2016Effects of Subglottal-Coupling and Interdental-Space on Formant Trajectories During Front-to-Back Vowel Transitions in Chinese.
Shuanglin Fan, Kiyoshi Honda, Jianwu Dang, Hui Feng
2016Effects of Urgent Speech and Preceding Sounds on Speech Intelligibility in Noisy and Reverberant Environments.
Nao Hodoshima
2016Efficient Segmental Cascades for Speech Recognition.
Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu
2016Efficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling.
Sittipong Saychum, Sarawoot Kongyoung, Anocha Rugchatjaroen, Patcharika Chootrakool, Sawit Kasuriya, Chai Wutiwiwatchai
2016Emergence of Vocal Developmental Sequences in a Predictive Coding Model of Speech Acquisition.
Shamima Najnin, Bonny Banerjee
2016End-to-End Language Identification Using Attention-Based Recurrent Neural Networks.
Wang Geng, Wenfu Wang, Yuanyuan Zhao, Xinyuan Cai, Bo Xu
2016End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding.
Yun-Nung Chen, Dilek Hakkani-Tür, Gökhan Tür, Jianfeng Gao, Li Deng
2016English Language Speech Assistant.
Xavier Anguera, Vu Van
2016Enhance the Word Vector with Prosodic Information for the Recurrent Neural Network Based TTS System.
Xin Wang, Shinji Takaki, Junichi Yamagishi
2016Enhanced Harmonic Content and Vocal Note Based Predominant Melody Extraction from Vocal Polyphonic Music Signals.
Gurunath Reddy M., K. Sreenivasa Rao
2016Enhancement of Automatic Oral Presentation Assessment System Using Latent N-Grams Word Representation and Part-of-Speech Information.
Wen-Yu Huang, Shan-Wen Hsiao, Hung-Ching Sun, Ming-Chuan Hsieh, Ming-Hsueh Tsai, Chi-Chun Lee
2016Enhancing Data-Driven Phone Confusions Using Restricted Recognition.
Mark Kane, Julie Carson-Berndsen
2016Enhancing Multilingual Recognition of Emotion in Speech by Language Identification.
Hesam Sagha, Pavel Matejka, Maryna Gavryukova, Filip Povolný, Erik Marchi, Björn W. Schuller
2016Entropy Based Pruning for Non-Negative Matrix Based Language Models with Contextual Features.
Barlas Oguz, Issac Alphonso, Shuangyu Chang
2016Entropy Coding of Spectral Envelopes for Speech and Audio Coding Using Distribution Quantization.
Srikanth Korse, Tobias Jähnel, Tom Bäckström
2016Error Correction in Lightly Supervised Alignment of Broadcast Subtitles.
Julia Olcoz, Oscar Saz, Thomas Hain
2016Estimating the Sincerity of Apologies in Speech by DNN Rank Learning and Prosodic Analysis.
Gábor Gosztolya, Tamás Grósz, György Szaszák, László Tóth
2016Estimation of Children's Physical Characteristics from Their Voices.
Jill Fain Lehman, Rita Singh
2016Evaluation of Phonatory Behavior of German and French Speakers in Native and Non-Native Speech.
Manfred Pützer, Frank Zimmerer, Wolfgang Wokurek, Jeanin Jügler
2016Evaluation of Singing Synthesis: Methodology and Case Study with Concatenative and Performative Systems.
Lionel Feugère, Christophe d'Alessandro, Samuel Delalez, Luc Ardaillon, Axel Roebel
2016Evaluation of a Phone-Based Anomaly Detection Approach for Dysarthric Speech.
Imed Laaridh, Corinne Fredouille, Christine Meunier
2016Exemplar Dynamics in Phonetic Convergence of Speech Rate.
Antje Schweitzer, Michael Walsh
2016Experiences with Shared Resources for Research and Education in Speech and Language Processing.
Rebecca Bates, Eric Fosler-Lussier, Florian Metze, Martha A. Larson, Gina-Anne Levow, Emily Mower Provost
2016Experimental Validation of Sound Generated from Flow in Simplified Vocal Tract Model of Sibilant /s/.
Tsukasa Yoshinaga, Kazunori Nozaki, Shigeo Wada
2016Exploiting Depth and Highway Connections in Convolutional Recurrent Deep Neural Networks for Speech Recognition.
Wei-Ning Hsu, Yu Zhang, Ann Lee, James R. Glass
2016Exploiting Hidden-Layer Responses of Deep Neural Networks for Language Recognition.
Ruizhi Li, Sri Harish Reddy Mallidi, Lukás Burget, Oldrich Plchot, Najim Dehak
2016Exploiting Phone Log-Likelihood Ratio Features for the Detection of the Native Language of Non-Native English Speakers.
Alberto Abad, Eugénio Ribeiro, Fábio N. Kepler, Ramón Fernandez Astudillo, Isabel Trancoso
2016Exploring Collections of Multimedia Archives Through Innovative Interfaces in the Context of Digital Humanities.
Géraldine Damnati, Delphine Charlet, Marc Denjean
2016Exploring Session Variability and Template Aging in Speaker Verification for Fixed Phrase Short Utterances.
Rohan Kumar Das, Sarfaraz Jelil, S. R. Mahadeva Prasanna
2016Exploring Word Mover's Distance and Semantic-Aware Embedding Techniques for Extractive Broadcast News Summarization.
Shih-Hung Liu, Kuan-Yu Chen, Yu-Lun Hsieh, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu
2016Exploring the Correlation of Pitch Accents and Semantic Slots for Spoken Language Understanding.
Sabrina Stehwien, Ngoc Thang Vu
2016Expressive Control of Singing Voice Synthesis Using Musical Contexts and a Parametric F0 Model.
Luc Ardaillon, Celine Chabot-Canet, Axel Roebel
2016Expressive Singing Synthesis Based on Unit Selection for the Singing Synthesis Challenge 2016.
Jordi Bonada, Martí Umbert, Merlijn Blaauw
2016Expressive Speech Driven Talking Avatar Synthesis with DBLSTM Using Limited Amount of Emotional Bimodal Data.
Xu Li, Zhiyong Wu, Helen M. Meng, Jia Jia, Xiaoyan Lou, Lianhong Cai
2016F
Xiaoyun Wang, Xugang Lu, Hisashi Kawai, Seiichi Yamamoto
2016F0 Development in Acquiring Korean Stop Distinction.
Gayeon Son
2016Facing Realism in Spontaneous Emotion Recognition from Speech: Feature Enhancement by Autoencoder with LSTM Neural Networks.
Zixing Zhang, Fabien Ringeval, Jing Han, Jun Deng, Erik Marchi, Björn W. Schuller
2016Factor Analysis Based Speaker Normalisation for Continuous Emotion Prediction.
Ting Dang, Vidhyasaharan Sethu, Eliathamby Ambikairajah
2016Factor Analysis Based Speaker Verification Using ASR.
Hang Su, Steven Wegmann
2016Factorized Linear Input Network for Acoustic Model Adaptation in Noisy Conditions.
Dung T. Tran, Marc Delcroix, Atsunori Ogawa, Tomohiro Nakatani
2016Factors Affecting the Intelligibility of Sine-Wave Speech.
Fei Chen, Daniel Fogerty
2016Far-Field ASR Without Parallel Data.
Vijayaditya Peddinti, Vimal Manohar, Yiming Wang, Daniel Povey, Sanjeev Khudanpur
2016Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices.
Heiga Zen, Yannis Agiomyrgiannakis, Niels Egberts, Fergus Henderson, Przemyslaw Szczepaniak
2016Feature Learning and Automatic Segmentation for Dolphin Communication Analysis.
Daniel Kohlsdorf, Denise Herzing, Thad Starner
2016Feature Learning with Raw-Waveform CLDNNs for Voice Activity Detection.
Rubén Zazo, Tara N. Sainath, Gabor Simko, Carolina Parada
2016Feature-Level Decision Fusion for Audio-Visual Word Prominence Detection.
Martin Heckmann
2016First Step Towards End-to-End Parametric TTS Synthesis: Generating Spectral Parameters with Neural Attention.
Wenfu Wang, Shuang Xu, Bo Xu
2016Flexible, Rapid Authoring of Goal-Orientated, Multi-Turn Dialogues Using the Task Completion Platform.
Alex Marin, Paul A. Crook, Omar Zia Khan, Vasiliy Radostev, Khushboo Aggarwal, Ruhi Sarikaya
2016Formant Estimation and Tracking Using Deep Learning.
Yehoshua Dissen, Joseph Keshet
2016Frequency Estimation from Waveforms Using Multi-Layered Neural Networks.
Prateek Verma, Ronald W. Schafer
2016Fusing Acoustic Feature Representations for Computational Paralinguistics Tasks.
Heysem Kaya, Alexey A. Karpov
2016Fusion Strategies for Robust Speech Recognition and Keyword Spotting for Channel- and Noise-Degraded Speech.
Vikramjit Mitra, Julien van Hout, Wen Wang, Chris Bartels, Horacio Franco, Dimitra Vergyri, Abeer Alwan, Adam Janin, John H. L. Hansen, Richard M. Stern, Abhijeet Sangwan, Nelson Morgan
2016Future Context Attention for Unidirectional LSTM Based Acoustic Model.
Jian Tang, Shiliang Zhang, Si Wei, Li-Rong Dai
2016GMM-Free Flat Start Sequence-Discriminative DNN Training.
Gábor Gosztolya, Tamás Grósz, László Tóth
2016Gating Recurrent Enhanced Memory Neural Networks on Language Identification.
Wang Geng, Yuanyuan Zhao, Wenfu Wang, Xinyuan Cai, Bo Xu
2016Generalized Discriminant Analysis (GDA) for Improved i-Vector Based Speaker Recognition.
Fahimeh Bahmaninezhad, John H. L. Hansen
2016Generalizing Steady State Suppression for Enhanced Intelligibility Under Reverberation.
Petko Nikolov Petkov, Yannis Stylianou
2016Generating Complementary Acoustic Model Spaces in DNN-Based Sequence-to-Frame DTW Scheme for Out-of-Vocabulary Spoken Term Detection.
Shi-wook Lee, Kazuyo Tanaka, Yoshiaki Itoh
2016Generating Gestural Scores from Acoustics Through a Sparse Anchor-Based Representation of Speech.
Christopher Liberatore, Ricardo Gutierrez-Osuna
2016Generating Natural Video Descriptions via Multimodal Processing.
Qin Jin, Junwei Liang, Xiaozhu Lin
2016Generation and Pruning of Pronunciation Variants to Improve ASR Accuracy.
Zhenhao Ge, Aravind Ganapathiraju, Ananth N. Iyer, Scott A. Randal, Felix I. Wyss
2016Generation of Emotion Control Vector Using MDS-Based Space Transformation for Expressive Speech Synthesis.
Yan-You Chen, Chung-Hsien Wu, Yu-Fong Huang
2016Generative Acoustic-Phonemic-Speaker Model Based on Three-Way Restricted Boltzmann Machine.
Toru Nakashika, Yasuhiro Minami
2016Glimpse-Based Metrics for Predicting Speech Intelligibility in Additive Noise Conditions.
Yan Tang, Martin Cooke
2016Glimpsing Predictions for Natural and Vocoded Sentence Intelligibility During Modulation Masking: Effect of the Glimpse Cutoff Criterion.
Bobby Gibbs II, Daniel Fogerty
2016GlottDNN - A Full-Band Glottal Vocoder for Statistical Parametric Speech Synthesis.
Manu Airaksinen, Bajibabu Bollepalli, Lauri Juvela, Zhizheng Wu, Simon King, Paavo Alku
2016Glottal Squeaks in VC Sequences.
Mísa Hejná, Pertti Palo, Scott Moisik
2016HAPPY Team Entry to NIST OpenSAD Challenge: A Fusion of Short-Term Unsupervised and Segment i-Vector Based Speech Activity Detectors.
Tomi Kinnunen, Alexey Sholokhov, Elie Khoury, Dennis Alexander Lehmann Thomsen, Md. Sahidullah, Zheng-Hua Tan
2016HMM-Based Non-Native Accent Assessment Using Posterior Features.
Ramya Rasipuram, Milos Cernak, Mathew Magimai-Doss
2016HMM-Based Speech Enhancement Using Sub-Word Models and Noise Adaptation.
Akihiro Kato, Ben P. Milner
2016Head Motion Generation with Synthetic Speech: A Data Driven Approach.
Najmeh Sadoughi, Carlos Busso
2016Hierarchical Classification of Speaker and Background Noise and Estimation of SNR Using Sparse Representation.
K. V. Vijay Girish, A. G. Ramakrishnan, T. V. Ananthapadmanabha
2016Highlighting Psychological Features for Predicting Child Interjections During Story Telling.
Gaël Lejeune, François Rioult, Bruno Crémilleux
2016How Neural Network Depth Compensates for HMM Conditional Independence Assumptions in DNN-HMM Acoustic Models.
Suman V. Ravuri, Steven Wegmann
2016Hybrid Accelerated Optimization for Speech Recognition.
Jen-Tzung Chien, Pei-Wen Huang, Tan Lee
2016Hybrid Dialogue State Tracking for Real World Human-to-Human Dialogues.
Kai Sun, Su Zhu, Lu Chen, Siqiu Yao, Xueyang Wu, Kai Yu
2016Hyperarticulated Production of Korean Glides by Age Group.
Seung-Eun Chang, Minsook Kim
2016Identifying Hearing Loss from Learned Speech Kernels.
Shamima Najnin, Bonny Banerjee, Lisa Lucks Mendel, Masoumeh Heidari Kapourchali, Jayanta Kumar Dutta, Sungmin Lee, Chhayakanta Patro, Monique Pousson
2016Identifying Perceptually Similar Voices with a Speaker Recognition System Using Auto-Phonetic Features.
Finnian Kelly, Anil Alexander, Oscar Forth, Samuel Kent, Jonas Lindh, Joel Åkesson
2016Idlak Tangle: An Open Source Kaldi Based Parametric Speech Synthesiser Based on DNN.
Blaise Potard, Matthew P. Aylett, David A. Baude, Petr Motlícek
2016Illustrating the Production of the International Phonetic Alphabet Sounds Using Fast Real-Time Magnetic Resonance Imaging.
Asterios Toutios, Sajan Goud Lingala, Colin Vaz, Jangwon Kim, John H. Esling, Patricia A. Keating, Matthew Gordon, Dani Byrd, Louis Goldstein, Krishna S. Nayak, Shrikanth S. Narayanan
2016Impaired Categorical Perception of Mandarin Tones and its Relationship to Language Ability in Autism Spectrum Disorders.
Fei Chen, Nan Yan, Xiaojie Pan, Feng Yang, Zhuanzhuan Ji, Lan Wang, Gang Peng
2016Implementing Acoustic-Prosodic Entrainment in a Conversational Avatar.
Rivka Levitan, Stefan Benus, Ramiro H. Gálvez, Agustín Gravano, Florencia Savoretti, Marián Trnka, Andreas Weise, Julia Hirschberg
2016Improved Depiction of Tissue Boundaries in Vocal Tract Real-Time MRI Using Automatic Off-Resonance Correction.
Yongwan Lim, Sajan Goud Lingala, Asterios Toutios, Shrikanth S. Narayanan, Krishna S. Nayak
2016Improved MVDR Beamforming Using Single-Channel Mask Prediction Networks.
Hakan Erdogan, John R. Hershey, Shinji Watanabe, Michael I. Mandel, Jonathan Le Roux
2016Improved Multilingual Training of Stacked Neural Network Acoustic Models for Low Resource Languages.
Tanel Alumäe, Stavros Tsakalidis, Richard M. Schwartz
2016Improved Music Genre Classification with Convolutional Neural Networks.
Weibin Zhang, Wenkang Lei, Xiangmin Xu, Xiaofeng Xing
2016Improved Neural Bag-of-Words Model to Retrieve Out-of-Vocabulary Words in Speech Recognition.
Imran A. Sheikh, Irina Illina, Dominique Fohr, Georges Linarès
2016Improved Neural Network Initialization by Grouping Context-Dependent Targets for Acoustic Modeling.
Gakuto Kurata, Brian Kingsbury
2016Improved Time-Frequency Trajectory Excitation Vocoder for DNN-Based Speech Synthesis.
Eunwoo Song, Frank K. Soong, Hong-Goo Kang
2016Improved a priori SAP Estimator in Complex Noisy Environment for Dual Channel Microphone System.
Youna Ji, Young-Cheol Park
2016Improving Automatic Recognition of Aphasic Speech with AphasiaBank.
Duc Le, Emily Mower Provost
2016Improving Boundary Estimation in Audiovisual Speech Activity Detection Using Bayesian Information Criterion.
Fei Tao, John H. L. Hansen, Carlos Busso
2016Improving Children's Speech Recognition Through Out-of-Domain Data Augmentation.
Joachim Fainberg, Peter Bell, Mike Lincoln, Steve Renals
2016Improving Deep Neural Networks Based Speaker Verification Using Unlabeled Data.
Yao Tian, Meng Cai, Liang He, Wei-Qiang Zhang, Jia Liu
2016Improving English Conversational Telephone Speech Recognition.
Ivan Medennikov, Alexey Prudnikov, Alexander Zatvornitskiy
2016Improving Generalisation to New Speakers in Spoken Dialogue State Tracking.
Iñigo Casanueva, Thomas Hain, Phil D. Green
2016Improving Large Vocabulary Accented Mandarin Speech Recognition with Attribute-Based I-Vectors.
Hao Zheng, Shanshan Zhang, Liwei Qiao, Jianping Li, Wenju Liu
2016Improving Prosodic Boundaries Prediction for Mandarin Speech Synthesis by Using Enhanced Embedding Feature and Model Fusion Approach.
Yibin Zheng, Ya Li, Zhengqi Wen, Xingguang Ding, Jianhua Tao
2016Improving TTS with Corpus-Specific Pronunciation Adaptation.
Marie Tahon, Raheel Qader, Gwénolé Lecorvé, Damien Lolive
2016Improving Under-Resourced Language ASR Through Latent Subword Unit Space Discovery.
Marzieh Razavi, Mathew Magimai-Doss
2016Improving i-Vector and PLDA Based Speaker Clustering with Long-Term Features.
Abraham Woubie, Jordi Luque, Javier Hernando
2016Improving the Lwazi ASR Baseline.
Charl Johannes van Heerden, Neil Kleynhans, Marelie H. Davel
2016Improving the Probabilistic Framework for Representing Dialogue Systems with User Response Model.
Miao Li, Zhipeng Chen, Ji Wu
2016Incorporating a Generative Front-End Layer to Deep Neural Network for Noise Robust Automatic Speech Recognition.
Souvik Kundu, Khe Chai Sim, Mark J. F. Gales
2016Individual Identity in Songbirds: Signal Representations and Metric Learning for Locating the Information in Complex Corvid Calls.
Dan Stowell, Veronica Morfi, Lisa F. Gill
2016Inferring Phonemic Classes from CNN Activation Maps Using Clustering Techniques.
Thomas Pellegrini, Sandrine Mouysset
2016Integrated Spoofing Countermeasures and Automatic Speaker Verification: An Evaluation on ASVspoof 2015.
Md. Sahidullah, Héctor Delgado, Massimiliano Todisco, Hong Yu, Tomi Kinnunen, Nicholas W. D. Evans, Zheng-Hua Tan
2016Intelligibility Enhancement at the Receiving End of the Speech Transmission System - Effects of Far-End Noise Reduction.
Emma Jokinen, Paavo Alku
2016Intelligibility of Disordered Speech: Global and Detailed Scores.
Mario Ganzeboom, Marjoke Bakker, Catia Cucchiarini, Helmer Strik
2016Inter-Speech Clicks in an Interspeech Keynote.
Jürgen Trouvain, Zofia Malisz
2016Inter-Task System Fusion for Speaker Recognition.
Marc Ferras, Srikanth R. Madikeri, Subhadeep Dey, Petr Motlícek, Hervé Bourlard
2016Interaction Between Lexical Tone and Intonation: An EMA Study.
Hao Yi, Sam Tilsen
2016Interactive Spoken Content Retrieval by Deep Reinforcement Learning.
Yen-Chen Wu, Tzu-Hsiang Lin, Yang-De Chen, Hung-yi Lee, Lin-Shan Lee
2016Interpretation of Low Dimensional Neural Network Bottleneck Features in Terms of Human Perception and Production.
Philip Weber, Linxue Bai, Martin J. Russell, Peter Jancovic, Stephen M. Houghton
2016Introducing Temporal Rate Coding for Speech in Cochlear Implants: A Microscopic Evaluation in Humans and Models.
Anja Eichenauer, Mathias Dietz, Bernd T. Meyer, Tim Jürgens
2016Introducing the Turbo-Twin-HMM for Audio-Visual Speech Enhancement.
Steffen Zeiler, Hendrik Meutzner, Ahmed Hussen Abdelaziz, Dorothea Kolossa
2016Introduction to Poster Presentation of Part II.
Jeesun Kim, Gérard Bailly
2016Introduction.
Naomi Harte, Peter Jancovic, Karl-L. Schuchmann
2016Investigating Various Diarization Algorithms for Speaker in the Wild (SITW) Speaker Recognition Challenge.
Yi Liu, Yao Tian, Liang He, Jia Liu
2016Investigating the Impact of Dialect Prestige on Lexical Decision.
Mairym Lloréns Monteserín, Jason D. Zevin
2016Investigation of Semi-Supervised Acoustic Model Training Based on the Committee of Heterogeneous Neural Networks.
Naoyuki Kanda, Shoji Harada, Xugang Lu, Hisashi Kawai
2016Investigation of Speed-Accuracy Tradeoffs in Speech Production Using Real-Time Magnetic Resonance Imaging.
Adam C. Lammert, Christine H. Shadle, Shrikanth S. Narayanan, Thomas F. Quatieri
2016Investigation of Sub-Band Discriminative Information Between Spoofed and Genuine Speech.
Kaavya Sriskandaraja, Vidhyasaharan Sethu, Phu Ngoc Le, Eliathamby Ambikairajah
2016Is Deception Emotional? An Emotion-Driven Predictive Approach.
Shahin Amiriparian, Jouni Pohjalainen, Erik Marchi, Sergey Pugachevskiy, Björn W. Schuller
2016Iterative PLDA Adaptation for Speaker Diarization.
Gaël Le Lan, Delphine Charlet, Anthony Larcher, Sylvain Meignier
2016Joint Effect of Dialect and Mandarin on English Vowel Production: A Case Study in Changsha EFL Learners.
Xinyi Wen, Yuan Jia
2016Joint Enhancement and Coding of Speech by Incorporating Wiener Filtering in a CELP Codec.
Johannes Fischer, Tom Bäckström
2016Joint Learning of Speaker and Phonetic Similarities with Siamese Networks.
Neil Zeghidour, Gabriel Synnaeve, Nicolas Usunier, Emmanuel Dupoux
2016Joint Optimization of Denoising Autoencoder and DNN Acoustic Model Based on Multi-Target Learning for Noisy Speech Recognition.
Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara
2016Joint Sound Source Separation and Speaker Recognition.
Jeroen Zegers, Hugo Van hamme
2016Joint Speaker and Lexical Modeling for Short-Term Characterization of Speaker.
Guangsen Wang, Kong-Aik Lee, Trung Hieu Nguyen, Hanwu Sun, Bin Ma
2016Joint Syntactic and Semantic Analysis with a Multitask Deep Learning Framework for Spoken Language Understanding.
Jérémie Tafforeau, Frédéric Béchet, Thierry Artières, Benoît Favre
2016Jointly Learning to Locate and Classify Words Using Convolutional Networks.
Dimitri Palaz, Gabriel Synnaeve, Ronan Collobert
2016Jointly Optimizing Activation Coefficients of Convolutive NMF Using DNN for Speech Separation.
Hao Li, Shuai Nie, Xueliang Zhang, Hui Zhang
2016Ketchup, Interdisciplinarity, and the Spread of Innovation in Speech and Language Processing.
Dan Jurafsky
2016Kulning (Swedish Cattle Calls): Acoustic, EGG, Stroboscopic and High-Speed Video Analyses of an Unusual Singing Style.
Ahmed Geneid, Anne-Maria Laukkanen, Anita McAllister, Robert Eklund
2016L1-L2 Interference: The Case of Final Devoicing of French Voiced Fricatives in Final Position by German Learners.
Sucheta Ghosh, Camille Fauth, Aghilas Sini, Yves Laprie
2016L2 Acquisition and Production of the English Rhotic Pharyngeal Gesture.
Sarah Harper, Louis Goldstein, Shrikanth S. Narayanan
2016L2 English Rhythm in Read Speech by Chinese Students.
Hongwei Ding, Xinping Xu
2016LIA System for the SITW Speaker Recognition Challenge.
Waad Ben Kheder, Moez Ajili, Pierre-Michel Bousquet, Driss Matrouf, Jean-François Bonastre
2016LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition.
Kazuki Irie, Zoltán Tüske, Tamer Alkhouli, Ralf Schlüter, Hermann Ney
2016LSTM-Based NeuroCRFs for Named Entity Recognition.
Marc-Antoine Rondeau, Yi Su
2016Labeled Data Generation with Encoder-Decoder LSTM for Semantic Slot Filling.
Gakuto Kurata, Bing Xiang, Bowen Zhou
2016Language Adaptive DNNs for Improved Low Resource Speech Recognition.
Markus Müller, Sebastian Stüker, Alex Waibel
2016Language Effects in Noise-Induced Word Misperceptions.
María Luisa García Lecumberri, Jon Barker, Ricard Marxer, Martin Cooke
2016Language Identification Based on Generative Modeling of Posteriorgram Sequences Extracted from Frame-by-Frame DNNs and LSTM-RNNs.
Ryo Masumura, Taichi Asami, Hirokazu Masataki, Yushi Aono, Sumitaka Sakauchi
2016Language Model Data Augmentation for Keyword Spotting in Low-Resourced Training Conditions.
Arseniy Gorin, Rasa Lileikyte, Guangpu Huang, Lori Lamel, Jean-Luc Gauvain, Antoine Laurent
2016Language Recognition via Sparse Coding.
Youngjune L. Gwon, William M. Campbell, Douglas E. Sturim, H. T. Kung
2016LatticeRnn: Recurrent Neural Networks Over Lattices.
Faisal Ladhak, Ankur Gandhe, Markus Dreyer, Lambert Mathias, Ariya Rastrow, Björn Hoffmeister
2016Laughter Valence Prediction in Motivational Interviewing Based on Lexical and Acoustic Cues.
Rahul Gupta, Nishant Nath, Taruna Agrawal, Panayiotis G. Georgiou, David C. Atkins, Shrikanth S. Narayanan
2016Learning Document Representations Using Subspace Multinomial Model.
Santosh Kesiraju, Lukás Burget, Igor Szöke, Jan Cernocký
2016Learning Multiscale Features Directly from Waveforms.
Zhenyao Zhu, Jesse H. Engel, Awni Y. Hannun
2016Learning N-Gram Language Models from Uncertain Data.
Vitaly Kuznetsov, Hank Liao, Mehryar Mohri, Michael Riley, Brian Roark
2016Learning Neural Network Representations Using Cross-Lingual Bottleneck Features with Word-Pair Information.
Yougen Yuan, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li
2016Learning Personalized Pronunciations for Contact Name Recognition.
Antoine Bruguier, Fuchun Peng, Françoise Beaufays
2016Learning a Translation Model from Word Lattices.
Oliver Adams, Graham Neubig, Trevor Cohn, Steven Bird
2016Lig-Aikuma: A Mobile App to Collect Parallel Speech for Under-Resourced Language Studies.
Elodie Gauthier, David Blachon, Laurent Besacier, Guy-Noël Kouarata, Martine Adda-Decker, Annie Rialland, Gilles Adda, Grégoire Bachman
2016Likelihood Ratio Calculation in Acoustic-Phonetic Forensic Voice Comparison: Comparison of Three Statistical Modelling Approaches.
Ewald Enzinger
2016Local Sparsity Based Online Dictionary Learning for Environment-Adaptive Speech Enhancement with Nonnegative Matrix Factorization.
Kwang Myung Jeon, Hong Kook Kim
2016Localizing Bird Songs Using an Open Source Robot Audition System with a Microphone Array.
Reiji Suzuki, Shiho Matsubayashi, Kazuhiro Nakadai, Hiroshi G. Okuno
2016Locally Linear Embedding for Exemplar-Based Spectral Conversion.
Yi-Chiao Wu, Hsin-Te Hwang, Chin-Cheng Hsu, Yu Tsao, Hsin-Min Wang
2016Log-Linear System Combination Using Structured Support Vector Machines.
Jingzhou Yang, Anton Ragni, Mark J. F. Gales, Kate M. Knill
2016Long Short-Term Memory for Speaker Generalization in Supervised Speech Separation.
Jitong Chen, DeLiang Wang
2016Long-Term Stability of Tracheoesophageal Voices.
Klaske E. van Sluis, Michiel W. M. van den Brekel, Frans J. M. Hilgers, Rob J. J. H. van Son
2016Low-Rank Representation of Nearest Neighbor Posterior Probabilities to Enhance DNN Based Acoustic Modeling.
Gil Luyet, Pranay Dighe, Afsaneh Asaei, Hervé Bourlard
2016Lower Frame Rate Neural Network Acoustic Models.
Golan Pundak, Tara N. Sainath
2016MIVOQ-PTTS - A Revolutionary New Way of Thinking TTS.
Piero Cosi, Giulio Paci, Giacomo Sommavilla, Fabio Tesser
2016ML Parameter Generation with a Reformulated MGE Training Criterion - Participation in the Voice Conversion Challenge 2016.
Daniel Erro, Agustín Alonso, Luis Serrano, David Tavarez, Igor Odriozola, Xabier Sarasola, Eder del Blanco, Jon Sánchez, Ibon Saratxaga, Eva Navas, Inma Hernáez
2016Mahalanobis Metric Scoring Learned from Weighted Pairwise Constraints in I-Vector Speaker Recognition System.
Zhenchun Lei, Yanhong Wan, Jian Luo, Yingen Yang
2016Majorisation-Minimisation Based Optimisation of the Composite Autoregressive System with Application to Glottal Inverse Filtering.
Lauri Juvela, Hirokazu Kameoka, Manu Airaksinen, Junichi Yamagishi, Paavo Alku
2016Making Personal Digital Assistants Aware of What They Do Not Know.
Omar Zia Khan, Ruhi Sarikaya
2016Manipulating Word Lattices to Incorporate Human Corrections.
Yashesh Gaur, Florian Metze, Jeffrey P. Bigham
2016Manual versus Automated: The Challenging Routine of Infant Vocalisation Segmentation in Home Videos to Study Neuro(mal)development.
Florian B. Pokorny, Robert Peharz, Wolfgang Roth, Matthias Zöhrer, Franz Pernkopf, Peter B. Marschik, Björn W. Schuller
2016Marginal Contrast Among Romanian Vowels: Evidence from ASR and Functional Load.
Margaret E. L. Renwick, Ioana Vasilescu, Camille Dutrey, Lori Lamel, Bianca Vieru
2016Maximum a posteriori Based Decoding for CTC Acoustic Models.
Naoyuki Kanda, Xugang Lu, Hisashi Kawai
2016Measuring Pronunciation Improvement in Users of CAPT Tool TipTopTalk!
Cristian Tejedor García, David Escudero Mancebo, Enrique Cámara Arenas, César González Ferreras, Valentín Cardeñoso-Payo
2016Measuring Turn-Taking Offsets in Human-Human Dialogues.
Rebecca Lunsford, Peter A. Heeman, Emma Rennie
2016Mechanical Production of [b], [m] and [w] Using Controlled Labial and Velopharyngeal Gestures.
Takayuki Arai
2016Memory-Efficient Modeling and Search Techniques for Hardware ASR Decoders.
Michael Price, Anantha P. Chandrakasan, James R. Glass
2016Microphone Distance Adaptation Using Cluster Adaptive Training for Robust Far Field Speech Recognition.
Animesh Prasad, Khe Chai Sim
2016Microscopic Multilingual Matrix Test Predictions Using an ASR-Based Speech Recognition Model.
Marc René Schädler, David Hülsmeier, Anna Warzybok, Sabine Hochmuth, Birger Kollmeier
2016Mindfulness Special Event.
Nikki Mirghafori
2016Minimization of Regression and Ranking Losses with Shallow Neural Networks on Automatic Sincerity Evaluation.
Hung-Shin Lee, Yu Tsao, Chi-Chun Lee, Hsin-Min Wang, Wei-Cheng Lin, Wei-Chen Chen, Shan-Wen Hsiao, Shyh-Kang Jeng
2016Minimizing Annotation Effort for Adaptation of Speech-Activity Detection Systems.
Luciana Ferrer, Martin Graciarena
2016Misperceptions Arising from Speech-in-Babble Interactions.
Máté Attila Tóth, Martin Cooke, Jon Barker
2016Mispronunciation Detection Leveraging Maximum Performance Criterion Training of Acoustic Models and Decision Functions.
Yao-Chi Hsu, Ming-Han Yang, Hsiao-Tsung Hung, Berlin Chen
2016Model Adaptation and Active Learning in the BBN Speech Activity Detection System for the DARPA RATS Program.
Damianos G. Karakos, Scott Novotney, Le Zhang, Richard M. Schwartz
2016Model Compression Applied to Small-Footprint Keyword Spotting.
George Tucker, Minhua Wu, Ming Sun, Sankaran Panchapagesan, Gengshen Fu, Shiv Vitaladevuni
2016Model Integration for HMM- and DNN-Based Speech Synthesis Using Product-of-Experts Framework.
Kentaro Tachibana, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai
2016Model-Based Parametric Prosody Synthesis with Deep Neural Network.
Hao Liu, Heng Lu, Xu Shao, Yi Xu
2016Modeling Noise Influence to Speech Intelligibility Non-Intrusively by Reduced Speech Dynamic Range.
Fei Chen
2016Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks.
Tara N. Sainath, Bo Li
2016Modeling and Transforming Speech Using Variational Autoencoders.
Merlijn Blaauw, Jordi Bonada
2016Modulation Enhancement of Temporal Envelopes for Increasing Speech Intelligibility in Noise.
Maria Koutsogiannaki, Yannis Stylianou
2016Modulation Spectral Features for Predicting Vocal Emotion Recognition by Simulated Cochlear Implants.
Zhi Zhu, Ryota Miyauchi, Yukiko Araki, Masashi Unoki
2016Monaural Source Separation Using a Random Forest Classifier.
Cosimo Riday, Saurabh Bhargava, Richard H. R. Hahnloser, Shih-Chii Liu
2016Multi-Attribute Factorized Hidden Layer Adaptation for DNN Acoustic Models.
Lahiru Samarakoon, Khe Chai Sim
2016Multi-Channel Linear Prediction Based on Binaural Coherence for Speech Dereverberation.
Hong Liu, Xiuling Wang, Miao Sun, Cheng Pang
2016Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM.
Dilek Hakkani-Tür, Gökhan Tür, Asli Celikyilmaz, Yun-Nung Chen, Jianfeng Gao, Li Deng, Ye-Yi Wang
2016Multi-Language Multi-Speaker Acoustic Modeling for LSTM-RNN Based Statistical Parametric Speech Synthesis.
Bo Li, Heiga Zen
2016Multi-Language Neural Network Language Models.
Anton Ragni, Edgar Dakin, Xie Chen, Mark J. F. Gales, Kate M. Knill
2016Multi-Talker Speech Recognition Based on Blind Source Separation with ad hoc Microphone Array Using Smartphones and Cloud Storage.
Keiko Ochi, Nobutaka Ono, Shigeki Miyabe, Shoji Makino
2016Multi-Task Learning and Weighted Cross-Entropy for DNN-Based Keyword Spotting.
Sankaran Panchapagesan, Ming Sun, Aparna Khare, Spyros Matsoukas, Arindam Mandal, Björn Hoffmeister, Shiv Vitaladevuni
2016Multichannel Spatial Clustering for Robust Far-Field Automatic Speech Recognition in Mismatched Conditions.
Michael I. Mandel, Jon Barker
2016Multidimensional Residual Learning Based on Recurrent Neural Networks for Acoustic Modeling.
Yuanyuan Zhao, Shuang Xu, Bo Xu
2016Multilingual Data Selection for Low Resource Speech Recognition.
Samuel Thomas, Kartik Audhkhasi, Jia Cui, Brian Kingsbury, Bhuvana Ramabhadran
2016Multilingual Speech Emotion Recognition System Based on a Three-Layer Model.
Xingfeng Li, Masato Akagi
2016Multimodal Fusion of Multirate Acoustic, Prosodic, and Lexical Speaker Characteristics for Native Language Identification.
Prashanth Gurunath Shivakumar, Sandeep Nallan Chakravarthula, Panayiotis G. Georgiou
2016Multiple Influences on Vocabulary Acquisition: Parental Input Dominates.
Dominic W. Massaro
2016Multiplicity of the Acoustic Correlates of the Fortis-Lenis Contrast: Plosives in Aberystwyth English.
Mísa Hejná
2016My-Own-Voice: A Web Service That Allows You to Create a Text-to-Speech Voice From Your Own Voice.
Fabrice Malfrère, Olivier Deroo, Emmanuelle Franques, Jonathan Hourez, Nicolas Mazars, Vincent Pagel, Geoffrey Wilfart
2016NN-Grams: Unifying Neural Network and n-Gram Language Models for Speech Recognition.
Babak Damavandi, Shankar Kumar, Noam Shazeer, Antoine Bruguier
2016Native Language Detection Using the I-Vector Framework.
Mohammed Senoussaoui, Patrick Cardinal, Najim Dehak, Alessandro L. Koerich
2016Native Language Identification Using Spectral and Source-Based Features.
Avni Rajpal, Tanvina B. Patel, Hardik B. Sailor, Maulik C. Madhavi, Hemant A. Patil, Hiroya Fujisaki
2016Naturalness Judgement of L2 English Through Dubbing Practice.
Dean Luo, Ruxin Luo, Lixin Wang
2016Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection.
Sheng-syun Shen, Hung-yi Lee
2016Neural Network Adaptive Beamforming for Robust Multichannel Speech Recognition.
Bo Li, Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Michiel Bacchiani
2016Neural Responses to Speech-Specific Modulations Derived from a Spectro-Temporal Filter Bank.
Marina Frye, Cristiano Micheli, Inga M. Schepers, Gerwin Schalk, Jochem W. Rieger, Bernd T. Meyer
2016Neurophysiological Vocal Source Modeling for Biomarkers of Disease.
Gregory A. Ciccarelli, Thomas F. Quatieri, Satrajit S. Ghosh
2016Noise Aware and Combined Noise Models for Speech Denoising in Unknown Noise Conditions.
Pavlos Papadopoulos, Colin Vaz, Shrikanth S. Narayanan
2016Noise and Metadata Sensitive Bottleneck Features for Improving Speaker Recognition with Non-Native Speech Input.
Yao Qian, Jidong Tao, David Suendermann-Oeft, Keelan Evanini, Alexei V. Ivanov, Vikram Ramanarayanan
2016Noise-Robust Hidden Markov Models for Limited Training Data for Within-Species Bird Phrase Classification.
Kantapon Kaewtip, Charles E. Taylor, Abeer Alwan
2016Non-Iterative Parameter Estimation for Total Variability Model Using Randomized Singular Value Decomposition.
Ruchir Travadi, Shrikanth S. Narayanan
2016Non-Uniform Boosted MCE Training of Deep Neural Networks for Keyword Spotting.
Zhong Meng, Biing-Hwang Juang
2016Novel Front-End Features Based on Neural Graph Embeddings for DNN-HMM and LSTM-CTC Acoustic Modeling.
Yuzong Liu, Katrin Kirchhoff
2016Novel Nonlinear Prediction Based Features for Spoofed Speech Detection.
Himanshu N. Bhavsar, Tanvina B. Patel, Hemant A. Patil
2016Novel Subband Autoencoder Features for Detection of Spoofed Speech.
Meet H. Soni, Tanvina B. Patel, Hemant A. Patil
2016Novel Subband Autoencoder Features for Non-Intrusive Quality Assessment of Noise Suppressed Speech.
Meet H. Soni, Hemant A. Patil
2016Objective Evaluation Methods for Chinese Text-To-Speech Systems.
Teng Zhang, Zhipeng Chen, Ji Wu, Sam Lai, Wenhui Lei, Carsten Isert
2016Objective Evaluation Using Association Between Dimensions Within Spectral Features for Statistical Parametric Speech Synthesis.
Yusuke Ijima, Taichi Asami, Hideyuki Mizuno
2016Objective Language Feature Analysis in Children with Neurodevelopmental Disorders During Autism Assessment.
Manoj Kumar, Rahul Gupta, Daniel Bone, Nikolaos Malandrakis, Somer Bishop, Shrikanth S. Narayanan
2016On Discriminative Framework for Single Channel Audio Source Separation.
Arpita Gang, Pravesh Biyani
2016On Employing a Highly Mismatched Crowd for Speech Transcription.
Purushotam G. Radadia, Rahul Kumar, Kanika Kalra, Shirish Karande, Sachin Lodha
2016On Online Attention-Based Speech Recognition and Joint Mandarin Character-Pinyin Training.
William Chan, Ian R. Lane
2016On Smoothing and Enhancing Dynamics of Pitch Contours Represented by Discrete Orthogonal Polynomials for Prosody Generation.
Chen-Yu Chiang
2016On the Appropriateness of Complex-Valued Neural Networks for Speech Enhancement.
Lukas Drude, Bhiksha Raj, Reinhold Haeb-Umbach
2016On the Correlation and Transferability of Features Between Automatic Speech Recognition and Speech Emotion Recognition.
Haytham M. Fayek, Margaret Lech, Lawrence Cavedon
2016On the Efficient Representation and Execution of Deep Acoustic Models.
Raziel Alvarez, Rohit Prabhavalkar, Anton Bakhtin
2016On the Importance of Efficient Transition Modeling for Speaker Diarization.
Itshak Lapidot, Jean-François Bonastre
2016On the Influence of Gender on Interruptions in Multiparty Dialogue.
Paul Van Eecke, Raquel Fernández
2016On the Influence of Text Content on Pass-Phrase Strength for Short-Duration Text-Dependent Automatic Speaker Authentication.
Giacomo Valenti, Adrien Daniel, Nicholas W. D. Evans
2016On the Issue of Calibration in DNN-Based Speaker Recognition Systems.
Mitchell McLaren, Diego Castán, Luciana Ferrer, Aaron Lawson
2016On the Role of Nonlinear Transformations in Deep Neural Network Acoustic Models.
Tasha Nagamine, Michael L. Seltzer, Nima Mesgarani
2016On the Suitability of Vocalic Sandwiches in a Corpus-Based TTS Engine.
David Guennec, Damien Lolive
2016On the Use of Gaussian Mixture Model Framework to Improve Speaker Adaptation of Deep Neural Network Acoustic Models.
Natalia A. Tomashenko, Yuri Y. Khokhlov, Yannick Estève
2016Open Language Interface for Voice Exploitation (OLIVE).
Aaron Lawson, Mitchell McLaren, Harry Bratt, Martin Graciarena, Horacio Franco, Christopher George, Allen R. Stauffer, Chris Bartels, Julien van Hout
2016Open Source Speech and Language Resources for Frisian.
Emre Yilmaz, Henk van den Heuvel, Jelske Dijkstra, Hans Van de Velde, Frederik Kampstra, Jouke Algra, David A. van Leeuwen
2016Open-Domain Audio-Visual Speech Recognition: A Deep Learning Approach.
Yajie Miao, Florian Metze
2016Optimal Unit Stitching in a Unit Selection Singing Synthesis System.
Marius Cotescu
2016Optimization of Speech Enhancement Front-End with Speech Recognition-Level Criterion.
Takuya Higuchi, Takuya Yoshioka, Tomohiro Nakatani
2016Optimizing Speech Recognition Evaluation Using Stratified Sampling.
Janne Pylkkönen, Thomas Drugman, Max Bisani
2016Organizing Syllables into Sandhi Domains - Evidence from F0 and Duration Patterns in Shanghai Chinese.
Bijun Ling, Jie Liang
2016Out of Set Language Modelling in Hierarchical Language Identification.
Saad Irtza, Vidhyasaharan Sethu, Sarith Fernando, Eliathamby Ambikairajah, Haizhou Li
2016Overcoming Data Sparsity in Acoustic Modeling of Low-Resource Language by Borrowing Data and Model Parameters from High-Resource Languages.
Basil Abraham, Srinivasan Umesh, Neethu Mariam Joy
2016Pair-Wise Distance Metric Learning of Neural Network Model for Spoken Language Identification.
Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai
2016Parallel Dictionary Learning for Voice Conversion Using Discriminative Graph-Embedded Non-Negative Matrix Factorization.
Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
2016Parallel Speaker and Content Modelling for Text-Dependent Speaker Verification.
Jianbo Ma, Saad Irtza, Kaavya Sriskandaraja, Vidhyasaharan Sethu, Eliathamby Ambikairajah
2016Parkinson's Disease Progression Assessment from Speech Using GMM-UBM.
Tomás Arias-Vergara, Juan Camilo Vásquez-Correa, Juan Rafael Orozco-Arroyave, Jesús Francisco Vargas-Bonilla, Elmar Nöth
2016Part-of-Speech Tagging and Chunking in Text-to-Speech Synthesis for South African Languages.
Georg I. Schlünz, Nkosikhona Dlamini, Rynhardt P. Kruger
2016Pause Prediction from Text for Speech Synthesis with User-Definable Pause Insertion Likelihood Threshold.
Norbert Braunschweiler, Ranniery Maia
2016Perceived Naturalness of Electrolaryngeal Speech Produced Using sEMG-Controlled vs. Manual Pitch Modulation.
Kathleen F. Nagle, James T. Heaton
2016Perceived Usability and Cognitive Demand of Secondary Tasks in Spoken Versus Visual-Manual Automotive Interaction.
Annika Silvervarg, Sofia Lindvall, Jonatan Andersson, Ida Esberg, Christian Jernberg, Filip Frumerie, Arne Jönsson
2016Perception Optimized Deep Denoising AutoEncoders for Speech Enhancement.
Prashanth Gurunath Shivakumar, Panayiotis G. Georgiou
2016Perception of Tone in Whispered Mandarin Sentences: The Case for Singapore Mandarin.
Yuling Gu, Boon Pang Lim, Nancy F. Chen
2016Perceptual Lateralization of Coda Rhotic Production in Puerto Rican Spanish.
Mairym Lloréns Monteserín, Shrikanth S. Narayanan, Louis Goldstein
2016Perceptual Salience of Voice Source Parameters in Signaling Focal Prominence.
Irena Yanushevskaya, Andy Murphy, Christer Gobl, Ailbhe Ní Chasaide
2016Personalized Natural Language Understanding.
Xiaohu Liu, Ruhi Sarikaya, Liang Zhao, Yong Ni, Yi-Cheng Pan
2016Personalized, Cross-Lingual TTS Using Phonetic Posteriorgrams.
Lifa Sun, Hao Wang, Shiyin Kang, Kun Li, Helen M. Meng
2016Phase-Aware Signal Processing for Automatic Speech Recognition.
Johannes Fahringer, Tobias Schrank, Johannes Stahl, Pejman Mowlaee, Franz Pernkopf
2016Phase-Encoded Speech Spectrograms.
Chandra Sekhar Seelamantula
2016PhonVoc: A Phonetic and Phonological Vocoding Toolkit.
Milos Cernak, Philip N. Garner
2016Phone Synchronous Decoding with CTC Lattice.
Zhehuai Chen, Wei Deng, Tao Xu, Kai Yu
2016Phoneme Embedding and its Application to Speech Driven Talking Avatar Synthesis.
Xu Li, Zhiyong Wu, Helen M. Meng, Jia Jia, Xiaoyan Lou, Lianhong Cai
2016Phoneme Set Design Considering Integrated Acoustic and Linguistic Features of Second Language Speech.
Xiaoyun Wang, Tsuneo Kato, Seiichi Yamamoto
2016Phoneme, Phone Boundary, and Tone in Automatic Scoring of Mandarin Proficiency.
Jiahong Yuan, Mark Y. Liberman
2016Phonetic Context Embeddings for DNN-HMM Phone Recognition.
Leonardo Badino
2016Phonetic Reduction Can Lead to Lengthening, and Enhancement Can Lead to Shortening.
Clara Cohen, Matt Carlson
2016Phonetic and Phonological Posterior Search Space Hashing Exploiting Class-Specific Sparsity Structures.
Afsaneh Asaei, Gil Luyet, Milos Cernak, Hervé Bourlard
2016Phonotactic Language Identification for Singing.
Anna M. Kruspe
2016Pitch-Adaptive Front-End Features for Robust Children's ASR.
Syed Shahnawazuddin, Abhishek Dey, Rohit Sinha
2016Pitch-Range Perception: The Dynamic Interaction Between Voice Quality and Fundamental Frequency.
Jianjing Kuang, Mark Y. Liberman
2016Poster Overview Presentations.
Naomi Harte, Peter Jancovic, Karl-L. Schuchmann
2016Predicting Affective Dimensions Based on Self Assessed Depression Severity.
Rahul Gupta, Shrikanth S. Narayanan
2016Predicting Binaural Speech Intelligibility from Signals Estimated by a Blind Source Separation Algorithm.
Qingju Liu, Yan Tang, Philip J. B. Jackson, Wenwu Wang
2016Predicting Pronunciations with Syllabification and Stress with Recurrent Neural Networks.
Daan van Esch, Mason Chua, Kanishka Rao
2016Predicting Severity of Voice Disorder from DNN-HMM Acoustic Posteriors.
Tan Lee, Yuanyuan Liu, Yu Ting Yeung, Thomas K. T. Law, Kathy Y. S. Lee
2016Predicting User Satisfaction from Turn-Taking in Spoken Conversations.
Shammur Absar Chowdhury, Evgeny A. Stepanov, Giuseppe Riccardi
2016Prediction and Generation of Backchannel Form for Attentive Listening Systems.
Tatsuya Kawahara, Takashi Yamaguchi, Koji Inoue, Katsuya Takanashi, Nigel G. Ward
2016Prediction of Deception and Sincerity from Speech Using Automatic Phone Recognition-Based Features.
Robert Herms
2016Prediction of the Articulatory Movements of Unseen Phonemes of a Speaker Using the Speech Structure of Another Speaker.
Hidetsugu Uchida, Daisuke Saito, Nobuaki Minematsu
2016Preliminary Experiments on Unsupervised Word Discovery in Mboshi.
Pierre Godard, Gilles Adda, Martine Adda-Decker, Alexandre Allauzen, Laurent Besacier, Hélène Bonneau-Maynard, Guy-Noël Kouarata, Kevin Löser, Annie Rialland, François Yvon
2016Priors for Speaker Counting and Diarization with AHC.
Gregory Sell, Alan McCree, Daniel Garcia-Romero
2016Privacy-Preserving Speech Analytics for Automatic Assessment of Student Collaboration.
Nikoletta Bassiou, Andreas Tsiartas, Jennifer Smith, Harry Bratt, Colleen Richey, Elizabeth Shriberg, Cynthia M. D'Angelo, Nonye Alozie
2016Probabilistic Amplitude Demodulation Features in Speech Synthesis for Improving Prosody.
Alexandros Lazaridis, Milos Cernak, Philip N. Garner
2016Probabilistic Approach Using Joint Clean and Noisy i-Vectors Modeling for Speaker Recognition.
Waad Ben Kheder, Driss Matrouf, Moez Ajili, Jean-François Bonastre
2016Probabilistic Approach Using Joint Long and Short Session i-Vectors Modeling to Deal with Short Utterances for Speaker Recognition.
Waad Ben Kheder, Driss Matrouf, Moez Ajili, Jean-François Bonastre
2016Probabilistic Spatial Filter Estimation for Signal Enhancement in Multi-Channel Automatic Speech Recognition.
Hendrik Kayser, Niko Moritz, Jörn Anemüller
2016Processing and Adaptation to Ambiguous Sounds during the Course of Perceptual Learning.
Polina Drozdova, Roeland van Hout, Odette Scharenborg
2016Progress and Prospects for Spoken Language Technology: Results from Four Sexennial Surveys.
Roger K. Moore, Ricard Marxer
2016Progress and Prospects for Spoken Language Technology: What Ordinary People Think.
Roger K. Moore, Hui Li, Shih-Hao Liao
2016Pronunciation Assessment of Japanese Learners of French with GOP Scores and Phonetic Information.
Vincent Laborde, Thomas Pellegrini, Lionel Fontan, Julie Mauclair, Halima Sahraoui, Jérôme Farinas
2016Pronunciation Error Detection for New Language Learners.
Sean Robertson, Cosmin Munteanu, Gerald Penn
2016Prosodic Convergence with Spoken Stimuli in Laboratory Data.
Margaret Zellers
2016Prosodic Cues and Answer Type Detection for the Deception Sub-Challenge.
Claude Montacié, Marie-José Caraty
2016Prosodic and Linguistic Analysis of Semantic Fluency Data: A Window into Speech Production and Cognition.
Maria K. Wolters, Najoung Kim, Jung-Ho Kim, Sarah E. MacPherson, Jong-Chan Park
2016Prosody Modification Using Allpass Residual of Speech Signals.
Karthika Vijayan, K. Sri Rama Murty
2016Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI.
Daniel Povey, Vijayaditya Peddinti, Daniel Galvez, Pegah Ghahremani, Vimal Manohar, Xingyu Na, Yiming Wang, Sanjeev Khudanpur
2016Putting German [ʃ] and [ç] in Two Different Boxes: Native German vs L2 German of French Learners.
Jane Wottawa, Martine Adda-Decker, Frédéric Isel
2016Quantitative Analysis of Backchannels Uttered by an Interviewer During Neuropsychological Tests.
Gérard Bailly, Frédéric Elisei, Alexandra Juphard, Olivier Moreaud
2016RNN-BLSTM Based Multi-Pitch Estimation.
Jianshu Zhang, Jian Tang, Li-Rong Dai
2016Rapid Update of Multilingual Deep Neural Network for Low-Resource Keyword Search.
Chongjia Ni, Lei Wang, Cheung-Chi Leung, Feng Rao, Li Lu, Bin Ma, Haizhou Li
2016Real-Time Presentation Tracking Using Semantic Keyword Spotting.
Reza Asadi, Harriet J. Fell, Timothy W. Bickmore, Ha Trinh
2016Real-Time Tracking of Speakers' Emotions, States, and Traits on Mobile Platforms.
Erik Marchi, Florian Eyben, Gerhard Hagerer, Björn W. Schuller
2016Realistic Multi-Microphone Data Simulation for Distant Speech Recognition.
Mirco Ravanelli, Piergiorgio Svaizer, Maurizio Omologo
2016Recent Advances in Google Real-Time HMM-Driven Unit Selection Synthesizer.
Xavi Gonzalvo, Siamak Tazari, Chun-an Chan, Markus Becker, Alexander Gutkin, Hanna Silén
2016Recognition of Depression in Bipolar Disorder: Leveraging Cohort and Person-Specific Knowledge.
Soheil Khorram, John Gideon, Melvin G. McInnis, Emily Mower Provost
2016Recognition of Dysarthric Speech Using Voice Parameters for Speaker Adaptation and Multi-Taper Spectral Estimation.
Chitralekha Bhat, Bhavik Vachhani, Sunil Kumar Kopparapu
2016Recognition of Multiple Bird Species Based on Penalised Maximum Likelihood and HMM-Based Modelling of Individual Vocalisation Elements.
Peter Jancovic, Münevver Köküer
2016Recurrent Models for Auditory Attention in Multi-Microphone Distant Speech Recognition.
Suyoun Kim, Ian R. Lane
2016Recurrent Neural Network Language Model with Incremental Updated Context Information Generated Using Bag-of-Words Representation.
Md. Akmal Haidar, Mikko Kurimo
2016Recurrent Neural Network-Based Phoneme Sequence Estimation Using Multiple ASR Systems' Outputs for Spoken Term Detection.
Naoki Sawada, Hiromitsu Nishizaki
2016Recurrent Out-of-Vocabulary Word Detection Using Distribution of Features.
Taichi Asami, Ryo Masumura, Yushi Aono, Koichi Shinoda
2016Redefining the Linguistic Context Feature Set for HMM and DNN TTS Through Position and Parsing.
Rasmus Dall, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda
2016Reducing the Computational Complexity of Multimicrophone Acoustic Models with Integrated Feature Extraction.
Tara N. Sainath, Arun Narayanan, Ron J. Weiss, Ehsan Variani, Kevin W. Wilson, Michiel Bacchiani, Izhak Shafran
2016Relating Estimated Cyclic Spectral Peak Frequency to Measured Epilarynx Length Using Magnetic Resonance Imaging.
Elizabeth Godoy, Andrew Dumas, Jennifer Melot, Nicolas Malyska, Thomas F. Quatieri
2016Relation of Automatically Extracted Formant Trajectories with Intelligibility Loss and Speaking Rate Decline in Amyotrophic Lateral Sclerosis.
Rachelle L. Horwitz-Martin, Thomas F. Quatieri, Adam C. Lammert, James R. Williamson, Yana Yunusova, Elizabeth Godoy, Daryush D. Mehta, Jordan R. Green
2016Relationships Between Functional Load and Auditory Confusability Under Different Speech Environments.
Shinae Kang, Clara Cohen
2016Relative Contributions of Amplitude and Phase to the Intelligibility Advantage of Ideal Binary Masked Sentences.
Lei Wang, Shufeng Zhu, Diliang Chen, Yong Feng, Fei Chen
2016Release from Energetic Masking Caused by Repeated Patterns of Glimpsing Windows.
Maury Lander-Portnoy
2016Remeeting - Deep Insights to Conversations.
Allen Guo, Arlo Faria, Korbinian Riedhammer
2016Representation Learning for Speech Emotion Recognition.
Sayan Ghosh, Eugene Laksana, Louis-Philippe Morency, Stefan Scherer
2016Rescoring Hypothesized Detections of Out-of-Vocabulary Keywords Using Subword Samples.
Van Tung Pham, Haihua Xu, Xiong Xiao, Nancy F. Chen, Eng Siong Chng, Haizhou Li
2016Rescoring by Combination of Posteriorgram Score and Subword-Matching Score for Use in Query-by-Example.
Masato Obara, Kazunori Kojima, Kazuyo Tanaka, Shi-wook Lee, Yoshiaki Itoh
2016Respiratory Belts and Whistles: A Preliminary Study of Breathing Acoustics for Turn-Taking.
Marcin Wlodarczak, Mattias Heldner
2016Respiratory Turn-Taking Cues.
Marcin Wlodarczak, Mattias Heldner
2016Results of The 2015 NIST Language Recognition Evaluation.
Hui Zhao, Désiré Bansé, George R. Doddington, Craig S. Greenberg, Jaime Hernandez-Cordero, John M. Howard, Lisa P. Mason, Alvin F. Martin, Douglas A. Reynolds, Elliot Singer, Audrey Tong
2016Retrieval of Textual Song Lyrics from Sung Inputs.
Anna M. Kruspe
2016Retrieving Categorical Emotions Using a Probabilistic Framework to Define Preference Learning Samples.
Reza Lotfian, Carlos Busso
2016Reverberation-Robust One-Bit TDOA Based Moving Source Localization for Automatic Camera Steering.
Sundar Harshavardhan, Gokul Deepak Manavalan, T. V. Sreenivas, Chandra Sekhar Seelamantula
2016Robust Audio Event Recognition with 1-Max Pooling Convolutional Neural Networks.
Huy Phan, Lars Hertel, Marco Maaß, Alfred Mertins
2016Robust DNN-Based VAD Augmented with Phone Entropy Based Rejection of Background Speech.
Yuya Fujita, Ken-ichi Iso
2016Robust Detection of Multiple Bioacoustic Events with Repetitive Structures.
Frank Kurth
2016Robust Estimation of Fundamental Frequency Using Single Frequency Filtering Approach.
Vishala Pannala, G. Aneeja, Sudarsana Reddy Kadiri, B. Yegnanarayana
2016Robust Example Search Using Bottleneck Features for Example-Based Speech Enhancement.
Atsunori Ogawa, Shogo Seki, Keisuke Kinoshita, Marc Delcroix, Takuya Yoshioka, Tomohiro Nakatani, Kazuya Takeda
2016Robust Multichannel Gender Classification from Speech in Movie Audio.
Naveen Kumar, Md. Nasir, Panayiotis G. Georgiou, Shrikanth S. Narayanan
2016Robust Sound Event Detection in Continuous Audio Environments.
Haomin Zhang, Ian McLoughlin, Yan Song
2016Robust Speaker Recognition with Combined Use of Acoustic and Throat Microphone Speech.
Md. Sahidullah, Rosa González Hautamäki, Dennis Alexander Lehmann Thomsen, Tomi Kinnunen, Zheng-Hua Tan, Ville Hautamäki, Robert Parts, Martti Pitkänen
2016Robust Speech Recognition Using Generalized Distillation Framework.
Konstantin Markov, Tomoko Matsui
2016Robust Vowel Landmark Detection Using Epoch-Based Features.
Sri Harsha Dumpala, Bhanu Teja Nellore, Raghu Ram Nevali, Suryakanth V. Gangashetty, B. Yegnanarayana
2016Robustness in Speech, Speaker, and Language Recognition: "You've Got to Know Your Limitations".
John H. L. Hansen, Hynek Boril
2016Root Cause Analysis of Miscommunication Hotspots in Spoken Dialogue Systems.
Spiros Georgiladakis, Georgia Athanasopoulou, Raveesh Meena, José Lopes, Arodami Chorianopoulou, Elisavet Palogiannidi, Elias Iosif, Gabriel Skantze, Alexandros Potamianos
2016SERAPHIM Live! - Singing Synthesis for the Performer, the Composer, and the 3D Game Developer.
Paul Yaozhu Chan, Minghui Dong, Grace Xue Hui Ho, Haizhou Li
2016SERAPHIM: A Wavetable Synthesis System with 3D Lip Animation for Real-Time Speech and Singing Applications on Mobile Platforms.
Paul Yaozhu Chan, Minghui Dong, Grace Xue Hui Ho, Haizhou Li
2016SNR-Aware Convolutional Neural Network Modeling for Speech Enhancement.
Szu-Wei Fu, Yu Tsao, Xugang Lu
2016SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement.
Tian Gao, Jun Du, Li-Rong Dai, Chin-Hui Lee
2016STON: Efficient Subtitling in Dutch Using State-of-the-Art Tools.
Lyan Verwimp, Brecht Desplanques, Kris Demuynck, Joris Pelemans, Marieke Lycke, Patrick Wambacq
2016Sage: The New BBN Speech Processing Platform.
Roger Hsiao, Ralf Meermeier, Tim Ng, Zhongqiang Huang, Maxwell Jordan, Enoch Kan, Tanel Alumäe, Jan Silovský, William Hartmann, Francis Keith, Omer Lang, Man-Hung Siu, Owen Kimball
2016Segmental Recurrent Neural Networks for End-to-End Speech Recognition.
Liang Lu, Lingpeng Kong, Chris Dyer, Noah A. Smith, Steve Renals
2016Segmented Dynamic Time Warping for Spoken Query-by-Example Search.
Jorge Proença, Fernando Perdigão
2016Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems.
Pierre Lanchantin, Mark J. F. Gales, Penny Karanasou, Xunying Liu, Yanman Qian, Linlin Wang, Philip C. Woodland, Chao Zhang
2016Self-Adaptive DNN for Improving Spoken Language Proficiency Assessment.
Yao Qian, Xinhao Wang, Keelan Evanini, David Suendermann-Oeft
2016Semi-Coupled Dictionary Based Automatic Bandwidth Extension Approach for Enhancing Children's ASR.
Ganji Sreeram, Rohit Sinha
2016Semi-Supervised Joint Enhancement of Spectral and Cepstral Sequences of Noisy Speech.
Li Li, Hirokazu Kameoka, Takuya Higuchi, Hiroshi Saruwatari
2016Semi-Supervised Speaker Adaptation for In-Vehicle Speech Recognition with Deep Neural Networks.
Wonkyum Lee, Kyu Jeong Han, Ian R. Lane
2016Semi-Supervised Training in Deep Learning Acoustic Model.
Yan Huang, Yongqiang Wang, Yifan Gong
2016Semi-Supervised and Cross-Lingual Knowledge Transfer Learnings for DNN Hybrid Acoustic Models Under Low-Resource Conditions.
Haihua Xu, Hang Su, Chongjia Ni, Xiong Xiao, Hao Huang, Eng Siong Chng, Haizhou Li
2016Sensitivity of Quantitative RT-MRI Metrics of Vocal Tract Dynamics to Image Reconstruction Settings.
Johannes Töger, Yongwan Lim, Sajan Goud Lingala, Shrikanth S. Narayanan, Krishna S. Nayak
2016Sensorimotor Response to Visual Imagery of Tongue Displacement.
William F. Katz, Divya Prabhakaran
2016Sentence Boundary Detection Based on Parallel Lexical and Acoustic Models.
Xiaoyin Che, Sheng Luo, Haojin Yang, Christoph Meinel
2016Sequence Student-Teacher Training of Deep Neural Networks.
Jeremy Heng Meng Wong, Mark J. F. Gales
2016Sequence Summarizing Neural Networks for Spoken Language Recognition.
Jan Pesán, Lukás Burget, Jan Cernocký
2016Sequential Convolutional Neural Networks for Slot Filling in Spoken Language Understanding.
Ngoc Thang Vu
2016Sequential Recurrent Neural Networks for Language Modeling.
Youssef Oualil, Clayton Greenberg, Mittul Singh, Dietrich Klakow
2016Sharing Speech Synthesis Software for Research and Education Within Low-Tech and Low-Resource Communities.
Andrew R. Plummer, Mary E. Beckman
2016Short Utterance Variance Modelling and Utterance Partitioning for PLDA Speaker Verification.
Ahilan Kanagasundaram, David Dean, Sridha Sridharan, Clinton Fookes, Ivan Himawan
2016Silent-Speech Command Word Recognition Using Electro-Optical Stomatography.
Simon Stone, Peter Birkholz
2016Sincerity and Deception in Speech: Two Sides of the Same Coin? A Transfer- and Multi-Task Learning Perspective.
Yue Zhang, Felix Weninger, Zhao Ren, Björn W. Schuller
2016SingaKids-Mandarin: Speech Corpus of Singaporean Children Speaking Mandarin Chinese.
Nancy F. Chen, Rong Tong, Darren Wee, Pei Xuan Lee, Bin Ma, Haizhou Li
2016Singing Voice Synthesis Based on Deep Neural Networks.
Masanari Nishimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda
2016Single-Channel Multi-Speaker Separation Using Deep Clustering.
Yusuf Ziya Isik, Jonathan Le Roux, Zhuo Chen, Shinji Watanabe, John R. Hershey
2016Single-Channel Speech Enhancement Using Double Spectrum.
Martin Blass, Pejman Mowlaee, W. Bastiaan Kleijn
2016Sinusoidal Modelling for Ecoacoustics.
Patrice Guyot, Alice Eldridge, Ying Chen Eyre-Walker, Alison Johnston, Thomas Pellegrini, Mika Peck
2016Small-Footprint Deep Neural Networks with Highway Connections for Speech Recognition.
Liang Lu, Steve Renals
2016Sound Pattern Matching for Automatic Prosodic Event Detection.
Milos Cernak, Afsaneh Asaei, Pierre-Edouard Honnet, Philip N. Garner, Hervé Bourlard
2016SparkNG: Interactive MATLAB Tools for Introduction to Speech Production, Perception and Processing Fundamentals and Application of the Aliasing-Free L-F Model Component.
Hideki Kawahara
2016Sparsely Connected and Disjointly Trained Deep Neural Networks for Low Resource Behavioral Annotation: Acoustic Classification in Couples' Therapy.
Haoqi Li, Brian R. Baucom, Panayiotis G. Georgiou
2016Speaker Age Classification and Regression Using i-Vectors.
Joanna Grzybowska, Stanislaw Kacprzak
2016Speaker Comparison for Forensic and Investigative Applications II.
Jean-François Bonastre, Joseph P. Campbell, Anders P. Eriksson, Hirotaka Nakasone, Reva Schwartz
2016Speaker Identity and Voice Quality: Modeling Human Responses and Automatic Speaker Recognition.
Soo Jin Park, Caroline Sigouin, Jody Kreiman, Patricia A. Keating, Jinxi Guo, Gary Yeung, Fang-Yu Kuo, Abeer Alwan
2016Speaker Linking and Applications Using Non-Parametric Hashing Methods.
Douglas E. Sturim, William M. Campbell
2016Speaker Normalization Through Feature Shifting of Linearly Transformed i-Vector.
Jahyun Goo, Younggwan Kim, Hyungjun Lim, Hoirin Kim
2016Speaker Recognition Using Real vs Synthetic Parallel Data for DNN Channel Compensation.
Fred Richardson, Michael S. Brandstein, Jennifer Melot, Douglas A. Reynolds
2016Speaker Representations for Speaker Adaptation in Multiple Speakers' BLSTM-RNN-Based Speech Synthesis.
Yi Zhao, Daisuke Saito, Nobuaki Minematsu
2016Speaker Verification Using Short Utterances with DNN-Based Estimation of Subglottal Acoustic Features.
Jinxi Guo, Gary Yeung, Deepak Muralidharan, Harish Arsikere, Amber Afshan, Abeer Alwan
2016Speaker-Dependent Dictionary-Based Speech Enhancement for Text-Dependent Speaker Verification.
Nicolai Bæk Thomsen, Dennis Alexander Lehmann Thomsen, Zheng-Hua Tan, Børge Lindberg, Søren Holdt Jensen
2016Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments.
Guan-Lin Chao, William Chan, Ian R. Lane
2016Speakers In The Wild (SITW): The QUT Speaker Recognition System.
Houman Ghaemmaghami, Md. Hafizur Rahman, Ivan Himawan, David Dean, Ahilan Kanagasundaram, Sridha Sridharan, Clinton Fookes
2016Spectral Enhancement of Cleft Lip and Palate Speech.
Vikram C. M., Nagaraj Adiga, S. R. Mahadeva Prasanna
2016Speech Bandwidth Extension Using Bottleneck Features and Deep Recurrent Neural Networks.
Yu Gu, Zhen-Hua Ling, Li-Rong Dai
2016Speech Emotion Recognition Using Affective Saliency.
Arodami Chorianopoulou, Polychronis Koutsakis, Alexandros Potamianos
2016Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System Using Deep Recurrent Neural Networks.
Cassia Valentini-Botinhao, Xin Wang, Shinji Takaki, Junichi Yamagishi
2016Speech Enhancement in Multiple-Noise Conditions Using Deep Neural Networks.
Anurag Kumar, Dinei A. F. Florêncio
2016Speech Features for Depression Detection.
Saurabh Sahu, Carol Y. Espy-Wilson
2016Speech Intelligibility Prediction Based on the Envelope Power Spectrum Model with the Dynamic Compressive Gammachirp Auditory Filterbank.
Katsuhiko Yamamoto, Toshio Irino, Toshie Matsui, Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani
2016Speech Likability and Personality-Based Social Relations: A Round-Robin Analysis over Communication Channels.
Laura Fernández Gallardo, Benjamin Weiss
2016Speech Localisation in a Multitalker Mixture by Humans and Machines.
Ning Ma, Guy J. Brown
2016Speech Recognition in Alzheimer's Disease and in its Assessment.
Luke Zhou, Kathleen C. Fraser, Frank Rudzicz
2016Speech Reductions Cause a De-Weighting of Secondary Acoustic Cues.
Léo Varnet, Fanny Meunier, Michel Hoen
2016Speech Rhythm in Parkinson's Disease: A Study on Italian.
Massimo Pettorino, Maria Grazia Busà, Elisa Pellegrino
2016Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence.
Bidisha Sharma, S. R. Mahadeva Prasanna
2016Speech Ventures.
Nicolas Scheffer, Korbinian Riedhammer, Alexandre Lebrun, David Suendermann-Oeft
2016Speech-Based Detection of Alzheimer's Disease in Conversational German.
Jochen Weiner, Christian Herff, Tanja Schultz
2016Speed Perturbation and Vowel Duration Modeling for ASR in Hausa and Wolof Languages.
Elodie Gauthier, Laurent Besacier, Sylvie Voisin
2016Spoken Language Understanding in a Latent Topic-Based Subspace.
Mohamed Morchid, Mohamed Bouaziz, Waad Ben Kheder, Killian Janod, Pierre-Michel Bousquet, Richard Dufour, Georges Linarès
2016Stacked Long-Term TDNN for Spoken Language Recognition.
Daniel Garcia-Romero, Alan McCree
2016State-of-the-Art MRI Protocol for Comprehensive Assessment of Vocal Tract Structure and Function.
Sajan Goud Lingala, Asterios Toutios, Johannes Töger, Yongwan Lim, Yinghua Zhu, Yoon-Chul Kim, Colin Vaz, Shrikanth S. Narayanan, Krishna S. Nayak
2016Statistical Modeling of Speaker's Voice with Temporal Co-Location for Active Voice Authentication.
Zhong Meng, Biing-Hwang Juang
2016Stimulated Deep Neural Network for Speech Recognition.
Chunyang Wu, Penny Karanasou, Mark J. F. Gales, Khe Chai Sim
2016Subspace Detection of DNN Posterior Probabilities via Sparse Representation for Query by Example Spoken Term Detection.
Dhananjay Ram, Afsaneh Asaei, Hervé Bourlard
2016Subspace LHUC for Fast Adaptation of Deep Neural Network Acoustic Models.
Lahiru Samarakoon, Khe Chai Sim
2016Supervised Learning of Acoustic Models in a Zero Resource Setting to Improve DPGMM Clustering.
Michael Heck, Sakriani Sakti, Satoshi Nakamura
2016Supplementary Motor Area Activation in Disfluency Perception: An fMRI Study of Listener Neural Responses to Spontaneously Produced Unfilled and Filled Pauses.
Robert Eklund, Martin Ingvar
2016Syllable-Level Representations of Suprasegmental Features for DNN-Based Text-to-Speech Synthesis.
Manuel Sam Ribeiro, Oliver Watts, Junichi Yamagishi
2016Synthesis of Device-Independent Noise Corpora for Realistic ASR Evaluation.
Hannes Gamper, Mark R. P. Thomas, Lyle Corbin, Ivan Tashev
2016THU-EE System Description for NIST LRE 2015.
Liang He, Yao Tian, Yi Liu, Jiaming Xu, Weiwei Liu, Cai Meng, Jia Liu
2016TUSK: A Framework for Overviewing the Performance of F0 Estimators.
Masanori Morise, Hideki Kawahara
2016Talking to a System and Talking to a Human: A Study from a Speech-to-Speech, Machine Translation Mediated Map Task.
Hayakawa Akira, Saturnino Luz, Nick Campbell
2016Talking with Kids Really Matters: Early Language Experience Shapes Later Life Chances.
Anne Fernald
2016Tandem Features for Text-Dependent Speaker Verification on the RedDots Corpus.
Md. Jahangir Alam, Patrick Kenny, Vishwa Gupta
2016Target-Based State and Tracking Algorithm for Spoken Dialogue System.
Miao Li, Zhiyang He, Ji Wu
2016Teaming Up: Making the Most of Diverse Representations for a Novel Personalized Speech Retrieval Application.
Stephanie Pancoast, Murat Akbacak
2016Temporal Envelopes in Sine-Wave Speech Recognition.
Li Xu
2016Text Dependent Speaker Verification Using Un-Supervised HMM-UBM and Temporal GMM-UBM.
Achintya Kumar Sarkar, Zheng-Hua Tan
2016Text-Available Speaker Recognition System for Forensic Applications.
Chengzhu Yu, Chunlei Zhang, Finnian Kelly, Abhijeet Sangwan, John H. L. Hansen
2016Text-Dependent Audiovisual Synchrony Detection for Spoofing Detection in Mobile Person Recognition.
Amit Aides, Hagai Aronowitz
2016Text-to-Speech for Individuals with Vision Loss: A User Study.
Monika Podsiadlo, Shweta Chahar
2016The 2015 NIST Language Recognition Evaluation: The Shared View of I2R, Fantastic4 and SingaMS.
Kong-Aik Lee, Haizhou Li, Li Deng, Ville Hautamäki, Wei Rao, Xiong Xiao, Anthony Larcher, Hanwu Sun, Trung Hieu Nguyen, Guangsen Wang, Aleksandr Sizov, Jianshu Chen, Ivan Kukanov, Amir Hossein Poorjam, Trung Ngo Trong, Chenglin Xu, Haihua Xu, Bin Ma, Eng Siong Chng, Sylvain Meignier
2016The 2016 Speakers in the Wild Speaker Recognition Evaluation.
Mitchell McLaren, Luciana Ferrer, Diego Castán, Aaron Lawson
2016The Acoustic Manifestation of Prominence in Stressless Languages.
Angeliki Athanasopoulou, Irene Vogel
2016The Acoustics of Lexical Stress in Italian as a Function of Stress Level and Speaking Style.
Anders Eriksson, Pier Marco Bertinetto, Mattias Heldner, Rosalba Nodari, Giovanna Lenoci
2016The Berkeley Phonetics Machine.
Ronald L. Sprouse, Keith Johnson
2016The Consistency and Stability of Acoustic and Visual Cues for Different Prosodic Attitudes.
Jeesun Kim, Chris Davis
2016The Deception Sub-Challenge: The Data.
Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron C. Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini
2016The Discourse Marker "so" in Turn-Taking and Turn-Releasing Behavior.
Emma Rennie, Rebecca Lunsford, Peter A. Heeman
2016The Effect of Background Noise on the Activation of Phonological and Semantic Information During Spoken-Word Recognition.
Florian Hintz, Odette Scharenborg
2016The Effect of Postlexical Deletion on Automatic Speech Recognition in Fast Spontaneously Spoken Zulu.
Ewald van der Westhuizen, Thomas Niesler
2016The Effect of Sentence Accent on Non-Native Speech Perception in Noise.
Odette Scharenborg, Elea Kolkman, Sofoklis Kakouros, Brechtje Post
2016The Effects of Modified Speech Styles on Intelligibility for Non-Native Listeners.
Martin Cooke, María Luisa García Lecumberri
2016The Effects of Prosody on French V-to-V Coarticulation: A Corpus-Based Study.
Giuseppina Turco, Cécile Fougeron, Nicolas Audibert
2016The Human Speech Cortex.
Edward Chang
2016The IBM 2016 English Conversational Telephone Speech Recognition System.
George Saon, Tom Sercu, Steven J. Rennie, Hong-Kwang Jeff Kuo
2016The IBM Speaker Recognition System: Recent Advances and Error Analysis.
Seyed Omid Sadjadi, Jason W. Pelecanos, Sriram Ganapathy
2016The INTERSPEECH 2016 Computational Paralinguistics Challenge: A Summary of Results.
Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini
2016The INTERSPEECH 2016 Computational Paralinguistics Challenge: Deception, Sincerity & Native Language.
Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron C. Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini
2016The Impact of Manner of Articulation on the Intelligibility of Voicing Contrast in Noise: Cross-Linguistic Implications.
Mayuki Matsui
2016The Influence of Language Experience on the Categorical Perception of Vowels: Evidence from Mandarin and Korean.
Hao Zhang, Fei Chen, Nan Yan, Lan Wang, Feng Shi, Manwa L. Ng
2016The Influence of Modality and Speaking Style on the Assimilation Type and Categorization Consistency of Non-Native Speech.
Sarah E. Fenwick, Catherine T. Best, Chris Davis, Michael D. Tyler
2016The Magic Stone: A Video Game to Improve Communication Skills of People with Intellectual Disabilities.
Mario Corrales-Astorgano, David Escudero Mancebo, César González Ferreras, Yurena Gutiérrez-González, Valle Flores-Lucas, Valentín Cardeñoso-Payo, Lourdes Aguilar-Cuevas
2016The NU-NAIST Voice Conversion System for the Voice Conversion Challenge 2016.
Kazuhiro Kobayashi, Shinnosuke Takamichi, Satoshi Nakamura, Tomoki Toda
2016The Native Language Sub-Challenge: The Data.
Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini
2016The Parameterized Phoneme Identity Feature as a Continuous Real-Valued Vector for Neural Network Based Speech Synthesis.
Zhengqi Wen, Ya Li, Jianhua Tao
2016The Perception of Overlapping Speech: Effects of Speaker Prosody and Listener Attitudes.
Katherine Hilton
2016The Perceptual Effect of L1 Prosody Transplantation on L2 Speech: The Case of French Accented German.
Jeanin Jügler, Frank Zimmerer, Jürgen Trouvain, Bernd Möbius
2016The Production of Intervocalic Glides in Non Dysarthric Parkinsonian Speech.
Véronique Delvaux, Virginie Roland, Kathy Huet, Myriam Piccaluga, Marie-Claire Haelewyck, Bernard Harmegnies
2016The Rhythmic Constraint on Prosodic Boundaries in Mandarin Chinese Based on Corpora of Silent Reading and Speech Perception.
Wei Lai, Jiahong Yuan, Ya Li, Xiaoying Xu, Mark Y. Liberman
2016The Role of Pitch in Punjabi Word Identification.
Jasmeen Kanwal, Amanda Ritchart
2016The Role of Spectral Resolution in Foreign-Accented Speech Perception.
Michelle R. Kapolowicz, Vahid Montazeri, Peter F. Assmann
2016The SIWIS Database: A Multilingual Speech Database with Acted Emphasis.
Jean-Philippe Goldman, Pierre-Edouard Honnet, Robert A. J. Clark, Philip N. Garner, Maria Ivanova, Alexandros Lazaridis, Hui Liang, Tiago Macedo, Beat Pfister, Manuel Sam Ribeiro, Eric Wehrli, Junichi Yamagishi
2016The SRI CLEO Speaker-State Corpus.
Andreas Kathol, Elizabeth Shriberg, Massimiliano de Zambotti
2016The SRI Speech-Based Collaborative Learning Corpus.
Colleen Richey, Cynthia M. D'Angelo, Nonye Alozie, Harry Bratt, Elizabeth Shriberg
2016The SRI System for the NIST OpenSAD 2015 Speech Activity Detection Evaluation.
Martin Graciarena, Luciana Ferrer, Vikramjit Mitra
2016The Sheffield Wargame Corpus - Day Two and Day Three.
Yulan Liu, Charles Fox, Madina Hasan, Thomas Hain
2016The Sincerity Sub-Challenge: The Data.
Björn W. Schuller, Stefan Steidl, Anton Batliner, Julia Hirschberg, Judee K. Burgoon, Alice Baird, Aaron Elkins, Yue Zhang, Eduardo Coutinho, Keelan Evanini
2016The Sound of Disgust: How Facial Expression May Influence Speech Production.
Chee Seng Chong, Jeesun Kim, Chris Davis
2016The Speakers in the Wild (SITW) Speaker Recognition Database.
Mitchell McLaren, Luciana Ferrer, Diego Castán, Aaron Lawson
2016The USTC System for Voice Conversion Challenge 2016: Neural Network Based Approaches for Spectrum, Aperiodicity and F
Ling-Hui Chen, Li-Juan Liu, Zhen-Hua Ling, Yuan Jiang, Li-Rong Dai
2016The Unit of Speech Encoding: The Case of Romanian.
Irene Vogel, Laura Spinu
2016The Use of Locally Normalized Cepstral Coefficients (LNCC) to Improve Speaker Recognition Accuracy in Highly Reverberant Rooms.
Víctor Poblete, Juan Pablo Escudero, Josué Fredes, José Novoa, Richard M. Stern, Simon King, Néstor Becerra Yoma
2016The Use of Read versus Conversational Lombard Speech in Spectral Tilt Modeling for Intelligibility Enhancement in Near-End Noise Conditions.
Emma Jokinen, Ulpu Remes, Paavo Alku
2016The Voice Conversion Challenge 2016.
Tomoki Toda, Ling-Hui Chen, Daisuke Saito, Fernando Villavicencio, Mirjam Wester, Zhizheng Wu, Junichi Yamagishi
2016TheanoLM - An Extensible Toolkit for Neural Network Language Modeling.
Seppo Enarvi, Mikko Kurimo
2016Time-Varying Quasi-Closed-Phase Weighted Linear Prediction Analysis of Speech for Accurate Formant Detection and Tracking.
Dhananjaya N. Gowda, Paavo Alku
2016Today's Most Frequently Used F
Sofia Strömbergsson
2016Tone Classification in Mandarin Chinese Using Convolutional Neural Networks.
Charles Chen, Razvan C. Bunescu, Li Xu, Chang Liu
2016Toward Development and Evaluation of Pain Level-Rating Scale for Emergency Triage based on Vocal Characteristics and Facial Expressions.
Fu-Sheng Tsai, Ya-Ling Hsu, Wei-Chen Chen, Yi-Ming Weng, Chip-Jin Ng, Chi-Chun Lee
2016Toward High-Performance Language-Independent Query-by-Example Spoken Term Detection for MediaEval 2015: Post-Evaluation Analysis.
Cheung-Chi Leung, Lei Wang, Haihua Xu, Jingyong Hou, Van Tung Pham, Hang Lv, Lei Xie, Xiong Xiao, Chongjia Ni, Bin Ma, Eng Siong Chng, Haizhou Li
2016Towards Automatic Detection of Amyotrophic Lateral Sclerosis from Speech Acoustic and Articulatory Samples.
Jun Wang, Prasanna V. Kothalkar, Beiming Cao, Daragh Heitzman
2016Towards Building an Attentive Artificial Listener: On the Perception of Attentiveness in Feedback Utterances.
Catharine Oertel, Joakim Gustafson, Alan W. Black
2016Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks.
Ying Zhang, Mohammad Pezeshki, Philémon Brakel, Saizheng Zhang, César Laurent, Yoshua Bengio, Aaron C. Courville
2016Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine.
Bo-Hsiang Tseng, Sheng-syun Shen, Hung-yi Lee, Lin-Shan Lee
2016Towards Minimally Invasive Velar State Detection in Normal and Silent Speech.
Peter Birkholz, Petko Bakardjiev, Steffen Kürbis, Rico Petrick
2016Towards Online-Recognition with Deep Bidirectional LSTM Acoustic Models.
Albert Zeyer, Ralf Schlüter, Hermann Ney
2016Towards Smart-Cars That Can Listen: Abnormal Acoustic Event Detection on the Road.
Mahesh Kumar Nandwana, Taufiq Hasan
2016Towards an Automated Screening Tool for Developmental Speech and Language Impairments.
Jen J. Gong, Maryann Gong, Dina Levy-Lambert, Jordan R. Green, Tiffany P. Hogan, John V. Guttag
2016Tracking Contours of Orofacial Articulators from Real-Time MRI of Speech.
Mathieu Labrunie, Pierre Badin, Dirk Voit, Arun A. Joseph, Laurent Lamalle, Coriandre Vilain, Louis-Jean Boë, Jens Frahm
2016Transfer Learning for Speaker Verification on Short Utterances.
Qingyang Hong, Lin Li, Lihong Wan, Jun Zhang, Feng Tong
2016Transfer Learning with Bottleneck Feature Networks for Whispered Speech Recognition.
Boon Pang Lim, Faith Wong, Yuyao Li, Jia Wei Bay
2016Transferring Emphasis in Speech Translation Using Hard-Attentional Neural Network Models.
Quoc Truong Do, Sakriani Sakti, Graham Neubig, Satoshi Nakamura
2016Triphone State-Tying via Deep Canonical Correlation Analysis.
Weiran Wang, Hao Tang, Karen Livescu
2016Twin Model G-PLDA for Duration Mismatch Compensation in Text-Independent Speaker Verification.
Jianbo Ma, Vidhyasaharan Sethu, Eliathamby Ambikairajah, Kong-Aik Lee
2016Two-Pass IB Based Speaker Diarization System Using Meeting-Specific ANN Based Features.
Nauman Dawalatabad, Srikanth R. Madikeri, C. Chandra Sekhar, Hema A. Murthy
2016Two-Stage Data Augmentation for Low-Resourced Speech Recognition.
William Hartmann, Tim Ng, Roger Hsiao, Stavros Tsakalidis, Richard M. Schwartz
2016Two-Stage Temporal Processing for Single-Channel Speech Enhancement.
Suman Samui, Indrajit Chakrabarti, Soumya Kanti Ghosh
2016Uncontrolled Manifolds in Vowel Production: Assessment with a Biomechanical Model of the Tongue.
Andrew Szabados, Pascal Perrier
2016Understanding Periodically Interrupted Mandarin Speech.
Jing Liu, Rosanna H. N. Tong, Fei Chen
2016Undoing Misperceptions: A Microscopic Analysis of Consistent Confusions Through Signal Modifications.
Máté Attila Tóth, Martin Cooke
2016Unipolar Depression vs. Bipolar Disorder: An Elicitation-Based Approach to Short-Term Detection of Mood Disorder.
Kun-Yi Huang, Chung-Hsien Wu, Yu-Ting Kuo, Fong-Lin Jang
2016Unit-Selection Attack Detection Based on Unfiltered Frequency-Domain Features.
Ulrich Scherhag, Andreas Nautsch, Christian Rathgeb, Christoph Busch
2016Universal Background Sparse Coding and Multilayer Bootstrap Network for Speaker Clustering.
Xiao-Lei Zhang
2016Unrestricted Vocabulary Keyword Spotting Using LSTM-CTC.
Yimeng Zhuang, Xuankai Chang, Yanmin Qian, Kai Yu
2016Unsupervised Adaptation of Recurrent Neural Network Language Models.
Siva Reddy Gangireddy, Pawel Swietojanski, Peter Bell, Steve Renals
2016Unsupervised Bottleneck Features for Low-Resource Query-by-Example Spoken Term Detection.
Hongjie Chen, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li
2016Unsupervised Deep Auditory Model Using Stack of Convolutional RBMs for Speech Recognition.
Hardik B. Sailor, Hemant A. Patil
2016Unsupervised Joint Estimation of Grapheme-to-Phoneme Conversion Systems and Acoustic Model Adaptation for Non-Native Speech Recognition.
Satoshi Tsujioka, Sakriani Sakti, Koichiro Yoshino, Graham Neubig, Satoshi Nakamura
2016Unsupervised Learning of Acoustic Units Using Autoencoders and Kohonen Nets.
Vikramjit Mitra, Dimitra Vergyri, Horacio Franco
2016Unsupervised Phoneme Segmentation of Previously Unseen Languages.
Marco Vetter, Markus Müller, Fatima Hamlaoui, Graham Neubig, Satoshi Nakamura, Sebastian Stüker, Alex Waibel
2016Unsupervised Stress Information Labeling Using Gaussian Process Latent Variable Model for Statistical Speech Synthesis.
Decha Moungsri, Tomoki Koriyama, Takao Kobayashi
2016Use of Agreement/Disagreement Classification in Dyadic Interactions for Continuous Emotion Recognition.
Hossein Khaki, Engin Erzin
2016Use of Generalised Nonlinearity in Vector Taylor Series Noise Compensation for Robust Speech Recognition.
Erfan Loweimi, Jon Barker, Thomas Hain
2016Use of Vowels in Discriminating Speech-Laugh from Laughter and Neutral Speech.
Sri Harsha Dumpala, P. Gangamohan, Suryakanth V. Gangashetty, B. Yegnanarayana
2016Using Clinician Annotations to Improve Automatic Speech Recognition of Stuttered Speech.
Peter A. Heeman, Rebecca Lunsford, Andy McMillin, J. Scott Yaruss
2016Using Past Speaker Behavior to Better Predict Turn Transitions.
Tomer Meshorer, Peter A. Heeman
2016Using Phonologically Weighted Levenshtein Distances for the Prediction of Microscopic Intelligibility.
Lionel Fontan, Isabelle Ferrané, Jérôme Farinas, Julien Pinquier, Xavier Aumont
2016Using Text and Acoustic Features in Predicting Glottal Excitation Waveforms for Parametric Speech Synthesis with Recurrent Neural Networks.
Lauri Juvela, Xin Wang, Shinji Takaki, Manu Airaksinen, Junichi Yamagishi, Paavo Alku
2016Using Zero-Frequency Resonator to Extract Multilingual Intonation Structure.
Jinfu Ni, Yoshinori Shiga, Hisashi Kawai
2016Using a Biomechanical Model and Articulatory Data for the Numerical Production of Vowels.
Saeed Dabbaghchian, Marc Arnela, Olov Engwall, Oriol Guasch, Ian Stavness, Pierre Badin
2016Utterance Verification for Text-Dependent Speaker Recognition: A Comparative Assessment Using the RedDots Corpus.
Tomi Kinnunen, Md. Sahidullah, Ivan Kukanov, Héctor Delgado, Massimiliano Todisco, Achintya Kumar Sarkar, Nicolai Bæk Thomsen, Ville Hautamäki, Nicholas W. D. Evans, Zheng-Hua Tan
2016Variation in Spoken North Sami Language.
Kristiina Jokinen, Trung Ngo Trong, Ville Hautamäki
2016Velum Control for Oral Sounds.
Reed Blaylock, Louis Goldstein, Shrikanth S. Narayanan
2016Virtual Adversarial Training Applied to Neural Higher-Order Factors for Phone Classification.
Martin Ratajczak, Sebastian Tschiatschek, Franz Pernkopf
2016Virtual Machines and Containers as a Platform for Experimentation.
Florian Metze, Eric Riebling, Anne S. Warlaumont, Elika Bergelson
2016Visual Speech Synthesis Using Dynamic Visemes, Contextual Features and DNNs.
Ausdang Thangthai, Ben Milner, Sarah Taylor
2016Vocal Effort Modification for Singing Synthesis.
Olivier Perrotin, Christophe d'Alessandro
2016Vocal Tract Length Normalization for Speaker Independent Acoustic-to-Articulatory Speech Inversion.
Ganesh Sivaraman, Vikramjit Mitra, Hosung Nam, Mark K. Tiede, Carol Y. Espy-Wilson
2016Voice Conversion Based on Matrix Variate Gaussian Mixture Model Using Multiple Frame Features.
Yi Yang, Hidetsugu Uchida, Daisuke Saito, Nobuaki Minematsu
2016Voice Conversion Based on Trajectory Model Training of Neural Networks Considering Global Variance.
Naoki Hosaka, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda
2016Voice Quality Control Using Perceptual Expressions for Statistical Parametric Speech Synthesis Based on Cluster Adaptive Training.
Yamato Ohtani, Koichiro Mori, Masahiro Morita
2016Voice-Quality Difference Between the Vowels in Filled Pauses and Ordinary Lexical Items.
Kikuo Maekawa, Hiroki Mori
2016Voting Detector: A Combination of Anomaly Detectors to Reveal Annotation Errors in TTS Corpora.
Jindrich Matousek, Daniel Tihelka
2016Vowel Characteristics in the Assessment of L2 English Pronunciation.
Calbert Graham, Paula Buttery, Francis Nolan
2016Vowel Fundamental and Formant Frequency Contributions to English and Mandarin Sentence Intelligibility.
Daniel Fogerty, Fei Chen
2016Vowels and Diphthongs in Cangnan Southern Min Chinese Dialect.
Fang Hu, Chunyu Ge
2016Vowels and Diphthongs in the Taiyuan Jin Chinese Dialect.
Liping Xia, Fang Hu
2016Waveform Generation Based on Signal Reshaping for Statistical Parametric Speech Synthesis.
Felipe Espic, Cassia Valentini-Botinhao, Zhizheng Wu, Simon King
2016Web Data Selection Based on Word Embedding for Low-Resource Speech Recognition.
Chuandong Xie, Wu Guo, Guoping Hu, Junhua Liu
2016Who Do You Think Will Speak Next? Perception of Turn-Taking Cues in Slovak and Argentine Spanish.
Agustín Gravano, Pablo Brusco, Stefan Benus
2016Why do ASR Systems Despite Neural Nets Still Depend on Robust Features.
Angel Mario Castro Martinez, Marc René Schädler
2016Within-Speaker Features for Native Language Recognition in the Interspeech 2016 Computational Paralinguistics Challenge.
Mark A. Huckvale
2016Word-Phrase-Entity Recurrent Neural Networks for Language Modeling.
Michael Levit, Sarangarajan Parthasarathy, Shuangyu Chang
2016YIN-Bird: Improved Pitch Tracking for Bird Vocalisations.
Colm O'Reilly, Nicola M. Marples, David J. Kelly, Naomi Harte
2016Zara: An Empathetic Interactive Virtual Agent.
Pascale Fung, Anik Dey, Farhad Bin Siddique, Ruixi Lin, Yang Yang, Yan Wan, Ricky Ho Yin Chan
2016i-Vector/HMM Based Text-Dependent Speaker Verification System for RedDots Challenge.
Hossein Zeinali, Hossein Sameti, Lukás Burget, Jan Cernocký, Nooshin Maghsoodi, Pavel Matejka
2016webASR 2 - Improved Cloud Based Speech Technology.
Thomas Hain, Jeremy Christian, Oscar Saz, Salil Deena, Madina Hasan, Raymond W. M. Ng, Rosanna Milner, Mortaza Doulaty, Yulan Liu