| 2019 | " Gra[f] e!" Word-Final Devoicing of Obstruents in Standard French: An Acoustic Study Based on Large Corpora. Adèle Jatteau, Ioana Vasilescu, Lori Lamel, Martine Adda-Decker, Nicolas Audibert |
| 2019 | "Computer, Test My Hearing": Accurate Speech Audiometry with Smart Speakers. Jasper Ooster, Pia Nancy Porysek Moreta, Jörg-Hendrik Bach, Inga Holube, Bernd T. Meyer |
| 2019 | 20th Annual Conference of the International Speech Communication Association, Interspeech 2019, Graz, Austria, September 15-19, 2019. Gernot Kubin, Zdravko Kacic |
| 2019 | A Chinese Dataset for Identifying Speakers in Novels. Jia-Xiang Chen, Zhen-Hua Ling, Li-Rong Dai |
| 2019 | A Combination of Model-Based and Feature-Based Strategy for Speech-to-Singing Alignment. Bidisha Sharma, Haizhou Li |
| 2019 | A Comparison of Deep Learning Methods for Language Understanding. Mandy Korpusik, Zoe Liu, James R. Glass |
| 2019 | A Comprehensive Study of Speech Separation: Spectrogram vs Waveform Separation. Fahimeh Bahmaninezhad, Jian Wu, Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu |
| 2019 | A Computational Model of Early Language Acquisition from Audiovisual Experiences of Young Infants. Okko Räsänen, Khazar Khorrami |
| 2019 | A Convolutional Neural Network with Non-Local Module for Speech Enhancement. Xiaoqi Li, Yaxing Li, Meng Li, Shan Xu, Yuanjie Dong, Xinrong Sun, Shengwu Xiong |
| 2019 | A Cross-Entropy-Guided (CEG) Measure for Speech Enhancement Front-End Assessing Performances of Back-End Automatic Speech Recognition. Li Chai, Jun Du, Chin-Hui Lee |
| 2019 | A Deep Learning Approach to Automatic Characterisation of Rhythm in Non-Native English Speech. Konstantinos Kyriakopoulos, Kate M. Knill, Mark J. F. Gales |
| 2019 | A Deep Neural Network for Short-Segment Speaker Recognition. Amirhossein Hajavi, Ali Etemad |
| 2019 | A Deep Residual Network for Large-Scale Acoustic Scene Analysis. Logan Ford, Hao Tang, François Grondin, James R. Glass |
| 2019 | A Frequency Normalization Technique for Kindergarten Speech Recognition Inspired by the Role of f Gary Yeung, Abeer Alwan |
| 2019 | A Hierarchical Attention Network-Based Approach for Depression Detection from Transcribed Clinical Interviews. Adria Mallol-Ragolta, Ziping Zhao, Lukas Stappen, Nicholas Cummins, Björn W. Schuller |
| 2019 | A Highly Efficient Distributed Deep Learning System for Automatic Speech Recognition. Wei Zhang, Xiaodong Cui, Ulrich Finkler, George Saon, Abdullah Kayi, Alper Buyuktosunoglu, Brian Kingsbury, David S. Kung, Michael Picheny |
| 2019 | A Hybrid Approach to Acoustic Scene Classification Based on Universal Acoustic Models. Xue Bai, Jun Du, Zi-Rui Wang, Chin-Hui Lee |
| 2019 | A Joint End-to-End and DNN-HMM Hybrid Automatic Speech Recognition System with Transferring Sharable Knowledge. Tomohiro Tanaka, Ryo Masumura, Takafumi Moriya, Takanobu Oba, Yushi Aono |
| 2019 | A Light Convolutional GRU-RNN Deep Feature Extractor for ASV Spoofing Detection. Alejandro Gómez Alanís, Antonio M. Peinado, José A. González, Angel M. Gomez |
| 2019 | A Machine Learning Based Clustering Protocol for Determining Hearing Aid Initial Configurations from Pure-Tone Audiograms. Chelzy Belitz, Hussnain Ali, John H. L. Hansen |
| 2019 | A Mandarin Prosodic Boundary Prediction Model Based on Multi-Task Learning. Huashan Pan, Xiulin Li, Zhiqiang Huang |
| 2019 | A Modified Algorithm for Multiple Input Spectrogram Inversion. DongXiao Wang, Hirokazu Kameoka, Koichi Shinoda |
| 2019 | A Multi-Accent Acoustic Model Using Mixture of Experts for Speech Recognition. Abhinav Jain, Vishwanath P. Singh, Shakti P. Rath |
| 2019 | A Multi-Speaker Emotion Morphing Model Using Highway Networks and Maximum Likelihood Objective. Ravi Shankar, Jacob Sager, Archana Venkataraman |
| 2019 | A Multimodal Real-Time MRI Articulatory Corpus of French for Speech Research. Ioannis K. Douros, Jacques Felblinger, Jens Frahm, Karyna Isaieva, Arun A. Joseph, Yves Laprie, Freddy Odille, Anastasiia Tsukanova, Dirk Voit, Pierre-André Vuissoz |
| 2019 | A Neural Turn-Taking Model without RNN. Chaoran Liu, Carlos Toshinori Ishi, Hiroshi Ishiguro |
| 2019 | A New Approach for Automating Analysis of Responses on Verbal Fluency Tests from Subjects At-Risk for Schizophrenia. Mary Pietrowicz, Carla Agurto, Raquel Norel, Elif Eyigöz, Guillermo A. Cecchi, Zarina R. Bilgrami, Cheryl Corcoran |
| 2019 | A New GAN-Based End-to-End TTS Training Algorithm. Haohan Guo, Frank K. Soong, Lei He, Lei Xie |
| 2019 | A New Time-Frequency Attention Mechanism for TDNN and CNN-LSTM-TDNN, with Application to Language Identification. Xiaoxiao Miao, Ian McLoughlin, Yonghong Yan |
| 2019 | A Non-Causal FFTNet Architecture for Speech Enhancement. P. V. Muhammed Shifas, Nagaraj Adiga, Vassilis Tsiaras, Yannis Stylianou |
| 2019 | A Novel Method to Correct Steering Vectors in MVDR Beamformer for Noise Robust ASR. Suliang Bu, Yunxin Zhao, Mei-Yuh Hwang |
| 2019 | A Path Signature Approach for Speech Emotion Recognition. Bo Wang, Maria Liakata, Hao Ni, Terry J. Lyons, Alejo J. Nevado-Holgado, Kate Saunders |
| 2019 | A Perceptual Study of CV Syllables in Both Spoken and Whistled Speech: A Tashlhiyt Berber Perspective. Julien Meyer, Laure Dentel, Silvain Gerber, Rachid Ridouane |
| 2019 | A Phonetic-Level Analysis of Different Input Features for Articulatory Inversion. Abdolreza Sabzi Shahrebabaki, Negar Olfati, Ali Shariq Imran, Sabato Marco Siniscalchi, Torbjørn Svendsen |
| 2019 | A Preliminary Study of Charismatic Speech on YouTube: Correlating Prosodic Variation with Counts of Subscribers, Views and Likes. Stephanie Berger, Oliver Niebuhr, Margaret Zellers |
| 2019 | A Real-Time Wideband Neural Vocoder at 1.6kb/s Using LPCNet. Jean-Marc Valin, Jan Skoglund |
| 2019 | A Robust Framework for Acoustic Scene Classification. Lam Dang Pham, Ian McLoughlin, Huy Phan, Ramaswamy Palaniappan |
| 2019 | A Saliency-Based Attention LSTM Model for Cognitive Load Classification from Speech. Ascensión Gallardo-Antolín, Juan Manuel Montero |
| 2019 | A Scalable Noisy Speech Dataset and Online Subjective Test Framework. Chandan K. A. Reddy, Ebrahim Beyrami, Jamie Pool, Ross Cutler, Sriram Srinivasan, Johannes Gehrke |
| 2019 | A Speaker-Dependent WaveNet for Voice Conversion with Non-Parallel Data. Xiaohai Tian, Eng Siong Chng, Haizhou Li |
| 2019 | A Statistically Principled and Computationally Efficient Approach to Speech Enhancement Using Variational Autoencoders. Manuel Pariente, Antoine Deleforge, Emmanuel Vincent |
| 2019 | A Storyteller's Tale: Literature Audiobooks Genre Classification Using CNN and RNN Architectures. Nehory Carmi, Azaria Cohen, Mireille Avigal, Anat Lerner |
| 2019 | A Strategy for Improved Phone-Level Lyrics-to-Audio Alignment for Speech-to-Singing Synthesis. David Ayllón, Fernando Villavicencio, Pierre Lanchantin |
| 2019 | A Study for Improving Device-Directed Speech Detection Toward Frictionless Human-Machine Interaction. Che-Wei Huang, Roland Maas, Sri Harish Mallidi, Björn Hoffmeister |
| 2019 | A Study of Soprano Singing in Light of the Source-Filter Interaction. Tokihiko Kaburagi |
| 2019 | A Study of a Cross-Language Perception Based on Cortical Analysis Using Biomimetic STRFs. Sangwook Park, David K. Han, Mounya Elhilali |
| 2019 | A Study of x-Vector Based Speaker Recognition on Short Utterances. Ahilan Kanagasundaram, Sridha Sridharan, Sriram Ganapathy, Prachi Singh, Clinton Fookes |
| 2019 | A System for Real-Time Privacy Preserving Data Collection for Ambient Assisted Living. Fasih Haider, Saturnino Luz |
| 2019 | A Time Delay Neural Network with Shared Weight Self-Attention for Small-Footprint Keyword Spotting. Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Zhengkun Tian, Chenghao Zhao, Cunhang Fan |
| 2019 | A Unified Bayesian Source Modelling for Determined Blind Source Separation. Chaitanya Narisetty |
| 2019 | A Unified Framework for Speaker and Utterance Verification. Tianchi Liu, Maulik C. Madhavi, Rohan Kumar Das, Haizhou Li |
| 2019 | A User-Friendly and Adaptable Re-Implementation of an Acoustic Prominence Detection and Annotation Tool. Jana Voße, Petra Wagner |
| 2019 | ASR Inspired Syllable Stress Detection for Pronunciation Evaluation Without Using a Supervised Classifier and Syllable Level Features. Manoj Kumar Ramanathi, Chiranjeevi Yarra, Prasanta Kumar Ghosh |
| 2019 | ASSERT: Anti-Spoofing with Squeeze-Excitation and Residual Networks. Cheng-I Lai, Nanxin Chen, Jesús Villalba, Najim Dehak |
| 2019 | ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection. Massimiliano Todisco, Xin Wang, Ville Vestman, Md. Sahidullah, Héctor Delgado, Andreas Nautsch, Junichi Yamagishi, Nicholas W. D. Evans, Tomi H. Kinnunen, Kong Aik Lee |
| 2019 | Acoustic Characteristics of Lexical Tone Disruption in Mandarin Speakers After Brain Damage. Wenjun Chen, Jeroen van de Weijer, Shuangshuang Zhu, Qian Qian, Manna Wang |
| 2019 | Acoustic Correlates of Phonation Type in Chichimec. Anneliese Kelterer, Barbara Schuppler |
| 2019 | Acoustic Cues to Topic and Narrow Focus in Egyptian Arabic. Dina El Zarka, Barbara Schuppler, Francesco Cangemi |
| 2019 | Acoustic Indicators of Deception in Mandarin Daily Conversations Recorded from an Interactive Game. Chih-Hsiang Huang, Huang-Cheng Chou, Yi-Tong Wu, Chi-Chun Lee, Yi-Wen Liu |
| 2019 | Acoustic Model Bootstrapping Using Semi-Supervised Learning. Langzhou Chen, Volker Leutnant |
| 2019 | Acoustic Model Ensembling Using Effective Data Augmentation for CHiME-5 Challenge. Feng Ma, Li Chai, Jun Du, Diyuan Liu, Zhongfu Ye, Chin-Hui Lee |
| 2019 | Acoustic Model Optimization Based on Evolutionary Stochastic Gradient Descent with Anchors for Automatic Speech Recognition. Xiaodong Cui, Michael Picheny |
| 2019 | Acoustic Modeling for Automatic Lyrics-to-Audio Alignment. Chitralekha Gupta, Emre Yilmaz, Haizhou Li |
| 2019 | Acoustic Scene Classification Using Teacher-Student Learning with Soft-Labels. Hee-Soo Heo, Jee-weon Jung, Hye-jin Shim, Ha-Jin Yu |
| 2019 | Acoustic Scene Classification by Implicitly Identifying Distinct Sound Events. Hongwei Song, Jiqing Han, Shiwen Deng, Zhihao Du |
| 2019 | Acoustic Scene Classification with Mismatched Devices Using CliqueNets and Mixup Data Augmentation. Truc Nguyen, Franz Pernkopf |
| 2019 | Acoustic and Articulatory Feature Based Speech Rate Estimation Using a Convolutional Dense Neural Network. Renuka Mannem, Jhansi Mallela, Aravind Illa, Prasanta Kumar Ghosh |
| 2019 | Acoustic and Articulatory Study of Ewe Vowels: A Comparative Study of Male and Female. Kowovi Comivi Alowonou, Jianguo Wei, Wenhuan Lu, Zhicheng Liu, Kiyoshi Honda, Jianwu Dang |
| 2019 | Acoustic-to-Phrase Models for Speech Recognition. Yashesh Gaur, Jinyu Li, Zhong Meng, Yifan Gong |
| 2019 | Active Annotation: Bootstrapping Annotation Lexicon and Guidelines for Supervised NLU Learning. Federico Marinelli, Alessandra Cervone, Giuliano Tortoreto, Evgeny A. Stepanov, Giuseppe Di Fabbrizio, Giuseppe Riccardi |
| 2019 | Active Learning Methods for Low Resource End-to-End Speech Recognition. Karan Malhotra, Shubham Bansal, Sriram Ganapathy |
| 2019 | Active Learning for Domain Classification in a Commercial Spoken Personal Assistant. Xi C. Chen, Adithya Sagar, Justine T. Kao, Tony Y. Li, Christopher Klein, Stephen Pulman, Ashish Garg, Jason D. Williams |
| 2019 | Adapting Transformer to End-to-End Spoken Language Translation. Mattia Antonino Di Gangi, Matteo Negri, Marco Turchi |
| 2019 | Adapting a FrameNet Semantic Parser for Spoken Language Understanding Using Adversarial Learning. Gabriel Marzinotto, Géraldine Damnati, Frédéric Béchet |
| 2019 | Adjusting Pleasure-Arousal-Dominance for Continuous Emotional Text-to-Speech Synthesizer. Azam Rabiee, Tae-Ho Kim, Soo-Young Lee |
| 2019 | Advances in Automatic Speech Recognition for Child Speech Using Factored Time Delay Neural Network. Fei Wu, Leibny Paola García-Perera, Daniel Povey, Sanjeev Khudanpur |
| 2019 | Advancing Sequence-to-Sequence Based Speech Recognition. Zoltán Tüske, Kartik Audhkhasi, George Saon |
| 2019 | Adversarial Black-Box Attacks on Automatic Speech Recognition Systems Using Multi-Objective Evolutionary Optimization. Shreya Khare, Rahul Aralikatte, Senthil Mani |
| 2019 | Adversarial Optimization for Dictionary Attacks on Speaker Verification. Mirko Marras, Pawel Korus, Nasir D. Memon, Gianni Fenu |
| 2019 | Adversarial Regularization for End-to-End Robust Speaker Verification. Qing Wang, Pengcheng Guo, Sining Sun, Lei Xie, John H. L. Hansen |
| 2019 | Adversarially Trained End-to-End Korean Singing Voice Synthesis System. Juheon Lee, Hyeong-Seok Choi, Chang-Bin Jeon, Junghyun Koo, Kyogu Lee |
| 2019 | Aerodynamics and Lumped-Masses Combined with Delay Lines for Modeling Vertical and Anterior-Posterior Phase Differences in Pathological Vocal Fold Vibration. Carlo Drioli, Philipp Aichinger |
| 2019 | Age-Related Changes in European Portuguese Vowel Acoustics. Luciana Albuquerque, Catarina Oliveira, António J. S. Teixeira, Pedro Sá-Couto, Daniela Figueiredo |
| 2019 | All Together Now: The Living Audio Dataset. David A. Braude, Matthew P. Aylett, Caoimhín Laoide-Kemp, Simone Ashby, Kristen M. Scott, Brian Ó Raghallaigh, Anna Braudo, Alex Brouwer, Adriana Stan |
| 2019 | An Acoustic Study of Vowel Undershoot in a System with Several Degrees of Prominence. Janina Molczanow, Beata Lukaszewicz, Anna Lukaszewicz |
| 2019 | An Acoustic and Lexical Analysis of Emotional Valence in Spontaneous Speech: Autobiographical Memory Recall in Older Adults. Deniece S. Nazareth, Ellen Tournier, Sarah Leimkötter, Esther Janse, Dirk Heylen, Gerben J. Westerhof, Khiet P. Truong |
| 2019 | An Adaptive-Q Cochlear Model for Replay Spoofing Detection. Tharshini Gunendradasan, Eliathamby Ambikairajah, Julien Epps, Haizhou Li |
| 2019 | An Analysis of Local Monotonic Attention Variants. André Merboldt, Albert Zeyer, Ralf Schlüter, Hermann Ney |
| 2019 | An Approach to Online Speaker Change Point Detection Using DNNs and WFSTs. Lukás Mateju, Petr Cerva, Jindrich Zdánský |
| 2019 | An Articulatory-Acoustic Investigation into GOOSE-Fronting in German-English Bilinguals Residing in London, UK. Scott Lewis, Adib Mehrabi, Esther de Leeuw |
| 2019 | An Attention-Based Hybrid Network for Automatic Detection of Alzheimer's Disease from Narrative Speech. Jun Chen, Ji Zhu, Jieping Ye |
| 2019 | An Effective Deep Embedding Learning Architecture for Speaker Verification. Yiheng Jiang, Yan Song, Ian McLoughlin, Zhifu Gao, Li-Rong Dai |
| 2019 | An Empirical Evaluation of DTW Subsampling Methods for Keyword Search. Bolaji Yusuf, Murat Saraclar |
| 2019 | An End-to-End Audio Classification System Based on Raw Waveforms and Mix-Training Strategy. Jiaxu Chen, Jing Hao, Kai Chen, Di Xie, Shicai Yang, Shiliang Pu |
| 2019 | An End-to-End Text-Independent Speaker Verification Framework with a Keyword Adversarial Network. Sungrack Yun, Janghoon Cho, Jungyun Eum, Wonil Chang, Kyuwoong Hwang |
| 2019 | An Extended Two-Dimensional Vocal Tract Model for Fast Acoustic Simulation of Single-Axis Symmetric Three-Dimensional Tubes. Debasish Ray Mohapatra, Victor Zappi, Sidney S. Fels |
| 2019 | An Improved Goodness of Pronunciation (GoP) Measure for Pronunciation Evaluation with DNN-HMM System Considering HMM Transition Probabilities. Sweekar Sudhakara, Manoj Kumar Ramanathi, Chiranjeevi Yarra, Prasanta Kumar Ghosh |
| 2019 | An Incremental Turn-Taking Model for Task-Oriented Dialog Systems. Andrei Catalin Coman, Koichiro Yoshino, Yukitoshi Murase, Satoshi Nakamura, Giuseppe Riccardi |
| 2019 | An Investigation into On-Device Personalization of End-to-End Automatic Speech Recognition Models. Khe Chai Sim, Petr Zadrazil, Françoise Beaufays |
| 2019 | An Investigation of Therapeutic Rapport Through Prosody in Brief Psychodynamic Psychotherapy. Carolina De Pasquale, Charlie Cullen, Brian Vaughan |
| 2019 | An Investigation on Speaker Specific Articulatory Synthesis with Speaker Independent Articulatory Inversion. Aravind Illa, Prasanta Kumar Ghosh |
| 2019 | An Online Attention-Based Model for Speech Recognition. Ruchao Fan, Pan Zhou, Wei Chen, Jia Jia, Gang Liu |
| 2019 | An Unsupervised Autoregressive Model for Speech Representation Learning. Yu-An Chung, Wei-Ning Hsu, Hao Tang, James R. Glass |
| 2019 | Analysis and Synthesis of Vocal Flutter and Vocal Jitter. Jean Schoentgen, Philipp Aichinger |
| 2019 | Analysis by Adversarial Synthesis - A Novel Approach for Speech Vocoding. Ahmed Mustafa, Arijit Biswas, Christian Bergler, Julia Schottenhamml, Andreas K. Maier |
| 2019 | Analysis of BUT Submission in Far-Field Scenarios of VOiCES 2019 Challenge. Pavel Matejka, Oldrich Plchot, Hossein Zeinali, Ladislav Mosner, Anna Silnova, Lukás Burget, Ondrej Novotný, Ondrej Glembek |
| 2019 | Analysis of BUT Submission in Far-Field Scenarios of VOiCES 2019 Challenge. Pavel Matejka, Oldrich Plchot, Hossein Zeinali, Ladislav Mosner, Anna Silnova, Lukás Burget, Ondrej Novotný, Ondrej Glembek |
| 2019 | Analysis of Critical Metadata Factors for the Calibration of Speaker Recognition Systems. Mahesh Kumar Nandwana, Luciana Ferrer, Mitchell McLaren, Diego Castán, Aaron Lawson |
| 2019 | Analysis of Deep Clustering as Preprocessing for Automatic Speech Recognition of Sparsely Overlapping Speech. Tobias Menne, Ilya Sklyar, Ralf Schlüter, Hermann Ney |
| 2019 | Analysis of Deep Learning Architectures for Cross-Corpus Speech Emotion Recognition. Jack Parry, Dimitri Palaz, Georgia Clarke, Pauline Lecomte, Rebecca Mead, Michael Berger, Gregor Hofer |
| 2019 | Analysis of Effect and Timing of Fillers in Natural Turn-Taking. Divesh Lala, Shizuka Nakamura, Tatsuya Kawahara |
| 2019 | Analysis of Multilingual Sequence-to-Sequence Speech Recognition Systems. Martin Karafiát, Murali Karthick Baskar, Shinji Watanabe, Takaaki Hori, Matthew Wiesner, Jan Cernocký |
| 2019 | Analysis of Native Listeners' Facial Microexpressions While Shadowing Non-Native Speech - Potential of Shadowers' Facial Expressions for Comprehensibility Prediction. Tasavat Trisitichoke, Shintaro Ando, Daisuke Saito, Nobuaki Minematsu |
| 2019 | Analysis of Pronunciation Learning in End-to-End Speech Synthesis. Jason Taylor, Korin Richmond |
| 2019 | Analyzing Intra-Speaker and Inter-Speaker Vocal Tract Impedance Characteristics in a Low-Dimensional Feature Space Using t-SNE. Balamurali B. T., Jer-Ming Chen |
| 2019 | Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition. Yonatan Belinkov, Ahmed Ali, James R. Glass |
| 2019 | Analyzing Reaction Time and Error Sequences in Lexical Decision Experiments. Louis ten Bosch, Lou Boves, Kimberley Mulder |
| 2019 | Analyzing Verbal and Nonverbal Features for Predicting Group Performance. Uliyana Kubasova, Gabriel Murray, McKenzie Braley |
| 2019 | Anti-Spoofing Speaker Verification System with Multi-Feature Integration and Multi-Task Learning. Rongjin Li, Miao Zhao, Zheng Li, Lin Li, Qingyang Hong |
| 2019 | Apkinson: A Mobile Solution for Multimodal Assessment of Patients with Parkinson's Disease. Juan Camilo Vásquez-Correa, Tomás Arias-Vergara, Philipp Klumpp, M. Strauss, Arne Küderle, Nils Roth, Sebastian P. Bayerl, Nicanor García-Ospina, Paula Andrea Pérez-Toro, L. Felipe Parra-Gallego, Cristian David Ríos-Urrego, Daniel Escobar-Grisales, Juan Rafael Orozco-Arroyave, Björn M. Eskofier, Elmar Nöth |
| 2019 | Are IP Initial Vowels Acoustically More Distinct? Results from LDA and CNN Classifications. Fanny Guitard-Ivent, Gabriele Chignoli, Cécile Fougeron, Laurianne Georgeton |
| 2019 | Articulation Rate as a Metric in Spoken Language Assessment. Calbert Graham, Francis Nolan |
| 2019 | Articulation of Vowel Length Contrasts in Australian English. Louise Ratko, Michael I. Proctor, Felicity Cox |
| 2019 | Articulatory Analysis of Transparent Vowel /iː/ in Harmonic and Antiharmonic Hungarian Stems: Is There a Difference? Alexandra Markó, Márton Bartók, Tamás Gábor Csapó, Tekla Etelka Gráczi, Andrea Deme |
| 2019 | Articulatory Characteristics of Secondary Palatalization in Romanian Fricatives. Laura Spinu, Maida Percival, Alexei Kochetov |
| 2019 | Articulatory Copy Synthesis Based on a Genetic Algorithm. Yingming Gao, Simon Stone, Peter Birkholz |
| 2019 | Artificial Bandwidth Extension Using H∞ Optimization. Deepika Gupta, Hanumant Singh Shekhawat |
| 2019 | Assessing Acoustic and Articulatory Dimensions of Speech Motor Adaptation with Random Forests. Eugen Klein, Jana Brunner, Phil Hoole |
| 2019 | Assessing Neuromotor Coordination in Depression Using Inverted Vocal Tract Variables. Carol Y. Espy-Wilson, Adam C. Lammert, Nadee Seneviratne, Thomas F. Quatieri |
| 2019 | Assessing Parkinson's Disease from Speech Using Fisher Vectors. José Vicente Egas López, Juan Rafael Orozco-Arroyave, Gábor Gosztolya |
| 2019 | Assessing the Semantic Space Bias Caused by ASR Error Propagation and its Effect on Spoken Document Summarization. Máté Ákos Tündik, Valér Kaszás, György Szaszák |
| 2019 | Attention Based Hybrid i-Vector BLSTM Model for Language Recognition. Bharat Padi, Anand Mohan, Sriram Ganapathy |
| 2019 | Attention Model for Articulatory Features Detection. Ievgen Karaulov, Dmytro Tkanov |
| 2019 | Attention-Based Word Vector Prediction with LSTMs and its Application to the OOV Problem in ASR. Alejandro Coucheiro-Limeres, Fernando Fernández Martínez, Rubén San Segundo, Javier Ferreiros López |
| 2019 | Attention-Enhanced Connectionist Temporal Classification for Discrete Speech Emotion Recognition. Ziping Zhao, Zhongtian Bao, Zixing Zhang, Nicholas Cummins, Haishuai Wang, Björn W. Schuller |
| 2019 | Attentive to Individual: A Multimodal Emotion Recognition Network with Personalized Attention Profile. Jeng-Lin Li, Chi-Chun Lee |
| 2019 | Audio Classification of Bit-Representation Waveform. Masaki Okawa, Takuya Saito, Naoki Sawada, Hiromitsu Nishizaki |
| 2019 | Audio Tagging with Compact Feedforward Sequential Memory Network and Audio-to-Audio Ratio Based Data Augmentation. Zhiying Huang, Shiliang Zhang, Ming Lei |
| 2019 | Augmented CycleGANs for Continuous Scale Normal-to-Lombard Speaking Style Conversion. Shreyas Seshadri, Lauri Juvela, Paavo Alku, Okko Räsänen |
| 2019 | Auto-Encoding Nearest Neighbor i-Vectors for Speaker Verification. Umair Khan, Miquel India, Javier Hernando |
| 2019 | Autoencoder-Based Semi-Supervised Curriculum Learning for Out-of-Domain Speaker Verification. Siqi Zheng, Gang Liu, Hongbin Suo, Yun Lei |
| 2019 | Automated Emotion Morphing in Speech Based on Diffeomorphic Curve Registration and Highway Networks. Ravi Shankar, Hsi-Wei Hsieh, Nicolas Charon, Archana Venkataraman |
| 2019 | Automated Estimation of Oral Reading Fluency During Summer Camp e-Book Reading with MyTurnToRead. Anastassia Loukina, Beata Beigman Klebanov, Patrick L. Lange, Yao Qian, Binod Gyawali, Nitin Madnani, Abhinav Misra, Klaus Zechner, Zuowei Wang, John Sabatini |
| 2019 | Automatic Assessment of Language Impairment Based on Raw ASR Output. Ying Qin, Tan Lee, Anthony Pak-Hin Kong |
| 2019 | Automatic Compression of Subtitles with Neural Networks and its Effect on User Experience. Katrin Angerbauer, Heike Adel, Ngoc Thang Vu |
| 2019 | Automatic Depression Level Detection via ℓ Mingyue Niu, Jianhua Tao, Bin Liu, Cunhang Fan |
| 2019 | Automatic Detection of Autism Spectrum Disorder in Children Using Acoustic and Text Features from Brief Natural Conversations. Sunghye Cho, Mark Y. Liberman, Neville Ryant, Meredith Cola, Robert T. Schultz, Julia Parish-Morris |
| 2019 | Automatic Detection of Breath Using Voice Activity Detection and SVM Classifier with Application on News Reports. Mohamed Ismail Yasar Arafath K, Aurobinda Routray |
| 2019 | Automatic Detection of Off-Topic Spoken Responses Using Very Deep Convolutional Neural Networks. Xinhao Wang, Su-Youn Yoon, Keelan Evanini, Klaus Zechner, Yao Qian |
| 2019 | Automatic Detection of Prosodic Focus in American English. Sunghye Cho, Mark Y. Liberman, Yong-Cheol Lee |
| 2019 | Automatic Detection of the Temporal Segmentation of Hand Movements in British English Cued Speech. Li Liu, Jianze Li, Gang Feng, Xiao-Ping (Steven) Zhang |
| 2019 | Automatic Hierarchical Attention Neural Network for Detecting AD. Yilin Pan, Bahman Mirheidari, Markus Reuber, Annalena Venneri, Daniel Blackburn, Heidi Christensen |
| 2019 | Automatic Lyric Transcription from Karaoke Vocal Tracks: Resources and a Baseline System. Gerardo Roa Dabike, Jon Barker |
| 2019 | Autonomous Emotion Learning in Speech: A View of Zero-Shot Speech Emotion Recognition. Xinzhou Xu, Jun Deng, Nicholas Cummins, Zixing Zhang, Li Zhao, Björn W. Schuller |
| 2019 | Auxiliary Interference Speaker Loss for Target-Speaker Speech Recognition. Naoyuki Kanda, Shota Horiguchi, Ryoichi Takashima, Yusuke Fujita, Kenji Nagamatsu, Shinji Watanabe |
| 2019 | Avaya Conversational Intelligence: A Real-Time System for Spoken Language Understanding in Human-Human Call Center Conversations. Jan Mizgajski, Adrian Szymczak, Robert Glowski, Piotr Szymanski, Piotr Zelasko, Lukasz Augustyniak, Mikolaj Morzy, Yishay Carmiel, Jeff Hodson, Lukasz Wójciak, Daniel Smoczyk, Adam Wróbel, Bartosz Borowik, Adam Artajew, Marcin Baran, Cezary Kwiatkowski, Marzena Zyla-Hoppe |
| 2019 | BAS Web Services for Automatic Subtitle Creation and Anonymization. Florian Schiel, Thomas Kisler |
| 2019 | BERT-DST: Scalable End-to-End Dialogue State Tracking with Bidirectional Encoder Representations from Transformer. Guan-Lin Chao, Ian R. Lane |
| 2019 | Bag-of-Acoustic-Words for Mental Health Assessment: A Deep Autoencoding Approach. Wenchao Du, Louis-Philippe Morency, Jeffrey F. Cohn, Alan W. Black |
| 2019 | Bandwidth Embeddings for Mixed-Bandwidth Speech Recognition. Gautam Mantena, Ozlem Kalinli, Ossama Abdel-Hamid, Don McAllaster |
| 2019 | Bayesian HMM Based x-Vector Clustering for Speaker Diarization. Mireia Díez, Lukás Burget, Shuai Wang, Johan Rohdin, Jan Cernocký |
| 2019 | Bayesian Subspace Hidden Markov Model for Acoustic Unit Discovery. Lucas Ondel, Hari Krishna Vydana, Lukás Burget, Jan Cernocký |
| 2019 | Benchmarking Benchmarks: Introducing New Automatic Indicators for Benchmarking Spoken Language Understanding Corpora. Frédéric Béchet, Christian Raymond |
| 2019 | Better Morphology Prediction for Better Speech Systems. Dravyansh Sharma, Melissa Wilson, Antoine Bruguier |
| 2019 | Binary Speech Features for Keyword Spotting Tasks. Alexandre Riviello, Jean-Pierre David |
| 2019 | Biologically Inspired Adaptive-Q Filterbanks for Replay Spoofing Attack Detection. Buddhi Wickramasinghe, Eliathamby Ambikairajah, Julien Epps |
| 2019 | Biosignal Processing for Human-Machine Interaction. Tanja Schultz |
| 2019 | Blind Channel Response Estimation for Replay Attack Detection. Anderson R. Avila, Jahangir Alam, Douglas D. O'Shaughnessy, Tiago H. Falk |
| 2019 | Boosting Character-Based Chinese Speech Synthesis via Multi-Task Learning and Dictionary Tutoring. Yuxiang Zou, Linhao Dong, Bo Xu |
| 2019 | Bootstrapping a Text Normalization System for an Inflected Language. Numbers as a Test Case. Anna Björk Nikulásdóttir, Jón Guðnason |
| 2019 | Bridging the Gap Between Monaural Speech Enhancement and Recognition with Distortion-Independent Acoustic Modeling. Peidong Wang, Ke Tan, DeLiang Wang |
| 2019 | Building Large-Vocabulary ASR Systems for Languages Without Any Audio Training Data. Manasa Prasad, Daan van Esch, Sandy Ritchie, Jonas Fromseier Mortensen |
| 2019 | Building a Mixed-Lingual Neural TTS System with Only Monolingual Data. Liumeng Xue, Wei Song, Guanghui Xu, Lei Xie, Zhizheng Wu |
| 2019 | Building the Singapore English National Speech Corpus. Jia Xin Koh, Aqilah Mislan, Kevin Khoo, Brian Ang, Wilson Ang, Charmaine Ng, Ying-Ying Tan |
| 2019 | CNN-BLSTM Based Question Detection from Dialogs Considering Phase and Context Information. Yuke Si, Longbiao Wang, Jianwu Dang, Mengfei Wu, Aijun Li |
| 2019 | CNN-Based Phoneme Classifier from Vocal Tract MRI Learns Embedding Consistent with Articulatory Topology. Kicky G. van Leeuwen, Paula Bos, Stefano Trebeschi, Maarten J. A. van Alphen, Luuk Voskuilen, Ludi E. Smeele, Ferdi van der Heijden, R. J. J. H. van Son |
| 2019 | CNN-LSTM Models for Multi-Speaker Source Separation Using Bayesian Hyper Parameter Optimization. Jeroen Zegers, Hugo Van hamme |
| 2019 | CRIM's Speech Transcription and Call Sign Detection System for the ATC Airbus Challenge Task. Vishwa Gupta, Lise Rebout, Gilles Boulianne, Pierre André Ménard, Jahangir Alam |
| 2019 | CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages. Kyubyong Park, Thomas Mulc |
| 2019 | Calibrating DNN Posterior Probability Estimates of HMM/DNN Models to Improve Social Signal Detection from Audio Data. Gábor Gosztolya, László Tóth |
| 2019 | CaptionAI: A Real-Time Multilingual Captioning Application. Nagendra Kumar Goel, Mousmita Sarma, Saikiran Valluri, Dharmeshkumar Agrawal, Steve Braich, Tejendra Singh Kuswah, Zikra Iqbal, Surbhi Chauhan, Raj Karbar |
| 2019 | Capturing L1 Influence on L2 Pronunciation by Simulating Perceptual Space Using Acoustic Features. Shuju Shi, Chilin Shih, Jinsong Zhang |
| 2019 | Cascaded Cross-Module Residual Learning Towards Lightweight End-to-End Speech Coding. Kai Zhen, Jongmo Sung, Mi Suk Lee, Seungkwon Beack, Minje Kim |
| 2019 | Challenging the Boundaries of Speech Recognition: The MALACH Corpus. Michael Picheny, Zoltán Tüske, Brian Kingsbury, Kartik Audhkhasi, Xiaodong Cui, George Saon |
| 2019 | Char+CV-CTC: Combining Graphemes and Consonant/Vowel Units for CTC-Based ASR Using Multitask Learning. Abdelwahab Heba, Thomas Pellegrini, Jean-Pierre Lorré, Régine André-Obrecht |
| 2019 | Character-Aware Sub-Word Level Language Modeling for Uyghur and Turkish ASR. Chang Liu, Zhen Zhang, Pengyuan Zhang, Yonghong Yan |
| 2019 | Child Speech Disorder Detection with Siamese Recurrent Network Using Speech Attribute Features. Jiarui Wang, Ying Qin, Zhiyuan Peng, Tan Lee |
| 2019 | Class-Wise Centroid Distance Metric Learning for Acoustic Event Detection. Xugang Lu, Peng Shen, Sheng Li, Yu Tsao, Hisashi Kawai |
| 2019 | Coarse-to-Fine Optimization for Speech Enhancement. Jian Yao, Ahmad Al-Dahle |
| 2019 | Code-Switching Detection Using ASR-Generated Language Posteriors. Qinyi Wang, Emre Yilmaz, Adem Derinel, Haizhou Li |
| 2019 | Code-Switching Sentence Generation by Bert and Generative Adversarial Networks. Yingying Gao, Junlan Feng, Ying Liu, Leijing Hou, Xin Pan, Yong Ma |
| 2019 | Code-Switching Sentence Generation by Generative Adversarial Networks and its Application to Data Augmentation. Ching-Ting Chang, Shun-Po Chuang, Hung-yi Lee |
| 2019 | Cognitive Factors in Thai-Naïve Mandarin Speakers' Imitation of Thai Lexical Tones. Juqiang Chen, Catherine T. Best, Mark Antoniou |
| 2019 | Combining Adversarial Training and Disentangled Speech Representation for Robust Zero-Resource Subword Modeling. Siyuan Feng, Tan Lee, Zhiyuan Peng |
| 2019 | Combining Speaker Recognition and Metric Learning for Speaker-Dependent Representation Learning. João Monteiro, Jahangir Alam, Tiago H. Falk |
| 2019 | Comparative Analysis of Prosodic Characteristics Using WaveNet Embeddings. Antti Suni, Marcin Wlodarczak, Martti Vainio, Juraj Simko |
| 2019 | Comparative Analysis of Think-Aloud Methods for Everyday Activities in the Context of Cognitive Robotics. Moritz Meier, Celeste Mason, Felix Putze, Tanja Schultz |
| 2019 | Comparative Study of Parametric and Representation Uncertainty Modeling for Recurrent Neural Network Language Models. Jianwei Yu, Max W. Y. Lam, Shoukang Hu, Xixin Wu, Xu Li, Yuewen Cao, Xunying Liu, Helen Meng |
| 2019 | Comparison of Lattice-Free and Lattice-Based Sequence Discriminative Training Criteria for LVCSR. Wilfried Michel, Ralf Schlüter, Hermann Ney |
| 2019 | Comparison of Speech Tasks and Recording Devices for Voice Based Automatic Classification of Healthy Subjects and Patients with Amyotrophic Lateral Sclerosis. Suhas B. N., Deep Patel, Nithin Rao Koluguri, Yamini Belur, Pradeep Reddy, Atchayaram Nalini, Ravi Yadav, Dipanjan Gope, Prasanta Kumar Ghosh |
| 2019 | Comparison of Telephone Recordings and Professional Microphone Recordings for Early Detection of Parkinson's Disease, Using Mel-Frequency Cepstral Coefficients with Gaussian Mixture Models. Laetitia Jeancolas, Graziella Mangone, Jean-Christophe Corvol, Marie Vidailhet, Stéphane Lehéricy, Badr-Eddine Benkelfat, Habib Benali, Dijana Petrovska-Delacrétaz |
| 2019 | Compensation for French Liquid Deletion During Auditory Sentence Processing. Sharon Peperkamp, Alvaro Martin Iturralde Zurita |
| 2019 | Completely Unsupervised Phoneme Recognition by a Generative Adversarial Network Harmonized with Iteratively Refined Hidden Markov Models. Kuan-Yu Chen, Che-Ping Tsai, Da-Rong Liu, Hung-yi Lee, Lin-Shan Lee |
| 2019 | Compression of Acoustic Event Detection Models with Quantized Distillation. Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, Chao Wang |
| 2019 | Compression of CTC-Trained Acoustic Models by Dynamic Frame-Wise Distillation or Segment-Wise N-Best Hypotheses Imitation. Haisong Ding, Kai Chen, Qiang Huo |
| 2019 | Conditional Variational Auto-Encoder for Text-Driven Expressive AudioVisual Speech Synthesis. Sara Dahmani, Vincent Colotte, Valérian Girard, Slim Ouni |
| 2019 | Connecting and Comparing Language Model Interpolation Techniques. Ernest Pusateri, Christophe Van Gysel, Rami Botros, Sameer Badaskar, Mirko Hannemann, Youssef Oualil, Ilya Oparin |
| 2019 | Consonant Classification in Mandarin Based on the Depth Image Feature: A Pilot Study. Han-Chi Hsieh, Wei-Zhong Zheng, Ko-Chiang Chen, Ying-Hui Lai |
| 2019 | Constrained Output Embeddings for End-to-End Code-Switching Speech Recognition with Only Monolingual Data. Yerbolat Khassanov, Haihua Xu, Van Tung Pham, Zhiping Zeng, Eng Siong Chng, Chongjia Ni, Bin Ma |
| 2019 | Contextual Recovery of Out-of-Lattice Named Entities in Automatic Speech Recognition. Jack Serrino, Leonid Velikovich, Petar S. Aleksic, Cyril Allauzen |
| 2019 | Continuous Emotion Recognition in Speech - Do We Need Recurrence? Maximilian Schmitt, Nicholas Cummins, Björn W. Schuller |
| 2019 | Contributions of Consonant-Vowel Transitions to Mandarin Tone Identification in Simulated Electric-Acoustic Hearing. Fei Chen |
| 2019 | Conversational Emotion Analysis via Attention Mechanisms. Zheng Lian, Jianhua Tao, Bin Liu, Jian Huang |
| 2019 | Conversational and Social Laughter Synthesis with WaveNet. Hiroki Mori, Tomohiro Nagata, Yoshiko Arimoto |
| 2019 | Convolutional Neural Network-Based Speech Enhancement for Cochlear Implant Recipients. Nursadul Mamun, Soheil Khorram, John H. L. Hansen |
| 2019 | Corpus Design Using Convolutional Auto-Encoder Embeddings for Audio-Book Synthesis. Meysam Shamsi, Damien Lolive, Nelly Barbot, Jonathan Chevelu |
| 2019 | Cross-Attention End-to-End ASR for Two-Party Conversations. Suyoun Kim, Siddharth Dalmia, Florian Metze |
| 2019 | Cross-Corpus Speech Emotion Recognition Using Semi-Supervised Transfer Non-Negative Matrix Factorization with Adaptation Regularization. Hui Luo, Jiqing Han |
| 2019 | Cross-Domain Replay Spoofing Attack Detection Using Domain Adversarial Training. Hongji Wang, Heinrich Dinkel, Shuai Wang, Yanmin Qian, Kai Yu |
| 2019 | Cross-Lingual Consistency of Phonological Features: An Empirical Study. Cibu Johny, Alexander Gutkin, Martin Jansche |
| 2019 | Cross-Lingual Transfer Learning for Affective Spoken Dialogue Systems. Kristijan Gjoreski, Aleksandar Gjoreski, Ivan Kraljevski, Diane Hirschfeld |
| 2019 | Cross-Lingual, Multi-Speaker Text-To-Speech Synthesis Using Neural Speaker Embedding. Mengnan Chen, Minchuan Chen, Shuang Liang, Jun Ma, Lei Chen, Shaojun Wang, Jing Xiao |
| 2019 | Cumulative Adaptation for BLSTM Acoustic Models. Markus Kitza, Pavel Golik, Ralf Schlüter, Hermann Ney |
| 2019 | Curriculum-Based Transfer Learning for an Effective End-to-End Spoken Language Understanding and Domain Portability. Antoine Caubrière, Natalia A. Tomashenko, Antoine Laurent, Emmanuel Morin, Nathalie Camelin, Yannick Estève |
| 2019 | CycleGAN-Based Emotion Style Transfer as Data Augmentation for Speech Emotion Recognition. Fang Bao, Michael Neumann, Ngoc Thang Vu |
| 2019 | Data Augmentation Using GANs for Speech Emotion Recognition. Aggelina Chatziagapi, Georgios Paraskevopoulos, Dimitris Sgouropoulos, Georgios Pantazopoulos, Malvina Nikandrou, Theodoros Giannakopoulos, Athanasios Katsamanis, Alexandros Potamianos, Shrikanth Narayanan |
| 2019 | Data Augmentation Using Variational Autoencoder for Embedding Based Speaker Verification. Zhanghao Wu, Shuai Wang, Yanmin Qian, Kai Yu |
| 2019 | Deep Attention Gated Dilated Temporal Convolutional Networks with Intra-Parallel Convolutional Modules for End-to-End Monaural Speech Separation. Ziqiang Shi, Huibin Lin, Liu Liu, Rujie Liu, Jiqing Han, Anyan Shi |
| 2019 | Deep Hashing for Speaker Identification and Retrieval. Lei Fan, Qing-Yuan Jiang, Ya-Qi Yu, Wu-Jun Li |
| 2019 | Deep Hierarchical Fusion with Application in Sentiment Analysis. Efthymios Georgiou, Charilaos Papaioannou, Alexandros Potamianos |
| 2019 | Deep Learning Based Mandarin Accent Identification for Accent Robust ASR. Felix Weninger, Yang Sun, Junho Park, Daniel Willett, Puming Zhan |
| 2019 | Deep Learning Based Multi-Channel Speaker Recognition in Noisy and Reverberant Environments. Hassan Taherian, Zhong-Qiu Wang, DeLiang Wang |
| 2019 | Deep Learning for Joint Acoustic Echo and Noise Cancellation with Nonlinear Distortions. Hao Zhang, Ke Tan, DeLiang Wang |
| 2019 | Deep Learning for Orca Call Type Identification - A Fully Unsupervised Approach. Christian Bergler, Manuel Schmitt, Rachael Xi Cheng, Andreas K. Maier, Volker Barth, Elmar Nöth |
| 2019 | Deep Learning of Segment-Level Feature Representation with Multiple Instance Learning for Utterance-Level Speech Emotion Recognition. Shuiyang Mao, P. C. Ching, Tan Lee |
| 2019 | Deep Multitask Acoustic Echo Cancellation. Amin Fazel, Mostafa El-Khamy, Jungwon Lee |
| 2019 | Deep Neural Baselines for Computational Paralinguistics. Daniel Elsner, Stefan Langer, Fabian Ritz, Robert Müller, Steffen Illium |
| 2019 | Deep Neural Network Embeddings with Gating Mechanisms for Text-Independent Speaker Verification. Lanhua You, Wu Guo, Li-Rong Dai, Jun Du |
| 2019 | Deep Residual Neural Networks for Audio Spoofing Detection. Moustafa Alzantot, Ziqi Wang, Mani B. Srivastava |
| 2019 | Deep Sensing of Breathing Signal During Conversational Speech. Venkata Srikanth Nallanthighal, Aki Härmä, Helmer Strik |
| 2019 | Deep Speaker Embedding Extraction with Channel-Wise Feature Responses and Additive Supervision Softmax Loss Function. Jianfeng Zhou, Tao Jiang, Zheng Li, Lin Li, Qingyang Hong |
| 2019 | Deep Speaker Recognition: Modular or Monolithic? Gautam Bhattacharya, Md. Jahangir Alam, Patrick Kenny |
| 2019 | DeepLung: Smartphone Convolutional Neural Network-Based Inference of Lung Anomalies for Pulmonary Patients. Mohsin Y. Ahmed, Md. Mahbubur Rahman, Jilong Kuang |
| 2019 | Depression State Assessment: Application for Detection of Depression by Speech. Gábor Kiss, Dávid Sztahó, Klára Vicsi |
| 2019 | Design and Development of a Multi-Lingual Speech Corpora (TaMaR-EmoDB) for Emotion Analysis. Rajeev Rajan, Haritha U. G., Sujitha A. C., Rejisha T. M. |
| 2019 | Detecting Depression with Word-Level Multimodal Fusion. Morteza Rohanian, Julian Hough, Matthew Purver |
| 2019 | Detecting Mismatch Between Speech and Transcription Using Cross-Modal Attention. Qiang Huang, Thomas Hain |
| 2019 | Detecting Spoofing Attacks Using VGG and SincNet: BUT-Omilia Submission to ASVspoof 2019 Challenge. Hossein Zeinali, Themos Stafylakis, Georgia Athanasopoulou, Johan Rohdin, Ioannis Gkinis, Lukás Burget, Jan Cernocký |
| 2019 | Detecting Topic-Oriented Speaker Stance in Conversational Speech. Catherine Lai, Beatrice Alex, Johanna D. Moore, Leimin Tian, Tatsuro Hori, Gianpiero Francesca |
| 2019 | Detection and Recovery of OOVs for Improved English Broadcast News Captioning. Samuel Thomas, Kartik Audhkhasi, Zoltán Tüske, Yinghui Huang, Michael Picheny |
| 2019 | Detection of Glottal Closure Instants from Raw Speech Using Convolutional Neural Networks. Mohit Goyal, Varun Srivastava, Prathosh A. P. |
| 2019 | Developing Pronunciation Models in New Languages Faster by Exploiting Common Grapheme-to-Phoneme Correspondences Across Languages. Harry Bleyan, Sandy Ritchie, Jonas Fromseier Mortensen, Daan van Esch |
| 2019 | Development of Emotion Rankers Based on Intended and Perceived Emotion Labels. Zhenghao Jin, Houwei Cao |
| 2019 | Development of Robust Automated Scoring Models Using Adversarial Input for Oral Proficiency Assessment. Su-Youn Yoon, Chong Min Lee, Klaus Zechner, Keelan Evanini |
| 2019 | Device Feature Extractor for Replay Spoofing Detection. Chang Huai You, Jichen Yang, Huy Dat Tran |
| 2019 | Diagnosing Dysarthria with Long Short-Term Memory Networks. Alex Mayle, Zhiwei Mou, Razvan C. Bunescu, Sadegh Mirshekarian, Li Xu, Chang Liu |
| 2019 | Dimensions of Prosodic Prominence in an Attractor Model. Simon Roessig, Doris Mücke, Lena Pagel |
| 2019 | Direct F0 Estimation with Neural-Network-Based Regression. Shuzhuang Xu, Hiroshi Shimodaira |
| 2019 | Direct Modelling of Speech Emotion from Raw Speech. Siddique Latif, Rajib Rana, Sara Khalifa, Raja Jurdak, Julien Epps |
| 2019 | Direct Neuron-Wise Fusion of Cognate Neural Networks. Takashi Fukuda, Masayuki Suzuki, Gakuto Kurata |
| 2019 | Direct Speech-to-Speech Translation with a Sequence-to-Sequence Model. Ye Jia, Ron J. Weiss, Fadi Biadsy, Wolfgang Macherey, Melvin Johnson, Zhifeng Chen, Yonghui Wu |
| 2019 | Direct-Path Signal Cross-Correlation Estimation for Sound Source Localization in Reverberation. Wei Xue, Ying Tong, Guohong Ding, Chao Zhang, Tao Ma, Xiaodong He, Bowen Zhou |
| 2019 | Direction-Aware Speaker Beam for Multi-Channel Speaker Extraction. Guanjun Li, Shan Liang, Shuai Nie, Wenju Liu, Meng Yu, Lianwu Chen, Shouye Peng, Changliang Li |
| 2019 | Directional Audio Rendering Using a Neural Network Based Personalized HRTF. Geon Woo Lee, Jung Hyuk Lee, Seong Ju Kim, Hong Kook Kim |
| 2019 | Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-Trained BERT. Dongyang Dai, Zhiyong Wu, Shiyin Kang, Xixin Wu, Jia Jia, Dan Su, Dong Yu, Helen Meng |
| 2019 | Discovering Dialog Rules by Means of an Evolutionary Approach. David Griol, Zoraida Callejas |
| 2019 | Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features. Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen |
| 2019 | Disentangling Style Factors from Speaker Representations. Jennifer Williams, Simon King |
| 2019 | Disfluencies and Human Speech Transcription Errors. Vicky Zayats, Trang Tran, Richard A. Wright, Courtney Mansfield, Mari Ostendorf |
| 2019 | Do Conversational Partners Entrain on Articulatory Precision? Nichola Lubold, Stephanie A. Borrie, Tyson S. Barrett, Megan M. Willi, Visar Berisha |
| 2019 | Do Hesitations Facilitate Processing of Partially Defective System Utterances? An Exploratory Eye Tracking Study. Kristin Haake, Sarah Schimke, Simon Betz, Sina Zarrieß |
| 2019 | Do not Hesitate! - Unless You Do it Shortly or Nasally: How the Phonetics of Filled Pauses Determine Their Subjective Frequency and Perceived Speaker Performance. Oliver Niebuhr, Kerstin Fischer |
| 2019 | Does the Lombard Effect Improve Emotional Communication in Noise? - Analysis of Emotional Speech Acted in Noise. Yi Zhao, Atsushi Ando, Shinji Takaki, Junichi Yamagishi, Satoshi Kobashikawa |
| 2019 | Dr.VOT: Measuring Positive and Negative Voice Onset Time in the Wild. Yosi Shrem, Matthew Goldrick, Joseph Keshet |
| 2019 | Dual Encoder Classifier Models as Constraints in Neural Text Normalization. Ajda Gokcen, Hao Zhang, Richard Sproat |
| 2019 | Duration Modeling with Global Phoneme-Duration Vectors. Jinfu Ni, Yoshinori Shiga, Hisashi Kawai |
| 2019 | ERP Signal Analysis with Temporal Resolution Using a Time Window Bank. Annika Nijveld, Louis ten Bosch, Mirjam Ernestus |
| 2019 | Early Identification of Speech Changes Due to Amyotrophic Lateral Sclerosis Using Machine Classification. Sarah E. Gutz, Jun Wang, Yana Yunusova, Jordan R. Green |
| 2019 | Ectc-Docd: An End-to-End Structure with CTC Encoder and OCD Decoder for Speech Recognition. Cheng Yi, Feng Wang, Bo Xu |
| 2019 | Effects of Base-Frequency and Spectral Envelope on Deep-Learning Speech Separation and Recognition Models. Jun Hui, Yue Wei, Shutao Chen, Richard Hau Yue So |
| 2019 | Effects of Natural Variability in Cross-Modal Temporal Correlations on Audiovisual Speech Recognition Benefit. Kaylah Lalonde |
| 2019 | Effects of Spectral and Temporal Cues to Mandarin Concurrent-Vowels Identification for Normal-Hearing and Hearing-Impaired Listeners. Zhen Fu, Xihong Wu, Jing Chen |
| 2019 | Effects of Urgent Speech and Congruent/Incongruent Text on Speech Intelligibility in Noise and Reverberation. Nao Hodoshima |
| 2019 | Effects of Waveform PMF on Anti-Spoofing Detection. Itshak Lapidot, Jean-François Bonastre |
| 2019 | Elpis, an Accessible Speech-to-Text Tool. Ben Foley, Alina Rakhi, Nicholas Lambourne, Nicholas Buckeridge, Janet Wiles |
| 2019 | Emotion Recognition from Natural Phone Conversations in Individuals with and without Recent Suicidal Ideation. John Gideon, Heather T. Schatten, Melvin G. McInnis, Emily Mower Provost |
| 2019 | Empirical Evaluation of Sequence-to-Sequence Models for Word Discovery in Low-Resource Settings. Marcely Zanon Boito, Aline Villavicencio, Laurent Besacier |
| 2019 | Employing Bottleneck and Convolutional Features for Speech-Based Physical Load Detection on Limited Data Amounts. Olga Egorow, Tarik Mrech, Norman Weißkirchen, Andreas Wendemuth |
| 2019 | End-to-End Accented Speech Recognition. Thibault Viglino, Petr Motlícek, Milos Cernak |
| 2019 | End-to-End Adaptation with Backpropagation Through WFST for On-Device Speech Recognition System. Emiru Tsunoo, Yosuke Kashiwagi, Satoshi Asakawa, Toshiyuki Kumakura |
| 2019 | End-to-End Articulatory Attribute Modeling for Low-Resource Multilingual Speech Recognition. Sheng Li, Chenchen Ding, Xugang Lu, Peng Shen, Tatsuya Kawahara, Hisashi Kawai |
| 2019 | End-to-End Automatic Speech Recognition with a Reconstruction Criterion Using Speech-to-Text and Text-to-Speech Encoder-Decoders. Ryo Masumura, Hiroshi Sato, Tomohiro Tanaka, Takafumi Moriya, Yusuke Ijima, Takanobu Oba |
| 2019 | End-to-End Convolutional Sequence Learning for ASL Fingerspelling Recognition. Katerina Papadimitriou, Gerasimos Potamianos |
| 2019 | End-to-End Losses Based on Speaker Basis Vectors and All-Speaker Hard Negative Mining for Speaker Verification. Hee-Soo Heo, Jee-weon Jung, Il-Ho Yang, Sung-Hyun Yoon, Hye-jin Shim, Ha-Jin Yu |
| 2019 | End-to-End Monaural Speech Separation with Multi-Scale Dynamic Weighted Gated Dilated Convolutional Pyramid Network. Ziqiang Shi, Huibin Lin, Liu Liu, Rujie Liu, Shoji Hayakawa, Shouji Harada, Jiqing Han |
| 2019 | End-to-End Multi-Channel Speech Enhancement Using Inter-Channel Time-Restricted Attention on Raw Waveform. Hyeon Seung Lee, Hyung Yong Kim, Woo Hyun Kang, Jeunghun Kim, Nam Soo Kim |
| 2019 | End-to-End Multi-Speaker Speech Recognition Using Speaker Embeddings and Transfer Learning. Pavel Denisov, Ngoc Thang Vu |
| 2019 | End-to-End Multilingual Multi-Speaker Speech Recognition. Hiroshi Seki, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux, John R. Hershey |
| 2019 | End-to-End Music Source Separation: Is it Possible in the Waveform Domain? Francesc Lluís, Jordi Pons, Xavier Serra |
| 2019 | End-to-End Neural Speaker Diarization with Permutation-Free Objectives. Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, Shinji Watanabe |
| 2019 | End-to-End Optimization of Source Models for Speech and Audio Coding Using a Machine Learning Framework. Tom Bäckström |
| 2019 | End-to-End Speaker Identification in Noisy and Reverberant Environments Using Raw Waveform Convolutional Neural Networks. Daniele Salvati, Carlo Drioli, Gian Luca Foresti |
| 2019 | End-to-End SpeakerBeam for Single Channel Target Speech Recognition. Marc Delcroix, Shinji Watanabe, Tsubasa Ochiai, Keisuke Kinoshita, Shigeki Karita, Atsunori Ogawa, Tomohiro Nakatani |
| 2019 | End-to-End Speech Translation with Knowledge Distillation. Yuchen Liu, Hao Xiong, Jiajun Zhang, Zhongjun He, Hua Wu, Haifeng Wang, Chengqing Zong |
| 2019 | End-to-End Spoken Language Understanding: Bootstrapping in Low Resource Scenarios. Swapnil Bhosale, Imran A. Sheikh, Sri Harsha Dumpala, Sunil Kumar Kopparapu |
| 2019 | End-to-End Text-to-Speech for Low-Resource Languages by Cross-Lingual Transfer Learning. Yuan-Jui Chen, Tao Tu, Cheng-chieh Yeh, Hung-yi Lee |
| 2019 | Energy Separation-Based Instantaneous Frequency Estimation for Cochlear Cepstral Feature for Replay Spoof Detection. Ankur T. Patil, Rajul Acharya, Pulikonda Krishna Aditya Sai, Hemant A. Patil |
| 2019 | Enforcing Semantic Consistency for Cross Corpus Valence Regression from Speech Using Adversarial Discrepancy Learning. Gao-Yi Chao, Yun-Shao Lin, Chun-Min Chang, Chi-Chun Lee |
| 2019 | Enhanced Spectral Features for Distortion-Independent Acoustic Modeling. Peidong Wang, DeLiang Wang |
| 2019 | Enriching Rare Word Representations in Neural Language Models by Embedding Matrix Augmentation. Yerbolat Khassanov, Zhiping Zeng, Van Tung Pham, Haihua Xu, Eng Siong Chng |
| 2019 | Ensemble Models for Spoofing Detection in Automatic Speaker Verification. Bhusan Chettri, Daniel Stoller, Veronica Morfi, Marco A. Martínez Ramírez, Emmanouil Benetos, Bob L. T. Sturm |
| 2019 | Environment-Dependent Attention-Driven Recurrent Convolutional Neural Network for Robust Speech Enhancement. Meng Ge, Longbiao Wang, Nan Li, Hao Shi, Jianwu Dang, Xiangang Li |
| 2019 | EpaDB: A Database for Development of Pronunciation Assessment Systems. Jazmín Vidal, Luciana Ferrer, Leonardo Brambilla |
| 2019 | Evaluating Audiovisual Source Separation in the Context of Video Conferencing. Berkay Inan, Milos Cernak, Helmut Grabner, Helena Peic Tukuljac, Rodrigo C. G. Pena, Benjamin Ricaud |
| 2019 | Evaluating Intention Communication by TTS Using Explicit Definitions of Illocutionary Act Performance. Nobukatsu Hojo, Noboru Miyazaki |
| 2019 | Evaluating Near End Listening Enhancement Algorithms in Realistic Environments. Carol Chermaz, Cassia Valentini-Botinhao, Henning F. Schepker, Simon King |
| 2019 | Examining the Combination of Multi-Band Processing and Channel Dropout for Robust Speech Recognition. György Kovács, László Tóth, Dirk Van Compernolle, Marcus Liwicki |
| 2019 | Excitation Source and Vocal Tract System Based Acoustic Features for Detection of Nasals in Continuous Speech. Bhanu Teja Nellore, Sri Harsha Dumpala, Karan Nathwani, Suryakanth V. Gangashetty |
| 2019 | Expediting TTS Synthesis with Adversarial Vocoding. Paarth Neekhara, Chris Donahue, Miller S. Puckette, Shlomo Dubnov, Julian J. McAuley |
| 2019 | Explaining Sentiment Classification. Marvin Rajwadi, Cornelius Glackin, Julie A. Wall, Gérard Chollet, Nigel Cannings |
| 2019 | Exploiting Monolingual Speech Corpora for Code-Mixed Speech Recognition. Karan Taneja, Satarupa Guha, Preethi Jyothi, Basil Abraham |
| 2019 | Exploiting Multi-Channel Speech Presence Probability in Parametric Multi-Channel Wiener Filter. Saeed Bagheri, Daniele Giacobello |
| 2019 | Exploiting Semi-Supervised Training Through a Dropout Regularization in End-to-End Speech Recognition. Subhadeep Dey, Petr Motlícek, Trung Bui, Franck Dernoncourt |
| 2019 | Exploiting Syntactic Features in a Parsed Tree to Improve End-to-End TTS. Haohan Guo, Frank K. Soong, Lei He, Lei Xie |
| 2019 | Exploiting Visual Features Using Bayesian Gated Neural Networks for Disordered Speech Recognition. Shansong Liu, Shoukang Hu, Yi Wang, Jianwei Yu, Rongfeng Su, Xunying Liu, Helen Meng |
| 2019 | Exploring Critical Articulator Identification from 50Hz RT-MRI Data of the Vocal Tract. Samuel S. Silva, António J. S. Teixeira, Conceição Cunha, Nuno Almeida, Arun A. Joseph, Jens Frahm |
| 2019 | Exploring Methods for the Automatic Detection of Errors in Manual Transcription. Xiaofei Wang, Jinyi Yang, Ruizhi Li, Samik Sadhu, Hynek Hermansky |
| 2019 | Exploring the Encoder Layers of Discriminative Autoencoders for LVCSR. Pin-Tuan Huang, Hung-Shin Lee, Syu-Siang Wang, Kuan-Yu Chen, Yu Tsao, Hsin-Min Wang |
| 2019 | Expressiveness Influences Human Vocal Alignment Toward voice-AI. Michelle Cohn, Georgia Zellou |
| 2019 | Extending an Acoustic Data-Driven Phone Set for Spontaneous Speech Recognition. Jeong-Uk Bang, Mu-Yeol Choi, Sang-Hun Kim, Oh-Wook Kwon |
| 2019 | Extending the E-Model Towards Super-Wideband and Fullband Speech Communication Scenarios. Sebastian Möller, Gabriel Mittag, Thilo Michael, Vincent Barriac, Hitoshi Aoki |
| 2019 | Extract, Adapt and Recognize: An End-to-End Neural Network for Corrupted Monaural Speech Recognition. Max W. Y. Lam, Jun Wang, Xunying Liu, Helen Meng, Dan Su, Dong Yu |
| 2019 | Extracting Mel-Frequency and Bark-Frequency Cepstral Coefficients from Encrypted Signals. Patricia Thaine, Gerald Penn |
| 2019 | F0 Variability Measures Based on Glottal Closure Instants. Yu-Ren Chien, Michal Borský, Jón Guðnason |
| 2019 | Factorization of Discriminatively Trained i-Vector Extractor for Speaker Recognition. Ondrej Novotný, Oldrich Plchot, Ondrej Glembek, Lukás Burget |
| 2019 | Far-Field End-to-End Text-Dependent Speaker Verification Based on Mixed Training Data with Transfer Learning and Enrollment Data Augmentation. Xiaoyi Qin, Danwei Cai, Ming Li |
| 2019 | Far-Field Speech Enhancement Using Heteroscedastic Autoencoder for Improved Speech Recognition. Shashi Kumar, Shakti P. Rath |
| 2019 | FarSpeech: Arabic Natural Language Processing for Live Arabic Speech. Mohamed Eldesouki, Naassih Gopee, Ahmed Ali, Kareem Darwish |
| 2019 | Fast DNN Acoustic Model Speaker Adaptation by Learning Hidden Unit Contribution Features. Xurong Xie, Xunying Liu, Tan Lee, Lan Wang |
| 2019 | Fast Learning for Non-Parallel Many-to-Many Voice Conversion with Residual Star Generative Adversarial Networks. Shengkui Zhao, Trung Hieu Nguyen, Hao Wang, Bin Ma |
| 2019 | Feature Exploration for Almost Zero-Resource ASR-Free Keyword Spotting Using a Multilingual Bottleneck Extractor and Correspondence Autoencoders. Raghav Menon, Herman Kamper, Ewald van der Westhuizen, John A. Quinn, Thomas Niesler |
| 2019 | Feature Representation of Pathophysiology of Parkinsonian Dysarthria. Alice Rueda, Juan Camilo Vásquez-Correa, Cristian David Ríos-Urrego, Juan Rafael Orozco-Arroyave, Sridhar Krishnan, Elmar Nöth |
| 2019 | Feature Space Visualization with Spatial Similarity Maps for Pathological Speech Data. Philipp Klumpp, Juan Camilo Vásquez-Correa, Tino Haderlein, Elmar Nöth |
| 2019 | Few-Shot Audio Classification with Attentional Graph Neural Networks. Shilei Zhang, Yong Qin, Kewei Sun, Yonghua Lin |
| 2019 | Fine-Grained Robust Prosody Transfer for Single-Speaker Neural Text-To-Speech. Viacheslav Klimkov, Srikanth Ronanki, Jonas Rohnke, Thomas Drugman |
| 2019 | Follow-Up Question Generation Using Neural Tensor Network-Based Domain Ontology Population in an Interview Coaching System. Ming-Hsiang Su, Chung-Hsien Wu, Yi Chang |
| 2019 | Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams. Guanlong Zhao, Shaojin Ding, Ricardo Gutierrez-Osuna |
| 2019 | Foreign-Language Knowledge Enhances Artificial-Language Segmentation. Annie Tremblay, Mirjam Broersma |
| 2019 | Forget a Bit to Learn Better: Soft Forgetting for CTC-Based Automatic Speech Recognition. Kartik Audhkhasi, George Saon, Zoltán Tüske, Brian Kingsbury, Michael Picheny |
| 2019 | Formant Pattern and Spectral Shape Ambiguity of Vowel Sounds, and Related Phenomena of Vowel Acoustics - Exemplary Evidence. Dieter Maurer, Heidy Suter, Christian d'Hereuse, Volker Dellwo |
| 2019 | Forward-Backward Decoding for Regularizing End-to-End TTS. Yibin Zheng, Xi Wang, Lei He, Shifeng Pan, Frank K. Soong, Zhengqi Wen, Jianhua Tao |
| 2019 | Framewise Supervised Training Towards End-to-End Speech Recognition Models: First Results. Mohan Li, Yuanjiang Cao, Weicong Zhou, Min Liu |
| 2019 | Framework for Conducting Tasks Requiring Human Assessment. Martin Gruber, Adam Chýlek, Jindrich Matousek |
| 2019 | Frication as a Vowel Feature? - Evidence from the Rui'an Wu Chinese Dialect. Fang Hu, Youjue He |
| 2019 | Front-End Feature Compensation and Denoising for Noise Robust Speech Emotion Recognition. Rupayan Chakraborty, Ashish Panda, Meghna Pandharipande, Sonal Joshi, Sunil Kumar Kopparapu |
| 2019 | Fréchet Audio Distance: A Reference-Free Metric for Evaluating Music Enhancement Algorithms. Kevin Kilgour, Mauricio Zuluaga, Dominik Roblek, Matthew Sharifi |
| 2019 | Full-Sentence Correlation: A Method to Handle Unpredictable Noise for Robust Speech Recognition. Ji Ming, Danny Crookes |
| 2019 | Fully-Convolutional Network for Pitch Estimation of Speech Signals. Luc Ardaillon, Axel Roebel |
| 2019 | Fundamental Frequency Accommodation in Multi-Party Human-Robot Game Interactions: The Effect of Winning or Losing. Omnia Ibrahim, Gabriel Skantze, Sabine Stoll, Volker Dellwo |
| 2019 | Fusion Strategy for Prosodic and Lexical Representations of Word Importance. Sushant Kafle, Cecilia Ovesdotter Alm, Matt Huenerfauth |
| 2019 | Fusion Techniques for Utterance-Level Emotion Recognition Combining Speech and Transcripts. Jilt Sebastian, Piero Pierucci |
| 2019 | GECKO - A Tool for Effective Annotation of Human Conversations. Golan Levy, Raquel Sitman, Ido Amir, Eduard Golshtein, Ran Mochary, Eilon Reshef, Roi Reichart, Omri Allouche |
| 2019 | GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-Spectrogram. Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, Paavo Alku |
| 2019 | GFM-Voc: A Real-Time Voice Quality Modification System. Olivier Perrotin, Ian McLoughlin |
| 2019 | GPU-Based WFST Decoding with Extra Large Language Model. Daisuke Fukunaga, Yoshiki Tanaka, Yuichi Kageyama |
| 2019 | Gender De-Biasing in Speech Emotion Recognition. Cristina Gorrostieta, Reza Lotfian, Kye Taylor, Richard Brutti, John Kane |
| 2019 | Generative Adversarial Networks for Unpaired Voice Transformation on Impaired Speech. Li-Wei Chen, Hung-yi Lee, Yu Tsao |
| 2019 | Generative Noise Modeling and Channel Simulation for Robust Speech Recognition in Unseen Conditions. Meet H. Soni, Sonal Joshi, Ashish Panda |
| 2019 | Glottal Closure Instants Detection from Speech Signal by Deep Features Extracted from Raw Speech and Linear Prediction Residual. Gurunath Reddy M., K. Sreenivasa Rao, Partha Pratim Das |
| 2019 | God as Interlocutor - Real or Imaginary? Prosodic Markers of Dialogue Speech and Expected Efficacy in Spoken Prayer. Oliver Niebuhr, Uffe Schjoedt |
| 2019 | Group Latent Embedding for Vector Quantized Variational Autoencoder in Non-Parallel Voice Conversion. Shaojin Ding, Ricardo Gutierrez-Osuna |
| 2019 | Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR. Naoyuki Kanda, Christoph Böddeker, Jens Heitkaemper, Yusuke Fujita, Shota Horiguchi, Kenji Nagamatsu, Reinhold Haeb-Umbach |
| 2019 | Guiding CTC Posterior Spike Timings for Improved Posterior Fusion and Knowledge Distillation. Gakuto Kurata, Kartik Audhkhasi |
| 2019 | Harmonic Beamformers for Non-Intrusive Speech Intelligibility Prediction. Charlotte Sørensen, Jesper Bünsow Boldt, Mads Græsbøll Christensen |
| 2019 | Harmonic-Aligned Frame Mask Based on Non-Stationary Gabor Transform with Application to Content-Dependent Speaker Comparison. Feng Huang, Péter Balázs |
| 2019 | Hierarchical Pooling Structure for Weakly Labeled Sound Event Detection. Ke-Xin He, Yu-Han Shen, Wei-Qiang Zhang |
| 2019 | High Quality, Lightweight and Adaptable TTS Using LPCNet. Zvi Kons, Slava Shechtman, Alexander Sorin, Carmel Rabinovitz, Ron Hoory |
| 2019 | How to Annotate 100 Hours in 45 Minutes. Per Fallgren, Zofia Malisz, Jens Edlund |
| 2019 | Hush-Hush Speak: Speech Reconstruction Using Silent Videos. Shashwat Uttam, Yaman Kumar, Dhruva Sahrawat, Mansi Aggarwal, Rajiv Ratn Shah, Debanjan Mahata, Amanda Stent |
| 2019 | HyST: A Hybrid Approach for Flexible and Accurate Dialogue State Tracking. Rahul Goel, Shachi Paul, Dilek Hakkani-Tür |
| 2019 | Hybrid Arbitration Using Raw ASR String and NLU Information - Taking the Best of Both Embedded World and Cloud World. Min Tang |
| 2019 | Hypernasality Severity Detection Using Constant Q Cepstral Coefficients. Akhilesh Kumar Dubey, S. R. Mahadeva Prasanna, Samarendra Dandapat |
| 2019 | I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences. Kong Aik Lee, Ville Hautamäki, Tomi H. Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, Jing Huang, Guohong Ding, Hanwu Sun, Anthony Larcher, Rohan Kumar Das, Haizhou Li, Mickael Rouvier, Pierre-Michel Bousquet, Wei Rao, Qing Wang, Chunlei Zhang, Fahimeh Bahmaninezhad, Héctor Delgado, Massimiliano Todisco |
| 2019 | IA-NET: Acceleration and Compression of Speech Enhancement Using Integer-Adder Deep Neural Network. Yu-Chen Lin, Yi-Te Hsu, Szu-Wei Fu, Yu Tsao, Tei-Wei Kuo |
| 2019 | IIIT-H Spoofing Countermeasures for Automatic Speaker Verification Spoofing and Countermeasures Challenge 2019. K. N. R. K. Raju Alluri, Anil Kumar Vuppala |
| 2019 | Identifying Distinctive Acoustic and Spectral Features in Parkinson's Disease. Yermiyahu Hauptman, Ruth Aloni-Lavi, Itshak Lapidot, Tanya Gurevich, Yael Manor, Stav Naor, Noa Diamant, Irit Opher |
| 2019 | Identifying Input Features for Development of Real-Time Translation of Neural Signals to Text. Janaki Sheth, Ariel Tankus, Michelle Tran, Lindy Comstock, Itzhak Fried, William Speier |
| 2019 | Identifying Mood Episodes Using Dialogue Features from Clinical Interviews. Zakaria Aldeneh, Mimansa Jaiswal, Michael Picheny, Melvin G. McInnis, Emily Mower Provost |
| 2019 | Identifying Personality Traits Using Overlap Dynamics in Multiparty Dialogue. Mingzhi Yu, Emer Gilmartin, Diane J. Litman |
| 2019 | Identifying Therapist and Client Personae for Therapeutic Alliance Estimation. Victor R. Martinez, Nikolaos Flemotomos, Victor Ardulov, Krishna Somandepalli, Simon B. Goldberg, Zac E. Imel, David C. Atkins, Shrikanth Narayanan |
| 2019 | Impact of ASR Performance on Spoken Grammatical Error Detection. Yiting Lu, Mark J. F. Gales, Kate M. Knill, P. P. Manakul, Linlin Wang, Yu Wang |
| 2019 | Improved Deep Duel Model for Rescoring N-Best Speech Recognition List Using Backward LSTMLM and Ensemble Encoders. Atsunori Ogawa, Marc Delcroix, Shigeki Karita, Tomohiro Nakatani |
| 2019 | Improved End-to-End Speech Emotion Recognition Using Self Attention Mechanism and Multitask Learning. Yuanchao Li, Tianyu Zhao, Tatsuya Kawahara |
| 2019 | Improved Low-Resource Somali Speech Recognition by Semi-Supervised Acoustic and Language Model Training. Astik Biswas, Raghav Menon, Ewald van der Westhuizen, Thomas Niesler |
| 2019 | Improved Speaker-Dependent Separation for CHiME-5 Challenge. Jian Wu, Yong Xu, Shi-Xiong Zhang, Lianwu Chen, Meng Yu, Lei Xie, Dong Yu |
| 2019 | Improved Speech Separation with Time-and-Frequency Cross-Domain Joint Embedding and Clustering. Gene-Ping Yang, Chao-I Tuan, Hung-yi Lee, Lin-Shan Lee |
| 2019 | Improved Vocal Tract Length Perturbation for a State-of-the-Art End-to-End Speech Recognition System. Chanwoo Kim, Minkyu Shin, Abhinav Garg, Dhananjaya Gowda |
| 2019 | Improvement and Assessment of Spectro-Temporal Modulation Analysis for Speech Intelligibility Estimation. Amin Edraki, Wai-Yip Chan, Jesper Jensen, Daniel Fogerty |
| 2019 | Improving ASR Confidence Scores for Alexa Using Acoustic and Hypothesis Embeddings. Prakhar Swarup, Roland Maas, Sri Garimella, Sri Harish Mallidi, Björn Hoffmeister |
| 2019 | Improving ASR Systems for Children with Autism and Language Impairment Using Domain-Focused DNN Transfer Techniques. Robert Gale, Liu Chen, Jill Dolata, Jan P. H. van Santen, Meysam Asgari |
| 2019 | Improving Aggregation and Loss Function for Better Embedding Learning in End-to-End Speaker Verification System. Zhifu Gao, Yan Song, Ian McLoughlin, Pengcheng Li, Yiheng Jiang, Li-Rong Dai |
| 2019 | Improving Automatically Induced Lexicons for Highly Agglutinating Languages Using Data-Driven Morphological Segmentation. Wiehan Agenbag, Thomas Niesler |
| 2019 | Improving Code-Switched Language Modeling Performance Using Cognate Features. Victor Soto, Julia Hirschberg |
| 2019 | Improving Conversation-Context Language Models with Multiple Spoken Language Understanding Models. Ryo Masumura, Tomohiro Tanaka, Atsushi Ando, Hosana Kamiyama, Takanobu Oba, Satoshi Kobashikawa, Yushi Aono |
| 2019 | Improving Emotion Identification Using Phone Posteriors in Raw Speech Waveform Based DNN. Mousmita Sarma, Pegah Ghahremani, Daniel Povey, Nagendra Kumar Goel, Kandarpa Kumar Sarma, Najim Dehak |
| 2019 | Improving Keyword Spotting and Language Identification via Neural Architecture Search at Scale. Hanna Mazzawi, Xavi Gonzalvo, Aleks Kracun, Prashant Sridhar, Niranjan Subrahmanya, Ignacio López-Moreno, Hyun-Jin Park, Patrick Violette |
| 2019 | Improving Large Vocabulary Urdu Speech Recognition System Using Deep Neural Networks. Muhammad Umar Farooq, Farah Adeeba, Sahar Rauf, Sarmad Hussain |
| 2019 | Improving Performance of End-to-End ASR on Numeric Sequences. Cal Peyser, Hao Zhang, Tara N. Sainath, Zelin Wu |
| 2019 | Improving Speech Synthesis with Discourse Relations. Adèle Aubin, Alessandra Cervone, Oliver Watts, Simon King |
| 2019 | Improving Transformer-Based End-to-End Speech Recognition with Connectionist Temporal Classification and Language Model Integration. Shigeki Karita, Nelson Enrique Yalta Soplin, Shinji Watanabe, Marc Delcroix, Atsunori Ogawa, Tomohiro Nakatani |
| 2019 | Improving Transformer-Based Speech Recognition Systems with Compressed Structure and Speech Attributes Augmentation. Sheng Li, Raj Dabre, Xugang Lu, Peng Shen, Tatsuya Kawahara, Hisashi Kawai |
| 2019 | Improving Unsupervised Subword Modeling via Disentangled Speech Representation Learning and Transformation. Siyuan Feng, Tan Lee |
| 2019 | Incorporating Symbolic Sequential Modeling for Speech Enhancement. Chien-Feng Liao, Yu Tsao, Xugang Lu, Hisashi Kawai |
| 2019 | Individual Difference of Relative Tongue Size and its Acoustic Effects. Xiaohan Zhang, Chongke Bi, Kiyoshi Honda, Wenhuan Lu, Jianguo Wei |
| 2019 | Individual Differences in Implicit Attention to Phonetic Detail in Speech Perception. Natalie Lewandowski, Daniel Duran |
| 2019 | Individual Differences of Airflow and Sound Generation in the Vocal Tract of Sibilant /s/. Tsukasa Yoshinaga, Kazunori Nozaki, Shigeo Wada |
| 2019 | Individual Variation in Cognitive Processing Style Predicts Differences in Phonetic Imitation of Device and Human Voices. Cathryn Snyder, Michelle Cohn, Georgia Zellou |
| 2019 | Influence of Contextuality on Prosodic Realization of Information Structure in Chinese Dialogues. Bin Li, Yuan Jia |
| 2019 | Influence of Speaker-Specific Parameters on Speech Separation Systems. David Ditter, Timo Gerkmann |
| 2019 | Instantaneous Phase and Long-Term Acoustic Cues for Orca Activity Detection. Rohan Kumar Das, Haizhou Li |
| 2019 | Integrating Video Retrieval and Moment Detection in a Unified Corpus for Video Question Answering. Hongyin Luo, Mitra Mohtarami, James R. Glass, Karthik Krishnamurthy, Brigitte Richardson |
| 2019 | Intel Far-Field Speaker Recognition System for VOiCES Challenge 2019. Jonathan Huang, Tobias Bocklet |
| 2019 | Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech. Daniel Korzekwa, Roberto Barra-Chicote, Bozena Kostek, Thomas Drugman, Mateusz Lajszczak |
| 2019 | Interpreting and Improving Deep Neural SLU Models via Vocabulary Importance. Yilin Shen, Wenhu Chen, Hongxia Jin |
| 2019 | Into the Wild: Transitioning from Recognizing Mood in Clinical Interactions to Personal Conversations for Individuals with Bipolar Disorder. Katie Matton, Melvin G. McInnis, Emily Mower Provost |
| 2019 | Intragestural Variation in Natural Sentence Production: Essential Tremor Patients Treated with DBS. Anne Hermes, Doris Mücke, Tabea Thies, Michael T. Barbe |
| 2019 | Investigating Adaptation and Transfer Learning for End-to-End Spoken Language Understanding from Speech. Natalia A. Tomashenko, Antoine Caubrière, Yannick Estève |
| 2019 | Investigating Linguistic and Semantic Features for Turn-Taking Prediction in Open-Domain Human-Computer Conversation. Seyedeh Zahra Razavi, Benjamin Kane, Lenhart K. Schubert |
| 2019 | Investigating Radical-Based End-to-End Speech Recognition Systems for Chinese Dialects and Japanese. Sheng Li, Xugang Lu, Chenchen Ding, Peng Shen, Tatsuya Kawahara, Hisashi Kawai |
| 2019 | Investigating the Effects of Noisy and Reverberant Speech in Text-to-Speech Systems. David Ayllón, Héctor A. Sánchez-Hevia, Carol Figueroa, Pierre Lanchantin |
| 2019 | Investigating the Lombard Effect Influence on End-to-End Audio-Visual Speech Recognition. Pingchuan Ma, Stavros Petridis, Maja Pantic |
| 2019 | Investigating the Physiological and Acoustic Contrasts Between Choral and Operatic Singing. Hiroko Terasawa, Kenta Wakasa, Hideki Kawahara, Ken-Ichi Sakakibara |
| 2019 | Investigating the Robustness of Sequence-to-Sequence Text-to-Speech Models to Imperfectly-Transcribed Training Data. Jason Fong, Pilar Oplustil Gallegos, Zack Hodari, Simon King |
| 2019 | Investigating the Variability of Voice Quality and Pain Levels as a Function of Multiple Clinical Parameters. Hui-Ting Hong, Jeng-Lin Li, Yi-Ming Weng, Chip-Jin Ng, Chi-Chun Lee |
| 2019 | Investigation of Cost Function for Supervised Monaural Speech Separation. Yun Liu, Hui Zhang, Xueliang Zhang, Yuhang Cao |
| 2019 | Investigation of F0 Conditioning and Fully Convolutional Networks in Variational Autoencoder Based Voice Conversion. Wen-Chin Huang, Yi-Chiao Wu, Chen-Chou Lo, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda, Yu Tsao, Hsin-Min Wang |
| 2019 | Investigation of Transformer Based Spelling Correction Model for CTC-Based End-to-End Mandarin Speech Recognition. Shiliang Zhang, Ming Lei, Zhijie Yan |
| 2019 | Investigation on Blind Bandwidth Extension with a Non-Linear Function and its Evaluation of x-Vector-Based Speaker Verification. Ryota Kaminishi, Haruna Miyamoto, Sayaka Shiota, Hitoshi Kiya |
| 2019 | Iterative Delexicalization for Improved Spoken Language Understanding. Avik Ray, Yilin Shen, Hongxia Jin |
| 2019 | Jasper: An End-to-End Convolutional Neural Acoustic Model. Jason Li, Vitaly Lavrukhin, Boris Ginsburg, Ryan Leary, Oleksii Kuchaiev, Jonathan M. Cohen, Huyen Nguyen, Ravi Teja Gadde |
| 2019 | Joint Decoding of CTC Based Systems for Speech Recognition. Jiaqi Guo, Yongbin You, Yanmin Qian, Kai Yu |
| 2019 | Joint Grapheme and Phoneme Embeddings for Contextual End-to-End ASR. Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael L. Seltzer, Christian Fuegen |
| 2019 | Joint Maximization Decoder with Neural Converters for Fully Neural Network-Based Japanese Speech Recognition. Takafumi Moriya, Jian Wang, Tomohiro Tanaka, Ryo Masumura, Yusuke Shinohara, Yoshikazu Yamaguchi, Yushi Aono |
| 2019 | Joint Optimization of Neural Acoustic Beamforming and Dereverberation with x-Vectors for Robust Speaker Verification. Joon-Young Yang, Joon-Hyuk Chang |
| 2019 | Joint Speech Recognition and Speaker Diarization via Sequence Transduction. Laurent El Shafey, Hagen Soltau, Izhak Shafran |
| 2019 | Joint Student-Teacher Learning for Audio-Visual Scene-Aware Dialog. Chiori Hori, Anoop Cherian, Tim K. Marks, Takaaki Hori |
| 2019 | Joint Training Framework for Text-to-Speech and Voice Conversion Using Multi-Source Tacotron and WaveNet. Mingyang Zhang, Xin Wang, Fuming Fang, Haizhou Li, Junichi Yamagishi |
| 2019 | Jointly Adversarial Enhancement Training for Robust End-to-End Speech Recognition. Bin Liu, Shuai Nie, Shan Liang, Wenju Liu, Meng Yu, Lianwu Chen, Shouye Peng, Changliang Li |
| 2019 | Jointly Trained Conversion Model and WaveNet Vocoder for Non-Parallel Voice Conversion Using Mel-Spectrograms and Phonetic Posteriorgrams. Songxiang Liu, Yuewen Cao, Xixin Wu, Lifa Sun, Xunying Liu, Helen Meng |
| 2019 | KL-Divergence Regularized Deep Neural Network Adaptation for Low-Resource Speaker-Dependent Speech Enhancement. Li Chai, Jun Du, Chin-Hui Lee |
| 2019 | Kernel Machines Beat Deep Neural Networks on Mask-Based Single-Channel Speech Enhancement. Like Hui, Siyuan Ma, Mikhail Belkin |
| 2019 | Keyword Spotting for Hearing Assistive Devices Robust to External Speakers. Iván López-Espejo, Zheng-Hua Tan, Jesper Jensen |
| 2019 | Kite: Automatic Speech Recognition for Unmanned Aerial Vehicles. Dan Oneata, Horia Cucu |
| 2019 | Knowledge Distillation for End-to-End Monaural Multi-Talker ASR System. Wangyou Zhang, Xuankai Chang, Yanmin Qian |
| 2019 | Knowledge Distillation for Throat Microphone Speech Recognition. Takahito Suzuki, Jun Ogata, Takashi Tsunakawa, Masafumi Nishida, Masafumi Nishimura |
| 2019 | Knowledge-Based Linguistic Encoding for End-to-End Mandarin Text-to-Speech Synthesis. Jingbei Li, Zhiyong Wu, Runnan Li, Pengpeng Zhi, Song Yang, Helen Meng |
| 2019 | L2 Pronunciation Accuracy and Context: A Pilot Study on the Realization of Geminates in Italian as L2 by French Learners. Sonia D'Apolito, Barbara Gili Fivela |
| 2019 | LEAP Diarization System for the Second DIHARD Challenge. Prachi Singh, Harsha Vardhan, Sriram Ganapathy, Ahilan Kanagasundaram |
| 2019 | LF-MMI Training of Bayesian and Gaussian Process Time Delay Neural Networks for Speech Recognition. Shoukang Hu, Xurong Xie, Shansong Liu, Max W. Y. Lam, Jianwei Yu, Xixin Wu, Xunying Liu, Helen Meng |
| 2019 | LSTM Based Similarity Measurement with Spectral Clustering for Speaker Diarization. Qingjian Lin, Ruiqing Yin, Ming Li, Hervé Bredin, Claude Barras |
| 2019 | Label Driven Time-Frequency Masking for Robust Continuous Speech Recognition. Meet H. Soni, Ashish Panda |
| 2019 | Language Learning Using Speech to Image Retrieval. Danny Merkx, Stefan L. Frank, Mirjam Ernestus |
| 2019 | Language Modeling with Deep Transformers. Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney |
| 2019 | Language Recognition Using Triplet Neural Networks. Victoria Mingote, Diego Castán, Mitchell McLaren, Mahesh Kumar Nandwana, Alfonso Ortega Giménez, Eduardo Lleida, Antonio Miguel |
| 2019 | Large Margin Softmax Loss for Speaker Verification. Yi Liu, Liang He, Jia Liu |
| 2019 | Large Margin Training for Attention Based End-to-End Speech Recognition. Peidong Wang, Jia Cui, Chao Weng, Dong Yu |
| 2019 | Large-Scale Mixed-Bandwidth Deep Neural Network Acoustic Modeling for Automatic Speech Recognition. Khoi-Nguyen C. Mac, Xiaodong Cui, Wei Zhang, Michael Picheny |
| 2019 | Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model. Anjuli Kannan, Arindrima Datta, Tara N. Sainath, Eugene Weinstein, Bhuvana Ramabhadran, Yonghui Wu, Ankur Bapna, Zhifeng Chen, Seungji Lee |
| 2019 | Large-Scale Speaker Diarization of Radio Broadcast Archives. Emre Yilmaz, Adem Derinel, Kun Zhou, Henk van den Heuvel, Niko Brummer, Haizhou Li, David A. van Leeuwen |
| 2019 | Large-Scale Speaker Retrieval on Random Speaker Variability Subspace. Suwon Shon, Younggun Lee, Taesu Kim |
| 2019 | Large-Scale Visual Speech Recognition. Brendan Shillingford, Yannis M. Assael, Matthew W. Hoffman, Thomas Paine, Cían Hughes, Utsav Prabhu, Hank Liao, Hasim Sak, Kanishka Rao, Lorrayne Bennett, Marie Mulville, Misha Denil, Ben Coppin, Ben Laurie, Andrew W. Senior, Nando de Freitas |
| 2019 | Latent Dirichlet Allocation Based Acoustic Data Selection for Automatic Speech Recognition. Mortaza Doulaty, Thomas Hain |
| 2019 | Latent Topic Attention for Domain Classification. Peisong Huang, Peijie Huang, Wencheng Ai, Jiande Ding, Jinchuan Zhang |
| 2019 | Lattice Generation in Attention-Based Speech Recognition Models. Michal Zapotoczny, Piotr Pietrzak, Adrian Lancucki, Jan Chorowski |
| 2019 | Lattice Re-Scoring During Manual Editing for Automatic Error Correction of ASR Transcripts. Anna V. Rúnarsdóttir, Inga Rún Helgadóttir, Jón Guðnason |
| 2019 | Lattice-Based Lightly-Supervised Acoustic Model Training. Joachim Fainberg, Ondrej Klejch, Steve Renals, Peter Bell |
| 2019 | Laughter Dynamics in Dyadic Conversations. Bogdan Ludusan, Petra Wagner |
| 2019 | Layer Trajectory BLSTM. Eric Sun, Jinyu Li, Yifan Gong |
| 2019 | Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition. Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen |
| 2019 | Learning Alignment for Multimodal Emotion Recognition from Speech. Haiyang Xu, Hui Zhang, Kun Han, Yun Wang, Yiping Peng, Xiangang Li |
| 2019 | Learning How to Listen: A Temporal-Frequential Attention Model for Sound Event Detection. Yu-Han Shen, Ke-Xin He, Wei-Qiang Zhang |
| 2019 | Learning Natural Language Interfaces with Neural Models. Mirella Lapata |
| 2019 | Learning Problem-Agnostic Speech Representations from Multiple Self-Supervised Tasks. Santiago Pascual, Mirco Ravanelli, Joan Serrà, Antonio Bonafonte, Yoshua Bengio |
| 2019 | Learning Speaker Aware Offsets for Speaker Adaptation of Neural Networks. Leda Sari, Samuel Thomas, Mark A. Hasegawa-Johnson |
| 2019 | Learning Speaker Representations with Mutual Information. Mirco Ravanelli, Yoshua Bengio |
| 2019 | Learning Temporal Clusters Using Capsule Routing for Speech Emotion Recognition. Md Asif Jalal, Erfan Loweimi, Roger K. Moore, Thomas Hain |
| 2019 | Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning. Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, R. J. Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran |
| 2019 | Leveraging Acoustic Cues and Paralinguistic Embeddings to Detect Expression from Voice. Vikramjit Mitra, Sue Booker, Erik Marchi, David Scott Farrar, Ute Dorothea Peitz, Bridget Cheng, Ermine Teves, Anuj Mehta, Devang Naik |
| 2019 | Leveraging a Character, Word and Prosody Triplet for an ASR Error Robust and Agglutination Friendly Punctuation Approach. György Szaszák, Máté Ákos Tündik |
| 2019 | Lexically Guided Perceptual Learning of a Vowel Shift in an Interactive L2 Listening Context. Emily Felker, Mirjam Ernestus, Mirjam Broersma |
| 2019 | LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech. Heiga Zen, Viet Dang, Rob Clark, Yu Zhang, Ron J. Weiss, Ye Jia, Zhifeng Chen, Yonghui Wu |
| 2019 | Linear Discriminant Differential Evolution for Feature Selection in Emotional Speech Recognition. Soumaya Gharsellaoui, Sid-Ahmed Selouani, Mohammed Sidi Yakoub |
| 2019 | Linguistically Motivated Parallel Data Augmentation for Code-Switch Language Modeling. Grandee Lee, Xianghu Yue, Haizhou Li |
| 2019 | Linguistically-Informed Training of Acoustic Word Embeddings for Low-Resource Languages. Zixiaofan Yang, Julia Hirschberg |
| 2019 | LipSound: Neural Mel-Spectrogram Reconstruction for Lip Reading. Leyuan Qu, Cornelius Weber, Stefan Wermter |
| 2019 | Liquid Deletion in French Child-Directed Speech. Sharon Peperkamp, Monica Hegde, Maria Julia Carbajal |
| 2019 | Listen, Attend, Spell and Adapt: Speaker Adapted Sequence-to-Sequence ASR. Felix Weninger, Jesús Andrés-Ferrer, Xinwei Li, Puming Zhan |
| 2019 | Listener Preference on the Local Criterion for Ideal Binary-Masked Speech. Zhuohuang Zhang, Yi Shen |
| 2019 | Listeners' Ability to Identify the Gender of Preadolescent Children in Different Linguistic Contexts. Shawn L. Nissen, Sharalee Blunck, Anita Dromey, Christopher Dromey |
| 2019 | Listening with Great Expectations: An Investigation of Word Form Anticipations in Naturalistic Speech. Martijn Bentum, Louis ten Bosch, Antal van den Bosch, Mirjam Ernestus |
| 2019 | Locality-Constrained Linear Coding Based Fused Visual Features for Robust Acoustic Event Classification. Manjunath Mulimani, Shashidhar G. Koolagudi |
| 2019 | Lombard Speech Synthesis Using Transfer Learning in a Tacotron Text-to-Speech System. Bajibabu Bollepalli, Lauri Juvela, Paavo Alku |
| 2019 | Long Range Acoustic Features for Spoofed Speech Detection. Rohan Kumar Das, Jichen Yang, Haizhou Li |
| 2019 | Low Resource Automatic Intonation Classification Using Gated Recurrent Unit (GRU) Networks Pre-Trained with Synthesized Pitch Patterns. Atreyee Saha, Chiranjeevi Yarra, Prasanta Kumar Ghosh |
| 2019 | Low-Dimensional Bottleneck Features for On-Device Continuous Speech Recognition. David B. Ramsay, Kevin Kilgour, Dominik Roblek, Matthew Sharifi |
| 2019 | Lyrics Recognition from Singing Voice Focused on Correspondence Between Voice and Notes. Motoyuki Suzuki, Sho Tomita, Tomoki Morita |
| 2019 | M2H-GAN: A GAN-Based Mapping from Machine to Human Transcripts for Speech Understanding. Titouan Parcollet, Mohamed Morchid, Xavier Bost, Georges Linarès |
| 2019 | MCE 2018: The 1st Multi-Target Speaker Detection and Identification Challenge Evaluation. Suwon Shon, Najim Dehak, Douglas A. Reynolds, James R. Glass |
| 2019 | MOSNet: Deep Learning-Based Objective Assessment for Voice Conversion. Chen-Chou Lo, Szu-Wei Fu, Wen-Chin Huang, Xin Wang, Junichi Yamagishi, Yu Tsao, Hsin-Min Wang |
| 2019 | Masking Estimation with Phase Restoration of Clean Speech for Monaural Speech Enhancement. Xianyun Wang, Changchun Bao |
| 2019 | Maximum a posteriori Speech Enhancement Based on Double Spectrum. Pejman Mowlaee, Daniel Scheran, Johannes Stahl, Sean U. N. Wood, W. Bastiaan Kleijn |
| 2019 | Meeting Transcription Using Asynchronous Distant Microphones. Takuya Yoshioka, Dimitrios Dimitriadis, Andreas Stolcke, William Hinthorn, Zhuo Chen, Michael Zeng, Xuedong Huang |
| 2019 | Mel-Frequency Cepstral Coefficients of Voice Source Waveforms for Classification of Phonation Types in Speech. Sudarsana Reddy Kadiri, Paavo Alku |
| 2019 | Meta Learning for Hyperparameter Optimization in Dialogue System. Jen-Tzung Chien, Wei Xiang Lieow |
| 2019 | Mining Polysemous Triplets with Recurrent Neural Networks for Spoken Language Understanding. Vedran Vukotic, Christian Raymond |
| 2019 | Mirroring to Build Trust in Digital Assistants. Katherine Metcalf, Barry-John Theobald, Garrett Weinberg, Robert Lee, Ing-Marie Jonsson, Russ Webb, Nicholas Apostoloff |
| 2019 | Mitigating Gender and L1 Differences to Improve State and Trait Recognition. Guozhen An, Rivka Levitan |
| 2019 | Mitigating Noisy Inputs for Question Answering. Denis Peskov, Joe Barrow, Pedro Rodriguez, Graham Neubig, Jordan L. Boyd-Graber |
| 2019 | Mixup Learning Strategies for Text-Independent Speaker Verification. Yingke Zhu, Tom Ko, Brian Mak |
| 2019 | MobiLipNet: Resource-Efficient Deep Learning Based Lipreading. Alexandros Koumparoulis, Gerasimos Potamianos |
| 2019 | MobiVSR : Efficient and Light-Weight Neural Network for Visual Speech Recognition on Mobile Devices. Nilay Shrivastava, Astitwa Saxena, Yaman Kumar, Rajiv Ratn Shah, Amanda Stent, Debanjan Mahata, Preeti Kaur, Roger Zimmermann |
| 2019 | Modeling Interpersonal Linguistic Coordination in Conversations Using Word Mover's Distance. Md. Nasir, Sandeep Nallan Chakravarthula, Brian R. W. Baucom, David C. Atkins, Panayiotis G. Georgiou, Shrikanth Narayanan |
| 2019 | Modeling Labial Coarticulation with Bidirectional Gated Recurrent Networks and Transfer Learning. Théo Biasutto-Lervat, Sara Dahmani, Slim Ouni |
| 2019 | Modeling User Context for Valence Prediction from Narratives. Aniruddha Tammewar, Alessandra Cervone, Eva-Maria Messner, Giuseppe Riccardi |
| 2019 | Modification of Devoicing Error in Cleft Lip and Palate Speech. Protima Nomo Sudro, S. R. Mahadeva Prasanna |
| 2019 | Modulation Vectors as Robust Feature Representation for ASR in Domain Mismatched Conditions. Samik Sadhu, Hynek Hermansky |
| 2019 | Monaural Speech Enhancement with Dilated Convolutions. Shadi Pirhosseinloo, Jonathan S. Brumberg |
| 2019 | Multi-Accent Adaptation Based on Gate Mechanism. Han Zhu, Li Wang, Pengyuan Zhang, Yonghong Yan |
| 2019 | Multi-Channel Block-Online Source Extraction Based on Utterance Adaptation. Juan M. Martín-Doñas, Jens Heitkaemper, Reinhold Haeb-Umbach, Angel M. Gomez, Antonio M. Peinado |
| 2019 | Multi-Channel Speech Enhancement Using Time-Domain Convolutional Denoising Autoencoder. Naohiro Tawara, Tetsunori Kobayashi, Tetsuji Ogawa |
| 2019 | Multi-Channel Training for End-to-End Speaker Recognition Under Reverberant and Noisy Environment. Danwei Cai, Xiaoyi Qin, Ming Li |
| 2019 | Multi-Corpus Acoustic-to-Articulatory Speech Inversion. Nadee Seneviratne, Ganesh Sivaraman, Carol Y. Espy-Wilson |
| 2019 | Multi-Dialect Acoustic Modeling Using Phone Mapping and Online i-Vectors. Harish Arsikere, Ashtosh Sapru, Sri Garimella |
| 2019 | Multi-Graph Decoding for Code-Switching ASR. Emre Yilmaz, Samuel Cohen, Xianghu Yue, David A. van Leeuwen, Haizhou Li |
| 2019 | Multi-Level Adaptive Speech Activity Detector for Speech in Naturalistic Environments. Bidisha Sharma, Rohan Kumar Das, Haizhou Li |
| 2019 | Multi-Lingual Dialogue Act Recognition with Deep Learning Methods. Jirí Martínek, Pavel Král, Ladislav Lenc, Christophe Cerisara |
| 2019 | Multi-Microphone Adaptive Noise Cancellation for Robust Hotword Detection. Yiteng Huang, Turaj Zakizadeh Shabestary, Alexander Gruenstein, Li Wan |
| 2019 | Multi-Modal Learning for Speech Emotion Recognition: An Analysis and Comparison of ASR Outputs with Ground Truth Transcription. Saurabh Sahu, Vikramjit Mitra, Nadee Seneviratne, Carol Y. Espy-Wilson |
| 2019 | Multi-Modal Sentiment Analysis Using Deep Canonical Correlation Analysis. Zhongkai Sun, Prathusha Kameswara Sarma, William A. Sethares, Erik P. Bucy |
| 2019 | Multi-PLDA Diarization on Children's Speech. Jiamin Xie, Leibny Paola García-Perera, Daniel Povey, Sanjeev Khudanpur |
| 2019 | Multi-Scale Time-Frequency Attention for Acoustic Event Detection. Jingyang Zhang, Wenhao Ding, Jintao Kang, Liang He |
| 2019 | Multi-Span Acoustic Modelling Using Raw Waveform Signals. Patrick von Platen, Chao Zhang, Philip C. Woodland |
| 2019 | Multi-Stream Network with Temporal Attention for Environmental Sound Classification. Xinyu Li, Venkata Chebiyyam, Katrin Kirchhoff |
| 2019 | Multi-Stride Self-Attention for Speech Recognition. Kyu Jeong Han, Jing Huang, Yun Tang, Xiaodong He, Bowen Zhou |
| 2019 | Multi-Task CTC Training with Auxiliary Feature Reconstruction for End-to-End Speech Recognition. Gakuto Kurata, Kartik Audhkhasi |
| 2019 | Multi-Task Discriminative Training of Hybrid DNN-TVM Model for Speaker Verification with Noisy and Far-Field Speech. Arindam Jati, Raghuveer Peri, Monisankha Pal, Tae Jin Park, Naveen Kumar, Ruchir Travadi, Panayiotis G. Georgiou, Shrikanth Narayanan |
| 2019 | Multi-Task Learning with High-Order Statistics for x-Vector Based Text-Independent Speaker Verification. Lanhua You, Wu Guo, Li-Rong Dai, Jun Du |
| 2019 | Multi-Task Multi-Network Joint-Learning of Deep Residual Networks and Cycle-Consistency Generative Adversarial Networks for Robust Speech Recognition. Shengkui Zhao, Chongjia Ni, Rong Tong, Bin Ma |
| 2019 | Multi-Task Multi-Resolution Char-to-BPE Cross-Attention Decoder for End-to-End Speech Recognition. Dhananjaya Gowda, Abhinav Garg, Kwangyoun Kim, Mehul Kumar, Chanwoo Kim |
| 2019 | Multichannel Loss Function for Supervised Speech Source Separation by Mask-Based Beamforming. Yoshiki Masuyama, Masahito Togami, Tatsuya Komatsu |
| 2019 | Multilingual Speech Recognition with Corpus Relatedness Sampling. Xinjian Li, Siddharth Dalmia, Alan W. Black, Florian Metze |
| 2019 | Multimedia Simultaneous Translation System for Minority Language Communication with Mandarin. Shen Huang, Bojie Hu, Shan Huang, Pengfei Hu, Jian Kang, Zhiqiang Lv, Jinghao Yan, Qi Ju, Shiyin Kang, Deyi Tuo, Guangzhi Li, Nurmemet Yolwas |
| 2019 | Multimodal Articulation-Based Pronunciation Error Detection with Spectrogram and Acoustic Features. Sabrina Jenne, Ngoc Thang Vu |
| 2019 | Multimodal Dialog with the MALACH Audiovisual Archive. Adam Chýlek, Lubos Smídl, Jan Svec |
| 2019 | Multimodal Response Obligation Detection with Unsupervised Online Domain Adaptation. Shota Horiguchi, Naoyuki Kanda, Kenji Nagamatsu |
| 2019 | Multimodal SpeakerBeam: Single Channel Target Speech Extraction with Audio-Visual Speaker Clues. Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Tomohiro Nakatani |
| 2019 | Multimodal Word Discovery and Retrieval with Phone Sequence and Image Concepts. Liming Wang, Mark A. Hasegawa-Johnson |
| 2019 | Multiple Sound Source Localization with SVD-PHAT. François Grondin, James R. Glass |
| 2019 | Multiview Shared Subspace Learning Across Speakers and Speech Commands. Krishna Somandepalli, Naveen Kumar, Arindam Jati, Panayiotis G. Georgiou, Shrikanth Narayanan |
| 2019 | Music Genre Classification Using Duplicated Convolutional Layers in Neural Networks. Hansi Yang, Wei-Qiang Zhang |
| 2019 | My Lips Are Concealed: Audio-Visual Speech Enhancement Through Obstructions. Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman |
| 2019 | NIESR: Nuisance Invariant End-to-End Speech Recognition. I-Hung Hsu, Ayush Jaiswal, Premkumar Natarajan |
| 2019 | NITK Kids' Speech Corpus. Pravin Bhaskar Ramteke, Sujata Supanekar, Pradyoth Hegde, Hanna Nelson, Venkataraja Aithal, Shashidhar G. Koolagudi |
| 2019 | NUS Speak-to-Sing: A Web Platform for Personalized Speech-to-Singing Conversion. Chitralekha Gupta, Karthika Vijayan, Bidisha Sharma, Xiaoxue Gao, Haizhou Li |
| 2019 | Nasal Air Emission in Sibilant Fricatives of Cleft Lip and Palate Speech. Sishir Kalita, Protima Nomo Sudro, S. R. Mahadeva Prasanna, Samarendra Dandapat |
| 2019 | Nasal Consonant Discrimination in Infant- and Adult-Directed Speech. Bogdan Ludusan, Annett Jorschick, Reiko Mazuka |
| 2019 | Neural Machine Translation for Multilingual Grapheme-to-Phoneme Conversion. Alex Sokolov, Tracy Rohlin, Ariya Rastrow |
| 2019 | Neural Named Entity Recognition from Subword Units. Abdalghani Abujabal, Judith Gaspers |
| 2019 | Neural Network Distillation on IoT Platforms for Sound Event Detection. Gianmarco Cerutti, Rahul Prasad, Alessio Brutti, Elisabetta Farella |
| 2019 | Neural Network-Based Modeling of Phonetic Durations. Xizi Wei, Melvyn Hunt, Adrian Skilling |
| 2019 | Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information. Rongzhi Gu, Lianwu Chen, Shi-Xiong Zhang, Jimeng Zheng, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu |
| 2019 | Neural Text Clustering with Document-Level Attention Based on Dynamic Soft Labels. Zhi Chen, Wu Guo, Li-Rong Dai, Zhen-Hua Ling, Jun Du |
| 2019 | Neural Transfer Learning for Cry-Based Diagnosis of Perinatal Asphyxia. Charles C. Onu, Jonathan Lebensold, William L. Hamilton, Doina Precup |
| 2019 | Neural Transition Systems for Modeling Hierarchical Semantic Representations. Riyaz Ahmad Bhat, John Chen, Rashmi Prasad, Srinivas Bangalore |
| 2019 | Neural Whispered Speech Detection with Imbalanced Learning. Takanori Ashihara, Yusuke Shinohara, Hiroshi Sato, Takafumi Moriya, Kiyoaki Matsui, Takaaki Fukutomi, Yoshikazu Yamaguchi, Yushi Aono |
| 2019 | No Distributional Learning in Adults from Attended Listening to Non-Speech. Ellen Marklund, Johan Sjons, Lisa Gustavsson, Elísabet Eir Cortes |
| 2019 | Noise Adaptive Speech Enhancement Using Domain Adversarial Training. Chien-Feng Liao, Yu Tsao, Hung-yi Lee, Hsin-Min Wang |
| 2019 | Noisy BiLSTM-Based Models for Disfluency Detection. Nguyen Bach, Fei Huang |
| 2019 | Non-Parallel Voice Conversion Using Weighted Generative Adversarial Networks. Dipjyoti Paul, Yannis Pantazis, Yannis Stylianou |
| 2019 | Non-Parallel Voice Conversion with Cyclic Variational Autoencoder. Patrick Lumban Tobing, Yi-Chiao Wu, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda |
| 2019 | Nonparallel Emotional Speech Conversion. Jian Gao, Deep Chakraborty, Hamidou Tembine, Olaitan Olaleye |
| 2019 | Normal Variance-Mean Mixtures for Unsupervised Score Calibration. Sandro Cumani |
| 2019 | Objective Assessment of Social Skills Using Automated Language Analysis for Identification of Schizophrenia and Bipolar Disorder. Rohit Voleti, Stephanie Woolridge, Julie M. Liss, Melissa Milanovic, Christopher R. Bowie, Visar Berisha |
| 2019 | Off the Cuff: Exploring Extemporaneous Speech Delivery with TTS. Éva Székely, Gustav Eje Henter, Jonas Beskow, Joakim Gustafson |
| 2019 | On Learning Interpretable CNNs with Parametric Modulated Kernel-Based Filters. Erfan Loweimi, Peter Bell, Steve Renals |
| 2019 | On Mitigating Acoustic Feedback in Hearing Aids with Frequency Warping by All-Pass Networks. Ching Hua Lee, Kuan-Lin Chen, Fredric J. Harris, Bhaskar D. Rao, Harinath Garudadri |
| 2019 | On Nonlinear Spatial Filtering in Multichannel Speech Enhancement. Kristina Tesch, Robert Rehr, Timo Gerkmann |
| 2019 | On Robustness of Unsupervised Domain Adaptation for Speaker Recognition. Pierre-Michel Bousquet, Mickael Rouvier |
| 2019 | On the Choice of Modeling Unit for Sequence-to-Sequence Speech Recognition. Kazuki Irie, Rohit Prabhavalkar, Anjuli Kannan, Antoine Bruguier, David Rybach, Patrick Nguyen |
| 2019 | On the Contributions of Visual and Textual Supervision in Low-Resource Semantic Speech Retrieval. Ankita Pasad, Bowen Shi, Herman Kamper, Karen Livescu |
| 2019 | On the End-to-End Solution to Mandarin-English Code-Switching Speech Recognition. Zhiping Zeng, Yerbolat Khassanov, Van Tung Pham, Haihua Xu, Eng Siong Chng, Haizhou Li |
| 2019 | On the Importance of Audio-Source Separation for Singer Identification in Polyphonic Music. Bidisha Sharma, Rohan Kumar Das, Haizhou Li |
| 2019 | On the Role of Oral Configurations in European Portuguese Nasal Vowels. Conceição Cunha, Samuel S. Silva, António J. S. Teixeira, Catarina Oliveira, Paula Martins, Arun A. Joseph, Jens Frahm |
| 2019 | On the Role of Style in Parsing Speech with Neural Models. Trang Tran, Jiahong Yuan, Yang Liu, Mari Ostendorf |
| 2019 | On the Suitability of the Riesz Spectro-Temporal Envelope for WaveNet Based Speech Synthesis. Jitendra Kumar Dhiman, Nagaraj Adiga, Chandra Sekhar Seelamantula |
| 2019 | On the Usage of Phonetic Information for Text-Independent Speaker Embedding Extraction. Shuai Wang, Johan Rohdin, Lukás Burget, Oldrich Plchot, Yanmin Qian, Kai Yu, Jan Cernocký |
| 2019 | On the Use of Pitch Features for Disordered Speech Recognition. Shansong Liu, Shoukang Hu, Xunying Liu, Helen Meng |
| 2019 | On the Use/Misuse of the Term 'Phoneme'. Roger K. Moore, Lucy Skidmore |
| 2019 | One-Pass Single-Channel Noisy Speech Recognition Using a Combination of Noisy and Enhanced Features. Masakiyo Fujimoto, Hisashi Kawai |
| 2019 | One-Shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization. Ju-Chieh Chou, Hung-yi Lee |
| 2019 | One-Shot Voice Conversion with Disentangled Representations by Leveraging Phonetic Posteriorgrams. Seyed Hamidreza Mohammadi, Taehwan Kim |
| 2019 | One-Shot Voice Conversion with Global Speaker Embeddings. Hui Lu, Zhiyong Wu, Dongyang Dai, Runnan Li, Shiyin Kang, Jia Jia, Helen Meng |
| 2019 | One-vs-All Models for Asynchronous Training: An Empirical Analysis. Rahul Gupta, Aman Alok, Shankar Ananthakrishnan |
| 2019 | Online Hybrid CTC/Attention Architecture for End-to-End Speech Recognition. Haoran Miao, Gaofeng Cheng, Pengyuan Zhang, Ta Li, Yonghong Yan |
| 2019 | Online Speech Processing and Analysis Suite. Wikus Pienaar, Daan Wissing |
| 2019 | Open-Vocabulary Keyword Spotting with Audio and Text Embeddings. Niccolò Sacchi, Alexandre Nanchen, Martin Jaggi, Milos Cernak |
| 2019 | Optimization of False Acceptance/Rejection Rates and Decision Threshold for End-to-End Text-Dependent Speaker Verification Systems. Victoria Mingote, Antonio Miguel, Dayana Ribas, Alfonso Ortega Giménez, Eduardo Lleida |
| 2019 | Optimizing Speech-Input Length for Speaker-Independent Depression Classification. Tomasz Rutowski, Amir Harati, Yang Lu, Elizabeth Shriberg |
| 2019 | Optimizing Voice Activity Detection for Noisy Conditions. Ruixi Lin, Charles Costello, Charles Jankowski, Vishwas Mruthyunjaya |
| 2019 | Optimizing a Speaker Embedding Extractor Through Backend-Driven Regularization. Luciana Ferrer, Mitchell McLaren |
| 2019 | Ordinal Triplet Loss: Investigating Sleepiness Detection from Speech. Peter Wu, Sai Krishna Rallabandi, Alan W. Black, Eric Nyberg |
| 2019 | PASCAL and DPA: A Pilot Study on Using Prosodic Competence Scores to Predict Communicative Skills for Team Working and Public Speaking. Oliver Niebuhr, Jan Michalsky |
| 2019 | Parallel vs. Non-Parallel Voice Conversion for Esophageal Speech. Luis Serrano, Sneha Raman, David Tavarez, Eva Navas, Inma Hernáez |
| 2019 | Parameter Enhancement for MELP Speech Codec in Noisy Communication Environment. Min-Jae Hwang, Hong-Goo Kang |
| 2019 | Parameter-Transfer Learning for Low-Resource Individualization of Head-Related Transfer Functions. Xiaoke Qi, Lu Wang |
| 2019 | Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation. Fadi Biadsy, Ron J. Weiss, Pedro J. Moreno, Dimitri Kanvesky, Ye Jia |
| 2019 | Perceiving Older Adults Producing Clear and Lombard Speech. Chris Davis, Jeesun Kim |
| 2019 | Perception of Pitch Contours in Speech and Nonspeech. Daniel R. Turner, Ann R. Bradlow, Jennifer S. Cole |
| 2019 | Perceptual Adaptation to Device and Human Voices: Learning and Generalization of a Phonetic Shift Across Real and Voice-AI Talkers. Bruno Ferenc Segedin, Michelle Cohn, Georgia Zellou |
| 2019 | Perceptual Evaluation of Early versus Late F0 Peaks in the Intonation Structure of Czech Question-Word Questions. Pavel Sturm, Jan Volín |
| 2019 | Perceptual Optimization of an Enhanced Geometric Vocal Fold Model for Articulatory Speech Synthesis. Peter Birkholz, Susanne Drechsel, Simon Stone |
| 2019 | Performance Monitoring for End-to-End Speech Recognition. Ruizhi Li, Gregory Sell, Hynek Hermansky |
| 2019 | Personalized Dialogue Response Generation Learned from Monologues. Feng-Guang Su, Aliyah R. Hsu, Yi-Lin Tuan, Hung-yi Lee |
| 2019 | Personalizing ASR for Dysarthric and Accented Speech with Limited Data. Joel Shor, Dotan Emanuel, Oran Lang, Omry Tuval, Michael P. Brenner, Julie Cattiau, Fernando Vieira, Maeve McNally, Taylor Charbonneau, Melissa Nollstadt, Avinatan Hassidim, Yossi Matias |
| 2019 | Phase Synchronization Between EEG Signals as a Function of Differences Between Stimuli Characteristics. Louis ten Bosch, Kimberley Mulder, Louis Boves |
| 2019 | Phone Aware Nearest Neighbor Technique Using Spectral Transition Measure for Non-Parallel Voice Conversion. Nirmesh J. Shah, Hemant A. Patil |
| 2019 | Phone-Attribute Posteriors to Evaluate the Speech of Cochlear Implant Users. Tomás Arias-Vergara, Juan Rafael Orozco-Arroyave, Milos Cernak, Sandra Gollwitzer, Maria Schuster, Elmar Nöth |
| 2019 | Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models. Ke Hu, Antoine Bruguier, Tara N. Sainath, Rohit Prabhavalkar, Golan Pundak |
| 2019 | Phonet: A Tool Based on Gated Recurrent Neural Networks to Extract Phonological Posteriors from Speech. Juan Camilo Vásquez-Correa, Philipp Klumpp, Juan Rafael Orozco-Arroyave, Elmar Nöth |
| 2019 | Phonetic Accommodation in a Wizard-of-Oz Experiment: Intonation and Segments. Iona Gessinger, Bernd Möbius, Bistra Andreeva, Eran Raveh, Ingmar Steiner |
| 2019 | Phonetic Detail Encoding in Explaining the Size of Speech Planning Window. Shan Luo |
| 2019 | Phonetically-Aware Embeddings, Wide Residual Networks with Time-Delay Neural Networks and Self Attention Models for the 2018 NIST Speaker Recognition Evaluation. Ignacio Viñals, Dayana Ribas, Victoria Mingote, Jorge Llombart, Pablo Gimeno, Antonio Miguel, Alfonso Ortega Giménez, Eduardo Lleida |
| 2019 | Phonological Awareness of French Rising Contours in Japanese Learners. Rachel Albar, Hiyon Yoo |
| 2019 | Physiology and Physics of Voice Production. Manfred Kaltenbacher |
| 2019 | Pindrop Labs' Submission to the First Multi-Target Speaker Detection and Identification Challenge. Elie Khoury, Khaled Lakhdhar, Andrew Vaughan, Ganesh Sivaraman, Parav Nagarsheth |
| 2019 | Pitch Accent Trajectories Across Different Conditions of Visibility and Information Structure - Evidence from Spontaneous Dyadic Interaction. Petra Wagner, Nataliya Bryhadyr, Marin Schröer |
| 2019 | Place Shift as an Autonomous Process: Evidence from Japanese Listeners. Yuriko Yokoe |
| 2019 | Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural Network with Multi-Level Embedding Features. Zexin Cai, Yaogen Yang, Chuxiong Zhang, Xiaoyi Qin, Ming Li |
| 2019 | Practical Applicability of Deep Neural Networks for Overlapping Speaker Separation. Pieter Appeltans, Jeroen Zegers, Hugo Van hamme |
| 2019 | Pre-Trained Text Embeddings for Enhanced Text-to-Speech Synthesis. Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Shubham Toshniwal, Karen Livescu |
| 2019 | Pre-Trained Text Representations for Improving Front-End Text Processing in Mandarin Text-to-Speech Synthesis. Bing Yang, Jiaqi Zhong, Shan Liu |
| 2019 | Predicting Behavior in Cancer-Afflicted Patient and Spouse Interactions Using Speech and Language. Sandeep Nallan Chakravarthula, Haoqi Li, Shao-Yen Tseng, Maija Reblin, Panayiotis G. Georgiou |
| 2019 | Predicting Group Performances Using a Personality Composite-Network Architecture During Collaborative Task. Shun-Chang Zhong, Yun-Shao Lin, Chun-Min Chang, Yi-Ching Liu, Chi-Chun Lee |
| 2019 | Predicting Group-Level Skin Attention to Short Movies from Audio-Based LSTM-Mixture of Experts Models. Ricardo Kleinlein, Cristina Luna Jiménez, Juan Manuel Montero, Zoraida Callejas, Fernando Fernández Martínez |
| 2019 | Predicting Humor by Learning from Time-Aligned Comments. Zixiaofan Yang, Bingyan Hu, Julia Hirschberg |
| 2019 | Predicting Speech Intelligibility of Enhanced Speech Using Phone Accuracy of DNN-Based ASR System. Kenichi Arai, Shoko Araki, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani, Katsuhiko Yamamoto, Toshio Irino |
| 2019 | Predicting the Leading Political Ideology of YouTube Channels Using Acoustic, Textual, and Metadata Information. Yoan Dinkov, Ahmed Ali, Ivan Koychev, Preslav Nakov |
| 2019 | Predictive Auxiliary Variational Autoencoder for Representation Learning of Global Speech Characteristics. Sebastian Springenberg, Egor Lakomkin, Cornelius Weber, Stefan Wermter |
| 2019 | Pretraining by Backtranslation for End-to-End ASR in Low-Resource Settings. Matthew Wiesner, Adithya Renduchintala, Shinji Watanabe, Chunxi Liu, Najim Dehak, Sanjeev Khudanpur |
| 2019 | Privacy-Preserving Adversarial Representation Learning in ASR: Reality or Illusion? Brij Mohan Lal Srivastava, Aurélien Bellet, Marc Tommasi, Emmanuel Vincent |
| 2019 | Privacy-Preserving Siamese Feature Extraction for Gender Recognition versus Speaker Identification. Alexandru Nelus, Silas Rech, Timm Koppelmann, Henrik Biermann, Rainer Martin |
| 2019 | Privacy-Preserving Speaker Recognition with Cohort Score Normalisation. Andreas Nautsch, Jose Patino, Amos Treiber, Themos Stafylakis, Petr Mizera, Massimiliano Todisco, Thomas Schneider, Nicholas W. D. Evans |
| 2019 | Privacy-Preserving Variational Information Feature Extraction for Domestic Activity Monitoring versus Speaker Identification. Alexandru Nelus, Janek Ebbers, Reinhold Haeb-Umbach, Rainer Martin |
| 2019 | Probabilistic Permutation Invariant Training for Speech Separation. Midia Yousefi, Soheil Khorram, John H. L. Hansen |
| 2019 | Probability Density Distillation with Generative Adversarial Networks for High-Quality Parallel Waveform Generation. Ryuichi Yamamoto, Eunwoo Song, Jae-Min Kim |
| 2019 | Profiling Speech Motor Impairments in Persons with Amyotrophic Lateral Sclerosis: An Acoustic-Based Approach. Hannah P. Rowe, Jordan R. Green |
| 2019 | Progressive Speech Enhancement with Residual Connections. Jorge Llombart, Dayana Ribas, Antonio Miguel, Luis Vicente, Alfonso Ortega Giménez, Eduardo Lleida |
| 2019 | Prosodic Characteristics of Mandarin Declarative and Interrogative Utterances in Parkinson's Disease. Lei Liu, Meng Jian, Wentao Gu |
| 2019 | Prosodic Effects on Plosive Duration in German and Austrian German. Barbara Schuppler, Margaret Zellers |
| 2019 | Prosodic Factors Influencing Vowel Reduction in Russian. Daniil Kocharov, Tatiana Kachkovskaia, Pavel A. Skrelin |
| 2019 | Prosodic Phrase Alignment for Machine Dubbing. Alp Öktem, Mireia Farrús, Antonio Bonafonte |
| 2019 | Prosodic Representations of Prominence Classification Neural Networks and Autoencoders Using Bottleneck Features. Sofoklis Kakouros, Antti Suni, Juraj Simko, Martti Vainio |
| 2019 | Prosody Usage Optimization for Children Speech Recognition with Zero Resource Children Speech. Chenda Li, Yanmin Qian |
| 2019 | PyToBI: A Toolkit for ToBI Labeling Under Python. Mónica Domínguez, Patrick Louis Rohrer, Juan Soler Company |
| 2019 | Pyramid Memory Block and Timestep Attention for Speech Emotion Recognition. Miao Cao, Chun Yang, Fang Zhou, Xu-Cheng Yin |
| 2019 | Qualitative Evaluation of ASR Adaptation in a Lecture Context: Application to the PASTEL Corpus. Salima Mdhaffar, Yannick Estève, Nicolas Hernandez, Antoine Laurent, Richard Dufour, Solen Quiniou |
| 2019 | Quality Degradation Diagnosis for Voice Networks - Estimating the Perceived Noisiness, Coloration, and Discontinuity of Transmitted Speech. Gabriel Mittag, Sebastian Möller |
| 2019 | Quantifying Cochlear Implant Users' Ability for Speaker Identification Using CI Auditory Stimuli. Nursadul Mamun, Ria Ghosh, John H. L. Hansen |
| 2019 | Quantifying Expectation Modulation in Human Speech Processing. Martijn Bentum, Louis ten Bosch, Antal van den Bosch, Mirjam Ernestus |
| 2019 | Quantifying Fundamental Frequency Modulation as a Function of Language, Speaking Style and Speaker. Pablo Arantes, Anders Eriksson |
| 2019 | Quasi-Periodic WaveNet Vocoder: A Pitch Dependent Dilated Convolution Model for Parametric Speech Generation. Yi-Chiao Wu, Tomoki Hayashi, Patrick Lumban Tobing, Kazuhiro Kobayashi, Tomoki Toda |
| 2019 | R Lauren Ward, Catherine Robinson, Matthew Paradis, Katherine M. Tucker, Ben G. Shirley |
| 2019 | R-Vectors: New Technique for Adaptation to Room Acoustics. Yuri Y. Khokhlov, Alexander Zatvornitskiy, Ivan Medennikov, Ivan Sorokin, Tatiana Prisyach, Aleksei Romanenko, Anton Mitrofanov, Vladimir Bataev, Andrei Andrusenko, Mariya Korenevskaya, Oleg Petrov |
| 2019 | RWTH ASR Systems for LibriSpeech: Hybrid vs Attention. Christoph Lüscher, Eugen Beck, Kazuki Irie, Markus Kitza, Wilfried Michel, Albert Zeyer, Ralf Schlüter, Hermann Ney |
| 2019 | RadioTalk: A Large-Scale Corpus of Talk Radio Transcripts. Doug Beeferman, William Brannon, Deb Roy |
| 2019 | Rare Sound Event Detection Using Deep Learning and Data Augmentation. Yanping Chen, Hongxia Jin |
| 2019 | RawNet: Advanced End-to-End Deep Neural Network Using Raw Waveforms for Text-Independent Speaker Verification. Jee-weon Jung, Hee-Soo Heo, Ju-ho Kim, Hye-jin Shim, Ha-Jin Yu |
| 2019 | ReMASC: Realistic Replay Attack Corpus for Voice Controlled Systems. Yuan Gong, Jian Yang, Jacob Huber, Mitchell MacKnight, Christian Poellabauer |
| 2019 | Real Time Online Visual End Point Detection Using Unidirectional LSTM. Tanay Sharma, Rohith Chandrashekar Aralikatti, Dilip Kumar Margam, Abhinav Thanda, Sharad Roy, Pujitha Appan Kandala, Shankar M. Venkatesan |
| 2019 | Real to H-Space Encoder for Speech Recognition. Titouan Parcollet, Mohamed Morchid, Georges Linarès, Renato De Mori |
| 2019 | Real-Time Neural Text-to-Speech with Sequence-to-Sequence Acoustic Model and WaveGlow or Single Gaussian WaveRNN Vocoders. Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai |
| 2019 | Real-Time One-Pass Decoder for Speech Recognition Using LSTM Language Models. Javier Jorge, Adrià Giménez, Javier Iranzo-Sánchez, Jorge Civera, Albert Sanchís, Alfons Juan |
| 2019 | Recognition of Creaky Voice from Emergency Calls. Lauri Tavi, Tanel Alumäe, Stefan Werner |
| 2019 | Recognition of Intentions of Users' Short Responses for Conversational News Delivery System. Hiroaki Takatsu, Katsuya Yokoyama, Yoichi Matsuyama, Hiroshi Honda, Shinya Fujie, Tetsunori Kobayashi |
| 2019 | Recognition of Latin American Spanish Using Multi-Task Learning. Carlos Mendes, Alberto Abad, João Paulo Neto, Isabel Trancoso |
| 2019 | Recursive Speech Separation for Unknown Number of Speakers. Naoya Takahashi, Sudarsanam Parthasaarathy, Nabarun Goswami, Yuki Mitsufuji |
| 2019 | Reduced Task Adaptation in Alternating Motion Rate Tasks as an Early Marker of Bulbar Involvement in Amyotrophic Lateral Sclerosis. Marziye Eshghi, Panying Rong, Antje S. Mefferd, Kaila L. Stipancic, Yana Yunusova, Jordan R. Green |
| 2019 | Regression and Classification for Direction-of-Arrival Estimation with Convolutional Recurrent Neural Networks. Zhenyu Tang, John D. Kanu, Kevin Hogan, Dinesh Manocha |
| 2019 | Relevance-Based Feature Masking: Improving Neural Network Based Whale Classification Through Explainable Artificial Intelligence. Dominik Schiller, Tobias Huber, Florian Lingenfelser, Michael Dietz, Andreas Seiderer, Elisabeth André |
| 2019 | Reliability of Clinical Voice Parameters Captured with Smartphones - Measurements of Added Noise and Spectral Tilt. Felix Schaeffler, Stephen Jannetts, Janet Beck |
| 2019 | Replay Attack Detection with Complementary High-Resolution Information Using End-to-End DNN for the ASVspoof 2019 Challenge. Jee-weon Jung, Hye-jin Shim, Hee-Soo Heo, Ha-Jin Yu |
| 2019 | Rescoring Keyword Search Confidence Estimates with Graph-Based Re-Ranking Using Acoustic Word Embeddings. Anna Piunova, Eugen Beck, Ralf Schlüter, Hermann Ney |
| 2019 | Residual + Capsule Networks (ResCap) for Simultaneous Single-Channel Overlapped Keyword Recognition. Yan Xiong, Visar Berisha, Chaitali Chakrabarti |
| 2019 | Reverse Transfer Learning: Can Word Embeddings Trained for Different NLP Tasks Improve Neural Language Models? Lyan Verwimp, Jerome R. Bellegarda |
| 2019 | Robust Bayesian and Light Neural Networks for Voice Spoofing Detection. Radoslaw Bialobrzeski, Michal Kosmider, Mateusz Matuszewski, Marcin Plata, Alexander Rakowski |
| 2019 | Robust DOA Estimation Based on Convolutional Neural Network and Time-Frequency Masking. Wangyou Zhang, Ying Zhou, Yanmin Qian |
| 2019 | Robust Keyword Spotting via Recycle-Pooling for Mobile Game. Shounan An, Youngsoo Kim, Hu Xu, Jinwoo Lee, Myungwoo Lee, Insoo Oh |
| 2019 | Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS. Mutian He, Yan Deng, Lei He |
| 2019 | Robust Sound Recognition: A Neuromorphic Approach. Jibin Wu, Zihan Pan, Malu Zhang, Rohan Kumar Das, Yansong Chua, Haizhou Li |
| 2019 | Robust Speech Emotion Recognition Under Different Encoding Conditions. Christopher Oates, Andreas Triantafyllopoulos, Ingmar Steiner, Björn W. Schuller |
| 2019 | Robustness of Statistical Voice Conversion Based on Direct Waveform Modification Against Background Sounds. Yusuke Kurita, Kazuhiro Kobayashi, Kazuya Takeda, Tomoki Toda |
| 2019 | SANTLR: Speech Annotation Toolkit for Low Resource Languages. Xinjian Li, Zhong Zhou, Siddharth Dalmia, Alan W. Black, Florian Metze |
| 2019 | SLP-AA: Tools for Sign Language Phonetic and Phonological Research. Roger Yu-Hsiang Lo, Kathleen Currie Hall |
| 2019 | SPEAK YOUR MIND! Towards Imagined Speech Recognition with Hierarchical Deep Learning. Pramit Saha, Muhammad Abdul-Mageed, Sidney S. Fels |
| 2019 | SPIRE-fluent: A Self-Learning App for Tutoring Oral Fluency to Second Language English Learners. Chiranjeevi Yarra, Aparna Srinivasan, Sravani Gottimukkala, Prasanta Kumar Ghosh |
| 2019 | STC Antispoofing Systems for the ASVspoof2019 Challenge. Galina Lavrentyeva, Sergey Novoselov, Andzhukaev Tseren, Marina Volkova, Artem Gorlanov, Alexandr Kozlov |
| 2019 | STC Speaker Recognition Systems for the VOiCES from a Distance Challenge. Sergey Novoselov, Aleksei Gusev, Artem Ivanov, Timur Pekhovsky, Andrey Shulipa, Galina Lavrentyeva, Vladimir Volokhov, Alexandr Kozlov |
| 2019 | STC Speaker Recognition Systems for the VOiCES from a Distance Challenge. Sergey Novoselov, Aleksei Gusev, Artem Ivanov, Timur Pekhovsky, Andrey Shulipa, Galina Lavrentyeva, Vladimir Volokhov, Alexandr Kozlov |
| 2019 | Salient Speech Representations Based on Cloned Networks. W. Bastiaan Kleijn, Felicia S. C. Lim, Michael Chinen, Jan Skoglund |
| 2019 | Sampling from Stochastic Finite Automata with Applications to CTC Decoding. Martin Jansche, Alexander Gutkin |
| 2019 | Say What? A Dataset for Exploring the Error Patterns That Two ASR Engines Make. Meredith Moore, Michael Saxon, Hemanth Venkateswara, Visar Berisha, Sethuraman Panchanathan |
| 2019 | Scalable Multi Corpora Neural Language Models for ASR. Anirudh Raju, Denis Filimonov, Gautam Tiwari, Guitang Lan, Ariya Rastrow |
| 2019 | Selection and Training Schemes for Improving TTS Voice Built on Found Data. Fang-Yu Kuo, Iris Chuoying Ouyang, Sandesh Aryal, Pierre Lanchantin |
| 2019 | Self Attention in Variational Sequential Learning for Summarization. Jen-Tzung Chien, Chun-Wei Wang |
| 2019 | Self Multi-Head Attention for Speaker Recognition. Miquel India, Pooyan Safari, Javier Hernando |
| 2019 | Self-Attention Transducers for End-to-End Speech Recognition. Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengqi Wen |
| 2019 | Self-Attention for Speech Emotion Recognition. Lorenzo Tarantino, Philip N. Garner, Alexandros Lazaridis |
| 2019 | Self-Imitating Feedback Generation Using GAN for Computer-Assisted Pronunciation Training. Seung Hee Yang, Minhwa Chung |
| 2019 | Self-Supervised Speaker Embeddings. Themos Stafylakis, Johan Rohdin, Oldrich Plchot, Petr Mizera, Lukás Burget |
| 2019 | Self-Teaching Networks. Liang Lu, Eric Sun, Yifan Gong |
| 2019 | Semi-Supervised Acoustic Model Training for Five-Lingual Code-Switched ASR. Astik Biswas, Emre Yilmaz, Febe de Wet, Ewald van der Westhuizen, Thomas Niesler |
| 2019 | Semi-Supervised Audio Classification with Consistency-Based Regularization. Kangkang Lu, Chuan-Sheng Foo, Kah Kuan Teh, Huy Dat Tran, Vijay Ramaseshan Chandrasekhar |
| 2019 | Semi-Supervised Prosody Modeling Using Deep Gaussian Process Latent Variable Model. Tomoki Koriyama, Takao Kobayashi |
| 2019 | Semi-Supervised Sequence-to-Sequence ASR Using Unpaired Speech and Text. Murali Karthick Baskar, Shinji Watanabe, Ramón Fernandez Astudillo, Takaaki Hori, Lukás Burget, Jan Cernocký |
| 2019 | Semi-Supervised Voice Conversion with Amortized Variational Inference. Cory Stephenson, Gokce Keskin, Anil Thomas, Oguz H. Elibol |
| 2019 | Sentence Prosody and Wh-Indeterminates in Taiwan Mandarin. Yu-Yin Hsu, Anqi Xu |
| 2019 | Sequence-to-Sequence Learning via Attention Transfer for Incremental Speech Recognition. Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura |
| 2019 | Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions. Awni Y. Hannun, Ann Lee, Qiantong Xu, Ronan Collobert |
| 2019 | Shallow-Fusion End-to-End Contextual Biasing. Ding Zhao, Tara N. Sainath, David Rybach, Pat Rondon, Deepti Bhatia, Bo Li, Ruoming Pang |
| 2019 | Shortcut Connections Based Deep Speaker Embeddings for End-to-End Speaker Verification System. Soonshin Seo, Daniel Jun Rim, Minkyu Lim, Donghyun Lee, Hosung Park, Junseok Oh, Changmin Kim, Ji-Hwan Kim |
| 2019 | ShrinkML: End-to-End ASR Model Compression Using Reinforcement Learning. Lukasz Dudziak, Mohamed S. Abdelfattah, Ravichander Vipperla, Stefanos Laskaridis, Nicholas D. Lane |
| 2019 | Sibilant Variation in New Englishes: A Comparative Sociophonetic Study of Trinidadian and American English /s(tr)/-Retraction. Wiebke Ahlers, Philipp Meer |
| 2019 | Simultaneous Denoising and Dereverberation for Low-Latency Applications Using Frame-by-Frame Online Unified Convolutional Beamformer. Tomohiro Nakatani, Keisuke Kinoshita |
| 2019 | Simultaneous Detection and Localization of a Wake-Up Word Using Multi-Task Learning of the Duration and Endpoint. Takashi Maekaku, Yusuke Kida, Akihiko Sugiyama |
| 2019 | Sincerity in Acted Speech: Presenting the Sincere Apology Corpus and Results. Alice Baird, Eduardo Coutinho, Julia Hirschberg, Björn W. Schuller |
| 2019 | Singing Voice Synthesis Using Deep Autoregressive Neural Networks for Acoustic Modeling. Yuan-Hao Yi, Yang Ai, Zhen-Hua Ling, Li-Rong Dai |
| 2019 | Slot Filling with Weighted Multi-Encoders for Out-of-Domain Values. Yuka Kobayashi, Takami Yoshida, Kenji Iwata, Hiroshi Fujimura |
| 2019 | Small-Footprint Magic Word Detection Method Using Convolutional LSTM Neural Network. Taiki Yamamoto, Ryota Nishimura, Masayuki Misaki, Norihide Kitaoka |
| 2019 | Sound Event Detection in Multichannel Audio Using Convolutional Time-Frequency-Channel Squeeze and Excitation. Wei Xia, Kazuhito Koishida |
| 2019 | Sound Privacy: A Conversational Speech Corpus for Quantifying the Experience of Privacy. Pablo Pérez Zarazaga, Sneha Das, Tom Bäckström, Vishnu Vidyadhara Raju Vegesna, Anil Kumar Vuppala |
| 2019 | Sound Tools eXtended (STx) 5.0 - A Powerful Sound Analysis Tool Optimized for Speech. Anton Noll, Jonathan Stuefer, Nicola Klingler, Hannah Leykum, Carina Lozo, Jan Luttenberger, Michael Pucher, Carolin Schmid |
| 2019 | SparseSpeech: Unsupervised Acoustic Unit Discovery with Memory-Augmented Sequence Autoencoders. Benjamin Milde, Chris Biemann |
| 2019 | Spatial Pyramid Encoding with Convex Length Normalization for Text-Independent Speaker Verification. Youngmoon Jung, Younggwan Kim, Hyungjun Lim, Yeunju Choi, Hoirin Kim |
| 2019 | Spatial and Spectral Fingerprint in the Brain: Speaker Identification from Single Trial MEG Signals. Debadatta Dash, Paul Ferrari, Jun Wang |
| 2019 | Spatial, Temporal and Spectral Multiresolution Analysis for the INTERSPEECH 2019 ComParE Challenge. Marie-José Caraty, Claude Montacié |
| 2019 | Spatio-Temporal Attention Pooling for Audio Scene Classification. Huy Phan, Oliver Y. Chén, Lam Dang Pham, Philipp Koch, Maarten De Vos, Ian McLoughlin, Alfred Mertins |
| 2019 | Speaker Adaptation for Attention-Based End-to-End Speech Recognition. Zhong Meng, Yashesh Gaur, Jinyu Li, Yifan Gong |
| 2019 | Speaker Adaptation for Lip-Reading Using Visual Identity Vectors. Pujitha Appan Kandala, Abhinav Thanda, Dilip Kumar Margam, Rohith Chandrashekar Aralikatti, Tanay Sharma, Sharad Roy, Shankar M. Venkatesan |
| 2019 | Speaker Adversarial Training of DPGMM-Based Feature Extractor for Zero-Resource Languages. Yosuke Higuchi, Naohiro Tawara, Tetsunori Kobayashi, Tetsuji Ogawa |
| 2019 | Speaker Augmentation and Bandwidth Extension for Deep Speaker Embedding. Hitoshi Yamamoto, Kong Aik Lee, Koji Okabe, Takafumi Koshinaka |
| 2019 | Speaker Diarization Using Leave-One-Out Gaussian PLDA Clustering of DNN Embeddings. Alan McCree, Gregory Sell, Daniel Garcia-Romero |
| 2019 | Speaker Diarization with Deep Speaker Embeddings for DIHARD Challenge II. Sergey Novoselov, Aleksei Gusev, Artem Ivanov, Timur Pekhovsky, Andrey Shulipa, Anastasia Avdeeva, Artem Gorlanov, Alexandr Kozlov |
| 2019 | Speaker Diarization with Lexical Information. Tae Jin Park, Kyu Jeong Han, Jing Huang, Xiaodong He, Bowen Zhou, Panayiotis G. Georgiou, Shrikanth Narayanan |
| 2019 | Speaker Recognition Benchmark Using the CHiME-5 Corpus. Daniel Garcia-Romero, David Snyder, Shinji Watanabe, Gregory Sell, Alan McCree, Daniel Povey, Sanjeev Khudanpur |
| 2019 | Speaker-Aware Deep Denoising Autoencoder with Embedded Speaker Identity for Speech Enhancement. Fu-Kai Chuang, Syu-Siang Wang, Jeih-weih Hung, Yu Tsao, Shih-Hau Fang |
| 2019 | Speaker-Corrupted Embeddings for Online Speaker Diarization. Omid Ghahabi, Volker Fischer |
| 2019 | Speaker-Invariant Feature-Mapping for Distant Speech Recognition via Adversarial Teacher-Student Learning. Long Wu, Hangting Chen, Li Wang, Pengyuan Zhang, Yonghong Yan |
| 2019 | Speaking Rate, Information Density, and Information Rate in First-Language and Second-Language Speech. Ann R. Bradlow |
| 2019 | SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition. Daniel S. Park, William Chan, Yu Zhang, Chung-Cheng Chiu, Barret Zoph, Ekin D. Cubuk, Quoc V. Le |
| 2019 | Specialized Speech Enhancement Model Selection Based on Learned Non-Intrusive Quality Assessment Metric. Ryandhimas E. Zezario, Szu-Wei Fu, Xugang Lu, Hsin-Min Wang, Yu Tsao |
| 2019 | Spectral Subspace Analysis for Automatic Assessment of Pathological Speech Intelligibility. Parvaneh Janbakhshi, Ina Kodrasi, Hervé Bourlard |
| 2019 | Speech Audio Super-Resolution for Speech Recognition. Xinyu Li, Venkata Chebiyyam, Katrin Kirchhoff |
| 2019 | Speech Augmentation via Speaker-Specific Noise in Unseen Environment. Ya'nan Guo, Ziping Zhao, Yide Ma, Björn W. Schuller |
| 2019 | Speech Based Emotion Prediction: Can a Linear Model Work? Anda Ouyang, Ting Dang, Vidhyasaharan Sethu, Eliathamby Ambikairajah |
| 2019 | Speech Denoising with Deep Feature Losses. François G. Germain, Qifeng Chen, Vladlen Koltun |
| 2019 | Speech Driven Backchannel Generation Using Deep Q-Network for Enhancing Engagement in Human-Robot Interaction. Nusrah Hussain, Engin Erzin, T. Metin Sezgin, Yücel Yemez |
| 2019 | Speech Emotion Recognition Based on Multi-Label Emotion Existence Model. Atsushi Ando, Ryo Masumura, Hosana Kamiyama, Satoshi Kobashikawa, Yushi Aono |
| 2019 | Speech Emotion Recognition in Dyadic Dialogues with Attentive Interaction Modeling. Jinming Zhao, Shizhe Chen, Jingjun Liang, Qin Jin |
| 2019 | Speech Emotion Recognition with a Reject Option. Kusha Sridhar, Carlos Busso |
| 2019 | Speech Enhancement Using Forked Generative Adversarial Networks with Spectral Subtraction. Ju Lin, Sufeng Niu, Zice Wei, Xiang Lan, Adriaan J. de Lind van Wijngaarden, Melissa C. Smith, Kuang-Ching Wang |
| 2019 | Speech Enhancement for Noise-Robust Speech Synthesis Using Wasserstein GAN. Nagaraj Adiga, Yannis Pantazis, Vassilis Tsiaras, Yannis Stylianou |
| 2019 | Speech Enhancement with Variance Constrained Autoencoders. Daniel T. Braithwaite, W. Bastiaan Kleijn |
| 2019 | Speech Enhancement with Wide Residual Networks in Reverberant Environments. Jorge Llombart, Dayana Ribas, Antonio Miguel, Luis Vicente, Alfonso Ortega Giménez, Eduardo Lleida |
| 2019 | Speech Model Pre-Training for End-to-End Spoken Language Understanding. Loren Lugosch, Mirco Ravanelli, Patrick Ignoto, Vikrant Singh Tomar, Yoshua Bengio |
| 2019 | Speech Organ Contour Extraction Using Real-Time MRI and Machine Learning Method. Hironori Takemoto, Tsubasa Goto, Yuya Hagihara, Sayaka Hamanaka, Tatsuya Kitamura, Yukiko Nota, Kikuo Maekawa |
| 2019 | Speech Quality Evaluation of Synthesized Japanese Speech Using EEG. Ivan Halim Parmonangan, Hiroki Tanaka, Sakriani Sakti, Shinnosuke Takamichi, Satoshi Nakamura |
| 2019 | Speech Replay Detection with x-Vector Attack Embeddings and Spectral Features. Jennifer Williams, Joanna Rownicka |
| 2019 | Speech Separation Using Independent Vector Analysis with an Amplitude Variable Gaussian Mixture Model. Zhaoyi Gu, Jing Lu, Kai Chen |
| 2019 | Speech-Based Web Navigation for Limited Mobility Users. Vasiliy Radostev, Serge Berger, Justin Tabrizi, Pasha Kamyshev, Hisami Suzuki |
| 2019 | SpeechMarker: A Voice Based Multi-Level Attendance Application. Sarfaraz Jelil, Abhishek Shrivastava, Rohan Kumar Das, S. R. Mahadeva Prasanna, Rohit Sinha |
| 2019 | SpeechYOLO: Detection and Localization of Speech Objects. Yael Segal, Tzeviya Sylvia Fuchs, Joseph Keshet |
| 2019 | Splash: Speech and Language Assessment in Schools and Homes. Avin Miwardelli, Ian Gallagher, Jenny Gibson, Napoleon Katsos, Kate M. Knill, Helena Wood |
| 2019 | Spoken Language Intent Detection Using Confusion2Vec. Prashanth Gurunath Shivakumar, Mu Yang, Panayiotis G. Georgiou |
| 2019 | Spontaneous Conversational Speech Synthesis from Found Data. Éva Székely, Gustav Eje Henter, Jonas Beskow, Joakim Gustafson |
| 2019 | Spot the Pleasant People! Navigating the Cocktail Party Buzz. Christina Tånnander, Per Fallgren, Jens Edlund, Joakim Gusafsson |
| 2019 | StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion. Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo |
| 2019 | State-of-the-Art Speaker Recognition for Telephone and Video Speech: The JHU-MIT Submission for NIST SRE18. Jesús Villalba, Nanxin Chen, David Snyder, Daniel Garcia-Romero, Alan McCree, Gregory Sell, Jonas Borgstrom, Fred Richardson, Suwon Shon, François Grondin, Réda Dehak, Leibny Paola García-Perera, Daniel Povey, Pedro A. Torres-Carrasquillo, Sanjeev Khudanpur, Najim Dehak |
| 2019 | Statistical Approach to Speech Synthesis: Past, Present and Future. Keiichi Tokuda |
| 2019 | Strength and Structure: Coupling Tones with Oral Constriction Gestures. Doris Mücke, Anne Hermes, Sam Tilsen |
| 2019 | Study of the Performance of Automatic Speech Recognition Systems in Speakers with Parkinson's Disease. Laureano Moro-Velázquez, Jaejin Cho, Shinji Watanabe, Mark A. Hasegawa-Johnson, Odette Scharenborg, Heejin Kim, Najim Dehak |
| 2019 | Styrian Dialect Classification: Comparing and Fusing Classifiers Based on a Feature Selection Using a Genetic Algorithm. Thomas Kisler, Raphael Winkelmann, Florian Schiel |
| 2019 | Sub-Band Convolutional Neural Networks for Small-Footprint Spoken Term Classification. Chieh-Chi Kao, Ming Sun, Yixin Gao, Shiv Vitaladevuni, Chao Wang |
| 2019 | Subjective Evaluation of Communicative Effort for Younger and Older Adults in Interactive Tasks with Energetic and Informational Masking. Valérie Hazan, Outi Tuomainen, Linda Taschenberger |
| 2019 | Subspace Pooling Based Temporal Features Extraction for Audio Event Recognition. Qiuying Shi, Hui Luo, Jiqing Han |
| 2019 | Subword RNNLM Approximations for Out-Of-Vocabulary Keyword Search. Mittul Singh, Sami Virpioja, Peter Smit, Mikko Kurimo |
| 2019 | Super-Wideband Spectral Envelope Modeling for Speech Coding. Guillaume Fuchs, Chamran Ashour, Tom Bäckström |
| 2019 | Supervised Classifiers for Audio Impairments with Noisy Labels. Chandan K. A. Reddy, Ross Cutler, Johannes Gehrke |
| 2019 | Survey Talk: A Survey on Speech Translation. Jan Niehues |
| 2019 | Survey Talk: End-to-End Deep Neural Network Based Speaker and Language Recognition. Ming Li, Weicheng Cai, Danwei Cai |
| 2019 | Survey Talk: Modeling in Automatic Speech Recognition: Beyond Hidden Markov Models. Ralf Schlüter |
| 2019 | Survey Talk: Multimodal Processing of Speech and Language. Florian Metze |
| 2019 | Survey Talk: Preserving Privacy in Speaker and Speech Characterisation. Andreas Nautsch |
| 2019 | Survey Talk: Prosody Research and Applications: The State of the Art. Nigel G. Ward |
| 2019 | Survey Talk: Reaching Over the Gap: Cross- and Interdisciplinary Research on Human and Automatic Speech Processing. Odette Scharenborg |
| 2019 | Survey Talk: Realistic Physics-Based Computational Voice Production. Oriol Guasch |
| 2019 | Survey Talk: Recognition of Foreign-Accented Speech: Challenges and Opportunities for Human and Computer Speech Communication. Ann R. Bradlow |
| 2019 | Survey Talk: When Attention Meets Speech Applications: Speech & Speaker Recognition Perspective. Kyu Jeong Han, Ramon Prieto, Tao Ma |
| 2019 | Sustained Vowel Game: A Computer Therapy Game for Children with Dysphonia. Vanessa Lopes, João Magalhães, Sofia Cavaco |
| 2019 | Synchronising Audio and Ultrasound by Learning Cross-Modal Embeddings. Aciel Eshky, Manuel Sam Ribeiro, Korin Richmond, Steve Renals |
| 2019 | Synthesized Spoken Names: Biases Impacting Perception. Lucas Kessler, Cecilia Ovesdotter Alm, Reynold Bailey |
| 2019 | Talker Intelligibility and Listening Effort with Temporally Modified Speech. Maximillian Paulus, Valérie Hazan, Patti Adank |
| 2019 | Target Speaker Extraction for Multi-Talker Speaker Verification. Wei Rao, Chenglin Xu, Eng Siong Chng, Haizhou Li |
| 2019 | Target Speaker Recovery and Recognition Network with Average x-Vector and Global Training. Wenjie Li, Pengyuan Zhang, Yonghong Yan |
| 2019 | Temporal Convolution for Real-Time Keyword Spotting on Mobile Devices. Seungwoo Choi, Seokjun Seo, Beomjun Shin, Hyeongmin Byun, Martin Kersner, Beomsu Kim, Dongyoung Kim, Sungjoo Ha |
| 2019 | Temporal Coordination of Articulatory and Respiratory Events Prior to Speech Initiation. Oksana Rasskazova, Christine Mooshammer, Susanne Fuchs |
| 2019 | Temporally-Aware Acoustic Unit Discovery for Zerospeech 2019 Challenge. Bolaji Yusuf, Alican Gök, Batuhan Gündogdu, Oyku Deniz Kose, Murat Saraclar |
| 2019 | Testing the Distinctiveness of Intonational Tunes: Evidence from Imitative Productions in American English. Eleanor Chodroff, Jennifer S. Cole |
| 2019 | The 2018 NIST Speaker Recognition Evaluation. Seyed Omid Sadjadi, Craig S. Greenberg, Elliot Singer, Douglas A. Reynolds, Lisa P. Mason, Jaime Hernandez-Cordero |
| 2019 | The 2019 Inaugural Fearless Steps Challenge: A Giant Leap for Naturalistic Audio. John H. L. Hansen, Aditya Joglekar, Meena Chandra Shekhar, Vinay Kothapally, Chengzhu Yu, Lakshmish Kaushik, Abhijeet Sangwan |
| 2019 | The Airbus Air Traffic Control Speech Recognition 2018 Challenge: Towards ATC Automatic Transcription and Call Sign Detection. Thomas Pellegrini, Jérôme Farinas, Estelle Delpech, François Lancelot |
| 2019 | The Althingi ASR System. Inga Rún Helgadóttir, Anna Björk Nikulásdóttir, Michal Borský, Judy Y. Fong, Róbert Kjaran, Jón Guðnason |
| 2019 | The CUHK Dysarthric Speech Recognition Systems for English and Cantonese. Shoukang Hu, Shansong Liu, Heng Fai Chang, Mengzhe Geng, Jiani Chen, Lau Wing Chung, To Ka Hei, Jianwei Yu, Ka Ho Wong, Xunying Liu, Helen Meng |
| 2019 | The Contribution of Acoustic Features Analysis to Model Emotion Perceptual Process for Language Diversity. Xingfeng Li, Masato Akagi |
| 2019 | The Contribution of Lip Protrusion to Anglo-English /r/: Evidence from Hyper- and Non-Hyperarticulated Speech. Hannah King, Emmanuel Ferragne |
| 2019 | The DKU Replay Detection System for the ASVspoof 2019 Challenge: On Data Augmentation, Feature Representation, Classification, and Fusion. Weicheng Cai, Haiwei Wu, Danwei Cai, Ming Li |
| 2019 | The DKU System for the Speaker Recognition Task of the 2019 VOiCES from a Distance Challenge. Danwei Cai, Xiaoyi Qin, Weicheng Cai, Ming Li |
| 2019 | The DKU-LENOVO Systems for the INTERSPEECH 2019 Computational Paralinguistic Challenge. Haiwei Wu, Weiqing Wang, Ming Li |
| 2019 | The DKU-SMIIP System for NIST 2018 Speaker Recognition Evaluation. Danwei Cai, Weicheng Cai, Ming Li |
| 2019 | The Dependability of Voice on Elders' Acceptance of Humanoid Agents. Anna Esposito, Terry Amorese, Marialucia Cuciniello, Maria Teresa Riviello, Antonietta Maria Esposito, Alda Troncone, Gennaro Cordasco |
| 2019 | The Different Roles of Expectations in Phonetic and Lexical Processing. Shiri Lev-Ari, Robin Dodsworth, Jeff Mielke, Sharon Peperkamp |
| 2019 | The Effect of Phoneme Distribution on Perceptual Similarity in English. Emma O'Neill, Julie Carson-Berndsen |
| 2019 | The Effects of Time Expansion on English as a Second Language Individuals. John S. Novak III, Daniel Bunn, Robert V. Kenyon |
| 2019 | The GDPR & Speech Data: Reflections of Legal and Technology Communities, First Steps Towards a Common Understanding. Andreas Nautsch, Catherine Jasserand, Els Kindt, Massimiliano Todisco, Isabel Trancoso, Nicholas W. D. Evans |
| 2019 | The Greennn Tree - Lengthening Position Influences Uncertainty Perception. Simon Betz, Sina Zarrieß, Éva Székely, Petra Wagner |
| 2019 | The I2R's ASR System for the VOiCES from a Distance Challenge 2019. Tze Yuang Chong, Kye Min Tan, Kah Kuan Teh, Chang Huai You, Hanwu Sun, Tran Huy Dat |
| 2019 | The I2R's ASR System for the VOiCES from a Distance Challenge 2019. Tze Yuang Chong, Kye Min Tan, Kah Kuan Teh, Chang Huai You, Hanwu Sun, Tran Huy Dat |
| 2019 | The I2R's Submission to VOiCES Distance Speaker Recognition Challenge 2019. Hanwu Sun, Kah Kuan Teh, Ivan Kukanov, Tran Huy Dat |
| 2019 | The INTERSPEECH 2019 Computational Paralinguistics Challenge: Styrian Dialects, Continuous Sleepiness, Baby Sounds & Orca Activity. Björn W. Schuller, Anton Batliner, Christian Bergler, Florian B. Pokorny, Jarek Krajewski, Margaret Cychosz, Ralf Vollmann, Sonja-Dana Roelen, Sebastian Schnieder, Elika Bergelson, Alejandrina Cristià, Amanda Seidl, Anne S. Warlaumont, Lisa Yankowitz, Elmar Nöth, Shahin Amiriparian, Simone Hantke, Maximilian Schmitt |
| 2019 | The Influence of Distraction on Speech Processing: How Selective is Selective Attention? Sandra I. Parhammer, Miriam Ebersberg, Jenny Tippmann, Katja Stärk, Andreas Opitz, Barbara Hinger, Sonja Rossi |
| 2019 | The JHU ASR System for VOiCES from a Distance Challenge 2019. Yiming Wang, David Snyder, Hainan Xu, Vimal Manohar, Phani Sankar Nidadavolu, Daniel Povey, Sanjeev Khudanpur |
| 2019 | The JHU Speaker Recognition System for the VOiCES 2019 Challenge. David Snyder, Jesús Villalba, Nanxin Chen, Daniel Povey, Gregory Sell, Najim Dehak, Sanjeev Khudanpur |
| 2019 | The LeVoice Far-Field Speech Recognition System for VOiCES from a Distance Challenge 2019. Yulong Liang, Lin Yang, Xuyang Wang, Yingjie Li, Chen Jia, Junjie Wang |
| 2019 | The Monophthongs of Formal Nigerian English: An Acoustic Analysis. Nisad Jamakovic, Robert Fuchs |
| 2019 | The NEC-TT 2018 Speaker Verification System. Kong Aik Lee, Hitoshi Yamamoto, Koji Okabe, Qiongqiong Wang, Ling Guo, Takafumi Koshinaka, Jiacen Zhang, Koichi Shinoda |
| 2019 | The Neural Correlates Underlying Lexically-Guided Perceptual Learning. Odette Scharenborg, Jiska Koemans, Cybelle Smith, Mark A. Hasegawa-Johnson, Kara D. Federmeier |
| 2019 | The Processing of Prosodic Cues to Rhetorical Question Interpretation: Psycholinguistic and Neurolinguistics Evidence. Mariya Kharaman, Manluolan Xu, Carsten Eulitz, Bettina Braun |
| 2019 | The Production of Chinese Affricates /ts/ and /ts Dan Du, Jinsong Zhang |
| 2019 | The Role of Musical Experience in the Perceptual Weighting of Acoustic Cues for the Obstruent Coda Voicing Contrast in American English. Michelle Cohn, Georgia Zellou, Santiago Barreda |
| 2019 | The Role of Voice Quality in the Perception of Prominence in Synthetic Speech. Andy Murphy, Irena Yanushevskaya, Ailbhe Ní Chasaide, Christer Gobl |
| 2019 | The SAIL LABS Media Mining Indexer and the CAVA Framework. Erinç Dikici, Gerhard Backfried, Jürgen Riedler |
| 2019 | The SJTU Robust Anti-Spoofing System for the ASVspoof 2019 Challenge. Yexin Yang, Hongji Wang, Heinrich Dinkel, Zhengyang Chen, Shuai Wang, Yanmin Qian, Kai Yu |
| 2019 | The STC ASR System for the VOiCES from a Distance Challenge 2019. Ivan Medennikov, Yuri Y. Khokhlov, Aleksei Romanenko, Ivan Sorokin, Anton Mitrofanov, Vladimir Bataev, Andrei Andrusenko, Tatiana Prisyach, Mariya Korenevskaya, Oleg Petrov, Alexander Zatvornitskiy |
| 2019 | The STC ASR System for the VOiCES from a Distance Challenge 2019. Ivan Medennikov, Yuri Y. Khokhlov, Aleksei Romanenko, Ivan Sorokin, Anton Mitrofanov, Vladimir Bataev, Andrei Andrusenko, Tatiana Prisyach, Mariya Korenevskaya, Oleg Petrov, Alexander Zatvornitskiy |
| 2019 | The Second DIHARD Challenge: System Description for USC-SAIL Team. Tae Jin Park, Manoj Kumar, Nikolaos Flemotomos, Monisankha Pal, Raghuveer Peri, Rimita Lahiri, Panayiotis G. Georgiou, Shrikanth Narayanan |
| 2019 | The Second DIHARD Diarization Challenge: Dataset, Task, and Baselines. Neville Ryant, Kenneth Church, Christopher Cieri, Alejandrina Cristià, Jun Du, Sriram Ganapathy, Mark Y. Liberman |
| 2019 | The VOiCES from a Distance Challenge 2019. Mahesh Kumar Nandwana, Julien van Hout, Colleen Richey, Mitchell McLaren, Maria Alejandra Barrios, Aaron Lawson |
| 2019 | The VOiCES from a Distance Challenge 2019. Mahesh Kumar Nandwana, Julien van Hout, Colleen Richey, Mitchell McLaren, María Auxiliadora Barrios, Aaron Lawson |
| 2019 | The Voicing Contrast in Stops and Affricates in the Western Armenian of Lebanon. Niamh E. Kelly, Lara Keshishian |
| 2019 | The Vowel System of Korebaju. Jenifer Vega Rodríguez |
| 2019 | The Zero Resource Speech Challenge 2019: TTS Without T. Ewan Dunbar, Robin Algayres, Julien Karadayi, Mathieu Bernard, Juan Benjumea, Xuan-Nga Cao, Lucie Miskic, Charlotte Dugrain, Lucas Ondel, Alan W. Black, Laurent Besacier, Sakriani Sakti, Emmanuel Dupoux |
| 2019 | Three's a Crowd? Effects of a Second Human on Vocal Accommodation with a Voice Assistant. Eran Raveh, Ingo Siegert, Ingmar Steiner, Iona Gessinger, Bernd Möbius |
| 2019 | Tied Mixture of Factor Analyzers Layer to Combine Frame Level Representations in Neural Speaker Embeddings. Nanxin Chen, Jesús Villalba, Najim Dehak |
| 2019 | Time to Frequency Domain Mapping of the Voice Source: The Influence of Open Quotient and Glottal Skew on the Low End of the Source Spectrum. Christer Gobl, Ailbhe Ní Chasaide |
| 2019 | Toeplitz Inverse Covariance Based Robust Speaker Clustering for Naturalistic Audio Streams. Harishchandra Dubey, Abhijeet Sangwan, John H. L. Hansen |
| 2019 | Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion. Hao Sun, Xu Tan, Jun-Wei Gan, Hongzhi Liu, Sheng Zhao, Tao Qin, Tie-Yan Liu |
| 2019 | ToneNet: A CNN Model of Tone Classification of Mandarin Chinese. Qiang Gao, Shutao Sun, Yaping Yang |
| 2019 | Topic-Aware Dialogue Speech Recognition with Transfer Learning. Yuanfeng Song, Di Jiang, Xueyang Wu, Qian Xu, Raymond Chi-Wing Wong, Qiang Yang |
| 2019 | Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations. Karthik Gopalakrishnan, Behnam Hedayatnia, Qinlang Chen, Anna Gottardi, Sanjeev Kwatra, Anu Venkatesh, Raefer Gabriel, Dilek Hakkani-Tür |
| 2019 | Towards Achieving Robust Universal Neural Vocoding. Jaime Lorenzo-Trueba, Thomas Drugman, Javier Latorre, Thomas Merritt, Bartosz Putrycz, Roberto Barra-Chicote, Alexis Moinet, Vatsal Aggarwal |
| 2019 | Towards Bilingual Lexicon Discovery From Visually Grounded Speech Audio. Emmanuel Azuh, David Harwath, James R. Glass |
| 2019 | Towards Debugging Deep Neural Networks by Generating Speech Utterances. Bilal Soomro, Anssi Kanervisto, Trung Ngo Trong, Ville Hautamäki |
| 2019 | Towards Detection of Canonical Babbling by Citizen Scientists: Performance as a Function of Clip Length. Amanda Seidl, Anne S. Warlaumont, Alejandrina Cristià |
| 2019 | Towards Discriminative Representations and Unbiased Predictions: Class-Specific Angular Softmax for Speech Emotion Recognition. Zhixuan Li, Liang He, Jingyang Li, Li Wang, Wei-Qiang Zhang |
| 2019 | Towards Generalized Speech Enhancement with Generative Adversarial Networks. Santiago Pascual, Joan Serrà, Antonio Bonafonte |
| 2019 | Towards Joint Sound Scene and Polyphonic Sound Event Recognition. Helen L. Bear, Inês Nolasco, Emmanouil Benetos |
| 2019 | Towards Language-Universal Mandarin-English Speech Recognition. Shiliang Zhang, Yuan Liu, Ming Lei, Bin Ma, Lei Xie |
| 2019 | Towards Robust Speech Emotion Recognition Using Deep Residual Networks for Speech Enhancement. Andreas Triantafyllopoulos, Gil Keren, Johannes Wagner, Ingmar Steiner, Björn W. Schuller |
| 2019 | Towards Universal Dialogue Act Tagging for Task-Oriented Dialogues. Shachi Paul, Rahul Goel, Dilek Hakkani-Tür |
| 2019 | Towards Using Context-Dependent Symbols in CTC Without State-Tying Decision Trees. Jan Chorowski, Adrian Lancucki, Bartosz Kostka, Michal Zapotoczny |
| 2019 | Towards Variability Resistant Dialectal Speech Evaluation. Ahmed Ali, Salam Khalifa, Nizar Habash |
| 2019 | Towards a Fault-Tolerant Speaker Verification System: A Regularization Approach to Reduce the Condition Number. Siqi Zheng, Gang Liu, Hongbin Suo, Yun Lei |
| 2019 | Towards a Method of Dynamic Vocal Tract Shapes Generation by Combining Static 3D and Dynamic 2D MRI Speech Data. Ioannis K. Douros, Anastasiia Tsukanova, Karyna Isaieva, Pierre-André Vuissoz, Yves Laprie |
| 2019 | Towards a Speaker Independent Speech-BCI Using Speaker Adaptation. Debadatta Dash, Alan Wisler, Paul Ferrari, Jun Wang |
| 2019 | Towards an Annotation Scheme for Complex Laughter in Speech Corpora. Khiet P. Truong, Jürgen Trouvain, Michel-Pierre Jansen |
| 2019 | Towards the Prosody of Persuasion in Competitive Negotiation. The Relationship Between f0 and Negotiation Success in Same Sex Sales Tasks. Jan Michalsky, Heike Schoormann, Thomas Schultze |
| 2019 | Towards the Speech Features of Early-Stage Dementia: Design and Application of the Mandarin Elderly Cognitive Speech Database. Tianqi Wang, Quanlei Yan, Jingshen Pan, Feiqi Zhu, Rongfeng Su, Yi Guo, Lan Wang, Nan Yan |
| 2019 | Towards the Speech Features of Mild Cognitive Impairment: Universal Evidence from Structured and Unstructured Connected Speech of Chinese. Tianqi Wang, Chongyuan Lian, Jingshen Pan, Quanlei Yan, Feiqi Zhu, Manwa L. Ng, Lan Wang, Nan Yan |
| 2019 | Tracking the New Zealand English NEAR/SQUARE Merger Using Functional Principal Components Analysis. Michele Gubian, Jonathan Harrington, Mary Stevens, Florian Schiel, Paul Warren |
| 2019 | Trainable Dynamic Subsampling for End-to-End Speech Recognition. Shucong Zhang, Erfan Loweimi, Yumo Xu, Peter Bell, Steve Renals |
| 2019 | Training Multi-Speaker Neural Text-to-Speech Systems Using Speaker-Imbalanced Speech Corpora. Hieu-Thi Luong, Xin Wang, Junichi Yamagishi, Nobuyuki Nishizawa |
| 2019 | Transfer Learning from Audio-Visual Grounding to Speech Recognition. Wei-Ning Hsu, David Harwath, James R. Glass |
| 2019 | Transfer-Representation Learning for Detecting Spoofing Attacks with Converted and Synthesized Speech in Automatic Speaker Verification System. Su-Yu Chang, Kai-Cheng Wu, Chia-Ping Chen |
| 2019 | Transformer Based Grapheme-to-Phoneme Conversion. Sevinj Yolchuyeva, Géza Németh, Bálint Gyires-Tóth |
| 2019 | Transparent Pronunciation Scoring Using Articulatorily Weighted Phoneme Edit Distance. Reima Karhila, Anna-Riikka Smolander, Sari Ylinen, Mikko Kurimo |
| 2019 | Turn-Taking Prediction Based on Detection of Transition Relevance Place. Kohei Hara, Koji Inoue, Katsuya Takanashi, Tatsuya Kawahara |
| 2019 | Two Tiered Distributed Training Algorithm for Acoustic Modeling. Pranav Ladkat, Oleg Rybakov, Radhika Arava, Sree Hari Krishnan Parthasarathi, I-Fan Chen, Nikko Strom |
| 2019 | Two-Dimensional Convolutional Recurrent Neural Networks for Speech Activity Detection. Anastasios Vafeiadis, Eleftherios Fanioudakis, Ilyas Potamitis, Konstantinos Votis, Dimitrios Giakoumis, Dimitrios Tzovaras, Liming Chen, Raouf Hamzaoui |
| 2019 | Two-Pass End-to-End Speech Recognition. Tara N. Sainath, Ruoming Pang, David Rybach, Yanzhang He, Rohit Prabhavalkar, Wei Li, Mirkó Visontai, Qiao Liang, Trevor Strohman, Yonghui Wu, Ian McGraw, Chung-Cheng Chiu |
| 2019 | Two-Stage Training for Chinese Dialect Recognition. Zongze Ren, Guofu Yang, Shugong Xu |
| 2019 | UNetGAN: A Robust Speech Enhancement Approach in Time Domain for Extremely Low Signal-to-Noise Ratio Condition. Xiang Hao, Xiangdong Su, Zhiyu Wang, Hui Zhang, Batushiren |
| 2019 | UWB-NTIS Speaker Diarization System for the DIHARD II 2019 Challenge. Zbynek Zajíc, Marie Kunesová, Marek Hrúz, Jan Vanek |
| 2019 | Ultra-Compact NLU: Neuronal Network Binarization as Regularization. Munir Georges, Krzysztof Czarnowski, Tobias Bocklet |
| 2019 | Ultrasound Tongue Imaging for Diarization and Alignment of Child Speech Therapy Sessions. Manuel Sam Ribeiro, Aciel Eshky, Korin Richmond, Steve Renals |
| 2019 | Ultrasound-Based Silent Speech Interface Built on a Continuous Vocoder. Tamás Gábor Csapó, Mohammed Salah Al-Radhi, Géza Németh, Gábor Gosztolya, Tamás Grósz, László Tóth, Alexandra Markó |
| 2019 | Unbabel Talk - Human Verified Translations for Voice Instant Messaging. Luís Bernardo, Mathieu Giquel, Sebastião Quintas, Paulo Dimas, Helena Moniz, Isabel Trancoso |
| 2019 | Unbiased Semi-Supervised LF-MMI Training Using Dropout. Sibo Tong, Apoorv Vyas, Philip N. Garner, Hervé Bourlard |
| 2019 | Understanding and Visualizing Raw Waveform-Based CNNs. Hannah Muckenhirn, Vinayak Abrol, Mathew Magimai-Doss, Sébastien Marcel |
| 2019 | Unidirectional Neural Network Architectures for End-to-End Automatic Speech Recognition. Niko Moritz, Takaaki Hori, Jonathan Le Roux |
| 2019 | Unified Language-Independent DNN-Based G2P Converter. Markéta Juzová, Daniel Tihelka, Jakub Vít |
| 2019 | Unified Verbalization for Speech Recognition & Synthesis Across Languages. Sandy Ritchie, Richard Sproat, Kyle Gorman, Daan van Esch, Christian Schallhart, Nikos Bampounis, Benoît Brard, Jonas Fromseier Mortensen, Millie Holt, Eoin Mahon |
| 2019 | Universal Adversarial Perturbations for Speech Recognition Systems. Paarth Neekhara, Shehzeen Hussain, Prakhar Pandey, Shlomo Dubnov, Julian J. McAuley, Farinaz Koushanfar |
| 2019 | Unleashing the Unused Potential of i-Vectors Enabled by GPU Acceleration. Ville Vestman, Kong Aik Lee, Tomi H. Kinnunen, Takafumi Koshinaka |
| 2019 | Unsupervised Acoustic Segmentation and Clustering Using Siamese Network Embeddings. Saurabhchand Bhati, Shekhar Nayak, K. Sri Rama Murty, Najim Dehak |
| 2019 | Unsupervised Acoustic Unit Discovery for Speech Synthesis Using Discrete Latent-Variable Neural Networks. Ryan Eloff, André Nortje, Benjamin van Niekerk, Avashna Govender, Leanne Nortje, Arnu Pretorius, Elan Van Biljon, Ewald van der Westhuizen, Lisa van Staden, Herman Kamper |
| 2019 | Unsupervised Adaptation with Adversarial Dropout Regularization for Robust Speech Recognition. Pengcheng Guo, Sining Sun, Lei Xie |
| 2019 | Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice Conversion. Andy T. Liu, Po-Chun Hsu, Hung-yi Lee |
| 2019 | Unsupervised Low-Rank Representations for Speech Emotion Recognition. Georgios Paraskevopoulos, Efthymios Tzinis, Nikolaos Ellinas, Theodoros Giannakopoulos, Alexandros Potamianos |
| 2019 | Unsupervised Methods for Audio Classification from Lecture Discussion Recordings. Hang Su, Borislav Dzodzo, Xixin Wu, Xunying Liu, Helen Meng |
| 2019 | Unsupervised Phonetic and Word Level Discovery for Speech to Speech Translation for Unwritten Languages. Steven Hillis, Anushree Prasanna Kumar, Alan W. Black |
| 2019 | Unsupervised Raw Waveform Representation Learning for ASR. Purvi Agrawal, Sriram Ganapathy |
| 2019 | Unsupervised Representation Learning with Future Observation Prediction for Speech Emotion Recognition. Zheng Lian, Jianhua Tao, Bin Liu, Jian Huang |
| 2019 | Unsupervised Singing Voice Conversion. Eliya Nachmani, Lior Wolf |
| 2019 | Unsupervised Training of Neural Mask-Based Beamforming. Lukas Drude, Jahn Heymann, Reinhold Haeb-Umbach |
| 2019 | Untranscribed Web Audio for Low Resource Speech Recognition. Andrea Carmantini, Peter Bell, Steve Renals |
| 2019 | Use of Beiwe Smartphone App to Identify and Track Speech Decline in Amyotrophic Lateral Sclerosis (ALS). Kathryn P. Connaghan, Jordan R. Green, Sabrina Paganoni, James Chan, Harli Weber, Ella Collins, Brian Richburg, Marziye Eshghi, Jukka-Pekka Onnela, James D. Berry |
| 2019 | Using Alexa for Flashcard-Based Learning. Lucy Skidmore, Roger K. Moore |
| 2019 | Using Attention Networks and Adversarial Augmentation for Styrian Dialect Continuous Sleepiness and Baby Sound Recognition. Sung-Lin Yeh, Gao-Yi Chao, Bo-Hao Su, Yu-Lin Huang, Meng-Han Lin, Yin-Chun Tsai, Yu-Wen Tai, Zheng-Chi Lu, Chieh-Yu Chen, Tsung-Ming Tai, Chiu-Wang Tseng, Cheng-Kuang Lee, Chi-Chun Lee |
| 2019 | Using Fisher Vector and Bag-of-Audio-Words Representations to Identify Styrian Dialects, Sleepiness, Baby & Orca Sounds. Gábor Gosztolya |
| 2019 | Using Prosody to Discover Word Order Alternations in a Novel Language. Anouschka Foltz, Sarah Cooper, Tamsin M. McKelvey |
| 2019 | Using Pupil Dilation to Measure Cognitive Load When Listening to Text-to-Speech in Quiet and in Noise. Avashna Govender, Anita E. Wagner, Simon King |
| 2019 | Using Real-Time Visual Biofeedback for Second Language Instruction. Shawn L. Nissen, Rebecca Nissen |
| 2019 | Using Speech Production Knowledge for Raw Waveform Modelling Based Styrian Dialect Identification. S. Pavankumar Dubagunta, Mathew Magimai-Doss |
| 2019 | Using Speech to Predict Sequentially Measured Cortisol Levels During a Trier Social Stress Test. Alice Baird, Shahin Amiriparian, Nicholas Cummins, Sarah Sturmbauer, Johanna Janson, Eva-Maria Meßner, Harald Baumeister, Nicolas Rohleder, Björn W. Schuller |
| 2019 | Using Ultrasound Imaging to Create Augmented Visual Biofeedback for Articulatory Practice. Colin T. Annand, Maurice Lamb, Sarah Dugan, Sarah R. Li, Hannah M. Woeste, T. Douglas Mast, Michael A. Riley, Jack A. Masterson, Neeraja Mahalingam, Kathryn J. Eary, Caroline Spencer, Suzanne Boyce, Stephanie Jackson, Anoosha Baxi, Reneé Seward |
| 2019 | Using a Manifold Vocoder for Spectral Voice and Style Conversion. Tuan Dinh, Alexander Kain, Kris Tjaden |
| 2019 | Using the Bag-of-Audio-Word Feature Representation of ASR DNN Posteriors for Paralinguistic Classification. Gábor Gosztolya |
| 2019 | V-to-V Coarticulation Induced Acoustic and Articulatory Variability of Vowels: The Effect of Pitch-Accent. Andrea Deme, Márton Bartók, Tekla Etelka Gráczi, Tamás Gábor Csapó, Alexandra Markó |
| 2019 | VAE-Based Regularization for Deep Speaker Embedding. Yang Zhang, Lantian Li, Dong Wang |
| 2019 | VESUS: A Crowd-Annotated Database to Study Emotion Production and Perception in Spoken English. Jacob Sager, Ravi Shankar, Jacob Reinhold, Archana Venkataraman |
| 2019 | VQVAE Unsupervised Unit Discovery and Multi-Scale Code2Spec Inverter for Zerospeech Challenge 2019. Andros Tjandra, Berrak Sisman, Mingyang Zhang, Sakriani Sakti, Haizhou Li, Satoshi Nakamura |
| 2019 | Validation of the Non-Intrusive Codebook-Based Short Time Objective Intelligibility Metric for Processed Speech. Charlotte Sørensen, Jesper Bünsow Boldt, Mads Græsbøll Christensen |
| 2019 | Variational Attention Using Articulatory Priors for Generating Code Mixed Speech Using Monolingual Corpora. Sai Krishna Rallabandi, Alan W. Black |
| 2019 | Variational Bayesian Multi-Channel Speech Dereverberation Under Noisy Environments with Probabilistic Convolutive Transfer Function. Masahito Togami, Tatsuya Komatsu |
| 2019 | Variational Domain Adversarial Learning for Speaker Verification. Youzhi Tu, Man-Wai Mak, Jen-Tzung Chien |
| 2019 | Vectorized Beam Search for CTC-Attention-Based Speech Recognition. Hiroshi Seki, Takaaki Hori, Shinji Watanabe, Niko Moritz, Jonathan Le Roux |
| 2019 | Very Deep Self-Attention Networks for End-to-End Speech Recognition. Ngoc-Quan Pham, Thai-Son Nguyen, Jan Niehues, Markus Müller, Alex Waibel |
| 2019 | ViVoLAB Speaker Diarization System for the DIHARD 2019 Challenge. Ignacio Viñals, Pablo Gimeno, Alfonso Ortega Giménez, Antonio Miguel, Eduardo Lleida |
| 2019 | Video-Driven Speech Reconstruction Using Generative Adversarial Networks. Konstantinos Vougioukas, Pingchuan Ma, Stavros Petridis, Maja Pantic |
| 2019 | Vietnamese Learners Tackling the German /ʃt/ in Perception. Anke Sennema, Silke Hamann |
| 2019 | Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis Through Audio Analysis. Noé Tits, Fengna Wang, Kevin El Haddad, Vincent Pagel, Thierry Dutoit |
| 2019 | Vocal Biomarker Assessment Following Pediatric Traumatic Brain Injury: A Retrospective Cohort Study. Camille Noufi, Adam C. Lammert, Daryush D. Mehta, James R. Williamson, Gregory A. Ciccarelli, Douglas E. Sturim, Jordan R. Green, Thomas F. Campbell, Thomas F. Quatieri |
| 2019 | Vocal Pitch Extraction in Polyphonic Music Using Convolutional Residual Network. Mingye Dong, Jie Wu, Jian Luan |
| 2019 | Voice Quality and Between-Frame Entropy for Sleepiness Estimation. Vijay Ravi, Soo Jin Park, Amber Afshan, Abeer Alwan |
| 2019 | Voice Quality as a Turn-Taking Cue. Mattias Heldner, Marcin Wlodarczak, Stefan Benus, Agustín Gravano |
| 2019 | VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking. Quan Wang, Hannah Muckenhirn, Kevin W. Wilson, Prashant Sridhar, Zelin Wu, John R. Hershey, Rif A. Saurous, Ron J. Weiss, Ye Jia, Ignacio López-Moreno |
| 2019 | VoiceID Loss: Speech Enhancement for Speaker Verification. Suwon Shon, Hao Tang, James R. Glass |
| 2019 | Vowel-Tone Interaction in Two Tibeto-Burman Languages. Wendy Lalhminghlui, Viyazonuo Terhiija, Priyankoo Sarmah |
| 2019 | Vowels and Diphthongs in the Xupu Xiang Chinese Dialect. Zhenrui Zhang, Fang Hu |
| 2019 | WHAM!: Extending Speech Separation to Noisy Environments. Gordon Wichern, Joe Antognini, Michael Flynn, Licheng Richard Zhu, Emmett McQuinn, Dwight Crow, Ethan Manilow, Jonathan Le Roux |
| 2019 | Weakly Supervised Syllable Segmentation by Vowel-Consonant Peak Classification. Ravi Shankar, Archana Venkataraman |
| 2019 | Web-Based Speech Synthesis Editor. Martin Gruber, Jakub Vít, Jindrich Matousek |
| 2019 | Whether to Pretrain DNN or not?: An Empirical Analysis for Voice Conversion. Nirmesh J. Shah, Hardik B. Sailor, Hemant A. Patil |
| 2019 | Which Ones Are Speaking? Speaker-Inferred Model for Multi-Talker Speech Separation. Jing Shi, Jiaming Xu, Bo Xu |
| 2019 | Whisper to Neutral Mapping Using Cosine Similarity Maximization in i-Vector Space for Speaker Verification. Abinay Reddy Naini, Achuth Rao M. V, Prasanta Kumar Ghosh |
| 2019 | Who Needs Words? Lexicon-Free Speech Recognition. Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert |
| 2019 | Who Said That?: Audio-Visual Speaker Diarisation of Real-World Meetings. Joon Son Chung, Bong-Jin Lee, Icksang Han |
| 2019 | Zero Resource Speech Synthesis Using Transcripts Derived from Perceptual Acoustic Units. Karthik Pandia D. S, Hema A. Murthy |
| 2019 | Zero Shot Intent Classification Using Long-Short Term Memory Networks. Kyle Williams |
| 2019 | Zooming in on Spatiotemporal V-to-C Coarticulation with Functional PCA. Michele Gubian, Manfred Pastätter, Marianne Pouplier |
| 2019 | wav2vec: Unsupervised Pre-Training for Speech Recognition. Steffen Schneider, Alexei Baevski, Ronan Collobert, Michael Auli |
| 2019 | x-Vector DNN Refinement with Full-Length Recordings for Speaker Recognition. Daniel Garcia-Romero, David Snyder, Gregory Sell, Alan McCree, Daniel Povey, Sanjeev Khudanpur |