| 2015 | "multilingual" deep neural network for music genre classification. Jia Dai, Wenju Liu, Chongjia Ni, Like Dong, Hong Yang |
| 2015 | "speech is silver, but silence is golden": improving speech-to-speech translation performance by slashing users input. Frédéric Béchet, Benoît Favre, Mickael Rouvier |
| 2015 | 16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015, Dresden, Germany, September 6-10, 2015 |
| 2015 | A binaural short time objective intelligibility measure for noisy and enhanced speech. Asger Heidemann Andersen, Jan Mark de Haan, Zheng-Hua Tan, Jesper Jensen |
| 2015 | A comparative study of BNF and DNN multilingual training on cross-lingual low-resource speech recognition. Haihua Xu, Van Hai Do, Xiong Xiao, Engsiong Chng |
| 2015 | A comparison between a DNN and a CRF disfluency detection and reconstruction system. Dario Bertero, Linlin Wang, Ho Yin Chan, Pascale Fung |
| 2015 | A comparison of features for synthetic speech detection. Md. Sahidullah, Tomi Kinnunen, Cemal Hanilçi |
| 2015 | A comparison of neural network feature transforms for speaker diarization. Sree Harsha Yella, Andreas Stolcke |
| 2015 | A comparison of neural network methods for unsupervised representation learning on the zero resource speech challenge. Daniel Renshaw, Herman Kamper, Aren Jansen, Sharon Goldwater |
| 2015 | A comparison of normalization techniques applied to latent space representations for speech analytics. Mohamed Morchid, Richard Dufour, Driss Matrouf |
| 2015 | A comparison of speech synthesis systems based on GPR, HMM, and DNN with a small amount of training data. Tomoki Koriyama, Takao Kobayashi |
| 2015 | A comprehensive 3d biomechanically-driven vocal tract model including inverse dynamics for speech research. Peter Anderson, Negar M. Harandi, Scott Moisik, Ian Stavness, Sidney S. Fels |
| 2015 | A data-driven speech enhancement method based on modeled long-range temporal dynamics. Yue Hao, Changchun Bao, Feng Bao, Feng Deng |
| 2015 | A database for analysis of speech under physical stress: detection of exercise intensity while running and talking. Khiet P. Truong, Arne Nieuwenhuys, Peter Beek, Vanessa Evers |
| 2015 | A dialog act tagging approach to behavioral coding: a case study of addiction counseling conversations. Dogan Can, David C. Atkins, Shrikanth S. Narayanan |
| 2015 | A discriminative analysis within and across voiced and unvoiced consonants in neutral and whispered speech in multiple indian languages. G. Nisha Meenakshi, Prasanta Kumar Ghosh |
| 2015 | A discriminative reliability-aware classification model with applications to intelligibility classification in pathological speech. Naveen Kumar, Shrikanth S. Narayanan |
| 2015 | A diversity-penalizing ensemble training method for deep learning. Xiaohui Zhang, Daniel Povey, Sanjeev Khudanpur |
| 2015 | A dynamic model for behavioral analysis of couple interactions using acoustic features. Wei Xia, James Gibson, Bo Xiao, Brian R. Baucom, Panayiotis G. Georgiou |
| 2015 | A fast algorithm for improved intelligibility of speech-in-noise based on frequency and time domain energy reallocation. Tudor-Catalin Zorila, Yannis Stylianou |
| 2015 | A fast approach to psychoacoustic model compensation for robust speaker recognition in additive noise. Ashish Panda |
| 2015 | A framework for the evaluation of microscopic intelligibility models. Ricard Marxer, Martin Cooke, Jon Barker |
| 2015 | A framework to develop context-aware adaptive dialogue system. David Griol, Zoraida Callejas, Ramón López-Cózar |
| 2015 | A general artificial neural network extension for HTK. Chao Zhang, Philip C. Woodland |
| 2015 | A glimpse-based approach for predicting binaural intelligibility with single and multiple maskers in anechoic conditions. Yan Tang, Martin Cooke, Bruno M. Fazenda, Trevor J. Cox |
| 2015 | A hybrid dynamic time warping-deep neural network architecture for unsupervised acoustic modeling. Roland Thiollière, Ewan Dunbar, Gabriel Synnaeve, Maarten Versteegh, Emmanuel Dupoux |
| 2015 | A latent variable model for joint pause prediction and dependency parsing. The Tung Nguyen, Graham Neubig, Hiroyuki Shindo, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura |
| 2015 | A likelihood ratio-based forensic voice comparison in microphone vs. mobile mismatched conditions using Japanese /ai/. Michael J. Carne |
| 2015 | A maximum likelihood approach to the detection of moments of maximum excitation and its application to high-quality speech parameterization. Ranniery Maia, Yannis Stylianou, Masami Akamine |
| 2015 | A metric for evaluating speech recognizer output based on human-perception model. Nobuyasu Itoh, Gakuto Kurata, Ryuki Tachibana, Masafumi Nishimura |
| 2015 | A model based voice activity detector for noisy environments. Kaavya Sriskandaraja, Vidhyasaharan Sethu, Phu Ngoc Le, Eliathamby Ambikairajah |
| 2015 | A multi-channel speech enhancement framework for robust NMF-based speech recognition for speech-impaired users. Gert Dekkers, Toon van Waterschoot, Bart Vanrumste, Bert Van Den Broeck, Jort F. Gemmeke, Hugo Van hamme, Peter Karsmakers |
| 2015 | A multi-layer F0 model for singing voice synthesis using a b-spline representation with intuitive controls. Luc Ardaillon, Gilles Degottex, Axel Roebel |
| 2015 | A multi-region deep neural network model in speech recognition. Jia Cui, George Saon, Bhuvana Ramabhadran, Brian Kingsbury |
| 2015 | A multimodal approach for automatic assessment of school principals' oral presentation during pre-service training program. Shan-Wen Hsiao, Hung-Ching Sun, Ming-Chuan Hsieh, Ming-Hsueh Tsai, Hsin-Chih Lin, Chi-Chun Lee |
| 2015 | A new Italian dataset of parallel acoustic and articulatory data. Claudia Canevari, Leonardo Badino, Luciano Fadiga |
| 2015 | A new front-end for classification of non-speech sounds: a study on human whistle. Mahesh Kumar Nandwana, Hynek Boril, John H. L. Hansen |
| 2015 | A new technique for assessing glottal dynamics in speech and singing by means of optical-flow computation. Gustavo Andrade-Miranda, Nathalie Henrich Bernardoni, Juan Ignacio Godino-Llorente |
| 2015 | A novel method of artificial bandwidth extension using deep architecture. Bin Liu, Jianhua Tao, Zhengqi Wen, Ya Li, Danish Bukhari |
| 2015 | A perceptual investigation of wavelet-based decomposition of f0 for text-to-speech synthesis. Manuel Sam Ribeiro, Junichi Yamagishi, Robert A. J. Clark |
| 2015 | A polyglot domain optimised text-to-speech system for railway station announcements. Csaba Zainkó, Mátyás Bartalis, Géza Németh, Gábor Olaszy |
| 2015 | A proposal to develop domain and subtask-adaptive dialog management models. David Griol, Zoraida Callejas |
| 2015 | A real-time variable-q non-stationary Gabor transform for pitch shifting. Dong-Yan Huang, Minghui Dong, Haizhou Li |
| 2015 | A statistical model-based voice activity detection using multiple DNNs and noise awareness. Inyoung Hwang, Jaeseong Sim, Sang-Hyeon Kim, Kwang-Sub Song, Joon-Hyuk Chang |
| 2015 | A study of speaker adaptation for DNN-based speech synthesis. Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King |
| 2015 | A study of the recurrent neural network encoder-decoder for large vocabulary speech recognition. Liang Lu, Xingxing Zhang, Kyunghyun Cho, Steve Renals |
| 2015 | A study on deep neural network acoustic model adaptation for robust far-field speech recognition. Seyedmahdad Mirsamadi, John H. L. Hansen |
| 2015 | A study on robust detection of pronunciation erroneous tendency based on deep neural network. Yingming Gao, Yanlu Xie, Wen Cao, Jinsong Zhang |
| 2015 | A study on the stability and effectiveness of features in quality estimation for spoken language translation. Raymond W. M. Ng, Kashif Shah, Lucia Specia, Thomas Hain |
| 2015 | A syllable-based analysis of speech temporal organization: a comparison between speaking styles in dysarthric and healthy populations. Brigitte Bigi, Katarzyna Klessa, Laurianne Georgeton, Christine Meunier |
| 2015 | A system for automatic broadcast news summarisation, geolocation and translation. Peter Bell, Catherine Lai, Clare Llewellyn, Alexandra Birch, Mark Sinclair |
| 2015 | A time delay neural network architecture for efficient modeling of long temporal contexts. Vijayaditya Peddinti, Daniel Povey, Sanjeev Khudanpur |
| 2015 | A two-stage singing voice separation algorithm using spectro-temporal modulation features. Frederick Z. Yen, Mao-Chang Huang, Tai-Shih Chi |
| 2015 | A unified deep neural network for speaker and language recognition. Fred Richardson, Douglas A. Reynolds, Najim Dehak |
| 2015 | A universal VAD based on jointly trained deep neural networks. Qing Wang, Jun Du, Xiao Bao, Zi-Rui Wang, Li-Rong Dai, Chin-Hui Lee |
| 2015 | ABIMS - auditory bewildered interaction measurement system. Lisa Lange, Bartholomäus Pfeiffer, Daniel Duran |
| 2015 | AM-FM based filter bank analysis for estimation of spectro-temporal envelopes and its application for speaker recognition in noisy reverberant environments. Dhananjaya N. Gowda, Rahim Saeidi, Paavo Alku |
| 2015 | ASVspoof 2015: the first automatic speaker verification spoofing and countermeasures challenge. Zhizheng Wu, Tomi Kinnunen, Nicholas W. D. Evans, Junichi Yamagishi, Cemal Hanilçi, Md. Sahidullah, Aleksandr Sizov |
| 2015 | Accounting for uncertainty of i-vectors in speaker recognition using uncertainty propagation and modified imputation. Rahim Saeidi, Paavo Alku |
| 2015 | Accuracy of a markerless acquisition technique for studying speech articulators. Andrea Bandini, Slim Ouni, Piero Cosi, Silvia Orlandi, Claudia Manfredi |
| 2015 | Accurate endpointing with expected pause duration. Baiyang Liu, Björn Hoffmeister, Ariya Rastrow |
| 2015 | Acoustic analysis of Mandarin affricates. Shanpeng Li, Wentao Gu |
| 2015 | Acoustic correlates for perceived effort levels in expressive speech. Mary Pietrowicz, Mark Hasegawa-Johnson, Karrie Karahalios |
| 2015 | Acoustic correlates of perceived syllable prominence in German. Hansjörg Mixdorff, Christian G. Cossio Mercado, Angelika Hönemann, Jorge A. Gurlekian, Diego A. Evin, Humberto M. Torres |
| 2015 | Acoustic event recognition using dominant spectral basis vectors. Woohyun Choi, Sangwook Park, David K. Han, Hanseok Ko |
| 2015 | Acoustic group feature selection using wrapper method for automatic eating condition recognition. Dara Pir, Theodore Brown |
| 2015 | Acoustic stress detection for improved navigation of educational videos. Sonal Patil, Harish Arsikere, Om Deshmukh |
| 2015 | Acoustic-prosodic analysis of attitudinal expressions in German. Hansjörg Mixdorff, Angelika Hönemann, Albert Rilliard |
| 2015 | Acoustic-prosodic correlates of 'awkward' prosody in story retellings from adolescents with autism. Daniel Bone, Matthew P. Black, Anil Ramakrishna, Ruth B. Grossman, Shrikanth S. Narayanan |
| 2015 | Acoustics of articulatory constraints: vowel classification and nasalization. Indranil Dutta, Ayushi Pandey |
| 2015 | Acquisition of English speech rhythm by monolingual children. Mikhail Ordin, Leona Polyanskaya |
| 2015 | Action planning and congruency effect between articulation and grasping. Mikko Tiainen, Lari Vainio, Kaisa Tiippana, Naeem Komeilipoor, Martti Vainio |
| 2015 | Active learning based data selection for limited resource STT and KWS. Thiago Fraga-Silva, Jean-Luc Gauvain, Lori Lamel, Antoine Laurent, Viet Bac Le, Abdelkhalek Messaoudi |
| 2015 | Adapting lexical representation and OOV handling from written to spoken language with word embedding. Jérémie Tafforeau, Thierry Artières, Benoît Favre, Frédéric Béchet |
| 2015 | Adapting machine translation models toward misrecognized speech with text-to-speech pronunciation rules and acoustic confusability. Nicholas Ruiz, Qin Gao, William Lewis, Marcello Federico |
| 2015 | Advanced crowdsourcing for speech and beyond: introduction by the organizers. Tim Polzehl, Gina-Anne Levow |
| 2015 | Advanced time shrinking using a drop classifier based on codec features. Jochen Issing, Nikolaus Färber, Reinhard German |
| 2015 | Age-dependent height estimation and speaker normalization for children's speech using the first three subglottal resonances. Jinxi Guo, Rohit Paturi, Gary Yeung, Steven M. Lulich, Harish Arsikere, Abeer Alwan |
| 2015 | Agreement and disagreement utterance detection in conversational speech by extracting and integrating local features. Atsushi Ando, Taichi Asami, Manabu Okamoto, Hirokazu Masataki, Sumitaka Sakauchi |
| 2015 | Aligning meeting recordings via adaptive fingerprinting. T. J. Tsai, Andreas Stolcke |
| 2015 | An acoustic event detection framework and evaluation metric for surveillance in cars. Peter Transfeld, Simon Receveur, Tim Fingscheidt |
| 2015 | An acoustic examination of the three-way sibilant contrast in lower sorbian. Phil Howson |
| 2015 | An alternating optimization approach for phase retrieval. Huaiping Ming, Dong-Yan Huang, Lei Xie, Haizhou Li, Minghui Dong |
| 2015 | An analysis of the relationship between signal-derived vocal arousal score and human emotion production and perception. Chi-Chun Lee, Daniel Bone, Shrikanth S. Narayanan |
| 2015 | An analysis of time-aggregated and time-series features for scoring different aspects of multimodal presentation data. Vikram Ramanarayanan, Lei Chen, Chee Wee Leong, Gary Feng, David Suendermann-Oeft |
| 2015 | An empirical model of emphatic word detection. Milos Cernak, Pierre-Edouard Honnet |
| 2015 | An end-to-end approach to language identification in short utterances using convolutional neural networks. Alicia Lozano-Diez, Rubén Zazo-Candil, Javier Gonzalez-Dominguez, Doroteo T. Toledano, Joaquín González-Rodríguez |
| 2015 | An entropy minimization framework for goal-driven dialogue management. Ji Wu, Miao Li, Chin-Hui Lee |
| 2015 | An error correction scheme for GCI detection algorithms using pitch smoothness criterion. P. Sujith, A. P. Prathosh, A. G. Ramakrishnan, Prasanta Kumar Ghosh |
| 2015 | An evaluation of graph clustering methods for unsupervised term discovery. Vince Lyzinski, Gregory Sell, Aren Jansen |
| 2015 | An i-vector backend for speaker verification. Patrick Kenny, Themos Stafylakis, Md. Jahangir Alam, Marcel Kockmann |
| 2015 | An information theory based data-homogeneity measure for voice comparison. Moez Ajili, Jean-François Bonastre, Solange Rossato, Juliette Kahn, Itshak Lapidot |
| 2015 | An investigation of MDVP parameters for voice pathology detection on three different databases. Ahmed Y. Al-nasheri, Zulfiqar Ali, Ghulam Muhammad, Mansour Alsulaiman |
| 2015 | An investigation of context clustering for statistical speech synthesis with deep neural network. Bo Chen, Zhehuai Chen, Jiachen Xu, Kai Yu |
| 2015 | An investigation of emotion change detection from speech. Zhaocheng Huang, Julien Epps, Eliathamby Ambikairajah |
| 2015 | An investigation of recurrent neural network architectures for statistical parametric speech synthesis. Sivanand Achanta, Tejas Godambe, Suryakanth V. Gangashetty |
| 2015 | An iterative speech model-based a priori SNR estimator. Samy Elshamy, Nilesh Madhu, Wouter Tirry, Tim Fingscheidt |
| 2015 | An unsupervised visual-only voice activity detection approach using temporal orofacial features. Fei Tao, John H. L. Hansen, Carlos Busso |
| 2015 | Analysing automatic descriptions of intonation with ICARUS. Katrin Schweitzer, Markus Gärtner, Arndt Riester, Ina Rösiger, Kerstin Eckart, Jonas Kuhn, Grzegorz Dogil |
| 2015 | Analysing rhythm in ritual discourse in yucatec maya using automatic speech alignment. Valentina Vapnarsky, Claude Barras, Cédric Becquey, David Doukhan, Martine Adda-Decker, Lori Lamel |
| 2015 | Analysis and classification of cooperative and competitive dialogs. Uwe D. Reichel, Nina Pörner, Dianne Nowack, Jennifer Cole |
| 2015 | Analysis and modeling of the role of laughter in motivational interviewing based psychotherapy conversations. Rahul Gupta, Theodora Chaspari, Panayiotis G. Georgiou, David C. Atkins, Shrikanth S. Narayanan |
| 2015 | Analysis of CNN-based speech recognition system using raw speech as input. Dimitri Palaz, Mathew Magimai-Doss, Ronan Collobert |
| 2015 | Analysis of a low-dimensional bottleneck neural network representation of speech for modelling speech dynamics. Linxue Bai, Peter Jancovic, Martin J. Russell, Philip Weber |
| 2015 | Analysis of coarticulated speech using estimated articulatory trajectories. Ganesh Sivaraman, Vikramjit Mitra, Mark K. Tiede, Elliot Saltzman, Louis Goldstein, Carol Y. Espy-Wilson |
| 2015 | Analysis of excitation source features of speech for emotion recognition. Sudarsana Reddy Kadiri, P. Gangamohan, Suryakanth V. Gangashetty, Bayya Yegnanarayana |
| 2015 | Analysis of features from analytic representation of speech using MP-ABX measures. Raghavendra Reddy Pappagari, Karthika Vijayan, K. Sri Rama Murty |
| 2015 | Analysis of mutual duration and noise effects in speaker recognition: benefits of condition-matched cohort selection in score normalization. Andreas Nautsch, Rahim Saeidi, Christian Rathgeb, Christoph Busch |
| 2015 | Analysis of spatial variation with app-based crowdsourced audio data. Marie-José Kolly, Adrian Leemann, Florian Matter |
| 2015 | Analysis of the second phase of the 2013-2014 i-vector machine learning challenge. Désiré Bansé, George R. Doddington, Daniel Garcia-Romero, John J. Godfrey, Craig S. Greenberg, Jaime Hernandez-Cordero, John M. Howard, Alvin F. Martin, Lisa P. Mason, Alan McCree, Douglas A. Reynolds |
| 2015 | Analyzing speech rate entrainment and its relation to therapist empathy in drug addiction counseling. Bo Xiao, Zac E. Imel, David C. Atkins, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2015 | Annotating large lattices with the exact word error. Rogier C. van Dalen, Mark J. F. Gales |
| 2015 | Annotators' agreement and spontaneous emotion classification performance. Bogdan Vlasenko, Andreas Wendemuth |
| 2015 | Anomaly-based annotation errors detection in TTS corpora. Jindrich Matousek, Daniel Tihelka |
| 2015 | Ant colony algorithm applied to automatic speech recognition graph decoding. Benjamin Lecouteux, Didier Schwab |
| 2015 | Anti-spoofing system: an investigation of measures to detect synthetic and human speech. Abhinav Misra, Shivesh Ranjan, Chunlei Zhang, John H. L. Hansen |
| 2015 | Applying GPGPU to recurrent neural network language model based fast network search in the real-time LVCSR. Kyungmin Lee, Chiyoun Park, Ilhwan Kim, Namhoon Kim, Jaewon Lee |
| 2015 | Architectures for deep neural network based acoustic models defined over windowed speech waveforms. Mayank Bhargava, Richard Rose |
| 2015 | Are we using enough listeners? no! - an empirically-supported critique of interspeech 2014 TTS evaluations. Mirjam Wester, Cassia Valentini-Botinhao, Gustav Eje Henter |
| 2015 | Are you TED talk material? comparing prosody in professors and TED speakers. T. J. Tsai |
| 2015 | Articulatory controllable speech modification based on Gaussian mixture models with direct waveform modification using spectrum differential. Patrick Lumban Tobing, Kazuhiro Kobayashi, Tomoki Toda, Graham Neubig, Sakriani Sakti, Satoshi Nakamura |
| 2015 | Articulatory movement prediction using deep bidirectional long short-term memory based recurrent neural networks and word/phone embeddings. Pengcheng Zhu, Lei Xie, Yunlin Chen |
| 2015 | Articulatory-based conversion of foreign accents with deep neural networks. Sandesh Aryal, Ricardo Gutierrez-Osuna |
| 2015 | Artificial personality and disfluency. Mirjam Wester, Matthew P. Aylett, Marcus Tomalin, Rasmus Dall |
| 2015 | Assessing empathy using static and dynamic behavior models based on therapist's language in addiction counseling. Sandeep Nallan Chakravarthula, Bo Xiao, Zac E. Imel, David C. Atkins, Panayiotis G. Georgiou |
| 2015 | Assessing the degree of nativeness and parkinson's condition using Gaussian processes and deep rectifier neural networks. Tamás Grósz, Róbert Busa-Fekete, Gábor Gosztolya, László Tóth |
| 2015 | Attribute knowledge integration for speech recognition based on multi-task learning neural networks. Hao Zheng, Zhanlei Yang, Liwei Qiao, Jianping Li, Wenju Liu |
| 2015 | Audio augmentation for speech recognition. Tom Ko, Vijayaditya Peddinti, Daniel Povey, Sanjeev Khudanpur |
| 2015 | Audio quotation marks for natural language understanding. Simon Boutin, Réal Tremblay, Patrick Cardinal, Doug Peters, Pierre Dumouchel |
| 2015 | Auditory-visual tone perception in hearing impaired Thai listeners. Benjawan Kasisopa, Nittayapa Klangpornkun, Denis Burnham |
| 2015 | Auris populi: crowdsourced native transcriptions of Dutch vowels spoken by adult Spanish learners. Pepi Burgos, Eric Sanders, Catia Cucchiarini, Roeland van Hout, Helmer Strik |
| 2015 | Auto-imputing radial basis functions for neural-network turn-taking models. Kornel Laskowski |
| 2015 | Autoencoder based multi-stream combination for noise robust speech recognition. Sri Harish Reddy Mallidi, Tetsuji Ogawa, Karel Veselý, Phani S. Nidadavolu, Hynek Hermansky |
| 2015 | Automated evaluation of non-native English pronunciation quality: combining knowledge- and data-driven features at multiple time scales. Matthew P. Black, Daniel Bone, Zisis Iason Skordilis, Rahul Gupta, Wei Xia, Pavlos Papadopoulos, Sandeep Nallan Chakravarthula, Bo Xiao, Maarten Van Segbroeck, Jangwon Kim, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2015 | Automatic accentedness evaluation of non-native speech using phonetic and sub-phonetic posterior probabilities. Ramya Rasipuram, Milos Cernak, Alexandre Nanchen, Mathew Magimai-Doss |
| 2015 | Automatic age detection in normal and pathological voice. Jorge Andrés Gómez García, Laureano Moro-Velázquez, Juan Ignacio Godino-Llorente, Germán Castellanos-Domínguez |
| 2015 | Automatic audio sentiment extraction using keyword spotting. Lakshmish Kaushik, Abhijeet Sangwan, John H. L. Hansen |
| 2015 | Automatic classification of eating conditions from speech using acoustic feature selection and a set of hierarchical support vector machine classifiers. Abhay Prasad, Prasanta Kumar Ghosh |
| 2015 | Automatic detection and annotation of disfluencies in spoken French corpora. George Christodoulides, Mathieu Avanzi |
| 2015 | Automatic detection of creaky voice using epoch parameters. N. P. Narendra, K. Sreenivasa Rao |
| 2015 | Automatic detection of equipment alarms in a neonatal intensive care unit environment: a knowledge-based approach. Ganna Raboshchuk, Peter Jancovic, Climent Nadeu, Alex Peiró Lilja, Münevver Köküer, Blanca Muñoz Mahamud, Ana Riverola de Veciana |
| 2015 | Automatic detection of mild cognitive impairment from spontaneous speech using ASR. László Tóth, Gábor Gosztolya, Veronika Vincze, Ildikó Hoffmann, Gréta Szatlóczki, Edit Biró, Fruzsina Zsura, Magdolna Pákáski, János Kálmán |
| 2015 | Automatic detection of parkinson's disease from continuous speech recorded in non-controlled noise conditions. Juan Camilo Vásquez-Correa, Tomás Arias-Vergara, Juan Rafael Orozco-Arroyave, Jesús Francisco Vargas-Bonilla, Julián D. Arias-Londoño, Elmar Nöth |
| 2015 | Automatic detection of sentence prominence in speech using predictability of word-level acoustic features. Sofoklis Kakouros, Okko Räsänen |
| 2015 | Automatic detection of uncertainty in spontaneous German dialogue. Tobias Schrank, Barbara Schuppler |
| 2015 | Automatic estimation of parkinson's disease severity from diverse speech tasks. Jangwon Kim, Md. Nasir, Rahul Gupta, Maarten Van Segbroeck, Daniel Bone, Matthew P. Black, Zisis Iason Skordilis, Zhaojun Yang, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2015 | Automatic formatted transcripts for videos. Aasish Pappu, Amanda Stent |
| 2015 | Automatic identification of received language in MEG. Emilio Parisotto, Youness Aliyari Ghassabeh, Matt J. MacDonald, Adelina Cozma, Elizabeth W. Pang, Frank Rudzicz |
| 2015 | Automatic intelligibility measures applied to speech signals simulating age-related hearing loss. Lionel Fontan, Jérôme Farinas, Isabelle Ferrané, Julien Pinquier, Xavier Aumont |
| 2015 | Automatic phrase boundary labeling of speech synthesis database using context-dependent HMMs and n-gram prior distributions. Qian Chen, Zhen-Hua Ling, Chen-Yu Yang, Li-Rong Dai |
| 2015 | Automatic recognition of unified parkinson's disease rating from speech with acoustic, i-vector and phonotactic features. Guozhen An, David Guy Brizan, Min Ma, Michelle Morales, Ali Raza Syed, Andrew Rosenberg |
| 2015 | Automatic segmentation and clustering of speech using sparse coding and metaheuristic search. Wiehan Agenbag, Thomas Niesler |
| 2015 | Automatic speaker verification spoofing and countermeasures (ASVspoof 2015): introductory talk by the organizers. Zhizheng Wu, Tomi Kinnunen |
| 2015 | Automatic speaker verification spoofing and countermeasures (ASVspoof 2015): open discussion and future plans. Junichi Yamagishi, Nicholas W. D. Evans |
| 2015 | Automatic transformation of irregular to regular voice by residual analysis and synthesis. Tamás Gábor Csapó, Géza Németh |
| 2015 | Autonomous measurement of speech intelligibility utilizing automatic speech recognition. Bernd T. Meyer, Birger Kollmeier, Jasper Ooster |
| 2015 | BLSTM neural networks for speech driven head motion synthesis. Chuang Ding, Pengcheng Zhu, Lei Xie |
| 2015 | Backward mimicry and forward influence in prosodic contour choice in standard American English. Agustín Gravano, Stefan Benus, Rivka Levitan, Julia Hirschberg |
| 2015 | Bag-of-words input for long history representation in neural network-based language models for speech recognition. Kazuki Irie, Ralf Schlüter, Hermann Ney |
| 2015 | Bayesian integration of sound source separation and speech recognition: a new approach to simultaneous speech recognition. Kousuke Itakura, Izaya Nishimuta, Yoshiaki Bando, Katsutoshi Itoyama, Kazuyoshi Yoshii |
| 2015 | Bilinear map of filter-bank outputs for DNN-based speech recognition. Tetsuji Ogawa, Kenshiro Ueda, Kouichi Katsurada, Tetsunori Kobayashi, Tsuneo Nitta |
| 2015 | Binaural sound source localisation and tracking using a dynamic spherical head model. Christopher Schymura, Fiete Winter, Dorothea Kolossa, Sascha Spors |
| 2015 | Biosignal-based spoken communication: panel and discussion. Matthias Janke, Michael Wand |
| 2015 | Biosignal-based spoken communication: welcome and introduction. Matthias Janke, Michael Wand |
| 2015 | Blind score normalization method for PLDA based speaker recognition. Danila Doroshin, Nikolay Lubimov, Marina Nastasenko, Mikhail Kotov |
| 2015 | Boosting universal speech attributes classification with deep neural network for foreign accent characterization. Ville Hautamäki, Sabato Marco Siniscalchi, Hamid Behravan, Valerio Mario Salerno, Ivan Kukanov |
| 2015 | Bringing contextual information to google speech recognition. Petar S. Aleksic, Mohammadreza Ghodsi, Assaf Hurwitz Michaely, Cyril Allauzen, Keith B. Hall, Brian Roark, David Rybach, Pedro J. Moreno |
| 2015 | Can you hear me? acoustic modifications in speech directed to foreigners and hearing-impaired people. Monja Angelika Knoll, Melissa Johnstone, Charlene Blakely |
| 2015 | Capcap: an output-agreement game for video captioning. Hernisa Kacorri, Kaoru Shinkawa, Shin Saito |
| 2015 | Channel selection in the short-time modulation domain for distant speech recognition. Ivan Himawan, Petr Motlícek, Sridha Sridharan, David Dean, Dian Tjondronegoro |
| 2015 | Children's reading aloud performance: a database and automatic detection of disfluencies. Jorge Proença, Dirce Celorico, Sara Candeias, Carla Lopes, Fernando Perdigão |
| 2015 | Classification of place-of-articulation of stop consonants using temporal analysis. A. P. Prathosh, A. G. Ramakrishnan, T. V. Ananthapadmanabha |
| 2015 | Classifiers for synthetic speech detection: a comparison. Cemal Hanilçi, Tomi Kinnunen, Md. Sahidullah, Aleksandr Sizov |
| 2015 | Clustering novel intents in a conversational interaction system with semantic parsing. Dilek Hakkani-Tür, Yun-Cheng Ju, Geoffrey Zweig, Gökhan Tür |
| 2015 | Clustering short push-to-talk segments. Ilya Shapiro, Neta Rabin, Irit Opher, Itshak Lapidot |
| 2015 | Codebook clustering for unit selection based EMG-to-speech conversion. Lorenz Diener, Matthias Janke, Tanja Schultz |
| 2015 | Codebook-based speech enhancement using Markov process and speech-presence probability. Qi He, Changchun Bao, Feng Bao |
| 2015 | Cognitive impairment prediction in the elderly based on vocal biomarkers. Bea Yu, Thomas F. Quatieri, James R. Williamson, James C. Mundt |
| 2015 | Cognitive workload and vocabulary sparseness: theory and practice. Ron M. Hecht, Aharon Bar-Hillel, Stas Tiomkin, Hadar Levi, Omer Tsimhoni, Naftali Tishby |
| 2015 | Collaborative annotation for person identification in TV shows. Mateusz Budnik, Laurent Besacier, Johann Poignant, Hervé Bredin, Claude Barras, Mickaël Stefas, Pierrick Bruneau, Thomas Tamisier |
| 2015 | Combating reverberation in large vocabulary continuous speech recognition. Vikramjit Mitra, Julien van Hout, Mitchell McLaren, Wen Wang, Martin Graciarena, Dimitra Vergyri, Horacio Franco |
| 2015 | Combination of NN and CRF models for joint detection of punctuation and disfluencies. Eunah Cho, Kevin Kilgour, Jan Niehues, Alex Waibel |
| 2015 | Combination of diverse subword units in spoken term detection. Shi-wook Lee, Kazuyo Tanaka, Yoshiaki Itoh |
| 2015 | Combinations of various language model technologies including data expansion and adaptation in spontaneous speech recognition. Ryo Masumura, Taichi Asami, Takanobu Oba, Hirokazu Masataki, Sumitaka Sakauchi, Akinori Ito |
| 2015 | Combined cine- and tagged-MRI for tracking landmarks on the tongue surface. Honghao Bao, Wenhuan Lu, Kiyoshi Honda, Jianguo Wei, Qiang Fang, Jianwu Dang |
| 2015 | Combining amplitude and phase-based features for speaker verification with short duration utterances. Md. Jahangir Alam, Patrick Kenny, Themos Stafylakis |
| 2015 | Combining evidences from mel cepstral, cochlear filter cepstral and instantaneous frequency features for detection of natural vs. spoofed speech. Tanvina B. Patel, Hemant A. Patil |
| 2015 | Combining extreme learning machine and decision tree for duration prediction in HMM based speech synthesis. Yang Wang, Minghao Yang, Zhengqi Wen, Jianhua Tao |
| 2015 | Combining hierarchical classification with frequency weighting for the recognition of eating conditions. Johannes Wagner, Andreas Seiderer, Florian Lingenfelser, Elisabeth André |
| 2015 | Combining multiple approaches to predict the degree of nativeness. Eugénio Ribeiro, Jaime Ferreira, Julia Olcoz, Alberto Abad, Helena Moniz, Fernando Batista, Isabel Trancoso |
| 2015 | Combining multiple-type input units using recurrent neural network for LVCSR language modeling. Vataya Chunwijitra, Ananlada Chotimongkol, Chai Wutiwiwatchai |
| 2015 | Communicative needs and respiratory constraints. Marcin Wlodarczak, Mattias Heldner, Jens Edlund |
| 2015 | Community detection with manifold learning on speaker i-vector space for Chinese. Hongcui Wang, Di Jin, Lantian Li, Jianwu Dang |
| 2015 | Comparing SVM, softmax, and shallow neural networks for eating condition classification. Thomas Pellegrini |
| 2015 | Comparing journalistic and spontaneous speech: prosodic and spectral analysis. Cédric Gendrot, Martine Adda-Decker, Yaru Wu |
| 2015 | Comparison of Gaussian process regression and Gaussian mixture models in spectral tilt modelling for intelligibility enhancement of telephone speech. Emma Jokinen, Ulpu Remes, Paavo Alku |
| 2015 | Comparison of chironomic stylization versus statistical modeling of prosody for expressive speech synthesis. Marc Evrard, Samuel Delalez, Christophe d'Alessandro, Albert Rilliard |
| 2015 | Comparison of forced-alignment speech recognition and humans for generating reference VAD. Ivan Kraljevski, Zheng-Hua Tan, Maria Paola Bissiri |
| 2015 | Complementary tasks for context-dependent deep neural network acoustic models. Peter Bell, Steve Renals |
| 2015 | Complete-linkage clustering for voice activity detection in audio and visual speech. Houman Ghaemmaghami, David Dean, Shahram Kalantari, Sridha Sridharan, Clinton Fookes |
| 2015 | Complex tensor factorization in modulation frequency domain for single-channel speech enhancement. Shogo Masaya, Masashi Unoki |
| 2015 | Composition-based on-the-fly rescoring for salient n-gram biasing. Keith B. Hall, Eunjoon Cho, Cyril Allauzen, Françoise Beaufays, Noah Coccaro, Kaisuke Nakajima, Michael Riley, Brian Roark, David Rybach, Linda Zhang |
| 2015 | Compressing deep neural networks using a rank-constrained topology. Preetum Nakkiran, Raziel Alvarez, Rohit Prabhavalkar, Carolina Parada |
| 2015 | Confidence-features and confidence-scores for ASR applications in arbitration and DNN speaker adaptation. Kshitiz Kumar, Ziad Al Bawab, Yong Zhao, Chaojun Liu, Benoît Dumoulin, Yifan Gong |
| 2015 | Conflict intensity estimation from speech using Greedy forward-backward feature selection. Gábor Gosztolya |
| 2015 | Confusability in L2 vowels: analyzing the role of different features. Mátyás Jani, Catia Cucchiarini, Roeland van Hout, Helmer Strik |
| 2015 | Consonant duration and VOT as a function of syllable complexity and voicing in a sub-set of Spanish clusters. Mark Gibson, Ana María Fernández Planas, Adamantios I. Gafos, Emily Remirez |
| 2015 | Consonant recognition with continuous-state hidden Markov models and perceptually-motivated features. Philip Weber, Colin J. Champion, S. M. Houghton, Peter Jancovic, Martin J. Russell |
| 2015 | Constructive feedback, thinking process and cooperation: assessing the quality of classroom interaction. Tahir Sousa, Lucie Flekova, Margot Mieskes, Iryna Gurevych |
| 2015 | Contaminated speech training methods for robust DNN-HMM distant speech recognition. Mirco Ravanelli, Maurizio Omologo |
| 2015 | Contemporary stochastic feature selection algorithms for speech-based emotion recognition. Maxim Sidorov, Christina Brester, Alexander Schmitt |
| 2015 | Context-dependent error correction of spoken referring expressions. Ingrid Zukerman, Andisheh Partovi, Su Nam Kim |
| 2015 | Contextual variation of tones in mizo. Priyankoo Sarmah, Leena Dihingia, Wendy Lalhminghlui |
| 2015 | Continuous emotion tracking using total variability space. Hossein Khaki, Engin Erzin |
| 2015 | Continuous speech recognition from ECoG. Dominic Heger, Christian Herff, Adriana de Pesters, Dominic Telaar, Peter Brunner, Gerwin Schalk, Tanja Schultz |
| 2015 | Continuous word representation using neural networks for proper name retrieval from diachronic documents. Dominique Fohr, Irina Illina |
| 2015 | Controlling quality and handling fraud in large scale crowdsourcing speech data collections. Spencer Rothwell, Ahmad Elshenawy, Steele Carter, Daniela Braga, Faraz Romani, Michael Kennewick, Bob Kennewick |
| 2015 | Conversational agent and management tools for conference and tourism domain. Luis Fernando D'Haro, Seokhwan Kim, Rafael E. Banchs |
| 2015 | Convolutional neural networks for acoustic modeling of raw time signal in LVCSR. Pavel Golik, Zoltán Tüske, Ralf Schlüter, Hermann Ney |
| 2015 | Convolutional neural networks for small-footprint keyword spotting. Tara N. Sainath, Carolina Parada |
| 2015 | Cosine distance features for robust speaker verification. Kuruvachan K. George, C. Santhosh Kumar, K. I. Ramachandran, Ashish Panda |
| 2015 | Counting competing speakers in a timeframe - human versus computer. Valentin Andrei, Horia Cucu, Andi Buzo, Corneliu Burileanu |
| 2015 | Creating expressive synthetic voices by unsupervised clustering of audiobooks. Igor Jauk, Antonio Bonafonte, Paula Lopez-Otero, Laura Docío Fernández |
| 2015 | Cross database training of audio-visual hidden Markov models for phone recognition. Shahram Kalantari, David Dean, Houman Ghaemmaghami, Sridha Sridharan, Clinton Fookes |
| 2015 | Cross-lingual transfer learning during supervised training in low resource scenarios. Amit Das, Mark Hasegawa-Johnson |
| 2015 | Cross-modality matching of linguistic and emotional prosody. Simone Simonetti, Jeesun Kim, Chris Davis |
| 2015 | Crosslinguistic comparison on the perception of Mandarin attitudinal speech. Wentao Gu, Ping Tang, Keikichi Hirose, Véronique Aubergé |
| 2015 | Crowdsource a little to label a lot: labeling a speech corpus of dialectal Arabic. Samantha Wray, Ahmed Ali |
| 2015 | DIANA: towards computational modeling reaction times in lexical decision in north American English. Louis ten Bosch, Lou Boves, Benjamin V. Tucker, Mirjam Ernestus |
| 2015 | DNN derived filters for processing of modulation spectrum of speech. Jan Pesán, Lukás Burget, Hynek Hermansky, Karel Veselý |
| 2015 | DNN senone MAP multinomial i-vectors for phonotactic language recognition. Alan McCree, Daniel Garcia-Romero |
| 2015 | DNN-based residual echo suppression. Chul Min Lee, Jong Won Shin, Nam Soo Kim |
| 2015 | DNN-based speech bandwidth expansion and its application to adding high-frequency missing features for automatic speech recognition of narrowband speech. Kehuang Li, Zhen Huang, Yong Xu, Chin-Hui Lee |
| 2015 | Data collection and annotation for state-of-the-art NER using unmanaged crowds. Spencer Rothwell, Steele Carter, Ahmad Elshenawy, Vladislavs Dovgalecs, Safiyyah Saleem, Daniela Braga, Bob Kennewick |
| 2015 | Data-driven foot-based intonation generator for text-to-speech synthesis. Mahsa Sadat Elyasi Langarani, Jan P. H. van Santen, Seyed Hamidreza Mohammadi, Alexander Kain |
| 2015 | Data-selective transfer learning for multi-domain speech recognition. Mortaza Doulaty, Oscar Saz, Thomas Hain |
| 2015 | Dataset-invariant covariance normalization for out-domain PLDA speaker verification. Md. Hafizur Rahman, Ahilan Kanagasundaram, David Dean, Sridha Sridharan |
| 2015 | Declination, peak height and pitch level in declaratives and questions of south connaught irish. Maria O'Reilly, Ailbhe Ní Chasaide |
| 2015 | Deep bottleneck network based i-vector representation for language identification. Yan Song, Xinhai Hong, Bing Jiang, Ruilian Cui, Ian McLoughlin, Li-Rong Dai |
| 2015 | Deep contextual language understanding in spoken dialogue systems. Chunxi Liu, Puyang Xu, Ruhi Sarikaya |
| 2015 | Deep neural network based spectral feature mapping for robust speech recognition. Kun Han, Yanzhang He, Deblin Bagchi, Eric Fosler-Lussier, DeLiang Wang |
| 2015 | Deep neural network context embeddings for model selection in rich-context HMM synthesis. Thomas Merritt, Junichi Yamagishi, Zhizheng Wu, Oliver Watts, Simon King |
| 2015 | Deep neural network training emphasizing central frames. Gakuto Kurata, Daniel Willett |
| 2015 | Deep neural network-based statistical parametric speech synthesis system using improved time-frequency trajectory excitation model. Eunwoo Song, Hong-Goo Kang |
| 2015 | Deep semantic encodings for language modeling. Ali Orkan Bayer, Giuseppe Riccardi |
| 2015 | Delta-melspectra features for noise robustness to DNN-based ASR systems. Kshitiz Kumar, Chaojun Liu, Yifan Gong |
| 2015 | Denoising autoencoder-based speaker feature restoration for utterances of short duration. Hitoshi Yamamoto, Takafumi Koshinaka |
| 2015 | Dereverberation for active human-robot communication robust to speaker's face orientation. Randy Gomez, Levko Ivanchuk, Keisuke Nakamura, Takeshi Mizumoto, Kazuhiro Nakadai |
| 2015 | Detecting audio-visual synchrony using deep neural networks. Etienne Marcheret, Gerasimos Potamianos, Josef Vopicka, Vaibhava Goel |
| 2015 | Detecting repetitions in spoken dialogue systems using phonetic distances. José Lopes, Giampiero Salvi, Gabriel Skantze, Alberto Abad, Joakim Gustafson, Fernando Batista, Raveesh Meena, Isabel Trancoso |
| 2015 | Detection of cardiovascular reactivity in speech. Laurens van der Werff, Jón Guðnason, Kamilla Rún Jóhannsdóttir |
| 2015 | Detection of cognitive states and their correlation to speech recognition performance in speech-to-speech machine translation systems. Hayakawa Akira, Fasih Haider, Loredana Cerrato, Nick Campbell, Saturnino Luz |
| 2015 | Detection of mizo tones. Biswajit Dev Sarma, Priyankoo Sarmah, Wendy Lalhminghlui, S. R. Mahadeva Prasanna |
| 2015 | Development of CRIM system for the automatic speaker verification spoofing and countermeasures challenge 2015. Md. Jahangir Alam, Patrick Kenny, Gautam Bhattacharya, Themos Stafylakis |
| 2015 | Development of a Cantonese dysarthric speech corpus. Ka-Ho Wong, Yu Ting Yeung, Edwin H. Y. Chan, Patrick C. M. Wong, Gina-Anne Levow, Helen M. Meng |
| 2015 | Development of hindi speech recognition system of agricultural commodities using deep neural network. Partho Mandal, Shalini Jain, Gaurav Ojha, Anupam Shukla |
| 2015 | Diachronic semantic cohesion for topic segmentation of TV broadcast news. Abdessalam Bouchekif, Géraldine Damnati, Yannick Estève, Delphine Charlet, Nathalie Camelin |
| 2015 | Dialog act modeling for virtual personal assistant applications using a small volume of labeled data and domain knowledge. Donghyeon Lee, Jinsik Lee, Eun-Kyoung Kim, Jaewon Lee |
| 2015 | Dialog state tracking using long short-term memory neural networks. Xiaohao Yang, Jia Liu |
| 2015 | Dimensionality reduction for speech emotion features by multiscale kernels. Xinzhou Xu, Jun Deng, Wenming Zheng, Li Zhao, Björn W. Schuller |
| 2015 | Direction of arrival estimation based on reverberation weighting and noise error estimator. Cheng Pang, Jie Zhang, Hong Liu |
| 2015 | Discovering discrete subword units with binarized autoencoders and hidden-Markov-model encoders. Leonardo Badino, Alessio Mereta, Lorenzo Rosasco |
| 2015 | Discriminative bilinear language modeling for broadcast transcriptions. Akio Kobayashi, Manon Ichiki, Takahiro Oku, Kazuo Onoe, Shoei Sato |
| 2015 | Discriminative data selection for lightly supervised training of acoustic model using closed caption texts. Sheng Li, Yuya Akita, Tatsuya Kawahara |
| 2015 | Discriminative nonnegative matrix factorization using cross-reconstruction error for source separation. Kisoo Kwon, Jong Won Shin, Hyung Yong Kim, Nam Soo Kim |
| 2015 | Discriminative template learning in group-convolutional networks for invariant speech representations. Chiyuan Zhang, Stephen Voinea, Georgios Evangelopoulos, Lorenzo Rosasco, Tomaso A. Poggio |
| 2015 | Distance-aware DNNs for robust speech recognition. Yajie Miao, Florian Metze |
| 2015 | Distinct triphone acoustic modeling using deep neural networks. Dongpeng Chen, Brian Mak |
| 2015 | Distinctive feature based representation of speech for query-by-example spoken term detection. Abhijeet Saxena, B. Yegnanarayana |
| 2015 | Distributed representation-based spoken word sense induction. Justin T. Chiu, Yajie Miao, Alan W. Black, Alexander I. Rudnicky |
| 2015 | Does my speech rock? automatic assessment of public speaking skills. Lucas Azaïs, Adrien Payan, Tianjiao Sun, Guillaume Vidal, Tina Zhang, Eduardo Coutinho, Florian Eyben, Björn W. Schuller |
| 2015 | Does voice anthropomorphism affect lexical alignment in speech-based human-computer dialogue? Benjamin R. Cowan, Holly P. Branigan |
| 2015 | Double-ended prediction of the naturalness ratings of the blizzard challenge 2008-2013. Lukas Latacz, Werner Verhelst |
| 2015 | Duration dependent covariance regularization in PLDA modeling for speaker verification. Weicheng Cai, Ming Li, Lin Li, Qingyang Hong |
| 2015 | Duration prediction using multi-level model for GPR-based speech synthesis. Decha Moungsri, Tomoki Koriyama, Takao Kobayashi |
| 2015 | Durational characteristics and timing patterns of Russian onset clusters at two speaking rates. Marianne Pouplier, Stefania Marin, Alexei Kochetov |
| 2015 | Durational information in word-initial lexical embeddings in spoken Dutch. Odette Scharenborg |
| 2015 | E-commu-book: an assistive technology for users with speech impairments. Ka-Ho Wong, Wai-Kim Leung, Helen M. Meng |
| 2015 | Effect of different jitter-induced glottal pulse shape changes in periodicity perturbation measures. Carlos A. Ferrer-Riesgo, Diana Torres, Eduardo González-Moreira, José Ramón Calvo de Lara, Eduardo Castillo |
| 2015 | Effect of gender and call duration on customer satisfaction in call center big data. Quim Llimona, Jordi Luque, Xavier Anguera, Zoraida Hidalgo, Souneil Park, Nuria Oliver |
| 2015 | Effect of trapping questions on the reliability of speech quality judgments in a crowdsourcing paradigm. Babak Naderi, Tim Polzehl, Ina Wechsung, Friedemann Köster, Sebastian Möller |
| 2015 | Efficient GPU implementation of convolutional neural networks for speech recognition. Ewout van den Berg, Daniel Brand, Rajesh Bordawekar, Leonid Rachevsky, Bhuvana Ramabhadran |
| 2015 | Efficient language model adaptation for automatic speech recognition of spoken translations. Joris Pelemans, Tom Vanallemeersch, Kris Demuynck, Hugo Van hamme, Patrick Wambacq |
| 2015 | Efficient learning for spoken language understanding tasks with word embedding based pre-training. Yi Luan, Shinji Watanabe, Bret Harsham |
| 2015 | Efficient machine translation decoding with slow language models. Ahmad Emami |
| 2015 | Efficient use of DNN bottleneck features in generalized variable parameter HMMs for noise robust speech recognition. Rongfeng Su, Xurong Xie, Xunying Liu, Lan Wang |
| 2015 | Emotion clustering based on probabilistic linear discriminant analysis. Mahnoosh Mehrabani, Ozlem Kalinli, Ruxin Chen |
| 2015 | Emotional transplant in statistical speech synthesis based on emotion additive model. Yamato Ohtani, Yu Nasu, Masahiro Morita, Masami Akamine |
| 2015 | Energy distribution analysis and nonlinear dynamical analysis of adductor spasmodic dysphonia. Jiantao Wu, Ping Yu, Nan Yan, Lan Wang, Xiaohui Yang, Manwa L. Ng |
| 2015 | Enhanced processing of a lost language: linguistic knowledge or linguistic skill? Jiyoun Choi, Mirjam Broersma, Anne Cutler |
| 2015 | Enhanced speaker diarization with detection of backchannels using eye-gaze information in poster conversations. Koji Inoue, Yukoh Wakabayashi, Hiromasa Yoshimoto, Katsuya Takanashi, Tatsuya Kawahara |
| 2015 | Enhanced videokymographic data analysis based on vocal folds dynamics modeling. Carlo Drioli, Gian Luca Foresti |
| 2015 | Enhancement of non-stationary speech using harmonic chirp filters. Sidsel Marie Nørholm, Jesper Rindom Jensen, Mads Græsbøll Christensen |
| 2015 | Enhancing low resource keyword spotting with automatically retrieved web documents. Le Zhang, Damianos G. Karakos, William Hartmann, Roger Hsiao, Richard M. Schwartz, Stavros Tsakalidis |
| 2015 | Ensemble of Gaussian mixture localized neural networks with application to phone recognition. Ruchir Travadi, Shrikanth S. Narayanan |
| 2015 | Ensemble speaker modeling using speaker adaptive training deep neural network for speaker adaptation. Sheng Li, Xugang Lu, Yuya Akita, Tatsuya Kawahara |
| 2015 | Entropy-based sentence selection for speech synthesis using phonetic and prosodic contexts. Takashi Nose, Yusuke Arao, Takao Kobayashi, Komei Sugiura, Yoshinori Shiga, Akinori Ito |
| 2015 | Error analysis of extracted tongue contours from 2d ultrasound images. Tamás Gábor Csapó, Steven M. Lulich |
| 2015 | Error bounds for context reduction and feature omission. Eugen Beck, Ralf Schlüter, Hermann Ney |
| 2015 | Estimating lower vocal tract features with closed-open phase spectral analyses. Elizabeth Godoy, Nicolas Malyska, Thomas F. Quatieri |
| 2015 | Estimating the severity of parkinson's disease from speech using linear regression and database partitioning. Dávid Sztahó, Gábor Kiss, Klára Vicsi |
| 2015 | Estimation of glottal closure instants from telephone speech using a group delay-based approach that considers speech signal as a spectrum. Rachel G. Anushiya, P. Vijayalakshmi, T. Nagarajan |
| 2015 | Estimation of the air-tissue boundaries of the vocal tract in the mid-sagittal plane from electromagnetic articulograph data. Satyabrata Parida, Ashok Kumar Pattem, Prasanta Kumar Ghosh |
| 2015 | Evaluation and calibration of short-term aging effects in speaker verification. Finnian Kelly, John H. L. Hansen |
| 2015 | Evaluation of re-ranking by prioritizing highly ranked documents in spoken term detection. Kazuki Oouchi, Ryota Kon'no, Takahiro Akyu, Kazuma Konno, Kazunori Kojima, Kazuyo Tanaka, Shi-wook Lee, Yoshiaki Itoh |
| 2015 | Evaluation of state mapping based foreign accent conversion. Markus Toman, Michael Pucher |
| 2015 | Evidence of phonological processes in automatic recognition of children's speech. Eva Fringi, Jill Fain Lehman, Martin J. Russell |
| 2015 | Experiences with and new application ideas for the interspeech app. Sebastian Möller, Tilo Westermann |
| 2015 | Experimental assessment of the tongue incompressibility hypothesis during speech production. Zisis Iason Skordilis, Vikram Ramanarayanan, Louis Goldstein, Shrikanth S. Narayanan |
| 2015 | Expert and crowdsourced annotation of pronunciation errors for automatic scoring systems. Anastassia Loukina, Melissa Lopez, Keelan Evanini, David Suendermann-Oeft, Klaus Zechner |
| 2015 | Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions. Ning Ma, Guy J. Brown, Tobias May |
| 2015 | Exploiting i-vector posterior covariances for short-duration language recognition. Sandro Cumani, Oldrich Plchot, Radek Fér |
| 2015 | Exploiting supervector structure for speaker recognition trained on a small development set. Hagai Aronowitz |
| 2015 | Exploiting top-down source models to improve binaural localisation of multiple sources in reverberant environments. Ning Ma, Guy J. Brown, José A. González |
| 2015 | Exploring ANN back-ends for i-vector based speaker age estimation. Anna Fedorova, Ondrej Glembek, Tomi Kinnunen, Pavel Matejka |
| 2015 | Exploring acoustic differences between Cantonese (tonal) and English (non-tonal) spoken expressions of emotions. Chee Seng Chong, Jeesun Kim, Chris Davis |
| 2015 | Exploring how deep neural networks form phonemic categories. Tasha Nagamine, Michael L. Seltzer, Nima Mesgarani |
| 2015 | Exploring minimal pronunciation modeling for low resource languages. Marelie H. Davel, Etienne Barnard, Charl Johannes van Heerden, William Hartmann, Damianos G. Karakos, Richard M. Schwartz, Stavros Tsakalidis |
| 2015 | Exploring robustness of DNN/RNN for extracting speaker baum-welch statistics in mismatched conditions. Hao Zheng, Shanshan Zhang, Wenju Liu |
| 2015 | Extractive meeting summarization through speaker zone detection. Mohammad Hadi Bokaei, Hossein Sameti, Yang Liu |
| 2015 | F0 discontinuity as a marker of prosodic boundary strength in lombard speech. Stefan Benus, Uwe D. Reichel, Juraj Simko |
| 2015 | F0 parameterization of glottalized tones for HMM-based vietnamese TTS. Duy Khanh Ninh, Yoichi Yamashita |
| 2015 | Face reading from speech - predicting facial action units from audio cues. Fabien Ringeval, Erik Marchi, Marc Mehu, Klaus R. Scherer, Björn W. Schuller |
| 2015 | Factor analysis for speaker segmentation and improved speaker diarization. Brecht Desplanques, Kris Demuynck, Jean-Pierre Martens |
| 2015 | Fast and accurate phase unwrapping. Thomas Drugman, Yannis Stylianou |
| 2015 | Fast and accurate recurrent neural network acoustic models for speech recognition. Hasim Sak, Andrew W. Senior, Kanishka Rao, Françoise Beaufays |
| 2015 | Feature extraction strategies in deep learning based acoustic event detection. Miquel Espi, Masakiyo Fujimoto, Keisuke Kinoshita, Tomohiro Nakatani |
| 2015 | Feature-space speaker adaptation for probabilistic linear discriminant analysis acoustic models. Liang Lu, Steve Renals |
| 2015 | Fisher vectors with cascaded normalization for paralinguistic analysis. Heysem Kaya, Alexey Karpov, Albert Ali Salah |
| 2015 | Flexible tracking of auditory attention. Majid Mirbagheri, Bradley Ekin, Les Atlas, Adrian K. C. Lee |
| 2015 | Fluent personalized speech synthesis with prosodic word-level spontaneous speech generation. Yi-Chin Huang, Chung-Hsien Wu, Ming-Ge Shie |
| 2015 | Frequency map selection using a RBFN-based classifier in the MVDR beamformer for speaker localization in reverberant rooms. Daniele Salvati, Carlo Drioli, Gian Luca Foresti |
| 2015 | Frequency offset correction in single sideband (SSB) speech by deep neural network for speaker verification. Hua Xing, Gang Liu, John H. L. Hansen |
| 2015 | From newcastle MOUTH to aussie ears: australians' perceptual assimilation and adaptation for newcastle UK vowels. Catherine T. Best, Jason A. Shaw, Gerard Docherty, Bronwen G. Evans, Paul Foulkes, Jennifer Hay, Jalal Al-Tamimi, Katharine Mair, Karen E. Mulak, Sophie Wood |
| 2015 | From text to formants - indirect model for trajectory prediction based on a multi-speaker parallel speech database. Kálmán Abari, Tamás Gábor Csapó, Bálint Pál Tóth, Gábor Olaszy |
| 2015 | Full multicondition training for robust i-vector based speaker recognition. Dayana Ribas, Emmanuel Vincent, José Ramón Calvo de Lara |
| 2015 | Fully unsupervised small-vocabulary speech recognition using a segmental Bayesian model. Herman Kamper, Aren Jansen, Sharon Goldwater |
| 2015 | Fusion of LVCSR and posteriorgram based keyword search. Leda Sari, Batuhan Gündogdu, Murat Saraçlar |
| 2015 | Fusion of multiple parameterisations for DNN-based sinusoidal speech synthesis with multi-task learning. Qiong Hu, Zhizheng Wu, Korin Richmond, Junichi Yamagishi, Yannis Stylianou, Ranniery Maia |
| 2015 | GMM-derived features for effective unsupervised adaptation of deep neural network acoustic models. Natalia A. Tomashenko, Yuri Y. Khokhlov |
| 2015 | Garbage modeling for on-device speech recognition. Christophe Van Gysel, Leonid Velikovich, Ian McGraw, Françoise Beaufays |
| 2015 | Gaussian free cluster tree construction using deep neural network. Linchen Zhu, Kevin Kilgour, Sebastian Stüker, Alex Waibel |
| 2015 | Generalized variable parameter HMMs based acoustic-to-articulatory inversion. Xurong Xie, Xunying Liu, Lan Wang, Rongfeng Su |
| 2015 | Geo-location for voice search language modeling. Ciprian Chelba, Xuedong Zhang, Keith B. Hall |
| 2015 | German non-native realizations of French voiced fricatives in final position of a group of words. Anne Bonneau, Martine Cadot |
| 2015 | Glottal inverse filtering based on quadratic programming. Manu Airaksinen, Tom Bäckström, Paavo Alku |
| 2015 | Goodness of tone (GOT) for non-native Mandarin tone recognition. Rong Tong, Nancy F. Chen, Bin Ma, Haizhou Li |
| 2015 | HMM adaptation for child speech synthesis. Avashna Govender, Febe de Wet, Jules-Raymond Tapamo |
| 2015 | HMM based myanmar text to speech system. Ye Kyaw Thu, Win Pa Pa, Jinfu Ni, Yoshinori Shiga, Andrew M. Finch, Chiori Hori, Hisashi Kawai, Eiichiro Sumita |
| 2015 | HMM training strategy for incremental speech synthesis. Maël Pouget, Thomas Hueber, Gérard Bailly, Timo Baumann |
| 2015 | Handling derivative filterbank features in bounded-marginalization-based missing data automatic speech recognition. Marco Kühne |
| 2015 | Hands-on tool producing front vowels for phonetic education: aiming for pronunciation training with tactile sensation. Takayuki Arai |
| 2015 | Hierarchical discriminative model for spoken language understanding based on convolutional neural network. Jan Svec, Adam Chýlek, Lubos Smídl |
| 2015 | High-level feature representation using recurrent neural network for speech emotion recognition. Jinkyu Lee, Ivan Tashev |
| 2015 | High-resolution acoustic modeling and compact language modeling of language-universal speech attributes for spoken language identification. Yannan Wang, Jun Du, Li-Rong Dai, Chin-Hui Lee |
| 2015 | Homophonous phonotactic and morphonotactic consonant clusters in word-final position. Hannah Leykum, Sylvia Moosmüller, Wolfgang U. Dressler |
| 2015 | How the slope of the speech spectrum affects the perception of speaker size. Kodai Yamamoto, Toshio Irino, Ryuichi Nisimura, Hideki Kawahara, Roy D. Patterson |
| 2015 | How to compare TTS systems: a new subjective evaluation methodology focused on differences. Jonathan Chevelu, Damien Lolive, Sébastien Le Maguer, David Guennec |
| 2015 | How to evaluate ASR output for named entity recognition? Mohamed Ameur Ben Jannet, Olivier Galibert, Martine Adda-Decker, Sophie Rosset |
| 2015 | Human vocal tract growth: a longitudinal study of the development of various anatomical structures. Guillaume Barbier, Louis-Jean Boë, Guillaume Captier, Rafael Laboissière |
| 2015 | Human vs machine spoofing detection on wideband and narrowband data. Mirjam Wester, Zhizheng Wu, Junichi Yamagishi |
| 2015 | Hypotheses ranking and state tracking for a multi-domain dialog system using multiple ASR alternates. Omar Zia Khan, Jean-Philippe Robichaud, Paul A. Crook, Ruhi Sarikaya |
| 2015 | I-vector based physical task stress detection with different fusion strategies. Chunlei Zhang, Gang Liu, Chengzhu Yu, John H. L. Hansen |
| 2015 | I-vector dependent feature space transformations for adaptive speech recognition. Xiangang Li, Xihong Wu |
| 2015 | I-vector estimation using informative priors for adaptation of deep neural networks. Penny Karanasou, Mark J. F. Gales, Philip C. Woodland |
| 2015 | Immediately postverbal questions in urdu. Farhat Jabeen, Tina Bögel, Miriam Butt |
| 2015 | Implementation of a live dialectal media subtitling system. Michael Stadtschnitzer, Christoph Schmidt |
| 2015 | Importance of intelligible phonemes for human speaker recognition in different channel bandwidths. Laura Fernández Gallardo, Sebastian Möller, Michael Wagner |
| 2015 | Improved hindi broadcast ASR by adapting the language model and pronunciation model using a priori syntactic and morphophonemic knowledge. Preethi Jyothi, Mark Hasegawa-Johnson |
| 2015 | Improved phase reconstruction in single-channel speech separation. Florian Mayer, Pejman Mowlaee |
| 2015 | Improvements in RWTH LVCSR evaluation systems for Polish, Portuguese, English, urdu, and Arabic. M. Ali Basha Shaik, Zoltán Tüske, Muhammad Ali Tahir, Markus Nußbaum-Thom, Ralf Schlüter, Hermann Ney |
| 2015 | Improvements to the pruning behavior of DNN acoustic models. Matthias Paulik |
| 2015 | Improving G2p from wiktionary and other (web) resources. Steffen Eger |
| 2015 | Improving PLDA speaker verification using WMFD and linear-weighted approaches in limited microphone data conditions. Ahilan Kanagasundaram, David Dean, Sridha Sridharan |
| 2015 | Improving automatic forced alignment for dysarthric speech transcription. Yu Ting Yeung, Ka-Ho Wong, Helen M. Meng |
| 2015 | Improving automatic speech recognition in spatially-aware hearing aids. Hendrik Kayser, Constantin Spille, Daniel Marquardt, Bernd T. Meyer |
| 2015 | Improving deep neural networks based multi-accent Mandarin speech recognition using i-vectors and accent-specific top layer. Mingming Chen, Zhanlei Yang, Jizhong Liang, Yanpeng Li, Wenju Liu |
| 2015 | Improving speech recognition and keyword search for low resource languages using web data. Gideon Mendels, Erica Cooper, Victor Soto, Julia Hirschberg, Mark J. F. Gales, Kate M. Knill, Anton Ragni, Haipeng Wang |
| 2015 | Improving the prediction power of the speech transmission index to account for non-linear distortions introduced by noise-reduction algorithms. Fei Chen |
| 2015 | Improving voice activity detection in movies. Bernhard Lehner, Gerhard Widmer, Reinhard Sonnleitner |
| 2015 | Incorporating prosodic prominence evidence into term weights for spoken content retrieval. David Nicolas Racca, Gareth J. F. Jones |
| 2015 | Incorporating visual information for spoken term detection. Shahram Kalantari, David Dean, Sridha Sridharan |
| 2015 | Inductive implementation of segmental HMMs as CS-HMMs. S. M. Houghton, Colin J. Champion |
| 2015 | Influence of speaker familiarity on blind and visually impaired children's perception of synthetic voices in audio games. Michael Pucher, Markus Toman, Dietmar Schabus, Cassia Valentini-Botinhao, Junichi Yamagishi, Bettina Zillinger, Erich Schmid |
| 2015 | Insights into deep neural networks for speaker recognition. Daniel Garcia-Romero, Alan McCree |
| 2015 | Integrating online i-vector extractor with information bottleneck based speaker diarization system. Srikanth R. Madikeri, Ivan Himawan, Petr Motlícek, Marc Ferras |
| 2015 | Integration of DNN based speech enhancement and ASR. Ramón Fernandez Astudillo, Maria Joana Correia, Isabel Trancoso |
| 2015 | Integration of deep bottleneck features for audio-visual speech recognition. Hiroshi Ninomiya, Norihide Kitaoka, Satoshi Tamura, Yurie Iribe, Kazuya Takeda |
| 2015 | Intelligibility enhancement of casual speech for reverberant environments inspired by clear speech properties. Maria Koutsogiannaki, Petko Nikolov Petkov, Yannis Stylianou |
| 2015 | Intelligibility enhancement of vocal announcements for public address systems: a design for all through a presbycusis pre-compensation filter. Amira Ben Jemaa, Nader Mechergui, G. Courtois, A. Mudry, Sonia Djaziri Larbi, Monia Turki, Hervé Lissek, Meriem Jaïdane |
| 2015 | Interactivity-aware playout adaptation. Jochen Issing, Nikolaus Färber, Reinhard German |
| 2015 | Intermediate-layer DNN adaptation for offline and session-based iterative speaker adaptation. Kshitiz Kumar, Chaojun Liu, Kaisheng Yao, Yifan Gong |
| 2015 | Interpolation of tongue fleshpoint kinematics from combined EMA position and orientation data. Andrew J. Kolb, Michael T. Johnson, Jeffrey Berry |
| 2015 | Investigating consonant reduction in Mandarin Chinese with improved forced alignment. Jiahong Yuan, Mark Y. Liberman |
| 2015 | Investigating factor analysis features for deep neural networks in noisy speech recognition. Sriram Ganapathy, Samuel Thomas, Dimitrios Dimitriadis, Steven J. Rennie |
| 2015 | Investigating in-domain data requirements for PLDA training. Md. Hafizur Rahman, David Dean, Ahilan Kanagasundaram, Sridha Sridharan |
| 2015 | Investigating modulation spectrogram features for deep neural network-based automatic speech recognition. Deepak Baby, Hugo Van hamme |
| 2015 | Investigating the role of 'yeah' in stance-dense conversation. Valerie Freeman, Gina-Anne Levow, Richard A. Wright, Mari Ostendorf |
| 2015 | Investigation of bottleneck features and multilingual deep neural networks for speaker verification. Yao Tian, Meng Cai, Liang He, Jia Liu |
| 2015 | Investigation of parametric rectified linear units for noise robust speech recognition. Sunil Sivadas, Zhenzhou Wu, Bin Ma |
| 2015 | Is it time to Switch to word embedding and recurrent neural networks for spoken language understanding? Vedran Vukotic, Christian Raymond, Guillaume Gravier |
| 2015 | JFA for speaker recognition with random digit strings. Themos Stafylakis, Patrick Kenny, Md. Jahangir Alam, Marcel Kockmann |
| 2015 | Joint decoding of tandem and hybrid systems for improved keyword spotting on low resource languages. Haipeng Wang, Anton Ragni, Mark J. F. Gales, Kate M. Knill, Philip C. Woodland, Chao Zhang |
| 2015 | Joint environment and speaker normalization using factored front-end CMLLR. Shakti Rath, Sunil Sivadas, Bin Ma |
| 2015 | Joint optimization of recurrent networks exploiting source auto-regression for source separation. Shuai Nie, Wei Xue, Shan Liang, Xueliang Zhang, Wenju Liu, Liwei Qiao, Jianping Li |
| 2015 | Joint source localization and separation in spherical harmonic domain using a sparsity based method. Sachin N. Kalkur, Sandeep Reddy C, Rajesh M. Hegde |
| 2015 | Joint training of speech separation, filterbank and acoustic model for robust automatic speech recognition. Zhong-Qiu Wang, DeLiang Wang |
| 2015 | Keyword spotting in multi-player voice driven games for children. Sundar Harshavardhan, Jill Fain Lehman, Rita Singh |
| 2015 | Knowledge versus data in TTS: evaluation of a continuum of synthesis systems. Rosie Kay, Oliver Watts, Roberto Barra-Chicote, Cassie Mayo |
| 2015 | LSTM for punctuation restoration in speech transcripts. Ottokar Tilk, Tanel Alumäe |
| 2015 | Language-independent method for analysis of German stuttering recordings. Tomas Lustyk, Petr Bergl, Tino Haderlein, Elmar Nöth, Roman Cmejla |
| 2015 | Large scale speech-to-text translation with out-of-domain corpora using better context-based models and domain adaptation. Marcin Junczys-Dowmunt, Pawel Przybysz, Arleta Staszuk, Eun-Kyoung Kim, Jaewon Lee |
| 2015 | Large vocabulary automatic speech recognition for children. Hank Liao, Golan Pundak, Olivier Siohan, Melissa K. Carroll, Noah Coccaro, Qi-Ming Jiang, Tara N. Sainath, Andrew W. Senior, Françoise Beaufays, Michiel Bacchiani |
| 2015 | Large vocabulary children's speech recognition with DNN-HMM and SGMM acoustic modeling. Diego Giuliani, Bagher BabaAli |
| 2015 | Large-scale, sequence-discriminative, joint adaptive training for masking-based robust ASR. Arun Narayanan, Ananya Misra, Kean K. Chin |
| 2015 | Latency analysis of speech shadowing reveals processing differences in Japanese adults who do and do not stutter. Rong Na A, Koichi Mori, Naomi Sakai |
| 2015 | Latent words recurrent neural network language models. Ryo Masumura, Taichi Asami, Takanobu Oba, Hirokazu Masataki, Sumitaka Sakauchi, Akinori Ito |
| 2015 | Lateralization in emotional speech perception following transcranial direct current stimulation. Alex Francois-Nienaber, Jed A. Meltzer, Frank Rudzicz |
| 2015 | Latvian speech-to-text transcription service. Askars Salimbajevs, Jevgenijs Strigins |
| 2015 | Laughter and filler detection in naturalistic audio. Lakshmish Kaushik, Abhijeet Sangwan, John H. L. Hansen |
| 2015 | Layered nonnegative matrix factorization for speech separation. Chung-Chien Hsu, Jen-Tzung Chien, Tai-Shih Chi |
| 2015 | Learning OOV through semantic relatedness in spoken dialog systems. Ming Sun, Yun-Nung Chen, Alexander I. Rudnicky |
| 2015 | Learning a speech manifold for signal subspace speech denoising. Colin Vaz, Shrikanth S. Narayanan |
| 2015 | Learning from real users: rating dialogue success with neural networks for reinforcement learning in spoken dialogue systems. Pei-Hao Su, David Vandyke, Milica Gasic, Dongho Kim, Nikola Mrksic, Tsung-Hsien Wen, Steve J. Young |
| 2015 | Learning phrase patterns for ASR name error detection using semantic similarity. Alex Marin, Mari Ostendorf, Ji He |
| 2015 | Learning semantic hierarchy with distributed representations for unsupervised spoken language understanding. Yun-Nung Chen, William Yang Wang, Alexander I. Rudnicky |
| 2015 | Learning speech rate in speech recognition. Xiangyu Zeng, Shi Yin, Dong Wang |
| 2015 | Learning the speech front-end with raw waveform CLDNNs. Tara N. Sainath, Ron J. Weiss, Andrew W. Senior, Kevin W. Wilson, Oriol Vinyals |
| 2015 | Learning to estimate reverberation time in noisy and reverberant rooms. Xiong Xiao, Shengkui Zhao, Xionghu Zhong, Douglas L. Jones, Engsiong Chng, Haizhou Li |
| 2015 | Least squares estimate of the initial phases in STFT based speech enhancement. Sidsel Marie Nørholm, Martin Krawczyk-Becker, Timo Gerkmann, Steven van de Par, Jesper Rindom Jensen, Mads Græsbøll Christensen |
| 2015 | Leveraging word embeddings for spoken document summarization. Kuan-Yu Chen, Shih-Hung Liu, Hsin-Min Wang, Berlin Chen, Hsin-Hsi Chen |
| 2015 | Linguistic measures of pitch range in slavic and Germanic languages. Bistra Andreeva, Bernd Möbius, Grazyna Demenko, Frank Zimmerer, Jeanin Jügler |
| 2015 | Locality constrained transitive distance clustering on speech data. Wenbo Liu, Zhiding Yu, Bhiksha Raj, Ming Li |
| 2015 | Locally-connected and convolutional neural networks for small footprint speaker recognition. Yu-Hsin Chen, Ignacio López-Moreno, Tara N. Sainath, Mirkó Visontai, Raziel Alvarez, Carolina Parada |
| 2015 | Long short-term memory based convolutional recurrent neural networks for large vocabulary speech recognition. Xiangang Li, Xihong Wu |
| 2015 | Low frequency ultrasonic voice activity detection using convolutional neural networks. Ian McLoughlin, Yan Song |
| 2015 | Low-frequency components analysis in running speech for the automatic detection of parkinson's disease. Tatiana Villa-Cañas, Julián D. Arias-Londoño, Juan Rafael Orozco-Arroyave, Jesús Francisco Vargas-Bonilla, Elmar Nöth |
| 2015 | Low-memory fast on-line adaptation for acoustically mismatched children's speech recognition. Syed Shahnawazuddin, Rohit Sinha |
| 2015 | Managing speech databases with emur and the EMU-webapp. Raphael Winkelmann |
| 2015 | Many-to-many voice conversion based on multiple non-negative matrix factorization. Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki |
| 2015 | Maximum a posteriori adaptation of network parameters in deep models. Zhen Huang, Sabato Marco Siniscalchi, I-Fan Chen, Jinyu Li, Jiadong Wu, Chin-Hui Lee |
| 2015 | Measuring and monitoring speech quality for voice over IP with POLQA, viSQOL and p.563. Andrew Hines, Eoin Gillen, Naomi Harte |
| 2015 | Measuring mimicry in task-oriented conversations: degree of mimicry is related to task difficulty. Vijay Solanki, Alessandro Vinciarelli, Jane Stuart-Smith, Rachel Smith |
| 2015 | Measuring oral and nasal airflow in production of Chinese plosive. Yujie Chi, Kiyoshi Honda, Jianguo Wei, Hui Feng, Jianwu Dang |
| 2015 | Media monitoring system for latvian radio and TV broadcasts. Arturs Znotins, Kaspars Polis, Roberts Dargis |
| 2015 | Meeting assistant application. Michel Assayag, Jonathan Huang, Jonathan Mamou, Oren Pereg, Saurav Sahay, Oren Shamir, Georg Stemmer, Moshe Wasserblat |
| 2015 | Micro-structure of disfluencies: basics for conversational speech synthesis. Simon Betz, Petra Wagner, David Schlangen |
| 2015 | Migrating i-vectors between speaker recognition systems using regression neural networks. Ondrej Glembek, Pavel Matejka, Oldrich Plchot, Jan Pesán, Lukás Burget, Petr Schwarz |
| 2015 | Minimum trajectory error training for deep neural networks, combined with stacked bottleneck features. Zhizheng Wu, Simon King |
| 2015 | Minimum word error training of RNN-based voice activity detection. Gregory Gelly, Jean-Luc Gauvain |
| 2015 | Mispronunciation detection without nonnative training data. Ann Lee, James R. Glass |
| 2015 | Mitigating the effects of non-stationary unseen noises on language recognition performance. Luciana Ferrer, Mitchell McLaren, Aaron Lawson, Martin Graciarena |
| 2015 | Model-based adaptive pre-processing of speech for enhanced intelligibility in noise and reverberation. Jan Rennies, Andreas Volgenandt, Henning F. Schepker, Simon Doclo |
| 2015 | Model-based integration of reverberation for noise-adaptive near-end listening enhancement. Henning F. Schepker, David Hülsmeier, Jan Rennies, Simon Doclo |
| 2015 | Modeling phonetic context with non-random forests for speech recognition. Hainan Xu, Guoguo Chen, Daniel Povey, Sanjeev Khudanpur |
| 2015 | Modeling phrasing and prominence using deep recurrent learning. Andrew Rosenberg, Raul Fernandez, Bhuvana Ramabhadran |
| 2015 | Modeling speaker variability using long short-term memory networks for speech recognition. Xiangang Li, Xihong Wu |
| 2015 | Modeling temporal dependency for robust estimation of LP model parameters in speech enhancement. Chun Hoy Wong, Tan Lee, Yu Ting Yeung, Pak-Chung Ching |
| 2015 | Modified-prior PLDA and score calibration for duration mismatch compensation in speaker recognition system. Qingyang Hong, Lin Li, Ming Li, Ling Huang, Lihong Wan, Jun Zhang |
| 2015 | Modulation spectrum-constrained trajectory training algorithm for HMM-based speech synthesis. Shinnosuke Takamichi, Tomoki Toda, Alan W. Black, Satoshi Nakamura |
| 2015 | Morphological and acoustic analysis of the vocal tract using a multi-speaker volumetric MRI dataset. Tokihiko Kaburagi |
| 2015 | Morphology of vocal affect bursts: exploring expressive interjections in Japanese conversation. Hiroki Mori |
| 2015 | Multi-channel speaker verification based on total variability modelling. Maria Joana Correia, Alessio Brutti, Alberto Abad |
| 2015 | Multi-language hypotheses ranking and domain tracking for open domain dialogue systems. Paul A. Crook, Jean-Philippe Robichaud, Ruhi Sarikaya |
| 2015 | Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement. Yong Xu, Jun Du, Zhen Huang, Li-Rong Dai, Chin-Hui Lee |
| 2015 | Multi-resolution stacking for speech separation based on boosted DNN. Xiao-Lei Zhang, DeLiang Wang |
| 2015 | Multi-softmax deep neural network for semi-supervised training. Hang Su, Haihua Xu |
| 2015 | Multi-stream long short-term memory neural network language model. Ebru Arisoy, Murat Saraçlar |
| 2015 | Multi-task learning deep neural networks for speech feature denoising. Bin Huang, Dengfeng Ke, Hao Zheng, Bo Xu, Yanyan Xu, Kaile Su |
| 2015 | Multi-task learning for text-dependent speaker verification. Nanxin Chen, Yanmin Qian, Kai Yu |
| 2015 | Multidimensional evaluation and predicting overall speech quality. Jens Berger, Anna Llagostera |
| 2015 | Multilingual bottleneck features for language recognition. Radek Fér, Pavel Matejka, Frantisek Grézl, Oldrich Plchot, Jan Cernocký |
| 2015 | Multilingual features based keyword search for very low-resource languages. Pavel Golik, Zoltán Tüske, Ralf Schlüter, Hermann Ney |
| 2015 | Multilingual tandem bottleneck feature for language identification. Wang Geng, Jie Li, Shanshan Zhang, Xinyuan Cai, Bo Xu |
| 2015 | Multimodal read-aloud ebooks for language learning. Xavier Anguera |
| 2015 | Multiple feed-forward deep neural networks for statistical parametric speech synthesis. Shinji Takaki, Sangjin Kim, Junichi Yamagishi, JongJin Kim |
| 2015 | Multiscale recurrent neural network based language model. Tsuyoshi Morioka, Tomoharu Iwata, Takaaki Hori, Tetsunori Kobayashi |
| 2015 | Mutually exclusive grounding for weakly supervised non-negative matrix factorisation. Vincent Renkens, Hugo Van hamme |
| 2015 | NIST language recognition evaluation - plans for 2015. Alvin F. Martin, Craig S. Greenberg, John M. Howard, Désiré Bansé, George R. Doddington, Jaime Hernandez-Cordero, Lisa P. Mason |
| 2015 | Nao is doing humour in the CHIST-ERA joker project. Guillaume Dubuisson Duplessis, Lucile Bechade, Mohamed A. Sehili, Agnès Delaborde, Vincent Letard, Anne-Laure Ligozat, Paul Deléglise, Yannick Estève, Sophie Rosset, Laurence Devillers |
| 2015 | Neolexon - a therapy app for patients with aphasia. Jakob Pfab, Hanna Jakob, Mona Späth, Christoph Draxler |
| 2015 | Neural higher-order factors in conditional random fields for phoneme classification. Martin Ratajczak, Sebastian Tschiatschek, Franz Pernkopf |
| 2015 | Neuromorphic based oscillatory device for incremental syllable boundary detection. Alexandre Hyafil, Milos Cernak |
| 2015 | News talk-show chaptering with journalistic genres. Delphine Charlet, Géraldine Damnati, Jérémy Trione |
| 2015 | Noise robust exemplar matching for speech enhancement: applications to automatic speech recognition. Emre Yilmaz, Deepak Baby, Hugo Van hamme |
| 2015 | Noise robust speaker recognition with convolutive sparse coding. Antti Hurmalainen, Rahim Saeidi, Tuomas Virtanen |
| 2015 | Noise-matched training of CRF based sentence end detection models. Madina Hasan, Rama Doddipatla, Thomas Hain |
| 2015 | Noise-robust speaker recognition based on morphological component analysis. Yongjun He, Chen Chen, Jiqing Han |
| 2015 | Non-audible murmur enhancement based on statistical conversion using air- and body-conductive microphones in noisy environments. Yusuke Tajiri, Kou Tanaka, Tomoki Toda, Graham Neubig, Sakriani Sakti, Satoshi Nakamura |
| 2015 | Non-linear PLDA for i-vector speaker verification. Sergey Novoselov, Timur Pekhovsky, Oleg Kudashev, Valentin S. Mendelev, Alexey Prudnikov |
| 2015 | Non-native speech synthesis preserving speaker individuality based on partial correction of prosodic and phonetic characteristics. Yuji Oshima, Shinnosuke Takamichi, Tomoki Toda, Graham Neubig, Sakriani Sakti, Satoshi Nakamura |
| 2015 | Novel clustering selection criterion for fast binary key speaker diarization. Héctor Delgado, Xavier Anguera, Corinne Fredouille, Javier Serrano |
| 2015 | Objective intelligibility assessment of text-to-speech systems through utterance verification. Raphael Ullmann, Ramya Rasipuram, Mathew Magimai-Doss, Hervé Bourlard |
| 2015 | Objective measures for predicting the intelligibility of spectrally smoothed speech with artificial excitation. Danny Websdale, Thomas Le Cornu, Ben Milner |
| 2015 | Objective study of the performance degradation in emotion recognition through the AMR-WB+ codec. Aaron Albin, Elliot Moore |
| 2015 | On compressibility of neural network phonological features for low bit rate speech coding. Afsaneh Asaei, Milos Cernak, Hervé Bourlard |
| 2015 | On efficient training of word classes and their application to recurrent neural network language models. Rami Botros, Kazuki Irie, Martin Sundermeyer, Hermann Ney |
| 2015 | On evaluation metrics for social signal detection. Gábor Gosztolya |
| 2015 | On glottal source shape parameter transformation using a novel deterministic and stochastic speech analysis and synthesis system. Stefan Huber, Axel Roebel |
| 2015 | On optimal smoothing in minimum statistics based noise tracking. Aleksej Chinaev, Reinhold Haeb-Umbach |
| 2015 | On representation learning for artificial bandwidth extension. Matthias Zöhrer, Robert Peharz, Franz Pernkopf |
| 2015 | On speaker adaptation of long short-term memory recurrent neural networks. Yajie Miao, Florian Metze |
| 2015 | On speech intelligibility estimation of phase-aware single-channel speech enhancement. Andreas Gaich, Pejman Mowlaee |
| 2015 | On the incompatibility of trilling and palatalization: a single-subject study of sustained apical and uvular trills. Alexei Kochetov, Phil Howson |
| 2015 | On the nature of the features generated in the human auditory pathway for phone recognition. Harald Höge |
| 2015 | On the need of template protection for voice authentication. Carlos Vaquero, Patricia Rodríguez |
| 2015 | Online Lombard adaptation in incremental speech synthesis. Sebastian Rottschäfer, Hendrik Buschmeier, Herwin van Welbergen, Stefan Kopp |
| 2015 | Optical sensor calibration for electro-optical stomatography. Simon Preuß, Peter Birkholz |
| 2015 | PATSY - it's all about pronunciation! Caroline Kaufhold, Vadim Gamidov, Andreas Kießling, Klaus Reinhard, Elmar Nöth |
| 2015 | Paragraph vector based topic model for language model adaptation. Wengong Jin, Tianxing He, Yanmin Qian, Kai Yu |
| 2015 | Parallel inference of dirichlet process Gaussian mixture models for unsupervised acoustic modeling: a feasibility study. Hongjie Chen, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li |
| 2015 | Parameterised sigmoid and reLU hidden activation functions for DNN acoustic modelling. Chao Zhang, Philip C. Woodland |
| 2015 | Parameterization of prosodic headedness. Uwe D. Reichel, Katalin Mády, Stefan Benus |
| 2015 | Parkinson's condition estimation using speech acoustic and inversely mapped articulatory data. Seongjun Hahm, Jun Wang |
| 2015 | Perception and production of vowel contrasts in German learners of English. Helena Levy |
| 2015 | Perception of French speakers' German vowels. Frank Zimmerer, Jürgen Trouvain |
| 2015 | Perception of Italian liquids by Japanese listeners: comparisons to Spanish liquids. Tomohiko Ooigawa |
| 2015 | Perception of Mandarin tones by native tibetan speakers. Wenfu Bao, Hui Feng, Jianwu Dang, Zhilei Liu, Yang Yu, Siyu Wang |
| 2015 | Perception of an existing and non-existing L2 English phoneme behind noise by Japanese native speakers. Mako Ishida, Takayuki Arai |
| 2015 | Perception of voicing in the absence of native voicing experience. Rikke Louise Bundgaard-Nielsen, Brett Baker |
| 2015 | Perceptual cues of whispered tones: are they really special? Li Jiao, Qiuwu Ma, Ting Wang, Yi Xu |
| 2015 | Perceptual speech quality dimensions in a conversational situation. Friedemann Köster, Sebastian Möller |
| 2015 | Personalization of word-phrase-entity language models. Michael Levit, Andreas Stolcke, R. Subba, Sarangarajan Parthasarathy, Shuangyu Chang, S. Xie, T. Anastasakos, Benoît Dumoulin |
| 2015 | Personalized speech recognizer with keyword-based personalized lexicon and language model using word vector representations. Ching-Feng Yeh, Yuan-ming Liou, Hung-yi Lee, Lin-Shan Lee |
| 2015 | Personalized synthetic voices for speaking impaired: website and app. Daniel Erro, Inma Hernáez, Agustín Alonso, D. García-Lorenzo, Eva Navas, Jianpei Ye, Haritz Arzelus, Igor Jauk, Nguyen Quy Hy, Carmen Magariños, R. Pérez-Ramón, M. Sulír, Xiaohai Tian, X. Wang |
| 2015 | Phase perception of the glottal excitation of vocoded speech. Tuomo Raitio, Lauri Juvela, Antti Suni, Martti Vainio, Paavo Alku |
| 2015 | Phone-centric local variability vector for text-constrained speaker verification. Liping Chen, Kong-Aik Lee, Bin Ma, Wu Guo, Haizhou Li, Li-Rong Dai |
| 2015 | Phonemes frequency based PLLR dimensionality reduction for language recognition. Saad Irtza, Vidhyasaharan Sethu, Phu Ngoc Le, Eliathamby Ambikairajah, Haizhou Li |
| 2015 | Phonetic-phonological feature emerges by associating phonetic with semantic information - a GSOM-based modeling study. Mengxue Cao, Aijun Li, Qiang Fang, Bernd J. Kröger |
| 2015 | Phonetic/linguistic web services at BAS. Thomas Kisler, Florian Schiel, Uwe D. Reichel, Christoph Draxler |
| 2015 | Phonology-augmented statistical transliteration for low-resource languages. Hoang Gia Ngo, Nancy F. Chen, Binh Minh Nguyen, Bin Ma, Haizhou Li |
| 2015 | Phontasia - a game for training German orthography. Kay Berkling, Nadine Pflaumer, Alexei Coyplove |
| 2015 | Phrase accentuation verification and phonetic variation measurement for the degree of nativeness sub-challenge. Claude Montacié, Marie-José Caraty |
| 2015 | Pitch accent distribution in German infant-directed speech. Katharina Zahner, Muna Pohl, Bettina Braun |
| 2015 | Pitch declination and reset as a function of utterance duration in conversational speech data. Céline De Looze, Irena Yanushevskaya, Andy Murphy, Eoghan O'Connor, Christer Gobl |
| 2015 | Pitch scaling as a perceptual cue for questions in German. Jan Michalsky |
| 2015 | Pitch-based speech perturbation measures using a novel GCI detection algorithm: application to pathological voice classification. Khalid Daoudi, Ashwini Jaya Kumar |
| 2015 | Polysyllabic shortening and word-final lengthening in English. Andreas Windmann, Juraj Simko, Petra Wagner |
| 2015 | Positional language modeling for extractive broadcast news speech summarization. Shih-Hung Liu, Kuan-Yu Chen, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu |
| 2015 | Predicting therapist empathy in motivational interviews using language features inspired by psycholinguistic norms. James Gibson, Nikolaos Malandrakis, Francisco Romero, David C. Atkins, Shrikanth S. Narayanan |
| 2015 | Prediction of heart rate changes from speech features during interaction with a misbehaving dialog system. Andreas Tsiartas, Andreas Kathol, Elizabeth Shriberg, Massimiliano de Zambotti, Adrian Willoughby |
| 2015 | Prediction of speech recognition accuracy for utterance classification. Maxim L. Korenevsky, Andrey B. Smirnov, Valentin S. Mendelev |
| 2015 | Preserving word-level emphasis in speech-to-speech translation using linear regression HSMMs. Quoc Truong Do, Shinnosuke Takamichi, Sakriani Sakti, Graham Neubig, Tomoki Toda, Satoshi Nakamura |
| 2015 | Probabilistic linear discriminant analysis for robust speaker identification in co-channel speech. Navid Shokouhi, John H. L. Hansen |
| 2015 | Production inconsistencies delay adaptation to foreign accents. Ann-Kathrin Grohe, Gregory J. Poarch, Adriana Hanulíková, Andrea Weber |
| 2015 | Productions of /h/ in German: French vs. German speakers. Frank Zimmerer, Jürgen Trouvain |
| 2015 | Pronunciation accuracy and intelligibility of non-native speech. Anastassia Loukina, Melissa Lopez, Keelan Evanini, David Suendermann-Oeft, Alexei V. Ivanov, Klaus Zechner |
| 2015 | Pronunciation and silence probability modeling for ASR. Guoguo Chen, Hainan Xu, Minhua Wu, Daniel Povey, Sanjeev Khudanpur |
| 2015 | Prosodic (non-)realisation of broad, narrow and contrastive focus in Hungarian: a production and a perception study. Katalin Mády |
| 2015 | Prosodic characteristics of read speech before and after treadmill running. Jürgen Trouvain, Khiet P. Truong |
| 2015 | Prosodic phrasing unique to the acquisition of L2 intonation - an analysis of L2 Japanese intonation by L1 Swedish learners. Yasuko Nagano-Madsen |
| 2015 | Prosodically-enhanced recurrent neural network language models. Siva Reddy Gangireddy, Steve Renals, Yoshihiko Nankaku, Akinobu Lee |
| 2015 | Providing objective metrics of team communication skills via interpersonal coordination mechanisms. Céline De Looze, Brian Vaughan, Finnian Kelly, Alison M. Kay |
| 2015 | Pruning redundant synthesis units based on static and delta unit appearance frequency. Heng Lu, Wei Zhang, Xu Shao, Quan Zhou, Wenhui Lei, Hongbin Zhou, Andrew P. Breen |
| 2015 | Pruning sparse non-negative matrix n-gram language models. Joris Pelemans, Noam Shazeer, Ciprian Chelba |
| 2015 | QAT Ahmed Abdelali, Ahmed M. Ali, Francisco Guzmán, Felix Stahlberg, Stephan Vogel, Yifan Zhang |
| 2015 | Quantifying difference in vocalizations of bird populations. Colm O'Reilly, Nicola M. Marples, David J. Kelly, Naomi Harte |
| 2015 | RNN-based labeled data generation for spoken language understanding. Yik-Cheung Tam, Yangyang Shi, Hunk Chen, Mei-Yuh Hwang |
| 2015 | Random forest-based prediction of parkinson's disease progression using acoustic, ASR and intelligibility features. Alexander Zlotnik, Juan Manuel Montero, Rubén San Segundo, Ascensión Gallardo-Antolín |
| 2015 | Random forests for statistical speech synthesis. Alan W. Black, Prasanna Kumar Muthukumar |
| 2015 | Rapid adaptation for deep neural networks through multi-task learning. Zhen Huang, Jinyu Li, Sabato Marco Siniscalchi, I-Fan Chen, Ji Wu, Chin-Hui Lee |
| 2015 | Rapid vocabulary addition to context-dependent decoder graphs. Cyril Allauzen, Michael Riley |
| 2015 | Real-time audio signal enhancement for hands-free speech applications. Thomas Fehér, Michael Freitag, Christian Gruber |
| 2015 | Real-time audio-to-score alignment of singing voice based on melody and lyric information. Rong Gong, Philippe Cuvillier, Nicolas Obin, Arshia Cont |
| 2015 | Real-time control of a DNN-based articulatory synthesizer for silent speech conversion: a pilot study. Florent Bocquelet, Thomas Hueber, Laurent Girin, Christophe Savariaux, Blaise Yvert |
| 2015 | Real-time integration of dynamic context information for improving automatic speech recognition. Youssef Oualil, Marc Schulder, Hartmut Helmke, Anna Schmidt, Dietrich Klakow |
| 2015 | Real-time pitch modification system for speech and singing voice. Elias Azarov, Maxim Vashkevich, Denis Likhachov, Alexander A. Petrovsky |
| 2015 | Recognition of voiced sounds with a continuous state HMM. S. M. Houghton, Colin J. Champion, Philip Weber |
| 2015 | Recognize foreign low-frequency words with similar pairs. Xi Ma, Xiaoxi Wang, Dong Wang, Zhiyong Zhang |
| 2015 | Reconstructing intelligible audio speech from visual speech features. Thomas Le Cornu, Ben Milner |
| 2015 | Reconstructing voices within the multiple-average-voice-model framework. Pierre Lanchantin, Christophe Veaux, Mark J. F. Gales, Simon King, Junichi Yamagishi |
| 2015 | Rectified linear neural networks with tied-scalar regularization for LVCSR. Shiliang Zhang, Hui Jiang, Si Wei, Li-Rong Dai |
| 2015 | Recurrent neural network and LSTM models for lexical utterance classification. Suman V. Ravuri, Andreas Stolcke |
| 2015 | Recurrent neural network language model adaptation for multi-genre broadcast speech recognition. Xie Chen, Tian Tan, Xunying Liu, Pierre Lanchantin, M. Wan, Mark J. F. Gales, Philip C. Woodland |
| 2015 | Recurrent neural networks for incremental disfluency detection. Julian Hough, David Schlangen |
| 2015 | Reduction of reverberation effects in the MFCC modulation spectrum for improved classification of acoustic signals. Sebastian Gergen, Anil M. Nagathil, Rainer Martin |
| 2015 | Regularized non-negative matrix factorization using alternating direction method of multipliers and its application to source separation. Shaofei Zhang, Dong-Yan Huang, Lei Xie, Engsiong Chng, Haizhou Li, Minghui Dong |
| 2015 | Regularized sequence-level deep neural network model adaptation. Yan Huang, Yifan Gong |
| 2015 | Relative phase information for detecting human speech and spoofed speech. Longbiao Wang, Yohei Yoshida, Yuta Kawakami, Seiichi Nakagawa |
| 2015 | Relevance vector machine for depression prediction. Nicholas Cummins, Vidhyasaharan Sethu, Julien Epps, Jarek Krajewski |
| 2015 | Relieving mental stress of speakers using a tele-operated robot in foreign language speech education. Shizuka Nakamura, Miki Watanabe, Yuichiro Yoshikawa, Kohei Ogawa, Hiroshi Ishiguro |
| 2015 | Remeeting - get more out of meetings. Arlo Faria, Korbinian Riedhammer |
| 2015 | Representing nonspeech audio signals through speech classification models. Huy Phan, Lars Hertel, Marco Maaß, Radoslaw Mazur, Alfred Mertins |
| 2015 | Reverberation robust acoustic modeling using i-vectors with time delay neural networks. Vijayaditya Peddinti, Guoguo Chen, Daniel Povey, Sanjeev Khudanpur |
| 2015 | Reverberation-robust acoustic indoor localization. Jae Choi, Jeunghun Kim, Shin Jae Kang, Nam Soo Kim |
| 2015 | Rhythm influences the tonal realisation of focus. Nadja Schauffler, Katrin Schweitzer |
| 2015 | Robust and accurate LSF location with laguerre method. Michal Lenarczyk |
| 2015 | Robust deep feature for spoofing detection - the SJTU system for ASVspoof 2015 challenge. Nanxin Chen, Yanmin Qian, Heinrich Dinkel, Bo Chen, Kai Yu |
| 2015 | Robust features for sonorant segmentation in continuous speech. Sri Harsha Dumpala, Bhanu Teja Nellore, Raghu Ram Nevali, Suryakanth V. Gangashetty, Bayya Yegnanarayana |
| 2015 | Robust i-vector based adaptation of DNN acoustic model for speech recognition. Sri Garimella, Arindam Mandal, Nikko Strom, Björn Hoffmeister, Spyros Matsoukas, Sree Hari Krishnan Parthasarathi |
| 2015 | Robust i-vector extraction for neural network adaptation in noisy environment. Chengzhu Yu, Atsunori Ogawa, Marc Delcroix, Takuya Yoshioka, Tomohiro Nakatani, John H. L. Hansen |
| 2015 | Robust localization of single sound source based on phase difference regression. Zhaoqiong Huang, Ge Zhan, Dongwen Ying, Yonghong Yan |
| 2015 | Robust parameter estimation for audio declipping in noise. Mark J. Harvilla, Richard M. Stern |
| 2015 | Robust pitch estimation in noisy speech using ZTW and group delay function. RaviShankar Prasad, Bayya Yegnanarayana |
| 2015 | Robust sound event classification using LBP-HOG based bag-of-audio-words feature representation. Hyungjun Lim, Myung Jong Kim, Hoirin Kim |
| 2015 | Robust speech processing using observation uncertainty and uncertainty propagation: session and paper overview. Ramón Fernandez Astudillo, Shinji Watanabe, Ahmed Hussen Abdelaziz, Dorothea Kolossa |
| 2015 | Robust speech recognition using DNN-HMM acoustic model combining noise-aware training with spectral subtraction. Akihiro Abe, Kazumasa Yamamoto, Seiichi Nakagawa |
| 2015 | Robust tongue tracking in ultrasound images: a multi-hypothesis approach. Catherine Laporte, Lucie Ménard |
| 2015 | Robustness in speech quality assessment and temporal training expiry in mobile crowdsourcing environments. Tim Polzehl, Babak Naderi, Friedemann Köster, Sebastian Möller |
| 2015 | Robustness to additive noise of locally-normalized cepstral coefficients in speaker verification. Josué Fredes, José Novoa, Víctor Poblete, Simon King, Richard M. Stern, Néstor Becerra Yoma |
| 2015 | SABR: sparse, anchor-based representation of the speech signal. Christopher Liberatore, Sandesh Aryal, Zelun Wang, Seth Polsley, Ricardo Gutierrez-Osuna |
| 2015 | SARMATA 2.0 automatic Polish language speech recognition system. Bartosz Ziólko, Tomasz Jadczyk, Dawid Skurzok, Piotr Zelasko, Jakub Galka, Tomasz Pedzimaz, Ireneusz Gawlik, Szymon Piotr Palka |
| 2015 | SNR-invariant PLDA modeling for robust speaker verification. Na Li, Man-Wai Mak |
| 2015 | SVD-based universal DNN modeling for multiple scenarios. Changliang Liu, Jinyu Li, Yifan Gong |
| 2015 | SVitchboard II and fiSVer i: high-quality limited-complexity corpora of conversational English speech. Yuzong Liu, Rishabh K. Iyer, Katrin Kirchhoff, Jeff A. Bilmes |
| 2015 | Salient dimensions in implicit phonotactic learning. Elise Michon, Emmanuel Dupoux, Alejandrina Cristià |
| 2015 | Scalable distributed DNN training using commodity GPU cloud computing. Nikko Strom |
| 2015 | Score stabilization for speaker recognition trained on a small development set. Hagai Aronowitz |
| 2015 | Second language speech recognition using multiple-pass decoding with lexicon represented by multiple reduced phoneme sets. Xiaoyun Wang, Seiichi Yamamoto |
| 2015 | Segment-dependent dynamics in predicting parkinson's disease. James R. Williamson, Thomas F. Quatieri, Brian S. Helfer, Joseph Perricone, Satrajit S. Ghosh, Gregory A. Ciccarelli, Daryush D. Mehta |
| 2015 | Segmental conditional random fields with deep neural networks as acoustic models for first-pass word recognition. Yanzhang He, Eric Fosler-Lussier |
| 2015 | Segmental contribution to the intelligibility of ideal binary-masked sentences. Fei Chen, Alexander Siu Tai Kwok |
| 2015 | Selection and aggregation techniques for crowdsourced semantic annotation task. Shammur Absar Chowdhury, Marcos Calvo, Arindam Ghosh, Evgeny A. Stepanov, Ali Orkan Bayer, Giuseppe Riccardi, Fernando García, Emilio Sanchis Arnal |
| 2015 | Semantic analysis of spoken input using Markov logic networks. Vladimir Despotovic, Oliver Walter, Reinhold Haeb-Umbach |
| 2015 | Semantic retrieval of personal photos using a deep autoencoder fusing visual features with speech annotations represented as word/paragraph vectors. Hung-tsung Lu, Yuan-ming Liou, Hung-yi Lee, Lin-Shan Lee |
| 2015 | Semi-supervised maximum mutual information training of deep neural network acoustic models. Vimal Manohar, Daniel Povey, Sanjeev Khudanpur |
| 2015 | Semi-supervised training of a voice conversion mapping function using a joint-autoencoder. Seyed Hamidreza Mohammadi, Alexander Kain |
| 2015 | Sentence-level control vectors for deep neural network speech synthesis. Oliver Watts, Zhizheng Wu, Simon King |
| 2015 | Sequence generation error (SGE) minimization based deep neural networks training for text-to-speech synthesis. Yuchen Fan, Yao Qian, Frank K. Soong, Lei He |
| 2015 | Sequence-based class tagging for robust transcription in ASR. Lucy Vasserman, Vlad Schogol, Keith B. Hall |
| 2015 | Sequence-to-sequence neural net models for grapheme-to-phoneme conversion. Kaisheng Yao, Geoffrey Zweig |
| 2015 | Simultaneous optimization of multiple tree structures for factor analyzed HMM-based speech synthesis. Takenori Yoshimura, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda |
| 2015 | Simultaneous utilization of spectral magnitude and phase information to extract supervectors for speaker verification anti-spoofing. Yi Liu, Yao Tian, Liang He, Jia Liu, Michael T. Johnson |
| 2015 | Smarter driving with IDA, the intelligent driving assistant for singapore. Andreea I. Niculescu, Ngoc Thuy Huong Thai, Chongjia Ni, Boon Pang Lim, Kheng Hui Yeo, Rafael E. Banchs |
| 2015 | Sound source separation algorithm using phase difference and angle distribution modeling near the target. Chanwoo Kim, Kean K. Chin |
| 2015 | Source-filter separation of speech signal in the phase domain. Erfan Loweimi, Jon Barker, Thomas Hain |
| 2015 | Sparse coding based features for speech units classification. Pulkit Sharma, Vinayak Abrol, Aroor Dinesh Dileep, Anil Kumar Sao |
| 2015 | Sparse coding of total variability matrix. Longting Xu, Kong-Aik Lee, Haizhou Li, Zhen Yang |
| 2015 | Sparse modeling of posterior exemplars for keyword detection. Dhananjay Ram, Afsaneh Asaei, Pranay Dighe, Hervé Bourlard |
| 2015 | Sparse non-negative matrix language modeling for skip-grams. Noam Shazeer, Joris Pelemans, Ciprian Chelba |
| 2015 | Sparse representation with temporal max-smoothing for acoustic event detection. Xugang Lu, Peng Shen, Yu Tsao, Chiori Hori, Hisashi Kawai |
| 2015 | Speaker adaptation of convolutional neural network using speaker specific subspace vectors of SGMM. Murali Karthick B, Prateek Kolhar, Srinivasan Umesh |
| 2015 | Speaker adaptation using only vocalic segments via frequency warping. Agustín Alonso, Daniel Erro, Eva Navas, Inma Hernáez |
| 2015 | Speaker adaptation using relevance vector regression for HMM-based expressive TTS. Doo Hwa Hong, Joun Yeop Lee, Se Young Jang, Nam Soo Kim |
| 2015 | Speaker adaptation using the i-vector technique for bottleneck features. Patrick Cardinal, Najim Dehak, Yu Zhang, James R. Glass |
| 2015 | Speaker diarization with i-vectors from DNN senone posteriors. Gregory Sell, Daniel Garcia-Romero, Alan McCree |
| 2015 | Speaker recognition by means of acoustic and phonetically informed GMMs. Sandro Cumani, Pietro Laface, Farzana Kulsoom |
| 2015 | Speaker recognition for speech under face cover. Rahim Saeidi, Tuija Niemi, Hanna Karppelin, Jouni Pohjalainen, Tomi Kinnunen, Paavo Alku |
| 2015 | Speaker verification using Gaussian posteriorgrams on fixed phrase short utterances. Sarfaraz Jelil, Rohan Kumar Das, Rohit Sinha, S. R. Mahadeva Prasanna |
| 2015 | Speaker-dependent multipitch tracking using deep neural networks. Yuzhou Liu, DeLiang Wang |
| 2015 | Speaker-independent silent speech recognition with across-speaker articulatory normalization and speaker adaptive training. Jun Wang, Seongjun Hahm |
| 2015 | Spectrally selective dithering for distorted speech recognition. Michal Borsky, Petr Mizera, Petr Pollák |
| 2015 | Spectrographic speech mask estimation using the time-frequency correlation of speech presence. Ge Zhan, Zhaoqiong Huang, Dongwen Ying, Jielin Pan, Yonghong Yan |
| 2015 | Speech bandwidth expansion based on deep neural networks. Yingxue Wang, Shenghui Zhao, Wenbo Liu, Ming Li, Jingming Kuang |
| 2015 | Speech dereverberation using long short-term memory. Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara |
| 2015 | Speech emotion classification using tree-structured sparse logistic regression. Myung Jong Kim, Joohong Yoo, Younggwan Kim, Hoirin Kim |
| 2015 | Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks. Zhuo Chen, Shinji Watanabe, Hakan Erdogan, John R. Hershey |
| 2015 | Speech intelligibility decline in individuals with fast and slow rates of ALS progression. Panying Rong, Yana Yunusova, Jordan R. Green |
| 2015 | Speech planning in 4-year-old children versus adults: acoustic and articulatory analyses. Guillaume Barbier, Pascal Perrier, Lucie Ménard, Yohan Payan, Mark K. Tiede, Joseph S. Perkell |
| 2015 | Speech quality evaluation of artificial bandwidth extension: comparing subjective judgments and instrumental predictions. Hannu Pulakka, Ville Myllylä, Anssi Rämö, Paavo Alku |
| 2015 | Speech recognition with temporal neural networks. Payton Lin, Dau-Cheng Lyu, Yun-Fan Chang, Yu Tsao |
| 2015 | Speech reconstruction from human auditory cortex with deep neural networks. Minda Yang, Sameer A. Sheth, Catherine A. Schevon, Guy M. McKhann II, Nima Mesgarani |
| 2015 | Speech technologies for african languages: example of a multilingual calculator for education. Laurent Besacier, Elodie Gauthier, Mathieu Mangeot, Philippe Bretier, Paul C. Bagshaw, Olivier Rosec, Thierry Moudenc, François Pellegrino, Sylvie Voisin, Egidio Marsico, Pascal Nocera |
| 2015 | Speech-based assessment of PTSD in a military population using diverse feature classes. Dimitra Vergyri, Bruce Knoth, Elizabeth Shriberg, Vikramjit Mitra, Mitchell McLaren, Luciana Ferrer, Pablo Garcia, Charles Marmar |
| 2015 | Speech-based location estimation of first responders in a simulated search and rescue scenario. Saeid Mokaram, Roger K. Moore |
| 2015 | Speed or accuracy? a study in evaluation of simultaneous speech translation. Takashi Mieno, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura |
| 2015 | Spiking neural networks and the generalised hough transform for speech pattern detection. Jonathan William Dennis, Tran Huy Dat, Haizhou Li |
| 2015 | Spoofing countermeasure based on analysis of linear prediction error. Artur Janicki |
| 2015 | Spoofing detection with DNN and one-class SVM for the ASVspoof 2015 challenge. Jesús Antonio Villalba López, Antonio Miguel, Alfonso Ortega, Eduardo Lleida |
| 2015 | Spoofing speech detection using high dimensional magnitude and phase features: the NTU approach for ASVspoof 2015 challenge. Xiong Xiao, Xiaohai Tian, Steven Du, Haihua Xu, Engsiong Chng, Haizhou Li |
| 2015 | Stable and unstable intervals as a basic segmentation procedure of the speech signal. Ulrike Glavitsch, Lei He, Volker Dellwo |
| 2015 | Stacked auto-encoder for ASR error detection and word error rate prediction. Shahab Jalalvand, Daniele Falavigna |
| 2015 | Statistical acoustic-to-articulatory mapping unified with speaker normalization based on voice conversion. Hidetsugu Uchida, Daisuke Saito, Nobuaki Minematsu, Keikichi Hirose |
| 2015 | Statistical singing voice conversion based on direct waveform modification with global variance. Kazuhiro Kobayashi, Tomoki Toda, Graham Neubig, Sakriani Sakti, Satoshi Nakamura |
| 2015 | Still together?: the role of acoustic features in predicting marital outcome. Md. Nasir, Wei Xia, Bo Xiao, Brian R. Baucom, Shrikanth S. Narayanan, Panayiotis G. Georgiou |
| 2015 | Stress level detection using double-layer subband filter. Tin Lay Nwe, Qianli Xu, Cuntai Guan, Bin Ma |
| 2015 | Stressed out: what speech tells us about stress. Will Paul, Cecilia Ovesdotter Alm, Reynold J. Bailey, Joe Geigel, Linwei Wang |
| 2015 | Structured output layer with auxiliary targets for context-dependent acoustic modelling. Pawel Swietojanski, Peter Bell, Steve Renals |
| 2015 | Structured prediction for speaker identification in TV series. Elena Knyazeva, Guillaume Wisniewski, Hervé Bredin, François Yvon |
| 2015 | Structuring lectures in massive open online courses (MOOCs) for efficient learning by linking similar sections and predicting prerequisites. Sheng-syun Shen, Hung-yi Lee, Shang-wen Li, Victor Zue, Lin-Shan Lee |
| 2015 | Study of acoustic correlates of English lexical stress produced by native (L1) bengali speakers compared to native (L1) English speakers. Shambhu Nath Saha, Shyamal Kr. Das Mandal |
| 2015 | Study of entity-topic models for OOV proper name retrieval. Imran A. Sheikh, Irina Illina, Dominique Fohr |
| 2015 | Stylex: a corpus of educational videos for research on speaking styles and their impact on engagement and learning. Harish Arsikere, Sonal Patil, Ranjeet Kumar, Kundan Shrivastava, Om Deshmukh |
| 2015 | Sub-band text-to-speech combining sample-based spectrum with statistically generated spectrum. Tadashi Inai, Sunao Hara, Masanobu Abe, Yusuke Ijima, Noboru Miyazaki, Hideyuki Mizuno |
| 2015 | Swiss graphogame: concept and design presentation of a computerised reading intervention for children with high risk for poor reading outcomes. Martina Röthlisberger, Iliana I. Karipidis, Georgette Pleisch, Volker Dellwo, Ulla Richardson, Silvia Brem |
| 2015 | Synchronous overlap and add of spectra for enhancement of excitation in artificial bandwidth extension of speech. M. A. Tugtekin Turan, Engin Erzin |
| 2015 | System fusion for high-performance voice conversion. Xiaohai Tian, Zhizheng Wu, Siu Wa Lee, Nguyen Quy Hy, Minghui Dong, Engsiong Chng |
| 2015 | System supporting speaker identification in emergency call center. Jakub Galka, Joanna Grzybowska, Magdalena Igras, Pawel Jaciów, Kamil Wajda, Marcin Witkowski, Mariusz Ziólko |
| 2015 | Systematic integration of acoustic echo canceller and noise reduction modules for voice communication systems. Hyeonjoo Kang, JeeSok Lee, Soonho Baek, Hong-Goo Kang |
| 2015 | TDTO language modeling with feedforward neural networks. Tze Yuang Chong, Rafael E. Banchs, Engsiong Chng, Haizhou Li |
| 2015 | Talk it out: adding speech interaction to support informational and transactional applications on public touch-screen kiosks. Kheng Hui Yeo, Rafael E. Banchs |
| 2015 | Temporal dynamics of the speech readiness potential, and its use in a neural decoder of speech-motor intention. Jonathan S. Brumberg, Nichol Castro, Akshatha Rao |
| 2015 | Text-informed speech enhancement with deep neural networks. Keisuke Kinoshita, Marc Delcroix, Atsunori Ogawa, Tomohiro Nakatani |
| 2015 | The AHOLAB RPS SSD spoofing challenge 2015 submission. Jon Sánchez, Ibon Saratxaga, Inma Hernáez, Eva Navas, Daniel Erro |
| 2015 | The Cambridge University 2014 BOLT conversational telephone Mandarin Chinese LVCSR system for speech translation. Xunying Liu, Federico Flego, Linlin Wang, Chao Zhang, Mark J. F. Gales, Philip C. Woodland |
| 2015 | The HBP-atlas - concept, perspectives, and application for language and speech research. Katrin Amunts |
| 2015 | The IBM 2015 English conversational telephone speech recognition system. George Saon, Hong-Kwang Jeff Kuo, Steven J. Rennie, Michael Picheny |
| 2015 | The IBM BOLT speech transcription system. Samuel Thomas, George Saon, Hong-Kwang Jeff Kuo, Lidia Mangu |
| 2015 | The INTERSPEECH 2015 computational paralinguistics challenge: a summary of results. Stefan Steidl |
| 2015 | The INTERSPEECH 2015 computational paralinguistics challenge: nativeness, parkinson's & eating condition. Björn W. Schuller, Stefan Steidl, Anton Batliner, Simone Hantke, Florian Hönig, Juan Rafael Orozco-Arroyave, Elmar Nöth, Yue Zhang, Felix Weninger |
| 2015 | The QUT-NOISE-SRE protocol for the evaluation of noisy speaker recognition. David Dean, Ahilan Kanagasundaram, Houman Ghaemmaghami, Md. Hafizur Rahman, Sridha Sridharan |
| 2015 | The acoustics of word stress in English as a function of stress level and speaking style. Anders Eriksson, Mattias Heldner |
| 2015 | The degree of nativeness sub-challenge: the data. Florian Hönig |
| 2015 | The development of categorical perception of lexical tones in Mandarin-speaking preschoolers. Fei Chen, Nan Yan, Lan Wang, Tao Yang, Jiantao Wu, Han Zhao, Gang Peng |
| 2015 | The discourse value of social signals at topic change moments. Francesca Bonin, Nick Campbell, Carl Vogel |
| 2015 | The eating condition sub-challenge: the data. Anton Batliner |
| 2015 | The effect of cochlear implant processing on speaker intelligibility: a perceptual study and computer model. Lin Lin, Jon Barker, Guy J. Brown |
| 2015 | The effect of high-variability training on the perception and production of French stops by German native speakers. Jeanin Jügler, Frank Zimmerer, Bernd Möbius, Christoph Draxler |
| 2015 | The effect of soft, modal and loud voice levels on entrainment in noisy conditions. Éva Székely, Mark T. Keane, Julie Carson-Berndsen |
| 2015 | The effect of speakers' regional varieties on listeners' decision-making. Adrian Leemann, Camilla Bernardasci, Francis Nolan |
| 2015 | The effect of spectral slope on pitch perception. Jianjing Kuang, Mark Y. Liberman |
| 2015 | The effect of stress on vowel space in daxi hakka Chinese. Chunan Qiu, Jie Liang |
| 2015 | The emergence of compositional structure in language evolution and development. Mary E. Beckman |
| 2015 | The emergence of nasal velar codas in Brazilian Portuguese: an rt-MRI study. Marissa S. Barlaz, Maojing Fu, Zhi-Pei Liang, Ryan Shosted, Bradley P. Sutton |
| 2015 | The intonation of echo wh-questions. Sophie Repp, Lena Rosin |
| 2015 | The parkinson's condition sub-challenge: the data. Juan Rafael Orozco-Arroyave |
| 2015 | The prosodic marking of rhetorical questions in German. Daniela Wochner, Jana Schlegel, Nicole Dehé, Bettina Braun |
| 2015 | The reddots data collection for speaker recognition. Kong-Aik Lee, Anthony Larcher, Guangsen Wang, Patrick Kenny, Niko Brümmer, David A. van Leeuwen, Hagai Aronowitz, Marcel Kockmann, Carlos Vaquero, Bin Ma, Haizhou Li, Themos Stafylakis, Md. Jahangir Alam, Albert Swart, Javier Perez |
| 2015 | The reddots platform for mobile crowd-sourcing of speech data. Kong-Aik Lee, Guangsen Wang, Kam Pheng Ng, Hanwu Sun, Trung Hieu Nguyen, Ngoc Thuy Huong Thai, Bin Ma, Haizhou Li |
| 2015 | The relationship between acoustic and perceived intraspeaker variability in voice quality. Jody Kreiman, Soo Jin Park, Patricia A. Keating, Abeer Alwan |
| 2015 | The relationship between voice source parameters and the maxima dispersion quotient (MDQ). Christer Gobl, Irena Yanushevskaya, Ailbhe Ní Chasaide |
| 2015 | The role of prosody and voice quality in text-dependent categories of storytelling across languages. Raúl Montaño, Francesc Alías |
| 2015 | The role of speakers and context in classifying competition in overlapping speech. Shammur Absar Chowdhury, Morena Danieli, Giuseppe Riccardi |
| 2015 | The role of temporal resolution in modulation-based speech segregation. Tobias May, Thomas Bentsen, Torsten Dau |
| 2015 | The speech recognition virtual kitchen turns one. Florian Metze, Eric Riebling, Eric Fosler-Lussier, Andrew R. Plummer, Rebecca Bates |
| 2015 | The technology powering personal digital assistants. Ruhi Sarikaya |
| 2015 | The zero resource speech challenge 2015. Maarten Versteegh, Roland Thiollière, Thomas Schatz, Xuan-Nga Cao, Xavier Anguera, Aren Jansen, Emmanuel Dupoux |
| 2015 | Therapy language analysis using automatically generated psycholinguistic norms. Nikolaos Malandrakis, Shrikanth S. Narayanan |
| 2015 | Three ways to adapt a CTS recognizer to unseen reverberated speech in BUT system for the ASpIRE challenge. Martin Karafiát, Frantisek Grézl, Lukás Burget, Igor Szöke, Jan Cernocký |
| 2015 | Time-frequency kernel-based CNN for speech recognition. Tuo Zhao, Yunxin Zhao, Xin Chen |
| 2015 | Time-frequency masking for large scale robust speech recognition. Yuxuan Wang, Ananya Misra, Kean K. Chin |
| 2015 | Tongue tracking in ultrasound images using eigentongue decomposition and artificial neural networks. Diandra Fabre, Thomas Hueber, Florent Bocquelet, Pierre Badin |
| 2015 | Tools for rapid customization of S2s systems for emergent domains. Rohit Kumar, Matthew E. Roy, Sanjika Hewavitharana, Dennis N. Mehay, Nina Zinovieva |
| 2015 | Topic modeling for conference analytics. Pengfei Liu, Shoaib Jameel, Wai Lam, Bin Ma, Helen M. Meng |
| 2015 | Towards a linear dynamical model based speech synthesizer. Vassilios Tsiaras, Ranniery Maia, Vassilios Diakoloukas, Yannis Stylianou, Vassilios Digalakis |
| 2015 | Towards an automated screening tool for pediatric speech delay. Roozbeh Sadeghian, Stephen A. Zahorian |
| 2015 | Towards automatic detection of reported speech in dialogue using prosodic cues. Alessandra Cervone, Catherine Lai, Silvia Pareti, Peter Bell |
| 2015 | Towards end-to-end speech recognition for Chinese Mandarin using long short-term memory recurrent neural networks. Jie Li, Heng Zhang, Xinyuan Cai, Bo Xu |
| 2015 | Towards minimum perceptual error training for DNN-based speech synthesis. Cassia Valentini-Botinhao, Zhizheng Wu, Simon King |
| 2015 | Towards the prediction of human speaker identification performance from measured speech quality. Laura Fernández Gallardo, Sebastian Möller |
| 2015 | Traditional IVR and visual IVR - killing two birds with one stone. Dmitry Sityaev, Praphul Kumar, Rajesh Ramchander |
| 2015 | Training data selection for acoustic modeling via submodular optimization of joint kullback-leibler divergence. Taichi Asami, Ryo Masumura, Hirokazu Masataki, Manabu Okamoto, Sumitaka Sakauchi |
| 2015 | Training deep bidirectional LSTM acoustic model for LVCSR by a context-sensitive-chunk BPTT approach. Kai Chen, Zhi-Jie Yan, Qiang Huo |
| 2015 | Transcribing continuous speech using mismatched crowdsourcing. Preethi Jyothi, Mark Hasegawa-Johnson |
| 2015 | Transferring knowledge from a RNN to a DNN. William Chan, Nan Rosemary Ke, Ian R. Lane |
| 2015 | Tunable keyword-aware language modeling and context dependent fillers for LVCSR-based spoken keyword search. Tze Siong Lau, I-Fan Chen, Chin-Hui Lee |
| 2015 | Two extensions of umeda and teranishi's physical models of the human vocal tract. Takayuki Arai |
| 2015 | Two-stage multi-target joint learning for monaural speech separation. Shuai Nie, Shan Liang, Wei Xue, Xueliang Zhang, Wenju Liu, Like Dong, Hong Yang |
| 2015 | Two-step spoken term detection using SVM classifier trained with pre-indexed keywords based on ASR result. Kentaro Domoto, Takehito Utsuro, Naoki Sawada, Hiromitsu Nishizaki |
| 2015 | Typicality and emotion in the voice of children with autism spectrum condition: evidence across three languages. Erik Marchi, Björn W. Schuller, Simon Baron-Cohen, Ofer Golan, Sven Bölte, Prerna Arora, Reinhold Häb-Umbach |
| 2015 | Uncertainty decoding for DNN-HMM hybrid systems based on numerical sampling. Christian Huemmer, Roland Maas, Andreas Schwarz, Ramón Fernandez Astudillo, Walter Kellermann |
| 2015 | Uncertainty propagation for noise robust speaker recognition: the case of NIST-SRE. Dayana Ribas González, Emmanuel Vincent, José Ramón Calvo de Lara |
| 2015 | Uncertainty propagation through deep neural networks. Ahmed Hussen Abdelaziz, Shinji Watanabe, John R. Hershey, Emmanuel Vincent, Dorothea Kolossa |
| 2015 | Uncertainty training and decoding methods of deep neural networks based on stochastic representation of enhanced features. Yuuki Tachioka, Shinji Watanabe |
| 2015 | Under-resourced speech recognition based on the speech manifold. Reza Sahraeian, Dirk Van Compernolle, Febe de Wet |
| 2015 | Unintuitive phonetic behavior in tswana post-nasal stops. Jagoda Bruni, Daniel Duran, Grzegorz Dogil |
| 2015 | Universal grapheme-based speech synthesis. Sunayana Sitaram, Alok Parlikar, Gopala Krishna Anumanchipalli, Alan W. Black |
| 2015 | Unsupervised adaptation for deep neural network using linear least square method. Roger Hsiao, Tim Ng, Stavros Tsakalidis, Long Nguyen, Richard M. Schwartz |
| 2015 | Unsupervised domain discovery using latent dirichlet allocation for acoustic modelling in speech recognition. Mortaza Doulaty, Oscar Saz, Thomas Hain |
| 2015 | Unsupervised relation detection using automatic alignment of query patterns extracted from knowledge graphs and query click logs. Panupong Pasupat, Dilek Hakkani-Tür |
| 2015 | Unsupervised word discovery from speech using automatic segmentation into syllable-like units. Okko Räsänen, Gabriel Doyle, Michael C. Frank |
| 2015 | Using F0 contours to assess nativeness in a sentence repeat task. Min Ma, Keelan Evanini, Anastassia Loukina, Xinhao Wang, Klaus Zechner |
| 2015 | Using acoustics to improve pronunciation for synthesis of low resource languages. Sunayana Sitaram, Serena Jeblee, Alan W. Black |
| 2015 | Using articulatory features and inferred phonological segments in zero resource speech processing. Pallavi Baljekar, Sunayana Sitaram, Prasanna Kumar Muthukumar, Alan W. Black |
| 2015 | Using audio and visual information for single channel speaker separation. Faheem Khan, Ben Milner |
| 2015 | Using automatic stress extraction from audio for improved prosody modelling in speech synthesis. György Szaszák, András Beke, Gábor Olaszy, Bálint Pál Tóth |
| 2015 | Using deep bidirectional recurrent neural networks for prosodic-target prediction in a unit-selection text-to-speech system. Raul Fernandez, Asaf Rendel, Bhuvana Ramabhadran, Ron Hoory |
| 2015 | Using keyword spotting to help humans correct captioning faster. Yashesh Gaur, Florian Metze, Yajie Miao, Jeffrey P. Bigham |
| 2015 | Using linguistic indicators of difficulty to identify mild cognitive impairment. Rebecca Lunsford, Peter A. Heeman |
| 2015 | Using melody metrics to compare English speech read by native speakers and by L2 Chinese speakers from shanghai. Daniel Hirst, Hongwei Ding |
| 2015 | Using profile similarity to measure agreement in personality perception. Zoraida Callejas, David Griol |
| 2015 | Using representation learning and out-of-domain data for a paralinguistic speech task. Benjamin Milde, Chris Biemann |
| 2015 | Using resources from a closely-related language to develop ASR for a very under-resourced language: a case study for iban. Sarah Flora Samson Juan, Laurent Besacier, Benjamin Lecouteux, Mohamed Dyab |
| 2015 | Using semantic maps for robust natural language interaction with robots. Emanuele Bastianelli, Danilo Croce, Roberto Basili, Daniele Nardi |
| 2015 | Using the beat histogram for speech rhythm description and language identification. Athanasios Lykartsis, Stefan Weinzierl |
| 2015 | Using tilt for automatic emphasis detection with Bayesian networks. Yishuang Ning, Zhiyong Wu, Xiaoyan Lou, Helen M. Meng, Jia Jia, Lianhong Cai |
| 2015 | Using voice-quality measurements with prosodic and spectral features for speaker diarization. Abraham Woubie, Jordi Luque, Javier Hernando |
| 2015 | Using word confusion networks for slot filling in spoken language understanding. Xiaohao Yang, Jia Liu |
| 2015 | Valence, arousal and dominance estimation for English, German, Greek, Portuguese and Spanish lexica using semantic models. Elisavet Palogiannidi, Elias Iosif, Polychronis Koutsakis, Alexandros Potamianos |
| 2015 | Validating and optimizing a crowdsourced method for gradient measures of child speech. Tara McAllister Byun, Elaine Hitchcock, Daphna Harel |
| 2015 | Verbal intelligence identification based on text classification. Roman B. Sergienko, Alexander Schmitt |
| 2015 | Very deep convolutional neural networks for LVCSR. Mengxiao Bi, Yanmin Qian, Kai Yu |
| 2015 | Viseme comparison based on phonetic cues for varying speech accents. Chitralekha Bhat, Sunil Kumar Kopparapu |
| 2015 | Visual comparison of speaker groups. Sebastian Wankerl, Florian Hönig, Anton Batliner, Juan Rafael Orozco-Arroyave, Elmar Nöth |
| 2015 | Vocal biomarkers to discriminate cognitive load in a working memory task. Thomas F. Quatieri, James R. Williamson, Christopher J. Smalt, Tejash Patel, Joseph Perricone, Daryush D. Mehta, Brian S. Helfer, Gregory A. Ciccarelli, Darrell O. Ricke, Nicolas Malyska, Jeff Palmer, Kristin Heaton, Marianna Eddy, Joseph Moran |
| 2015 | Vocal separation from monaural music using adaptive auditory filtering based on kernel back-fitting. Jun-Yong Lee, Hye-Seung Cho, Hyoung-Gook Kim |
| 2015 | Vocal tremor analysis via AM-FM decomposition of empirical modes of the glottal cycle length time series. Christophe Mertens, Francis Grenez, François Viallet, Alain Ghio, Sabine Skodda, Jean Schoentgen |
| 2015 | Vocal turn-taking patterns in groups of children performing collaborative tasks: an exploratory study. Jaebok Kim, Khiet P. Truong, Vicky Charisi, Cristina Zaga, Manja Lohse, Dirk Heylen, Vanessa Evers |
| 2015 | Voice liveness detection algorithms based on pop noise caused by human breath for automatic speaker verification. Sayaka Shiota, Fernando Villavicencio, Junichi Yamagishi, Nobutaka Ono, Isao Echizen, Tomoko Matsui |
| 2015 | Voice Äpp: a mobile app for crowdsourcing Swiss German dialect data. Adrian Leemann, Marie-José Kolly, Jean-Philippe Goldman, Volker Dellwo, Ingrid Hove, Ibrahim Almajai, Sarah Grimm, Sylvain Robert, Daniel Wanitsch |
| 2015 | Voice-conditioned allophones of MOUTH and PRICE in bahamian creole. Janina Kraus |
| 2015 | Voiced/unvoiced transitions in speech as a potential bio-marker to detect parkinson's disease. Juan Rafael Orozco-Arroyave, Florian Hönig, Julián D. Arias-Londoño, Jesús Francisco Vargas-Bonilla, Sabine Skodda, Jan Rusz, Elmar Nöth |
| 2015 | Voices of power, passion, and personality. Klaus R. Scherer |
| 2015 | Vowel mispronunciation detection using DNN acoustic models with cross-lingual training. Shrikant Joshi, Nachiket Deo, Preeti Rao |
| 2015 | Weakly-supervised word learning is improved by an active online algorithm. Heikki Rasilo, Okko Räsänen |
| 2015 | Web application system for pronunciation practice by children with disabilities and to support cooperation of teachers and medical workers. Ikuyo Masuda-Katsuse |
| 2015 | Weighted correlation based atom decomposition intonation modelling. Branislav Gerazov, Pierre-Edouard Honnet, Aleksandar Gjoreski, Philip N. Garner |
| 2015 | Word-initial glottal stop insertion, hiatus resolution and linking in British English. Robert Fuchs |
| 2015 | Wrapping up: the story of the compare challenges, what we learned and where to go. Anton Batliner |
| 2015 | Wubuy coronal stop perception by speakers of three dialects of bangla. Rikke Louise Bundgaard-Nielsen, Brett Baker, Olga Maxwell, Janet Fletcher |
| 2015 | Zero-shot semantic parser for spoken language understanding. Emmanuel Ferreira, Bassam Jabaian, Fabrice Lefèvre |
| 2015 | fMLLR based feature-space speaker adaptation of DNN acoustic models. Sree Hari Krishnan Parthasarathi, Björn Hoffmeister, Spyros Matsoukas, Arindam Mandal, Nikko Strom, Sri Garimella |
| 2015 | iCALL corpus: Mandarin Chinese spoken by non-native speakers of European descent. Nancy F. Chen, Rong Tong, Darren Wee, Pei Xuan Lee, Bin Ma, Haizhou Li |