| 2013 | "sure, i did the right thing": a system for sarcasm detection in speech. Rachel Rakov, Andrew Rosenberg |
| 2013 | 'houston, we have a solution': using NASA apollo program to advance speech and language processing technology. Abhijeet Sangwan, Lakshmish Kaushik, Chengzhu Yu, John H. L. Hansen, Douglas W. Oard |
| 2013 | 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013, Lyon, France, August 25-29, 2013 Frédéric Bimbot, Christophe Cerisara, Cécile Fougeron, Guillaume Gravier, Lori Lamel, François Pellegrino, Pascal Perrier |
| 2013 | A blind segmentation approach to acoustic event detection based on i-vector. Zhen Huang, You-Chi Cheng, Kehuang Li, Ville Hautamäki, Chin-Hui Lee |
| 2013 | A comparative study of glottal open quotient estimation techniques. John Kane, Stefan Scherer, Louis-Philippe Morency, Christer Gobl |
| 2013 | A computational model of perceptuo-motor processing in speech perception: learning to imitate and categorize synthetic CV syllables. Raphaël Laurent, Jean-Luc Schwartz, Pierre Bessière, Julien Diard |
| 2013 | A corpus-based study of elderly and young speakers of European Portuguese: acoustic correlates and their impact on speech recognition performance. Thomas Pellegrini, Annika Hämäläinen, Philippe Boula de Mareüil, Michael Tjalve, Isabel Trancoso, Sara Candeias, Miguel Sales Dias, Daniela Braga |
| 2013 | A cross-linguistic study on turn-taking and temporal alignment in verbal interaction. Spyros Kousidis, David Schlangen, Stavros Skopeteas |
| 2013 | A digital signal processor implementation of silent/electrolaryngeal speech enhancement based on real-time statistical voice conversion. Takuto Moriguchi, Tomoki Toda, Motoaki Sano, Hiroshi Sato, Graham Neubig, Sakriani Sakti, Satoshi Nakamura |
| 2013 | A distributed system for recognizing home automation commands and distress calls in the Italian language. Emanuele Principi, Stefano Squartini, Francesco Piazza, Danilo Fuselli, Maurizio Bonifazi |
| 2013 | A dynamic programming framework for neural network-based automatic speech segmentation. Van Zyl Van Vuuren, Louis ten Bosch, Thomas Niesler |
| 2013 | A free online accent and intonation dictionary for teachers and learners of Japanese. Hiroko Hirano, Ibuki Nakamura, Nobuaki Minematsu, Masayuki Suzuki, Chieko Nakagawa, Noriko Nakamura, Yukinori Tagawa, Keikichi Hirose, Hiroya Hashimoto |
| 2013 | A hybrid HMM/DNN approach to keyword spotting of short words. I-Fan Chen, Chin-Hui Lee |
| 2013 | A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversion. Kou Tanaka, Tomoki Toda, Graham Neubig, Sakriani Sakti, Satoshi Nakamura |
| 2013 | A hybrid language model for open-vocabulary Thai LVCSR. Kwanchiva Thangthai, Ananlada Chotimongkol, Chai Wutiwiwatchai |
| 2013 | A lecture transcription system combining neural network acoustic and language models. Peter Bell, Hitoshi Yamamoto, Pawel Swietojanski, Youzheng Wu, Fergus McInnes, Chiori Hori, Steve Renals |
| 2013 | A low-complexity voice activity detector for smart hearing protection of hyperacusic persons. Narimene Lezzoum, Ghyslain Gagnon, Jérémie Voix |
| 2013 | A method for structure estimation of weighted finite-state transducers and its application to grapheme-to-phoneme conversion. Yotaro Kubo, Takaaki Hori, Atsushi Nakamura |
| 2013 | A multi-domain dialog system to integrate heterogeneous spoken dialog systems. Joaquin Planells, Lluís F. Hurtado, Encarna Segarra, Emilio Sanchis |
| 2013 | A neural oscillator model of speech timing and rhythm. Erin Rusaw |
| 2013 | A new Bayesian network to assess the reliability of speaker verification decisions. Jesús Antonio Villalba López, Eduardo Lleida, Alfonso Ortega, Antonio Miguel |
| 2013 | A new DNN-based high quality pronunciation evaluation for computer-aided language learning (CALL). Wenping Hu, Yao Qian, Frank K. Soong |
| 2013 | A new language independent, photo-realistic talking head driven by voice only. Xinjian Zhang, Lijuan Wang, Gang Li, Frank Seide, Frank K. Soong |
| 2013 | A new prosody annotation protocol for live sports commentaries. Sandrine Brognaux, Benjamin Picart, Thomas Drugman |
| 2013 | A new speaker verification spoofing countermeasure based on local binary patterns. Federico Alegre, Ravichander Vipperla, Asmaa Amehraye, Nicholas W. D. Evans |
| 2013 | A new statistical excitation mapping for enhancement of throat microphone recordings. M. A. Tugtekin Turan, Engin Erzin |
| 2013 | A noise-robust system for NIST 2012 speaker recognition evaluation. Luciana Ferrer, Mitchell McLaren, Nicolas Scheffer, Yun Lei, Martin Graciarena, Vikramjit Mitra |
| 2013 | A non-experts user interface for obtaining automatic diagnostic spelling evaluations for learners of the German writing system. Kay Berkling |
| 2013 | A perceptually and physiologically motivated voice source model. Gang Chen, Marc Garellek, Jody Kreiman, Bruce R. Gerratt, Abeer Alwan |
| 2013 | A phase-modified approach for TDE-based acoustic localization. Georgios Athanasopoulos, Werner Verhelst |
| 2013 | A physiological analysis of the tense/lax vowel contrast in two varieties of German. Conceição Cunha, Jonathan Harrington, Phil Hoole |
| 2013 | A pitch-based spectral enhancement technique for robust speech processing. Kantapon Kaewtip, Lee Ngee Tan, Abeer Alwan |
| 2013 | A preliminary spectral analysis of palatal and velar stop bursts in pitjantjatjara. Marija Tabain, Richard Beare, Andrew Butcher |
| 2013 | A preliminary study of child vocalization on a parallel corpus of US and shanghainese toddlers. Hynek Boril, Qian Zhang, Pongtep Angkititrakul, John H. L. Hansen, Dongxin Xu, Jill Gilkerson, Jeffrey A. Richards |
| 2013 | A preliminary study of cross-lingual emotion recognition from speech: automatic classification versus human perception. Je Hun Jeon, Duc Le, Rui Xia, Yang Liu |
| 2013 | A quantitative comparison of glottal closure instant estimation algorithms on a large variety of singing sounds. Onur Babacan, Thomas Drugman, Nicolas D'Alessandro, Nathalie Henrich, Thierry Dutoit |
| 2013 | A real-world system for simultaneous translation of German lectures. Eunah Cho, Christian Fügen, Teresa Herrmann, Kevin Kilgour, Mohammed Mediani, Christian Mohr, Jan Niehues, Kay Rottmann, Christian Saam, Sebastian Stüker, Alex Waibel |
| 2013 | A recursive dialogue game framework with optimal Policy offering personalized computer-assisted language learning. Pei-Hao Su, Yow-Bang Wang, Tsung-Hsien Wen, Tien-han Yu, Lin-Shan Lee |
| 2013 | A region-specific feature-space transformation for speaker adaptation and singularity analysis of jacobian matrix. Shakti P. Rath, Lukás Burget, Martin Karafiát, Ondrej Glembek, Jan Cernocký |
| 2013 | A resource-dependent approach to word modeling for keyword spotting. I-Fan Chen, Chin-Hui Lee |
| 2013 | A robust frontend for VAD: exploiting contextual, discriminative and spectral cues of human voice. Maarten Van Segbroeck, Andreas Tsiartas, Shrikanth S. Narayanan |
| 2013 | A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR. Zhi-Jie Yan, Qiang Huo, Jian Xu |
| 2013 | A sequential repetition model for improved disfluency detection. Mari Ostendorf, Sangyun Hahn |
| 2013 | A single channel speech enhancement approach by combining statistical criterion and multi-frame sparse dictionary learning. Hung-Wei Tseng, Srikanth Vishnubhotla, Mingyi Hong, Xiangfeng Wang, Jinjun Xiao, Zhi-Quan Luo, Tao Zhang |
| 2013 | A source-filter based adaptive harmonic model and its application to speech prosody modification. JeeSok Lee, Frank K. Soong, Hong-Goo Kang |
| 2013 | A source-filter separation algorithm for voiced sounds based on an exact anticausal/causal pole decomposition for the class of periodic signals. Thomas Hézard, Thomas Hélie, Boris Doval |
| 2013 | A speech enhancement method by coupling speech detection and spectral amplitude estimation. Feng Deng, Changchun Bao, Feng Bao |
| 2013 | A study on LVCSR and keyword search for tagalog. Korbinian Riedhammer, Van Hai Do, James Hieronymus |
| 2013 | A style control technique for singing voice synthesis based on multiple-regression HSMM. Takashi Nose, Misa Kanemoto, Tomoki Koriyama, Takao Kobayashi |
| 2013 | A survey about databases of children's speech. Felix Claus, Hamurabi Gamboa-Rosales, Rico Petrick, Horst-Udo Hain, Rüdiger Hoffmann |
| 2013 | A targets-based superpositional model of fundamental frequency contours applied to HMM-based speech synthesis. Jinfu Ni, Yoshinori Shiga, Chiori Hori, Yutaka Kidawara |
| 2013 | A tool to elicit and collect multicultural and multimodal laughter. Mariette Soury, Clément Gossart, Martine Adda-Decker, Laurence Devillers |
| 2013 | A two-step technique for MRI audio enhancement using dictionary learning and wavelet packet analysis. Colin Vaz, Vikram Ramanarayanan, Shrikanth S. Narayanan |
| 2013 | A weakly-supervised approach for discovering new user intents from search query logs. Dilek Hakkani-Tür, Asli Celikyilmaz, Larry P. Heck, Gökhan Tür |
| 2013 | ALIZE 3.0 - open source toolkit for state-of-the-art speaker recognition. Anthony Larcher, Jean-François Bonastre, Benoit G. B. Fauve, Kong-Aik Lee, Christophe Lévy, Haizhou Li, John S. D. Mason, Jean-Yves Parfait |
| 2013 | Acceleration of spoken term detection using a suffix array by assigning optimal threshold values to sub-keywords. Kouichi Katsurada, Seiichi Miura, Kheang Seng, Yurie Iribe, Tsuneo Nitta |
| 2013 | Accent- and speaker-specific polyphone decision trees for non-native speech recognition. Dominic Telaar, Mark C. Fuhs |
| 2013 | Accurate and compact large vocabulary speech recognition on mobile devices. Xin Lei, Andrew W. Senior, Alexander Gruenstein, Jeffrey Sorensen |
| 2013 | Acoustic and perceptual analysis of vocal tremor. Christophe Mertens, Jean Schoentgen, Francis Grenez, Sabine Skodda |
| 2013 | Acoustic and visual phonetic features in the mcgurk effect - an audiovisual speech illusion. Kaisa Tiippana, Mikko Tiainen, Lari Vainio, Martti Vainio |
| 2013 | Acoustic development of vowel production in American English children. Jing Yang, Robert Allen Fox |
| 2013 | Acoustic factor analysis based universal background model for robust speaker verification in noise. Taufiq Hasan, John H. L. Hansen |
| 2013 | Acoustic features for detection of phonemic aspiration in voiced plosives. Vaishali Patil, Preeti Rao |
| 2013 | Acoustic segmentation of speech using zero time liftering (ZTL). RaviShankar Prasad, B. Yegnanarayana |
| 2013 | Acoustic-prosodic, turn-taking, and language cues in child-psychologist interactions for varying social demand. Daniel Bone, Chi-Chun Lee, Theodora Chaspari, Matthew P. Black, Marian E. Williams, Sungbok Lee, Pat Levitt, Shrikanth S. Narayanan |
| 2013 | Active learning by label uncertainty for acoustic emotion recognition. Zixing Zhang, Jun Deng, Erik Marchi, Björn W. Schuller |
| 2013 | Active learning for dimensional speech emotion recognition. Wenjing Han, Haifeng Li, Huabin Ruan, Lin Ma, Jiayin Sun, Björn W. Schuller |
| 2013 | Adaptation of respiratory patterns in collaborative reading. Gérard Bailly, Amélie Rochet-Capellan, Coriandre Vilain |
| 2013 | Adaptation to natural fast speech and time-compressed speech in children. Hélène Guiraud, Emmanuel Ferragne, Nathalie Bedoin, Véronique Boulenger |
| 2013 | Adapting a speech into sign language translation system to a new domain. Verónica López-Ludeña, Rubén San Segundo, Carlos González-Morcillo, Juan Carlos López, E. Ferreiro |
| 2013 | Adaptive Gaussian backend for robust language identification. Mitchell McLaren, Aaron Lawson, Yun Lei, Nicolas Scheffer |
| 2013 | Adaptive stereo-based stochastic mapping. Shay Maymon, Pierre L. Dognin, Xiaodong Cui, Vaibhava Goel |
| 2013 | Addressee detection for dialog systems using temporal and spectral dimensions of speaking style. Elizabeth Shriberg, Andreas Stolcke, Suman V. Ravuri |
| 2013 | Aerodynamic and durational cues of phonological voicing in whisper. Yohann Meynadier, Yulia Gaydina |
| 2013 | Affect recognition in real-life acoustic conditions - a new perspective on feature selection. Florian Eyben, Felix Weninger, Björn W. Schuller |
| 2013 | Affective classification of generic audio clips using regression models. Nikos Malandrakis, Shiva Sundaram, Alexandros Potamianos |
| 2013 | Affective evaluation of multimodal dialogue games for preschoolers using physiological signals. Vassiliki Kouloumenta, Manolis Perakakis, Alexandros Potamianos |
| 2013 | All for one: feature combination for highly channel-degraded speech activity detection. Martin Graciarena, Abeer Alwan, Dan Ellis, Horacio Franco, Luciana Ferrer, John H. L. Hansen, Adam Janin, Byung Suk Lee, Yun Lei, Vikramjit Mitra, Nelson Morgan, Seyed Omid Sadjadi, T. J. Tsai, Nicolas Scheffer, Lee Ngee Tan, Benjamin Williams |
| 2013 | Alleviating the over-smoothing problem in GMM-based voice conversion with discriminative training. Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Yih-Ru Wang, Sin-Horng Chen |
| 2013 | Amplitude modulation features for emotion recognition from speech. Md. Jahangir Alam, Yazid Attabi, Pierre Dumouchel, Patrick Kenny, Douglas D. O'Shaughnessy |
| 2013 | An MRI-based acoustic study of Mandarin vowels. Yuguang Wang, Jianwu Dang, Xi Chen, Jianguo Wei, Hongcui Wang, Kiyoshi Honda |
| 2013 | An anisotropic diffusion filter based on multidirectional separability. Shen Liu, Jianguo Wei, Xin Wang, Wenhuan Lu, Qiang Fang, Jianwu Dang |
| 2013 | An early case of "VOT". Angelika Braun |
| 2013 | An efficient method to estimate pronunciation from multiple utterances. Tofigh Naghibi, Sarah Hoffmann, Beat Pfister |
| 2013 | An empirical comparison of joint optimization techniques for speech translation. Masaya Ohgushi, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura |
| 2013 | An explicit independence constraint for factorised adaptation in speech recognition. Yongqiang Wang, Mark J. F. Gales |
| 2013 | An inter- and cross-disciplinary perspective of spoken language processing. Hiroya Fujisaki |
| 2013 | An investigation of acoustic features for singing voice conversion based on perceptual age. Kazuhiro Kobayashi, Hironori Doi, Tomoki Toda, Tomoyasu Nakano, Masataka Goto, Graham Neubig, Sakriani Sakti, Satoshi Nakamura |
| 2013 | An investigation of spectral restoration algorithms for deep neural networks based noise robust speech recognition. Bo Li, Yu Tsao, Khe Chai Sim |
| 2013 | An investigation of temporally varying weight regression for noise robust speech recognition. Shilin Liu, Khe Chai Sim |
| 2013 | An investigation of vowel epenthesis in Chinese learners' production of German consonants. Hongwei Ding, Rüdiger Hoffmann |
| 2013 | An on-line incremental speaker adaptation technique for audio stream transcription. Diego Giuliani, Fabio Brugnara |
| 2013 | An open-source state-of-the-art toolbox for broadcast news diarization. Mickael Rouvier, Grégor Dupuy, Paul Gay, Elie Khoury, Téva Merlin, Sylvain Meignier |
| 2013 | An overview of the VUB entry for the 2013 hurricane challenge. Henk Brouckxon, Werner Verhelst |
| 2013 | An unsupervised Bayesian classifier for multiple speaker detection and localization. Youssef Oualil, Friedrich Faubel, Dietrich Klakow |
| 2013 | Analysis and modeling of "focus" in context. Dirk Hovy, Gopala Krishna Anumanchipalli, Alok Parlikar, Caroline Vaughn, Adam C. Lammert, Eduard H. Hovy, Alan W. Black |
| 2013 | Analysis and synthesis of shouted speech. Tuomo Raitio, Antti Suni, Jouni Pohjalainen, Manu Airaksinen, Martti Vainio, Paavo Alku |
| 2013 | Analysis of breathy, modal and pressed phonation based on low frequency spectral density. Dhananjaya N. Gowda, Mikko Kurimo |
| 2013 | Analysis of emotional speech at subsegmental level. P. Gangamohan, Sudarsana Reddy Kadiri, B. Yegnanarayana |
| 2013 | Analysis of factors involved in the choice of rising or non-rising intonation in question utterances appearing in conversational speech. Hiroaki Hatano, Miyako Kiso, Carlos Toshinori Ishi |
| 2013 | Analysis of gaze and speech patterns in three-party quiz game interaction. Samer Al Moubayed, Jens Edlund, Joakim Gustafson |
| 2013 | Analyzing eye-voice coordination in rapid automatized naming. Daniel Bone, Chi-Chun Lee, Vikram Ramanarayanan, Shrikanth S. Narayanan, Renske S. Hoedemaker, Peter C. Gordon |
| 2013 | Analyzing the structure of parent-moderated narratives from children with ASD using an entity-based approach. Theodora Chaspari, Emily Mower Provost, Shrikanth S. Narayanan |
| 2013 | Anchor and UBM-based multi-class MLLR m-vector system for speaker verification. Achintya Kumar Sarkar, Claude Barras |
| 2013 | Annotation and classification of Political advertisements. Samuel Kim, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2013 | Annotation and detection of conflict escalation in Political debates. Samuel Kim, Fabio Valente, Alessandro Vinciarelli |
| 2013 | Annotation errors detection in TTS corpora. Jindrich Matousek, Daniel Tihelka |
| 2013 | Application of the NAO humanoid robot in the treatment of bone marrow-transplanted children (demo). E. Csala, Géza Németh, Csaba Zainkó |
| 2013 | Architekt or archtekt? perception of devoiced vowels produced by Japanese speakers of German. Frank Zimmerer, Rei Yasuda, Henning Reetz |
| 2013 | Articulatory copy synthesis from cine x-ray films. Yves Laprie, Matthieu Loosvelt, Shinji Maeda, Rudolph Sock, Fabrice Hirsch |
| 2013 | Articulatory features for speech-driven head motion synthesis. Atef Ben Youssef, Hiroshi Shimodaira, David Adam Braude |
| 2013 | Articulatory settings facilitate mechanically advantageous motor control of vocal tract articulators. Vikram Ramanarayanan, Adam C. Lammert, Louis Goldstein, Shrikanth S. Narayanan |
| 2013 | Articulatory synthesis of French connected speech from EMA data. Asterios Toutios, Shrikanth S. Narayanan |
| 2013 | Artificial bandwidth extension based on regularized piecewise linear mapping with discriminative region weighting and long-Span features. Nguyen Duc Duy, Masayuki Suzuki, Nobuaki Minematsu, Keikichi Hirose |
| 2013 | Assessing the intelligibility impact of vowel space expansion via clear speech-inspired frequency warping. Elizabeth Godoy, Maria Koutsogiannaki, Yannis Stylianou |
| 2013 | Assessing the utility of judgments of children's speech production made by untrained listeners in uncontrolled listening environments. Benjamin Munson |
| 2013 | Assimilation of word-final nasals to following word-initial place of articulation in UK English. Margaret E. L. Renwick, Ladan Baghai-Ravary, Rosalind Temple, John S. Coleman |
| 2013 | Asynchronous factorisation of speaker and background with feature transforms in speech recognition. Oscar Saz, Thomas Hain |
| 2013 | Attribute-based histogram equalization (HEQ) and its adaptation for robust speech recognition. Xiong Xiao, Engsiong Chng, Haizhou Li |
| 2013 | Audio classification using dominant spatial patterns in time-frequency space. Md. Khademul Islam Molla, Keikichi Hirose |
| 2013 | Audio event classification using deep neural networks. Zvi Kons, Orith Toledo-Ronen |
| 2013 | Audio self organized units for high-level event detection. Xiaodan Zhuang, Shuang Wu, Pradeep Natarajan, Rohit Prasad, Prem Natarajan |
| 2013 | Audition: the most important sense for humanoid robots? Rodolphe Gelin, Gabriele Barbieri |
| 2013 | Auditory detectability of vocal ageing and its effect on forensic automatic speaker recognition. Finnian Kelly, Naomi Harte |
| 2013 | Augmented conditional random fields modeling based on discriminatively trained features. Yasser Hifny |
| 2013 | Augmenting short-term cepstral features with long-term discriminative features for speaker verification of telephone data. Cong-Thanh Do, Claude Barras, Viet Bac Le, Achintya Kumar Sarkar |
| 2013 | Automated speech scoring for non-native middle school students with multiple task types. Keelan Evanini, Xinhao Wang |
| 2013 | Automatic accent quantification of indian speakers of English. Jian Cheng, Nikhil Bojja, Xin Chen |
| 2013 | Automatic estimation of dialect mixing ratio for dialect speech recognition. Naoki Hirayama, Koichiro Yoshino, Katsutoshi Itoyama, Shinsuke Mori, Hiroshi G. Okuno |
| 2013 | Automatic evaluation of parkinson's speech - acoustic, prosodic and voice related cues. Tobias Bocklet, Stefan Steidl, Elmar Nöth, Sabine Skodda |
| 2013 | Automatic gender recognition in normal and pathological speech. Jorge Andrés Gómez García, Juan Ignacio Godino-Llorente, Germán Castellanos-Domínguez |
| 2013 | Automatic glottal tracking from high-speed digital images using a continuous normalized cross correlation. Gustavo Andrade-Miranda, Juan Ignacio Godino-Llorente |
| 2013 | Automatic human utility evaluation of ASR systems: does WER really predict performance? Benoît Favre, Kyla Cheung, Siavash Kazemian, Adam Lee, Yang Liu, Cosmin Munteanu, Ani Nenkova, Dennis Ochei, Gerald Penn, Stephen Tratz, Clare R. Voss, Frauke Zeller |
| 2013 | Automatic phonetic segmentation using boundary models. Jiahong Yuan, Neville Ryant, Mark Y. Liberman, Andreas Stolcke, Vikramjit Mitra, Wen Wang |
| 2013 | Automatic regularization of cross-entropy cost for speaker recognition fusion. Ville Hautamäki, Kong-Aik Lee, David A. van Leeuwen, Rahim Saeidi, Anthony Larcher, Tomi Kinnunen, Taufiq Hasan, Seyed Omid Sadjadi, Gang Liu, Hynek Boril, John H. L. Hansen, Benoit G. B. Fauve |
| 2013 | Automatic self-supervised learning of associations between speech and text. Juha Knuuttila, Okko Räsänen, Unto K. Laine |
| 2013 | Automatic social role recognition in professional meetings using conditional random fields. Ashtosh Sapru, Hervé Bourlard |
| 2013 | Automatic tracheoesophageal voice typing using acoustic parameters. Renee Peje Clapham, Corina J. van As-Brooks, Michiel W. M. van den Brekel, Frans J. M. Hilgers, R. J. J. H. van Son |
| 2013 | Avatar therapy: an audio-visual dialogue system for treating auditory hallucinations. Mark A. Huckvale, Julian Leff, Geoff Williams |
| 2013 | BUT BABEL system for spontaneous Cantonese. Martin Karafiát, Frantisek Grézl, Mirko Hannemann, Karel Veselý, Jan Cernocký |
| 2013 | Balancing word lists in speech audiometry through large spoken language corpora. Annemiek Hammer, Bart Vaerenberg, Wojtek Kowalczyk, Louis ten Bosch, Martine Coene, Paul J. Govaerts |
| 2013 | Bayesian distance metric learning on i-vector for speaker verification. Xiao Fang, Najim Dehak, James R. Glass |
| 2013 | Beyond bandlimited sampling of speech spectral envelope imposed by the harmonic structure of voiced sounds. Hideki Kawahara, Masanori Morise, Tomoki Toda, Ryuichi Nisimura, Toshio Irino |
| 2013 | Bhattacharyya distance based emotional dissimilarity measure in multi-dimensional space for emotion classification. Tin Lay Nwe, Trung Hieu Nguyen, Dilip Kumar Limbu |
| 2013 | Bidirectional truncated recurrent neural networks for efficient speech denoising. Philemon Brakel, Dirk Stroobandt, Benjamin Schrauwen |
| 2013 | Binocular photometric stereo acquisition and reconstruction for 3d talking head applications. Chaoyang Wang, Lijuan Wang, Yasuyuki Matsushita, Bojun Huang, Magnetro Chen, Frank K. Soong |
| 2013 | Blind source separation using spatially distributed microphones based on microphone-location dependent source activities. Keisuke Kinoshita, Mehrez Souden, Tomohiro Nakatani |
| 2013 | Bottleneck features based on gammatone frequency cepstral coefficients. Jun Qi, Dong Wang, Ji Xu, Javier Tejedor |
| 2013 | Bounded conditional mean imputation with an approximate posterior. Ulpu Remes |
| 2013 | Brain activations in speech recovery process after intra-oral surgery: an fMRI study. Audrey Acher, Marc Sato, Laurent Lamalle, Coriandre Vilain, Arnaud Attye, Alexandre Krainik, Georges Bettega, Christian Adrien Righini, Brice Carlot, Muriel Brix, Pascal Perrier |
| 2013 | Burst-based features for the classification of pathological voices. Julie Mauclair, Lionel Koenig, Marina Robert, Peggy Gatignol |
| 2013 | CSLM - a modular open-source continuous space language modeling toolkit. Holger Schwenk |
| 2013 | Calibration of distance measures for unsupervised query-by-example. Michele Gubian, Lou Boves, Maarten Versteegh |
| 2013 | Categorization of speech in early auditory evoked responses. Ludovic Bellier, Michel Mazzuca, Hung Thai-Van, Anne Caclin, Rafael Laboissière |
| 2013 | Category-based phoneme-to-grapheme transliteration. Willem D. Basson, Marelie H. Davel |
| 2013 | Changes in the role of intensity as a cue for fricative categorisation. Odette Scharenborg, Esther Janse |
| 2013 | Channel selection using n-best hypothesis for multi-microphone ASR. Martin Wolf, Climent Nadeu |
| 2013 | Characterising depressed speech for classification. Sharifa Alghowinem, Roland Goecke, Michael Wagner, Julien Epps, Gordon Parker, Michael Breakspear |
| 2013 | Characteristic contours of syllabic-level units in laughter. Jieun Oh, Eunjoon Cho, Malcolm Slaney |
| 2013 | Children's timing and repair strategies for communication in adverse listening conditions. Valérie Hazan, Michèle Pettinato |
| 2013 | Classification based binaural dereverberation. Nicoleta Roman, Michael I. Mandel |
| 2013 | Classification of cooperative and competitive overlaps in speech using cues from the context, overlapper, and overlappee. Khiet P. Truong |
| 2013 | Classification of depression state based on articulatory precision. Brian S. Helfer, Thomas F. Quatieri, James R. Williamson, Daryush D. Mehta, Rachelle Horwitz, Bea Yu |
| 2013 | Classification of developmental disorders from speech signals using submodular feature selection. Katrin Kirchhoff, Yuzong Liu, Jeff A. Bilmes |
| 2013 | Classification of speech under stress by modeling the aerodynamics of the laryngeal ventricle. Xiao Yao, Takatoshi Jitsuhiro, Chiyomi Miyajima, Norihide Kitaoka, Kazuya Takeda |
| 2013 | Classifying language-related developmental disorders from speech cues: the promise and the potential confounds. Daniel Bone, Theodora Chaspari, Kartik Audhkhasi, James Gibson, Andreas Tsiartas, Maarten Van Segbroeck, Ming Li, Sungbok Lee, Shrikanth S. Narayanan |
| 2013 | Cluster adaptive training with factorized decision trees for speech recognition. Kai Yu, Hainan Xu |
| 2013 | Code-Switching event detection based on delta-BIC using phonetic eigenvoice models. Wei-Bin Liang, Chung-Hsien Wu, Chun-Shan Hsu |
| 2013 | Combination of auditory attention features with phone posteriors for better automatic phoneme segmentation. Ozlem Kalinli |
| 2013 | Combination of random indexing based language model and n-gram language model for speech recognition. Dominique Fohr, Odile Mella |
| 2013 | Combining acoustic name spotting and continuous context models to improve spoken person name recognition in speech. Benjamin Bigot, Grégory Senay, Georges Linarès, Corinne Fredouille, Richard Dufour |
| 2013 | Combining deep speaker specific representations with GMM-SVM for speaker verification. Ryan Price, Sangeeta Biswas, Koichi Shinoda |
| 2013 | Combining forward-based and backward-based decoders for improved speech recognition performance. Denis Jouvet, Dominique Fohr |
| 2013 | Combining in-domain and out-of-domain speech data for automatic recognition of disordered speech. Heidi Christensen, Magda B. Aniol, Peter Bell, Phil D. Green, Thomas Hain, Simon King, Pawel Swietojanski |
| 2013 | Combining perceptually-motivated spectral shaping with loudness and duration modification for intelligibility enhancement of HMM-based synthetic speech in noise. Cassia Valentini-Botinhao, Junichi Yamagishi, Simon King, Yannis Stylianou |
| 2013 | Comparative investigation of objective speech intelligibility prediction measures for noise-reduced signals in Mandarin and Japanese. Junfeng Li, Fei Chen, Masato Akagi, Yonghong Yan |
| 2013 | Comparative study of speaker personality traits recognition in conversational and broadcast news speech. Firoj Alam, Giuseppe Riccardi |
| 2013 | Comparing computation in Gaussian mixture and neural network based large-vocabulary speech recognition. Vishwa Gupta, Gilles Boulianne |
| 2013 | Comparing vowel category response surfaces over age-varying maximal vowel spaces within and across language communities. Andrew R. Plummer, Lucie Ménard, Benjamin Munson, Mary E. Beckman |
| 2013 | Comparison of approaches for an efficient phonetic decoding. Luiza Orosanu, Denis Jouvet |
| 2013 | Comparison of spectral analysis methods for automatic speech recognition. Venkata Neelima Parinam, Chandra Sekhar Vootkuri, Stephen A. Zahorian |
| 2013 | Comparison of spectrum estimators in speaker verification: mismatch conditions induced by vocal effort. Cemal Hanilçi, Tomi Kinnunen, Padmanabhan Rajan, Jouni Pohjalainen, Paavo Alku, Figen Ertas |
| 2013 | Compensatory speech response to time-scale altered auditory feedback. Rintaro Ogane, Masaaki Honda |
| 2013 | Composing auditory ERPs: cross-linguistic comparison of auditory change complex for Japanese fricative consonants. Makiko Sadakata, Loukianos Spyrou, Mizuki Shingai, Kaoru Sekiyama |
| 2013 | Computationally efficient objective function for algebraic codebook optimization in ACELP. Tom Bäckström |
| 2013 | Concurrent processing of voice activity detection and noise reduction using empirical mode decomposition and modulation spectrum analysis. Yasuaki Kanai, Shota Morita, Masashi Unoki |
| 2013 | Conditional emission densities for combining speech enhancement and recognition systems. Armin Sehr, Takuya Yoshioka, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Roland Maas, Walter Kellermann |
| 2013 | Confidence-based scoring: a useful diagnostic tool for detection tasks. T. J. Tsai, Adam Janin |
| 2013 | Consonant distortions in dysarthria due to parkinson's disease, amyotrophic lateral sclerosis and cerebellar ataxia. Tanja Kocjancic Antolík, Cécile Fougeron |
| 2013 | Context-dependent modeling and speaker normalization applied to reservoir-based phone recognition. Fabian Triefenbach, Azarakhsh Jalalvand, Kris Demuynck, Jean-Pierre Martens |
| 2013 | Context-dependent phone mapping for LVCSR of under-resourced languages. Van Hai Do, Xiong Xiao, Engsiong Chng, Haizhou Li |
| 2013 | Controlling "shout" expression in a Japanese POP singing performance: analysis and suppression study. Yuri Nishigaki, Ken-Ichi Sakakibara, Masanori Morise, Ryuichi Nisimura, Toshio Irino, Hideki Kawahara |
| 2013 | Convergence of articulation rate in spontaneous speech. Antje Schweitzer, Natalie Lewandowski |
| 2013 | Convolutional deep rectifier neural nets for phone recognition. László Tóth |
| 2013 | Corpus analysis of simultaneous interpretation data for improving real time speech translation. Vivek Kumar Rangarajan Sridhar, John Chen, Srinivas Bangalore |
| 2013 | Correlates of contrastive focus in congenitally blind adults and sighted adults. Lucie Ménard, Annie Leclerc, Mark K. Tiede, Amélie Prémont, Christine Turgeon, Paméla Trudeau-Fisette, Dominique Côté |
| 2013 | Correlates to intelligibility in deviant child speech - comparing clinical evaluations to audience response system-based evaluations by untrained listeners. Sofia Strömbergsson, Christina Tånnander |
| 2013 | Cross-domain paraphrasing for improving language modelling using out-of-domain data. Xunying Liu, Mark J. F. Gales, Philip C. Woodland |
| 2013 | Cross-entropy vs. squared error training: a theoretical and experimental comparison. Pavel Golik, Patrick Doetsch, Hermann Ney |
| 2013 | Cross-language comparison of functional load for vowels, consonants, and tones. Yoon Mi Oh, François Pellegrino, Christophe Coupé, Egidio Marsico |
| 2013 | Cross-lingual acoustic model adaptation based on transfer vector field smoothing with MAP. Masahiro Saiko, Shigeki Matsuda, Ken Hanazawa, Ryosuke Isotani, Chiori Hori |
| 2013 | Crosslingual tandem-SGMM: exploiting out-of-language data for acoustic model and feature level adaptation. Petr Motlícek, David Imseng, Philip N. Garner |
| 2013 | Crosslinguistic corpus of hesitation phenomena: a corpus for investigating first and second language speech performance. Ralph L. Rose |
| 2013 | Crosslinguistic priming in interactive reference: evidence for conceptual alignment in speech production. Anne Vullinghs, Martijn Goudbeek, Emiel Krahmer |
| 2013 | DTW-distance-ordered spoken term detection. Teppei Ohno, Tomoyosi Akiba |
| 2013 | Damped oscillator cepstral coefficients for robust speech recognition. Vikramjit Mitra, Horacio Franco, Martin Graciarena |
| 2013 | Data driven methods for utterance semantic tagging. Ding Liu, Anthea Cheung, Anna Margolis, Patrick Redmond, Jun-Won Suh, Chao Wang |
| 2013 | Data-driven design of a sentence list for an articulatory speech corpus. Jeffrey Berry, Luciano Fadiga |
| 2013 | Deep belief network based semantic taggers for spoken language understanding. Anoop Deoras, Ruhi Sarikaya |
| 2013 | Deep segmental neural networks for speech recognition. Ossama Abdel-Hamid, Li Deng, Dong Yu, Hui Jiang |
| 2013 | Deep vs. wide: depth on a budget for robust speech recognition. Oriol Vinyals, Nelson Morgan |
| 2013 | Demographic recommendation by means of group profile elicitation using speaker age and gender recognition. Sven Ewan Shepstone, Zheng-Hua Tan, Søren Holdt Jensen |
| 2013 | Demonstration of LAPSyd: lyon-albuquerque phonological systems database. Ian Maddieson, Sébastien Flavier, Egidio Marsico, François Pellegrino |
| 2013 | Design of a mobile app for interspeech conferences: towards an open tool for the spoken language community. Robert Schleicher, Tilo Westermann, Jinjin Li, Moritz Lawitschka, Benjamin Mateev, Ralf Reichmuth, Sebastian Möller |
| 2013 | Detecting autism, emotions and social signals using adaboost. Gábor Gosztolya, Róbert Busa-Fekete, László Tóth |
| 2013 | Detecting laughter and filled pauses using syllable-based features. Gouzhen An, David Guy Brizan, Andrew Rosenberg |
| 2013 | Detecting overlapping speech with long short-term memory recurrent neural networks. Jürgen T. Geiger, Florian Eyben, Björn W. Schuller, Gerhard Rigoll |
| 2013 | Detecting summarization hot spots in meetings using group level involvement and turn-taking features. Catherine Lai, Jean Carletta, Steve Renals |
| 2013 | Detecting words in speech using linear separability in a bag-of-events vector space. Maarten Versteegh, Louis ten Bosch |
| 2013 | Detection of glottal opening instants using Hilbert envelope. K. Ramesh, S. R. Mahadeva Prasanna, D. Govind |
| 2013 | Detection of laughter in children's speech using spectral and prosodic acoustic features. Hrishikesh Rao, Jonathan C. Kim, Agata Rozga, Mark A. Clements |
| 2013 | Detection of nonverbal vocalizations using Gaussian mixture models: looking for fillers and laughter in conversational speech. Teun F. Krikke, Khiet P. Truong |
| 2013 | Developing an information system for deaf. Verónica López-Ludeña, Rubén San Segundo, Javier Ferreiros, José M. Pardo, E. Ferreiro |
| 2013 | Development and evaluation of spoken dialog systems with one or two agents. Yuki Todo, Ryota Nishimura, Kazumasa Yamamoto, Seiichi Nakagawa |
| 2013 | Development and implementation of fiducial markers for vocal tract MRI imaging and speech articulatory modelling. Pierre Badin, Julián Andrés Valdés Vargas, Arielle Koncki, Laurent Lamalle, Christophe Savariaux |
| 2013 | Development and validation of the conversational agents scale (CAS). Ina Wechsung, Benjamin Weiss, Christine Kühnel, Patrick Ehrenbrink, Sebastian Möller |
| 2013 | Development of a pronunciation training system based on auditory-visual elements. Haruko Miyakoda |
| 2013 | Development of a web framework for teaching and learning Japanese prosody: OJAD (online Japanese accent dictionary). Ibuki Nakamura, Nobuaki Minematsu, Masayuki Suzuki, Hiroko Hirano, Chieko Nakagawa, Noriko Nakamura, Yukinori Tagawa, Keikichi Hirose, Hiroya Hashimoto |
| 2013 | Development of central auditory processes and their links with language skills in typically developing children. Marie Dekerle, Fanny Meunier, Marie-Ange N'Guyen, Estelle Gillet-Perret, Delphine Lassus-Sangosse, Sophie Donnadieu |
| 2013 | Development of the RWTH transcription system for slovenian. Pavel Golik, Zoltán Tüske, Ralf Schlüter, Hermann Ney |
| 2013 | Devoicing of vowels in German, a comparison of Japanese and German speakers. Rei Yasuda, Frank Zimmerer |
| 2013 | Diacritics restoration for Arabic dialect texts. Salima Harrat, Mourad Abbas, Karima Meftouh, Kamel Smaïli |
| 2013 | Dimensionality analysis of singing speech based on locality preserving projections. Mahnoosh Mehrabani, John H. L. Hansen |
| 2013 | Dimensionality reduction of phone log-likelihood ratio features for spoken language recognition. Mireia Díez, Amparo Varona, Mikel Peñagarikano, Luis Javier Rodríguez-Fuentes, Germán Bordel |
| 2013 | Discrimination between fricative and affricate in Japanese using time and spectral domain variables. Kimiko Yamakawa, Shigeaki Amano |
| 2013 | Discriminative nonnegative dictionary learning using cross-coherence penalties for single channel source separation. Emad M. Grais, Hakan Erdogan |
| 2013 | Discriminative pronunciation modeling based on minimum phone error training. Meixu Song, Qingqing Zhang, Jielin Pan, Yonghong Yan |
| 2013 | Discriminative training of WFST factors with application to pronunciation modeling. Preethi Jyothi, Eric Fosler-Lussier, Karen Livescu |
| 2013 | Discriminative training of a phoneme confusion model for a dynamic lexicon in ASR. Penny Karanasou, François Yvon, Thomas Lavergne, Lori Lamel |
| 2013 | Discriminative training of acoustic models for system combination. Yuuki Tachioka, Shinji Watanabe |
| 2013 | Discriminatively trained dependency language modeling for conversational speech recognition. Benjamin Lambert, Bhiksha Raj, Rita Singh |
| 2013 | Discriminatively trained sparse inverse covariance matrices for low resource acoustic modeling. Weibin Zhang, Pascale Fung |
| 2013 | Disfluency detection based on prosodic features for university lectures. Henrique Medeiros, Helena Moniz, Fernando Batista, Isabel Trancoso, Luís Nunes |
| 2013 | Distribution-based feature normalization for robust speech recognition leveraging context and dynamics cues. Yu-Chen Kao, Berlin Chen |
| 2013 | Double contrast is signalled by prenuclear and nuclear accent types alone, not by f0-plateaux. Bettina Braun, Yuki Asano |
| 2013 | Duration as a secondary cue for perception of voicing and tone in shanghai Chinese. Jiayin Gao, Pierre A. Hallé |
| 2013 | Duration of early vocalisations. Adele Gregory, Marija Tabain, Michael Robb |
| 2013 | Dysarthria intelligibility assessment in a factor analysis total variability space. David Martínez González, Phil D. Green, Heidi Christensen |
| 2013 | Dysarthric speech recognition using dysarthria-severity-dependent and speaker-adaptive models. Myung Jong Kim, Joohong Yoo, Hoirin Kim |
| 2013 | Effect of MPEG audio compression on HMM-based speech synthesis. Bajibabu Bollepalli, Tuomo Raitio, Paavo Alku |
| 2013 | Effect of context, rebinding and noise, on audiovisual speech fusion. Ganesh Attigodu Chandrashekara, Frédéric Berthommier, Olha Nahorna, Jean-Luc Schwartz |
| 2013 | Effect of linguistic masker on the intelligibility of Mandarin sentences. Fei Chen, Junfeng Li, Lena L. N. Wong, Yonghong Yan |
| 2013 | Effect of multicondition training on i-vector PLDA configurations for speaker recognition. Padmanabhan Rajan, Tomi Kinnunen, Ville Hautamäki |
| 2013 | Effective estimation of a multi-session speaker model using information on signal parameters. Konstantin Simonchik, Andrey Shulipa, Timur Pekhovsky |
| 2013 | Effects of envelope filter cutoff frequency on the intelligibility of Mandarin noise-vocoded speech in babble noise: implications for cochlear implants. Guangting Mai, James W. Minett, William S.-Y. Wang |
| 2013 | Effects of lexical class and lemma frequency on German homographs. Barbara Samlowski, Petra Wagner, Bernd Möbius |
| 2013 | Effects of mouth-only and whole-face displays on audio-visual speech perception in noise: is the vision of a talker's full face truly the most efficient solution? Grozdana Erjavec, Denis Legros |
| 2013 | Effects of talk-spurt silence boundary thresholds on distribution of gaps and overlaps. Marcin Wlodarczak, Petra Wagner |
| 2013 | Efficient speech transcription through respeaking. Matthias Sperber, Graham Neubig, Christian Fügen, Satoshi Nakamura, Alex Waibel |
| 2013 | Eigenageing compensation for speaker verification. Finnian Kelly, Niko Brümmer, Naomi Harte |
| 2013 | Electromagnetic articulography with AG500 and AG501. Massimo Stella, Antonio Stella, Francesco Sigona, Paolo Bernardini, Mirko Grimaldi, Barbara Gili Fivela |
| 2013 | Electrophysiological evidence for benefits of imitation during the processing of spoken words embedded in sentential contexts. Angèle Brunellière, Sophie Dufour |
| 2013 | Elicitation and analysis of a corpus of robust noise-induced word misperceptions in Spanish. María Luisa García Lecumberri, Máté Attila Tóth, Yan Tang, Martin Cooke |
| 2013 | Eliciting speech with sentence lists - a critical evaluation with special emphasis on segmental anchoring. Lea S. Kohtz, Oliver Niebuhr |
| 2013 | Embedding speech recognition to control lights. Alessandro Sosi, Fabio Brugnara, Luca Cristoforetti, Marco Matassoni, Mirco Ravanelli, Maurizio Omologo |
| 2013 | Emotion recognition of conversational affective speech using temporal course modeling. Jen-Chun Lin, Chung-Hsien Wu, Wen-Li Wei |
| 2013 | Empirical link between hypothesis diversity and fusion performance in an ensemble of automatic speech recognition systems. Kartik Audhkhasi, Andreas M. Zavou, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2013 | Empirical mode decomposition-based spectral acoustic cues for disordered voices analysis. Abdellah Kacha, Francis Grenez, Jean Schoentgen |
| 2013 | Endpoint detection using weighted finite state transducer. Hoon Chung, Sung Joo Lee, Yunkeun Lee |
| 2013 | Energy and F0 contour modeling with functional data analysis for emotional speech detection. Juan Pablo Arias, Carlos Busso, Néstor Becerra Yoma |
| 2013 | Enhanced muting method in packet loss concealment of ITU-t g.722 employing optimized sigmoid function. Bong-Ki Lee, Chungsoo Lim, Jihwan Park, Joon-Hyuk Chang |
| 2013 | Ensemble approach in speaker verification. Leibny Paola García-Perera, Bhiksha Raj, Juan Arturo Nolazco-Flores |
| 2013 | Ensemble of machine learning and acoustic segment model techniques for speech emotion and autism spectrum disorders recognition. Hung-yi Lee, Ting-Yao Hu, How Jing, Yun-Fan Chang, Yu Tsao, Yu-Cheng Kao, Tsang-Long Pao |
| 2013 | Error-corrective discriminative joint decoding of automatic spoken language transcription and understanding. Bassam Jabaian, Fabrice Lefèvre |
| 2013 | Estimating callers' levels of knowledge in call center dialogues. Chiaki Miyazaki, Ryuichiro Higashinaka, Toshiro Makino, Yoshihiro Matsuo |
| 2013 | Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks. Dimitri Palaz, Ronan Collobert, Mathew Magimai-Doss |
| 2013 | Estimating speaker-specific intonation patterns using the linear alignment model. Géza Kiss, Jan P. H. van Santen |
| 2013 | Estimation of interest and comprehension level of audience through multi-modal behaviors in poster conversations. Tatsuya Kawahara, Soichiro Hayashi, Katsuya Takanashi |
| 2013 | Estimation of multiple-branch vocal tract models: the influence of prior assumptions. Christian H. Kasess, Wolfgang Kreuzer |
| 2013 | Evaluating an adaptive dialog system for the public. Benjamin Weiss, Simon Willkomm, Sebastian Möller |
| 2013 | Evaluating speech features with the minimal-pair ABX task: analysis of the classical MFC/PLP pipeline. Thomas Schatz, Vijayaditya Peddinti, Francis R. Bach, Aren Jansen, Hynek Hermansky, Emmanuel Dupoux |
| 2013 | Evaluating spoken dialogue models under the interactive pattern recognition framework. Fabrizio Ghigi, M. Inés Torres, Raquel Justo, José-Miguel Benedí |
| 2013 | Evaluation of a bone-conducted ultrasonic hearing aid in vocal emotion transmission. Takayuki Kagomiya, Seiji Nakagawa |
| 2013 | Evaluation of a real-time voice order recognition system from multiple audio channels in a home. Michel Vacher, Benjamin Lecouteux, Dan Istrate, Thierry Joubert, François Portet, Mohamed A. Sehili, Pedro Chahuara |
| 2013 | Evaluation of a singing voice conversion method based on many-to-many eigenvoice conversion. Hironori Doi, Tomoki Toda, Tomoyasu Nakano, Masataka Goto, Satoshi Nakamura |
| 2013 | Evaluation of fundamental validity in applying AR-HMM with automatic topology generation to pathology voice analysis. Akira Sasou |
| 2013 | Evaluation of speech-based protocol for detection of early-stage dementia. Aharon Satt, Alexander Sorin, Orith Toledo-Ronen, Oren Barkan, Ioannis Kompatsiaris, Athina Kokonozi, Magda Tsolaki |
| 2013 | Exemplar-based individuality-preserving voice conversion for articulation disorders in noisy environments. Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki |
| 2013 | Exemplar-based pitch accent categorisation using the generalized context model. Michael Walsh, Katrin Schweitzer, Nadja Schauffler |
| 2013 | Exemplar-based unit selection for voice conversion utilizing temporal information. Zhizheng Wu, Tuomas Virtanen, Tomi Kinnunen, Engsiong Chng, Haizhou Li |
| 2013 | Experiments towards a better LVCSR system for tamil. Melvin Jose Johnson Premkumar, Ngoc Thang Vu, Tanja Schultz |
| 2013 | Exploiting shared information for multi-intent natural language sentence classification. Puyang Xu, Ruhi Sarikaya |
| 2013 | Exploiting the succeeding words in recurrent neural network language models. Yangyang Shi, Martha A. Larson, Pascal Wiggers, Catholijn M. Jonker |
| 2013 | Exploring convolutional neural network structures and optimization techniques for speech recognition. Ossama Abdel-Hamid, Li Deng, Dong Yu |
| 2013 | Exploring methods of improving speaker accuracy for speaker diarization. Mary Tai Knox, Nikki Mirghafori, Gerald Friedland |
| 2013 | Exploring the connection of acoustic and distinctive features. Thomas Kisler, Uwe D. Reichel |
| 2013 | Expressive speech synthesis in MARY TTS using audiobook data and emotionML. Marcela Charfuelan, Ingmar Steiner |
| 2013 | Extended weighted linear prediction using the autocorrelation snapshot - a robust speech analysis method and its application to recognition of vocal emotions. Jouni Pohjalainen, Paavo Alku |
| 2013 | Factored maximum likelihood kernelized regression for HMM-based singing voice synthesis. June Sig Sung, Doo Hwa Hong, Hyun Woo Koo, Nam Soo Kim |
| 2013 | Failure transitions for joint n-gram models and G2p conversion. Josef R. Novak, Nobuaki Minematsu, Keikichi Hirose |
| 2013 | Fast and memory effective i-vector extraction using a factorized sub-space. Sandro Cumani, Pietro Laface |
| 2013 | Faster 3d vocal tract real-time MRI using constrained reconstruction. Yinghua Zhu, Asterios Toutios, Shrikanth S. Narayanan, Krishna S. Nayak |
| 2013 | Feature space generalized variable parameter HMMs for noise robust recognition. Yang Li, Xunying Liu, Lan Wang |
| 2013 | Feature-rich sub-lexical language models using a maximum entropy approach for German LVCSR. M. Ali Basha Shaik, Amr El-Desoky Mousa, Ralf Schlüter, Hermann Ney |
| 2013 | Final lengthening in Russian: a corpus-based study. Tatiana Kachkovskaia, Nina B. Volskaya, Pavel A. Skrelin |
| 2013 | Finding recurrent out-of-vocabulary words. Long Qin, Alexander I. Rudnicky |
| 2013 | Fine-grain voice strength estimation from vowel spectral cues. Jean-Sylvain Liénard, Claude Barras |
| 2013 | Fitting long-range information using interpolated distanced n-grams and cache models into a latent dirichlet language model for speech recognition. Md. Akmal Haidar, Douglas D. O'Shaughnessy |
| 2013 | Foreign accent conversion through voice morphing. Sandesh Aryal, Daniel Felps, Ricardo Gutierrez-Osuna |
| 2013 | Foreign accent detection from spoken Finnish using i-vectors. Hamid Behravan, Ville Hautamäki, Tomi Kinnunen |
| 2013 | Formalizing expert knowledge for developing accurate speech recognizers. Anuj Kumar, Florian Metze, Wenyi Wang, Matthew Kam |
| 2013 | Formant contours in Czech vowels: speaker-discriminating potential. Dita Fejlová, David Lukes, Radek Skarnitzl |
| 2013 | Formant frequency tracking using Gaussian mixtures with maximum a posteriori adaptation. Jonathan C. Kim, Hrishikesh Rao, Mark A. Clements |
| 2013 | Freestyle: a challenge-response system for hip hop lyrics via unsupervised induction of stochastic transduction grammars. Dekai Wu, Karteek Addanki, Markus Saers |
| 2013 | Frequency warping and robust speaker verification: a comparison of alternative mel-scale representations. Tomi Kinnunen, Md. Jahangir Alam, Pavel Matejka, Patrick Kenny, Jan Cernocký, Douglas D. O'Shaughnessy |
| 2013 | Frequency-adaptive post-filtering for intelligibility enhancement of narrowband telephone speech. Emma Jokinen, Marko Takanen, Paavo Alku |
| 2013 | From segmentation bootstrapping to transcription-to-word conversion. Uwe D. Reichel |
| 2013 | Functional data analysis of tongue articulation in palatal vowels: gothenburg and malmöhus Swedish /iː, yː, ̟ʉː/. Susanne Schötz, Johan Frid, Lars Gustafsson, Anders Löfqvist |
| 2013 | G2p variant prediction techniques for ASR and STD. Marelie H. Davel, Charl Johannes van Heerden, Etienne Barnard |
| 2013 | GMM based speaker variability compensated system for interspeech 2013 compare emotion challenge. Vidhyasaharan Sethu, Julien Epps, Eliathamby Ambikairajah, Haizhou Li |
| 2013 | Generalizing continuous-space translation of paralinguistic information. Takatomo Kano, Shinnosuke Takamichi, Sakriani Sakti, Graham Neubig, Tomoki Toda, Satoshi Nakamura |
| 2013 | Generating fundamental frequency contours for speech synthesis in yorùbá. Daniel R. van Niekerk, Etienne Barnard |
| 2013 | Generation of fundamental frequency contours for Thai speech synthesis using tone nucleus model. Oraphan Krityakien, Keikichi Hirose, Nobuaki Minematsu |
| 2013 | Generative modeling of speech F Hirokazu Kameoka, Kota Yoshizato, Tatsuma Ishihara, Yasunori Ohishi, Kunio Kashino, Shigeki Sagayama |
| 2013 | Geometric contamination for GMM/UBM speaker verification in reverberant environments. Alessio Brutti, Maurizio Omologo |
| 2013 | Graph-based semi-supervised learning for phone and segment classification. Yuzong Liu, Katrin Kirchhoff |
| 2013 | Grapheme-to-phoneme conversion based on adaptive regularization of weight vectors. Keigo Kubo, Sakriani Sakti, Graham Neubig, Tomoki Toda, Satoshi Nakamura |
| 2013 | HMM-based TTS for hanoi vietnamese: issues in design and evaluation. Thi Thu Trang Nguyen, Christophe d'Alessandro, Albert Rilliard, Do Dat Tran |
| 2013 | HMM-based synthesis of creaky voice. Tuomo Raitio, John Kane, Thomas Drugman, Christer Gobl |
| 2013 | Handling recordings acquired simultaneously over multiple channels with PLDA. Jesús Antonio Villalba López, Mireia Díez, Amparo Varona, Eduardo Lleida |
| 2013 | Hardware/software codesign for mobile speech recognition. David Sheffield, Michael J. Anderson, Yunsup Lee, Kurt Keutzer |
| 2013 | Heuristic selection of training sentences from historical TV guide for semi-supervised LM adaptation. Harry M. Chang |
| 2013 | Hierarchical models based on a continuous acoustic space to identify phonological features. Javier Mikel Olaso, M. Inés Torres |
| 2013 | Hierarchical pitman-yor and dirichlet process for language model. Jen-Tzung Chien, Ying-Lan Chang |
| 2013 | Histogram equalization of real and imaginary modulation spectra for noise-robust speech recognition. Hsin-Ju Hsieh, Berlin Chen, Jeih-weih Hung |
| 2013 | How are word-final schwas different in the north and south of france? Rena Nemoto, Martine Adda-Decker |
| 2013 | How did it work? historic phonetic devices explained by coeval photographs. Rüdiger Hoffmann, Dieter Mehnert, Rolf Dietzel |
| 2013 | How do multiple sublexical cues converge in lexical segmentation? an artificial language learning study. Odile Bagou, Ulrich H. Frauenfelder |
| 2013 | How voicing, place and manner of articulation differently modulate event-related potentials associated with response inhibition. Nathalie Bedoin, Jennifer Krzonowski, Emmanuel Ferragne |
| 2013 | Human mouth state detection using low frequency ultrasound. Farzaneh Ahmadi, Mousa Ahmadi, Ian McLoughlin |
| 2013 | Human perception of alcoholic intoxication in speech. Barbara Baumeister, Florian Schiel |
| 2013 | Hybrid nearest-neighbor/cluster adaptive training for rapid speaker adaptation in statistical speech synthesis systems. Amir Mohammadi, Cenk Demiroglu |
| 2013 | I-vectors meet imitators: on vulnerability of speaker verification systems against voice mimicry. Rosa González Hautamäki, Tomi Kinnunen, Ville Hautamäki, Timo Leino, Anne-Maria Laukkanen |
| 2013 | I4u submission to NIST SRE 2012: a large-scale collaborative effort for noise-robust speaker verification. Rahim Saeidi, Kong-Aik Lee, Tomi Kinnunen, Tawfik Hasan, Benoit G. B. Fauve, Pierre-Michel Bousquet, Elie Khoury, Pablo Luis Sordo Martinez, Jia Min Karen Kua, Changhuai You, Hanwu Sun, Anthony Larcher, Padmanabhan Rajan, Ville Hautamäki, Cemal Hanilçi, Billy Braithwaite, Rosa González Hautamäki, Seyed Omid Sadjadi, Gang Liu, Hynek Boril, Navid Shokouhi, Driss Matrouf, Laurent El Shafey, Pejman Mowlaee, Julien Epps, Tharmarajah Thiruvaran, David A. van Leeuwen, Bin Ma, Haizhou Li, John H. L. Hansen, Jean-François Bonastre, Sébastien Marcel, John S. D. Mason, Eliathamby Ambikairajah |
| 2013 | Identification of gender from children's speech by computers and humans. Saeid Safavi, Peter Jancovic, Martin J. Russell, Michael J. Carey |
| 2013 | Identifying consonantal tasks via measures of tongue shaping: a real-time MRI investigation of the production of vocalized syllabic /l/ in American English. Caitlin Smith, Adam C. Lammert |
| 2013 | Identifying new bird species from differences in birdsong. Naomi Harte, Sadhbh Murphy, David J. Kelly, Nicola M. Marples |
| 2013 | Imitation interacts with one's second-language phonology but it does not operate cross-linguistically. Václav Jonás Podlipský, Sárka Simácková, Katerina Chládková |
| 2013 | Impact of noise reduction and spectrum estimation on noise robust speaker identification. Keith W. Godin, Seyed Omid Sadjadi, John H. L. Hansen |
| 2013 | Implicit learning leads to familiarity effects for intonation but not for voice. Ann-Kathrin Grohe, Bettina Braun |
| 2013 | Improved feature processing for deep neural networks. Shakti P. Rath, Daniel Povey, Karel Veselý, Jan Cernocký |
| 2013 | Improved models for automatic punctuation prediction for spoken and written text. Nicola Ueffing, Maximilian Bisani, Paul Vozila |
| 2013 | Improved unsupervised NAP training dataset design for speaker recognition. Hanwu Sun, Bin Ma |
| 2013 | Improvement of distant-talking speaker identification using bottleneck features of DNN. Takanori Yamada, Longbiao Wang, Atsuhiko Kai |
| 2013 | Improvement of speech intelligibility by reallocation of spectral energy. Reiko Takou, Nobumasa Seiyama, Atsushi Imai |
| 2013 | Improvements in language identification on the RATS noisy speech corpus. Jeff Z. Ma, Bing Zhang, Spyros Matsoukas, Sri Harish Reddy Mallidi, Feipeng Li, Hynek Hermansky |
| 2013 | Improvements to HMM-based speech synthesis based on parameter generation with rich context models. Shinnosuke Takamichi, Tomoki Toda, Yoshinori Shiga, Sakriani Sakti, Graham Neubig, Satoshi Nakamura |
| 2013 | Improving LVCSR with hidden conditional random fields for grapheme-to-phoneme conversion. Stefan Hahn, Patrick Lehnen, Simon Wiesler, Ralf Schlüter, Hermann Ney |
| 2013 | Improving grapheme-based ASR by probabilistic lexical modeling approach. Ramya Rasipuram, Mathew Magimai-Doss |
| 2013 | Improving language identification robustness to highly channel-degraded speech through multiple system fusion. Aaron Lawson, Mitchell McLaren, Yun Lei, Vikramjit Mitra, Nicolas Scheffer, Luciana Ferrer, Martin Graciarena |
| 2013 | Improving lightly supervised training for broadcast transcription. Yanhua Long, Mark J. F. Gales, Pierre Lanchantin, Xunying Liu, Matthew Stephen Seigel, Philip C. Woodland |
| 2013 | Improving low-resource CD-DNN-HMM using dropout and multilingual DNN training. Yajie Miao, Florian Metze |
| 2013 | Improving robustness to compressed speech in speaker recognition. Mitchell McLaren, Victor Abrash, Martin Graciarena, Yun Lei, Jan Pesán |
| 2013 | Improving short utterance based i-vector speaker recognition using source and utterance-duration normalization techniques. Ahilan Kanagasundaram, David Dean, Javier Gonzalez-Dominguez, Sridha Sridharan, Daniel Ramos, Joaquin Gonzalez-Rodriguez |
| 2013 | Improving speaker identification in TV-shows using person name detection in overlaid text and speech. Delphine Charlet, Corinne Fredouille, Géraldine Damnati, Grégory Senay |
| 2013 | Improving speech intelligibility in noise by SII-dependent preprocessing using frequency-dependent amplification and dynamic range compression. Henning F. Schepker, Jan Rennies, Simon Doclo |
| 2013 | Improving the PLDA based speaker verification in limited microphone data conditions. Ahilan Kanagasundaram, David Dean, Javier Gonzalez-Dominguez, Sridha Sridharan, Daniel Ramos, Joaquin Gonzalez-Rodriguez |
| 2013 | Improving the accuracy and the robustness of harmonic model for pitch estimation. Meysam Asgari, Izhak Shafran |
| 2013 | Improving unsupervised language model adaptation with discriminative data filtering. Shuangyu Chang, Michael Levit, Partha Parthasarathy, Benoît Dumoulin |
| 2013 | In-home detection of distress calls: the case of aged users. Frédéric Aman, Michel Vacher, Solange Rossato, François Portet |
| 2013 | In-vehicle destination entry by voice: practical aspects. Bart D'hoore, Alfred Wiesen |
| 2013 | Incorporating named entity recognition into the speech transcription process. Mohamed Hatmi, Christine Jacquin, Emmanuel Morin, Sylvain Meignier |
| 2013 | Incorporating proximity information for relevance language modeling in speech recognition. Yi-Wen Chen, Bo-Han Hao, Kuan-Yu Chen, Berlin Chen |
| 2013 | Increasing speech intelligibility via spectral shaping with frequency warping and dynamic range compression plus transient enhancement. Elizabeth Godoy, Yannis Stylianou |
| 2013 | Incremental acoustic subspace learning for voice activity detection using harmonicity-based features. Jiaxing Ye, Takumi Kobayashi, Masahiro Murakawa, Tetsuya Higuchi |
| 2013 | Incremental emotion recognition. Taniya Mishra, Dimitrios Dimitriadis |
| 2013 | Indexing multimedia documents with acoustic concept recognition lattices. Diego Castán, Murat Akbacak |
| 2013 | Individual differences of emotional expression in speaker's behavioral and autonomic responses. Yoshiko Arimoto, Kazuo Okanoya |
| 2013 | Inferring actor communities from videos. Sumit Negi, Ramnath Balasubramanyan, Santanu Chaudhury |
| 2013 | Infinite support vector machines in speech recognition. Jingzhou Yang, Rogier C. van Dalen, Mark J. F. Gales |
| 2013 | Information retrieval-based dynamic time warping. Xavier Anguera |
| 2013 | Information theoretic acoustic feature selection for acoustic-to-articulatory inversion. Prasanta Kumar Ghosh, Shrikanth S. Narayanan |
| 2013 | Information theoretic syllable structure and its relation to the c-center effect. Uwe D. Reichel |
| 2013 | Information-preserving temporal reallocation of speech in the presence of fluctuating maskers. Vincent Aubanel, Martin Cooke |
| 2013 | Informative spectro-temporal bottleneck features for noise-robust speech recognition. Shuo-Yiin Chang, Nelson Morgan |
| 2013 | Instance-based on-line language model adaptation. Ali Orkan Bayer, Giuseppe Riccardi |
| 2013 | Instantaneous harmonic representation of speech using multicomponent sinusoidal excitation. Elias Azarov, Maxim Vashkevich, Alexander A. Petrovsky |
| 2013 | Integer linear programming for speaker diarization and cross-modal identification in TV broadcast. Hervé Bredin, Johann Poignant |
| 2013 | Integrating conditional random fields and joint multi-gram model with syllabic features for grapheme-to-phone conversion. Xiaoxuan Wang, Khe Chai Sim |
| 2013 | Intelligibility at a multilingual cocktail party: effect of concurrent language knowledge. Aurore Gautreau, Michel Hoen, Fanny Meunier |
| 2013 | Intelligibility-enhancing speech modifications: the hurricane challenge. Martin Cooke, Catherine Mayo, Cassia Valentini-Botinhao |
| 2013 | Intensive acoustic models constructed by integrating low-occurrence models for spoken term detection. Shiro Narumi, Kazuma Konno, Takuya Nakano, Yoshiaki Itoh, Kazunori Kojima, Masaaki Ishigame, Kazuyo Tanaka, Shi-wook Lee |
| 2013 | Inter-speaker variability in audio-visual classification of word prominence. Martin Heckmann |
| 2013 | Interacting with robots via speech and gestures, an integrated architecture. Francesco Cutugno, Alberto Finzi, Michelangelo Fiore, Enrico Leone, Silvia Rossi |
| 2013 | Interference robust DOA estimation of human speech by exploiting historical information and temporal correlation. Wei Xue, Shan Liang, Wenju Liu |
| 2013 | Interpolation of acoustic models for speech recognition. Thiago Fraga-Silva, Jean-Luc Gauvain, Lori Lamel |
| 2013 | Intonational contrasts encode speaker's certainty in neutral vs. incredulity declarative questions in French. Amandine Michelas, Cristel Portes, Maud Champagne-Lavau |
| 2013 | Investigating fine temporal dynamics of prosodic and lexical accommodation. Francesca Bonin, Céline De Looze, Sucheta Ghosh, Emer Gilmartin, Carl Vogel, Anna Polychroniou, Hugues Salamin, Alessandro Vinciarelli, Nick Campbell |
| 2013 | Investigating the relationship between glottal area waveform shape and harmonic magnitudes through computational modeling and laryngeal high-speed videoendoscopy. Gang Chen, Robin A. Samlan, Jody Kreiman, Abeer Alwan |
| 2013 | Investigating voice quality as a speaker-independent indicator of depression and PTSD. Stefan Scherer, Giota Stratou, Jonathan Gratch, Louis-Philippe Morency |
| 2013 | Investigation of MT-based ASR confusion models for semi-supervised discriminative language modeling. Erinç Dikici, Emily Tucker Prud'hommeaux, Brian Roark, Murat Saraçlar |
| 2013 | Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. Grégoire Mesnil, Xiaodong He, Li Deng, Yoshua Bengio |
| 2013 | Investigations on hessian-free optimization for cross-entropy training of deep neural networks. Simon Wiesler, Jinyu Li, Jian Xue |
| 2013 | Is protrusion of French rounded vowels affected by prosodic positions? Laurianne Georgeton, Nicolas Audibert |
| 2013 | Is speech enhancement pre-processing still relevant when using deep neural networks for acoustic modeling? Marc Delcroix, Yotaro Kubo, Tomohiro Nakatani, Atsushi Nakamura |
| 2013 | Is the vowel length contrast in Japanese exaggerated in infant-directed speech? Keiichi Tajima, Kuniyoshi Tanaka, Andrew Martin, Reiko Mazuka |
| 2013 | IsNL? a discriminative approach to detect natural language like queries for conversational understanding. Asli Celikyilmaz, Gökhan Tür, Dilek Hakkani-Tür |
| 2013 | Iterative sinusoidal-based partial phase reconstruction in single-channel source separation. Mario Kaoru Watanabe, Pejman Mowlaee |
| 2013 | Joint noise cancellation and dereverberation using multi-channel linearly constrained minimum variance filter. Karan Nathwani, Rajesh M. Hegde |
| 2013 | Joint recognition and direction-of-arrival estimation of simultaneous meeting-room acoustic events. Rupayan Chakraborty, Climent Nadeu |
| 2013 | Joint spectral distribution modeling using restricted boltzmann machines for voice conversion. Ling-Hui Chen, Zhen-Hua Ling, Yan Song, Li-Rong Dai |
| 2013 | Joint stochastic-deterministic wiener filtering with recursive Bayesian estimation of deterministic speech. Matthew C. McCallum, Bernard J. Guillemin |
| 2013 | KPCatcher - a keyphrase extraction system for enterprise videos. Yongxin Taylor Xi, Matthias Paulik, Venkata Ramana Rao Gadde, Ananth Sankar |
| 2013 | Knowledge integration for improving performance in LVCSR. Chen-Yu Chiang, Sabato Marco Siniscalchi, Sin-Horng Chen, Chin-Hui Lee |
| 2013 | L2 English learners' recognition of words spoken in familiar versus unfamiliar English accents. Jia Ying, Jason A. Shaw, Catherine T. Best |
| 2013 | L2 syntax acquisition: the effect of oral and written computer assisted practice. Polina Drozdova, Catia Cucchiarini, Helmer Strik |
| 2013 | LAPSyd: lyon-albuquerque phonological systems database. Ian Maddieson, Sébastien Flavier, Egidio Marsico, Christophe Coupé, François Pellegrino |
| 2013 | Language background affects the strength of the pitch bias in a duration discrimination task. Daniel Aalto, Juraj Simko, Martti Vainio |
| 2013 | Language modeling for mixed language speech recognition using weighted phrase extraction. Ying Li, Pascale Fung |
| 2013 | Language-independent call routing using the large margin estimation principle. Moataz El Ayadi, Mohamed Afify |
| 2013 | Language-universal speech audiometry with automated scoring. Bart Vaerenberg, Louis ten Bosch, Wojtek Kowalczyk, Martine Coene, Herwig De Smet, Paul J. Govaerts |
| 2013 | Large-scale characterization of Mandarin pronunciation errors made by native speakers of European languages. Nancy F. Chen, Vivaek Shivakumar, Mahesh Harikumar, Bin Ma, Haizhou Li |
| 2013 | Large-scale personal assistant technology deployment: the siri experience. Jerome R. Bellegarda |
| 2013 | Late reverberation suppression using MMSE modulation spectral estimation. Chenxi Zheng, Wai-Yip Chan |
| 2013 | Lattice-based training of bottleneck feature extraction neural networks. Matthias Paulik |
| 2013 | Laughter modulation: from speech to speech-laugh. Jieun Oh, Ge Wang |
| 2013 | Learning binaural spectrogram features for azimuthal speaker localization. Wiktor Mlynarski |
| 2013 | Learning speaker-specific pronunciations of disordered speech. Heidi Christensen, Phil D. Green, Thomas Hain |
| 2013 | Learning to imitate adult speech with the KLAIR virtual infant. Mark A. Huckvale, Amrita Sharma |
| 2013 | Let me finish: automatic conflict detection using speaker overlap. Félix Grèzes, Justin Richards, Andrew Rosenberg |
| 2013 | Leveraging knowledge graphs for web-scale unsupervised semantic parsing. Larry P. Heck, Dilek Hakkani-Tür, Gökhan Tür |
| 2013 | Leveraging locality for topic identification of conversational speech. Jonathan Wintrode |
| 2013 | Lexee: a cloud-based platform for building and deploying voice-enabled mobile applications. Dmitry Sityaev, Jonathan Hotz, Vadim Snitkovsky |
| 2013 | Lexical stress detection for L2 English speech using deep belief networks. Kun Li, Xiaojun Qian, Shiyin Kang, Helen Meng |
| 2013 | Lexical tone perception in Thai normal-hearing adults and those using hearing aids: a case study. Charturong Tantibundhit, Chutamanee Onsuwan, Nittayapa Klangpornkun, P. Phienphanich, Tanawan Saimai, Nantaporn Saimai, P. Pitathawatchai, Chai Wutiwiwatchai |
| 2013 | Lightly supervised discriminative training of grapheme models for improved sentence-level alignment of speech and text data. Adriana Stan, Peter Bell, Junichi Yamagishi, Simon King |
| 2013 | Lightly supervised training for risk-based discriminative language models. Akio Kobayashi, Takahiro Oku, Yuya Fujita, Shoei Sato |
| 2013 | Likelihood-ratio calibration using prior-weighted proper scoring rules. Niko Brümmer, George R. Doddington |
| 2013 | Linguistic disfluency in narrative speech: evidence from story-telling in 6-year olds. Ingrida Balciuniene |
| 2013 | Linking loudness increases in normal and lombard speech to decreasing vowel formant separation. Elizabeth Godoy, Catherine Mayo, Yannis Stylianou |
| 2013 | Locality sensitive hashing for fast computation of correlational manifold learning based feature space transformations. Vikrant Singh Tomar, Richard C. Rose |
| 2013 | Lombard modified text-to-speech synthesis for improved intelligibility: submission for the hurricane challenge 2013. Antti Suni, Reima Karhila, Tuomo Raitio, Mikko Kurimo, Martti Vainio, Paavo Alku |
| 2013 | Looking for lexical feedback effects in /tl/→/kl/ repairs. Pierre A. Hallé, Natalia Kartushina, Juan Segui, Ulrich H. Frauenfelder |
| 2013 | MINT.tools: tools and adaptors supporting acquisition, annotation and analysis of multimodal corpora. Spyros Kousidis, Thies Pfeiffer, David Schlangen |
| 2013 | MLP-HMM two-stage unsupervised training for low-resource languages on conversational telephone speech recognition. Yanmin Qian, Jia Liu |
| 2013 | MODIS: an audio motif discovery software. Laurence Catanese, Nathan Souviraà-Labastie, Bingqing Qu, Sébastien Campion, Guillaume Gravier, Emmanuel Vincent, Frédéric Bimbot |
| 2013 | Machine learning of probabilistic phonological pronunciation rules from the Italian CLIPS corpus. Florian Schiel, Mary Stevens, Uwe D. Reichel, Francesco Cutugno |
| 2013 | Manual and automatic tone annotation: the case of an endangered language from north vietnam "mo piu". Geneviève Caelen-Haumont, Katarina Bartkova |
| 2013 | Markers of confidence and correctness in spoken medical narratives. Kathryn Womack, Cecilia Ovesdotter Alm, Cara Calvelli, Jeff B. Pelz, Pengcheng Shi, Anne R. Haake |
| 2013 | Measuring laryngealization in running speech: interaction with contrastive tones in yalálag zapotec. Leonardo Lancia, Heriberto Avelino, Daniel Voigt |
| 2013 | Melody metrics for prosodic typology: comparing English, French and Chinese. Daniel Hirst |
| 2013 | Merging human and automatic system decisions to improve speaker recognition performance. Rosa González Hautamäki, Ville Hautamäki, Padmanabhan Rajan, Tomi Kinnunen |
| 2013 | Methodologies for the evaluation of speaker diarization and automatic speech recognition in the presence of overlapping speech. Olivier Galibert |
| 2013 | Minimax i-vector extractor for short duration speaker verification. Ville Hautamäki, You-Chi Cheng, Padmanabhan Rajan, Chin-Hui Lee |
| 2013 | Minimum mean squared error based warped complex cepstrum analysis for statistical parametric speech synthesis. Ranniery Maia, Mark J. F. Gales, Yannis Stylianou, Masami Akamine |
| 2013 | Mixtures of Bayesian joint factor analyzers for noise robust automatic speech recognition. Xiaodong Cui, Vaibhava Goel, Brian Kingsbury |
| 2013 | Model order estimation using Bayesian NMF for discovering phone patterns in spoken utterances. Sayeh Mirzaei, Hugo Van hamme, Yaser Norouzi |
| 2013 | Model-based Bayesian reinforcement learning for dialogue management. Pierre Lison |
| 2013 | Model-based noise suppression using unsupervised estimation of hidden Markov model for non-stationary noise. Masakiyo Fujimoto, Tomohiro Nakatani |
| 2013 | Modeling durational incompressibility. Andreas Windmann, Juraj Simko, Britta Wrede, Petra Wagner |
| 2013 | Modeling postcolonial language varieties: challenges and lessons learned from mozambican Portuguese. Simone Ashby, Sílvia Barbosa, Catarina Silva, Paulino Fumo, José Pedro Ferreira |
| 2013 | Modeling prosodic sequences with k-means and dirichlet process GMMs. Andrew Rosenberg |
| 2013 | Modeling spectral variability for the classification of depressed speech. Nicholas Cummins, Julien Epps, Vidhyasaharan Sethu, Michael Breakspear, Roland Goecke |
| 2013 | Modeling therapist empathy and vocal entrainment in drug addiction counseling. Bo Xiao, Panayiotis G. Georgiou, Zac E. Imel, David C. Atkins, Shrikanth S. Narayanan |
| 2013 | Modelling and estimation of the fundamental frequency of speech using a hidden Markov model. John H. Taylor, Ben Milner |
| 2013 | Modified cepstral mean normalization - transforming to utterance specific non-zero mean. Vikas Joshi, N. Vishnu Prasad, Srinivasan Umesh |
| 2013 | Modular combination of deep neural networks for acoustic modeling. Jonas Gehring, Wonkyum Lee, Kevin Kilgour, Ian R. Lane, Yajie Miao, Alex Waibel |
| 2013 | Modulation features for noise robust speaker identification. Vikramjit Mitra, Mitchell McLaren, Horacio Franco, Martin Graciarena, Nicolas Scheffer |
| 2013 | Monaural speech segregation based on pitch track correction using an ensemble kalman filter. Han-Gyu Kim, Gil-Jin Jang, Jeong-Sik Park, Yung-Hwan Oh |
| 2013 | Monitoring the effects of temporal clipping on voIP speech quality. Andrew Hines, Jan Skoglund, Anil C. Kokaram, Naomi Harte |
| 2013 | Mora-based pre-low raising in Japanese pitch accent. Albert Lee, Yi Xu, Santitham Prom-on |
| 2013 | Morpheme level hierarchical pitman-yor class-based language models for LVCSR of morphologically rich languages. Amr El-Desoky Mousa, M. Ali Basha Shaik, Ralf Schlüter, Hermann Ney |
| 2013 | Motivational feedback in crowdsourcing: a case study in speech transcription. Giuseppe Riccardi, Arindam Ghosh, S. A. Chowdhury, Ali Orkan Bayer |
| 2013 | Multi-band long-term signal variability features for robust voice activity detection. Andreas Tsiartas, Theodora Chaspari, Nassos Katsamanis, Prasanta Kumar Ghosh, Ming Li, Maarten Van Segbroeck, Alexandros Potamianos, Shrikanth S. Narayanan |
| 2013 | Multi-centroidal duration generation algorithm for HMM-based TTS. Yongguo Kang, Jian Li, Yan Deng, Miaomiao Wang |
| 2013 | Multi-domain neural network language model. Tanel Alumäe |
| 2013 | Multi-layer mutually reinforced random walk with hidden parameters for improved multi-party meeting summarization. Yun-Nung Chen, Florian Metze |
| 2013 | Multi-session PLDA scoring of i-vector for partially open-set speaker detection. Kong-Aik Lee, Anthony Larcher, Chang Huai You, Bin Ma, Haizhou Li |
| 2013 | Multi-stream recognition of noisy speech with performance monitoring. Ehsan Variani, Feipeng Li, Hynek Hermansky |
| 2013 | Multilingual hierarchical MRASTA features for ASR. Zoltán Tüske, Ralf Schlüter, Hermann Ney |
| 2013 | Multilingual multilayer perceptron for rapid language adaptation between and across language families. Ngoc Thang Vu, Tanja Schultz |
| 2013 | Multilingual web conferencing using speech-to-speech translation. John Chen, Shufei Wen, Vivek Kumar Rangarajan Sridhar, Srinivas Bangalore |
| 2013 | Multiple topic identification in telephone conversations. Xavier Bost, Marc El-Bèze, Renato De Mori |
| 2013 | Musical noise analysis for Bayesian minimum mean-square error speech amplitude estimators based on higher-order statistics. Hiroshi Saruwatari, Suzumi Kanehara, Ryoichi Miyazaki, Kiyohiro Shikano, Kazunobu Kondo |
| 2013 | Mutual intelligibility of American, Chinese and Dutch-accented speakers of English tested by SUS and SPIN sentences. Hongyan Wang, Vincent J. van Heuven |
| 2013 | N-best rescoring by phoneme classifiers using subclass adaboost algorithm. Hiroshi Fujimura, Yusuke Shinohara, Takashi Masuko |
| 2013 | NMF-based temporal feature integration for acoustic event classification. Jimmy Ludeña-Choez, Ascensión Gallardo-Antolín |
| 2013 | Native English listeners' perceptions of prosody in L1 and L2 reading. Caroline L. Smith, Paul Edmunds |
| 2013 | Native accent classification via i-vectors and speaker compensation fusion. Andrea DeMarco, Stephen J. Cox |
| 2013 | Naturalness judgement of L2 Mandarin Chinese - does timing matter? Chiharu Tsurutani, Dean Luo |
| 2013 | Neural network acoustic models for the DARPA RATS program. Hagen Soltau, Hong-Kwang Kuo, Lidia Mangu, George Saon, Tomás Beran |
| 2013 | New cosine similarity scorings to implement gender-independent speaker verification. Mohammed Senoussaoui, Patrick Kenny, Pierre Dumouchel, Najim Dehak |
| 2013 | New parameters for automatic speech recognition based on the mammalian cochlea model using resonance analysis. José Luis Oropeza Rodríguez |
| 2013 | Noise adaptive training for subspace Gaussian mixture models. Liang Lu, Arnab Ghoshal, Steve Renals |
| 2013 | Noise robust speaker verification with delta cepstrum normalization. Naoyuki Kanda, Ryu Takeda, Yasunari Obuchi |
| 2013 | Non-canonical syntactic structures in discourse: tonality, tonicity and tones in English (semi-)spontaneous speech. Laetitia Leonarduzzi, Sophie Herment |
| 2013 | Non-linguistic vocalisation recognition based on hybrid GMM-SVM approach. Artur Janicki |
| 2013 | Non-negative matrix factorization with linear constraints for single-channel speech enhancement. Nikolay Lyubimov, Mikhail Kotov |
| 2013 | Non-negative tensor factorisation of modulation spectrograms for monaural sound source separation. Tom Barker, Tuomas Virtanen |
| 2013 | Nonlinear prediction of speech signal using volterra-wiener series. Hemant A. Patil, Tanvina B. Patel |
| 2013 | Notes on so-called inter-speaker difference in spontaneous speech: the case of Japanese voiced obstruent. Kikuo Maekawa |
| 2013 | Nuance - Politecnico di torino's 2012 NIST speaker recognition evaluation system. Daniele Colibro, Claudio Vair, Kevin Farrell, Nir Krause, Gennady Karvitsky, Sandro Cumani, Pietro Laface |
| 2013 | Observations of perseverative coarticulation in lateral approximants using MRI. Nicole Wong, Maojing Fu, Zhi-Pei Liang, Ryan Shosted, Bradley P. Sutton |
| 2013 | On leveraging conversational data for building a text dependent speaker verification system. Hagai Aronowitz, Oren Barkan |
| 2013 | On the calibration and fusion of heterogeneous spoken term detection systems. Alberto Abad, Luis Javier Rodríguez-Fuentes, Mikel Peñagarikano, Amparo Varona, Germán Bordel |
| 2013 | On the computation of document frequency statistics from spoken corpora using factor automata. Dogan Can, Shrikanth S. Narayanan |
| 2013 | On the enhancement of dereverberation algorithms based on a perceptual evaluation criterion. Thiago de M. Prego, Amaro A. de Lima, Sergio L. Netto |
| 2013 | On the evaluation of inversion mapping performance in the acoustic domain. Korin Richmond, Zhen-Hua Ling, Junichi Yamagishi, Benigno Uria |
| 2013 | On the feasibility of using pupil diameter to estimate cognitive load changes for in-vehicle spoken dialogues. Andrew L. Kun, Oskar Palinko, Zeljko Medenica, Peter A. Heeman |
| 2013 | On the improvement of multimodal voice activity detection. Matt Burlick, Dimitrios Dimitriadis, Eric Zavesky |
| 2013 | On the relation between intonational phrasing and pitch accent distribution. evidence from European Portuguese varieties. Marisa Cruz, Sónia Frota |
| 2013 | On the robustness of distributed EM based BSS in asynchronous distributed microphone array scenarios. Yasufumi Uezu, Keisuke Kinoshita, Mehrez Souden, Tomohiro Nakatani |
| 2013 | On the robustness of some acoustic parameters for signalling word stress across styles in Brazilian Portuguese. Plínio A. Barbosa, Anders Eriksson, Joel Åkesson |
| 2013 | On the role of L1 speech production in L2 perception: evidence from Spanish learners of French. Natalia Kartushina, Ulrich H. Frauenfelder |
| 2013 | On von-mises fisher mixture model in text-independent speaker identification. Jalil Taghia, Zhanyu Ma, Arne Leijon |
| 2013 | On why Japanese /r/ sounds are difficult for children to acquire. Takayuki Arai |
| 2013 | On-line audio dilation for human interaction. John S. Novak III, Jason Archer, Valeriy Shafiro, Robert V. Kenyon, Jason Leigh |
| 2013 | On-line learning of lexical items and grammatical constructions via speech, gaze and action-based human-robot interaction. Grégoire Pointeau, Maxime Petit, Xavier Hinaut, Guillaume Gibert, Peter Ford Dominey |
| 2013 | Optimization of sigmoidal rate-level function based on acoustic features. Víctor Poblete, Néstor Becerra Yoma, Richard M. Stern |
| 2013 | Optimizations and fitting procedures for the liljencrants-fant model for statistical parametric speech synthesis. Prasanna Kumar Muthukumar, Alan W. Black, H. Timothy Bunnell |
| 2013 | Paralinguistic event detection from speech using probabilistic time-series smoothing and masking. Rahul Gupta, Kartik Audhkhasi, Sungbok Lee, Shrikanth S. Narayanan |
| 2013 | Parallel absolute-relative feature based phonotactic language recognition. Weiwei Liu, Wei-Qiang Zhang, Zhiyi Li, Jia Liu |
| 2013 | Parameter clustering for temporally varying weight regression for automatic speech recognition. Shilin Liu, Khe Chai Sim |
| 2013 | Paraphrase features to improve natural language understanding. Xiaohu Liu, Ruhi Sarikaya, Chris Brockett, Chris Quirk, William B. Dolan |
| 2013 | Particle swarm optimisation of spoken dialogue system strategies. Lucie Daubigney, Matthieu Geist, Olivier Pietquin |
| 2013 | Perceived prosodic correlates of smiled speech in spontaneous data. Caroline Émond, Lucie Ménard, Marty Laforest |
| 2013 | Perceived vocal attractiveness across dialects is similar but not uniform. Molly Babel, Grant McGuire |
| 2013 | Perceiving speech rate differences between natural and time-scale modified utterances. Hartmut R. Pfitzinger, Hansjörg Mixdorff |
| 2013 | Perception and production of Italian vowels: an ERP study. Anna Dora Manca, Mirko Grimaldi |
| 2013 | Perception of English minimal pairs in noise by Japanese listeners: does clear speech for L2 listeners help? Shinichi Tokuma, Won Tokuma |
| 2013 | Perception of glottalization in varying pitch contexts across languages. Maria Paola Bissiri, Margaret Zellers |
| 2013 | Perceptual interference between regional accent and voice/speech disorders. Alain Ghio, Médéric Gasquet-Cyrus, Juliette Roquel, Antoine Giovanni |
| 2013 | Perceptual, acoustic and electroglottographic correlates of 3 aggressive attitudes in French: a pilot study. Charlotte Kouklia, Nicolas Audibert |
| 2013 | Performance of the MVOCA silent speech interface across multiple speakers. Robin Hofe, Jie Bai, Lam Aun Cheah, Stephen R. Ell, James M. Gilbert, Roger K. Moore, Phil D. Green |
| 2013 | Periodicity extraction for voiced sounds with multiple periodicity. Masanori Morise, Hideki Kawahara, Kenji Ozawa |
| 2013 | Person identification using biometric markers from footsteps sound. M. Umair Bin Altaf, Taras Butko, Biing-Hwang Juang |
| 2013 | Person name spotting by combining acoustic matching and LDA topic models. Grégory Senay, Benjamin Bigot, Richard Dufour, Georges Linarès, Corinne Fredouille |
| 2013 | Phase-aware single-channel speech enhancement. Pejman Mowlaee, Mario Kaoru Watanabe, Rahim Saeidi |
| 2013 | Phone duration modeling using clustering of rich contexts. Tanel Alumäe, Rena Nemoto |
| 2013 | Phonetic convergence in shadowed speech: a comparison of perceptual and acoustic measures. Jennifer S. Pardo |
| 2013 | Phonetic manifestation and influence of zero anaphora in Chinese reading texts. Luying Hou, Yuan Jia, Aijun Li |
| 2013 | Photo-realistic expressive text to talking head synthesis. Vincent Wan, Robert Anderson, Art Blokland, Norbert Braunschweiler, Langzhou Chen, BalaKrishna Kolluru, Javier Latorre, Ranniery Maia, Björn Stenger, Kayoko Yanagisawa, Yannis Stylianou, Masami Akamine, Mark J. F. Gales, Roberto Cipolla |
| 2013 | Physical models of the vocal tract with a flapping tongue for flap and liquid sounds. Takayuki Arai |
| 2013 | Physics-based synthesis of disordered voices. Jorge C. Lucero, Jean Schoentgen, Mara Behlau |
| 2013 | Pitch and duration as a basis for entrainment of overlapped speech onsets. Marcin Wlodarczak, Juraj Simko, Petra Wagner |
| 2013 | Pitch and lengthening as cues to turn transition in Swedish. Margaret Zellers |
| 2013 | Pitch pattern variations in three regional varieties of American English. Robert Allen Fox, Ewa Jacewicz, Jessica Hart |
| 2013 | Pitch synchronous spectral analysis for a pitch dependent recognition of voiced phonemes - PISAR. Hans-Günter Hirsch |
| 2013 | Pitch-gesture modeling using subband autocorrelation change detection. Malcolm Slaney, Elizabeth Shriberg, Jui-Ting Huang |
| 2013 | Place assimilation and articulatory strategies: the case of sibilant sequences in French as L1 and L2. Sonia D'Apolito, Barbara Gili Fivela |
| 2013 | Pre-initialized composition for large-vocabulary speech recognition. Cyril Allauzen, Michael Riley |
| 2013 | Predicting speech quality based on interactivity and delay. Alexander Raake, Katrin Schoenenberg, Janto Skowronek, Sebastian Egger |
| 2013 | Predicting the bilateral advantage in cochlear implantees using a non-intrusive speech intelligibility measure. Stefano Cosentino, Tiago H. Falk, David McAlpine |
| 2013 | Predicting the quality of text-to-speech systems from a large-scale feature set. Florian Hinterleitner, Christoph Norrenbrock, Sebastian Möller, Ulrich Heute |
| 2013 | Prediction of intelligibility of noisy and time-frequency weighted speech based on mutual information between amplitude envelopes. Jesper Jensen, Cees H. Taal |
| 2013 | Prediction of strategy and outcome as negotiation unfolds by using basic verbal and behavioral features. Elnaz Nouri, Sunghyun Park, Stefan Scherer, Jonathan Gratch, Peter J. Carnevale, Louis-Philippe Morency, David R. Traum |
| 2013 | Prefix tree based n-best list re-scoring for recurrent neural network language model used in speech recognition system. Yujing Si, Qingqing Zhang, Ta Li, Jielin Pan, Yonghong Yan |
| 2013 | Presentational focus realisation in nalbaria variety of assamese. Shakuntala Mahanta, A. I. Twaha |
| 2013 | Preservation of speech spectral dynamics enhances intelligibility. Petko Nikolov Petkov, W. Bastiaan Kleijn |
| 2013 | Probabilistic speech F Tatsuma Ishihara, Hirokazu Kameoka, Kota Yoshizato, Daisuke Saito, Shigeki Sagayama |
| 2013 | Probabilistic trainable segmenter for call center audio using multiple features. Nina Zinovieva, Xiaodan Zhuang, Pat Peterson, Joe Alwan, Rohit Prasad |
| 2013 | Processing of /i/ and /u/ in Italian cochlear-implant children: a behavioral and neurophysiologic study. Luigia Garrapa, Davide Bottari, Mirko Grimaldi, Francesco Pavani, Andrea Calabrese, Michele De Benedetto, Silvano Vitale |
| 2013 | Production and perception of pseudo-V1CV2 outside the vowel triangle: speech illusion effects. Thi Anh Xuan Tran, Viet Son Nguyen, Eric Castelli, René Carré |
| 2013 | Production of estonian quantity contrasts by native speakers of Finnish. Einar Meister, Lya Meister |
| 2013 | Production training in second language acquisition: a comparison between objective measures and subjective judgments. Véronique Delvaux, Kathy Huet, Myriam Piccaluga, Bernard Harmegnies |
| 2013 | Progress and prospects for speech technology: what ordinary people think. Roger K. Moore |
| 2013 | Pronunciation errors by Spanish learners of Dutch: a data-driven study for ASR-based pronunciation training. Pepi Burgos, Catia Cucchiarini, Roeland van Hout, Helmer Strik |
| 2013 | Prosodic changes pre-announcing a syntactic completion point in Japanese utterance. Yuichi Ishimoto, Mika Enomoto, Hitoshi Iida |
| 2013 | Prosodic cues of sarcastic speech in French: slower, higher, wider. Hélène Loevenbruck, Mohamed Ameur Ben Jannet, Mariapaola D'Imperio, Mathilde Spini, Maud Champagne-Lavau |
| 2013 | Prosodic encoding of declarative, interrogative and imperative sentences in jaminjung, a language of australia. Candide Simard |
| 2013 | Prosodic markings of semantic predictability in taiwan Mandarin. Po-jen Hsieh |
| 2013 | Prosody of contrastive focus in estonian. Heete Sahkai, Mari-Liis Kalvik, Meelis Mihkla |
| 2013 | Quality assessment of asymmetric multiparty telephone conferences: a systematic method from technical degradations to perceived impairments. Janto Skowronek, Julian Herlinghaus, Alexander Raake |
| 2013 | Quantifying cross-linguistic variation in grapheme-to-phoneme mapping. Martine Coene, Annemiek Hammer, Wojtek Kowalczyk, Louis ten Bosch, Bart Vaerenberg, Paul Govaerts |
| 2013 | Quasi closed phase analysis for glottal inverse filtering. Manu Airaksinen, Brad H. Story, Paavo Alku |
| 2013 | R-norm: improving inter-speaker variability modelling at the score level via regression score normalisation. David Vandyke, Michael Wagner, Roland Goecke |
| 2013 | ROCme! software for the recording and management of speech corpora. Emmanuel Ferragne, Sébastien Flavier, Christian Fressard |
| 2013 | Random subset feature selection in automatic recognition of developmental disorders, affective states, and level of conflict from speech. Okko Räsänen, Jouni Pohjalainen |
| 2013 | Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition. Ossama Abdel-Hamid, Hui Jiang |
| 2013 | Reactive accent interpolation through an interactive map application. Maria Astrinaki, Junichi Yamagishi, Simon King, Nicolas D'Alessandro, Thierry Dutoit |
| 2013 | Real-time and non-real-time voice conversion systems with web interfaces. Elias Azarov, Maxim Vashkevich, Denis Likhachov, Alexander A. Petrovsky |
| 2013 | Real-time control of a 2d animation model of the vocal tract using optopalatography. Simon Preuß, Christiane Neuschaefer-Rube, Peter Birkholz |
| 2013 | Real-time voice conversion using artificial neural networks with rectified linear units. Elias Azarov, Maxim Vashkevich, Denis Likhachov, Alexander A. Petrovsky |
| 2013 | Realisation of tonal alignment in the English of Japanese-English late bilinguals. Calbert Graham, Brechtje Post |
| 2013 | Recent evolution of non-standard consonantal variants in French broadcast news. Maria Candea, Martine Adda-Decker, Lori Lamel |
| 2013 | Recognizing words across regional accents: the role of perceptual assimilation in lexical competition. Catherine T. Best, Jason A. Shaw, Elizabeth Clancy |
| 2013 | Reconstruction of continuous voiced speech from whispers. Ian Vince McLoughlin, Jingjie Li, Yan Song |
| 2013 | Recurrent neural network based language model personalization by social network crowdsourcing. Tsung-Hsien Wen, Aaron Heidel, Hung-yi Lee, Yu Tsao, Lin-Shan Lee |
| 2013 | Recurrent neural networks for language understanding. Kaisheng Yao, Geoffrey Zweig, Mei-Yuh Hwang, Yangyang Shi, Dong Yu |
| 2013 | Rediscovering 25 years of discoveries in spoken language processing: a preliminary ISCA archive analysis. Joseph Mariani, Patrick Paroubek, Gil Francopoulo, Marine Delaborde |
| 2013 | Reexamine the sandhi rules and the merging tones in hakka language. Shao-Ren Lyu, Ho-Hsien Pan |
| 2013 | Refining sentence similarity with discourse information in dialog system. Sangkeun Jung, Seung-Hoon Na |
| 2013 | Refr: an open-source reranker framework. Daniel M. Bikel, Keith B. Hall |
| 2013 | Regional accents affect speech intelligibility in a multitalker environment. Ewa Jacewicz, Robert Allen Fox |
| 2013 | Regularized MVDR spectrum estimation-based robust feature extractors for speech recognition. Md. Jahangir Alam, Patrick Kenny, Douglas D. O'Shaughnessy |
| 2013 | Regularized subspace n-gram model for phonotactic ivector extraction. Mehdi Soufifar, Lukás Burget, Oldrich Plchot, Sandro Cumani, Jan Cernocký |
| 2013 | Relative error bounds for statistical classifiers based on the f-divergence. Markus Nußbaum-Thom, Eugen Beck, Tamer Alkhouli, Ralf Schlüter, Hermann Ney |
| 2013 | Relevance-weighted-reconstruction of articulatory features in deep-neural-network-based acoustic-to-articulatory mapping. Claudia Canevari, Leonardo Badino, Luciano Fadiga, Giorgio Metta |
| 2013 | Rephrasing-based speech intelligibility enhancement. Mengqiu Zhang, Petko Nikolov Petkov, W. Bastiaan Kleijn |
| 2013 | Resistance is futile - the intonation between continuation rise and calling contour in German. Oliver Niebuhr |
| 2013 | Restoration of clipped signals with application to speech recognition. Shay Maymon, Etienne Marcheret, Vaibhava Goel |
| 2013 | Restructuring of deep neural network acoustic models with singular value decomposition. Jian Xue, Jinyu Li, Yifan Gong |
| 2013 | Reverberant speech recognition based on denoising autoencoder. Takaaki Ishii, Hiroki Komiyama, Takahiro Shinozaki, Yasuo Horiuchi, Shingo Kuroiwa |
| 2013 | Revisiting pitch slope and height effects on perceived duration. Carlos Gussenhoven, Wencui Zhou |
| 2013 | Rhythm analysis of second-language speech through low-frequency auditory features. Jin Jin, Joseph Tepperman |
| 2013 | Robust and accurate features for detecting and diagnosing autism spectrum disorders. Meysam Asgari, Alireza Bayestehtashk, Izhak Shafran |
| 2013 | Robust audio-codebooks for large-scale event detection in consumer videos. Shourabh Rawat, Peter F. Schulam, Susanne Burger, Duo Ding, Yipei Wang, Florian Metze |
| 2013 | Robust estimation of multiple-regression HMM parameters for dimension-based expressive dialogue speech synthesis. Tomohiro Nagata, Hiroki Mori, Takashi Nose |
| 2013 | Robust formant detection using group delay function and stabilized weighted linear prediction. Dhananjaya N. Gowda, Jouni Pohjalainen, Mikko Kurimo, Paavo Alku |
| 2013 | Robust speaker recognition using spectro-temporal autoregressive models. Sri Harish Reddy Mallidi, Sriram Ganapathy, Hynek Hermansky |
| 2013 | Robust speech enhancement techniques for ASR in non-stationary noise and dynamic environments. Gang Liu, Dimitrios Dimitriadis, Enrico Bocchieri |
| 2013 | SII-based speech preprocessing for intelligibility improvement in noise. Cees H. Taal, Jesper Jensen |
| 2013 | SMASH: a tool for articulatory data processing and analysis. Jordan R. Green, Jun Wang, David L. Wilson |
| 2013 | Salento Italian listeners' perception of American English vowels. Bianca Sisinni, Paola Escudero, Mirko Grimaldi |
| 2013 | Same same but different - an acoustical comparison of the automatic segmentation of high quality and mobile telephone speech. Christoph Draxler, Hanna S. Feiser |
| 2013 | Secure binary embeddings of front-end factor analysis for privacy preserving speaker verification. José Portelo, Alberto Abad, Bhiksha Raj, Isabel Trancoso |
| 2013 | Security evaluation of i-vector based speaker verification systems against hill-climbing attacks. Marta Gomez-Barrero, Javier Gonzalez-Dominguez, Javier Galbally, Joaquin Gonzalez-Rodriguez |
| 2013 | Selective use of gaze information to improve ASR performance in noisy environments by cache-based class language model adaptation. Ao Shen, Neil Cooke, Martin J. Russell |
| 2013 | Self-taught assistive vocal interfaces: an overview of the ALADIN project. Jort F. Gemmeke, Bart Ons, Netsanet M. Tessema, Hugo Van hamme, Janneke van de Loo, Guy De Pauw, Walter Daelemans, Jonathan Huyghe, Jan Derboven, Lode Vuegen, Bert Van Den Broeck, Peter Karsmakers, Bart Vanrumste |
| 2013 | Semantic parsing using word confusion networks with conditional random fields. Gökhan Tür, Anoop Deoras, Dilek Hakkani-Tür |
| 2013 | Semi-supervised GMM and DNN acoustic model training with multi-system combination and confidence re-calibration. Yan Huang, Dong Yu, Yifan Gong, Chaojun Liu |
| 2013 | Semi-supervised manifold learning approaches for spoken term verification. Atta Norouzian, Richard C. Rose, Aren Jansen |
| 2013 | Sentiment analysis of online spoken reviews. Verónica Pérez-Rosas, Rada Mihalcea |
| 2013 | Sequence-discriminative training of deep neural networks. Karel Veselý, Arnab Ghoshal, Lukás Burget, Daniel Povey |
| 2013 | Sequential model adaptation for speaker verification. Jun Wang, Dong Wang, Xiaojun Wu, Thomas Fang Zheng, Javier Tejedor |
| 2013 | Show me what you listen to! auditory classification images can reveal the processing of fine acoustic cues during speech categorization. Léo Varnet, Kenneth Knoblauch, Fanny Meunier, Michel Hoen |
| 2013 | Significance of instants of significant excitation for source modeling. Nagaraj Adiga, S. R. M. Prasanna |
| 2013 | Significance of variable height-bandwidth group delay filters in the spectral reconstruction of speech. Devanshu Arya, Anant Raj, Rajesh M. Hegde |
| 2013 | Simple Robert A. J. Clark |
| 2013 | Simple, lexicalized choice of translation timing for simultaneous speech translation. Tomoki Fujita, Graham Neubig, Sakriani Sakti, Tomoki Toda, Satoshi Nakamura |
| 2013 | Simultaneous perturbation stochastic approximation for automatic speech recognition. Daniel Stein, Jochen Schwenninger, Michael Stadtschnitzer |
| 2013 | Social face to face communication - American English attitudinal prosody. Albert Rilliard, Donna Erickson, Takaaki Shochi, João Antônio de Moraes |
| 2013 | Some issues affecting the transcription of Hungarian broadcast audio. Anindya Roy, Lori Lamel, Thiago Fraga-Silva, Jean-Luc Gauvain, Ilya Oparin |
| 2013 | Speaker adaptation of an acoustic-articulatory inversion model using cascaded Gaussian mixture regressions. Thomas Hueber, Gérard Bailly, Pierre Badin, Frédéric Elisei |
| 2013 | Speaker and noise independent voice activity detection. François G. Germain, Dennis L. Sun, Gautham J. Mysore |
| 2013 | Speaker dependent activation keyword detector based on GMM-UBM. Evelyn Kurniawati, Sapna George |
| 2013 | Speaker separation using visual speech features and single-channel audio. Faheem Khan, Ben Milner |
| 2013 | Speaker verification based on fusion of acoustic and articulatory information. Ming Li, Jangwon Kim, Prasanta Kumar Ghosh, Vikram Ramanarayanan, Shrikanth S. Narayanan |
| 2013 | Speaker-specific retraining for enhanced compression of unit selection text-to-speech databases. Jani Nurminen, Hanna Silén, Moncef Gabbouj |
| 2013 | Speaking rate normalization with lattice-based context-dependent phoneme duration modeling for personalized speech recognizers on mobile devices. Ching-Feng Yeh, Hung-yi Lee, Lin-Shan Lee |
| 2013 | Spectral modulation sensitivity based perceptual acoustic echo cancellation. Wei-Lun Chuang, Kah-Meng Cheong, Chung-Chien Hsu, Tai-Shih Chi |
| 2013 | Spectro-temporal directional derivative features for automatic speech recognition. James Gibson, Maarten Van Segbroeck, Antonio Ortega, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2013 | Spectro-temporal modulation based singing detection combined with pitch-based grouping for singing voice separation. Tse-En Lin, Chung-Chien Hsu, Yi-Cheng Chen, Jian-Hueng Chen, Tai-Shih Chi |
| 2013 | Spectro-temporal post-enhancement using MMSE estimation in NMF based single-channel source separation. Emad M. Grais, Hakan Erdogan |
| 2013 | Speech acoustic unit segmentation using hierarchical dirichlet processes. Amir Hossein Harati Nejad Torbati, Joseph Picone, Marc Sobel |
| 2013 | Speech activity detection on youtube using deep neural networks. Neville Ryant, Mark Y. Liberman, Jiahong Yuan |
| 2013 | Speech enhancement based on deep denoising autoencoder. Xugang Lu, Yu Tsao, Shigeki Matsuda, Chiori Hori |
| 2013 | Speech enhancement using compressed sensing. Vinayak Abrol, Pulkit Sharma, Anil Kumar Sao |
| 2013 | Speech enhancement using convolutive nonnegative matrix factorization with cosparsity regularization. Majid Mirbagheri, Yanbo Xu, Sahar Akram, Shihab A. Shamma |
| 2013 | Speech enhancement with weighted denoising auto-encoder. Bingyin Xia, Changchun Bao |
| 2013 | Speech planning as an index of speech motor control maturity. Guillaume Barbier, Pascal Perrier, Lucie Ménard, Yohan Payan, Mark K. Tiede, Joseph S. Perkell |
| 2013 | Speech quality prediction for artificial bandwidth extension algorithms. Sebastian Möller, Emilia Kelaidi, Friedemann Köster, Nicolas Côté, Patrick Bauer, Tim Fingscheidt, Thomas Schlien, Hannu Pulakka, Paavo Alku |
| 2013 | Speech spectrum restoration based on conditional restricted boltzmann machine. Xugang Lu, Shigeki Matsuda, Chiori Hori |
| 2013 | Speechmark acoustic landmark tool: application to voice pathology. Suzanne Boyce, Marisha Speights, Keiko Ishikawa, Joel MacAuslan |
| 2013 | Speed up of recurrent neural network language models with sentence independent subsampling stochastic gradient descent. Yangyang Shi, Mei-Yuh Hwang, Kaisheng Yao, Martha A. Larson |
| 2013 | Spontaneous and explicit speech imitation. Jeesun Kim, Ruben Demirdjian, Chris Davis |
| 2013 | Spoofing and countermeasures for automatic speaker verification. Nicholas W. D. Evans, Tomi Kinnunen, Junichi Yamagishi |
| 2013 | Stable articulatory tasks and their variable formation: tamil retroflex consonants. Caitlin Smith, Michael I. Proctor, Khalil Iskarous, Louis Goldstein, Shrikanth S. Narayanan |
| 2013 | Standoff speaker recognition: effects of recording distance mismatch on speaker recognition system performance. Mike Fowler, Mark McCurry, Jonathan Bramsen, Kehinde Dunsin, Jeremiah Remus |
| 2013 | Statistical nonparametric speech synthesis using sparse Gaussian processes. Tomoki Koriyama, Takashi Nose, Takao Kobayashi |
| 2013 | Statistical synthesizer with embedded prosodic and spectral modifications to generate highly intelligible speech in noise. Daniel Erro, Tudor-Catalin Zorila, Yannis Stylianou, Eva Navas, Inma Hernáez |
| 2013 | Stochastic-deterministic signal modelling for the tracking of pitch in noise and speech mixtures using factorial HMMs. Matthew C. McCallum, Bernard J. Guillemin |
| 2013 | Strategies for high accuracy keyword detection in noisy channels. Arindam Mandal, Julien van Hout, Yik-Cheung Tam, Vikramjit Mitra, Yun Lei, Jing Zheng, Dimitra Vergyri, Luciana Ferrer, Martin Graciarena, Andreas Kathol, Horacio Franco |
| 2013 | Stream selection and integration in multistream ASR using GMM-based performance monitoring. Tetsuji Ogawa, Feipeng Li, Hynek Hermansky |
| 2013 | Structure learning in hidden conditional random fields for grapheme-to-phoneme conversion. Patrick Lehnen, Alexandre Allauzen, Thomas Lavergne, François Yvon, Stefan Hahn, Hermann Ney |
| 2013 | Study of coarticulation and F2 transitions in French and Italian adult stutterers. Marine Verdurand, Solange Rossato, Lionel Granjon, Daria Balbo, Claudio Zmarich |
| 2013 | Subspace models for bottleneck features. Jun Qi, Dong Wang, Javier Tejedor |
| 2013 | Subspace-constrained supervector PLDA for speaker verification. Daniel Garcia-Romero, Alan McCree |
| 2013 | Superposed speech localisation using frequency tracking. Maxime Le Coz, Julien Pinquier, Régine André-Obrecht |
| 2013 | Supervised spoken document summarization based on structured support vector machine with utterance clusters as hidden variables. Sz-Rung Shiang, Hung-yi Lee, Lin-Shan Lee |
| 2013 | Suprasegmental information modelling for autism disorder spectrum and specific language impairment classification. David Martínez González, Dayana Ribas, Eduardo Lleida, Alfonso Ortega, Antonio Miguel |
| 2013 | Syllable nuclei detection using perceptually significant features. Apoorv Reddy Arrabothu, Nivedita Chennupati, B. Yegnanarayana |
| 2013 | Syllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architecture. Milos Cernak, Xingyu Na, Philip N. Garner |
| 2013 | Synthetic speaker models using VTLN to improve the performance of children in mismatched speaker conditions for ASR. D. Rama Sanand, Torbjørn Svendsen |
| 2013 | THU-EE system fusion for the NIST 2012 speaker recognition evaluation. Wei-Qiang Zhang, Zhiyi Li, Weiwei Liu, Jia Liu |
| 2013 | TP 3.1 software: a tool for designing audio, visual, and audiovisual perceptual training tasks and perception tests. Andréia Schurt Rauber, Anabela Rato, Denise Cristina Kluge, Giane Rodrigues dos Santos |
| 2013 | TRAP language identification system for RATS phase II evaluation. Kyu Jeong Han, Sriram Ganapathy, Ming Li, Mohamed Kamal Omar, Shrikanth S. Narayanan |
| 2013 | TUNDRA: a multilingual corpus of found data for TTS research created with light supervision. Adriana Stan, Oliver Watts, Yoshitaka Mamiya, Mircea Giurgiu, Robert A. J. Clark, Junichi Yamagishi, Simon King |
| 2013 | Talker-specific perceptual processing: influences on internal category structure. Rachel M. Theodore |
| 2013 | Target-to-non-target directional ratio estimation based on dual-microphone phase differences for target-directional speech enhancement. Seon Man Kim, Hong Kook Kim |
| 2013 | Technique for automatic sentence level alignment of long speech and transcripts. Imran Ahmed, Sunil Kumar Kopparapu |
| 2013 | Template-warping based speech driven head motion synthesis. David Adam Braude, Hiroshi Shimodaira, Atef Ben Youssef |
| 2013 | Text-dependent speaker recognition using PLDA with uncertainty propagation. Themos Stafylakis, Patrick Kenny, Pierre Ouellet, Javier Perez, Marcel Kockmann, Pierre Dumouchel |
| 2013 | Text-to-speech alignment of long recordings using universal phone models. Sarah Hoffmann, Beat Pfister |
| 2013 | Text-to-speech inspired duration modeling for improved whole-word acoustic models. Keith Kintzley, Aren Jansen, Hynek Hermansky |
| 2013 | The 2012 NIST speaker recognition evaluation. Craig S. Greenberg, Vincent M. Stanford, Alvin F. Martin, Meghana Yadagiri, George R. Doddington, John J. Godfrey, Jaime Hernandez-Cordero |
| 2013 | The AT&t speech API: a study on practical challenges for customized speech to text service. Evandro Gouvêa, Antonio Moreno-Daniel, A. Reddy, Rathinavelu Chengalvarayan, David L. Thomson, Andrej Ljolje |
| 2013 | The I3a speaker recognition system for NIST SRE12: post-evaluation analysis. Jesús Antonio Villalba López, Eduardo Lleida, Alfonso Ortega, Antonio Miguel |
| 2013 | The IBM RATS phase II speaker recognition system: overview and analysis. Weizhong Zhu, Sibel Yaman, Jason W. Pelecanos |
| 2013 | The IBM speech activity detection system for the DARPA RATS program. George Saon, Samuel Thomas, Hagen Soltau, Sriram Ganapathy, Brian Kingsbury |
| 2013 | The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism. Björn W. Schuller, Stefan Steidl, Anton Batliner, Alessandro Vinciarelli, Klaus R. Scherer, Fabien Ringeval, Mohamed Chetouani, Felix Weninger, Florian Eyben, Erik Marchi, Marcello Mortillaro, Hugues Salamin, Anna Polychroniou, Fabio Valente, Samuel Kim |
| 2013 | The MUTE silent speech recognition system. Geoffrey S. Meltzner, James T. Heaton, Yunbin Deng |
| 2013 | The acoustics of word stress in Swedish: a function of stress level, speaking style and word accent. Anders Eriksson, Plínio A. Barbosa, Joel Åkesson |
| 2013 | The albayzin 2012 language recognition evaluation. Luis Javier Rodríguez-Fuentes, Niko Brümmer, Mikel Peñagarikano, Amparo Varona, Germán Bordel, Mireia Díez |
| 2013 | The bulgarian stressed and unstressed vowel system. a corpus study. Bistra Andreeva, William J. Barry, Jacques C. Koreman |
| 2013 | The distribution of calibrated likelihood-ratios in speaker recognition. David A. van Leeuwen, Niko Brümmer |
| 2013 | The duration compensation issue revisited. Plínio A. Barbosa |
| 2013 | The edinburgh speech production facility doubletalk corpus. James M. Scobbie, Alice Turk, Christian Geng, Simon King, Robin J. Lickley, Korin Richmond |
| 2013 | The effect of visual speech timing and form cues on the processing of speech and nonspeech. Chris Davis, Jeesun Kim |
| 2013 | The effect of word frequency and lexical class on articulatory-acoustic coupling. Zhaojun Yang, Vikram Ramanarayanan, Dani Byrd, Shrikanth S. Narayanan |
| 2013 | The effects of perceptual and/or productive training on the perception and production of English vowels /ɪ/ and /iː/ by Cantonese ESL learners. Janice Wing Sze Wong |
| 2013 | The furhat social companion talking head. Samer Al Moubayed, Jonas Beskow, Gabriel Skantze |
| 2013 | The influence of F0 contour continuity on prominence perception. Hansjörg Mixdorff, Oliver Niebuhr |
| 2013 | The influence of accentuation and polysyllabicity on compensatory shortening in German. Jessica Siddins, Jonathan Harrington, Felicitas Kleber, Ulrich Reubold |
| 2013 | The influence of language and speech task upon creaky voice use among six young American women learning French. Agathe Benoist-Lucy, Claire Pillot-Loiseau |
| 2013 | The interplay of intonation and complex lexical tones: how speaker attitudes affect the realization of glottalization on vietnamese sentence-final particles. Thi Lan Nguyen, Alexis Michaud, Do Dat Tran, Dang-Khoa Mac |
| 2013 | The interplay of linguistic structure and breathing in German spontaneous speech. Amélie Rochet-Capellan, Susanne Fuchs |
| 2013 | The learning and generalization of contrasts consistent or inconsistent with native biases. Kyuwon Moon, Meghan Sumner |
| 2013 | The organ stop "vox humana" as a model for a vowel synthesiser. Fabian Brackhane, Jürgen Trouvain |
| 2013 | The phonological voicing contrast in Czech: an EPG study of phonated and whispered fricatives. Radek Skarnitzl, Pavel Sturm, Pavel Machac |
| 2013 | The physiological use of the charismatic voice in Political speech. Rosario Signorello, Didier Demolin |
| 2013 | The production and perception of voice onset time in English-speaking children enrolled in a French immersion program. Nicole Netelenbos, Fangfang Li |
| 2013 | The relationship between gender-differentiated productions of /s/ and gender role behaviour in young children. Melissa Kinsman, Fangfang Li |
| 2013 | The role of empathy in the recognition of vocal emotions. Rene Altrov, Hille Pajupuu, Jaan Pajupuu |
| 2013 | The role of intrinsic motivations in learning sensorimotor vocal mappings: a developmental robotics study. Clément Moulin-Frier, Pierre-Yves Oudeyer |
| 2013 | The role of the pharynx and tongue in enhancement of vowel nasalization: a real-time MRI investigation of French nasal vowels. Christopher Carignan, Ryan Shosted, Maojing Fu, Zhi-Pei Liang, Bradley P. Sutton |
| 2013 | The sheffield wargames corpus. Charles Fox, Yulan Liu, Erich Zwyssig, Thomas Hain |
| 2013 | The spectral dynamics of vowels in Mandarin Chinese. Jiahong Yuan |
| 2013 | The speech recognition virtual kitchen. Florian Metze, Eric Fosler-Lussier, Rebecca Bates |
| 2013 | The voice prominence hypothesis: the interplay of F0 and voice source features in accentuation. Ailbhe Ní Chasaide, Irena Yanushevskaya, John Kane, Christer Gobl |
| 2013 | Theme identification in telephone service conversations using quaternions of speech features. Mohamed Morchid, Georges Linarès, Marc El-Bèze, Renato De Mori |
| 2013 | Three-dimensional rectangular vocal-tract model with asymmetric wall impedances. Kunitoshi Motoki |
| 2013 | Timing differences in articulation between voiced and voiceless stop consonants: an analysis of cine-MRI data. Masako Fujimoto, Tatsuya Kitamura, Hiroaki Hatano, Ichiro Fujimoto |
| 2013 | Timing responses to questions in dialogue. Sofia Strömbergsson, Anna Hjalmarsson, Jens Edlund, David House |
| 2013 | Toward transfer of acoustic cues of emphasis across languages. Andreas Tsiartas, Panayiotis G. Georgiou, Shrikanth S. Narayanan |
| 2013 | Towards a more efficient SVM supervector speaker verification system using Gaussian reduction and a tree-structured hash. Richard D. McClanahan, Phillip L. De Leon |
| 2013 | Towards a systematic and quantitative analysis of vocal tract data. Samuel S. Silva, António J. S. Teixeira, Catarina Oliveira, Paula Martins |
| 2013 | Towards an end-to-end computational model of speech comprehension: simulating a lexical decision task. Louis ten Bosch, Lou Boves, Mirjam Ernestus |
| 2013 | Training an articulatory synthesizer with continuous acoustic data. Santitham Prom-on, Peter Birkholz, Yi Xu |
| 2013 | Training log-linear acoustic models in higher-order polynomial feature space for speech recognition. Muhammad Ali Tahir, Heyun Huang, Ralf Schlüter, Hermann Ney, Louis ten Bosch, Bert Cranen, Lou Boves |
| 2013 | Transducer-based speech recognition with dynamic language models. Munir Georges, Stephan Kanthak, Dietrich Klakow |
| 2013 | Truncation of pharyngeal gesture in English diphthong [aɪ]. Fang-Ying Hsieh, Louis Goldstein, Dani Byrd, Shrikanth S. Narayanan |
| 2013 | Two-step correction of speech recognition errors based on n-gram and long contextual information. Ryohei Nakatani, Tetsuya Takiguchi, Yasuo Ariki |
| 2013 | Ultraspeech-player: intuitive visualization of ultrasound articulatory data for speech therapy and pronunciation training. Thomas Hueber |
| 2013 | Uniform concatenative excitation model for synthesising speech without voiced/unvoiced classification. João P. Cabral |
| 2013 | Unsupervised confidence calibration using examples of recognized words and their contexts. Taichi Asami, Satoshi Kobashikawa, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi |
| 2013 | Unsupervised discriminative language modeling using error rate estimator. Takanobu Oba, Atsunori Ogawa, Takaaki Hori, Hirokazu Masataki, Atsushi Nakamura |
| 2013 | Unsupervised language model adaptation for automatic speech recognition of broadcast news using web 2.0. Tim Schlippe, Lukasz Gren, Ngoc Thang Vu, Tanja Schultz |
| 2013 | Unsupervised mining of acoustic subword units with segment-level Gaussian posteriorgrams. Haipeng Wang, Tan Lee, Cheung-Chi Leung, Bin Ma, Haizhou Li |
| 2013 | Unsupervised naming of speakers in broadcast TV: using written names, pronounced names or both? Johann Poignant, Laurent Besacier, Viet Bac Le, Sophie Rosset, Georges Quénot |
| 2013 | Unsupervised prominence prediction for speech synthesis. Mahnoosh Mehrabani, Taniya Mishra, Alistair Conkie |
| 2013 | Unsupervised speaker and expression factorization for multi-speaker expressive synthesis of ebooks. Langzhou Chen, Norbert Braunschweiler |
| 2013 | Unsupervised topic adaptation for morph-based speech recognition. André Mansikkaniemi, Mikko Kurimo |
| 2013 | Unsupervised vocal-tract length estimation through model-based acoustic-to-articulatory inversion. Shanqing Cai, H. Timothy Bunnell, Rupal Patel |
| 2013 | User activity estimation method based on probabilistic generative model of acoustic event sequence with user activity and its subordinate categories. Keisuke Imoto, Suehiro Shimauchi, Hisashi Uematsu, Hitoshi Ohmuro |
| 2013 | User feedback in human-robot interaction: prosody, gaze and timing. Gabriel Skantze, Catharine Oertel, Anna Hjalmarsson |
| 2013 | Using an autoencoder with deformable templates to discover features for automated speech recognition. Navdeep Jaitly, Geoffrey E. Hinton |
| 2013 | Using conversational word bursts in spoken term detection. Justin T. Chiu, Alexander I. Rudnicky |
| 2013 | Using denoising autoencoder for emotion recognition. Rui Xia, Yang Liu |
| 2013 | Using dialog-activity similarity for spoken information retrieval. Nigel G. Ward, Steven D. Werner |
| 2013 | Using generalized additive models and random forests to model prosodic prominence in German. Denis Arnold, Petra Wagner, R. Harald Baayen |
| 2013 | Using group delay functions from all-pole models for speaker recognition. Padmanabhan Rajan, Tomi Kinnunen, Cemal Hanilçi, Jouni Pohjalainen, Paavo Alku |
| 2013 | Using linguistic analysis to characterize conceptual units of thought in spoken medical narratives. Kathryn Womack, Cecilia Ovesdotter Alm, Cara Calvelli, Jeff B. Pelz, Pengcheng Shi, Anne R. Haake |
| 2013 | Using linguistic information to detect overlapping speech. Jürgen T. Geiger, Florian Eyben, Nicholas W. D. Evans, Björn W. Schuller, Gerhard Rigoll |
| 2013 | Using phone log-likelihood ratios as features for speaker recognition. Mireia Díez, Amparo Varona, Mikel Peñagarikano, Luis Javier Rodríguez-Fuentes, Germán Bordel |
| 2013 | Using phonetic feature extraction to determine optimal speech regions for maximising the effectiveness of glottal source analysis. John Kane, Irena Yanushevskaya, John Dalton, Christer Gobl, Ailbhe Ní Chasaide |
| 2013 | Using phonetic patterns for detecting social cues in natural conversations. Johannes Wagner, Florian Lingenfelser, Elisabeth André |
| 2013 | Using phonological phrase segmentation to improve automatic keyword spotting for the highly agglutinating Hungarian language. György Szaszák, András Beke |
| 2013 | Using role play for collecting question-answer pairs for dialogue agents. Ryuichiro Higashinaka, Kohji Dohsaka, Hideki Isozaki |
| 2013 | Using spectral moments as a speaker specific feature in nasals and fricatives. Carola Schindler, Christoph Draxler |
| 2013 | Using text and acoustic features to diagnose progressive aphasia and its subtypes. Kathleen C. Fraser, Frank Rudzicz, Elizabeth Rochon |
| 2013 | Using twin-HMM-based audio-visual speech enhancement as a front-end for robust audio-visual speech recognition. Ahmed Hussen Abdelaziz, Steffen Zeiler, Dorothea Kolossa |
| 2013 | VTLN based on the linear interpolation of contiguous mel filter-bank energies. Néstor Becerra Yoma, Claudio Garretón, Fernando Huenupán, Ignacio Catalan, Jorge Wuth |
| 2013 | Variable-Span out-of-vocabulary named entity detection. Wei Chen, Sankaranarayanan Ananthakrishnan, Rohit Prasad, Prem Natarajan |
| 2013 | Velic coordination in French nasals: a real-time magnetic resonance imaging study. Michael I. Proctor, Louis Goldstein, Adam C. Lammert, Dani Byrd, Asterios Toutios, Shrikanth S. Narayanan |
| 2013 | Visualizing articulatory data with VisArtico. Slim Ouni |
| 2013 | Viterbi decoding for latent words language models using gibbs sampling. Ryo Masumura, Hirokazu Masataki, Takanobu Oba, Osamu Yoshioka, Satoshi Takahashi |
| 2013 | Vocabulary structure and spoken-word recognition: evidence from French reveals the source of embedding asymmetry. Anne Cutler, Laurence Bruggeman |
| 2013 | Vocal tract cross-distance estimation from real-time MRI using region-of-interest analysis. Adam C. Lammert, Vikram Ramanarayanan, Michael I. Proctor, Shrikanth S. Narayanan |
| 2013 | Voice activity classification for automatic bi-speaker adaptive beamforming in speech separation. Ngoc Thuy Tran, William G. Cowley, André Pollok |
| 2013 | Voice conversion for non-parallel datasets using dynamic kernel partial least squares regression. Hanna Silén, Jani Nurminen, Elina Helander, Moncef Gabbouj |
| 2013 | Voice conversion in high-order eigen space using deep belief nets. Toru Nakashika, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki |
| 2013 | Voice pathology detection and classification using MPEG-7 audio low-level features. Ghulam Muhammad, Moutasem Melhem |
| 2013 | Voice search in mobile applications and the use of linked open data. Felix Burkhardt, Hans Ulrich Nägeli |
| 2013 | Voice search in mobile applications with the rootvole framework. Felix Burkhardt |
| 2013 | Voice transformation-based spoofing of text-dependent speaker verification systems. Zvi Kons, Hagai Aronowitz |
| 2013 | Vowel and prosodic factor dependent variations of vocal-tract length. Shinji Maeda, Yves Laprie |
| 2013 | Vowel identity conditions the time course of tone recognition. Jason A. Shaw, Michael D. Tyler, Benjawan Kasisopa, Yuan Ma, Michael I. Proctor, Chong Han, Donald Derrick, Denis K. Burnham |
| 2013 | Vulnerability evaluation of speaker verification under voice conversion spoofing: the effect of text constraints. Zhizheng Wu, Anthony Larcher, Kong-Aik Lee, Engsiong Chng, Tomi Kinnunen, Haizhou Li |
| 2013 | Weakly supervised parsing with rules. Christophe Cerisara, Alejandra Lorenzo, Pavel Král |
| 2013 | Web data harvesting for speech understanding grammar induction. Ioannis Klasinas, Alexandros Potamianos, Elias Iosif, Spiros Georgiladakis, Gianluca Mameli |
| 2013 | Weighting of acoustic cues shifts to frication duration in identification of fricatives/affricates when auditory properties are degraded due to aging. Keiichi Yasu, Takayuki Arai, Kei Kobayashi, Mitsuko Shindo |
| 2013 | What's the difference? comparing humans and machines on the Aurora 2 speech recognition task. Bernd T. Meyer |
| 2013 | Which resemblance is useful to predict phrase boundary rise labels for Japanese expressive text-to-speech synthesis, numerically-expressed stylistic or distribution-based semantic? Hideharu Nakajima, Hideyuki Mizuno, Osamu Yoshioka, Satoshi Takahashi |
| 2013 | Word frequency, vowel length and vowel quality in speech production: an EMA study of the importance of experience. Fabian Tomaschek, Martijn Wieling, Denis Arnold, R. Harald Baayen |
| 2013 | Word identification using phonetic features: towards a method to support multivariate fMRI speech decoding. Tijl Grootswagers, Karen Dijkstra, Louis ten Bosch, Alex Brandmeyer, Makiko Sadakata |
| 2013 | Word stress perception in European Portuguese. Susana Correia, Sónia Frota, Joseph Butler, Marina Vigário |
| 2013 | Written-domain language modeling for automatic speech recognition. Hasim Sak, Yun-Hsuan Sung, Françoise Beaufays, Cyril Allauzen |
| 2013 | ivector-based acoustic data selection. Olivier Siohan, Michiel Bacchiani |