Publications | Eric Battenberg

Eric Battenberg, RJ Skerry-Ryan, Daisy Stanton, Soroosh Mariooryad, Matt Shannon, Julian Salazar, David Kao . Robust and Unbounded Length Generalization in Autoregressive Transformer-Based Text-to-Speech. NAACL, 2025.

Preprint PDF Project Slides Ref Audio Examples

Matt Shannon, Ben Poole, Soroosh Mariooryad, Tom Bagby, Eric Battenberg, David Kao, Daisy Stanton, RJ Skerry-Ryan . Learning the joint distribution of two sequences using little or no paired data. ICML SpiGM Workshop, 2023.

Preprint PDF

Daisy Stanton, Matt Shannon, Soroosh Mariooryad, RJ Skerry-Ryan, Eric Battenberg, Tom Bagby . Speaker Generation. ICASSP, 2022.

Preprint PDF Project Slides Ref Audio Examples

Ron J. Weiss, RJ Skerry-Ryan, Eric Battenberg, Soroosh Mariooryad, Diederik P. Kingma . Wave-Tacotron: Spectrogram-Free End-to-End Text-to-Speech Synthesis. ICASSP, 2021.

Preprint PDF Project Slides Ref Audio Examples

Matt Shannon, Ben Poole, Soroosh Mariooryad, Tom Bagby, Eric Battenberg, David Kao, Daisy Stanton, RJ Skerry-Ryan . Non-Saturating GAN Training as Divergence Minimization. arXiv, 2020.

Preprint PDF

Eric Battenberg, RJ Skerry-Ryan, Soroosh Mariooryad, Daisy Stanton, David Kao, Matt Shannon, Tom Bagby . Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis. ICASSP, 2019.

Preprint PDF Project Ref Audio Examples

Raza Habib, Soroosh Mariooryad, Matt Shannon, Eric Battenberg, RJ Skerry-Ryan, Daisy Stanton, David Kao, Tom Bagby . Semi-Supervised Generative Modeling for Controllable Speech Synthesis. ICLR, 2019.

Preprint PDF Project Ref Audio Examples

Eric Battenberg, Soroosh Mariooryad, Daisy Stanton, RJ Skerry-Ryan, Matt Shannon, David Kao, Tom Bagby . Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis. arXiv, 2019.

Preprint PDF Project Audio Examples

RJ Skerry-Ryan, Eric Battenberg, Ying Xiao, Yuxuan Wang, Daisy Stanton, Joel Shor, Ron J. Weiss, Rob Clark, Rif A. Saurous . Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron. ICML, 2018.

Preprint PDF Project Poster Slides Video Ref Audio Examples Blog Post

Yuxuan Wang, Daisy Stanton, Yu Zhang, RJ Skerry-Ryan, Eric Battenberg, Joel Shor, Ying Xiao, Fei Ren, Ye Jia, Rif A. Saurous . Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis. ICML, 2018.

Preprint PDF Project Source Document Ref Audio Examples Blog Post

Yuxuan Wang, RJ Skerry-Ryan, Ying Xiao, Daisy Stanton, Joel Shor, Eric Battenberg, Rob Clark, Rif A. Saurous . Uncovering Latent Style Factors for Expressive Speech Synthesis. NIPS ML4Audio Workshop, 2017.

Preprint PDF Project Poster Audio Examples Workshop

Eric Battenberg, Jitong Chen, Rewon Child, Adam Coates, Yashesh Gaur, Yi Li, Hairong Liu, Sanjeev Satheesh, David Seetapun, Anuroop Sriram, Zhenyao Zhu . Exploring Neural Transducers for End-to-End Speech Recognition. ASRU, 2017.

Preprint PDF Project Ref

Eric Battenberg, Rewon Child, Adam Coates, Christopher Fougner, Yashesh Gaur, Jiaji Huang, Heewoo Jun, Ajay Kannan, Markus Kliegl, Atul Kumar, Hairong Liu, Vinay Rao, Sanjeev Satheesh, David Seetapun, Anuroop Sriram, Zhenyao Zhu . Reducing Bias in Production Speech Models. arXiv, 2017.

Preprint PDF Project

Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen, Jie Chen, Jingdong Chen, Zhijie Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Ke Ding, Niandong Du, Erich Elsen, Jesse Engel, Weiwei Fang, Linxi Fan, Christopher Fougner, Liang Gao, Caixia Gong, Awni Hannun, Tony Han, Lappi Vaino Johannes, Bing Jiang, Cai Ju, Billy Jun, Patrick LeGresley, Libby Lin, Junjie Liu, Yang Liu, Weigao Li, Xiangang Li, Dongpeng Ma, Sharan Narang, Andrew Ng, Sherjil Ozair, Yiping Peng, Ryan Prenger, Sheng Qian, Zongfeng Quan, Jonathan Raiman, Vinay Rao, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Kavya Srinet, Anuroop Sriram, Haiyuan Tang, Liliang Tang, Chong Wang, Jidong Wang, Kaifu Wang, Yi Wang, Zhijian Wang, Zhiqian Wang, Shuang Wu, Likai Wei, Bo Xiao, Wen Xie, Yan Xie, Dani Yogatama, Bin Yuan, Jun Zhan, Zhenyao Zhu . Deep Speech 2: End-to-End Speech Recognition in English and Mandarin. ICML, 2016.

Preprint PDF Project Slides Ref

Sander Dieleman, Jan Schlüter, Colin Raffel, Eben Olson, Søren Kaae Sønderby, Daniel Nouri, Daniel Maturana, Martin Thoma, Eric Battenberg, Jack Kelly, Jeffrey De Fauw, Michael Heilman, Diogo Moitinho de Almeida, Brian McFee, Hendrik Weideman, Gábor Takács, Peter de Rivaz, Jon Crall, Gregory Sanders, Kashif Rasul, Cong Liu, Geoffrey French, Jonas Degrave . Lasagne: First Release. GitHub, 2015.

Code Project 0.1 Ref

Brian McFee, Colin Raffel, Dawen Liang, Daniel PW Ellis, Matt McVicar, Eric Battenberg, Oriol Nieto . LibROSA: Audio and Music Signal Analysis in Python. SciPy, 2015.

PDF Code Project 0.5.0 Ref

Ekaterina Gonina, Gerald Friedland, Eric Battenberg, Penporn Koanantakool, Michael Driscoll, Evangelos Georganan, Kurt Keutzer . Scalable Multimedia Content Analysis on Parallel Platforms Using Python. TOMCCAP, 2014.

PDF Project Ref

Eric Battenberg . Techniques for Machine Understanding of Live Drum Performances. PhD Thesis, UC Berkeley, 2012.

PDF Project Ref

Eric Battenberg, David Wessel . Analyzing Drum Patterns Using Conditional Deep Belief Networks. ISMIR, 2012.

PDF Code Project

Eric Battenberg, Victor Huang, David Wessel . Toward Live Drum Separation Using Probabilistic Spectral Clustering Based on the Itakura-Saito Divergence. AES 45, 2012.

PDF Project Slides Ref

Eric Battenberg, Rimas Avizienis . Implementing Real-Time Partitioned Convolution Algorithms on Conventional Operating Systems. DAFX, 2011.

PDF Project

Juan Colmenares, Ian Saxton, Eric Battenberg, Rimas Avizienis, Nils Peters, Krste Asanovic, John D. Kubiatowicz, David Wessel . Real-Time Musical Applications on an Experimental Operating System for Multi-Core Processors. ICMC, 2011.

PDF Project

Eric Battenberg, Adrian Freed, David Wessel . Advances in the Parallelization of Music and Audio Applications. ICMC, 2010.

PDF Project

Eric Battenberg, David Wessel . Accelerating Non-Negative Matrix Factorization for Audio Source Separation on Multi-Core and Many-Core Architectures. ISMIR, 2009.

PDF Code Project Project Poster

Eric Battenberg . Improvements to Percussive Component Extraction Using Non-Negative Matrix Factorization and Support Vector Machines. Masters Thesis, 2008.

PDF Project

David Wessel, Kelly Fitz, Eric Battenberg, Andrew Schmeder, Brent Edwards . Optimizing Hearing Aids for Music Listening. ICA, 2007.

PDF