Eric's Page

Eric Battenberg

Last updated: Apr 2, 2017

I'm currently a Research Scientist at Baidu's Silicon Valley Artificial Intelligence Lab led by Adam Coates and Andrew Ng. At Baidu, I work on end-to-end speech and language understanding systems.

Previously, I worked on audio content analysis applications at Gracenote in Emeryville, CA. Before that, I received my PhD in Electrical Engineering and Computer Sciences from UC Berkeley, where I did research on the application of signal processing and machine learning techniques to music and audio analysis and processing.

I was advised by David Wessel at the Center for New Music and Audio Technologies (CNMAT) and co-advised by Nelson Morgan at the International Computer Science Institute (ICSI). I worked with the Parallel Computing Laboratory (Par Lab) on parallel music applications. My research interests include machine perception, music information retrieval, deep learning / neural networks, audio signal processing, speech and language understanding, and parallel/scalable machine learning.

| News/Updates | Publications | Talks | Conference Activities | Projects | Software |



Google Scholar Profile
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
This paper describes work done at Baidu's Silicon Valley AI Lab to train end-to-end deep recurrent neural networks for both English and Mandarin speech recognition.
Dario Amodei, Rishita Anubai, Eric Battenberg, Carl Case, et al. (34 members of SVAIL alphabetical by last name)
ICML, New York, New York, 2016
LibROSA: Audio and Music Signal Analysis in Python
Description of the open source audio processing framework LibROSA.
Brian McFee, Colin Raffel, Dawen Liang, Daniel PW Ellis, Matt McVicar, Eric Battenberg, Oriol Nieto
SciPy, Austin, Texas, 2015
Well-Defined Tasks and Good Datasets for MIR
Extended abstract describing what was discussed at a late-break unconference meeting from ISMIR 2013.
Eric Battenberg
ISMIR Late-Break, Curitiba, Brazil, 2013
Scalable Multimedia Content Analysis on Parallel Platforms Using Python
Presents a framework that automatically maps audio content analysis code written in Python onto a variety of parallel platforms.
Ekaterina Gonina, Gerald Friedland, Eric Battenberg, Penporn Koanantakool, Michael Driscoll, Evangelos Georganan, and Kurt Keutzer
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), 2013
Techniques for Machine Understanding of Live Drum Performances
Techniques for drum detection, multi-hypothesis beat tracking, and drum pattern analysis are presented as a complete system for drum understanding.
Eric Battenberg
PhD Dissertation, EECS, UC Berkeley, Dec 2012
Analyzing Drum Patterns Using Conditional Deep Belief Networks
Applies multi-layer neural networks to the analysis of drum patterns.
Eric Battenberg and David Wessel
ISMIR, Porto, Portugal, 2012
Toward Live Drum Separation Using Probabilistic Spectral Clustering Based on the Itakura-Saito Divergence
This paper introduces techniques for decomposing drum audio onto spectral templates which are learned using a probabilistic Gamma Mixture Model.
Eric Battenberg, Victor Huang, and David Wessel
AES 45th Conference: Applications of Time-Frequency Processing in Audio, Helsinki, Finland, March 2012
Implementing Real-Time Partitioned Convolution Algorithms on Conventional Operating Systems
Compares the performance, plugin compatibility, and required programmer effort of preemptive and time-distributed implementations of non-uniform partitioned convolution.
Eric Battenberg and Rimas Avizienis
Digital Audio Effects Conference, Paris, France, Sept 2011
Real-Time Musical Applications on an Experimental Operating System for Multi-Core Processors
Describes the operating system needs of real-time music software and a current approach to meeting these needs.
Juan Colmenares, Ian Saxton, Eric Battenberg, Rimas Avizienis, Nils Peters, Krste Asanovic, John D. Kubiatowicz, and David Wessel
International Computer Music Conference, Huddersfield, UK, Aug 2011
Advances in the Parallelization of Music and Audio Applications
A survey of parallel computer music work being done at CNMAT and the Par Lab.
Eric Battenberg, Adrian Freed, and David Wessel, June 2010
International Computer Music Conference, New York City/Stony Brook, New York, Jun 2010
Accelerating Non-Negative Matrix Factorization for Audio Source Separation on Multi-Core and Many-Core Architectures
| poster |
OpenMP and CUDA implementations of NMF to speed up drum track extraction.
Eric Battenberg and David Wessel, May 2009
International Society for Music Information Retrieval Conference, Kobe, Japan, 2009
Improvements to Percussive Component Extraction Using Non-Negative Matrix Factorization and Support Vector Machines
Perceptual dimensionality reduction and new features are used to improve the speed and performance of automatic drum track extraction.
Eric Battenberg
Masters Thesis, EECS, UC Berkeley, Dec 2008
Optimizing Hearing Aids for Music Listening
A subspace technique for optimal hearing aid fitting.
David Wessel, Kelly Fitz, Eric Battenberg, Andrew Schmeder, and Brent Edwards
19th International Congress on Acoustics, Madrid, Spain, Sep 2007.


Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
An overview of Baidu's Deep Speech 2 speech recognition system.
| slides |
Jun 20, 2016
ICML, New York, NY
Teaching Computers to Listen to Music
Condensed version of my previous talk of the same title.
| slides |
Nov 15, 2013
ML Conf, San Francisco, CA
Teaching Computers to Listen to Music
| slides |
Introduction to research in content-based music information retrieval. Exciting new research directions, including feature learning and recurrent neural networks. Drum transcription and drum pattern analysis are covered in depth.
Aug 14, 2013
SF Bayarea Machine Learning Meetup, Adobe Systems, San Francisco, CA
The Breadth of Applications for Music
| slides |
The range of music apps being pursued at CNMAT and what they need from parallel computing. Also, a case study on parallelizing audio source separation on OpenMP and CUDA.
May 2009
UPCRC Applications Workshop, Microsoft Research, Redmond, WA, 29/05/09.

Conference Activities

Deep Learning for Music
| Schedule |
Organized and chaired an ICASSP 2014 special session entitled Deep Learning for Music
Eric Battenberg, Erik Schmidt, Juan Bello
ICASSP, Florence, Italy, 2014


A Theoretical and Experimental Analysis of the Acoustic Guitar
| slides |
Automatically recognizing picking location and a look into natural harmonics and tuning systems.
Eric Battenberg, May 2009.
An Interior-Point Newton Algorithm for Non-negative Matrix Factorization
A Netwton step barrier method for non-negative matrix factorization is proposed and applied to audio source separation.
Eric Battenberg, Dec. 2008.
Parallelizing Audio Feature Extraction Using an Automatically-Partitioned Streaming Dataflow Language
| poster | | slides |
An attempt to use StreamIt for spectral feature extraction.
Eric Battenberg and Mark Murphy, May 2008.
Calculating Musical Rhythm Similarity
| poster |
A method for comparing rhythms using self-similarity.
Eric Battenberg, Dec. 2007.
Optimizing the Hearing Aid Musical Experience
Parameters of a multi-band compressor are tuned using a user-calibrated subjective parameter space.
Eric Battenberg, May 2007
A New Method for Calculating Music Similarity
Hidden Markov models and spectral fluctuation patterns are used to calculate a distance measure between songs.
Eric Battenberg and Vijay Ullal, Dec. 2006
A System for Automatic Cell Segmentation of Bacterial Microscopy Images
Various image processing and computer vision techniques are used to segment individual bacteria cells in a microscope image.
Eric Battenberg and IlkaBischofs-Pfeifer, Aug. 2006
Sparse Signal Representation
Image Compression Using Sparse Bayesian Learning.
Eric Battenberg, Vijay Ullal, and Galen Reeves, May 2006.


If you use any of this code, please send me an email to let me know how you plan on using it; I'd love to hear. Also, your feedback will help determine where the code needs to be improved. Feel free to send me any questions you have about its use. If you use any of this code for any published work, please cite the appropriate paper.
A Python module for training Gaussian Mixture Models (GMMs) on a GPU using CUDA.

A CUDA implementation of non-negative matrix factorization for GPUs as described in the ISMIR 2009 pulbication above.

A Python implement of conditional restricted Boltzmann machine training and generation as applied to drum pattern generation.