End-to-End Speech Synthesis

Jul 1, 2017

At Google, I am now a member of the team that brought you Tacotron, an end-to-end speech synthesis system that uses neural networks to convert text directly to audio. Check out the audio samples from the recently released Tacotron 2 system, which combines Tacotron with a Wavenet-based vocoder.

Deep Learning

Publications

Eric Battenberg, RJ Skerry-Ryan, Soroosh Mariooryad, Daisy Stanton, David Kao, Matt Shannon, Tom Bagby . Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis. ICASSP, 2019.

Preprint PDF Project Ref Audio Examples

Raza Habib, Soroosh Mariooryad, Matt Shannon, Eric Battenberg, RJ Skerry-Ryan, Daisy Stanton, David Kao, Tom Bagby . Semi-Supervised Generative Modeling for Controllable Speech Synthesis. arXiv, 2019.

Preprint PDF Project Audio Examples

Eric Battenberg, Soroosh Mariooryad, Daisy Stanton, RJ Skerry-Ryan, Matt Shannon, David Kao, Tom Bagby . Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis. arXiv, 2019.

Preprint PDF Project Audio Examples

Yuxuan Wang, Daisy Stanton, Yu Zhang, RJ Skerry-Ryan, Eric Battenberg, Joel Shor, Ying Xiao, Fei Ren, Ye Jia, Rif A. Saurous . Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis. ICML, 2018.

Preprint PDF Project Source Document Ref Audio Examples Blog Post

RJ Skerry-Ryan, Eric Battenberg, Ying Xiao, Yuxuan Wang, Daisy Stanton, Joel Shor, Ron J. Weiss, Rob Clark, Rif A. Saurous . Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron. ICML, 2018.

Preprint PDF Project Poster Slides Video Ref Audio Examples Blog Post

Yuxuan Wang, RJ Skerry-Ryan, Ying Xiao, Daisy Stanton, Joel Shor, Eric Battenberg, Rob Clark, Rif A. Saurous . Uncovering Latent Style Factors for Expressive Speech Synthesis. NIPS ML4Audio Workshop, 2017.

Preprint PDF Project Poster Audio Examples Workshop

End-to-End Speech Synthesis

Eric Battenberg

Software Engineer

Related