End-to-End Speech Synthesis


At Google, I am now a member of the team that brought you Tacotron, an end-to-end speech synthesis system that uses neural networks to convert text directly to audio. Check out the audio samples from the recently released Tacotron 2 system, which combines Tacotron with a Wavenet-based vocoder.

Publications I contributed to are listed below.


. Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron. ICML, 2018.

Preprint PDF Project Poster Slides Source Audio Examples Blog Post

. Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis. ICML, 2018.

Preprint PDF Project Source Audio Examples Blog Post

. Uncovering Latent Style Factors for Expressive Speech Synthesis. NIPS ML4Audio Workshop, 2017.

Preprint PDF Project Poster Audio Examples Workshop