MRMI-TTS: Multi-Reference Audios and Mutual Information Driven Zero-Shot Voice Cloning
Abstract
References
Index Terms
- MRMI-TTS: Multi-Reference Audios and Mutual Information Driven Zero-Shot Voice Cloning
Recommendations
U-Style: Cascading U-Nets With Multi-Level Speaker and Style Modeling for Zero-Shot Voice Cloning
Zero-shot speaker cloning aims to synthesize speech for any target speaker unseen during TTS system building, given only a single speech reference of the speaker at hand. Although more practical in real applications, the current zero-shot methods still ...
Voice Cloning for Voice Disorders: Impact of Phonetic Content
Text, Speech, and DialogueAbstractOrganic dysphonia can lead to vocal impairments. Recording patients’ impaired voice could allow them to use voice cloning systems. Voice cloning, being the process of producing speech matching a target speaker voice, given textual input and an ...
Analysis and modeling of F0 contours for cantonese text-to-speech
For the generation of highly natural synthetic speech, the control of prosody is of primary importance. The fundamental frequency (F0) is one of the most important components of speech prosody. This research investigates the variation of F0 in ...
Comments
Information & Contributors
Information
Published In
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Check for updates
Author Tags
Qualifiers
- Short-paper
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 278Total Downloads
- Downloads (Last 12 months)278
- Downloads (Last 6 weeks)38
Other Metrics
Citations
View Options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in