Just finished testing Index TTS2, and the ceiling for open-source voice cloning has been raised again.



The biggest difference from other TTS systems is "emotional control." It's not mechanical reading; you can specify it to speak with tones like "sadness, anger, extreme excitement," and even control speech speed and pauses. With the Zero-Shot mechanism, you can clone directly by inputting 10 seconds of audio.

However, the original Github deployment is extremely user-unfriendly. It conflicts with CUDA versions, Python dependencies, and C++ compilation environments. Running native code on a regular computer can result in errors that last all day.

To make it easier for the team’s personal use, I’ve encapsulated all environment dependencies. I’ve created a one-click integrated package for both Windows and Mac.

· Unzip (note: do not include Chinese characters in the path)
· Double-click the one-click startup.bat
· Operate directly through the WebUI on the web page
Run locally offline, no tokens needed. If you need it, feel free to message!
Project open-source link:
View Original
post-image
post-image
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
  • Pin

Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)