VoxCPM2

Description

#AI Voice #Voice Large Model #Open Source Project #Voice Cloning #Text to Speech #Multiple Dialects Recently, I ran VoxCPM2 locally, and what surprised me the most was not that it could speak Cantonese and Henan dialect, but that it made me feel: voice is becoming an editable content.
In the past, everyone focused on whether AI sounded like a real person, but now it is moving in another direction—can it "perform" according to your requirements? With just a prompt, you can control the age, tone, emotion, speech rate, and other expressions; if you upload a reference audio, it can retain the original tone while changing the expression style as much as possible.

Software Features

Text to Speech Generation: Input text to generate natural speech, supporting rich expression control.
Prompt Control of Voice Style: You can specify the age, tone, emotion, speech rate, and other generation effects through prompts.
Reference Audio Tone Cloning: After uploading reference speech, you can adjust the expression style while retaining the tonal characteristics.
Multi-Dialect Support: In addition to Mandarin, it can also generate speech in different dialects such as Cantonese and Henan dialect.
Enhanced Voice Expression Capability: It not only pursues tonal similarity but also focuses on tone, emotion, and expressiveness, making speech more controllable.
Rich Application Scenarios: Can be used in AI dubbing, audio content production, digital humans, voice assistants, education and training, and other scenarios.

Screenshots

Click to view full size

Description

Software Features

Screenshots

Related Software