Tips for the best voice clone quality

Recording environment, speech style, file requirements and fixes for common voice cloning problems.

Written By Umakhan Magomedov

Last updated 3 days ago

Voice clone quality depends on the recording you provide. These tips help you get a natural-sounding result that closely matches your voice.

Recording environment

Record in a quiet room with no background noise, echo or music
Avoid rooms with hard surfaces that create echo (bathrooms, empty rooms)
A small room with soft furnishings (carpet, curtains, sofa) works well
Keep your phone or microphone at a consistent distance from your mouth (15-30 cm)

How to speak

Speak at your natural pace, as you would in normal conversation
Pronounce words clearly without exaggerating
Read the reference text provided in the app — it is designed to capture a wide range of your voice characteristics
Do not whisper or speak unusually slowly: the model learns your natural voice
Aim for at least 30-60 seconds of clean, uninterrupted speech

ℹ️ The reference text in the app is specifically chosen to include a variety of sounds, intonations and sentence structures. Reading it fully gives the model more to work with.

If you upload a file instead of recording

Use a file with a single speaker and no background music or effects
Minimum: 10 seconds. Recommended: 30-60 seconds or more
Maximum file size: 20 MB
Supported formats: MP3, WAV, AAC, OGG
Avoid phone call recordings, heavily compressed audio or recordings with multiple speakers

Common problems and fixes

Problem	Fix
Voice sounds robotic or unnatural	Re-record in a quieter space with more audio (60+ seconds)
Voice does not sound like you	Make sure the recording has no background noise and you read the full reference text
Muffled or unclear sound	Move the microphone closer, avoid recording through fabric
Cloning failed	Check that the file is not too short, has clear speech and is within the size limit

VocaLingo

Tips for the best voice clone quality

Recording environment

How to speak

If you upload a file instead of recording

Common problems and fixes

Frequently asked questions

Recording environment

How to speak

If you upload a file instead of recording

Common problems and fixes

Frequently asked questions

How long should my recording be?

Can I use a recording from a phone call?

My voice clone sounds good in one language but not another. Why?

Can I improve a voice clone without deleting and recreating it?

Related articles