Translate Audio settings
Speech recognition, translation model and voiceover providers in Translate Audio: options, pricing and behavior.
Written By Umakhan Magomedov
Last updated 4 days ago
Open the Settings sheet in Translate Audio to control speech recognition, translation quality and voiceover. This article explains every option and when it applies.
Where to find settings
Open Translate Audio from the Tools tab.
Tap the Settings icon in the top right corner.
Change recognition, translation or voiceover options. Token estimates update immediately.
ℹ️ Speech recognition and translation model changes apply on the next file upload, not to the current result. Voiceover settings affect the next time you generate audio.
Recognition (speech-to-text)
Choose which engine transcribes the uploaded audio. The default is ElevenLabs Scribe.
Translation
Pick the AI model for re-translations when you change the target language or edit the source text.
⚠️ The automatic pipeline on first upload always uses Gemini 3 on the backend, regardless of the model selected here. Settings only affect re-translations.
Voiceover without cloning
Standard synthetic voices. No voice sample from the original audio is used.
If ElevenLabs does not support your target language, the app falls back to OpenAI automatically.
Voiceover with cloning
These providers clone the speaker voice from your uploaded audio or a saved Custom Voice.
MiniMax (recommended)
Qwen
HeyGen
TTS behavior
Edit translation: changing the translated text clears the current voiceover. Tap play to regenerate.
Pending or completed jobs: MiniMax, Qwen and HeyGen jobs continue in the background. Reopening from History resumes playback or polling.
Language change: if the current cloning provider does not support the new language or the audio is too short, the app auto-switches to ElevenLabs.
Settings change: switching provider, speed, emotion or style clears cached audio for the current result.