How Translate Video works

Learn how Translate Video translates the audio in a video file, syncs the dubbed voice and delivers the result in 5 to 15 minutes.

Written By Umakhan Magomedov

Last updated 4 days ago

Translate Video translates the spoken audio in a video and produces a new video file where the dubbed voice matches the on-screen speaker. Processing runs on the server and typically takes 5 to 15 minutes.

When to use

  • Translate a TikTok, Instagram Reel or YouTube video into your language

  • Localize an educational or training video for a different audience

  • Watch a foreign-language documentary or interview without subtitles

  • Share a video with someone who speaks a different language


What you can upload

Formats: MP4, MOV, AVI, MKV, WebM, M4V

Max file size: 100 MB

Sources:

  • File from your device or gallery

  • Link to YouTube, Instagram or TikTok

  • Direct video URL (.mp4, .mov and others)


How to run

  1. Open the Translate Video tool from the Tools tab.

  2. Tap Choose to pick a file, or tap Paste link to import from a URL.

  3. Select the target language.

  4. Tap Translate. The app compresses and uploads the video, then sends it for server processing.

  5. You can close the app. When the video is ready, it appears in History and you receive an email notification.

ℹ️ Average processing time is 5 to 15 minutes depending on video length. Do not close the app until the upload finishes, then you can safely minimize it.


What you get

The result is a video file with the dubbed audio in the target language. It appears in History with one of these statuses:

Status

Meaning

What to do

Queued

Video is waiting in the processing queue

Wait

Processing

Translation is in progress

Wait

Completed

Video is ready

Open and download

Failed

An error occurred

Tap Retry

From the completed result you can play the video, download it to your device, save it to your gallery or share it.


How much it costs

The cost depends on the video duration and the quality mode selected:

Mode

Price

What it gives

Standard

5 tokens/sec

Fast processing, good quality dubbing

Enhanced Cloning

10 tokens/sec

Higher quality, closer to original voice

The estimated token cost for your video is shown before you tap Translate. For the full pricing reference, see How tokens work.


Settings

Tap the settings icon before translating to adjust the output:

  • Audio only: translates the audio track only, without lip sync. Use this when lip sync is not important or the face is not clearly visible.

  • Enhanced Cloning: doubles the cost but produces a more natural-sounding result with better voice similarity.


Frequently asked questions

Related articles