Token pricing for each tool

Approximate token cost for Translate Audio, Translate Video, Speech to Text, Text to Speech, AI Chat and more.

Written By Umakhan Magomedov

Last updated 4 days ago

This page lists the approximate token cost for each AI tool in VocaLingo. The app always shows an estimate before you start, so you can verify the cost for your specific input.

ℹ️ All prices are estimates. Actual charges may vary slightly based on final processing results. 1 token ≈ $0.01 USD.

Translate Audio

Mode

Approximate cost

Standard voice (OpenAI)

Depends on audio length and text

MiniMax voice cloning

Per character of translated text

Heygen voice cloning

Per second of audio

Translate Video

Mode

Price

Standard

5 tokens per second of video

Enhanced Cloning

10 tokens per second of video

Speech to Text

Charged per minute of audio. A short summary (Essence) is an additional charge when generated.

Video to Text

Same as Speech to Text: charged per minute of audio extracted from the video. A summary costs extra when generated.

Text Analysis

Charged for speech recognition (if audio is uploaded) plus the analysis step. The cost is visible before you start.

Text to Speech

Voice type

Billing basis

OpenAI voices

Per minute of estimated audio (minimum 1 minute)

Standard MiniMax voices (HD)

Per character, HD rate

Standard MiniMax voices (Turbo)

Per character, Turbo rate (lower cost)

Custom voice (Qwen)

Per character, minimum ~9 tokens

Custom voice (MiniMax, first use)

150 tokens activation + per character

Custom voice (MiniMax, subsequent)

Per character, HD rate

AI Chat

Charged per message. Cost depends on the selected model, message length and conversation history. More capable models cost more per message.


Frequently asked questions

Related articles