How Text to Speech works
Type or paste text, choose a voice and generate an audio file in seconds.
Written By Umakhan Magomedov
Last updated 4 days ago
Text to Speech converts written text into a spoken audio file. Type or paste your text, pick a voice and tap Generate to get the result in seconds.
When to use
Create a voiceover for a presentation, video or podcast
Listen to translated text as audio instead of reading it
Practice pronunciation by generating audio of a foreign-language phrase
Generate a narration track with your own cloned voice
How to run
Open Text to Speech from the Tools tab.
Go to the Text tab and type or paste your text. You can also tap Paste to paste from clipboard or use voice input.
Go to the Voice tab and select a voice. You can choose from OpenAI voices, standard MiniMax voices or your own Custom Voices.
Tap Generate. The audio file appears in the Result tab.
ℹ️ For OpenAI voices, you can add voice instructions on the Voice tab to adjust tone, pace and style. For example: "Speak slowly and warmly" or "Use a formal news presenter style."
What you get
The Result tab shows an audio player with the generated file. From there you can:
Play the audio directly
Download it to your device
Share it through the system share sheet
Generate again with different settings if needed
Each result is saved to History automatically.
How much it costs
The cost depends on the text length and the voice type chosen. OpenAI voices are billed per minute of estimated audio. Standard and Custom MiniMax voices are billed per character. Custom voices activated for the first time include a one-time setup fee of 150 tokens. For the full pricing breakdown, see Token pricing for each tool.