Voice instructions and presets

Control tone, pace and style of OpenAI voices in Text to Speech using natural-language instructions and eight ready-made presets.

Written By Umakhan Magomedov

Last updated 4 days ago

When using OpenAI voices in Text to Speech, you can provide a natural-language instruction that tells the AI how the voice should sound. You can write your own instruction or choose one of the built-in presets.

ℹ️ Instructions only work with OpenAI voices (Alloy, Shimmer, Echo and others). They are not supported for MiniMax voices or Custom Voices.

How to set an instruction

  1. Open Text to Speech and select an OpenAI voice.

  2. Scroll down to the Instructions section.

  3. Tap the text field and type a description of how the voice should sound.

  4. Or tap Presets and choose one of the ready-made styles.

  5. Tap Generate. The instruction is sent along with your text and influences the output.


Available presets

Preset

What it does

Calm

Soft, measured pace with natural pauses. Good for meditation, ambient or relaxing content.

Cheerful

Upbeat and enthusiastic with lively intonation. Good for marketing, promos or upbeat content.

News

Professional news anchor: clear articulation, neutral intonation, confident pace. Good for announcements and reports.

Whisper

Soft, intimate whisper. Good for ASMR, bedtime stories or private-feeling content.

Story

Warm audiobook narrator with atmospheric emphasis. Good for stories and long-form content.

Teacher

Clear and instructive: pauses after key points, emphasizes important words. Good for educational content.

Dramatic

Movie trailer voice-over: deep tone, slow pace, powerful pauses. Good for cinematic or high-impact content.

Kids

Friendly, playful and warm. Slow pace with highlighted fun words. Good for children's content.


Writing custom instructions

You can write any instruction in plain language. The model understands descriptions of tone, pace, emotion and style. Examples:

  • "Speak with a slight British accent. Slow pace. Thoughtful pauses."

  • "Sound like a friendly customer service agent. Warm and helpful."

  • "Energetic sports commentator voice. Fast pace, excitement."

ℹ️ Instructions are a prompt to the AI, not a guaranteed transformation. Results may vary between voices and languages. If the output does not match, try rephrasing the instruction.


Frequently asked questions

Related articles