Music, sound effects & dialogue
Build a soundtrack on PonPon beyond voiceover: generate music, design sound effects, re-voice clips with the voice changer, script multi-voice dialogue, and mix the layers.
Voiceover and dubbing are the speech side of the audio studio. This page covers the rest of the soundtrack — music, sound effects, the voice changer, and multi-voice dialogue. For the full studio overview, start with Voiceover and audio basics.

Music
Generate a backing track in the music mode (also a standalone AI music generator). Prompt a style and energy, not a specific song:
Warm lo-fi, relaxed, mellow keys — for a calm product montage.
Driving electronic, upbeat, punchy bass — for a sneaker ad.
- Toggle instrumental when you don't want vocals competing with a voiceover.
- Set the length to match your edit.
Sound effects
Describe a sound in the sound effects mode (or the AI sound effect generator) — "heavy rain on a tin roof," "sci-fi door whoosh," "distant city traffic." You can set the clip length and how strictly it follows the prompt.
Layer effects under a clip to make a silent render feel alive — footsteps, ambience, a single accent hit on a cut.
Voice changer
Already have a recording? The voice changer re-voices it in a different voice while keeping your timing and delivery — useful for anonymizing or restyling narration. There's a denoise option to clean up a rough source first.
Dialogue
The dialogue mode generates a multi-voice conversation: write the script line by line and assign a different voice to each speaker (there are 37 voices to choose from). Good for skits, explainer back-and-forths, and character scenes.
Mixing the layers
A finished mix usually stacks three things, loudest to quietest:
- Voice — a voiceover or dialogue track, the loudest element and the one viewers follow.
- Music — a bed underneath, instrumental and noticeably quieter so it never fights the voice.
- Effects — accents and ambience for texture, lowest of all.
Generate each, then assemble and balance them against your visuals in Flow or Studio.
Related articles
- Voiceover & audioThe PonPon audio studio: text-to-speech, voice changer, dubbing into 31 languages, sound effects, music, and multi-voice dialogue — powered by ElevenLabs and MiniMax.
- AI dubbingDub a video or audio clip into another language with AI on PonPon — 31 target languages, how dubbing differs from voiceover, a worked example, source prep, and pairing with lip-sync.
- Text-to-video basicsHow video generation works on PonPon: text-to-video vs image-to-video, choosing models like Veo 3.1, Sora 2 and Kling 3.0, and the Edit and Motion Control tabs.