Audio

Materials

Runway

Audio

Updated: 2026-05

1. About This Page

This guide covers Runway’s audio generation features (text-to-speech, sound effects, and lip-syncing). Adding audio to your videos instantly makes your prototypes much more compelling.

2. The Three Types of Sound

There are basically three types of sound used in video prototyping:

Type	Runway Features	Use
Dialogue & Narration	Text-to-Speech, Lip-Sync	Character dialogue, commentary
Sound Effects (SFX)	Generative Audio	Ambient sounds, specific sounds
Background Music (BGM)	(Runway alone is limited)	Atmosphere, mood

Runway excels at the first two tasks, but it’s often better to create the background music using a separate tool (such as Suno, AIVA, or Adobe Firefly Audio).

3. Text-to-Speech

Basic text-to-speech functionality.

3.1 How to Use

Left navigation bar → Generative Audio → Text to Speech
Enter text (up to 600 characters per speaker)
Select Voice — Choose from a wide range of preset voices
Generate → Generate audio file

3.2 Types of Audio

Preset Voices — Runway’s standard presets (free)
Custom Voice — You can also clone your own voice through the ElevenLabs integration (subject to plan restrictions)

3.3 Examples of Narration Use

A narration at the beginning of the work, such as “This is a story set in a city of the future”
A character’s inner thoughts (monologue)
A voiceover reading explanatory subtitles

4. Lip-syncing

The ultimate feature: Make people in videos lip-sync to any audio track you choose.

4.1 Basic Workflow

Left navigation → Generative Audio → Lip Sync
Video Source: Upload a video (showing a person’s face)
Audio Source: Specify the audio in one of three ways
- Text input (Text-to-Speech runs in the background)
- Upload an audio file
- Record on the spot
Generate → A video with lip movements synchronized to the audio

4.2 Constraints

Up to 600 characters per dialogue
Up to 10 dialogues per video (supports multiple speakers)
Videos where faces are shown from the front to a three-quarter front view have a higher success rate
Accuracy decreases for side profiles and dark faces

4.3 Relationship with Act-Two

Act-Two transcribes the acting itself, while Lipsync specializes in synchronizing lip movements with audio.

Purpose	Recommendation
Want to include facial expressions and body language	Act-Two
Want to add dialogue to existing video	Lipsync
Only need the mouth to move	Lipsync (Light)

5. Sound Effects (SFX)

You can also generate sound effects using Runway’s Generative Audio.

5.1 How to Use

Generative Audio → Sound Effects (or Generate Sound)
Describe what kind of sound you want in text

Examples:

“Footsteps on wet pavement, slow pace”
“Distant thunder, low rumble”
“Birds chirping on a forest morning”

5.2 Using Multiple Libraries

You don’t have to generate everything with AI. Make use of free sound effect libraries as well:

Freesound.org — A large library of Creative Commons audio files
Pixabay Audio — Available for commercial use
BBC Sound Effects — Over 50,000 high-quality sound effects

For class projects, a hybrid approach combining AI-generated content and existing libraries is a practical option.

6. Background Music

Runway alone isn’t very good at generating background music. The standard approach is to use it as follows:

Suno — Generates music from text (free plan available)
AIVA — Focuses on film music
Adobe Firefly Audio — By Adobe
YouTube Audio Library — Royalty-free, for commercial use

Add the generated or selected background music to the Runway Editor timeline.

7. Volume Balance

Adjustments after adding to the timeline:

Background music: -20 to -25 dB (low)
Sound effects: -15 to -20 dB
Dialogue: -5 to -10 dB (highest)

The basic principle is to maintain a state where the dialogue is always the loudest, rather than focusing on specific dB values.

8. Fade In and Fade Out

Fade the audio at both ends of the clip:

At a minimum, fade the first and last second of the BGM
Sounds that start or end abruptly sound very amateurish
Use the Editor to drag the edges of each audio clip to set the fade

9. Copyright Notice

The following rules must be strictly observed even for class projects:

✗ Use of commercial music (e.g., songs from Apple Music)
✗ Extracting audio from YouTube videos
✓ AI-generated voice and sound effects
✓ Creative Commons materials (check licenses such as CC BY)
✓ Materials declared to be copyright-free

Since the video may be made public outside the university during the final presentation, we will use only royalty-free materials from the start.

10. Priorities in Sound Design

Priorities for video prototyping when time is limited:

Voice lines, if any (Lipsync or Text-to-Speech)
1–2 main sound effects (distinctive sounds such as footsteps or ambient noise)
Background music (fade in softly at the end)
Other minor sound effects

If you aim for perfection, you’ll end up wasting an endless amount of time, so it’s important to know when to declare the project 80% complete.

11. Example Workflow for the Practical Training Session

Sound Design for a 30-Second Short Film:

Time	Sound
0:00-0:05	BGM fades in, snow sound effects
0:05-0:15	Protagonist’s narration (lip-sync), BGM continues
0:15-0:25	Ambient sounds, footsteps, BGM emphasized
0:25-0:30	BGM fades out

A 30-second clip can be composed of 3 to 4 sound clips. The workload for one group is 30 to 60 minutes.

12. What’s Next

Video Prototyping Mindset — A mindset for defining your goals
Limits and Next — Runway’s limitations and other models

Runway Editing Video Prototyping Mindset