Audio Mixing Basics for Filmmakers

Sound That Doesn't Get in Its Own Way

The goal of a film audio mix is deceptively simple: every element in the soundtrack should be clearly audible and contribute to the emotional experience without fighting for space with everything else. When a mix works, you don't notice it. When it doesn't, you're adjusting your volume constantly, straining to hear dialogue, or wincing at unexpected loudness spikes.

For indie filmmakers doing their own mix — which is the reality for most shorts and microbudget features — understanding the fundamentals of audio mixing will get you to a professional result faster than any amount of plug-in spending.

The Signal Chain: What Happens to Audio

Understanding the order of operations in audio processing prevents the most common mistakes.

A typical signal chain for a dialogue track goes:

EQ (equalization): Shaping the frequency content of the signal — cutting unwanted frequencies, boosting useful ones
Dynamics processing (noise gate, de-esser, compressor): Controlling the dynamic range — reducing extreme peaks, evening out inconsistent levels
Reverb/room treatment (if needed): Adding or matching room tone
Fader/level: The final output level of the track

Order matters. EQ before compression means you're shaping the tonal character before you stabilize the dynamics. EQ after compression means you're shaping the stabilized signal. Both have applications, but EQ-before-compression is the standard starting point for dialogue.

Understanding the Mix Hierarchy

Every film mix has a hierarchy: some elements are more important than others, and the mix should reflect this.

Dialogue always sits at the top. If dialogue is fighting with anything — music, effects, ambience — the dialogue wins. Every other element exists to support the story, and the story is told through what characters say. Audiences are remarkably forgiving about effects levels and music; they are not forgiving about not being able to hear the dialogue.

Sound effects and Foley sit in the middle. They define the physical reality of the scene but should never compete with spoken words.

Music sits underneath. Score and licensed tracks should support the emotional register of the scene without drowning its action. Under dialogue, music should typically be significantly lower than its solo level — what feels "right" as a musical statement often needs to drop by 6–12 dB to sit correctly under speech.

Ambience and room tone sit at the bottom. These are continuous background layers that fill the space and keep scenes from sounding "dead." They should be felt more than heard.

Setting Dialogue Levels

Dialogue should typically peak around -6 to -3 dBFS on your meter for streaming and online delivery, with average conversational levels around -18 to -12 dBFS. These numbers vary slightly by delivery format — broadcast has specific loudness requirements (typically -24 LUFS in the US per CALM Act standards; -23 LUFS for EBU R128 in Europe).

What to look and listen for:

Conversational speech should be consistently audible without straining
Lines delivered at different volumes (quiet intimate moment vs. shouted argument) will need different level adjustments — automate faders or use compression to manage this
Inconsistencies between takes (different recording days, different microphone positions) need to be manually matched

EQ for Dialogue: What Actually Helps

Dialogue EQ is primarily subtractive — you cut problems more than you boost qualities.

Common EQ tasks for dialogue:

High-pass filter (low-cut) at 80–120 Hz: Removes low-frequency rumble, HVAC hum, and handling noise that the human voice doesn't need to communicate clearly
Cut room resonances (200–400 Hz range): Boxy, hollow-sounding room artifacts often live here. Narrow cuts in this range can clean up muddy-sounding recordings significantly
Presence and intelligibility (2–5 kHz): Gentle boosts here can improve speech clarity and help dialogue cut through a mix
Harshness (5–8 kHz): Sibilance and harsh artifacts from certain microphone and room combinations can be tamed with moderate cuts in this range

Listen for what's wrong more than what's missing. The goal is a clean, natural-sounding voice — not a sculpted radio voice.

Compression: Managing Dynamic Range

Compression reduces the volume of loud sounds to make the dynamic range more manageable in the mix. For dialogue, this means evening out the difference between a whisper and a shout so both sit in an audible range without the mix constantly chasing the loudest moment.

A gentle dialogue compressor setting as a starting point:

Attack: 5–15 ms (fast enough to catch peaks, slow enough to let transients through naturally)
Release: 60–120 ms
Ratio: 2:1 to 4:1
Threshold: Set so the compressor is engaging on loud lines but barely touching quiet ones
Gain reduction: 3–6 dB on average, more on extreme variations

Over-compression makes dialogue sound pumped and unnatural — the level appears stable but the words lose their natural emphasis and weight. Use it as a tool, not a fix for fundamentally bad audio.

Music in the Mix: The Most Common Mistake

Inexperienced mixers consistently put music too loud relative to dialogue. This happens because music is being evaluated in isolation before being set against the dialogue layer.

The right approach:

Start with music at a level that feels correct in isolation
Now drop it by 6 dB
Now bring up your dialogue track
Adjust until the dialogue is clearly intelligible over the music
Automate music levels: bring music up in sections without dialogue; duck it significantly under speech

Automation — manually drawing level changes in your DAW across the timeline — is the single most powerful tool in film mixing. Rather than setting a static level for music and hoping it works everywhere, automation lets you precisely control the level moment by moment. Invest the time to automate your music track through the whole film.

Building a Basic Mix Session

For an indie film mix without a dedicated facility, here is a practical session structure:

Routing architecture:

All dialogue tracks → a Dialogue Bus
All music tracks → a Music Bus
All effects and Foley → an Effects Bus
All ambience → an Ambience Bus
All buses → Master Fader

Working with buses lets you adjust the relative balance of each category with a single fader rather than chasing individual track levels. It also gives you clean stems for delivery (each bus can be exported as a separate file).

Order of operations for the mix:

Balance and EQ dialogue tracks
Set ambience levels under dialogue
Place effects and Foley in proportion to dialogue
Place music under everything else
Automate level changes across the timeline
Final check: listen on different playback systems (speakers, headphones, earbuds, laptop speakers)

Checking Your Mix: The Multi-System Test

A mix that sounds great on your studio monitors may be inaudible on a laptop or too loud on earbuds. Check your final mix on:

Studio monitors or good headphones (your reference)
Consumer earbuds (the most common listening environment)
Built-in laptop speakers (the least forgiving)
A home stereo system if available

The specifics will vary across these systems, but the dialogue should remain intelligible and the hierarchy (dialogue over effects over music) should hold on all of them. If the dialogue gets buried on laptop speakers, the music is too loud.

Loudness Normalization and Delivery

Most streaming platforms (YouTube, Vimeo) apply loudness normalization to content — they measure the average loudness of your file and adjust its playback level to meet their target (-14 LUFS on YouTube, for example).

This means an overly loud master won't sound louder than a properly mixed one — it will sound identical in level but worse in quality, because over-compressed files that are normalized down lose their dynamic feel. Mix to proper loudness targets for your intended delivery platform rather than making everything as loud as possible.

A simple pre-delivery checklist: dialogue is clear, no peaks above -1 dBFS (for digital delivery), average loudness is in the target range for your primary platform, and stems are exported and labeled.

Frequently Asked Questions

What level should dialogue be in a film mix?

Dialogue should peak around -6 to -3 dBFS, with average conversational levels around -18 to -12 dBFS for streaming delivery. Broadcast has specific loudness standards — -24 LUFS (US CALM Act) or -23 LUFS (EBU R128 in Europe).

Why does music always sound too loud in my mix?

Because music is evaluated in isolation before being set against dialogue. The fix is to start with music at a comfortable solo level, drop it 6 dB, then bring up dialogue and adjust until speech is clearly intelligible. Automate music levels to duck under dialogue throughout the film.

What is a mix bus?

A mix bus is a routing channel that combines multiple tracks for collective processing and level control. Grouping all dialogue tracks into a Dialogue Bus lets you adjust the overall dialogue level with one fader and export a clean dialogue stem for deliverables.

What is LUFS and why does it matter?

LUFS (Loudness Units relative to Full Scale) is the standard measurement for audio loudness used by streaming platforms for normalization. Mixing to the target LUFS for your delivery platform ensures your audio plays back at the right level and maintains its dynamic quality.