Why I REFUSE to Record Podcasts Without Headphones (And You Should Too)

Hot take: If you're not wearing headphones while podcasting in 2025, you're doing it wrong. I don't care how good Riverside's echo cancellation is or what magic AI audio cleanup you're using - I will literally refuse to record with guests who won't wear headphones.

In this episode, I explain why "fixing it in post" with apps like Descript and Adobe Podcast AI is actually making your audio WORSE, not better. I've been on shows where my studio-quality audio gets destroyed by automated filters that were never needed in the first place.

The truth is, these AI audio tools make assumptions about your recording environment and apply blanket fixes that can create weird artifacts, dropped sounds, and unpredictable results. When you wear headphones, you get clean, "unopinionated" audio that gives you or your editor full control over the final sound.

🎙️ What you'll learn:

Why automated audio cleanup often makes good audio worse
The difference between surgical editing vs. sledgehammer approaches
How echo cancellation can hurt when it's not needed
My hardware setup for clean audio (without over-processing)
The guest who got floored when I cancelled our interview

🎧 Are you team headphones or team "the software will fix it"? Let me know at streamlinedfeedback.com!

00:00:00 --> 00:00:02 Today, I want to talk to you about headphones.
00:00:03 --> 00:00:06 When I first started podcasting all the way back
00:00:06 --> 00:00:11 in 2012, headphones were absolutely a requirement,
00:00:11 --> 00:00:14 also a requirement, having all of your guests
00:00:14 --> 00:00:17 record their audio separately or using an app
00:00:17 --> 00:00:21 like Ecamm Call Recorder on Skype, which was
00:00:21 --> 00:00:25 not great. It definitely led to low quality shows,
00:00:25 --> 00:00:30 which is why I feel at least my show that I launched
00:00:30 --> 00:00:36 in 2016 took off because I was a stickler for
00:00:36 --> 00:00:42 quality. Now in 2025, I've been getting more
00:00:42 --> 00:00:46 pushback from podcasters and guests about wearing
00:00:46 --> 00:00:49 headphones. And I will tell you straight up,
00:00:49 --> 00:00:53 I'm not burying the lead here. If a guest is
00:00:53 --> 00:00:56 not wearing headphones, I will not record with
00:00:56 --> 00:01:02 them. I don't care that Riverside or Squadcast
00:01:02 --> 00:01:07 or whatever has the echo cancellation. I don't
00:01:07 --> 00:01:09 care if we're recording on Zoom, which I don't
00:01:09 --> 00:01:12 record on Zoom, but that they do the echo cancellation
00:01:12 --> 00:01:17 thing. Headphones are still a requirement for
00:01:17 --> 00:01:20 me. And I think if you care about quality, which
00:01:20 --> 00:01:24 you should, especially now, then headphones should
00:01:24 --> 00:01:26 be your requirement. So first of all, why should
00:01:26 --> 00:01:28 you care about quality now? We have Riverside,
00:01:28 --> 00:01:33 we have Descript, we have apps like Adobe Podcasts
00:01:33 --> 00:01:36 and Descript that can make crappy microphones
00:01:36 --> 00:01:42 sound like good studio microphones. But here's
00:01:42 --> 00:01:47 the thing. Fixing audio and software is not as
00:01:47 --> 00:01:53 good as getting the cleanest audio. possible.
00:01:54 --> 00:01:57 And I know this because I've gone on podcasts
00:01:57 --> 00:02:01 as a guest where we've recorded with Riverside
00:02:01 --> 00:02:05 and you hear how I sound right now. Not to toot
00:02:05 --> 00:02:10 my own horn or anything, but my audio is amazing.
00:02:11 --> 00:02:14 I have a great microphone going into a great
00:02:14 --> 00:02:16 interface. I have a great recording environment.
00:02:17 --> 00:02:21 I always wear headphones. But the end result
00:02:21 --> 00:02:24 for some of these podcasts is me sounding worse.
00:02:25 --> 00:02:29 And the New Yorker in me, who always assumes
00:02:29 --> 00:02:32 malice, figures, oh, well, they just want the
00:02:32 --> 00:02:35 guest to sound worse than the host, which is
00:02:35 --> 00:02:39 insane. That's too much effort, right? What's
00:02:39 --> 00:02:42 actually happening is they're running it through
00:02:42 --> 00:02:45 Descript or whatever magic editing thing they
00:02:45 --> 00:02:51 do. the effects that they apply actually make
00:02:51 --> 00:02:53 my audio worse, not better, because they're not
00:02:53 --> 00:02:57 giving it to an audio engineer. They're just
00:02:57 --> 00:03:01 throwing it through some app, right? Or just
00:03:01 --> 00:03:04 combining and cleaning up, quote unquote. And
00:03:04 --> 00:03:07 so you'll have like dropped sounds or you'll
00:03:07 --> 00:03:11 have like this weird artifact that shows up sometimes
00:03:11 --> 00:03:13 because they didn't properly do noise removal.
00:03:14 --> 00:03:16 or they did noise removal on noise that wasn't
00:03:16 --> 00:03:19 actually there. So why am I telling you all of
00:03:19 --> 00:03:25 this? Because when you use something like echo
00:03:25 --> 00:03:26 cancellation, and that's the other thing that
00:03:26 --> 00:03:28 they could have done, right? They could have
00:03:28 --> 00:03:31 been using echo cancellation in Riverside even
00:03:31 --> 00:03:36 though I'm wearing headphones. So the echo cancellation
00:03:36 --> 00:03:39 was not necessary. The software is then looking
00:03:39 --> 00:03:43 for stuff to remove And when it can't find anything,
00:03:43 --> 00:03:47 it does non -deterministic things. Non -deterministic
00:03:47 --> 00:03:50 is a programming term for you can't predict what
00:03:50 --> 00:03:53 it does. Large language models are non -deterministic.
00:03:53 --> 00:03:55 I don't care what the AI quote unquote experts
00:03:55 --> 00:04:00 will tell you. You cannot predict how an AI will
00:04:00 --> 00:04:03 respond to you. Just go ask Elon Musk and Grok.
00:04:04 --> 00:04:10 So when you apply those filters, If they are
00:04:10 --> 00:04:13 not necessary, they will make the audio worse.
00:04:13 --> 00:04:16 If they are necessary, they're going to do things
00:04:16 --> 00:04:18 to the audio that you may not predict or want.
00:04:19 --> 00:04:23 And so when you record your podcast, you should
00:04:23 --> 00:04:26 always wear headphones because you don't want
00:04:26 --> 00:04:28 the guest's audio creeping into your microphone
00:04:28 --> 00:04:34 and vice versa. Right? You want your guests to
00:04:34 --> 00:04:36 wear headphones even if you're recording over
00:04:36 --> 00:04:39 Riverside or whatever. and it has that echo cancellation
00:04:39 --> 00:04:45 because nothing is better than the raw unaffected
00:04:45 --> 00:04:50 analog sound. You can take that and you can fix
00:04:50 --> 00:04:53 it in an app like Logic Pro or you can give it
00:04:53 --> 00:04:56 to an editor or an audio engineer and they can
00:04:56 --> 00:04:59 pull all the right levers, the correct levers
00:04:59 --> 00:05:02 to actually fix the thing that you're trying
00:05:02 --> 00:05:06 to fix. But if you're just kind of wholesale
00:05:06 --> 00:05:10 applying You know, it's like it's like if you
00:05:10 --> 00:05:14 decide oh we're going to Paint the entire house
00:05:14 --> 00:05:18 gray even if like the Sunroom should be light
00:05:18 --> 00:05:23 blue or we're just going to We're gonna make
00:05:23 --> 00:05:25 a bunch of different lunches for all the kids,
00:05:25 --> 00:05:28 but we're gonna spray ketchup on all of it Right
00:05:28 --> 00:05:33 like great ketchup on hamburgers is fine Ketchup
00:05:33 --> 00:05:37 on pizza is an nomination. Don't at me on that
00:05:38 --> 00:05:45 So like you're you're You're doing with a sledgehammer
00:05:45 --> 00:05:48 what you should do with something more surgical,
00:05:48 --> 00:05:51 right? I think that you're using a You're using
00:05:51 --> 00:05:54 a hacksaw when you should be using a surgical
00:05:54 --> 00:05:57 knife or whatever So headphones prevent that
00:05:57 --> 00:05:59 headphones will ensure that you get the best
00:05:59 --> 00:06:04 possible quality from your audio that is not
00:06:04 --> 00:06:07 affected by Any software that you don't have
00:06:07 --> 00:06:11 a direct hand in fixing. And I'm not saying don't
00:06:11 --> 00:06:14 apply fixes, right? I'm recording this in Logic
00:06:14 --> 00:06:19 Pro and I do have a compressor, but it's a hardware
00:06:19 --> 00:06:22 based compressor, right? And which is like a
00:06:22 --> 00:06:24 noise gate. It's like the opposite of a noise
00:06:24 --> 00:06:27 gate. I'm not an audio engineer, so I'm not going
00:06:27 --> 00:06:30 to be able to tactfully describe this, but it's
00:06:30 --> 00:06:34 basically like. If there is a sound below a certain
00:06:34 --> 00:06:37 decibel, it's going to ignore it, essentially.
00:06:37 --> 00:06:41 And I am doing that in hardware. I'm not doing
00:06:41 --> 00:06:44 it in software where you can get false positives.
00:06:44 --> 00:06:49 What I'm doing in software is I'm using audio
00:06:49 --> 00:06:55 effects from iZotope. I'll link it in the description.
00:06:55 --> 00:06:59 For breath control and mouth sounds. Because
00:06:59 --> 00:07:03 I can't stand mouth sounds. So I don't like listening
00:07:03 --> 00:07:06 back to my audio with mouth sounds and so, you
00:07:06 --> 00:07:09 know, I have like a de -clicking filter on there.
00:07:10 --> 00:07:14 But again, I'm very surgical about how it's applied
00:07:14 --> 00:07:17 and it's only applied to my audio. When I have
00:07:17 --> 00:07:21 a guest, I don't touch that. I give my editor
00:07:21 --> 00:07:25 both and he handles it because he knows what
00:07:25 --> 00:07:27 is a light touch and what's too heavy -handed.
00:07:30 --> 00:07:34 The point is in 2025, this is not an editing
00:07:34 --> 00:07:36 episode because I, you know, I hate editing.
00:07:37 --> 00:07:41 I don't do a lot of editing myself. I do what
00:07:41 --> 00:07:44 I have to, but I don't like doing a lot of editing
00:07:44 --> 00:07:50 myself. This is about headphones. And so should
00:07:50 --> 00:07:55 you as a podcaster use headphones in 2025? Yes.
00:07:56 --> 00:08:00 Should your guests? Yes. That is going to ensure
00:08:00 --> 00:08:06 that you get the most clear, unopinionated audio
00:08:06 --> 00:08:10 you can possibly get. Because then Riverside
00:08:10 --> 00:08:15 or Descript or Zoom or whatever is not applying
00:08:15 --> 00:08:19 their filters, which have been applied for a
00:08:19 --> 00:08:24 very specific reason. Right? They have made assumptions
00:08:24 --> 00:08:29 about how people are using their software. and
00:08:29 --> 00:08:32 their filters are going to execute those assumptions.
00:08:33 --> 00:08:36 Whereas if you're not, if you're using headphones
00:08:36 --> 00:08:38 and you turn off echo cancellation or whatever
00:08:38 --> 00:08:42 audio filters are in Zoom, there are no assumptions.
00:08:42 --> 00:08:45 So you can understand the environment the person
00:08:45 --> 00:08:49 is recording in. You can hear the issues and
00:08:49 --> 00:08:53 you can fix them later. But also headphones ensure
00:08:53 --> 00:08:55 that you don't have to fix as much, right? When
00:08:55 --> 00:08:58 I recorded in person, this was a problem. Obviously,
00:08:59 --> 00:09:01 we had two microphones, but we were too close
00:09:01 --> 00:09:04 to each other. And we weren't using student like
00:09:04 --> 00:09:09 headphone monitors. So. I could hear myself on
00:09:09 --> 00:09:12 my guest's microphone and vice versa. That's
00:09:12 --> 00:09:13 just the name of the game when you're recording
00:09:13 --> 00:09:16 in person, I assume. I don't I don't have well,
00:09:17 --> 00:09:19 I shouldn't say I assume I don't record in person
00:09:19 --> 00:09:21 that often, but we were using kit studios, which
00:09:21 --> 00:09:24 was great. But like I didn't understand any of
00:09:24 --> 00:09:26 that going in. And so we did just use the combined
00:09:26 --> 00:09:28 audio there. But again, we were in person. It
00:09:28 --> 00:09:31 was the same environment. We were using the same.
00:09:31 --> 00:09:35 We were each using the same microphone. Like
00:09:35 --> 00:09:37 separate microphones, but they were the same.
00:09:37 --> 00:09:42 And so, you know, there are the environment's
00:09:42 --> 00:09:47 going to matter. And getting the most unopinionated
00:09:47 --> 00:09:50 audio is going to ensure that you can get the
00:09:50 --> 00:09:53 best edit possible. Alright, that's it for this
00:09:53 --> 00:09:55 episode of Streamlined Podcaster. Let me know
00:09:55 --> 00:09:58 right over at StreamlinedFeedback .com if you
00:09:58 --> 00:10:01 use headphones or have strong opinions about
00:10:01 --> 00:10:04 not using headphones. I will tell you, like somebody,
00:10:05 --> 00:10:08 there was, this happened one time, a dude got
00:10:08 --> 00:10:12 onto Riverside, was not using, he was using the
00:10:12 --> 00:10:15 built -in microphone, he was not wearing headphones,
00:10:15 --> 00:10:18 and I said you need headphones, and he said I
00:10:18 --> 00:10:21 don't have headphones. And I said, I find that
00:10:21 --> 00:10:23 hard to believe, but if that is true, we cannot
00:10:23 --> 00:10:27 record because on the form that you filled out
00:10:27 --> 00:10:29 to come on this show, you said you were going
00:10:29 --> 00:10:33 to record in a quiet place, use the best microphone
00:10:33 --> 00:10:37 you can and wear headphones. And he was floored
00:10:37 --> 00:10:39 that I said this interview wasn't happening.
00:10:41 --> 00:10:44 But it's that important to me. So let me know.
00:10:44 --> 00:10:47 Tell me I'm wrong. Tell me I'm right. Streamlinedfeedback
00:10:47 --> 00:10:50 .com. Thanks so much for listening. And until
00:10:50 --> 00:10:53 next time, I hope you find some space in your
00:10:53 --> 00:10:53 week.