Experiment with 192 kHz...

Almost scared myself

Got my new E-MU 1820 interface two days ago (thanks for everybody who contributed their opinions), installed XP Pro sp2, got everything running and starting to get some idea how to use the PatchMix software, which is somewhat more complicated than the old Isis mixer applet… :O

Anyway, after trying the unit’s TfPro mic pre’s as mono with Oktava MK219 (gorgeous, even with my singing voice) I just had to do a little stereo experiment. Like this:

I ran two long mic leads to living room, put two ECM8000 omni condensers at a stereo bar to a easy chair (so that the backrest works as a baffle towards the aquarium and refridgerator in adjacent rooms). The distance between the capsules was something like half a meter. Back to the studio, fired the nTS recording 192 kHz / 24 bit, took my Regal dobro and a tambourine, walked to the living room from behind the easy chair. (Creaky wooden floors, so a lot of footstep noise.) Sat in a stool in front of the hair, played some dobro, sang some, hit the tambourine few times. Walked back to the studio, switched the recording off.

Then I put the headphones (AKG240M) on and listened to the results. Now the scary part. I was alone in the apartment and I definitely knew I was listening to the recording, but I just had to check over my shoulders and see there was nobody behind me. The footsteps sounded so real.

(The guitar part and the tamburine sounded pretty good, too.)

Saved the stereo file, opened it with WaveLab. Listened. Same scary effect. Downsampled the file to 48 kHz and listened. A good recording, but not as real sounding.

Now, I’ve experimented with test tones and I know my hearing stops somewhere around 16.5 kHz (pretty good for an 42 year old), so the extra resolution shouldn’t matter much. But this decidedly un-scientific experiment suggests it does. Or I’m pretty good at fooling myself, which is entirely possible. :p

I still buy the explanation that says that those higher frequencies play a role in timbre, and hence in perceived quality of the sound, so higher would be better. I know that is not consistent with that article we read, but…what’s the real test? Ears! :)

Tom, I doubt there’s much audio over 20k being recorded, due to the mike, preamp, and soundcard. (Do 192kHz soundcards actually say they record over 20k? I’ve yet to see any post specs.)

What did you use to downsample? Try recording it in 44.1, and then set up a blind test so you don’t know which version you’re listening to and see if you can hear the difference. It’s very difficult to judge if you know what you’re listening to. We fool ourselves much too easily.

For example, I remember listening to something and I’d click a checkbox on something and swear I could hear a difference. But then I realized I knew whether the checkbox was checked or not, so I set up a blind test. I couldn’t tell the two apart. BTW, you need at least 4 and probably more listening trials to minimize the chance that you get results that seem significant but were just a matter of chance.

Truth IS stranger than fiction? Maybe I’ll try 192K. I have not gone above 96K with my EMU because I could not really tell much difference between 44.1 and 96. Maybe 192 is a marked improvement regardless of what the science/math says. ???

Nah!

There MAY be something to this. I know Walter Sear wrote a long diatribe about analog versus digital and his big beef was that at 16/44.1 digital recordings lack “warmth” because digital cannot capture all the harmonics present that may not be heard as a primary source but additive. ???

Crap. To heck with it! 24/48 sounds darn good with my gear so 24/48 it shall be!

TG

PS MWah, that EMU 1820M is sweet isn’t it! I have been extremely happy with mine.

I used WaveLab to downsample the file.

Apparently, the EMU does sample something over the 20 kHz range. Here’s a frequency plot from the beginning of the file (when I was walking towards the chair). Actually, I thought initially that the mics had more severe limits…

TG, you’re right: the EMU 1820 (i have the plain jane version) seems to be very sweet! Anyway, I seem to remember reading an article from an American mag (Electronic Musician? Mix? Don’t remember) where the author said he couldn’t hear much difference between 48K and 96K, but he could hear difference between 96K and 192K: the sense of space was much more natural. I think he compared piano recordings.

Apparently, the jury’s still out, and probably will be for a while. And anyway, what we’re trying to achieve with multitracking is usually nothing that even tries to pose as natural stereo field. We want something that sounds “good” instead of “correct” or “natural”.

RE: The freq plot you posted. An EMU guy told us that the main difference between the plain cards and the “M” cards are the converters and filtering. Apparently, the non “M” cards have better filtering to get rid of that ultrasonic energy your plot shows. I’m not sure I understand why… I mean cats and dogs can’t even hear 60-70k can they?

TG

Mwah - Might I ask what you used to do the frequency plot? It looks like a nice tool. Wow - material at 60Khz - You could record music that only dogs could hear…Not sure what the dog would think of it though…

.-=gp=-.

The freq plot was made with Cool Edit.

Just saw a documentary about bats last week. When the summer comes, I’m planning leaving an omni mic outside for a few hours at some warm, beautiful night to check out if there’s any bat activity near our house… their sonars use 40 to 50 kHz range…

Right, learjeff, I need to do that. Controlled experiment is needed. never was good in science… :)

I bet all that stuff above 60k is noise and would be there regardless of what the mike is picking up. But it does indeed prove that the soundcard is recording it.

Build yourself an XLR short plug (connect the two signal leads together inside a spare male XLR connector, and no cable at all) and plug that into your mike preamp. Record that at your normal recording levels and see what you get.

Then, try it with your mike connected, but buried inside a mound of all the pillows and mattresses you can find. Compare those two plots, and you’ll see the noise coming from the preamp versus the noise added by the mike.

Of course, you can also make a 1/4" short plug and measure the line input, and then you’ll see the mike preamp noise versus the rest of the chain.


Finally, run RightMark Audio Analyser on your card and post the results. If you’re going to keep the results on-line, let me post a link to them from my RightMark results page. If you’re not, let me copy them and keep them. Please measure at 24/44 and 24/192. Thanks!

(My page is down at the moment, btw.)

Another thing to try is different resamplers. Try r8brain and do double blinds with the 192 khz. I bet you will hear slight differences based on the downsampler used too.

The other thing to think about is what frquencies can your headphones reproduce? Do you really think that the headphones are pumping out that much stuff? Also, try recording in 44.1 the same sort of thing and compare. I bet you get more loss in quality from downsampling than from recording in the target format to begin with. Just my guess. Gotta double blind like Learjeff said.

All this talk of frequency response leads to an interesting question: what is the frequency response of air? I mean, as a medium of sound transmission, what is the audio bandwidth of air? Does it change with different weather, location, altitude, etc.? If we know the upper limit of air’s response, then maybe we’ll know at what point over-sampling will be over-kill…

Doesn’t 192 khz sampling create HUGE wave files? I mean 96 creates big files that nTrack can barely handle, even with a fast machine. I can’t imagine myself going to 192 - 96 works for me.

Mr Soul

You could also do a blind test with two different recordings (instead of using a tool to downsample it) - you might be harder pressed to hear a quality difference if both clips are the original recording. Of course then you have to take into account that recording aren’t the “exact” same program material…but if you still hear the “space”, then I would blame the downsampling for the quality loss.

Of course, you’re all right about conclusive results needing better double blindfold testing. Maybe one day I’ll have enough time to assemble a test group and do a decent paper about the results…

Until then I’m still left to thought there’s something in the idea that we’re somehow able to perceive frequencies higher than the usual audible range.

And I was, too, interested in what the file actually contains in its ultrasound range. So I dropped the pitch by two octaves (effectively scaling the original ultrasound frequencies to 0-24 kHz range). There was more than noise, there, and the audio information didn’t sound muffled at all. Apparently, the mic managed to gather real sounds there (but hardly in any linear way, freq response-wise).

BTW, I now believe that in some cases, recording in 44k, upsampling to 192k, mixing, and then downsampling would produce better results than working in 44k all along. However, it’s only true if the up/downsampling code is very high fidelity, and I have no idea how good it really is. I understand there’s a big difference in quality between different implementations.

The improvement would be due to the way certain plugins like chorus work – ones that use a modulated delay line.

Too add even more to my already dazed and confused mind…I found another paper discussing the different ways DAW programs approach digital summing of signals. The link is on the PC at home and right now I can’t seem to find it from here…hmmm…

Anyway, the big beef was about the differences between analog summing say in a console or summat versus the digital approach in software. Interesting read…if you can locate it…

TG

nergle, that’s a very intuitively appealing argument, one we’ve heard quite a lot. However, according to theory of sound and also human physiology, transient response is equivalent to frequency response. If you can’t detect/record/playback the frequency, you can’t do the same for that frequency’s component in the transient. Nyquist proved this mathmatically, and human hearing experiments that support it. If you know of any serious scientific studies that show otherwise, I’d be very interested.

Bottom line: I don’t think there are any benefits to higher sample rates for a simple record/playback scenario. However, for a record/process/mix/playback scenario, I believe there are benefits. (Though not enough for me to bother with it at this time.)