Originally Posted by Optimus_Fine
This is completely false! It is actually a blasphemy!
85 dBC is 85 dBC no matter the distance of the source, as long as, at the listening position, the SPL meter reads 85 dBC.
I see I am now an heretical enthusiast.
That 85 dBC is with a particular pink noise test sample playing. Just because two systems hit 85 dBC with a particular pink noise test signal doesn't mean they'll hit the same SPL with on other test signals or on real-world content.
Substantial SPL differences can arise for two mains reasons: First, the spectral balance of the speakers may be very different. Second, the acoustics of the rooms may be different.
There is pretty good evidence that X-curve calibrated dub-stages have diminished output in the treble *and* bass compared to an accurate studio monitor or an "average" top-tier home theater. So if a home theater and cinema calibrate to equal SPL over the 500-2000 Hz (midrange) band, the accurate monitor will be putting out a lot more bass and treble than the cinema will be for the same content. That extra content will increase SPL and may have other side-effects via Fletcher-Munsen effects, which could account for the apparent exaggeration of dynamics when cinema mixes are viewed on a more neutral system.
The acoustics of the rooms and relative placement of speaker(s) and listener(s) are also important. Larger rooms often have longer decay time, which increases the contribution of late arriving sound. Greater distances between speakers and listener(s) also increases the proportion of late arriving sound vs. direct and early arriving sound.
The consequences of these factors on SPL depend substantially on the content. Real-world movie content is dynamic. The sounds which determine our perception of the loudness of the track are likely to be momentary and dispersed as opposed to a pink noise signal, which is continuous. If we use medium-term SPL (i.e., an SPL meter operating in "fast" mode) to assess approximate loudness, we are likely to observed that the momentary dynamics are not enhanced much by late reflections; however, the SPL of continuous sounds like pink noise will be enhanced substantially by late reflections.
So basically what's happening with calibration and acoustics is that cinemas, where listening distances are higher, have a much higher proportion of late arriving energy, than small rooms, and that energy inflates the SPL of the pink noise measurement but not the SPL of the sounds that affect loudness in real-world content. So if a home theater calibrates to the same pink noise SPL target, the SPL of the momentary sounds will actually be louder from acoustics alone. And to re-iterate, the spectral balance differences are likely to contribute even more to SPL.
Originally Posted by Optimus_Fine
Maybe, for small rooms, people should also try a calibration with band-limited noise, after the broadband one, and measure a -20 dBFS RMS white noise signal (equal energy per frequency) from 500-2000 Hz at 85 dBA, rather than dBC, and see if there's a measurable difference.
I did that and there was a large discrepancy of about 5 dB SPL.
I would argue that *no one* should be using broadband pink noise to calibrate level in a home theater. The broadband signal only really works as consistently as it does for cinemas (which isn't saying much) because the cinema is *also* being calibrated to the X-curve target using the same signal. The X-curve target is rolled-off at both extremes, and this effectively "band-limits" the full-band signal consistently between venues.
Technically speaking, a finite duration full-band pink noise signal does not exist! Pink noise signals contain equal energy in each octave, and there are an infinite number of octaves down to DC. In digital audio, pink noise must also be band-limited to below the Nyquist frequency to avoid aliasing, but the Nyquist frequency can vary between systems, depending on chosen sample rate. So really, "full-band pink noise" must always be band-limited at some high and low limits, and the choices aren't arbitrary. They affect the distribution of energy between the mid-bands and the extremes.
If anyone insists upon working around this limitation by performing X-curve spectral balance calibration, realize that you are not following any known standard. X-curve is expressly intended for cinema and even specifies different target curves and levels for different size rooms. To my knowledge, X-curve doesn't specify what to do for "tiny" home theater rooms, and even if it did, I still wouldn't recommend using it if the best sound quality is desired.
Originally Posted by Optimus_Fine
That's even more blasphemous that the first sentence you wrote.
Setting playback level by ear?
We are not machines!
Our hearing varies in sensitivity from day to day!
What you suggest is exactly what brought the loudness war in the music industry and, recently, in the movie industry.
If Reference Level in your room is too harsh, either your gear sucks or your room treatment sucks.
You are right that we are not machines. And likewise, machines are not people, and at the end of the day, audio is for people not machines to enjoy. Machines cannot tell how good or how loud something sounds. At best, machines can guess on the basis of physical and/or psychoacoustic models applied to whatever data is available. The level calibration methods we are discussing here rely on poor quality data with no phase or time-domain information and typically poor frequency resolution as well. They also completely ignore physical and psychoacoustic concerns, which they can't address anyway without time-domain information. We also haven't even discussed spatial variations and what to do about them. (Average them? How? How do we fairly sample the seating area? Where trade-offs exist, how do we prioritize sound quality vs. different seats? Etc.)
Psyho-acoustics is another major dimension that I haven't addressed yet. My experiments with thousands of different EQ configurations indicate to me that various spectral balance characteristics of both the playback system and the soundtrack have a profound impact on perceived loudness. For example I can make a small EQ change, and afterwards, one movie might sound a lot louder (i.e., a few dB) but another sound a lot quieter. I suspect that even environmental changes like increased humidity can have an impact. Every soundtrack has nuances which combine with the nuances of the playback system, all of which interacts in complex and unpredictable ways. And let me not forget to mention individual listener differences too!
On the other hand, practically every playback system has a volume control that is usually very easy to operate. I set it by ear, usually based on how the dialog sounds. Often all it takes is one spoken line for me to get it right for the rest of the film.
Dialog is often called the "anchor" element, because its loudness is often used by the mixers as a kind of reference for everything else. So if you can find the right level for the dialog, chances are the rest of the mix will sound close to right.
Dialog mixing styles do vary. Some movies mix dialog at a fairly constant loudness, usually very prominently. I like to imagine that this style makes the voices sound as big as they look close up on a big screen. Other films go for a more "realistic" style, which tends to be a lot quieter on average but varies a lot more in general. For me, most films seem to fall in one of these three categories, though there are always exceptions.
For people doing audio production, a good method for calibration may be to set the monitor level by ear while listening to a dialog clip with a standardized loudness. For example, the ATSC provides the following sample to use as a reference for TV mixing:
The sample is standardized to -24 LKFS, which is the standard specified loudness for broadcast TV. (Note: computing LKFS involves a sophisticated analysis of the content, taking into account various spectral balance and temporal characteristics to estimate how loud something will sound.)
Before doing a TV mix, a mixer can use the reference to adjust the monitor level until it sounds comfortable to them and then *just mix* everything relative to that level. Because it already sounds "right", the mixer won't be tempted to mix something too loud or too soft, and the final mix is more likely to comply with broadcast standards without the need for normalization and other unwanted processing.
For cinema and home theater tracks, -24 LKFS is usually too loud, and mixers need not adhere to a single, standardized loudness. The test sample can be attenuated by (e.g.) between -3 dB and -7 dB in order to calibrate for a film that follows one particular dialog style or another. Then everything could be mixed relative to that anchor loudness.
A likely counter-argument to my suggestion to set levels by-ear is that it's too subjective. What if people don't really agree on how loud is "right"? And "what about Fletcher-Munson effects"?
To answer the first question, let me turn it around. Why should we listen at a volume level other than what sounds best to our ears? Why should mixers work at a level other than what is most effective for them? The main thing we want to avoid is inconsistency in loudness between tracks, apart from that which arises due to stylistic difference. Hence, if we create a "Speech Sample" with standardized loudness for each kind of style, mixers can do the rest themselves. I really doubt a mixer will have trouble deciding whether the "prominent" or "realistic" is best for their work. It's all based on *precedent*. Do I want my mix to sound like those with louder, flatter dialog or those with quieter, more dynamic dialog? Choose the appropriate "Speech Sample" and then just mix naturally.
What about Fletcher-Munson effects? I think they are overblown. Even the most extreme level differences we're talking about are like +/- 5 dB, and FM effects are pretty subtle between those differences and near "reference level". In fact, the differences in broad spectral balance between different systems calibrated with X-curve and between the wide variety of home systems is likely far, far more important. General differences in treble+bass vs. mids balance in X-curve stages compared to accurate near-field monitors may be the primary reason for the need for home remixes.