Originally Posted by dfa973
Spatial Coding is the technique that is employed only when Cinema Atmos must be delivered to Home.
There is no need for Cinema Atmos to be spatially coded because the medium for delivering the Cinema Atmos is not that limited as the medium for delivering Atmos to Home.
So, in other words, "spatial coding
" may have been a poor choice of words on Dolby's part as what they really refer to is their clustering/grouping code, which means combining objects to use a new combined spatial location rather than each individual object's coordinates (and the original objects would have to contain their coordinate positions if that system is to function as they are not channels), but that's not what Dolby is referring to by "Spatial Encoding" but rather the newly combined locations (and possibly the conversion to the home format as well at the same time). In any case, I don't care for their use of the term as it does not describe what is actually happening, which is more like a "Cluster Fudge" than "Spatial Encoding" of the data. Of course, that sounds negative in connotation. But is lossy coding of any type not a negative thing in some respects?
The cinema version has up to 128 simultaneous audio streams and 64 speakers locations (does that imply the objects can optionally be stored with "stereo" waveforms? I don't know a lot about the renderer, but I know the final panning would have to be channel-based level+phase differences in the final render. Stereo waveforms in a given object could allow even more objects to be pre-combined to create more individual sounds than the 128 limit as the bed channels normally would have in their mixes and that could probably explain the ratio choice.) The home version has up to 16 simultaneous audio streams and up to 34 speaker locations. That's going from 2:1 to just under 1:2. Many review sites say home Atmos reproduces
all 128 possible objects, but retaining the "sounds" within the original objects and storing the object on the home medium (blu-ray or streaming) is another thing altogether. I can't see how they could be stored to save data. That would seem to be the entire point of "spatial coding" (reducing the overall waveforms to reduce the storage and processing requirements). But loss is loss. The question is what spatial accuracy is actually lost under certain conditions (such as the bird example in the post of mine above) and I have yet to see a single reply in that regard.
I have attached some screenshots from the "Dolby Atmos Immersive Audio From the Cinema to the Home" where is very clear that Cinema Atmos has very few limits - in contrast with Home Atmos.
I'm aware of the cinema Atmos capabilities. They are fairly straight forward to understand. It's the home version I'm interested in knowing what it does in situations with a limited number of objects and I don't mean a block diagram with a circle around other circles and hence the bird example I gave that has been ignored.
Perhaps an object of a bird ends up going back and forth between surround #2
and side surround as an object becomes available and then unavailable in each second in time. Perhaps something like that would seem to have the potential to sound more objectionable to the ear than a more stable configuration (like Disney's 7.1.4 locked objects), even if it's using less speakers, particularly if you were seated closer to surround #2
to hear it turning on/off in such a way with the sound as to be objectionable (from certain locations, it could seem to get weaker/stronger in terms of a phantom image versus the discrete speaker image).
This would be the type of thing I could imagine Disney wanting to avoid as it could be disconcerting. Perhaps Disney chose absolute accuracy and stability with fewer speakers over the possibility of spatial error. IF that is the reason, I can see why they (if not Dolby) wouldn't want to tell us and/or have someone like FilmMixer tell us as it implies Atmos has an accuracy issue they'd rather have you believe "spatial coding" can deal with in a non-noticeable way every time. But is it noticeable? It seems like with more speakers it certainly at least could be possible to notice issues. But the question is even with a 7.1.4 render compared to a 7.1.4 print-through, were there enough inaccuracies that Disney felt more comfortable limiting their Atmos mixes to 7.1.4 rather than let the "spatial coding" create inaccurate renditions of their cinema soundtracks? We may never know as apparently it's still a secret and mere speculation.
Like most conspiracy theories, however, something that might be relatively innocuous like that sounds ever the more sinister when they refuse to tell you what's really going on. Certainly, a print-through 7.1.4 would be utterly stable in the speakers it's using as it could never run out of objects as it's only using 11 plus the LFE as stationary objects. Something must give in a lossy situation. Perhaps it's inaudible or mostly inaudible most, if not all the time. Given we have no frame of comparison, it's hard to know what we might be missing from the original soundtrack positions of objects. Disney could easily tell, however. All they have to do is compare a 7.1.4 print-through locked mix to the straight conversion of their cinema Atmos track and compare various busy surround scenes. If they found there were a lot of relative positional errors of individual sounds in the process compared to limiting the combination to a pre-combined 11.1 channel output, that would explain WHY they are using locked 7.1.4 tracks, even if some of us might find those differences not worth losing the extra speakers. That would explain FilmMixer's comment that he understood their reasoning (absolute accuracy compared to the cinema soundtrack in relative imaging terms of all sounds), but disagreed with it (not significant enough error to warrant killing the scaling ability of Atmos to use up to 34 speakers).
DTS:X Pro by comparison would be like taking a Disney style 7.1.4 mix and extracting more speakers for hard sources and larger rooms without changing the actual mix itself. This could result in more accuracy, if not suffer some "bleed through" losses or what have you in the extraction process (things are rarely perfect, but are the differences audible
The main objective for Atmos was: SCALABLE - so even if delivered to Home, there will be no downmix or down conversion - all the objects are carried but within the limits of the transport medium.
This sounds like splitting hairs. The home version is definitely down-scaled in object count and speaker count as well. "Conversion" of that cinema track to the home format is aptly described by "downmix" in my opinion. Downmix isn't a patented term or anything. It simply implies something is scaled down. HOW it's done is not implied. Call it a conversion or call it "spatial coding" if you like (I like "Cluster Fudge"), but if it looks like a duck and quacks like a duck, it's probably still a duck. Cinema Atmos goes from 128 waveforms/objects and 64 speakers to 12 or 16 waveform/objects and as many as 34 speakers. How is that not
a "downmix" ?