The results are very, very impressive. But the price paid (besides a $4K up-front cost) is massive complexity in setup and capturing PRIRs. I've been in the computing field for 40+ years, and a massive tech and A/V geek who rolls his own active crossovers and designs & builds speakers, but this thing stretched me.
Yeah, that's outside of my price range, and it also sounds like it's outside of what I might have enough patience to set up.
Next closest is the Apple AirPods Max, which do a good job, considering they use a generic Head-Related Transfer Function (HRTF). I hope Apple provides a means of creating a custom HRTF to further increase positional accuracy.
I have a Windows 10 PC and an Xbox One S, and I've installed Dolby Access and DTS Sound Unbound on both of them. Dolby Access provides Dolby Atmos for Home Theater and Dolby Atmos for Headphones, while DTS Sound Unbound provides DTS:X and DTS Headphone:X. They use generic HRTF models, though DTS allows users to select a profile for their headphones, so it might customize the sound based on what headphones the user is using. I use a pair of Sony MDR-7506 headphones, and DTS has a profile for those because they're a very common pair of headphones. In addition, DTS Sound Unbound allows the user to select one of two modes: 'balanced' or 'spacious' (spacious makes things sound further away). I've found that for gaming, DTS Headphone:X in 'spacious' mode is better. I don't want to side-track this thread with discussion of games, because the topic of this thread is music, but I just remembered that off of the top of my head.
One advantage that Windows 10 and Xbox have over iOS, iPad OS, and tvOS is that Windows 10 and Xbox have DTS (and also Windows Sonic, which I haven't even mentioned), with customizations. Of course, you can only play content in Atmos from apps that don't support bitstream passthrough if you choose Dolby, otherwise it falls back to 5.1, since many apps don't yet support bitstream passthrough (a new feature), and Dolby doesn't allow Atmos signals to be processed by DTS. You can probably only play content in DTS:X from apps that don't support bitstream passthrough if you choose DTS. I suppose that for apps that support bitstream passthrough, their Atmos or DTS:X bitstreams will pass through regardless. I actually wrote a pseudo-guide about it somewhere on AVSforum.
You can choose which spatial audio renderer (Dolby, DTS, or Windows Sonic, and possibly others that might be in the Microsoft Store) you want to use. iOS, iPad OS, and tvOS, on the other hand, only have Dolby.
But, I've found that Dolby Atmos for Headphones and DTS Headphone:X both place the center speaker very close to my head, as if my nose was pressed right up against my screen, while the front-left and front-right channels are placed far-off to my left and right, respectively. I think that the audio is rendered from the point of view of the camera, whose boundary IS my TV screen's boundary, and that's why the speaker channels are placed where they're placed. The other channels are also placed as if my head was where the virtual camera is. Neither Microsoft nor the game's developers know how far away from my TV my head
actually is, so they have to render the audio from the POV of the virtual camera.
Obviously, a real home theater setup is better then all of the virtual spacious sound over headphone setups that I just described. However, the real world has practical limitations--space limits and financial limits--and some of us can't install speakers in the ceiling and also might not have ceilings that are optimal for ceiling bounce speakers. Headphones aren't affected by the characteristics of our ceilings, or whether or not we have speakers there, so that is one advantage of headphones over home theater speakers.
Since we only have two ear canals, a 'stereo' headset does fine, as the immersive positional aspects are in the temporal and frequency cues a device like the A16 or the Apple Spatial audio Atmos decoder does to emulate what happens vs physical sound sources.
I will consider any headphones that place multiple drivers around my head to be snake oil, until someone proves that they are actually better than stereo headphones being fed sound that has been HRTF-processed.
The listening habits you describe shouldn't be projected onto how Atmos music is mixed. Surround music might not have listeners facing a screen, but it is still mixed with a front soundstage in mind AND a listener (mixer) sitting in a sweet spot. To that end, music mixed in Atmos is as practical as music mixed in 5.1 or 7.1. YOU can treat it as background music, but that is not the intent (even 2 channel music wouldn't have carefully crafted soundstage and imaging if the intent was background music).
Your focus on the "sweet spot" seems to miss the distinction I made between ideal and acceptable listening in the two scenarios: a space (such as a room) and headphones.
In the space, the sweet spot is the ideal position for every type of signal from stereo through all the multi-channel options to object audio, but for casual listening it doesn't matter. When I get up from my middle seat in the home theater and go to the adjacent room I am still enjoying listening to the music.
So, I get that music in Atmos--and also music in stereo--is optimized for critical listening. I just wonder how many people
can sit in a listening room or in a home theater and listen critically, and out of those people, how many of them spend much time doing that. I not sure that I'm able to do that without having something else to do, such as doing something on my phone or on my tablet. I also get that I can listen non-critically by doing other things and moving around the room or around the house while listening, and I'll still be hearing the music, it's just that the sound won't be optimized. What I wonder, is how whether music in stereo would be BETTER for multitasking while moving around the room or around the house, than music in Atmos. Will I miss MORE acoustic information by not sitting in the sweet spot with Atmos, than I would with stereo?