Last week, SMPTE (the Society of Motion Picture and Television Engineers) held its 2013 Annual Technical Conference and Exhibition at the Loews Hollywood Hotel next door to the TCL (formerly Grauman's, then Mann's) Chinese Theater in Hollywood, CA. Most of the conference was aimed squarely at the technical side of making movies and TV shows, with seminars like "Analysis of PTP Locking Time on non-PTP Networks for Genlock over IP" and "Playout Automation in a Virtual Environment."
But before the conference officially started, there was a day-long symposium called "Next Generation Imaging Formats: More, Faster, and Better Pixels"—in other words, UHD/4K. In fact, there were two simultaneous tracks: technical and business. Naturally, I was eager to attend the technical track to see what the professional creative community had to say about UHD.
The presenters included many high-level technocrats, including (L-R) Hanno Basse (Fox), Jim Helman (MovieLabs), David Brooks (Dolby), Wendy Aylsworth (Warner Bros.), Hans Hoffmann (EBU), Annie Chang (Disney), Peter Putman (Kramer Electronics), and Patrick Griffis (Dolby)
And man, did they have a lot to say! First of all, it's clear that increased resolution is only part of the story—and probably not the most important part, though a few presenters stressed its importance in heightening the sense of "being there," especially when the screen occupies a larger field of view. However, many presenters agreed that increasing the resolution has less impact than other factors on the viewing experience. What's more important—what produces a real "wow" factor—is increasing the dynamic range, color gamut, color subsampling, color bit depth, and frame rate, none of which have been settled in terms of content creation or display.
Of course, humans can perceive a much wider dynamic and color range than is reproduced on any current screen. In terms of dynamic range, most of the discussion was aimed at increasing peak brightness from the currently common 100 nits (about 30 footlamberts) to as much as 10,000 nits (nearly 3000 fL)! Of course, a full-screen white field at that brightness would sear your retinas, but that's not what the presenters meant—they were talking about small "specular" reflections, such as the sun reflected in a wine glass, as opposed to diffuse reflections from most surfaces. If specular reflections on a video display could be that bright, the entire image would look much closer to what we see in the real world.
Expanding the color gamut of UHD was another hot topic. Much as already been said about the gamut known as Rec.2020, which encompasses a significantly wider range of colors than HDTV's Rec.709 and even the digital-cinema P3 gamut. (Actually, color gamut is only one part of Rec.2020, which is more formally known as ITU-R Recommendation BT.2020 and also includes parameters such as display resolution, frame rate, color bit depth, and color subsampling.)
Rec.2020 encompasses a much wider range of colors than HDTV's Rec.709.
Some presenters advocated going much farther by using the XYZ color gamut, which extends well beyond the entire range of colors visible to the human eye. By using XYZ, the system would be entirely future-proof, accommodating any display technology that might be developed without having to create a new system all over again. In this case, metadata could be used to represent the gamut of the content and the display's capabilities, allowing the source device to adjust its output accordingly.
The XYZ color gamut includes the entire range of visible colors and much more.
Another aspect of color is called subsampling. As you probably know, full-color images are reproduced on virtually all displays by combining red, green, and blue (RGB) elements. RGB can be transformed into another representation called YCbCr, which consists of a black-and-white brightness channel (Y) and two so-called color-difference channels (Cb and Cr). After this transformation, the YCbCr signal is also known as 4:4:4, because for every four Y pixels on each horizontal line of the video signal, there are also four Cb and Cr pixels on the even-numbered lines and four Cb and Cr pixels on the odd-numbered lines.
RGB can be converted to YCbCr 4:4:4. (Graphic courtesy of Spears & Munsil)
That's a lot of data, so the number of Cb and Cr pixels is often reduced, which works fairly well because the human visual system is far more sensitive to brightness than it is to color. For example, 4:2:2 indicates that for every four Y pixels on each horizontal line, there are two Cb and Cr pixels on the even lines and two on the odd lines. This is equivalent to cutting the horizontal resolution of the color information in half.
4:2:2 cuts the horizontal resolution of the two color channels in half. (Graphic courtesy of Spears & Munsil)
Even more color pixels are removed in 4:2:0, which indicates that for every four Y pixels on each line, there are two Cb and Cr pixels on the even lines and no color pixels on the odd lines. This is equivalent to cutting the horizontal and vertical color resolution in half. In either case, the full-color information is reconstructed using interpolation, in which a video processor re-creates the missing pixels with varying degrees of accuracy. DVDs and Blu-rays store video data as 4:2:0, and it's not yet clear which subsampling scheme UHD will use—probably not 4:4:4, but hopefully 4:2:2, which looks better than 4:2:0 because more of the real image information is in the signal.
4:2:0 cuts the horizontal and vertical resolution of the two color channels in half. (Graphic courtesy of Spears & Munsil)
Then there's color bit depth—that is, the number of bits used to represent the brightness of red, green, and blue. The current standard is 8 bits per color, which represents 256 steps of brightness. Everyone hopes that UHD will use at least 10 or 12 bits per color for smoother color gradients, especially with higher peak levels. For example, with a peak luminance of 100 nits, people with normal vision can see banding below 50 nits with 10-bit color; with a peak luminance of 10,000 nits, banding is evident below 100 nits with 12-bit color.
Part of the problem here is the opto-electronic transfer function (OETF) used to convert optical images to electronic signals in the camera and the electro-optical transfer function (EOTF) used to convert those signals back into image in the display. Both are currently based on the behavior of CRTs, which gives rise to the gamma function used in all TVs. The issues of bit depth, luminance, and banding cited above are all related to gamma. Interestingly, neither Rec.709 nor Rec.2020 specify an EOTF—a gamma of 2.4 is specified in Rec.1886, and Rec.2020 uses it for now, though there's a clause that says if a new, better EOTF is developed, Rec.2020 can use it.
In order to reduce banding with high peak-luminance values, Dolby is developing a new perceptually based EOTF called Perceptual Quantization (PQ). Dolby claims the new EOTF can handle bright highlights, such as specular reflections and light sources like the sun, lights, and fireworks, as well as low-level details better than a gamma-based EOTF. In fact, the company claims that 12-bit PQ equals the banding performance of 14-bit gamma.
Frame rate is a hotly debated issue. Of course, higher frame rates result in sharper motion detail, especially if the camera's shutter aperture—the fraction of the entire frame duration that the shutter is open—is low. (The longer the shutter is open during each frame, the blurrier moving objects appear.) However, a low shutter aperture also increases visible judder, a stuttering in what should be smooth motion. Also, higher frame rates look less like film and more like video, which many cinephiles object to. Still, many presenters said, in effect, "Get over it, this is the future we're talking about!"
Richard Salmon of the BBC brought some demo material comparing standard and high video frame rates—50 vs. 100 frames per second for Europe and 60 vs. 120 fps for the US. In the European clips, the 50 fps material was shot with a 50% shutter aperture, while the 100 fps footage was shot with a 33% aperture; in the American clips, both 60 and 120 fps were shot with a 50% aperture. In all but one case, the objects in motion were much sharper at the higher frame rate, and I did not see any judder. The only exception was a side shot of a woman juggling three bowling pins, and in that case, there was no improvement in the motion sharpness because, we were told, the human visual system cannot resolve rotating motion very well.
There are quite a few concerns about higher frame rates other than removing the "filmic look" of 24 fps. For one thing, the color and editing tools are not widely available yet. Also, there's the difference between multiples of 24 for commercial movies and multiples of 60 for home video (not to mention the difference between these American standards and the 25 and 50 fps frame rates used in Europe and other countries with the PAL video system). Finally, there's the issue of fractional frame rates (23.97 and 59.94 fps), which are a legacy from the days when television transitioned from black-and-white to color. Most pros agree that, for the American market, 120 fps is ideal (it's a multiple of both 24 and 60), while 100 fps is best for the PAL market.
Jim Kutzner of Fox talked about a new broadcast standard for UHD called ATSC 3.0, which is being developed by the Advanced Television Standards Committee, the organization that brought us HDTV broadcasting. Kutzner said that ATSC 3.0 will most likely be incompatible with the current broadcasting system, which means it had better provide a significant improvement if it's going to be worth doing. He expects a proposed standard to be ready for evaluation by the end of 2015.
Surprisingly, two items were not discussed much—data compression and delivery from the source device to the display. In terms of data compression, most people seem to be hanging their hat on H.265 (aka HEVC, or High-Efficiency Video Coding), which is roughly twice as efficient as H.264, though if parameters like frame rate, color gamut, bit depth, and color subsampling are all increased, that means way more than twice the amount of data will have to be compressed. Plus, there are other codecs that could offer better performance, such as eyeIO, which Sony has licensed to use with its online UHD delivery service, and the one developed by Red Digital Cinema for its RedRay 4K media player.
Richard Salmon shared the results of some preference tests performed by the BBC with 60 fps content compressed using HEVC at 5, 7, and 9 megabits per second and 120 fps at 7, 9, and 12 Mbps. As expected, viewers' preferences increased along with bit and frame rate. The only exception was a clip of a carnival ride with very complex motion—the preference rating dropped sharply at 120 fps/7 Mbps, then rose again with bit rate. Clearly, that particular clip required less compression to look good.
HDMI 2.0 was mentioned by Peter Putman, a well-known industry analyst and journalist whose presentation focused on the consumer-display side of the equation. According to his calculations based on using the RGB color space, HDMI 2.0 with 18 Gbps of available bandwidth can convey UHD at 60 fps with 8-bit color but not 10-bit, while DisplayPort 1.2 with 21.6 Gbps of bandwidth can handle 2160p/60 with 10-bit color using RGB coding.
Although it wasn't discussed at the symposium, I want to clarify what manufacturers mean when they claim their current displays have "HDMI 2.0" capabilities. Some of these capabilities can be implemented by updating the firmware of existing chipsets, but other capabilities can't be implemented without new hardware, which we won't see in consumer products until next year. For example, current HDMI hardware, which has a bandwidth of 10.2 Gbps, can support UHD (2160p) at 60 fps, but only with 8-bit color and 4:2:0 subsampling.
Anything above the double line in this table can be conveyed with current HDMI hardware; anything below the double line requires new hardware that isn't available yet.
Most of the presenters advocated for UHD to adopt the entire Rec.2020 suite of parameters, including a resolution of 3840x2160, the specified color gamut (if not XYZ), frame rates up to 120 fps, at least 10-bit color (preferably 12-bit), and 4:2:2 subsampling (if not 4:4:4). However, HDMI 2.0 at 18 Gbps can't accommodate all these upgrades, so we face a dilemma—increase HDMI 2.0's data rate, abandon HDMI for DisplayPort, or accept lower standards for UHD. In my view, HDMI is too entrenched in the consumer-electronics market to be supplanted by anything else, so I'm afraid we may have to settle for less than the best possible UHD experience, at least until HDMI is upgraded yet again, which I can't imagine happening any time soon.
In any event, it's clear to me that high frame rates, high dynamic range, expanded color gamut, high color bit depth, and less-aggressive color subsampling all make more difference in the picture quality than higher resolution, yet resolution is the only settled issue. All the other improvements are still being discussed and will not be finalized for some time to come, which means the TV manufacturers have jumped the gun by introducing UHDTVs with 8-bit color, Rec.709 gamut, and the ability to accept a maximum frame rate of 30 or, in some cases, 60 fps. (Many UHDTVs can reproduce a larger gamut, but it is not well-defined, varying from one set to another.) Of course, they want to sell TVs, but the models they sell now will be obsolete in a couple of years as these other issues are settled and content is created using the new standards.