@janos: nice work. I have some interesting measurements of real-time gamma vs. APL on my samsung that is probably applicable to the pannys and also may be related to how one calibrates for luminance relative to white.
One question, why do you calibrate color luminance using 75% luminance and not something lower? The histograms I looked at for typical movie scenes is heavily weighted to much lower luminance, closer to 30%.
Here is one example of the Gamma vs. APL measurements I'm currently working on. Note that this set was calibrated for gamma using AVHSHD windowed patterns. The results for fixed APL patterns look very similar on average but there are distinct differences in the details of how each level input level behaves.
Gamma calibration with target = 2.3, fairly flat using window patterns:

I then measured the gamma response for each of 7 input levels relative to peak white as a function of APL. I did this by using small (1% size) windows against a variable gray 100% size background. In a perfect world, each level would be a flat line at gamma = 2.3 as we vary the picture content level.

The average of all levels is not too far off from the desired behavior and shows the typical non-defeatable dynamic contrast effect at low APL (gamma is too low) and ABL effects at high APL (again gamma is too low). But the interesting behavior is the spread in the response of the various levels. This is not noise in the measurement, all of the structure seen here is repeatable to the 0.02 gamma level.
So to take an extreme example, a 5% input level has fairly flat response but it's far too bright relative to white (gamma=2.15) while the 15% level is generally too dark relative to white (gamma=2.4). The 10% level is the level which most closely approximates our target value of 2.3