I feel like there some misunderstanding of HDR10Plus, and dynamic metadata in general, so I'm going to try to summarize in the simplest way possible, my understanding of how it works. I will ignore color volume transforms in this discussion and solely focus on tone-mapping as it relates to brightness. I will also ignore the "Active HDR" pseudo-dynamic metadata approaches.
For a display that can track the EOTF perfectly up to its max brightness of 1,000 nits...
Case 1: The static metadata informs the display that the content does not contain any highlights in excess of 1,000 nits. The display does not need to do any tone-mapping, as it can perfectly track the EOTF for every scene in the content.
Case 2: The static metadata informs the display that the content does not contain any highlights in excess of 4,000 nits. The display needs to perform tone-mapping (or clip all content above 1,000 nits), so say it starts deviating from the EOTF at 600 nits and rolls off everything above so that 4,000 nits in the content corresponds to 1,000 nits on the display. The display is now displaying everything in the content that falls in the 600-1,000 nit range dimmer than it should be, despite the display being fully capable of tracking the EOTF perfectly for that range. It has to do this for all scenes, since the metadata was static and only informed the display of the brightest highlight in the entire content, even though it may have been a single bright highlight for an instant in one scene. Still, all scenes that only contain content from 0-600 nits should still look identical since tone-mapping is not occurring in this range.
Case 3: At the beginning of each scene, the display receives dynamic metadata telling it the maximum nits in the scene. Now all scenes that do not surpass the display's 1,000 nit capabilities do not need to be tone-mapped at all. Scenes that do surpass the capabilities do need to be tone-mapped, but the other benefit is that the aggressiveness of tone-mapping can be adjusted as necessary for each scene. For example, a scene that maxes out at 1,100 nits needs hardly any tone-mapping at all on this display, while a scene that maxes out at 4,000 nits can get the heavy tone-mapping that it needs.
This is why dynamic metadata matters so much more for lower nit displays. The content needs to exceed the display's capabilities in order for dynamic metadata to have an impact, and the more that it exceeds it, the more impact it will have.