When I think about this from a hardware perspective I don't see how it could be an HDMI protocol issue. Lets take my example of I1->O1, I2->O2. Then the user selects I3->O1. What does the hardware have to do? It has to first disconnect I1 from O1. Then it has to connect I3->O1. At this point there will be an HDMI handshake. In this case the switch doesn't need to do any proxying. Why would it care what is on O2? It is displaying I2. And obviously O2 is NOT communicating with I1 or I3.
Ok so now I3->O1 and I2->O2.
Then we consider if the user switches I3 to connect to O2 as well. HDMI is a single source, single sink protocol. At this point I understand why the switch needs to intervene. In this case it needs to proxy EDID information and only return the resolution(s) common to both displays. But in the simple case where it only needs to connect one input to one output how could it possibly be an issue with HDMI standards? The hardware could be a simple buffer in this case right? I could see how a silly hardware impementation could choose to always proxy EDID info. But that is implementation not protocol. A better implementation does not need to proxy EDID if it is not displaying one input on multiple outputs.