You're basically stripping out the audio stream from the video stream on one HDMI cable and stripping out the video stream from the audio on the other. Of course both streams are encrypted. After merging the streams, then you have to re-encrypt and then send it on its way. Of course doing this in real-time (or close to it) makes it more difficult.
I'm not aware of anything that can do this. A matrix switch cannot do this digitally since it does not (or should not) change the audio/video stream that was received. A good AVR would be able to do this (watch video from one input and listen audio from another) but the HDMI output would still have the original audio/video for the first stream while the speakers played the output from the second stream.
Your other choice is to go to the dark side of HDMI conversion, convert to analog and then run it through a true audio/video mixer back to an HDMI encoder.