I agree with your first statement. I don't understand your second and third. How is the audio masking phenomenon not logical?
Audio masking is the very well established principle that if a sound is louder and especially lower in frequency than another sound, that other sound can be completely inaudible.
I think you know this, but for the benefit of those who are unfamiliar with masking I wanted to state it. Here is a Wikipedia link for those who were not familiar for info about it:
The more complex the signal is, the more frequencies it has, the more potential for masking any much lower level sounds (the distortion) there is.
In general, lower level tones cannot mask louder sounds. If the distortion is lower than the signal harmonics, it can't mask the signal.
For the violin example, violins are quite rich in harmonics. It is the harmonics that make a violin sound different from a flute. The violin harmonics are not distortion, they are the sound of the violin.
In the picture I have attached this is the spectrum of a violin playing the A above middle C, 440Hz. This was taken from a violin sample recording from my Korg Kronos keyboard.
You can see the fundamental and a couple dozen of its harmonics. You can see that the 2nd, 4th and 6th harmonics are actually louder than the fundamental with this violin.
So lets say you add 2nd harmonic distortion to this at -90dB. Note how loud the 2nd harmonic already is.
The distortion would be completely swamped by the violin's own harmonics. Now the -90dB distortion would apply to each of the violins harmonics at 90dB below their own signal level.
The masking curve for each of those harmonics would prevent the distortion from being audible.
If you look at the figure 4.18 from Toole's book that I referenced previously,
you can see that for the 2nd harmonic to become audible, that is above the masking threshold,
it would have to be above about -40dB relative to the signal level of the violin harmonic doing the masking.