It's not the DC. It's the high frequency crud that's generated at FULL POWER when the amp clips. The crud goes straight to the tweeter and "poof".
That's what would seem logical, but that's also not how it works. When you clip audio, as in music, the actual spectral distribution doesn't change a lot, but the RMS level does go up. It doesn't go up much differently than if you just changed the level without clipping, though, until you get into severe clipping. The effect is of course dependant on the music itself.
These are kind of hard to see, but it's a contemporary tune, which I chose because contemporary music is more dense and would clip more often. The lower plot pair is L/R without clipping, the upper plot pair is the same tune pushed 10dB past the clipping threshold. Notice the spectral distribution is nearly identical. This would not be true of a sine wave where no harmonics exist at all without distortion, but with the spectral distribution of music itself the actual contribution of the added harmonics is not easily seen. But you can see there is no high frequency content that is substantially different from the unclipped spectrum below it.
The detector algorithm in the spectral analysis is not true RMS of course, its an average, but the difference with this kind of signal won't be all that great.
The explanation is that clipping is a peak related function, and while it definitely adds harmonic content, that content is below that of the unclipped waveform, and is also distributed temporally, meaning it's not there for very long (existing only as long as a peak is clipped). So from the standpoint of RMS heating, it doesn't change the spectrum in complex signals.
The other thing is, 10dB is a LOT of clipping, something nobody would actually tolerate for very long. 3dB is actually a lot too, still very nasty sounding. The typical peak clipping people worry about in terms of blowing speakers isn't even 3dB, at least not for more than a brief instant, because audio pushed that far into clipping is simply unlistenable.
The RMS change is what's interesting. With just a few dB of clipping, the difference between a clipped signal and an unclipped signal with the same level increase is very small, if any. However, at 10dB of clipping, the difference is noticeable, with the RMS level of the clipped signal becoming higher sooner. However, this effect is highly program dependant, and as such there's no real way to generalize.
What we can say is that the heating caused by nominally clipped audio is no worse than unclipped audio of the same RMS level. In fact, the unclipped may be slightly more dangerous because it is not limited in maximum peak level like a clipped signal. The situation may change if there is gross and continuous clipping, or if the audio itself is pre processed for very high density.