For transparency reasons I'll start by saying I'm the CTO of Zektor (we make AV switches).
This has always been a big pet peeve of mine. It's just not that hard to create unbrickable firmware updates! The statement "Be sure not to unplug unit during firmware updating", always makes me a little queasy.
In hopes that some engineer from some other company reads this and fixes their update issues, I'm going to describe how we handle firmware updates. (This article caught my attention., maybe I'm not alone.)
First off, the firmware inside the unit is divided into two parts. The boot-loader, and the main program. The boot-loader's job is to go into a loop waiting for new firmware, and once found, update the main program.
There is not a modern embedded processor sold, that will not let you hardware protect the boot-loader. So we make the boot-loader read only. The boot-loader never gets updated, so this had better be well tested before shipping!
On power up, the first thing that is executed is the boot-loader, is does a full CRC32 check of the main program, if it passes, it jumps into the main program, if it fails, then it waits for a new firmware update.
As simple as that! This takes care of bricking 99% of the time. Some scenarios:
User starts firmware update and power is lost.
-- In this case, the half loaded firmware will not pass CRC32, and once the power is restored, the device will not jump to the main program, but will wait for another attempt at downloading the new firmware. Just start up the firmware updater and try again.
The firmware file is corrupt (we do a full CRC32 of the firmware before even starting an update, but who knows).
-- We CRC check each individual packet being sent, and if it fails the update is aborted. The switch is reset, the CRC32 fails, the device jumps back into the boot-loader and waits for another attempt. If at the end of the entire update the CRC32 fails, we jump back into the boot-loader. This allows you to download the firmware again, and give it a try.
Ok, how about that 1%? A scenario:
We send out a piece of firmware that passes CRC32, but because of a programming error the main program bricks the device.
-- In this case there's a problem, the CRC32 passes so it never goes into download mode. Each time you power up it passes the CRC32 check, jumps into the main program, and bricks.
So we added one more test at power up. We pick a button somewhere (usually the power button). If you hold this button down when you plug in the unit, it will skip the CRC32 check and jump directly to the boot-loader waiting for new firmware. Combined with the fact we *always* allow downgrading to a previous version, this gives you the chance to downgrade and call us up and tell us our new firmware sucks! But at least you're not without a switch.
Believe it or not, we dislike having switches sent back to us for unbricking as you do. (Ok maybe not quite as much, but you get my drift).
So knowing how simple this is, why is it possible for new devices to brick when you upgrade the firmware? Fix it fellow engineers, it's not that hard!