Both words are Latin, meaning "I see" and "I listen", respectively.
But they're different conjugations - there are four different ending forms for Latin verbs. "Videre" has the second ("e"-based) form, and "audire" has the fourth ("i"-based).
A bit of web research shows that "audio" was initially used as a prefix from 1913, (such as in words like "audiophile", although that came later). Latin and Greek words are often used unaltered in English as prefixes (such as "sub-", "an-", "in-", "inter-", "intra-"). In 1934 "audio" was used as a word on its own, and then in 1935 "video" was coined to match.
There are much older English words based on the same Latin words - eg "vision" from the 13th century, and "audience" from the 14th century.
Anyway, the summary is: if you have a complaint, take it up with the ancient Romans.