Dear Professor Phil:
I’ve picked up several computer books at trade shows and have read wiki’s about mpeg compression. The explanations always seem to involve complex mathematics. Is there a simple way to explain how it works?
Sam, Hershey, PA
The algorithms used in mpeg compression are indeed what most of us would call advanced mathematics. Yet, the underlying concepts aren’t complex and can be explained through analogies.
All compression methods are based on reducing the size of the data set and removing redundant information. Suppose I create a message to indicate where something is hidden – under tree trunk. There are sixteen characters (counting spaces); eight of which are distinct. Conventionally, we would use the ASCII code chart to get the eight bits representing each character and transmit or store 16X8= 128 bits. However, using a common compression technique developed by Huffman, we could use the representations in Figure 1. (For more on how Huffman representations are developed search the web on Huffman Coding)
Figure 1
Space 100 n 001
e 01 t 101
r 111 d 1100
U 000 k 1101
Notice that the frequent characters in our message have short bit patterns while the less frequent characters have longer patterns. Now we can send our message using 49 bits. We have achieved 62% compression by reducing the data set. We could also drop the letter d and the spaces and the receiver will still understand the message from its context. That would yield 70% compression.
Mpeg takes 8X8 blocks of pixels from the scene and represents the intensity and colors as numerical values in an 8X8 table. Placing the average values for the table in the first cell and the differences from that average in the remaining cells, makes the entries into small numbers. Many are 0, +1 or -1. To reduce the data set, the lowest values are dropped. The resulting data set represents the scene and is much smaller than what was digitized by the camera. After a compression such as shown above, very little must be transmitted. However, when a block is corrupted or lost we see tiling on the screen.