How Low Can You Go?

Last night I went to the Denver IEEE meeting of the Signal Processing Society. I was particularly interested in this talk because it was given by Gary Sullivan, the co-chair of the recent international standardization effort to create the High Efficiency Video Coding Standard (HEVC). HEVC is the heir of the AVC (H.264) standard, which was itself the heir of the venerable — if long in the tooth — MPEG-2 standard of the early 90s.

We’ve been following HEVC from the sidelines here at Cardinal Peak, and I was looking forward to hearing Gary’s opinion on the gains of HEVC with respect to AVC. The HEVC standard set out with the goal of a 50% improvement in bitrate over AVC while maintaining a constant perceived video quality. AVC itself is often touted as requiring only 50% of the bitrate of MPEG-2. The bitrate of HEVC was therefore targeted at 25% that of MPEG-2 for comparable video quality.

As a standard, HEVC can be characterized as “more of the same.” I don’t mean this to be derogatory, just that it is not a new conceptual framework with respect to either MPEG-2 or AVC. In other words, HEVC uses the same approach as the earlier compression standards: motion-compensated prediction, a transform to achieve spatial de-correlation, adaptive quantization, and entropy coding of the quantizers’ outputs.

Back in the day, I was involved in the MPEG-2 standardization effort. At the end of that time, if someone had asked me if there was a factor-of-four savings to be had over MPEG-2, I would have flat out said “No.” If they had asked me if there was a factor-of-two to be had, I would have said “Maybe, but it will be tough.” AVC got that factor of two, and yes, it was tough. What really surprises me is that the canonical video compression approach still had another factor of two left in it, i.e., HEVC.

Gary’s opinion at the talk was that he didn’t think there was much meat left on the table by HEVC: Perhaps some more sophisticated loop filtering, perhaps some gain from larger sized transform blocks. I think it’s fair to say that he was skeptical about another factor of two with a “more of the same” approach. Given that I was surprised once, by HEVC’s factor of two over AVC, I find myself wondering whether there might not be another factor of two hiding in the canonical approach.

Obviously there is a limit! After all, 1/2^n goes to zero as n gets bigger. Rate Distortion theory makes all this explicit in a mathematical sense. But I’m impressed by the legs the old predict/transform/quantize approach has shown.