I’m at the DSI conference in Las Vegas today, presenting a primer for law enforcement investigators on how video compression works and trying to answer the question of why “lossy” compression should be considered reliable for use in courtrooms. (My slides are available here, and I welcome comments on them.) I think I was invited to speak because of our CaseCracker product, which is used to record custodial interrogations, although what I’m discussing is only slightly related.
The lack of trust in digital media compression in a forensic setting is primarily a PR issue for the media compression industry, if such an industry can be said to exist. We use terms like “lossy compression” and “predicted blocks”—terms that have relatively precise technical meaning. But these terms also have a slightly different meaning to laymen, and that everyday meaning isn’t exactly reassuring if you’re a judge relying on testimony compressed using a lossy compression algorithm. So it’s important for lawyers and investigators working in the criminal justice system to understand how image compression works.
The technical meaning of “lossy compression” is that the process of encoding followed by the process of decoding doesn’t output the exact same file as the source file you started out with:
When we say the output file isn’t the same as the source file, what we mean is that a byte-for-byte comparison of the two files will fail—not that a guy protesting his innocence will be turned into a different guy admitting his guilt. In fact, with a well-implemented codec, the mathematical lossiness shouldn’t be subjectively noticeable at all. Intuitively, everyone knows that: Nobody worries about using lossy media compression for recording videos of their kids’ birthdays or pictures of their vacations.
But still, it’s worth thinking about the question as to how to state with certainty that lossy compression algorithms should be considered reliable for courtroom use.
In preparing for this talk, I tried to think of all the ways that video compression is lossy. I came up with four independent sub-processes that each contribute to a codec’s overall lossiness:
I’m not a lawyer, thank heaven, but I’m pretty sure the relevant legal issue is whether a piece of video evidence accurately reproduces the event it purports to record. And so in a law enforcement setting, the ultimate answer is that someone who is trusted needs to be able to testify that a particular video clip faithfully represents what happened.