Transforms for Video Compression, part 1: Vectors, the Dot Product, and Orthonormal Bases

Mike Perkins

The use of transforms in data compression algorithms is at least 40 years old. The goal of this three-part series of posts is to provide the mathematical background necessary for understanding transforms and to explain why they are a valuable part of many compression algorithms.

I’ll focus on video since that’s my particular interest. Part 1 (this post) will review vectors, the dot product, and orthonormal bases. Part 2 introduces the use of matrices for describing both one and two-dimensional transforms. Finally, Part 3 gives an example and explains intuitively why transforms are a valuable part of video compression algorithms.

Remember back in high school when you learned the “dot product” between two vectors? It turns out that this operation is at the heart of a powerful way of thinking about transforms such as the DFT, the DCT, and the integer transform used in H.264. In this post, part 1 of a three part series, I review some key concepts we’ll need including vectors, the dot product, and orthonormal bases. So let’s get started!

An n dimensional vector is just a collection of n numbers. By convention the numbers are usually written in a column, like this:

Vector concepts are valuable in signal processing because a discrete, or digital, signal is nothing more than a collection of numbers. In other words, a discrete signal (of finite length) is a vector. The numbers in a vector can be samples of a music waveform, pixels in an image, or samples from any signal that tickles your fancy. We call the numbers that make up a vector the components of the vector. Any operation that can be performed on a vector can be thought of as an operation performed on a discrete signal.

Now, an n dimensional vector can be viewed in two ways:

  • as a point in n-dimensional space (i.e. the components of the vector are the coordinates of a point); or
  • as a directed line segment with its tail at the origin and its head at the point specified by its coordinates (this is the way we normally think of a vector—as an “arrow” with a definite direction).

The Pythagorean Theorem tells us how long a vector is as a function of its components:

If you have two vectors, then the dot product between them is computed by multiplying their corresponding components together and summing up the products. Mathematically we express this as

We see immediately that

It can also be shown that

This is just the vector formulation of the famous law of cosines you learned in trigonometry class, where θ is the angle between the vectors a and b. Now, recall that the cosine of a 90 degree angle is zero—so from this equation, it is clear that a and b are orthogonal, i.e. the angle between them is 90 degrees, if their dot product is zero.

If a vector has length one, we call it a unit vector. We can always form a unit vector that points in the same direction as an arbitrary vector b by applying the formula

(Recall that multiplying or dividing a vector by a scalar means multiplying or dividing each component of the vector by the scalar.)

Finally, we can find the projection of a vector a in the direction of a unit vector u by computing the dot product between them:

Let’s consider an example for the case of three-dimensional vectors. We note that (1, 0, 0), (0, 1, 0), and (0, 0, 1) are all unit vectors; furthermore, they are all orthogonal to each other because the dot product between any pair of them is zero. Of course (1, 0, 0) points in the direction of the x‑axis, (0, 1, 0) points in the direction of the y‑axis, and (0, 0, 1) points in the direction of the z­‑axis. Because these three unit vectors are all orthogonal to each other we say these vectors are orthonormal to each other (here “normal” refers to the fact that the vectors have a length of 1—they’ve been normalized).

Now, let xT = (x0, x1, x2) be an arbitrary vector in 3D space. (The superscript “T” means transpose; when we write a vector horizontally we refer to it as the transpose of the vertical vector. But I may be sloppy at times and let the context make clear whether a vector is horizontally or vertically oriented.) The dot product of x with (1, 0, 0) is x0, the dot product of x with (0, 1, 0) is x1, and the dot product of x with (0, 0, 1) is x2. In other words, x0, x1, and x2 are the projections of the vector x in the directions of the x, y, and z axes.

Furthermore, the unit vectors (1, 0, 0), (0, 1, 0), and (0, 0, 1) form an orthonormal basis for the set of all 3D vectors because any 3D vector can be written as a scaled sum of these three vectors. In particular:

Now we’re ready for a key insight into transforms. The vectors (1, 0, 0), (0, 1, 0), and (0, 0, 1) are not the only orthonormal basis vectors. For example, one easy way to get another set of orthonormal vectors is to rotate the unit vectors that point in the x-axis and y‑axis directions by 45 degrees, while leaving the (0, 0, 1) vector unchanged. Doing this we get the three unit vectors (0.707, 0.707, 0), (-0.707, 0.707, 0), and (0, 0, 1). It is easy to compute the dot product of every pair of these vectors and show that they are mutually orthogonal; it is also easy to see that all the vectors have a length of 1. What if we take the projection of (x0, x1, x2) onto each of these new vectors? We get

Eq. 1

Eq. 1

Now we can express the vector xT as a sum of these three new unit vectors as follows:

Eq. 2

Eq. 2

In other words, given the three projections y0, y1, and y2, we can reconstruct the numbers x0, x1, and x2 from them by using them to scale their corresponding unit vectors and then adding up the scaled vectors.

In fact, for any set of three orthonormal vectors, we can compute the projection of x onto each of the vectors, and given these projections, we can reconstruct x exactly as demonstrated above. Conceptually, we “transform” x into y by computing the projections onto the unit vectors of our orthonormal basis (eq. 1); given y, we “inverse transform” back to x by using the components of y to scale the basis vectors and then add them up (eq. 2). Every set of orthonormal basis vectors defines a transform.

When the goal is to compress the vector (i.e., the discrete signal) x, then it may be easier to first transform x into a new vector y and compress y instead. Of course, given y, the receiver can apply the inverse transform to recover the original vector x. If the right orthonormal vectors are chosen—in other words, if the right transform is used—the benefit can be dramatic! …but more on that in Part 3. First, we need to develop a more efficient formulation of the above concepts using matrix notation. That will be the topic of Part 2.

This entire series of blog posts is also available as a Cardinal Peak white paper.

Cardinal Peak
Learn more about our Audio & Video capabilities.

Dive deeper into our IoT portfolio

Take a look at the clients we have helped.

We’re always looking for top talent, check out our current openings. 

Contact Us

Please fill out the contact form below and our engineering services team will be in touch soon.

We rely on Cardinal Peak for their ability to bolster our patent licensing efforts with in-depth technical guidance. They have deep expertise and they’re easy to work with.
Diego deGarrido Sr. Manager, LSI
Cardinal Peak has a strong technology portfolio that has complemented our own expertise well. They are communicative, drive toward results quickly, and understand the appropriate level of documentation it takes to effectively convey their work. In…
Jason Damori Director of Engineering, Biamp Systems
We asked Cardinal Peak to take ownership for an important subsystem, and they completed a very high quality deliverable on time.
Matt Cowan Chief Scientific Officer, RealD
Cardinal Peak’s personnel worked side-by-side with our own engineers and engineers from other companies on several of our key projects. The Cardinal Peak staff has consistently provided a level of professionalism and technical expertise that we…
Sherisse Hawkins VP Software Development, Time Warner Cable
Cardinal Peak was a natural choice for us. They were able to develop a high-quality product, based in part on open source, and in part on intellectual property they had already developed, all for a very effective price.
Bruce Webber VP Engineering, VBrick
We completely trust Cardinal Peak to advise us on technology strategy, as well as to implement it. They are a dependable partner that ultimately makes us more competitive in the marketplace.
Brian Brown President and CEO, Decatur Electronics
The Cardinal Peak team started quickly and delivered high-quality results, and they worked really well with our own engineering team.
Charles Corbalis VP Engineering, RGB Networks
We found Cardinal Peak’s team to be very knowledgeable about embedded video delivery systems. Their ability to deliver working solutions on time—combined with excellent project management skills—helped bring success not only to the product…
Ralph Schmitt VP, Product Marketing and Engineering, Kustom Signals
Cardinal Peak has provided deep technical insights, and they’ve allowed us to complete some really hard projects quickly. We are big fans of their team.
Scott Garlington VP Engineering, xG Technology
We’ve used Cardinal Peak on several projects. They have a very capable engineering team. They’re a great resource.
Greg Read Senior Program Manager, Symmetricom
Cardinal Peak has proven to be a trusted and flexible partner who has helped Harmonic to deliver reliably on our commitments to our own customers. The team at Cardinal Peak was responsive to our needs and delivered high quality results.
Alex Derecho VP Professional Services, Harmonic
Yonder Music was an excellent collaboration with Cardinal Peak. Combining our experience with the music industry and target music market, with Cardinal Peak’s technical expertise, the product has made the mobile experience of Yonder as powerful as…
Adam Kidron founder and CEO, Yonder Music
The Cardinal Peak team played an invaluable role in helping us get our first Internet of Things product to market quickly. They were up to speed in no time and provided all of the technical expertise we lacked. They interfaced seamlessly with our i…
Kevin Leadford Vice President of Innovation, Acuity Brands Lighting
We asked Cardinal Peak to help us address a number of open items related to programming our systems in production. Their engineers have a wealth of experience in IoT and embedded fields, and they helped us quickly and diligently. I’d definitely…
Ryan Margoles Founder and CTO, notion