The Many Ways to Stream Video using RTP and RTSP

Howdy Pierce

Something I end up explaining relatively often has to do with all the various ways you can stream video encapsulated in the Real-time Transport Protocol, or RTP, and still claim to be standards compliant.

Some background: RTP is used primarily to stream either H.264 or MPEG-4 video. RTP is a system protocol that provides mechanisms to synchronize the presentation different streams – for instance audio and video. As such, it performs some of the same functions as an MPEG-2 transport or program stream.

RTP – which you can read about in great detail in RFC 3550 – is codec-agnostic. This means it is possible to carry a large number of codec types inside RTP; for each protocol, the IETF defines an RTP profile that specifies any codec-specific details of mapping data from the codec into RTP packets. Profiles are defined for H.264, MPEG-4 video and audio, and many more. Even VC-1 – the “standardized” form of Windows Media Video – has an RTP profile.

In my opinion, the standards are a mess in this area. It should be possible to meet all the various requirements on streaming video with one or at most two different methods for streaming. But ultimately standards bodies are committees: each person puts in a pretty color, and the result comes out grey.

In fact, the original standards situation around MPEG-4 video was so confused that a group of large companies formed the Internet Streaming Media Alliance, or ISMA. ISMA’s role is basically to wade into all the different options presented in the standards and create a meta-standard – currently ISMA 2.0 – that ties a number of other standards documents together and tells you how to build a working system that will interoperate with other systems.

In any event, there are a number of predominant ways to send MPEG-4 or H.264 video using RTP, all of which follow some relevant standards. If you’re writing a decoder, you’ll normally need to address all of them, so here’s a quick overview.

Multicast delivery: RTP over UDP

In an environment where there is one source of a video stream and many viewers, ideally each frame of video and audio would only transit the network once.  This is how multicast delivery works. In a multicast network, each viewer must retrieve an SDP file through some unspecified mechanism, which in practice is usually HTTP. Once retrieved, the SDP file gives enough information for the viewer to find the multicast streams on the network and begin playback.

In the Multicast delivery scenario, each individual stream is sent on a pair of different UDP ports – one for data and the second for the related RTP Control Protocol or RTCP. That means for a video program consisting of a video stream and two audio streams, you’ll actually see packets being delivered to six UDP ports:

  1. Video data delivered over RTP
  2. The related RTCP port for the video stream
  3. Primary audio data delivered over RTP
  4. The related RTCP port for the primary audio stream
  5. Secondary audio data delivered over RTP
  6. The related RTCP port for the secondary audio stream

Timestamps in the RTP headers can be used to synchronize the presentation of the various streams.

As a side note, RTCP is almost vestigial for most applications. It’s specified in RFC 3550 along with RTP.  If you’re implementing a decoder you’ll need to listen on the RTCP ports, but you can almost ignore any data sent to you. The exceptions are the sender report, which you’ll need in order to match up the timestamps between the streams, and the BYE, which some sources will send as they tear down a stream.

Multicast video delivery works best for live content. Because each viewer is viewing the same stream, it’s not possible for individual viewers to be able to pause, seek, rewind or fast-forward the stream.

Unicast delivery: RTP over UDP

It’s also possible to send unicast video over UDP, with one copy of the video transiting the network for each client. Unicast delivery can be used for both live and stored content. In the stored content case, additional control commands can be used to pause, seek, and enter fast forward and rewind modes.

Normally in this case, the player first establishes a control connection to a server using the Real Time Streaming Protocol, or RTSP. In theory RTSP can be used over either UDP or TCP, but in practice it is almost always used over TCP.

The player is normally started with an rtsp:// URL, and this causes it to connect over TCP to the RTSP server. After some back-and-forth between the player and the RTSP server, during which the server sends the client an SDP file describing the stream, the server begins sending video to the client over UDP. As with the multicast delivery case, a pair of UDP ports is used for each of the elementary streams.

For seekable streams, once the video is playing, the player has additional control using RTSP: It can cause playback to pause, or seek to a different position, or enter fast forward or rewind mode.

RTSP Interleaved mode: RTP and RTSP over TCP

I’m not a fan of streaming video over TCP. In the event a packet is lost in the network, it’s usually worse to wait for a retransmission (which is what happens with TCP’s guaranteed delivery) than it is just to allow the resulting video glitch to pass through to the user (which is what happens with UDP).

However, there are a handful of different networking configurations that would block UDP video; in particular, firewalls historically have interacted badly with the two modes of UDP delivery summarized above.

So the RTSP RFC, in section 10.12, briefly outlines a mode of interleaving the RTP and RTCP packets onto the existing TCP connection being used for RTSP. Each RTP and RTCP packet is given a four-byte prefix and dropped onto the TCP stream. The result is that the player connects to the RTSP server, and all communication flows over a single TCP connection between the two.

HTTP Tunneled mode: RTP and RTSP over HTTP over TCP

You would think RTSP Interleaved mode, being designed to transmit video across firewalls, would be the end, but it turns out that many firewalls aren’t configured to allow connections to TCP port 554, the well-known port for an RTSP server.

So Apple invented a method of mapping the entire RTSP Interleaved communication on top of HTTP, meaning the video ultimately flows across TCP port 80. To my knowledge, this HTTP Tunneled mode is not standardized in any official RFC, but it’s so widely implemented that it has become a de-facto standard.

Categories: Howdy, Video

Cardinal Peak
Learn more about our Audio & Video capabilities.

Dive deeper into our IoT portfolio

Take a look at the clients we have helped.

We’re always looking for top talent, check out our current openings. 

Contact Us

Please fill out the contact form below and our engineering services team will be in touch soon.

We rely on Cardinal Peak for their ability to bolster our patent licensing efforts with in-depth technical guidance. They have deep expertise and they’re easy to work with.
Diego deGarrido Sr. Manager, LSI
Cardinal Peak has a strong technology portfolio that has complemented our own expertise well. They are communicative, drive toward results quickly, and understand the appropriate level of documentation it takes to effectively convey their work. In…
Jason Damori Director of Engineering, Biamp Systems
We asked Cardinal Peak to take ownership for an important subsystem, and they completed a very high quality deliverable on time.
Matt Cowan Chief Scientific Officer, RealD
Cardinal Peak’s personnel worked side-by-side with our own engineers and engineers from other companies on several of our key projects. The Cardinal Peak staff has consistently provided a level of professionalism and technical expertise that we…
Sherisse Hawkins VP Software Development, Time Warner Cable
Cardinal Peak was a natural choice for us. They were able to develop a high-quality product, based in part on open source, and in part on intellectual property they had already developed, all for a very effective price.
Bruce Webber VP Engineering, VBrick
We completely trust Cardinal Peak to advise us on technology strategy, as well as to implement it. They are a dependable partner that ultimately makes us more competitive in the marketplace.
Brian Brown President and CEO, Decatur Electronics
The Cardinal Peak team started quickly and delivered high-quality results, and they worked really well with our own engineering team.
Charles Corbalis VP Engineering, RGB Networks
We found Cardinal Peak’s team to be very knowledgeable about embedded video delivery systems. Their ability to deliver working solutions on time—combined with excellent project management skills—helped bring success not only to the product…
Ralph Schmitt VP, Product Marketing and Engineering, Kustom Signals
Cardinal Peak has provided deep technical insights, and they’ve allowed us to complete some really hard projects quickly. We are big fans of their team.
Scott Garlington VP Engineering, xG Technology
We’ve used Cardinal Peak on several projects. They have a very capable engineering team. They’re a great resource.
Greg Read Senior Program Manager, Symmetricom
Cardinal Peak has proven to be a trusted and flexible partner who has helped Harmonic to deliver reliably on our commitments to our own customers. The team at Cardinal Peak was responsive to our needs and delivered high quality results.
Alex Derecho VP Professional Services, Harmonic
Yonder Music was an excellent collaboration with Cardinal Peak. Combining our experience with the music industry and target music market, with Cardinal Peak’s technical expertise, the product has made the mobile experience of Yonder as powerful as…
Adam Kidron founder and CEO, Yonder Music
The Cardinal Peak team played an invaluable role in helping us get our first Internet of Things product to market quickly. They were up to speed in no time and provided all of the technical expertise we lacked. They interfaced seamlessly with our i…
Kevin Leadford Vice President of Innovation, Acuity Brands Lighting
We asked Cardinal Peak to help us address a number of open items related to programming our systems in production. Their engineers have a wealth of experience in IoT and embedded fields, and they helped us quickly and diligently. I’d definitely…
Ryan Margoles Founder and CTO, notion