VMS Software

RTP, RTSP, RTMP, and ONVIF: Why Streaming Has So Many Protocols (and Why That’s Actually a Good Thing)

IP Camera Software In Focus Video Surveillance News
rtsp protocol
If you’ve ever tried to pull a video stream from a camera, publish a livestream, or set up remote monitoring on an industrial site, you’ve probably wondered: why on earth are there so many video protocols? RTP, RTSP, RTMP, ONVIF — it’s like someone spilled alphabet soup on the networking stack.
The truth is less chaotic and more historical: each protocol solves a very specific problem, born in a very specific era, and the world never stopped needing any of them. Real-time video is messy, networks are imperfect, and “one protocol to rule them all” never quite happened.
Let’s walk through how these technologies differ, where they came from, and why they all still coexist today — plus take a look at the modern protocols competing for attention.

RTP: the delivery truck that actually carries your video packets

RTP (Real-time Transport Protocol) dates back to the mid-90s — the dawn of internet telephony and video conferencing. Its whole purpose is simple: carry audio and video with correct timing over the network.
Think of RTP as the UPS truck of streaming:
  • It delivers packets quickly.
  • It labels them with timestamps and sequence numbers.
  • It doesn’t ask what’s inside — codec, resolution, whatever, that’s someone else’s problem.
  • It uses UDP most of the time, because TCP is allergic to jitter and delays.
RTP is almost never used alone. It’s paired with RTCP, a side-channel that reports losses and delays. Together, they power:
  • SIP / VoIP calls
  • video conferencing
  • WebRTC media transport (via encrypted SRTP)
  • IP cameras inside surveillance systems
RTP’s job is low-level and utilitarian, which is exactly why it’s everywhere.

RTSP: the remote control for media streams

If RTP is the truck, RTSP (Real-Time Streaming Protocol) is the remote control telling the truck where to drive.
Created by RealNetworks, Netscape, and Columbia University in the 90s (yes, that Netscape), RTSP is a control protocol — similar in style to HTTP, but designed specifically for media.
Its commands look like Netflix buttons for robots:
  • DESCRIBE — “what streams do you have?”
  • SETUP — “send video here, audio there”
  • PLAY — “go!”
  • PAUSE — “hold on”
  • TEARDOWN — “we’re done here”
RTSP doesn’t carry video itself. It negotiates and coordinates the RTP streams that carry it.
Why surveillance loves RTSP:
  • Precise real-time control
  • Low latency
  • Works with many codecs (H.264/H.265/AAC/etc.)
  • Universal in IP camera ecosystems
You’ve almost certainly seen RTSP URLs like:
rtsp://user:pass@192.168.1.10:554/Streaming/Channels/101
That's RTSP in its natural habitat.

RTMP: the Flash-era survivor still handling the front line

RTMP (Real-Time Messaging Protocol) comes from Macromedia — the pre-Adobe company behind Flash. Back in the 2000s, Flash was how the entire internet watched video, so RTMP became the go-to protocol for livestreaming.
Key traits:
  • Runs over TCP
  • Persistent low-latency connection
  • Multiplexed audio/video/control in one channel
  • Historically the fastest way to push live video to the web
Flash is dead (RIP 2020), but RTMP refuses to die. Today it’s used primarily as an ingest protocol for streamers:
  • OBS → RTMP → media server → viewers via HLS/WebRTC
It’s not fashionable anymore, but it’s reliable — like an old pickup truck that still starts on the first try.

ONVIF: not a streaming protocol at all — but the reason IP cameras finally agree on something

ONVIF often gets lumped in with streaming protocols, but that’s a misunderstanding.
ONVIF is not a video transport protocol.
It’s a device interoperability standard used by IP cameras, NVRs, and VMS systems.
It covers:
  • device discovery
  • management of camera settings
  • pulling stream URLs
  • configuring video profiles
  • authentication & security
  • PTZ control
  • event handling
ONVIF uses SOAP, XML, and WS-Discovery under the hood — not exactly glamorous tech, but extremely practical.
And here’s the key reason it matters:
👉 ONVIF typically gives you the camera’s RTSP stream.
They’re teammates, not competitors.
Without ONVIF, every camera manufacturer would still be serving video via a secret proprietary protocol from 2008. No thanks.

Why so many protocols? Because video streaming has wildly different requirements

Real-time video is a battlefield of conflicting needs:
  1. Latency
  • WebRTC: wants under 200 ms
  • Surveillance: 300–800 ms is fine
  • Streaming to millions: 5–30 seconds is normal
  1. Network conditions
  • UDP is fast but fragile
  • TCP is universal but slow
  • Firewalls/nATs hate anything non-HTTP
  1. Scale
  • 1 operator watching 50 cameras → RTSP
  • 3 million viewers → HLS/DASH + CDN
  1. Ecosystem baggage
  • RTMP survived because streamers needed it
  • ONVIF survived because camera vendors needed standardization
  • RTP survived because nothing else delivers low-latency packets better
No single protocol solves all of this. So we stack them like Lego bricks.

Other major protocols for delivering video over the Internet

Welcome to the rest of the zoo.

HLS (HTTP Live Streaming)

  • Made by Apple
  • Uses chunked .ts or .mp4 segments
  • Plays natively in browsers
  • Stable and scalable, but high latency

MPEG-DASH

  • Like HLS but standards-based
  • Adaptive bitrate streaming
  • Works great with CDNs

WebRTC

  • Ultra-low latency (<200 ms)
  • Peer-to-peer or through servers
  • Mandatory encryption
  • Perfect for calls, support sessions, and real-time surveillance previews

SRT

  • Open-source protocol from Haivision
  • Designed for unreliable networks
  • Adds retransmissions, jitter correction, encryption
  • Used in broadcast and remote contribution

RIST

  • Similar to SRT
  • Telecom/enterprise focus
  • Built by the broadcast engineering world

MPEG-TS over UDP (or over RTP)

  • The old-school IPTV backbone
  • Extremely stable in controlled networks
  • Still used by ISPs and TV operators

Proprietary P2P camera protocols

Consumer cameras often bundle their own encrypted P2P mechanisms over HTTPS/WebSocket to bypass NAT—convenient for users, mysterious for developers.

Bottom line

  • RTP = the truck hauling your media packets
  • RTSP = the remote control managing playback and stream setup
  • RTMP = the Flash-era ingest protocol that refuses to retire
  • ONVIF = the universal language of IP cameras
They’re not rivals. They’re layers of the same ecosystem.
Real-time video is inherently messy: networks drop packets, firewalls misbehave, cameras misbehave even more, and users expect everything to “just work.” That’s why the ecosystem never converged on one protocol — each of these technologies fills a niche that still matters.
And until the laws of physics change, we’re going to keep stacking protocols like pancakes: RTP inside RTSP inside ONVIF discovery, feeding into a server that outputs HLS for viewers and WebRTC for low-latency operators.
The good news? It all works — and when it works, it feels like magic.