Open Sourcing our RTSP Server and video-rs

Posted on Sep 15, 2022




We are proud to announce that we have decided to open source two formerly internal Oddity projects that help us read, process and distribute video streams. Today, we're open sourcing a Rust library called video-rs that can read, write, encode and decode video. video-rs is built on top of ffmpeg. The library is currently focused on real-time video applications, but we intend to grow it into a more generic video library. One project we have been using internally already uses video-rs: Oddity's RTSP server. RTSP is a protocol that is at the heart of all video surveillance. We often found ourselves needing an RTSP server for redistribution, restreaming, or other purposes, and decided to build one ourselves. We are open sourcing both projects hoping to enrich the ecosystem of video libraries in Rust.

Why Video is Important

Oddity.ai is worldwide leader in violence detection, using deep learning to create a safer world. Ever since we started in 2020, we have been focused on developing an algorithm that can detect aggression with the highest possible accuracy, and with the fewest false positives possible. Building the most accurate AI has always been our primary goal, and it is what we are most known for. But to make our technology work in actual, real-world camera surveillance systems, we require another important component. That component is what we call our "engine platform". It is responsible for reading security camera feeds, running them through our AI model, and sending alarms to video management systems of our customers. The engine must be stable and robust. It must run unsupervised for years without maintenance, without failing. Some of our early installations have been running continuously ever since they were installed.

Usually, the security cameras themselves are the most error-prone part of a video surveillance installation. They can be down due to network issues, for maintenance or they might have broken, often due to the weather or vandalism. In some cases the camera might go down, or it could reset itself and revert to a non-compatible profile. Additionally, from time to time, a camera will appear to be working, but doesn't actually produce any video. In all cases, our engine must log the issue and continue analysis on camera streams that are still up. Moreover, when the camera is repaired, our engine needs to pick up where it left off. From experience, we learned that having a well-built and well-tuned library for video retrieval is key to ensuring the stability of the engine. Added benefit is that we can use the same library for many other purposes that come up during experimentation, model validation, model debugging and video surveillance implementations.

video-rs

Over time, the part of our engine that handles video input grew into its own separate component, and then into a library. In many of Oddity's projects, video-rs is now a dependency. Furthermore, we believe it is mature enough to be an open source project as well!

The video-rs library is a high-level wrapper over ffmpeg. The goal is to make video tasks, such as encoding or muxing, much easier than it would be if you were using the ffmpeg API. For example, transcoding a video to H264 is as easy as this:

use video_rs::{self, Locator, Decoder, Encoder, EncoderSettings};

fn main() {
  let source = Locator::Path(std::path::PathBuf::from("input.mp4"));
  let destination = Locator::Path(std::path::PathBuf::from("output.mp4"));
  let mut decoder = Decoder::new(&source)
    .expect("failed to create decoder");

  let (width, height) = decoder.size();
  let encoder_settings = EncoderSettings::for_h264_yuv420p(width as usize, height as usize, false);
  let mut encoder = Encoder::new(&destination, encoder_settings)
    .expect("failed to create encoder");

  decoder
    .decode_iter()
    .for_each(|decoded| {
      let (timestamp, frame) = decoded
        .expect("failed to decode");
      encoder.encode(&frame, &timestamp)
        .expect("failed to encode");
    });
}

We designed the API to be simple and foolproof. For example, we opted to create certain encoding presets (like for_h264_yuv420p) that users will be likely to need. Using video-rs should lead to less code compared to ffmpeg. Of course, some custom or advanced use cases might still require using native ffmpeg.

The example below shows the power of ndarray combined with video-rs. In just a couple lines of code, we can draw an animation and produce an MP4 video file encoded by ffmpeg:

use video_rs::{
  self,
  Locator,
  Encoder,
  EncoderSettings,
  Time,
};

fn main() {
  let width = 800_i32;
  let height = 600_i32;
  let color_bg = [240, 240, 240];
  let color_cube = [197, 0, 0];
  let destination = Locator::Path(std::path::PathBuf::from("output.mp4"));
  
  let encoder_settings = EncoderSettings::for_h264_yuv420p(width as usize, height as usize, false);
  let mut encoder = Encoder::new(&destination, encoder_settings)
    .expect("failed to create encoder");
  
  let duration: Time = std::time::Duration::from_nanos(1_000_000_000 / 60).into();
  let mut position = Time::zero();
  
  let center_x = width / 2;
  let center_y = height / 2;
  for size in 4..520 {
    // Using some Pythagoras magic to draw a circle that grows bigger and bigger!
    let frame = ndarray::Array3::from_shape_fn(
      (height as usize, width as usize, 3),
      |(y, x, c)| {
        let dx = (x as i32 - center_x).abs();
        let dy = (y as i32 - center_y).abs();
        let d = ((dx.pow(2) + dy.pow(2)) as f64).sqrt();
        if d < size.into() { color_cube[c] } else { color_bg[c] }
      });
      
    encoder.encode(&frame, &position)
      .expect("failed to encode frame");
    
    position = position.aligned_with(&duration).add();
  }

  encoder.finish()
    .expect("failed to finish encoding");
}

Growing red circle produced by above code.

To learn more about video-rs, head over to GitHub. video-rs is also published on crates.io. To use it in your project, add the following to your dependencies in Cargo.toml:

video-rs = "0.1"

To use it with ndarray, enable the corresponding feature like so:

video-rs = { version = "0.1", features = ["ndarray"] }

Oddity's RTSP Server

Our first use for video-rs was building an in-house RTSP server. We tried a couple of approaches to building an RTSP server before. Earlier implementations were in C or C++. Building the RTSP server in Rust turned out to be the perfect choice. Rust is well-suited for networking programming, low-level enough for implementing a protocol like RTPS, and high-level enough that we were able to complete the first prototype in a couple of weeks. Also, Rusts strictness is particularly useful when building a protocol parser, where even the tiniest bugs can take hours to debug and solve.

Let's look at some of the ways we are using oddity-rtsp-server.

Restreaming

Old or cheap surveillance cameras often have strict limits on the number of RTSP streams, or the maximum outbound bandwidth. Some scenarios require more consumers than the camera supports. Oddity's RTSP server can be used to consume one stream from a security camera, and then produce unlimited streams to any client that needs it.

To configure such a scenario, download and install Oddity's RTSP server using the instructions on GitHub.

This is an example of a configuration file that can be used for restreaming. In this example, the RTSP server reads a stream from an RTSP camera with IP 10.0.0.100 and relays the stream to clients on the path /relayed/camera/1.

server:
  host: 0.0.0.0
  # Note that using port 554 requires elevated permissions!
  port: 554
media:
  - name: "Camera 1"
    path: "/relayed/camera/1"
    kind: stream
    source: "rtsp://10.0.0.100/stream/0"

No matter how many clients connect to the server at the same time, from the perspective of the camera, there is only a single connection.

Restreaming can be useful in certain network scenarios as well. For example, some networks are locked down in such a way that RTSP streams cannot get to their desired destination. For example, an inaccessible server might have bound itself to port 554, which can't be reached through the network firewall. In that case, you might want to configure Oddity's RTSP server to bind to an accessible port and relay streams as needed.

Integration Testing

Most video management systems have no method of inserting fake video streams for testing. We often find ourselves wanting to test our violence detection system in an end-to-end fashion. One option would be to load one of our office cameras into the VMS, and then do some pretend fighting. Of course, we got a bit tired of that at some point.

To get around this, our RTSP server can be configured to play a video on repeat, and produce an RTSP stream out of it, that looks like a real camera stream to a VMS. To do this, we use a configuration similar to this:

server:
  host: 0.0.0.0
  port: 554
media:
  - name: "Fake Camera 1"
    path: "/always-violence"
    kind: file
    source: "/data/video/violence_01.mp4"
Debugging

During testing our system, we have run into hundreds of different camera models. From cheap $100 IP cameras, to $10k military-grade PTZ cameras with 32x optical zoom. They do not always work as expected. After inspecting the logs, we often find some quirky or erroneous RTSP implementation detail that causes availability issues. Reproducing such issues is notoriously hard. Nowadays, we reimplement the buggy behavior in our own RTSP server to replicate the behavior of the buggy camera. Using this strategy, we have been able to fix issues much quicker.

Contributions Welcome!

In the process of developing the best violence detection algorithm, we learned about the wonderful world of video and all its complexities. We ended up developing some very useful software to deal with it, and as those projects matured, open sourcing it felt like the right choice. We hope that by doing this, we can help enrich the Rust ecosystem, especially in the domain of video.

Oddity still is, and will stay, an AI-company first. There are many people in the industry that know far more about the domain of video than we do. Hopefully, by open sourcing our code, we can learn from those people as well. We are open to contributions!