Posted on Sep 15, 2022
We are proud to announce that we have decided to open source two formerly
internal Oddity projects that help us read, process and distribute video
streams. Today, we're open sourcing a Rust library called
video-rs that can read, write,
encode and decode video. video-rs
is built on top of ffmpeg
. The
library is currently focused on real-time video applications, but we
intend to grow it into a more generic video library. One project we have been
using internally already uses video-rs
:
Oddity's RTSP server. RTSP is a
protocol that is at the heart of all video surveillance. We often found
ourselves needing an RTSP server for redistribution, restreaming, or other
purposes, and decided to build one ourselves. We are open sourcing both
projects hoping to enrich the ecosystem of video libraries in Rust.
Oddity.ai is worldwide leader in violence detection, using deep learning to create a safer world. Ever since we started in 2020, we have been focused on developing an algorithm that can detect aggression with the highest possible accuracy, and with the fewest false positives possible. Building the most accurate AI has always been our primary goal, and it is what we are most known for. But to make our technology work in actual, real-world camera surveillance systems, we require another important component. That component is what we call our "engine platform". It is responsible for reading security camera feeds, running them through our AI model, and sending alarms to video management systems of our customers. The engine must be stable and robust. It must run unsupervised for years without maintenance, without failing. Some of our early installations have been running continuously ever since they were installed.
Usually, the security cameras themselves are the most error-prone part of a video surveillance installation. They can be down due to network issues, for maintenance or they might have broken, often due to the weather or vandalism. In some cases the camera might go down, or it could reset itself and revert to a non-compatible profile. Additionally, from time to time, a camera will appear to be working, but doesn't actually produce any video. In all cases, our engine must log the issue and continue analysis on camera streams that are still up. Moreover, when the camera is repaired, our engine needs to pick up where it left off. From experience, we learned that having a well-built and well-tuned library for video retrieval is key to ensuring the stability of the engine. Added benefit is that we can use the same library for many other purposes that come up during experimentation, model validation, model debugging and video surveillance implementations.
Over time, the part of our engine that handles video input grew into
its own separate component, and then into a library. In many of Oddity's
projects, video-rs
is now a dependency. Furthermore, we
believe it is mature enough to be an open source project as well!
The video-rs
library is a high-level wrapper over
ffmpeg
. The goal is to make video tasks, such as encoding
or muxing, much easier than it would be if you were using the
ffmpeg
API. For example, transcoding a video to H264 is as
easy as this:
use video_rs::{self, Locator, Decoder, Encoder, EncoderSettings};
fn main() {
let source = Locator::Path(std::path::PathBuf::from("input.mp4"));
let destination = Locator::Path(std::path::PathBuf::from("output.mp4"));
let mut decoder = Decoder::new(&source)
.expect("failed to create decoder");
let (width, height) = decoder.size();
let encoder_settings = EncoderSettings::for_h264_yuv420p(width as usize, height as usize, false);
let mut encoder = Encoder::new(&destination, encoder_settings)
.expect("failed to create encoder");
decoder
.decode_iter()
.for_each(|decoded| {
let (timestamp, frame) = decoded
.expect("failed to decode");
encoder.encode(&frame, ×tamp)
.expect("failed to encode");
});
}
We designed the API to be simple and foolproof. For example, we opted to create
certain encoding presets (like for_h264_yuv420p
) that users will be likely to
need. Using video-rs
should lead to less code compared to ffmpeg
. Of course,
some custom or advanced use cases might still require using native ffmpeg
.
The example below shows the power of ndarray
combined with video-rs
. In just
a couple lines of code, we can draw an animation and produce an MP4 video file
encoded by ffmpeg
:
use video_rs::{
self,
Locator,
Encoder,
EncoderSettings,
Time,
};
fn main() {
let width = 800_i32;
let height = 600_i32;
let color_bg = [240, 240, 240];
let color_cube = [197, 0, 0];
let destination = Locator::Path(std::path::PathBuf::from("output.mp4"));
let encoder_settings = EncoderSettings::for_h264_yuv420p(width as usize, height as usize, false);
let mut encoder = Encoder::new(&destination, encoder_settings)
.expect("failed to create encoder");
let duration: Time = std::time::Duration::from_nanos(1_000_000_000 / 60).into();
let mut position = Time::zero();
let center_x = width / 2;
let center_y = height / 2;
for size in 4..520 {
// Using some Pythagoras magic to draw a circle that grows bigger and bigger!
let frame = ndarray::Array3::from_shape_fn(
(height as usize, width as usize, 3),
|(y, x, c)| {
let dx = (x as i32 - center_x).abs();
let dy = (y as i32 - center_y).abs();
let d = ((dx.pow(2) + dy.pow(2)) as f64).sqrt();
if d < size.into() { color_cube[c] } else { color_bg[c] }
});
encoder.encode(&frame, &position)
.expect("failed to encode frame");
position = position.aligned_with(&duration).add();
}
encoder.finish()
.expect("failed to finish encoding");
}
Growing red circle produced by above code.
To learn more about video-rs
, head over to
GitHub. video-rs
is also
published on crates.io. To use it in your
project, add the following to your dependencies in Cargo.toml
:
video-rs = "0.1"
To use it with ndarray
, enable the corresponding feature like so:
video-rs = { version = "0.1", features = ["ndarray"] }
Our first use for video-rs
was building an in-house RTSP server. We tried a
couple of approaches to building an RTSP server before. Earlier implementations
were in C or C++. Building the RTSP server in Rust turned out to be the perfect choice.
Rust is well-suited for networking programming, low-level enough for
implementing a protocol like RTPS, and high-level enough that we were able to
complete the first prototype in a couple of weeks. Also, Rusts strictness is
particularly useful when building a protocol parser, where even the tiniest bugs
can take hours to debug and solve.
Let's look at some of the ways we are using oddity-rtsp-server
.
Old or cheap surveillance cameras often have strict limits on the number of RTSP streams, or the maximum outbound bandwidth. Some scenarios require more consumers than the camera supports. Oddity's RTSP server can be used to consume one stream from a security camera, and then produce unlimited streams to any client that needs it.
To configure such a scenario, download and install Oddity's RTSP server using the instructions on GitHub.
This is an example of a configuration file that can be used for
restreaming. In this example, the RTSP server reads a stream from an
RTSP camera with IP 10.0.0.100
and relays the stream to
clients on the path /relayed/camera/1
.
server:
host: 0.0.0.0
# Note that using port 554 requires elevated permissions!
port: 554
media:
- name: "Camera 1"
path: "/relayed/camera/1"
kind: stream
source: "rtsp://10.0.0.100/stream/0"
No matter how many clients connect to the server at the same time, from the perspective of the camera, there is only a single connection.
Restreaming can be useful in certain network scenarios as well. For example, some networks are locked down in such a way that RTSP streams cannot get to their desired destination. For example, an inaccessible server might have bound itself to port 554, which can't be reached through the network firewall. In that case, you might want to configure Oddity's RTSP server to bind to an accessible port and relay streams as needed.
Most video management systems have no method of inserting fake video streams for testing. We often find ourselves wanting to test our violence detection system in an end-to-end fashion. One option would be to load one of our office cameras into the VMS, and then do some pretend fighting. Of course, we got a bit tired of that at some point.
To get around this, our RTSP server can be configured to play a video on repeat, and produce an RTSP stream out of it, that looks like a real camera stream to a VMS. To do this, we use a configuration similar to this:
server:
host: 0.0.0.0
port: 554
media:
- name: "Fake Camera 1"
path: "/always-violence"
kind: file
source: "/data/video/violence_01.mp4"
During testing our system, we have run into hundreds of different camera models. From cheap $100 IP cameras, to $10k military-grade PTZ cameras with 32x optical zoom. They do not always work as expected. After inspecting the logs, we often find some quirky or erroneous RTSP implementation detail that causes availability issues. Reproducing such issues is notoriously hard. Nowadays, we reimplement the buggy behavior in our own RTSP server to replicate the behavior of the buggy camera. Using this strategy, we have been able to fix issues much quicker.
In the process of developing the best violence detection algorithm, we learned about the wonderful world of video and all its complexities. We ended up developing some very useful software to deal with it, and as those projects matured, open sourcing it felt like the right choice. We hope that by doing this, we can help enrich the Rust ecosystem, especially in the domain of video.
Oddity still is, and will stay, an AI-company first. There are many people in the industry that know far more about the domain of video than we do. Hopefully, by open sourcing our code, we can learn from those people as well. We are open to contributions!