Copyright © 2003 John-Mark Gurney <>

Table of Contents

  1. Design Goals
  2. Problems to be solved
  3. Conventions
  4. Design Overview
  5. Hardware Overview
  6. Hardware Interface
    1. Encoder/Decoder
    2. Controller
    3. Device
      1. struct vid_port
      2. struct vid_param
      3. struct videobus_device
    4. VideoBus
      1. videobus_create()
      2. videobus_register()
    5. Example
  7. Internal Software Interface
  8. Software Interface
  9. Thoughts
  10. Glossary and Defines

Design Goals

  1. Generic multimedia support
  2. Simple API to program with
  3. Performance
  4. *BSD support?

Problems to be solved

  1. Limited hardware interfaces
  2. Different API's for different hardware
  3. Limited hardware support


This we define the conventions used by the VideoBSD frame work.

  1. Code must conform to FreeBSD's style(9) guide.
  2. C99 will be used. This means all structs can/should use C99's sparse initalization when necessary.

Design Overview

There are two logical parts to a video driver. There is the hardware part the describes how a video device interacts. There is also the software part which describes how end user programs process video data. This could make use of hardware to do accelerated MJPEG compression/decompression, or simply to grab video data from a tv tuner.

The hardware side will only interest device driver writers. This will provide the ability to make it easier to connect the devices together to get video data flowing between them, and out of the card. It will also help arbitrate Pictgure-in-Picture and the video bus. There will also be a userland implementation for devices that are implemented purely in userland such as USB Webcams and Firewire cameras.

The software side will link video sources to video sinks. This will include such things as an AVI file reader/writer along with hardware that can be used to capture/play video from the AVI.

Currently, I have not thought of how sound will integrate with this frame work. Most capture cards have a seperate interface to capture sound. There are already other API's such as the ASIO spec that would be better followed for this. I am not sure how that will integrate with such things as AVI which produce/consume an audio channel.

One idea I have for simple integration, is a call back that will be called to get the current audio sample. This will be useful for capture, but for audio/video playback, there needs to be a way to drop or duplicate a frame (or drop/add audio frames) when getting out of sync. If you have suggestions, I am interested in hearing them.

Hardware Overview

Video support is a lot more complex than most people think. Many people are most familar with cards like BT848 that have once source for video, and one sink (or consumer) for video data. This is not always the case. Some cards, like MJPEG encoder/decoder cards, can have multiple sources and sinks on the same card. There is also the option to have two sources running (PIP), along with all of the sinks consuming the video data.

The first idea is to seperate the device down into core parts. The source and sink parts of the card are really two seperate devices that share a common video bus. A video bus connect a set of sources and sinks together.

As an example, the BT848 has the decoder chip that takes input from either the composite, or from the tunner, and puts it on the video bus. The BT848 chip, or controller, then decodes the data on the video bus and either puts it into a buffer, or to video memory. Each part of the BT848, the tuner, decoder and controller have equivalents on other cards. This simplifies the API as you program each part of the device instead of each device creating a macro set that the device supports. It also makes writing drivers easier since you just define the capabilities of each part, and tell the upper layer how to "program" each device.

The central part of the hardware interface is a video bus. This is a bus that all sources and sinks will be attached too. This will act as an arbitrer to make sure that only the correct devices talk on the bus. This will also provide connection information for programs to control the device.

Hardware Interface

There are two types of devices. The first is the encoder/decoder. This converts video data from composite, S-Video, or another external source to the video bus. The second is the controller. The controller interfaces the computer with the video bus. This means capture frames from the video bus, or put frames on the video bus.

In the example of a Zoran MJPEG card, you have two encoder/decoder chips (one of each) and two controller chips. One controller is the MJPEG chip, and the other is the video controler that supports video overlay and frame capture. From a physical perspective the video overlay controller chip also controls the MJPEG chip, but from a stream point of view, they both produce seperate streams of data. Each can be controlled individually and both can run to capture data at the same time. In the case of the BT848, the decoder chip has been integrated onto the controller chip. The BT848 still logically has a decoder and controller.

Even though they each have been given their own terms, I am going to try to integrate both into the same API. This will mean that some of the settings for one will not make sense for the other, and vise virsa. It will be documented which are used for each.

This interface should remain the same between modules for userland and modules for kernels. There will be macros that will do the correct thing under each side. Though the API will remain the same, this is mostly to not confuse the device writer. He will have to be aware of what he can and can't do when targetting userland and kernel sides.


The encoder/decoder translates between an analog video signal and the digital video bus. It contains video port(s) and various conversion options. Due to the fact that devices may have different options, there will be a user definable set of options. We will define a few set of well known options that should be implemented if the device supports it. They include brightness and contrast among others.


A controller is a device that translates video data between the video bus and the computer in some format. Most cards have at least one controller than can translate to/from standard RGB or YUV. This also includes MJPEG and MPEG encoder boards. It is concevible that there is not a controller on a videobus if it is solely used for parameter adjustment or transformations or picture in picture output.


struct vid_port

Each device will have a set of possible inputs and outputs. Each device is required to present an array of struct vid_port's to present what input/output ports are present.

The struct vid_port is as follows:

struct vid_port {
	const char * const vp_name;
	int32_t	vp_id;
	const int32_t	vp_flags;
A string containing the name of this device. It shall be either one of the well defined names: VID_PORT_SVIDEO, VID_PORT_COMPOSITE, VID_PORT_TUNER, VID_PORT_CAMERA, VID_PORT_BUFFER or a string for a custom defined type.
This is a numeric id of name. This is filled in by the videobus code when initalized. (Do we need this? Can we just pass around the pointers? I am implementing the standard names as const char *'s in the library, so a simple comparision in the driver between VID_PORT_SVIDEO and the one passed by the videobus code would work.)
This sets the options for the port.
This port is a default port when opened. The videobus code will do a back call to activate this. (Provide a way to override the default on open?)
This provides input to the videobus. A decoder device sets this as it generates input.
This provide output from the videobus. An encoder device sets this as it outputs videobus data.
The buffer provided to this device must be contigious on the bus. This may not be necessary on platforms that include an IOMMU (such as sparc64) which has the ability to make fragmented physical memory appear contigious on subordinate buses.
This video port is exclusive. That means no other video ports may be active while this is active.
This flag is not to be set by the driver.

struct vid_param

There are a number of parameters that you can adjust for devices. This lets you enumerate each one and give valid ranges that the user program can set.

The struct vid_param is as follows:

struct vid_param {
	const char * const vp_name;
	vid_param_val const vp_min;
	vid_param_val const vp_max;
	vid_param_val const vp_dfl;
	vid_param_val	vp_val;
	int32_t	const vp_flags;
	int32_t	vp_id;
A string containing the name of this parameter. It shall be either one of VID_PARAM_BRIGHTNESS, VID_PARAM_CONTRAST, VID_PARAM_SATURATION, VID_PARAM_HUE, VID_PARAM_CHANNEL or a string for a custom defined parameter.
Minimum value this parameter can take, inclusive. (Do we want to use int64_t for these?)
Maximum value this parameter can take, inclusive.
Default value of this parameter.
Current value for this parameter.
Flags for this parameter. Current valid flags are VID_PARAM_AUTO.
Numeric id. Same comments of id in vid_port applies here.

struct videobus_device

This structure defines the parts that make up one of the devices on a VideoBus. The videobus_device structure is as follows:

struct videobus_device {
	struct vid_port * const vbd_ports;
	int32_t	vbd_portcnt;
	struct vid_param * const vbd_params;
	int32_t	vbd_paramcnt;
	const vid_port_cb vbd_portcb;
	const vid_param_cb vbd_paramcb;
An array of struct vid_port's that provide a list of possible inputs and outputs. The last structure must have vp_name set equal to NULL.
The number of elements in the vbd_ports array.
A list of parameters that can be changed for this device. The last structure must have vp_name set equal to NULL.
The number of elements in the vbd_params array.
This is the function that will be called to notify the device that a port is to be activated or deactivated.
This is the function that will be called to modify the parameters of this device. The first argument is the const char * from one of the entries in the vbd_params array. The second argument is a pointer to a vid_param_val. The contents of the pointer will attempt to be set by the device. The final result will be written back. The last parameter is the flags to set on the parameter.


The VideoBus is the linking of all the devices connected to the Video Bus. You first create a VideoBus, and then register each device with the bus. This lets different code modules be responsible for each device on the Video Bus.


struct videobus *videobus_create(void);

This creates a new Video Bus. This initalizes the necessary storage for the Video Bus. If a NULL pointer is returned, the allocation has failed.


int videobus_register(struct videobus *, struct videobus_device *);

The function videobus_register registers the device with the VideoBus. If the register was successful, a value of 0 is returned.


Take a webcam that has an additional composite input on the back. There are two devices on the webcam. First is the USB interface which reads the video data from the video bus and captures the frames. The USB interface only has one buffer output port on it. Second is the video decoder. The video decoder has two ports. One port is the CCD/CMOS sensor, and the other port is the composite input. Depending upon how things are setup, the decoder will usually contain a number of parameters which are used to adjust the capture data. This usually includes such things as brightness and contrast among others.

Internal Software Interface

This will describe the internal interface of the userland library. This will explain the overall architecture and design decissions.

The goals of the software library are:

  1. video device agnostic
  2. event driven (async behavior)
  3. high performance (no extra copies)
  4. loadable modules

Currently event driven is the most difficult.

Software Interface

The software interface is what applications use to talk with the video hardware. This includes selecting the ports on a device, reading/writing data from a buffer port, and adjusting the parameters of the device.


int vid_init(void);


int vid_getvideobuslist(sturct vid_videobuslist **hwl);
struct vid_videobuslist {
	int vhl_cnt;
	struct {
		const char *vvb_name;
		uint32_t vvb_indx;
	} vhl_videobus[];

Returns a list of videobuses along with their descriptions. This is to be used by the user to choose which device s/he wants. There should should be an option to hard code vvb_indx for specific videobuses so that config files can specify device and the library is able to maintain static ordering.

We should possibly assign a bit value to differentiate between dynamicly discovered videobuses and staticly configured videobuses. This way the application will know if the videobuses index will be stable across runs.


struct vid_vbhndl *vid_openvideobus(uint32_t indx);

This function opens the device specified by indx that is in the videobus list. It will return NULL on failure. (Document errno's when failure occures.) It returns a videobus handle for the device. The handle is an opaque type, and should be used for all future references to the device.


int vid_closevideobus(struct vid_vbhndl *);

This takes a videobus handle returned by vid_openvideobus and closes it. You must pay attention to the return to make sure it closed properly. (Document action to be taken on various failues.)

int vid_getdevicelist(struct vid_vbhndl *,)


Returns the device list belonging to the videobus. (Same listing/struct as provided by drivers? If same listing, no memory allocating by user library, otherwise, do our own allocation?)




vid_getportlist(struct vid_vbhndl *, struct vid_device *);












This section will contain ramblings about current design issues that need to be solved. If you have any input, feel free to send in comments.


Do we want the library to be async? I currently think we do as one passive goal is to make sure that X apps (any app) doesn't freeze. How do we handle it? On devices like USB we need to read from the device. I was originally thinking of launching a subprogram to handle this, but then I realized that we can watch for events on a kqueue. With a kqueue we can create a simple interface that I normally use which is that the data pointer in the knote points the a function pointer and a void * arg paramter for the function. This would let the VideoBSD library integrate more easily into event based programs.

The one disadvantage of this is that this may introduce an extra copy on USB devices. We need to find out if USB copies it from a buffer or not, and if it does, how hard it would be to do page mappings or something for USB access. If not, we may need to use async io, where we can submit the buffer to be written to, and then receive completion notification via signal or kqueue.

Glossary and Defines

This is not an entire capture card. A device is one interface on the video bus. This device may have multiple input/output ports, but can be logically seperately operated on. This is used to isolate logically seperate parts of the hardware.
This is set on a per device basis. This is usually something like brightness and contrast. This is adjusted for all ports, so some might not be meaningful for all ports.
This is an input/output part of a device. This represents different ways data can be transfered to/from the video bus. This is such things as a S-Video port, a composite port, or a video frame buffer.
S-Video device. Equivalent to the string "svideo".
Composite device. Equivalent to the string "composite".
Tuner device, normally used to receive antena or cable signals. Equivalent to the string "tuner".
Camera device, normally a CCD or CMOS sensor in webcams. Equivalent to the string "camera".
Memory buffer device, this is used to get data to/from the videobus. All devices should implement this device unless the device is only used for processing of video data, and does not support capture/render to/from the computer. Equivalent to the string "buffer".
This is a collection of devices that are attached to a single videobus. It is assumed that any data put on the bus by any attached devices will be received by any devices also listed/attached.