Robust-Audio Tool (RAT)
~~~~~~~~~~~~~~~~~~~~~~~

Copyright (C) 1995-2001 University College London
All rights reserved. 

This software is distributed under license, see the file COPYRIGHT for full
terms and conditions.

Further information, and a list of frequently asked questions (with answers) 
is available from http://www-mice.cs.ucl.ac.uk/multimedia/software/rat/

See the file INSTALL.TXT for installation and compilation instructions.

Send comments, suggestions and bug-reports to <rat-trap@cs.ucl.ac.uk>. A
mailing list for general discussion on related issues is now available,
send mail to <rat-users-request@cs.ucl.ac.uk> to subscribe. All users of
rat are encouraged to join this list.

RAT is a network audio tool that allows users to particpate in audio
conferences over the internet. These can be between two participants
directly, or between a group of participants on a common multicast group.
No special features are required to use RAT in point-to-point mode, but to
use the multicast conferencing facilities of RAT, a connection to the
Mbone, or a similar multicast capable network, is required. RAT is based on
IETF standards, using RTP above UDP/IP as its transport protocol, and
conforming to the RTP profile for audio and video conferences with minimal
control. 

In addition to the features provided by other Mbone audio conferencing
tools RAT offers the following additional functionality: 

	- Sender based repair of damaged audio streams 
		- FEC in the form of redundant packet transmission 
		- Support for interleaved audio 
	- Received based repair of damaged audio streams 
	- Adaptive scheduling protection 
	- Secure conferencing 
	- Improved statistics and diagnostic features 
	- Conference coordination bus 
	- Transcoder operation 
	- Sound localisation and externalisation

This list is correct for RAT v3.2, earlier versions have a more limited feature
set. 

Sender based repair of damaged audio streams 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Unless some form of resource reservation protcol (eg: RSVP) is used, an IP
based network, such as the Internet or the Mbone, will occasionally lose
packets. These lost packets result in broken up audio, which rapidly
becomes unintelligible as the loss rate increases. RAT implements two
sender based repair schemes to recover from this problem: redundant
transmission and interleaving. 

Redundant transmission is the means by which a (more) heavily compressed
copy of a packet is piggy-backed onto the following packet. If the original
packet is lost, the redundant copy can be used in its place. Because the
redundant packet is very heavily compressed, sound quality suffers, but is
still better than having no audio to play out in the place of the lost
packet. Clearly, there exists a tradeoff between the amount of compression
used for the redundant packet (and hence stream bandwidth/overhead), and
the quality of the resultant audio. 

Redundant transmission was developed by UCL and INRIA Sophia-Antipolis, as
part of the MICE/MERCI multimedia conferencing projects. It is discussed
further in the following papers: 

	Vicky Hardman, Angela Sasse, Mark Handley and Anna Watson, 
	"Reliable Audio for Use over the Internet", in Proceedings of
	INET'95, June 1995, Honolulu, Hawaii. 

	Isidor Kouvelas, Orion Hodson, Vicky Hardman and Jon Crowcroft,
	"Redundancy Control in Real-Time Internet Audio Conferencing", in
	Proceedings of AVSPN 97, September 1997, Aberdeen, Scotland, UK. 

	Colin Perkins, Isidor Kouvelas, Orion Hodson, Vicky Hardman, Mark
	Handley, Jean-Chrysostome Bolot, Andres Vega-Garcia, Sacha
	Fosse-Parisis, "RTP Payload for Redundant Audio Data", IETF
	Audio/Video Transport Working Group, RFC2198, September 1997. 

	C. S. Perkins, O. Hodson & V. Hardman, "A Survey of Packet-Loss
	Recovery Techniques for Streaming Audio", IEEE Network Magazine,
	September/October 1998.

As an alternative to redundant transmission, recent versions of RAT provide
the option to send interleaved audio. Units of audio data are resequenced
before transmission, so that originally adjacent units are separated by a
guaranteed distance in the transmitted stream, and returned to their
original order at the receiver. Interleaving disperses the effect of packet
losses. If, for example, units are 5ms in length and packets 20ms (ie: 4
units per packet), then the first packet could contain units 1, 5, 9, 13;
the second packet would contain units 2, 6, 10, 14; and so on. It can be
seen that the loss of a single packet from an interleaved stream results in
multiple small gaps in the reconstructed stream, as opposed to the single
large gap which would occur in a non-interleaved stream. 

Although interleaving does not reduce the amount of loss observed, it does
significantly improve the perceived quality of an audio stream. The obvious
disadvantage of interleaving is that it increases latency. This limits the
use of this technique for interactive applications, although it performs
well for non-interactive use. The major advantage of interleaving is that
it does not increase the bandwidth requirements of a stream. 

Receiver based repair of damaged audio streams 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Receiver based recovery schemes rely on producing a replacement for a lost
packet which is similar to the original. This is possible since audio
signals, and in particular speech, exhibit large amounts of short-term self
similarity. As such, these techniques work for relatively small loss rates
(less than 15%), and for small packets (4-40ms). When the loss length
approaches the length of a phoneme (5-100ms) these techniques breakdown,
since whole phonemes may be missed by the listener. 

It is, therefore, clear that receiver based repair schemes are not a
substitute for sender-based repair, but rather work in tandem with it. A
sender-based scheme is used to repair most losses, leaving a small number
of isolated gaps to be repaired. Once the effective loss rate has been
reduced in this way, receiver based repair forms a cheap and effective
means of patching over the remaining loss. 

A number of receiver based repair schemes are implemented in RAT: 

	- Silence substituation 
	- Packet repetition 
	- Pattern matching repair 

A simple form of receiver based recovery is silence substitution. The gap
left by a lost packet is filled with silence, to maintain the timing
relationship between the surrounding packets. It is only effective with
short packet lengths (less than 4ms) and low loss rates (less than 2%),
making it suitable for striped audio with narrow and distributed stripes
over low loss paths. 

The performance of silence substitution degrades rapidly as packet sizes
increase, and quality is unacceptably bad for the 40ms packet size in
common use in network audio conferencing tools. Despite this, the use of
silence substitution is widespread, primarily because it is simple to
implement. 

Packet repetition replaces lost packets with copies of the packets that
arrived immediately before the loss.  It has low computational complexity
and performs reasonably well. The subjective quality of repetition is
improved by gradually fading repeated units. The GSM system, for example,
advocates the repetition of the first 20ms with the same amplitude and
followed by fading the repeated signal to zero amplitude over the next
320ms. 

The use of repetition with fading is a good compromise between the poor
performance of silence substitution, and the more complex pattern matching
scheme. 

Pattern matching repair uses audio before and after the loss to interpolate
a suitable signal to cover the loss. It performs somewhat better than
packet repetition, but is significantly more computationally intesive. 

Adaptive Scheduling Protection 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Current general purpose operating systems, such as Unix and Windows 95, do
not provide adequate support for real-time services in their scheduling
algorithms. RAT uses a novel adaptive algorithm, where the DMA driven audio
playout is used to `cushion' the system against scheduling anomolies. This
is described in the following paper: 

	Isidor Kouvelas and Vicky Hardman, " Overcoming Workstation
	Scheduling Problems in a Real-Time Audio Tool", in Proceedings of
	Usenix Annual Technical Conference, January 1997, Anaheim,
	California. 

Secure Conferencing 
~~~~~~~~~~~~~~~~~~~

RAT allows for secure conferencing, whereby media streams and participant
identity information can be encrypted using DES. Other encryption
algorithms can easily be added. 

Improved Statistics and diagnostic features 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Like other RTP-based audio tools, RAT provides reception quality
statistics and user information for all participants in a
conference. In addition, it has a graphical display of the loss
to/from each participant, making diagnosis of problems a simple
matter.

Conference coordination bus 
~~~~~~~~~~~~~~~~~~~~~~~~~~~

RAT implements a conference coordination bus, whereby the user
interface and media engine are separated, and communicate via an IPC
mechanism. This allows for complete control of RAT by another process
operating on the same host. Advantages of this split approach include:

	- Customised user-interface: the existing RAT user interface can
	  easily be replaced, with no loss of functionality.   

	- Lip-synchronisation: RAT can communicate with a videa tool, to
	  synchronise audio and video.

	- Integration with wide area conference control: a separate
  	  conference control process may be run on the same host as the
	  audio/video tools. This can use the conference bus to control the
	  media tools, to provide, for example H.323 conference control. 

Transcoder operation 
~~~~~~~~~~~~~~~~~~~~

When the bandwidth available is not constant for all participants in a
conference, or when some participants do not have multicast capable
access, the RAT transcoder/gateway may be used. This connects two
multicast groups, or one multicast group and a single unicast
host. RTP packets received from either group are transcoded into the
format specified for the other group, multiple sources are mixed
together, and the resulting stream is transmitted to the other
group. This allows for different codecs to be used in each group,
meaning that the bandwidth requirements are different.

Future research 
~~~~~~~~~~~~~~~

Future developments planned include: 

	- Additional codecs, for low bandwidth speech and high quality music. 
	- Investigation of different error recovery techniques. 

Project Background
~~~~~~~~~~~~~~~~~~

UCL has a well-established multimedia and communications research
group.  One of the current research topics is multimedia conferencing
applications over the Internet Mbone. We are in the process of
developing a next generation audio tool, RAT, for use over the
Internet, as part of a suite of integrated multimedia tools. The
research is aimed at providing software systems on heterogeneous
workstations (Unix and Windows based). The group has existing
collaborative and industrially funded application piloting projects.

Sponsors
~~~~~~~~

The RAT project was funded by the EPSRC under the Multimedia and
Networked Applications Programme, Bristish Telecommunications plc, and
the European Commision (Telematics Applications Programme, Research
Sector).  It has benefited from hardware donations by Hewlett-Packard
and Sun Microsystems, and software donations by Microsoft.