<!DOCTYPE ARTICLE PUBLIC "-//Davenport//DTD DocBook V3.0//EN" [
]>

<article id="index">
 <artheader>
  <authorgroup>
   <author>
    <firstname>Carsten</firstname>
    <surname>Haitzler</surname>
    <affiliation>
     <orgname>Red Hat Software, Inc.</orgname>
     <address>
     <email>raster@redhat.com</email>
     </address>
     </affiliation>
   </author>
   </authorgroup>
   <copyright>
    <year>1999</year>
    <holder>Red Hat, Inc.</holder>
   </copyright>
  <abstract id="abstract">
    <para>
     Esound (also referred to as ESD) is a small sound daemon for both
     Linux and UNIX. ESD was created to provide a
     consistent and simple interface to the audio device, so
     applications do not need to have different driver support written
     per architecture. It was also designed to enhance capabilities of
     audio devices such as allowing more than one application to share
     an open device. ESD accomplishes these things while remaining transparent
     to the application, meaning that the application developer can
     simply provide ESD support and let it do the rest. On top of
     this, the API is designed to be very similar to the current audio
     device API, making it easy to port to ESD.
    </para>
  </abstract>
 <title>Esound</title>
 </artheader>
 <sect1 id="overview"
  <title>Overview</title>
  <para> 
   Esound (ESD) is a stand-alone sound daemon which abstracts the system
   sound device to multiple clients.  Under Linux using the Open Sound
   System (OSS), as well as other UNIX systems, typically only one
   process may open the sound device. This is not acceptable in a desktop
   environment like GNOME, as it is expected that many applications will
   be making sounds (music decoders, event based sounds, video
   conferencing, etc). The ESD daemon connects to the sound device and
   accepts connections from multiple clients, mixing the incoming audio
   streams and sending the result to the sound device.  Connections are
   only allowed to clients which can authenticate successfully,
   alleviating the concern that unauthorized users can eavesdrop via the
   sound device. In addition to accepting client connections from the
   local machine, ESD can be configured to accept client connections from
   remote hosts which authenticate successfully.  
  </para>
  <para>
   Applications wanting to contact the ESD daemon do so using the libesd
   library.  Much like with file i/o, a ESD connection is first
   opened. The ESD daemon will be spawned automatically by libesd if a daemon
   is not already present. Data is then either read or written to the
   ESD daemon.  For a ESD client local to the machine which the ESD
   daemon is running on, the data is transferred through a local socket,
   then written to the sound device by the ESD daemon.  For a client on a
   remote machine, the data is sent by libesd on the remote machine over
   the network to the ESD daemon. The process is completely transparent
   to the application using ESD.
  </para>
   <figure id="esd-diagram">
    <title>The ESD Process</title>
    <screenshot>
     <screeninfo>The ESD Process</screeninfo>
     <Graphic Format="gif" Fileref="./figs/diagram" srccredit="dcm">
     </graphic>
    </screenshot>
   </figure>
  </sect1>
  <sect1 id="bit-stream">
   <title>Bit Stream</title>
   <para>
    ESD will automatically sample an incoming stream from a client to the
    best format which is supported by the sound device.  Therefore, an ESD
    client does not need to be concerned with the actual format it
    uses.
   </para>
   <para>
    This alleviates the common problem of having to write code for each
    different platform which determines the possible formats available.  A
    developer just selects a format to use and relies upon ESD to map that
    as best possible to the platform the application is running on.
   </para>
   <para>
    ESD also supports recording and writing from the audio device. The
    API allows for different programs to be able to record and write
    simultaneously if your audio device is <emphasis>full
    duplex</emphasis> - that means the device is able to digitize
    analog audio input and convert digital to analog audio output  at
    the same time. Many common sound cards are not full duplex, such
    as Sound Blaster cards. A device can play in 16 bits and record in
    8 bits, but not play and record in 16 bits on both streams. Being
    able to record and play at the same bit resolution, same rate, and
    same number of channels is what is considered full duplex, for the
    purposes of this document.  
   </para>
   <para>
    In addition to streams, ESD also supports sample caching. The client can
    upload a sample of audio, tag it by a name, and receive an ID tag for
    that sample. At any point the client can ask for the sample to be freed from
    ESD's memory. The sample can be shared among several programs and
    allow instant playback of sounds, (For example, for spot effects), with no
    blocking of calls to the server to play long samples. 
   </para>
   <para>
    Futhermore, if the audio device of the standard Linux kernel
    supports mixing at the driver level, e.g, ALSA, ESD can act as a
    simple front-end to ALSA. This allows mixing on older kernels and
    non-Linux platforms, as well as mixing via the device when
    available. 
   </para>
 </sect1>
 <sect1 id="advantages">
  <title>Instant Advantages</title>
   <para>
    ESD provides the application developer with some instant advantages: 
   </para>
   <ITEMIZEDLIST mark="bullet">
    <listitem>
     <para>
      ESD has the ability to do network transparent audio if desired.
     </para>
    </listitem>
    <listitem>
     <para>
      ESD can keep ownership of the audio device to one user, such as
      <emphasis>audio</emphasis>, and then grant authentication keys
      to specific users for access. Removing users is as simple
      as changing the authentication key. 
     </para>
    </listitem>
    <listitem>
     <para>
      Programs that are unable to handle ``lesser'' audio devices (ones that can
      only output 16 bit stereo 44.1kHz audio) can still run, as ESD
      will mix down automatically and transparently for the application.
     </para>
    </listitem>
    <listitem>
     <para>
      With ESD more than one application can access the sound device at once.
     </para>
    </listitem>
    <listitem>
     <para>
      Non ESD-enabled applications can be fooled into being ESD
      applications by using ESD's hack:
     <programlisting>
     <command>esddsp app_name -parameters to -the application</command>
     </programlisting> 
     This will redirect the application to use ESD instead of
     <hardware>/dev/dsp</hardware>. 
     </para>
    </listitem>
    <listitem>
     <para>
      You can monitor all mixed output to the audio
      device. <command>esdmon</command> is a quick example of
      this. This is useful for being able to do waveform displays for
      audio output from your computer.
     </para>
    </listitem>    
   </itemizedlist>
 </sect1>  
 <sect1 id="problems">
  <title>Problems</title>
   <para>
    ESD is by no means perfect, but it is a small, manageable project and thus
    can easily be expanded and modified to meet the needs of
    applications.
   </para>
   <para>
    Several problems currently in ESD are:
   </para>
     <ITEMIZEDLIST mark="bullet">
      <listitem>
       <para>
        Lag could be reduced inside of ESD's own mixing
        routines. 
       </para>
      </listitem>
      <listitem>
       <para>
        ESD needs better audio client management support (similar
        to the X equivalent of Window managers and ICCCM). 
      </para>
     </listitem>
     <listitem>
      <para>
       ESD suffers from lack of <emphasis>real-time</emphasis>
       processing. It is liable to "crackle" and become unable to keep
       up in piping and mixing audio to the device if it does not get
       sufficient CPU time-slices for a period of time. This is a
       problem that is hard to overcome in an easy fashion without
       making ESD an <command>SUID</command> root process so that it
       could ursurp a higher priority. 
      </para>
     </listitem>
     <listitem>
      <para>
       Authentication is simplistic, as ESD only accepts a single
       authentication key. 
      </para>
     </listitem>
    </itemizedlist>
 </sect1> 
 <sect1 id="references">
  <title>References</title>
   <para>
    Websites:
   </para>
   <ITEMIZEDLIST mark="bullet">
    <listitem>
     <para>
      <ulink url="http://www.tux.org/~ricdude/EsounD.html"
      type="html">Esound</ulink>
     </para>
    </listitem>
    <listitem>
     <para>
      <ulink url="http://www.alsa-project.org/api.html"
      type="html">ALSA</ulink>
     </para>
    </listitem>
    <listitem>
     <para>
      <ulink url="http://www.4front-tech.com/pguide/" type="html">OSS
      </ulink>
     </para>
    </listitem>
   </itemizedlist>
 </sect1>
</article>


