The most direct way is to talk to the kernel sound drivers. Linux has two:
Open Sound System (OSS) comes in two versions: OSS/Free, which is a free software maintained by the well-known kernel hacker Alan Cox, and 4Front Technologies' OSS (OSS/Linux, formerly known as VoxWare, USS, and TASD), which is a proprietary implementation based on OSS/Free. OSS is available not only for Linux but also for BSD OSes and other Unixes. That may be its only advantage, because this system is not very powerful and was officially replaced by ALSA in 2.5 kernels.
I'm not going to talk about programming for OSS, considering that it is deprecated, but it is not very difficult (to sum up, open /dev/dsp, /dev/dspW, or /dev/audio depending on the format you want, manipulate the file descriptor to read and write to the sound card, and use some ioctl to set parameters like volume). You can learn about advanced OSS programming in 4Front's API specs.
Advanced Linux Sound Architecture (ALSA) is the new Linux sound hardware abstraction layer that replaces OSS. In fact, it's more than a simple HAL because it provides a user-space library named libasound. What's more, it's thread-safe, works well with SMP machines, and is backward-compatible with OSS/Free (using OSS emulation module). Of course, it's also free and open source. A full description of its features and API can be found on ALSA's Web site, and I would also suggest reading Paul Davis' Tutorial.
Let's take a look at ALSA's API with a little example that will show the good and bad points of ALSA:
/* Example stolen from Paul Davis' tutorial (don't worry, he won't sue me -- GPL privilege)
* Have just omitted the error handling for concision and added comments */
#include <stdio.h>
#include <stdlib.h>
#include <alsa/asoundlib.h>
main (int argc, char *argv[])
{
int i;
int err;
short buf[128];
snd_pcm_t *playback_handle;
snd_pcm_hw_params_t *hw_params;
/* Open the device */
snd_pcm_open (&playback_handle, argv[1], SND_PCM_STREAM_PLAYBACK, 0);
/* Allocate Hardware Parameters structures and fills it with config space for PCM */
snd_pcm_hw_params_malloc (&hw_params);
snd_pcm_hw_params_any (playback_handle, hw_params);
/* Set parameters : interleaved channels, 16 bits little endian, 44100Hz, 2 channels */
snd_pcm_hw_params_set_access (playback_handle, hw_params, SND_PCM_ACCESS_RW_INTERLEAVED);
snd_pcm_hw_params_set_format (playback_handle, hw_params, SND_PCM_FORMAT_S16_LE);
snd_pcm_hw_params_set_rate_near (playback_handle, hw_params, 44100, 0);
snd_pcm_hw_params_set_channels (playback_handle, hw_params, 2);
/* Assign them to the playback handle and free the parameters structure */
snd_pcm_hw_params (playback_handle, hw_params);
snd_pcm_hw_params_free (hw_params);
/* Prepare & Play */
snd_pcm_prepare (playback_handle);
for (i = 0; i < 10; i++) {
if ((err = snd_pcm_writei (playback_handle, buf, 128)) != 128) {
(...)
}
}
/* Close the handle and exit */
snd_pcm_close (playback_handle);
exit (0);
}
As you can see, the API is quite clear and not very hard to understand, even if it's a bit long. ALSA acts at a level low enough for the programmer to be able to chose another design called interrupt-driven or callback-driven, which is fundamentally better because:
About the practical use of Kernel Drivers...
Besides the full duplex difficulty, another problem for ALSA multimedia applications is what motivated the creation of sound servers: such programs need concurrent access to the sound card, and it's not possible to have only one application be able to produce and capture sound at a time. Practically, designers must determine at which level a program should act: If it needs low-level access, ALSA can be a good solution, but if sound is not the main part of the project or if high-level operations are needed, consider instead the sound systems we'll talk about next.
Sound servers
Sound servers are software that sit atop the audio core and put one more layer between the user and the hardware. The act of talking to the kernel's audio API comes with a little performance hit but results in a simpler API which enables applications to do software-based sample mixing. Software-based sample mixing enables applications to play multiple sounds at the same time on a single sound card without needing one a sound card that natively supports that. With it, applications can share the sound hardware, because sound servers support multiple channels (kernel sound servers support only one) by multiplexing and streaming the result to /dev/dsp. Some sound servers (esd, aRTs, NAS) are also built on a client/server model that enable sound to be played remotely and transparently on a network: this is called network transparency. If you want sound servers with such features, take a look at the Squeak homepage.
ESD, short for Enlightenment Sound Daemon, was originally developed for Enlightenment and is now part of the GNOME Project. ESoundD supports full duplex and network transparency, and is especially suited for sound effects and long unsynchronized music. You can extract the API from source: esd.h
and esdlib.c. Compile with gcc -o esdtest esdtest.c `esd-config --cflags --libs`.
/* Let's see a skeleton that a recording program can change */
#include <stdio.h> /* for NULL */
#include "esd.h"
int main()
{
char buf[ESD_BUF_SIZE];
int sock = -1;
/* Set format : 16bits stereo stream for recording */
esd_format_t format = ESD_BITS16 | ESD_STEREO | ESD_STREAM | ESD_RECORD;
/* And only 1 command to open the recording :) with the format defined earlier,
* ESD's default rate (ESD_DEFAULT_RATE -> 44100Hz),
* on localhost:16001 (default -> NULL), and with "testprog" as internal name */
sock = esd_record_stream_fallback(format, ESD_DEFAULT_RATE, NULL, "testprog");
if (sock <= 0) return 1;
/* And now treat that */
while (read(sock, buf, ESD_BUF_SIZE) > 0)
{
(...)
}
close(sock);
return 0;
}
Piece of cake, isn't it? The esd_record_stream function is in fact a wrapper that calls esd_open_sound(hostname) to connect to the server, negotiate with it, then sets the socket buffers size with esd_set_socket_buffers(sock, format, rate, 44100). The _fallback functions fall back to ALSA and OSS to try to play the sound if ESD fails, which is quite useful.
The Analog RealTime Synthesizer is KDE's sound server. Support is progressively fading for it and it's probable that it will be abandoned in the future in favor of JACK. Nevertheless, various commentaries suggest that aRTs has better sound quality than ESD due to better sound processing routines (but higher latency too due to their complexity). aRTs also supports full duplex (but has been reported to be a bit buggy in this area) and network transparency and works on BSD operating systems. Documentation about the aRTs C API is quite rare (see the aRTs project Web site for a little page about it) so the best thing to do is to take a look at the source (artsc.h).
Here's a little example to compile with gcc -o artstest artstest.c `artsc-config --cflags` `artsc-config --libs`.
#include <stdio.h>
#include <artsc.h>
int main()
{
arts_stream_t stream;
char buffer[8192];
int bytes;
int errorcode;
/* Initialise aRTs with arts_init() */
if ((errorcode = arts_init()) < 0)
{
fprintf(stderr, "arts_init error: %s\n", arts_error_text(errorcode));
return 1;
}
/* Open a stream for playback at 44100Hz, 16 bits, 2 channels as "aRTstest" */
stream = arts_play_stream(44100, 16, 2, "aRTstest");
/* example of treatment : read music from stdin and play it with arts_write */
while((bytes = fread(buffer, 1, 8192, stdin)) > 0)
{
if ((errorcode = arts_write(stream, buffer, bytes)) < 0)
{
fprintf(stderr, "arts_write error: %s\n", arts_error_text(errorcode));
return 1;
}
}
/* Does what it says */
arts_close_stream(stream);
arts_free();
return 0;
}
The API is also very simple, as you can see. Some other useful commands include arts_suspend, to free the DSP device for aRTs-incapable programs to access it, and arts_stream_set, to configure some stream parameters.
JACK (also called JACKit) follows the long tradition of recursive acronyms -- in this case, Jack Audio Connection Kit. This project was created as an implementation of the Linux Audio Applications Glue API project, which aimed at creating a high-bandwidth, low-latency inter-application communication API. It is a real-time sound server written for POSIX systems (and actually available for Linux and OS X) that enables different applications to have synchronous connections to the audio hardware and to share audio among themselves via a ports system. Programs can run as normal independent applications or as plugins within the JACK server. It uses the callback method shown earlier, implements ringbuffers, and is, in my humble opinion, the most excellent and promising sound server. The only bad point is that it is not widely available at the moment, but that should be fixed soon. (Gentoo already includes it and there are some third party RPMs for Fedora Core.) The API is well-documented and available on SourceForge. A fully documented example for a capture client is available on Berman Home Page.
Here we'll start with something softer and smaller. Compilation is done using gcc -o jacktest `pkg-config --cflags --libs jack` jacktest.c.
/* Lighter version of simple_client.c */
#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include <jack/jack.h>
jack_port_t *input_port;
jack_port_t *output_port;
/* Processing thread: only transmit the data from input to output */
int process (jack_nframes_t nframes, void *arg)
{
jack_default_audio_sample_t *out = (jack_default_audio_sample_t *) jack_port_get_buffer (output_port, nframes);
jack_default_audio_sample_t *in = (jack_default_audio_sample_t *) jack_port_get_buffer (input_port, nframes);
memcpy (out, in, sizeof (jack_default_audio_sample_t) * nframes);
return 0;
}
void jack_shutdown (void *arg)
{
exit (1);
}
int main ()
{
jack_client_t *client;
/* try to become a client of the JACK server */
if ((client = jack_client_new ("test_client") == 0) {
fprintf (stderr, "jack server not running?\n");
return 1;
}
/* tell the JACK server to call `process()' whenever there is work to be done. */
jack_set_process_callback (client, process, 0);
/* tell the JACK server to call `jack_shutdown()' if it ever shuts down, either entirely, or if it
just decides to stop calling us. */
jack_on_shutdown (client, jack_shutdown, 0);
/* display the current sample rate. once the client is activated */
printf ("engine sample rate: %lu\n", jack_get_sample_rate (client));
/* create two ports: 1 input & 1 output*/
input_port = jack_port_register (client, "input", JACK_DEFAULT_AUDIO_TYPE, JackPortIsInput, 0);
output_port = jack_port_register (client, "output", JACK_DEFAULT_AUDIO_TYPE, JackPortIsOutput, 0);
/* tell the JACK server that we are ready to roll */
if (jack_activate (client)) {
fprintf (stderr, "cannot activate client");
return 1;
}
/* connect the ports: input one to the first ALSA PCM input and output to the first ALSA PCM output */
if (jack_connect (client, "alsa_pcm:in_1", jack_port_name (input_port))) {
fprintf (stderr, "cannot connect input ports\n");
}
if (jack_connect (client, jack_port_name (output_port), "alsa_pcm:out_1")) {
fprintf (stderr, "cannot connect output ports\n");
}
/* Since this is just a toy, run for a few seconds, then finish */
sleep (10);
jack_client_close (client);
exit (0);
}
This code looks a bit more complex. To understand it, you must think of JACK as a big and complex switchboard with inputs and outputs and on which you can interconnect devices (microphone, sound card, programs, etc.) by plugging them into it. The program copies what's connected as input to what's connected as output (meaning, generally speaking, a wire or cable). This explanation is meant to be a simple example. If you want a complete analysis, go to dis-dot-dat.net.
All the handling is done in the callback and the main program flow is still running (that's why we've used sleep(10)). By the way, there are some real-time considerations when implementing the callback, like using non-blocking and deterministic calls only (malloc, printf, mutex_*, etc., must be banned). Here we're hard-coding a connection between our created output_port and alsa_pcm:out_1, but if you need something more flexible, an interesting function is jack_get_ports (client, NULL, NULL, JackPortIsPhysical|JackPortIsOutput), which for example gets a list of physical output ports available.
Practical implementation
So many APIs -- what now? What should a programmer wanting to use sound choose?
To be continued....
Vincenot has been a Linux user for eight years, and is currently a student at University Louis Pasteur in Strasbourg.
Note: Comments are owned by the poster. We are not responsible for their content.
ALSA's dmix plugin is a pretty severe omission from this article.
OK, I omitted dmix (and dsnoop) because I considered it as a plugin and as not strictly part of ALSA. You're right, should have at least mentionned it (but most howtos & docs I read weren't doing so either, except the ALSA one).
The fact that he hypes the very new Jack Server, but omits dmix, NAS, and MAS makes me think that he's just trying to promote Jack.
When it comes to NAS & Gstreamer, I explain my choice at the end of the second part (was supposed to be only one article). For the others, had to make choices for concision (the article was already much too big compared to what I was asked), and had to talk about the *obvious* ones likes aRTs, esd and JACK. My goal wasn't to promote any system, but I must admit that JACK was my personal favourite one. <A HREF="http://www.mediaapplicationserver.net/indexframes.html" title="mediaappli...server.net">SAI's MAS</a mediaappli...server.net> is, as you mentionned, also an interesting project with very interesting features (low latency, Network Transparency, X11 integration, bandwidth measurement,<nobr> <wbr></nobr>...) especially for conferencing, but less popular for the moment, and if on the little space I had, I had talked about less known sound systems and not about the ones everybody would await, my mailbox would have exploded now<nobr> <wbr></nobr>:-P. It's difficult to write something that pleases everyone: I already received multiple mails of people accusing me of preferring one system (always a different one) or even one from a big company director saying I would have an agenda (!!) to take down its product.
So, just to say it once and for all, this article isn't supposed to be exhaustive at all, because it was meant to be an introduction and not a serie of articles, and I'm just a student interested in technology and certainly not here to do any kind of advertisement for one sound system (I'd like to get money from one of them, would be an interesting way of
paying my studies<nobr> <wbr></nobr>;), but I'm not on sale). If enough readers show interest in a serie of articles about this subject with more details about each solution, I'd
be glad to write it (if the editor is ok with that).
I'm not going to talk about programming for OSS, considering that it is deprecated
Compilation must be done with "-lasound" to link the binary to alsa-lib. For example:
gcc test.c -o test -lasound
I tried all the devices in<nobr> <wbr></nobr>/dev/snd/,<nobr> <wbr></nobr>/dev/dsp, etc.
Last thing, your code modifications are right, BUT verify your sources before qualifying articles of being poor and saying code hasn't been checked !! Paul Davis is nothing less than one of the most active sound developers of the community, works on ALSA and has created the entire LAAGA project as well as JACK (for which he got an <A HREF="http://builder.com.com/5100-6375-5136755.html?tag=tt" title="com.com">Open Source Award this year</a com.com>). The point is that ALSA is still under heavy development and that its API changes a lot, and that's what happened for the snd_pcm_hw_params_set_rate_near function which prototype changed as you can verify on this <A HREF="http://www.music.columbia.edu/pipermail/linux-audio-dev/2003-December/005779.html" title="columbia.edu">linux-audio-dev message</a columbia.edu>. This change provocated the segfault of many programs using ALSA (IceCast, aplay, etc AFAIK). So I checked my code and Paul Davis did too, and google will reveal you that most docs/examples on the net are still using the old prototype (those include the howto you cite yourself !!). What's more, those pieces of code are skeletons, just to show what the API looks like, and not real practical examples (else I wouldn't have put those (...) everywhere). There are API references and tutorials to give more up-to-date and exhaustive explanations if you wish to write a real program. However, I'll consider your comments and I'd like to thank you for your interest in my article, hoping it helped you a bit.
I read the article and the Paul Davi's. For someone like me, trying to test programming for the sound device for the first time, both articles need to include instructions on how to compile and use the sample programs. I managed to compile the playback code example, but I get an error message when I use it.
<TT>
ALSA lib pcm.c:1972:(snd_pcm_open_noupdate) Unknown PCM<nobr> <wbr></nobr>/dev/snd/pcmC0D6c
cannot open audio device<nobr> <wbr></nobr>/dev/snd/pcmC0D6c (No such file or directory)
</TT>
I tried all the devices in<nobr> <wbr></nobr>/dev/snd/,<nobr> <wbr></nobr>/dev/dsp, etc. with no luck. Finally, after finding this tutorial <A HREF="http://www.suse.de/~mana/alsa090_howto.html" title="www.suse.de">http://www.suse.de/~mana/alsa090_howto.html</a www.suse.de>, I learned that the device convention is "plughw:0,0". This is a totally different API than the usual<nobr> <wbr></nobr>/dev/xxxxxx convention.
I also spotted an error with the sample code, in the function snd_pcm_hw_params_set_rate_near, the third and fourth arguments must be pointers. Since man pages seem not to exist, I had to google the function in order to find it's syntax (That doxygen crap is too hard to navigate, man function_name is simple and does not require a browser.)
To make the ALSA playback code work (and not produce a segmentation fault), you'll need to add/modify:
<TT>int dir=0;
int rate=44100;
snd_pcm_hw_params_set_rate_near (playback_handle, hw_params, &rate, &dir)</TT>
The program needs to execute with the argument "plughw:0,0", you'll get a click.
As introductory material, this and the referenced ALSA tutorial are really poor articles. Both need the code checked to make sure it actually works.
dmix plugin for ALSA allows concurrent access
Posted by: Administrator on August 10, 2004 01:51 PMhttp://alsa.opensrc.org/index.php?page=DmixPlugin
So by making dmix plugin the default ALSA sound output device, all the sound servers will work fine concurrently (when using ALSA) and so will other apps.
#