Sunday, March 7, 2010

PulseAudio and why I do not use it.

[pulseaudio is a new Linux 'sound server']

Firstly, to make it clear, I think that there is nothing really wrong with PulseAudio itself. Some versions even work fine with OpenAL, which means that my game's sound works.

What's not good, however, is distributions enabling it by default, and worst yet, distributions [ubuntu in particular] include much outdated versions of PulseAudio which have more bugs. This, especially the old versions, creates extremely difficult landscape for application developers, open source and commercial alike. Furthermore, introduction of PA on by default violates the "suck less" principle - the principle that after each new update software or the system must such *less* than it sucked before - and if it does not, you'll be losing users. What's even worse is distributions entirely ignoring frequent user complaints about PA.

The most important thing to understand about PulseAudio is that it is NOT a sound driver and is NOT an ALSA replacement. PulseAudio takes in sound from applications, does some stuff on it, and outputs the sound through ALSA. It is a sound server. It adds new features, and inevitably, new bugs.
It so happens that vast majority of software can work with ALSA directly; and it so happens that ALSA includes a lot of features which people expect and need - mixing sound from different applications (even when you do not have hardware mixer), volume control, and so on. The role of PulseAudio is to add new features.

What features? Straight from the PulseAudio developer.

  • There's so much more a good audio system needs to provide than just the most basic mixing functionality. Per-application volumes, moving streams between devices during playback, positional event sounds (i.e. click on the left side of the screen, have the sound event come out through the left speakers), secure session-switching support, monitoring of sound playback levels, rescuing playback streams to other audio devices on hot unplug, automatic hotplug configuration, automatic up/downmixing stereo/surround, high-quality resampling, network transparency, sound effects, simultaneous output to multiple sound devices are all features PA provides right now, and what you don't get without it. It also provides the infrastructure for upcoming features like volume-follows-focus, automatic attenuation of music on signal on VoIP stream, UPnP media renderer support, Apple RAOP support, mixing/volume adjustments with dynamic range compression, adaptive volume of event sounds based on the volume of music streams, jack sensing, switching between stereo/surround/spdif during runtime, ...
  • And even for the most basic mixing functionality plain ALSA/dmix is not really everlasting happiness. Due to the way it works all clients are forced to use the same buffering metrics all the time, that means all clients are limited in their wakeup/latency settings. You will burn more CPU than necessary this way, keep the risk of drop-outs unnecessarily high and still not be able to make clients with low-latency requirements happy. 'Glitch-Free' PulseAudio fixes all this. Quite frankly I believe that 'glitch-free' PulseAudio is the single most important killer feature that should be enough to convince everyone why PulseAudio is the right thing to do. Maybe people actually don't know that they want this. But they absolutely do, especially the embedded people -- if used properly it is a must for power-saving during audio playback. It's a pity that how awesome this feature is you cannot directly see from the user interface.[1]
  • PulseAudio provides compatibility with a lot of sound systems/APIs that bare ALSA or bare OSS don't provide.
  • And last but not least, I love breaking Jeffrey's audio. It's just soo much fun, you really have to try it! ;-)
That's the things which PA aspires to make work. It's all amazing - AFAIK many of those features are not supported by the Windows or OS X. Well, that is all great, but you can imagine what sort of complexity PA needs with such a feature list.

I'm a simple man. All I want is to play music while I'm working, I want sound in flash, I do not like if some applications do not work, and I want sound in games (which use OpenAL). I need reliability. Complexity is the enemy of reliability, and the perfect is the enemy of the good.

I do not care about per application volume sliders (guess what, my application has two volume sliders, for SFX and music), I do not care about moving sound streams between devices during playback, I DEFINITELY do not give a damn about positional event sounds (more than that, I would not mind if event sounds even quitted working, except for: time alarm sound, and new mail sound), I do not care about multiple sessions playing sound through different devices, and so on and so forth. I'm pretty sure that a typical user has even simpler interests. The primary thing he needs is lack of regressions - everything that worked back when he decided to switch to Linux must still work - else he will switch back (!).

As developer, what I want is a stable API. A mature software which does not change much any more in each release, and which is not so buggy. Unfortunately, software maintenance is boring, and open source software is maintained by bright people whom do not like boring tasks. Open source developers want challenges. They want to do epic stuff that no other system does. They prefer rewrites over maintenance, they prefer large sets of very challenging features (often, features that almost nobody asks for) over basic set implemented to high reliability, and so on. They underestimate importance of reliability for people, and overestimate importance of new cool things (and keep doing that no matter how much are they flamed). Linux environment has a long history of frequent, major, breaking rewrites of important subsystems - far more frequent than on either Windows or OS X - frequent to the point that subsystems get rewritten before previous incarnation is polished and mature enough.


  1. I think this is correct. I pretty much have the same view. I like listening to music and playing games I don't get to far into other things with sound. I find Pulseaudio to be a hinder to my linux experience (and I'm a noob). Not enough to scare me because I've seen things like it before but why not give everyone something basic that can deal with basic sound and if they want to tinker and get more into it then just upgrade what they want.

  2. Thumbs up to this post! It embodies everything I feel and my own experiences with pulseaudio as well as the power arrogance of ubuntu ignoring these considerations, which are clearly bothering quite a lot of people. That last bit about rewriting subsystems is so true it hurts.

  3. It would be good to have some choices inside of that same pulseaudio like to switch of those unnecessary computations when user does not need 'extra complexity' but when the 'zero latency' is required. E.g. I can not use Creox with pulseaudio because of the delay, and there is no convenient way to switch it off (the easiest I find - is to rename the executable for the time it is not needed for something else/different and kill the process - not elegant, of course).
    There is nothing wrong with the complexity when the complexity is 'perfect' including as well a 'simple way' to choose a 'simple choice'. Sure, Pulseaudio has something to add in this aspect (to be still 'more complex' and therefore more convenient to use).