[Development] Qt Multimedia: proposing behavior change to QAudioSink::resume

Mon Jan 30 16:38:48 CET 2023

Hi,

TL;DR: I’d like to change QAudioSink::resume() to always change the sink to Active state, no matter how the sink was start()’ed.

QAudioSink provides low-level access to an audio device, allowing applications to provide PCM data.

The class operates in one of two modes: in pull mode, the application provides a QIODevice when calling QAudioSink::start(QIODevice *), and the sink will pull data from that devices as needed. In push mode, the QAudioSink creates and returns a QIODevice from the other QAudioSink::start() overload. Applications write PCM data into that device.

The problem is with QAudioSink::resume. The function is documented to behavior differently depending on how start() was called:

/*!
    Resumes processing audio data after a suspend().

    Sets error() to QAudio::NoError.
    Sets state() to QAudio::ActiveState if you previously called start(QIODevice*).
    Sets state() to QAudio::IdleState if you previously called start().
    emits stateChanged() signal.
*/

QAudioSink::suspend() behaves the same way in both modes: audio stops immediately, already buffered data is preserved.

This means that on a sink that has 10 seconds worth of data, calling suspend() after 1 second, and then resume()’ing changes the state of a push-mode audio sink to Idle, audio will play for 9 seconds while the sink reports to be idle, and the sink doesn’t change to “really idle” when all the audio data has been played.

This means also that applications have no way of knowing when it’s time to write more data, as an underflow error is never reported - the sink is already idle, there are no state changes. This is pretty broken IMHO making the push-style API practically useless if suspend/resume are used. However, the behavior is documented, and verified in tests.

A sane behavior would be to move the sink to Active after resume (and to handle buffer under-runs as usual, resulting in Idle state with error).

I’m not sure yet that we can implement the sane behavior on all platforms. Pull-mode evidently implements it, but I don’t know yet if we can inspect the amount of data buffered by the underlying audio system. Before we start investigating that - can anyone think of any particular reason why the existing behavior is a good idea?

Volker