[RFC PATCH 1/6] ALSA: compress: add Sample Rate Converter codec support

Fri Aug 9 17:19:58 AEST 2024

>>>> Then there's the issue of parameters, we chose to only add parameters
>>>> for standard encoders/decoders. Post-processing is highly specific and
>>>> the parameter definitions varies from one implementation to another -
>>>> and usually parameters are handled in an opaque way with binary
>>>> controls. This is best handled with a UUID that needs to be known only
>>>> to applications and low-level firmware/hardware, the kernel code should
>>>> not have to be modified for each and every processing and to add new
>>>> parameters. It just does not scale and it's unmaintainable.
>>>>
>>>> At the very least if you really want to use this compress API,
>>>> extend it
>>>> to use a non-descript "UUID-defined" type and an opaque set of
>>>> parameters with this UUID passed in a header.
>>>
>>> We don't need to use UUID-defined scheme for simple (A)SRC
>>> implementation. As I noted, the specific runtime controls may use
>>> existing ALSA control API.
>>
>> "Simple (A)SRC" is an oxymoron. There are multiple ways to define the
>> performance, and how the drift estimator is handled. There's nothing
>> simple if you look under the hood. The SOF implementation has for
>> example those parameters:
>>
>> uint32_t source_rate;           /**< Define fixed source rate or */
>>                 /**< use 0 to indicate need to get */
>>                 /**< the rate from stream */
>> uint32_t sink_rate;             /**< Define fixed sink rate or */
>>                 /**< use 0 to indicate need to get */
>>                 /**< the rate from stream */
>> uint32_t asynchronous_mode;     /**< synchronous 0, asynchronous 1 */
>>                 /**< When 1 the ASRC tracks and */
>>                 /**< compensates for drift. */
>> uint32_t operation_mode;        /**< push 0, pull 1, In push mode the */
>>                 /**< ASRC consumes a defined number */
>>                 /**< of frames at input, with varying */
>>                 /**< number of frames at output. */
>>                 /**< In pull mode the ASRC outputs */
>>                 /**< a defined number of frames while */
>>                 /**< number of input frames varies. */
>>
>> They are clearly different from what is suggested above with a 'ratio-
>> mod'.
> 
> I don't think so. The proposed (A)SRC for compress-accel is just one
> case for the above configs where the input is known and output is
> controlled by the requested rate. The I/O mechanism is abstracted enough
> in this case and the driver/hardware/firmware must follow it.

ASRC is usually added when the nominal rates are known but the clock
sources differ and the drift needs to be estimated at run-time and the
coefficients or interpolation modified dynamically

If the ratio is known exactly and there's no clock drift, then it's a
different problem where the filter coefficients are constant.

>> Same if you have a 'simple EQ'. there are dozens of ways to implement
>> the functionality with FIR, IIR or a combination of the two, and
>> multiple bands.
>>
>> The point is that you have to think upfront about a generic way to pass
>> parameters. We didn't have to do it for encoders/decoders because we
>> only catered to well-documented standard solutions only. By choosing to
>> support PCM processing, a new can of worms is now open.
>>
>> I repeat: please do not make the mistake of listing all processing with
>> an enum and a new structure for parameters every time someone needs a
>> specific transform in their pipeline. We made that mistake with SOF and
>> had to backtrack rather quickly. The only way to scale is an identifier
>> that is NOT included in the kernel code but is known to higher and
>> lower-levels only.
> 
> There are two ways - black box (UUID - as you suggested) - or well
> defined purpose (abstraction). For your example 'simple EQ', the
> parameters should be the band (frequency range) volume values. It's
> abstract and the real filters (resp. implementation) used behind may
> depend on the hardware/driver capabilities.

Indeed there is a possibility that the parameters are high-level, but
that would require firmware or hardware to be able to generate actual
coefficients from those parameters. That usually requires some advanced
math which isn't necessarily obvious to implement with fixed-point hardware.

> From my view, the really special cases may be handled as black box, but
> others like (A)SRC should follow some well-defined abstraction IMHO to
> not force user space to handle all special cases.

I am not against the high-level abstractions, e.g. along the lines of
what Android defined:
https://developer.android.com/reference/android/media/audiofx/AudioEffect

That's not sufficient however, we also need to make sure there's an
ability to provide pre-computed coefficients in an opaque manner for
processing that doesn't fit in the well-defined cases. In practice there
are very few 3rd party IP that fits in well-defined cases, everyone has
secret-sauce parameters and options.