[RFC PATCH 1/6] ALSA: compress: add Sample Rate Converter codec support

Fri Aug 9 23:51:39 AEST 2024

On 09. 08. 24 12:14, Shengjiu Wang wrote:
> On Fri, Aug 9, 2024 at 3:25 PM Pierre-Louis Bossart
> <pierre-louis.bossart at linux.intel.com> wrote:
>>
>>
>>>>>> Then there's the issue of parameters, we chose to only add parameters
>>>>>> for standard encoders/decoders. Post-processing is highly specific and
>>>>>> the parameter definitions varies from one implementation to another -
>>>>>> and usually parameters are handled in an opaque way with binary
>>>>>> controls. This is best handled with a UUID that needs to be known only
>>>>>> to applications and low-level firmware/hardware, the kernel code should
>>>>>> not have to be modified for each and every processing and to add new
>>>>>> parameters. It just does not scale and it's unmaintainable.
>>>>>>
>>>>>> At the very least if you really want to use this compress API,
>>>>>> extend it
>>>>>> to use a non-descript "UUID-defined" type and an opaque set of
>>>>>> parameters with this UUID passed in a header.
>>>>>
>>>>> We don't need to use UUID-defined scheme for simple (A)SRC
>>>>> implementation. As I noted, the specific runtime controls may use
>>>>> existing ALSA control API.
>>>>
>>>> "Simple (A)SRC" is an oxymoron. There are multiple ways to define the
>>>> performance, and how the drift estimator is handled. There's nothing
>>>> simple if you look under the hood. The SOF implementation has for
>>>> example those parameters:
>>>>
>>>> uint32_t source_rate;           /**< Define fixed source rate or */
>>>>                  /**< use 0 to indicate need to get */
>>>>                  /**< the rate from stream */
>>>> uint32_t sink_rate;             /**< Define fixed sink rate or */
>>>>                  /**< use 0 to indicate need to get */
>>>>                  /**< the rate from stream */
>>>> uint32_t asynchronous_mode;     /**< synchronous 0, asynchronous 1 */
>>>>                  /**< When 1 the ASRC tracks and */
>>>>                  /**< compensates for drift. */
>>>> uint32_t operation_mode;        /**< push 0, pull 1, In push mode the */
>>>>                  /**< ASRC consumes a defined number */
>>>>                  /**< of frames at input, with varying */
>>>>                  /**< number of frames at output. */
>>>>                  /**< In pull mode the ASRC outputs */
>>>>                  /**< a defined number of frames while */
>>>>                  /**< number of input frames varies. */
>>>>
>>>> They are clearly different from what is suggested above with a 'ratio-
>>>> mod'.
>>>
>>> I don't think so. The proposed (A)SRC for compress-accel is just one
>>> case for the above configs where the input is known and output is
>>> controlled by the requested rate. The I/O mechanism is abstracted enough
>>> in this case and the driver/hardware/firmware must follow it.
>>
>> ASRC is usually added when the nominal rates are known but the clock
>> sources differ and the drift needs to be estimated at run-time and the
>> coefficients or interpolation modified dynamically
>>
>> If the ratio is known exactly and there's no clock drift, then it's a
>> different problem where the filter coefficients are constant.
>>
>>>> Same if you have a 'simple EQ'. there are dozens of ways to implement
>>>> the functionality with FIR, IIR or a combination of the two, and
>>>> multiple bands.
>>>>
>>>> The point is that you have to think upfront about a generic way to pass
>>>> parameters. We didn't have to do it for encoders/decoders because we
>>>> only catered to well-documented standard solutions only. By choosing to
>>>> support PCM processing, a new can of worms is now open.
>>>>
>>>> I repeat: please do not make the mistake of listing all processing with
>>>> an enum and a new structure for parameters every time someone needs a
>>>> specific transform in their pipeline. We made that mistake with SOF and
>>>> had to backtrack rather quickly. The only way to scale is an identifier
>>>> that is NOT included in the kernel code but is known to higher and
>>>> lower-levels only.
>>>
>>> There are two ways - black box (UUID - as you suggested) - or well
>>> defined purpose (abstraction). For your example 'simple EQ', the
>>> parameters should be the band (frequency range) volume values. It's
>>> abstract and the real filters (resp. implementation) used behind may
>>> depend on the hardware/driver capabilities.
>>
>> Indeed there is a possibility that the parameters are high-level, but
>> that would require firmware or hardware to be able to generate actual
>> coefficients from those parameters. That usually requires some advanced
>> math which isn't necessarily obvious to implement with fixed-point hardware.
>>
>>>  From my view, the really special cases may be handled as black box, but
>>> others like (A)SRC should follow some well-defined abstraction IMHO to
>>> not force user space to handle all special cases.
>>
>> I am not against the high-level abstractions, e.g. along the lines of
>> what Android defined:
>> https://developer.android.com/reference/android/media/audiofx/AudioEffect
>>
>> That's not sufficient however, we also need to make sure there's an
>> ability to provide pre-computed coefficients in an opaque manner for
>> processing that doesn't fit in the well-defined cases. In practice there
>> are very few 3rd party IP that fits in well-defined cases, everyone has
>> secret-sauce parameters and options.
> 
> Appreciate the discussion.
> 
> Let me explain the reason for the change:
> 
> Why I use the metadata ioctl is because the ALSA controls are binding
> to the sound card.  What I want is the controls can be bound to
> snd_compr_stream, because the ASRC compress sound card can
> support multi instances ( the ASRC can support multi conversion in
> parallel).   The ALSA controls can't be used for this case,  the only
> choice in current compress API is metadata ioctl. And metadata
> ioctl can be called many times which can meet the ratio modifier
> requirement (ratio may be drift on the fly)

This argument is not valid. The controls are bound to the card, but the 
element identifiers have already iface (interface), device and subdevice 
numbers. We are using controls for PCM devices for example. The binding is 
straight.

Just add SNDRV_CTL_ELEM_IFACE_COMPRESS define and specify the compress device 
number in the 'struct snd_ctl_elem_id'.

					Jaroslav

-- 
Jaroslav Kysela <perex at perex.cz>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.