[RFC PATCH 1/6] ALSA: compress: add Sample Rate Converter codec support

Jaroslav Kysela perex at perex.cz
Fri Aug 9 01:51:36 AEST 2024


On 08. 08. 24 14:19, Pierre-Louis Bossart wrote:
>   files changed, 10 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/include/uapi/sound/compress_offload.h b/include/uapi/
>>>>>> sound/compress_offload.h
>>>>>> index 98772b0cbcb7..8b2b72f94e26 100644
>>>>>> --- a/include/uapi/sound/compress_offload.h
>>>>>> +++ b/include/uapi/sound/compress_offload.h
>>>>>> @@ -112,10 +112,12 @@ struct snd_compr_codec_caps {
>>>>>>     * end of the track
>>>>>>     * @SNDRV_COMPRESS_ENCODER_DELAY: no of samples inserted by the
>>>>>> encoder at the
>>>>>>     * beginning of the track
>>>>>> + * @SNDRV_COMPRESS_SRC_RATIO_MOD: Resampling Ratio Modifier for
>>>>>> sample rate converter
>>>>>>     */
>>>>>>    enum sndrv_compress_encoder {
>>>>>>         SNDRV_COMPRESS_ENCODER_PADDING = 1,
>>>>>>         SNDRV_COMPRESS_ENCODER_DELAY = 2,
>>>>>> +     SNDRV_COMPRESS_SRC_RATIO_MOD = 3,
>>>>>>    };
>>>>>
>>>>> this sounds wrong to me. The sample rate converter is not an "encoder",
>>>>> and the properties for padding/delay are totally specific to an encoder
>>>>> function.
>>>>
>>>> There is only decoder and encoder definition for compress,  I know
>>>> it is difficult to add SRC to encoder or decoder classification,
>>>> SRC is a Post Processing.  I hope you can have a recommandation
>>>
>>> I don't. I think we're blurring layers in a really odd way.
>>>
>>> The main reason why the compress API was added is to remove the
>>> byte-to-time conversions. But that's clearly not useful for a
>>> post-processing of PCM data, where the bitrate is constant. It really
>>> feels like we're adding this memory-to-memory API to the compress
>>> framework because we don't have anything else available, not because it
>>> makes sense to do so.
>>
>> It makes sense to offload decoder/encoder tasks as batch processing for
>> standard compress stream and return back result (PCM stream or encoded
>> stream) to user space. So it makes sense to use the compress interface
>> (parameters handshake) for it. Let's talk about the proper SRC extension.
>>
>> For SRC and dynamic rate modification. I would just create an ALSA
>> control for this. We are already using the "PCM Rate Shift 100000"
>> control in the sound/drivers/aloop.c for this purpose (something like
>> pitch in MIDI) for example. There is no requirement to add this function
>> through metadata ioctls. As bonus, this control can be monitored with
>> multiple tasks.
> 
> this wouldn't work when the rate is estimated in firmware/hardware,
> which is precisely what the 'asynchronous' part of ASRC does.
> 
> 
>>> Then there's the issue of parameters, we chose to only add parameters
>>> for standard encoders/decoders. Post-processing is highly specific and
>>> the parameter definitions varies from one implementation to another -
>>> and usually parameters are handled in an opaque way with binary
>>> controls. This is best handled with a UUID that needs to be known only
>>> to applications and low-level firmware/hardware, the kernel code should
>>> not have to be modified for each and every processing and to add new
>>> parameters. It just does not scale and it's unmaintainable.
>>>
>>> At the very least if you really want to use this compress API, extend it
>>> to use a non-descript "UUID-defined" type and an opaque set of
>>> parameters with this UUID passed in a header.
>>
>> We don't need to use UUID-defined scheme for simple (A)SRC
>> implementation. As I noted, the specific runtime controls may use
>> existing ALSA control API.
> 
> "Simple (A)SRC" is an oxymoron. There are multiple ways to define the
> performance, and how the drift estimator is handled. There's nothing
> simple if you look under the hood. The SOF implementation has for
> example those parameters:
> 
> uint32_t source_rate;           /**< Define fixed source rate or */
> 				/**< use 0 to indicate need to get */
> 				/**< the rate from stream */
> uint32_t sink_rate;             /**< Define fixed sink rate or */
> 				/**< use 0 to indicate need to get */
> 				/**< the rate from stream */
> uint32_t asynchronous_mode;     /**< synchronous 0, asynchronous 1 */
> 				/**< When 1 the ASRC tracks and */
> 				/**< compensates for drift. */
> uint32_t operation_mode;        /**< push 0, pull 1, In push mode the */
> 				/**< ASRC consumes a defined number */
> 				/**< of frames at input, with varying */
> 				/**< number of frames at output. */
> 				/**< In pull mode the ASRC outputs */
> 				/**< a defined number of frames while */
> 				/**< number of input frames varies. */
> 
> They are clearly different from what is suggested above with a 'ratio-mod'.

I don't think so. The proposed (A)SRC for compress-accel is just one case for 
the above configs where the input is known and output is controlled by the 
requested rate. The I/O mechanism is abstracted enough in this case and the 
driver/hardware/firmware must follow it.

> Same if you have a 'simple EQ'. there are dozens of ways to implement
> the functionality with FIR, IIR or a combination of the two, and
> multiple bands.
> 
> The point is that you have to think upfront about a generic way to pass
> parameters. We didn't have to do it for encoders/decoders because we
> only catered to well-documented standard solutions only. By choosing to
> support PCM processing, a new can of worms is now open.
> 
> I repeat: please do not make the mistake of listing all processing with
> an enum and a new structure for parameters every time someone needs a
> specific transform in their pipeline. We made that mistake with SOF and
> had to backtrack rather quickly. The only way to scale is an identifier
> that is NOT included in the kernel code but is known to higher and
> lower-levels only.

There are two ways - black box (UUID - as you suggested) - or well defined 
purpose (abstraction). For your example 'simple EQ', the parameters should be 
the band (frequency range) volume values. It's abstract and the real filters 
(resp. implementation) used behind may depend on the hardware/driver capabilities.

 From my view, the really special cases may be handled as black box, but 
others like (A)SRC should follow some well-defined abstraction IMHO to not 
force user space to handle all special cases.

						Jaroslav

-- 
Jaroslav Kysela <perex at perex.cz>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.



More information about the Linuxppc-dev mailing list