[PATCH v15 00/16] Add audio support in v4l2 framework
Jaroslav Kysela
perex at perex.cz
Fri May 17 00:50:39 AEST 2024
On 15. 05. 24 22:33, Nicolas Dufresne wrote:
> Hi,
>
> GStreamer hat on ...
>
> Le mercredi 15 mai 2024 à 12:46 +0200, Jaroslav Kysela a écrit :
>> On 15. 05. 24 12:19, Takashi Iwai wrote:
>>> On Wed, 15 May 2024 11:50:52 +0200,
>>> Jaroslav Kysela wrote:
>>>>
>>>> On 15. 05. 24 11:17, Hans Verkuil wrote:
>>>>> Hi Jaroslav,
>>>>>
>>>>> On 5/13/24 13:56, Jaroslav Kysela wrote:
>>>>>> On 09. 05. 24 13:13, Jaroslav Kysela wrote:
>>>>>>> On 09. 05. 24 12:44, Shengjiu Wang wrote:
>>>>>>>>>> mem2mem is just like the decoder in the compress pipeline. which is
>>>>>>>>>> one of the components in the pipeline.
>>>>>>>>>
>>>>>>>>> I was thinking of loopback with endpoints using compress streams,
>>>>>>>>> without physical endpoint, something like:
>>>>>>>>>
>>>>>>>>> compress playback (to feed data from userspace) -> DSP (processing) ->
>>>>>>>>> compress capture (send data back to userspace)
>>>>>>>>>
>>>>>>>>> Unless I'm missing something, you should be able to process data as fast
>>>>>>>>> as you can feed it and consume it in such case.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Actually in the beginning I tried this, but it did not work well.
>>>>>>>> ALSA needs time control for playback and capture, playback and capture
>>>>>>>> needs to synchronize. Usually the playback and capture pipeline is
>>>>>>>> independent in ALSA design, but in this case, the playback and capture
>>>>>>>> should synchronize, they are not independent.
>>>>>>>
>>>>>>> The core compress API core no strict timing constraints. You can eventually0
>>>>>>> have two half-duplex compress devices, if you like to have really independent
>>>>>>> mechanism. If something is missing in API, you can extend this API (like to
>>>>>>> inform the user space that it's a producer/consumer processing without any
>>>>>>> relation to the real time). I like this idea.
>>>>>>
>>>>>> I was thinking more about this. If I am right, the mentioned use in gstreamer
>>>>>> is supposed to run the conversion (DSP) job in "one shot" (can be handled
>>>>>> using one system call like blocking ioctl). The goal is just to offload the
>>>>>> CPU work to the DSP (co-processor). If there are no requirements for the
>>>>>> queuing, we can implement this ioctl in the compress ALSA API easily using the
>>>>>> data management through the dma-buf API. We can eventually define a new
>>>>>> direction (enum snd_compr_direction) like SND_COMPRESS_CONVERT or so to allow
>>>>>> handle this new data scheme. The API may be extended later on real demand, of
>>>>>> course.
>>>>>>
>>>>>> Otherwise all pieces are already in the current ALSA compress API
>>>>>> (capabilities, params, enumeration). The realtime controls may be created
>>>>>> using ALSA control API.
>>>>>
>>>>> So does this mean that Shengjiu should attempt to use this ALSA approach first?
>>>>
>>>> I've not seen any argument to use v4l2 mem2mem buffer scheme for this
>>>> data conversion forcefully. It looks like a simple job and ALSA APIs
>>>> may be extended for this simple purpose.
>>>>
>>>> Shengjiu, what are your requirements for gstreamer support? Would be a
>>>> new blocking ioctl enough for the initial support in the compress ALSA
>>>> API?
>>>
>>> If it works with compress API, it'd be great, yeah.
>>> So, your idea is to open compress-offload devices for read and write,
>>> then and let them convert a la batch jobs without timing control?
>>>
>>> For full-duplex usages, we might need some more extensions, so that
>>> both read and write parameters can be synchronized. (So far the
>>> compress stream is a unidirectional, and the runtime buffer for a
>>> single stream.)
>>>
>>> And the buffer management is based on the fixed size fragments. I
>>> hope this doesn't matter much for the intended operation?
>>
>> It's a question, if the standard I/O is really required for this case. My
>> quick idea was to just implement a new "direction" for this job supporting
>> only one ioctl for the data processing which will execute the job in "one
>> shot" at the moment. The I/O may be handled through dma-buf API (which seems
>> to be standard nowadays for this purpose and allows future chaining).
>>
>> So something like:
>>
>> struct dsp_job {
>> int source_fd; /* dma-buf FD with source data - for dma_buf_get() */
>> int target_fd; /* dma-buf FD for target data - for dma_buf_get() */
>> ... maybe some extra data size members here ...
>> ... maybe some special parameters here ...
>> };
>>
>> #define SNDRV_COMPRESS_DSPJOB _IOWR('C', 0x60, struct dsp_job)
>>
>> This ioctl will be blocking (thus synced). My question is, if it's feasible
>> for gstreamer or not. For this particular case, if the rate conversion is
>> implemented in software, it will block the gstreamer data processing, too.
>
> Yes, GStreamer threading is using a push-back model, so blocking for the time of
> the processing is fine. Note that the extra simplicity will suffer from ioctl()
> latency.
>
> In GFX, they solve this issue with fences. That allow setting up the next
> operation in the chain before the data has been produced.
The fences look really nicely and seem more modern. It should be possible with
dma-buf/sync_file.c interface to handle multiple jobs simultaneously and share
the state between user space and kernel driver.
In this case, I think that two non-blocking ioctls should be enough - add a
new job with source/target dma buffers guarded by one fence and abort (flush)
all active jobs.
I'll try to propose an API extension for the ALSA's compress API in the
linux-sound mailing list soon.
Jaroslav
--
Jaroslav Kysela <perex at perex.cz>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.
More information about the Linuxppc-dev
mailing list