diff options
Diffstat (limited to 'unstable/linux-dmabuf/feedback.rst')
-rw-r--r-- | unstable/linux-dmabuf/feedback.rst | 218 |
1 files changed, 218 insertions, 0 deletions
diff --git a/unstable/linux-dmabuf/feedback.rst b/unstable/linux-dmabuf/feedback.rst new file mode 100644 index 0000000..0f6e1b5 --- /dev/null +++ b/unstable/linux-dmabuf/feedback.rst @@ -0,0 +1,218 @@ +.. Copyright 2021 Simon Ser + +.. contents:: + + +linux-dmabuf feedback introduction +================================== + +linux-dmabuf feedback allows compositors and clients to negotiate optimal buffer +allocation parameters. This document will assume that the compositor is using a +rendering API such as OpenGL or Vulkan and KMS as the presentation API: even if +linux-dmabuf feedback isn't restricted to this use-case, it's the most common. + +linux-dmabuf feedback introduces the following concepts: + +1. A main device. This is the render device that the compositor is using to + perform composition. Compositors should always be able to display a buffer + submitted by a client, so this device can be used as a fallback in case none + of the more optimized code-paths work. Clients should allocate buffers such + that they can be imported and textured from the main device. + +2. One or more tranches. Each tranche consists of a target device, allocation + flags and a set of format/modifier pairs. A tranche can be seen as a set of + formats/modifier pairs that are compatible with the target device. + + A tranche can have the ``scanout`` flag. It means that the target device is + a KMS device, and that buffers allocated with one of the format/modifier + pairs in the tranche are eligible for direct scanout. + + Clients should use the tranches in order to allocate buffers with the most + appropriate format/modifier and also to avoid allocating in private device + memory when cross-device operations are going to happen. + +linux-dmabuf feedback implementation notes +========================================== + +This section contains recommendations for client and compositor implementations. + +For clients +----------- + +Clients are expected to either pick a fixed DRM format beforehand, or +perform the following steps repeatedly until they find a suitable format. + +Basic clients may only support static buffer allocation on startup. These +clients should do the following: + +1. Send a ``get_default_feedback`` request to get global feedback. +2. Select the device indicated by ``main_device`` for allocation. +3. For each tranche: + + 1. If ``tranche_target_device`` doesn't match the allocation device, ignore + the tranche. + 2. Accumulate allocation flags from ``tranche_flags``. + 3. Accumulate format/modifier pairs received via ``tranche_formats`` in a + list. + 4. When the ``tranche_done`` event is received, try to allocate the buffer + with the accumulated list of modifiers and allocation flags. If that + fails, proceed with the next tranche. If that succeeds, stop the loop. + +4. Destroy the feedback object. + +Tranches are ordered by preference: the more optimized tranches come first. As +such, clients should use the first tranche that happens to work. + +Some clients may have already selected the device they want to use beforehand. +These clients can ignore the ``main_device`` event, and ignore tranches whose +``tranche_target_device`` doesn't match the selected device. Such clients need +to be prepared for the ``wp_linux_buffer_params.create`` request to potentially +fail. + +If the client allocates a buffer without specifying explicit modifiers on a +device different from the one indicated by ``main_device``, then the client +must force a linear layout. + +Some clients might support re-negotiating the buffer format/modifier on the +fly. These clients should send a ``get_surface_feedback`` request and keep the +feedback object alive after the initial allocation. Each time a new set of +feedback parameters is received (ended by the ``done`` event), they should +perform the same steps as basic clients described above. They should detect +when the optimal allocation parameters didn't change (same +format/modifier/flags) to avoid needlessly re-allocating their buffers. + +Some clients might additionally support switching the device used for +allocations on the fly. Such clients should send a ``get_surface_feedback`` +request. For each tranche, select the device indicated by +``tranche_target_device`` for allocation. Accumulate allocation flags (received +via ``tranche_flags``) and format/modifier pairs (received via +``tranche_formats``) as usual. When the ``tranche_done`` event is received, try +to allocate the buffer with the accumulated list of modifiers and the +allocation flags. Try to import the resulting buffer by sending a +``wp_linux_buffer_params.create`` request (this might fail). Repeat with each +tranche until an allocation and import succeeds. Each time a new set of +feedback parameters is received, they should perform these steps again. They +should detect when the optimal allocation parameters didn't change (same +device/format/modifier/flags) to avoid needlessly re-allocating their buffers. + +For compositors +--------------- + +Basic compositors may only support texturing the DMA-BUFs via a rendering API +such as OpenGL or Vulkan. Such compositors can send a single tranche as a reply +to both ``get_default_feedback`` and ``get_surface_feedback``. Set the +``main_device`` to the rendering device. Send the tranche with +``tranche_target_device`` set to the rendering device and all of the DRM +format/modifier pairs supported by the rendering API. Do not set the +``scanout`` flag in the ``tranche_flags`` event. + +Some compositors may support direct scan-out for full-screen surfaces. These +compositors can re-send the feedback parameters when a surface becomes +full-screen or leaves full-screen mode if the client has used the +``get_surface_feedback`` request. The non-full-screen feedback parameters are +the same as basic compositors described above. The full-screen feedback +parameters have two tranches: one with the format/modifier pairs supported by +the KMS plane, with the ``scanout`` flag set in the ``tranche_flags`` event and +with ``tranche_target_device`` set to the KMS scan-out device; the other with +the rest of the format/modifier pairs (supported for texturing, but not for +scan-out), without the ``scanout`` flag set in the ``tranche_flags`` event, and +with the ``tranche_target_device`` set to the rendering device. + +Some compositors may support direct scan-out for all surfaces. These +compositors can send two tranches for surfaces that become candidates for +direct scan-out, similarly to compositors supporting direct scan-out for +fullscreen surfaces. When a surface stops being a candidate for direct +scan-out, compositors should re-send the feedback parameters optimized for +texturing only. The way candidates for direct scan-out are selected is +compositor policy, a possible implementation is to select as many surfaces as +there are available hardware planes, starting from surfaces closer to the eye. + +Some compositors may support multiple devices at the same time. If the +compositor supports rendering with a fixed device and direct scan-out on a +secondary device, it may send a separate tranche for surfaces displayed on +the secondary device that are candidates for direct scan-out. The +``tranche_target_device`` for this tranche will be the secondary device and +will not match the ``main_device``. + +Some compositors may support switching their rendering device at runtime or +changing their rendering device depending on the surface. When the rendering +device changes for a surface, such compositors may re-send the feedback +parameters with a different ``main_device``. However there is a risk that +clients don't support switching their device at runtime and continue using the +previous device. For this reason, compositors should always have a fallback +rendering device that they initially send as ``main_device``, such that these +clients use said fallback device. + +Compositors should not change the ``main_device`` on-the-fly when explicit +modifiers are not supported, because there's a risk of importing buffers +with an implicit non-linear modifier as a linear buffer, resulting in +misinterpreted buffer contents. + +Compositors should not send feedback parameters if they don't have a fallback +path. For instance, compositors shouldn't send a format/modifier supported for +direct scan-out but not supported by the rendering API for texturing. + +Compositors can decide to use multiple tranches to describe the allocation +parameters optimized for texturing. For example, if there are formats which +have a fast texturing path and formats which have a slower texturing path, the +compositor can decide to expose two separate tranches. + +Compositors can decide to use intermediate tranches to describe code-paths +slower than direct scan-out but faster than texturing. For instance, a +compositor could insert an intermediate tranche if it's possible to use a +mem2mem device to convert buffers to be able to use scan-out. + +``dev_t`` encoding +================== + +The protocol carries ``dev_t`` values on the wire using arrays. A compositor +written in C can encode the values as follows: + +.. code-block:: c + + struct stat drm_node_stat; + struct wl_array dev_array = { + .size = sizeof(drm_node_stat.st_rdev), + .data = &drm_node_stat.st_rdev, + }; + +A client can decode the values as follows: + +.. code-block:: c + + struct dev_t dev; + assert(dev_array->size == sizeof(dev)); + memcpy(&dev, dev_array->data, sizeof(dev)); + +Because two DRM nodes can refer to the same DRM device while having different +``dev_t`` values, clients should use ``drmDevicesEqual`` to compare two +devices. + +``format_table`` encoding +========================= + +The ``format_table`` event carries a file descriptor containing a list of +format + modifier pairs. The list is an array of pairs which can be accessed +with this C structure definition: + +.. code-block:: c + + struct dmabuf_format_modifier { + uint32_t format; + uint32_t pad; /* unused */ + uint64_t modifier; + }; + +Integration with other APIs +=========================== + +- libdrm: ``drmGetDeviceFromDevId`` returns a ``drmDevice`` from a device ID. +- EGL: the `EGL_EXT_device_drm_render_node`_ extension may be used to query the + DRM device render node used by a given EGL display. When unavailable, the + older `EGL_EXT_device_drm`_ extension may be used as a fallback. +- Vulkan: the `VK_EXT_physical_device_drm`_ extension may be used to query the + DRM device used by a given ``VkPhysicalDevice``. + +.. _EGL_EXT_device_drm: https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_device_drm.txt +.. _EGL_EXT_device_drm_render_node: https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_device_drm_render_node.txt +.. _VK_EXT_physical_device_drm: https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/VK_EXT_physical_device_drm.html |