Linaro-mm-sig May 2020

linaro-mm-sig@lists.linaro.org

23 participants
33 discussions

Re: [Linaro-mm-sig] [PATCH] staging: android: ion: Skip sync if not mapped

by John Stultz

On Mon, Apr 20, 2020 at 1:22 AM Christian Brauner <christian.brauner(a)ubuntu.com> wrote: > On Thu, Apr 16, 2020 at 12:25:08PM +0200, Greg Kroah-Hartman wrote: > > On Tue, Apr 14, 2020 at 09:41:31PM -0700, John Stultz wrote: > > > But I do think we can mark it as deprecated and let folks know that > > > around the end of the year it will be deleted. > > > > No one ever notices "depreciated" things, they only notice if the code > > is no longer there :) > > > > So I'm all for just deleting it and seeing who even notices... > > Agreed. I mean, I get there's not much love for ION in staging, and I too am eager to see it go, but I also feel like in the discussions around submitting the dmabuf heaps at talks, etc, that there was clear value in removing ION after a short time so that folks could transition being able to test both implementations against the same kernel so performance regressions, etc could be worked out. I am actively getting many requests for help for vendors who are looking at dmabuf heaps and are starting the transition process, and I'm trying my best to motivate them to directly work within the community so their needed heap functionality can go upstream. But it's going to be a process, and their first attempts aren't going to magically land upstream. I think being able to really compare their implementations as they iterate and push things upstream will help in order to be able to have upstream solutions that are also properly functional for production usage. The dmabuf heaps have been in an official kernel now for all of three weeks. So yea, we can "delete [ION] and see who even notices", but I worry that may seem a bit like contempt for the folks doing the work on transitioning over, which doesn't help getting them to participate within the community. thanks -john

5 years

[RFC 00/17] dma-fence lockdep annotations

by Daniel Vetter

Hi all, I've dragged my feet for years on this, hoping that cross-release lockdep would do this for us, but well that never really happened unfortunately. So here we are. Cc'ed quite a pile of people since this is about the cross-driver contract around dma_fences. Which is heavily used for dma_buf, and I'm hearing more noises that rdma folks are looking into this, hence also on cc. There's a bunch of different parts to this RFC: - The annotations itself, in the 2nd patch after the prep patch to add might_sleep annotations. Commit message has all the motivation for what kind of deadlocks I want to catch, best you just read it. Since lockdep doesn't understand cross-release natively we need to cobble something together using rwlocks and a few more tricks, but from the test rollout in a few places in drm/vkms, amdgpu & i915 I think what I have now seems to actually work. Downside is that we have to explicitly annotate all code involved in eventual dma_fence signalling. - Second important part is locking down the current dma-fence cross-driver contract, using lockdep priming like we already do for dma_resv_lock. I've just started with my own take on what we probably need to make the current code work (-ish), but both amdgpu and i915 have issues with that. So this needs some careful discussions, and also some thought on how we land it all eventually to not break lockdep completely for everyone. The important patch for that is "dma-fence: prime lockdep annotations" plus of course the various annotations patches and driver hacks to highlight some of the issues caught. Note that depending upon what exactly we end up deciding we might need to improve the annotations for fs_reclaim_acquire/release - for dma_fence_wait in mmu notifiers we can only allow GFP_NOWAIT (afaiui), and currently fs_reclaim_acquire/release only has a lockdep class for __GFP_FS only, we'd need to add another one for __GFP_DIRECT_RECLAIM in general maybe. - Finally there's clearly some gaps in the current dma_fence driver interfaces: Amdgpu's hang recovery is essentially impossible to fix as-is - it needs to reset the display state and you can't get at modeset locks from tdr without deadlock potential. i915 has an internal trick (but it stops working once we involve real cross-driver fences) for this issues, but then for i915 modeset reset is only needed on very ancient gen2/3. Modern hw is a lot more reasonable. I'm kinda hoping that the annotations and priming for basic command submission and atomic modeset paths could be merged soonish, while we the tdr side clearly needs a pile more work to get going. But since we have to explicitly annotate all code paths anyway we can hide bugs in e.g. tdr code by simply not yet annotating those functions. I'm trying to lay out at least one idea for solving the tdr issue in the patch titled "drm/scheduler: use dma-fence annotations in tdr work". Finally, once we have some agreement on where we're going with all this, we also need some documentation. Currently that's missing because I don't want to re-edit the text all the time while we still figure out the details of the exact cross-driver semantics. My goal here is that with this we can lock down the cross-driver contract for the last bit of the dma_buf/resv/fence story and make sure this stops being such a wobbly thing where everyone just does whatever they feel like. Ideas, thoughts, reviews, testing (with specific annotations for that driver) on other drivers very much welcome. Cheers, Daniel Cc: linux-media(a)vger.kernel.org Cc: linaro-mm-sig(a)lists.linaro.org Cc: linux-rdma(a)vger.kernel.org Cc: amd-gfx(a)lists.freedesktop.org Cc: intel-gfx(a)lists.freedesktop.org Cc: Chris Wilson <chris(a)chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com> Cc: Christian König <christian.koenig(a)amd.com> Daniel Vetter (17): dma-fence: add might_sleep annotation to _wait() dma-fence: basic lockdep annotations dma-fence: prime lockdep annotations drm/vkms: Annotate vblank timer drm/vblank: Annotate with dma-fence signalling section drm/atomic-helper: Add dma-fence annotations drm/amdgpu: add dma-fence annotations to atomic commit path drm/scheduler: use dma-fence annotations in main thread drm/amdgpu: use dma-fence annotations in cs_submit() drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code drm/amdgpu: DC also loves to allocate stuff where it shouldn't drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail drm/scheduler: use dma-fence annotations in tdr work drm/amdgpu: use dma-fence annotations for gpu reset code Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset" drm/amdgpu: gpu recovery does full modesets drm/i915: Annotate dma_fence_work drivers/dma-buf/dma-fence.c | 56 +++++++++++++++++++ drivers/dma-buf/dma-resv.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 5 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 22 ++++++-- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 2 +- drivers/gpu/drm/amd/amdgpu/atom.c | 2 +- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 18 +++++- drivers/gpu/drm/amd/display/dc/core/dc.c | 4 +- drivers/gpu/drm/drm_atomic_helper.c | 16 ++++++ drivers/gpu/drm/drm_vblank.c | 8 ++- drivers/gpu/drm/i915/i915_sw_fence_work.c | 3 + drivers/gpu/drm/scheduler/sched_main.c | 11 ++++ drivers/gpu/drm/vkms/vkms_crtc.c | 8 ++- include/linux/dma-fence.h | 13 +++++ 16 files changed, 160 insertions(+), 13 deletions(-) -- 2.26.2

5 years, 1 month

Re: [Linaro-mm-sig] [RFC 02/17] dma-fence: basic lockdep annotations

by Daniel Vetter

On Thu, May 28, 2020 at 11:54 PM Luben Tuikov <luben.tuikov(a)amd.com> wrote: > > On 2020-05-12 4:59 a.m., Daniel Vetter wrote: > > Design is similar to the lockdep annotations for workers, but with > > some twists: > > > > - We use a read-lock for the execution/worker/completion side, so that > > this explicit annotation can be more liberally sprinkled around. > > With read locks lockdep isn't going to complain if the read-side > > isn't nested the same way under all circumstances, so ABBA deadlocks > > are ok. Which they are, since this is an annotation only. > > > > - We're using non-recursive lockdep read lock mode, since in recursive > > read lock mode lockdep does not catch read side hazards. And we > > _very_ much want read side hazards to be caught. For full details of > > this limitation see > > > > commit e91498589746065e3ae95d9a00b068e525eec34f > > Author: Peter Zijlstra <peterz(a)infradead.org> > > Date: Wed Aug 23 13:13:11 2017 +0200 > > > > locking/lockdep/selftests: Add mixed read-write ABBA tests > > > > - To allow nesting of the read-side explicit annotations we explicitly > > keep track of the nesting. lock_is_held() allows us to do that. > > > > - The wait-side annotation is a write lock, and entirely done within > > dma_fence_wait() for everyone by default. > > > > - To be able to freely annotate helper functions I want to make it ok > > to call dma_fence_begin/end_signalling from soft/hardirq context. > > First attempt was using the hardirq locking context for the write > > side in lockdep, but this forces all normal spinlocks nested within > > dma_fence_begin/end_signalling to be spinlocks. That bollocks. > > > > The approach now is to simple check in_atomic(), and for these cases > > entirely rely on the might_sleep() check in dma_fence_wait(). That > > will catch any wrong nesting against spinlocks from soft/hardirq > > contexts. > > > > The idea here is that every code path that's critical for eventually > > signalling a dma_fence should be annotated with > > dma_fence_begin/end_signalling. The annotation ideally starts right > > after a dma_fence is published (added to a dma_resv, exposed as a > > sync_file fd, attached to a drm_syncobj fd, or anything else that > > makes the dma_fence visible to other kernel threads), up to and > > including the dma_fence_wait(). Examples are irq handlers, the > > scheduler rt threads, the tail of execbuf (after the corresponding > > fences are visible), any workers that end up signalling dma_fences and > > really anything else. Not annotated should be code paths that only > > complete fences opportunistically as the gpu progresses, like e.g. > > shrinker/eviction code. > > > > The main class of deadlocks this is supposed to catch are: > > > > Thread A: > > > > mutex_lock(A); > > mutex_unlock(A); > > > > dma_fence_signal(); > > > > Thread B: > > > > mutex_lock(A); > > dma_fence_wait(); > > mutex_unlock(A); > > > > Thread B is blocked on A signalling the fence, but A never gets around > > to that because it cannot acquire the lock A. > > > > Note that dma_fence_wait() is allowed to be nested within > > dma_fence_begin/end_signalling sections. To allow this to happen the > > read lock needs to be upgraded to a write lock, which means that any > > other lock is acquired between the dma_fence_begin_signalling() call and > > the call to dma_fence_wait(), and still held, this will result in an > > immediate lockdep complaint. The only other option would be to not > > annotate such calls, defeating the point. Therefore these annotations > > cannot be sprinkled over the code entirely mindless to avoid false > > positives. > > > > v2: handle soft/hardirq ctx better against write side and dont forget > > EXPORT_SYMBOL, drivers can't use this otherwise. > > > > Cc: linux-media(a)vger.kernel.org > > Cc: linaro-mm-sig(a)lists.linaro.org > > Cc: linux-rdma(a)vger.kernel.org > > Cc: amd-gfx(a)lists.freedesktop.org > > Cc: intel-gfx(a)lists.freedesktop.org > > Cc: Chris Wilson <chris(a)chris-wilson.co.uk> > > Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com> > > Cc: Christian König <christian.koenig(a)amd.com> > > Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com> > > --- > > drivers/dma-buf/dma-fence.c | 53 +++++++++++++++++++++++++++++++++++++ > > include/linux/dma-fence.h | 12 +++++++++ > > 2 files changed, 65 insertions(+) > > > > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c > > index 6802125349fb..d5c0fd2efc70 100644 > > --- a/drivers/dma-buf/dma-fence.c > > +++ b/drivers/dma-buf/dma-fence.c > > @@ -110,6 +110,52 @@ u64 dma_fence_context_alloc(unsigned num) > > } > > EXPORT_SYMBOL(dma_fence_context_alloc); > > > > +#ifdef CONFIG_LOCKDEP > > +struct lockdep_map dma_fence_lockdep_map = { > > + .name = "dma_fence_map" > > +}; > > + > > +bool dma_fence_begin_signalling(void) > > +{ > > + /* explicitly nesting ... */ > > + if (lock_is_held_type(&dma_fence_lockdep_map, 1)) > > + return true; > > + > > + /* rely on might_sleep check for soft/hardirq locks */ > > + if (in_atomic()) > > + return true; > > + > > + /* ... and non-recursive readlock */ > > + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _RET_IP_); > > + > > + return false; > > +} > > +EXPORT_SYMBOL(dma_fence_begin_signalling); > > Hi Daniel, > > This is great work and could help a lot. > > If you invert the result of dma_fence_begin_signalling() > then it would naturally mean "locked", i.e. whether we need to > later release "dma_fence_lockedep_map". Then, > in dma_fence_end_signalling(), you can call the "cookie" > argument "locked" and simply do: > > void dma_fence_end_signalling(bool locked) > { > if (locked) > lock_release(&dma_fence_lockdep_map, _RET_IP_); > } > EXPORT_SYMBOL(dma_fence_end_signalling); > > It'll be more natural to understand as well. It's intentionally called cookie so callers don't start doing funny stuff with it. The thing is, after begin_signalling you are _always_ in the locked state. It's just that because of limitations with lockdep we need to play a few tricks, and in some cases we do not take the lockdep map. There's 2 cases: - lockdep map already taken - we want recursive readlock semantics for this, but lockdep does not correctly check recursive read locks. Hence we only use readlock, and make sure we do not actually nest upon ourselves with this explicit check. - when we're in atomic sections - lockdep gets pissed at us if we take the read lock in hard/softirq sections because of hard/softirq ctx mismatch (lockdep thinks it's a real lock, but we don't treat it as one). Simplest fix was to rely on the might_sleep check in patch 1 (already merged) The commit message mentions this already a bit, but I'll try to explain this implementation detail tersely in the kerneldoc too in the next round. Thanks, Daniel > > Regards, > Luben > > > + > > +void dma_fence_end_signalling(bool cookie) > > +{ > > + if (cookie) > > + return; > > + > > + lock_release(&dma_fence_lockdep_map, _RET_IP_); > > +} > > +EXPORT_SYMBOL(dma_fence_end_signalling); > > + > > +void __dma_fence_might_wait(void) > > +{ > > + bool tmp; > > + > > + tmp = lock_is_held_type(&dma_fence_lockdep_map, 1); > > + if (tmp) > > + lock_release(&dma_fence_lockdep_map, _THIS_IP_); > > + lock_map_acquire(&dma_fence_lockdep_map); > > + lock_map_release(&dma_fence_lockdep_map); > > + if (tmp) > > + lock_acquire(&dma_fence_lockdep_map, 0, 0, 1, 1, NULL, _THIS_IP_); > > +} > > +#endif > > + > > + > > /** > > * dma_fence_signal_locked - signal completion of a fence > > * @fence: the fence to signal > > @@ -170,14 +216,19 @@ int dma_fence_signal(struct dma_fence *fence) > > { > > unsigned long flags; > > int ret; > > + bool tmp; > > > > if (!fence) > > return -EINVAL; > > > > + tmp = dma_fence_begin_signalling(); > > + > > spin_lock_irqsave(fence->lock, flags); > > ret = dma_fence_signal_locked(fence); > > spin_unlock_irqrestore(fence->lock, flags); > > > > + dma_fence_end_signalling(tmp); > > + > > return ret; > > } > > EXPORT_SYMBOL(dma_fence_signal); > > @@ -211,6 +262,8 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout) > > if (timeout > 0) > > might_sleep(); > > > > + __dma_fence_might_wait(); > > + > > trace_dma_fence_wait_start(fence); > > if (fence->ops->wait) > > ret = fence->ops->wait(fence, intr, timeout); > > diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h > > index 3347c54f3a87..3f288f7db2ef 100644 > > --- a/include/linux/dma-fence.h > > +++ b/include/linux/dma-fence.h > > @@ -357,6 +357,18 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep) > > } while (1); > > } > > > > +#ifdef CONFIG_LOCKDEP > > +bool dma_fence_begin_signalling(void); > > +void dma_fence_end_signalling(bool cookie); > > +#else > > +static inline bool dma_fence_begin_signalling(void) > > +{ > > + return true; > > +} > > +static inline void dma_fence_end_signalling(bool cookie) {} > > +static inline void __dma_fence_might_wait(void) {} > > +#endif > > + > > int dma_fence_signal(struct dma_fence *fence); > > int dma_fence_signal_locked(struct dma_fence *fence); > > signed long dma_fence_default_wait(struct dma_fence *fence, > > > -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch

5 years, 1 month

Re: [Linaro-mm-sig] [RFC 02/17] dma-fence: basic lockdep annotations

by Daniel Vetter

On Thu, May 28, 2020 at 3:37 PM Thomas Hellström (Intel) <thomas_os(a)shipmail.org> wrote: > > On 2020-05-12 10:59, Daniel Vetter wrote: > > Design is similar to the lockdep annotations for workers, but with > > some twists: > > > > - We use a read-lock for the execution/worker/completion side, so that > > this explicit annotation can be more liberally sprinkled around. > > With read locks lockdep isn't going to complain if the read-side > > isn't nested the same way under all circumstances, so ABBA deadlocks > > are ok. Which they are, since this is an annotation only. > > > > - We're using non-recursive lockdep read lock mode, since in recursive > > read lock mode lockdep does not catch read side hazards. And we > > _very_ much want read side hazards to be caught. For full details of > > this limitation see > > > > commit e91498589746065e3ae95d9a00b068e525eec34f > > Author: Peter Zijlstra <peterz(a)infradead.org> > > Date: Wed Aug 23 13:13:11 2017 +0200 > > > > locking/lockdep/selftests: Add mixed read-write ABBA tests > > > > - To allow nesting of the read-side explicit annotations we explicitly > > keep track of the nesting. lock_is_held() allows us to do that. > > > > - The wait-side annotation is a write lock, and entirely done within > > dma_fence_wait() for everyone by default. > > > > - To be able to freely annotate helper functions I want to make it ok > > to call dma_fence_begin/end_signalling from soft/hardirq context. > > First attempt was using the hardirq locking context for the write > > side in lockdep, but this forces all normal spinlocks nested within > > dma_fence_begin/end_signalling to be spinlocks. That bollocks. > > > > The approach now is to simple check in_atomic(), and for these cases > > entirely rely on the might_sleep() check in dma_fence_wait(). That > > will catch any wrong nesting against spinlocks from soft/hardirq > > contexts. > > > > The idea here is that every code path that's critical for eventually > > signalling a dma_fence should be annotated with > > dma_fence_begin/end_signalling. The annotation ideally starts right > > after a dma_fence is published (added to a dma_resv, exposed as a > > sync_file fd, attached to a drm_syncobj fd, or anything else that > > makes the dma_fence visible to other kernel threads), up to and > > including the dma_fence_wait(). Examples are irq handlers, the > > scheduler rt threads, the tail of execbuf (after the corresponding > > fences are visible), any workers that end up signalling dma_fences and > > really anything else. Not annotated should be code paths that only > > complete fences opportunistically as the gpu progresses, like e.g. > > shrinker/eviction code. > > > > The main class of deadlocks this is supposed to catch are: > > > > Thread A: > > > > mutex_lock(A); > > mutex_unlock(A); > > > > dma_fence_signal(); > > > > Thread B: > > > > mutex_lock(A); > > dma_fence_wait(); > > mutex_unlock(A); > > > > Thread B is blocked on A signalling the fence, but A never gets around > > to that because it cannot acquire the lock A. > > > > Note that dma_fence_wait() is allowed to be nested within > > dma_fence_begin/end_signalling sections. To allow this to happen the > > read lock needs to be upgraded to a write lock, which means that any > > other lock is acquired between the dma_fence_begin_signalling() call and > > the call to dma_fence_wait(), and still held, this will result in an > > immediate lockdep complaint. The only other option would be to not > > annotate such calls, defeating the point. Therefore these annotations > > cannot be sprinkled over the code entirely mindless to avoid false > > positives. > > > > v2: handle soft/hardirq ctx better against write side and dont forget > > EXPORT_SYMBOL, drivers can't use this otherwise. > > > > Cc: linux-media(a)vger.kernel.org > > Cc: linaro-mm-sig(a)lists.linaro.org > > Cc: linux-rdma(a)vger.kernel.org > > Cc: amd-gfx(a)lists.freedesktop.org > > Cc: intel-gfx(a)lists.freedesktop.org > > Cc: Chris Wilson <chris(a)chris-wilson.co.uk> > > Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com> > > Cc: Christian König <christian.koenig(a)amd.com> > > Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com> > > LGTM. Perhaps some in-code documentation on how to use the new functions > are called. See cover letter, that's going to be done for next round. For this one here I just wanted to showcase a bit how it's used in a few different places, mostly selected to get as much feedback from across different drivers. Hence e.g. annotating drm/scheduler. > Otherwise for patch 2 and 3, > > Reviewed-by: Thomas Hellstrom <thomas.hellstrom(a)intel.com> I think I'll just cc you for the next round with docs, so you can make sure it looks ok :-) -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch

5 years, 1 month

Re: [Linaro-mm-sig] [PATCH v4 0/3] Support virtio cross-device resources

by Gerd Hoffmann

On Tue, May 26, 2020 at 07:58:08PM +0900, David Stevens wrote: > This patchset implements the current proposal for virtio cross-device > resource sharing [1]. It will be used to import virtio resources into > the virtio-video driver currently under discussion [2]. The patch > under consideration to add support in the virtio-video driver is [3]. > It uses the APIs from v3 of this series, but the changes to update it > are relatively minor. > > This patchset adds a new flavor of dma-bufs that supports querying the > underlying virtio object UUID, as well as adding support for exporting > resources from virtgpu. > > [1] https://markmail.org/thread/2ypjt5cfeu3m6lxu > [2] https://markmail.org/thread/p5d3k566srtdtute > [3] https://markmail.org/thread/j4xlqaaim266qpks > > v3 -> v4 changes: > - Replace dma-buf hooks with virtio dma-buf from v1. > - Remove virtio_attach callback, as the work that had been done > in that callback is now done on dma-buf export. The documented > requirement that get_uuid only be called on attached virtio > dma-bufs is also removed. > - Rebase and add call to virtio_gpu_notify for ASSIGN_UUID. > > David Stevens (3): > virtio: add dma-buf support for exported objects > virtio-gpu: add VIRTIO_GPU_F_RESOURCE_UUID feature > drm/virtio: Support virtgpu exported resources Looks all sane to me. mst, have you looked at the virtio core changes? How we are going to merge this? If you ack I can merge via drm-misc-next. Merging through virtio queue would be fine too. thanks, Gerd

5 years, 1 month

[PATCH v5 00/38] DRM: fix struct sg_table nents vs. orig_nents misuse

by Marek Szyprowski

Dear All, During the Exynos DRM GEM rework and fixing the issues in the. drm_prime_sg_to_page_addr_arrays() function [1] I've noticed that most drivers in DRM framework incorrectly use nents and orig_nents entries of the struct sg_table. In case of the most DMA-mapping implementations exchanging those two entries or using nents for all loops on the scatterlist is harmless, because they both have the same value. There exists however a DMA-mapping implementations, for which such incorrect usage breaks things. The nents returned by dma_map_sg() might be lower than the nents passed as its parameter and this is perfectly fine. DMA framework or IOMMU is allowed to join consecutive chunks while mapping if such operation is supported by the underlying HW (bus, bridge, IOMMU, etc). Example of the case where dma_map_sg() might return 1 'DMA' chunk for the 4 'physical' pages is described here [2] The DMA-mapping framework documentation [3] states that dma_map_sg() returns the numer of the created entries in the DMA address space. However the subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be called with the original number of entries passed to dma_map_sg. The common pattern in DRM drivers were to assign the dma_map_sg() return value to sg_table->nents and use that value for the subsequent calls to dma_sync_sg_* or dma_unmap_sg functions. Also the code iterated over nents times to access the pages stored in the processed scatterlist, while it should use orig_nents as the numer of the page entries. I've tried to identify all such incorrect usage of sg_table->nents and this is a result of my research. It looks that the incorrect pattern has been copied over the many drivers mainly in the DRM subsystem. Too bad in most cases it even worked correctly if the system used a simple, linear DMA-mapping implementation, for which swapping nents and orig_nents doesn't make any difference. To avoid similar issues in the future, I've introduced a common wrappers for DMA-mapping calls, which operate directly on the sg_table objects. I've also added wrappers for iterating over the scatterlists stored in the sg_table objects and applied them where possible. This, together with some common DRM prime helpers, allowed me to almost get rid of all nents/orig_nents usage in the drivers. I hope that such change makes the code robust, easier to follow and copy/paste safe. The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix it fully. The driver creatively uses sg_table->orig_nents to store the size of the allocate scatterlist and ignores the number of the entries returned by dma_map_sg function. In this patchset I only fixed the sg_table objects exported by dmabuf related functions. I hope that I didn't break anything there. Patches are based on top of Linux next-20200512. Christoph Hellwig already offered to take patches 1-3 into his immutable branch [4]. If possible I would like ask for merging most of the remaining patches via DRM tree (on top of that immutable branch). Best regards, Marek Szyprowski References: [1] https://lkml.org/lkml/2020/3/27/555 [2] https://lkml.org/lkml/2020/3/29/65 [3] Documentation/DMA-API-HOWTO.txt [4] https://lore.kernel.org/linux-iommu/20200512121931.GD20393@lst.de/T/#ma18c9… Changelog: v5: - fixed some minor style issues and typos - fixed lack of the attrs argument in ion, dmabuf, rapidio, fastrpc and vfio patches v4: https://lore.kernel.org/linux-iommu/20200512121931.GD20393@lst.de/T/ - added for_each_sgtable_* wrappers and applied where possible - added drm_prime_get_contiguous_size() and applied where possible - applied drm_prime_sg_to_page_addr_arrays() where possible to remove page extraction from sg_table objects - added documentation for the introduced wrappers - improved patches description a bit v3: https://lore.kernel.org/dri-devel/20200505083926.28503-1-m.szyprowski@samsu… - introduce dma_*_sgtable_* wrappers and use them in all patches v2: https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376ba4@ar… - dropped most of the changes to drm/i915 - added fixes for rcar-du, xen, media and ion - fixed a few issues pointed by kbuild test robot - added wide cc: list for each patch v1: https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376ba4@ar… - initial version Patch summary: Marek Szyprowski (38): dma-mapping: add generic helpers for mapping sgtable objects scatterlist: add generic wrappers for iterating over sgtable objects iommu: add generic helper for mapping sgtable objects drm: prime: add common helper to check scatterlist contiguity drm: prime: use sgtable iterators in drm_prime_sg_to_page_addr_arrays() drm: core: fix common struct sg_table related issues drm: amdgpu: fix common struct sg_table related issues drm: armada: fix common struct sg_table related issues drm: etnaviv: fix common struct sg_table related issues drm: exynos: use common helper for a scatterlist contiguity check drm: exynos: fix common struct sg_table related issues drm: i915: fix common struct sg_table related issues drm: lima: fix common struct sg_table related issues drm: mediatek: use common helper for a scatterlist contiguity check drm: mediatek: use common helper for extracting pages array drm: msm: fix common struct sg_table related issues drm: omapdrm: use common helper for extracting pages array drm: omapdrm: fix common struct sg_table related issues drm: panfrost: fix common struct sg_table related issues drm: radeon: fix common struct sg_table related issues drm: rockchip: use common helper for a scatterlist contiguity check drm: rockchip: fix common struct sg_table related issues drm: tegra: fix common struct sg_table related issues drm: v3d: fix common struct sg_table related issues drm: virtio: fix common struct sg_table related issues drm: vmwgfx: fix common struct sg_table related issues xen: gntdev: fix common struct sg_table related issues drm: host1x: fix common struct sg_table related issues drm: rcar-du: fix common struct sg_table related issues dmabuf: fix common struct sg_table related issues staging: ion: remove dead code staging: ion: fix common struct sg_table related issues staging: tegra-vde: fix common struct sg_table related issues misc: fastrpc: fix common struct sg_table related issues rapidio: fix common struct sg_table related issues samples: vfio-mdev/mbochs: fix common struct sg_table related issues media: pci: fix common ALSA DMA-mapping related codes videobuf2: use sgtable-based scatterlist wrappers drivers/dma-buf/heaps/heap-helpers.c | 13 ++-- drivers/dma-buf/udmabuf.c | 7 +- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 6 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 9 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 8 +- drivers/gpu/drm/armada/armada_gem.c | 12 +-- drivers/gpu/drm/drm_cache.c | 2 +- drivers/gpu/drm/drm_gem_cma_helper.c | 23 +----- drivers/gpu/drm/drm_gem_shmem_helper.c | 14 ++-- drivers/gpu/drm/drm_prime.c | 86 ++++++++++++---------- drivers/gpu/drm/etnaviv/etnaviv_gem.c | 12 ++- drivers/gpu/drm/etnaviv/etnaviv_mmu.c | 13 +--- drivers/gpu/drm/exynos/exynos_drm_g2d.c | 10 +-- drivers/gpu/drm/exynos/exynos_drm_gem.c | 23 +----- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 11 +-- drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c | 7 +- drivers/gpu/drm/lima/lima_gem.c | 11 ++- drivers/gpu/drm/lima/lima_vm.c | 5 +- drivers/gpu/drm/mediatek/mtk_drm_gem.c | 34 ++------- drivers/gpu/drm/msm/msm_gem.c | 13 ++-- drivers/gpu/drm/msm/msm_gpummu.c | 14 ++-- drivers/gpu/drm/msm/msm_iommu.c | 2 +- drivers/gpu/drm/omapdrm/omap_gem.c | 20 ++--- drivers/gpu/drm/panfrost/panfrost_gem.c | 4 +- drivers/gpu/drm/panfrost/panfrost_mmu.c | 7 +- drivers/gpu/drm/radeon/radeon_ttm.c | 11 ++- drivers/gpu/drm/rcar-du/rcar_du_vsp.c | 3 +- drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 42 +++-------- drivers/gpu/drm/tegra/gem.c | 27 +++---- drivers/gpu/drm/tegra/plane.c | 15 ++-- drivers/gpu/drm/v3d/v3d_mmu.c | 17 ++--- drivers/gpu/drm/virtio/virtgpu_object.c | 36 +++++---- drivers/gpu/drm/virtio/virtgpu_vq.c | 12 ++- drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 17 +---- drivers/gpu/host1x/job.c | 22 ++---- .../media/common/videobuf2/videobuf2-dma-contig.c | 41 +++++------ drivers/media/common/videobuf2/videobuf2-dma-sg.c | 32 +++----- drivers/media/common/videobuf2/videobuf2-vmalloc.c | 12 +-- drivers/media/pci/cx23885/cx23885-alsa.c | 2 +- drivers/media/pci/cx25821/cx25821-alsa.c | 2 +- drivers/media/pci/cx88/cx88-alsa.c | 2 +- drivers/media/pci/saa7134/saa7134-alsa.c | 2 +- drivers/media/platform/vsp1/vsp1_drm.c | 8 +- drivers/misc/fastrpc.c | 4 +- drivers/rapidio/devices/rio_mport_cdev.c | 8 +- drivers/staging/android/ion/ion.c | 25 +++---- drivers/staging/android/ion/ion.h | 1 - drivers/staging/android/ion/ion_heap.c | 53 ++++--------- drivers/staging/android/ion/ion_system_heap.c | 2 +- drivers/staging/media/tegra-vde/iommu.c | 4 +- drivers/xen/gntdev-dmabuf.c | 13 ++-- include/drm/drm_prime.h | 2 + include/linux/dma-mapping.h | 78 ++++++++++++++++++++ include/linux/iommu.h | 16 ++++ include/linux/scatterlist.h | 50 ++++++++++++- samples/vfio-mdev/mbochs.c | 3 +- 56 files changed, 451 insertions(+), 477 deletions(-) -- 1.9.1

5 years, 1 month

Re: [Linaro-mm-sig] [Intel-gfx] [PATCH 3/3] misc/habalabs: don't set default fence_ops->wait

by Daniel Vetter

On Thu, May 14, 2020 at 02:38:38PM +0300, Oded Gabbay wrote: > On Tue, May 12, 2020 at 9:12 AM Daniel Vetter <daniel.vetter(a)ffwll.ch> wrote: > > > > On Tue, May 12, 2020 at 4:14 AM Dave Airlie <airlied(a)gmail.com> wrote: > > > > > > On Mon, 11 May 2020 at 19:37, Oded Gabbay <oded.gabbay(a)gmail.com> wrote: > > > > > > > > On Mon, May 11, 2020 at 12:11 PM Daniel Vetter <daniel.vetter(a)ffwll.ch> wrote: > > > > > > > > > > It's the default. > > > > Thanks for catching that. > > > > > > > > > > > > > > Also so much for "we're not going to tell the graphics people how to > > > > > review their code", dma_fence is a pretty core piece of gpu driver > > > > > infrastructure. And it's very much uapi relevant, including piles of > > > > > corresponding userspace protocols and libraries for how to pass these > > > > > around. > > > > > > > > > > Would be great if habanalabs would not use this (from a quick look > > > > > it's not needed at all), since open source the userspace and playing > > > > > by the usual rules isn't on the table. If that's not possible (because > > > > > it's actually using the uapi part of dma_fence to interact with gpu > > > > > drivers) then we have exactly what everyone promised we'd want to > > > > > avoid. > > > > > > > > We don't use the uapi parts, we currently only using the fencing and > > > > signaling ability of this module inside our kernel code. But maybe I > > > > didn't understand what you request. You want us *not* to use this > > > > well-written piece of kernel code because it is only used by graphics > > > > drivers ? > > > > I'm sorry but I don't get this argument, if this is indeed what you meant. > > > > > > We would rather drivers using a feature that has requirements on > > > correct userspace implementations of the feature have a userspace that > > > is open source and auditable. > > > > > > Fencing is tricky, cross-device fencing is really tricky, and having > > > the ability for a closed userspace component to mess up other people's > > > drivers, think i915 shared with closed habana userspace and shared > > > fences, decreases ability to debug things. > > > > > > Ideally we wouldn't offer users known untested/broken scenarios, so > > > yes we'd prefer that drivers that intend to expose a userspace fencing > > > api around dma-fence would adhere to the rules of the gpu drivers. > > > > > > I'm not say you have to drop using dma-fence, but if you move towards > > > cross-device stuff I believe other drivers would be correct in > > > refusing to interact with fences from here. > > > > The flip side is if you only used dma-fence.c "because it's there", > > and not because it comes with an uapi attached and a cross-driver > > kernel internal contract for how to interact with gpu drivers, then > > there's really not much point in using it. It's a custom-rolled > > wait_queue/event thing, that's all. Without the gpu uapi and gpu > > cross-driver contract it would be much cleaner to just use wait_queue > > directly, and that's a construct all kernel developers understand, not > > just gpu folks. From a quick look at least habanalabs doesn't use any > > of these uapi/cross-driver/gpu bits. > > -Daniel > > Hi Daniel, > I want to say explicitly that we don't use the dma-buf uapi parts, nor > we intend to use them to communicate with any GPU device. We only use > it as simple completion mechanism as it was convenient to use. > I do understand I can exchange that mechanism with a simpler one, and > I will add an internal task to do it (albeit not in a very high > priority) and upstream it, its just that it is part of our data path > so we need to thoroughly validate it first. Sounds good. Wrt merging this patch here, can you include that in one of your next pulls? Or should I toss it entirely, waiting for you to remove dma_fence outright? Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

5 years, 1 month

[PATCH] dma-fence: add might_sleep annotation to _wait()

by Daniel Vetter

Do it uncontionally, there's a separate peek function with dma_fence_is_signalled() which can be called from atomic context. v2: Consensus calls for an unconditional might_sleep (Chris, Christian) Full audit: - dma-fence.h: Uses MAX_SCHEDULE_TIMOUT, good chance this sleeps - dma-resv.c: Timeout always at least 1 - st-dma-fence.c: Save to sleep in testcases - amdgpu_cs.c: Both callers are for variants of the wait ioctl - amdgpu_device.c: Two callers in vram recover code, both right next to mutex_lock. - amdgpu_vm.c: Use in the vm_wait ioctl, next to _reserve/unreserve - remaining functions in amdgpu: All for test_ib implementations for various engines, caller for that looks all safe (debugfs, driver load, reset) - etnaviv: another wait ioctl - habanalabs: another wait ioctl - nouveau_fence.c: hardcoded 15*HZ ... glorious - nouveau_gem.c: hardcoded 2*HZ ... so not even super consistent, but this one does have a WARN_ON :-/ At least this one is only a fallback path for when kmalloc fails. Maybe this should be put onto some worker list instead, instead of a work per unamp ... - i915/selftests: Hardecoded HZ / 4 or HZ / 8 - i915/gt/selftests: Going up the callchain looks safe looking at nearby callers - i915/gt/intel_gt_requests.c. Wrapped in a mutex_lock - i915/gem_i915_gem_wait.c: The i915-version which is called instead for i915 fences already has a might_sleep() annotation, so all good Cc: Alex Deucher <alexander.deucher(a)amd.com> Cc: Lucas Stach <l.stach(a)pengutronix.de> Cc: Jani Nikula <jani.nikula(a)linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com> Cc: Ben Skeggs <bskeggs(a)redhat.com> Cc: "VMware Graphics" <linux-graphics-maintainer(a)vmware.com> Cc: Oded Gabbay <oded.gabbay(a)gmail.com> Cc: linux-media(a)vger.kernel.org Cc: linaro-mm-sig(a)lists.linaro.org Cc: linux-rdma(a)vger.kernel.org Cc: amd-gfx(a)lists.freedesktop.org Cc: intel-gfx(a)lists.freedesktop.org Cc: Chris Wilson <chris(a)chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com> Cc: Christian König <christian.koenig(a)amd.com> Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com> --- drivers/dma-buf/dma-fence.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 90edf2b281b0..656e9ac2d028 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -208,6 +208,8 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout) if (WARN_ON(timeout < 0)) return -EINVAL; + might_sleep(); + trace_dma_fence_wait_start(fence); if (fence->ops->wait) ret = fence->ops->wait(fence, intr, timeout); -- 2.26.2

5 years, 1 month

Re: [Linaro-mm-sig] [PATCH] ASoC: fsl: imx-pcm-dma: Don't request dma channel in probe

by Lucas Stach

Am Mittwoch, den 20.05.2020, 16:20 +0800 schrieb Shengjiu Wang: > Hi > > On Tue, May 19, 2020 at 6:04 PM Lucas Stach <l.stach(a)pengutronix.de> wrote: > > Am Dienstag, den 19.05.2020, 17:41 +0800 schrieb Shengjiu Wang: > > > There are two requirements that we need to move the request > > > of dma channel from probe to open. > > > > How do you handle -EPROBE_DEFER return code from the channel request if > > you don't do it in probe? > > I use the dma_request_slave_channel or dma_request_channel instead > of dmaengine_pcm_request_chan_of. so there should be not -EPROBE_DEFER > return code. This is a pretty weak argument. The dmaengine device might probe after you try to get the channel. Using a function to request the channel that doesn't allow you to handle probe deferral is IMHO a bug and should be fixed, instead of building even more assumptions on top of it. > > > - When dma device binds with power-domains, the power will > > > be enabled when we request dma channel. If the request of dma > > > channel happen on probe, then the power-domains will be always > > > enabled after kernel boot up, which is not good for power > > > saving, so we need to move the request of dma channel to .open(); > > > > This is certainly something which could be fixed in the dmaengine > > driver. > > Dma driver always call the pm_runtime_get_sync in > device_alloc_chan_resources, the device_alloc_chan_resources is > called when channel is requested. so power is enabled on channel > request. So why can't you fix the dmaengine driver to do that RPM call at a later time when the channel is actually going to be used? This will allow further power savings with other slave devices than the audio PCM. Regards, Lucas > > > - With FE-BE case, if the dma channel is requested in probe, > > > then there will be below issue, which is caused by that the > > > dma channel will be requested duplicately > > > > Why is this requested a second time? Is this just some missing cleanup > > on a deferred probe path? > > Not relate with deferred probe. With DMA1->ASRC->DMA2->ESAI case, > the DMA1->ASRC->DMA2 is in FE, ESAI is in BE. When ESAI drvier > probe, DMA3 channel is created with ESAI's "dma:tx" (DMA3 channel > is not used in this FE-BE case). When FE-BE startup, DMA2 > channel is created, it needs the ESAI's "dma:tx", so below warning > comes out. > > > Regards, > > Lucas > > > > > [ 638.906268] sysfs: cannot create duplicate filename '/devices/soc0/soc/2000000.bus/2000000.spba-bus/2024000.esai/dma:tx' > > > [ 638.919061] CPU: 1 PID: 673 Comm: aplay Not tainted 5.7.0-rc1-12956-gfc64b2585593 #287 > > > [ 638.927113] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) > > > [ 638.933690] [<c0110dd8>] (unwind_backtrace) from [<c010b8ec>] (show_stack+0x10/0x14) > > > [ 638.941464] [<c010b8ec>] (show_stack) from [<c0557fc0>] (dump_stack+0xe4/0x118) > > > [ 638.948808] [<c0557fc0>] (dump_stack) from [<c032aeb4>] (sysfs_warn_dup+0x50/0x64) > > > [ 638.956406] [<c032aeb4>] (sysfs_warn_dup) from [<c032b1a8>] (sysfs_do_create_link_sd+0xc8/0xd4) > > > [ 638.965134] [<c032b1a8>] (sysfs_do_create_link_sd) from [<c05dc668>] (dma_request_chan+0xb0/0x210) > > > [ 638.974120] [<c05dc668>] (dma_request_chan) from [<c05dc7d0>] (dma_request_slave_channel+0x8/0x14) > > > [ 638.983111] [<c05dc7d0>] (dma_request_slave_channel) from [<c09d5548>] (fsl_asrc_dma_hw_params+0x1e0/0x438) > > > [ 638.992881] [<c09d5548>] (fsl_asrc_dma_hw_params) from [<c09c1654>] (soc_pcm_hw_params+0x4a0/0x6a8) > > > [ 639.001952] [<c09c1654>] (soc_pcm_hw_params) from [<c09c39d4>] (dpcm_fe_dai_hw_params+0x70/0xe4) > > > [ 639.010765] [<c09c39d4>] (dpcm_fe_dai_hw_params) from [<c099b274>] (snd_pcm_hw_params+0x158/0x418) > > > [ 639.019750] [<c099b274>] (snd_pcm_hw_params) from [<c099c5a0>] (snd_pcm_ioctl+0x734/0x183c) > > > [ 639.028129] [<c099c5a0>] (snd_pcm_ioctl) from [<c029ff94>] (ksys_ioctl+0x2ac/0xb98) > > > [ 639.035812] [<c029ff94>] (ksys_ioctl) from [<c0100080>] (ret_fast_syscall+0x0/0x28) > > > [ 639.043490] Exception stack(0xec529fa8 to 0xec529ff0) > > > [ 639.048565] 9fa0: bee84650 01321870 00000004 c25c4111 bee84650 0002000f > > > [ 639.056766] 9fc0: bee84650 01321870 01321820 00000036 00001f40 00000000 0002c2f8 00000003 > > > [ 639.064964] 9fe0: b6f483fc bee8451c b6ee2655 b6e1dcf8 > > > [ 639.070339] fsl-esai-dai 2024000.esai: Cannot create DMA dma:tx symlink > > > > > > Signed-off-by: Shengjiu Wang <shengjiu.wang(a)nxp.com> > > > --- > > > sound/soc/fsl/imx-pcm-dma.c | 173 +++++++++++++++++++++++++++++++++--- > > > 1 file changed, 159 insertions(+), 14 deletions(-) > > > > > > diff --git a/sound/soc/fsl/imx-pcm-dma.c b/sound/soc/fsl/imx-pcm-dma.c > > > index 04a9bc749016..dae53b384df4 100644 > > > --- a/sound/soc/fsl/imx-pcm-dma.c > > > +++ b/sound/soc/fsl/imx-pcm-dma.c > > > @@ -11,6 +11,7 @@ > > > #include <linux/dmaengine.h> > > > #include <linux/types.h> > > > #include <linux/module.h> > > > +#include <linux/dma-mapping.h> > > > > > > #include <sound/core.h> > > > #include <sound/pcm.h> > > > @@ -29,24 +30,168 @@ static bool filter(struct dma_chan *chan, void *param) > > > return true; > > > } > > > > > > -static const struct snd_dmaengine_pcm_config imx_dmaengine_pcm_config = { > > > - .prepare_slave_config = snd_dmaengine_pcm_prepare_slave_config, > > > - .compat_filter_fn = filter, > > > -}; > > > +static int imx_pcm_hw_params(struct snd_soc_component *component, > > > + struct snd_pcm_substream *substream, > > > + struct snd_pcm_hw_params *params) > > > +{ > > > + struct snd_pcm_runtime *runtime = substream->runtime; > > > + struct snd_soc_pcm_runtime *rtd = substream->private_data; > > > + struct snd_soc_dai *cpu_dai = asoc_rtd_to_cpu(rtd, 0); > > > + struct snd_dmaengine_dai_dma_data *dma_data; > > > + struct dma_slave_config config; > > > + struct dma_chan *chan; > > > + int ret = 0; > > > > > > -int imx_pcm_dma_init(struct platform_device *pdev, size_t size) > > > + snd_pcm_set_runtime_buffer(substream, &substream->dma_buffer); > > > + runtime->dma_bytes = params_buffer_bytes(params); > > > + > > > + chan = snd_dmaengine_pcm_get_chan(substream); > > > + if (!chan) > > > + return -EINVAL; > > > + > > > + ret = snd_hwparams_to_dma_slave_config(substream, params, &config); > > > + if (ret) > > > + return ret; > > > + > > > + dma_data = snd_soc_dai_get_dma_data(cpu_dai, substream); > > > + if (!dma_data) > > > + return -EINVAL; > > > + > > > + snd_dmaengine_pcm_set_config_from_dai_data(substream, > > > + dma_data, > > > + &config); > > > + return dmaengine_slave_config(chan, &config); > > > +} > > > + > > > +static int imx_pcm_hw_free(struct snd_soc_component *component, > > > + struct snd_pcm_substream *substream) > > > { > > > - struct snd_dmaengine_pcm_config *config; > > > + snd_pcm_set_runtime_buffer(substream, NULL); > > > + return 0; > > > +} > > > + > > > +static snd_pcm_uframes_t imx_pcm_pointer(struct snd_soc_component *component, > > > + struct snd_pcm_substream *substream) > > > +{ > > > + return snd_dmaengine_pcm_pointer(substream); > > > +} > > > + > > > +static int imx_pcm_trigger(struct snd_soc_component *component, > > > + struct snd_pcm_substream *substream, int cmd) > > > +{ > > > + return snd_dmaengine_pcm_trigger(substream, cmd); > > > +} > > > + > > > +static int imx_pcm_open(struct snd_soc_component *component, > > > + struct snd_pcm_substream *substream) > > > +{ > > > + struct snd_soc_pcm_runtime *rtd = substream->private_data; > > > + bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK; > > > + struct snd_soc_dai *cpu_dai = asoc_rtd_to_cpu(rtd, 0); > > > + struct snd_dmaengine_dai_dma_data *dma_data; > > > + struct device *dev = component->dev; > > > + struct snd_pcm_hardware hw; > > > + struct dma_chan *chan; > > > + int ret; > > > + > > > + ret = snd_pcm_hw_constraint_integer(substream->runtime, > > > + SNDRV_PCM_HW_PARAM_PERIODS); > > > + if (ret < 0) { > > > + dev_err(dev, "failed to set pcm hw params periods\n"); > > > + return ret; > > > + } > > > + > > > + dma_data = snd_soc_dai_get_dma_data(cpu_dai, substream); > > > + if (!dma_data) > > > + return -EINVAL; > > > + > > > + chan = dma_request_slave_channel(cpu_dai->dev, tx ? "tx" : "rx"); > > > + if (!chan) { > > > + /* Try to request channel using compat_filter_fn */ > > > + chan = snd_dmaengine_pcm_request_channel(filter, > > > + dma_data->filter_data); > > > + if (!chan) > > > + return -ENXIO; > > > + } > > > > > > - config = devm_kzalloc(&pdev->dev, > > > - sizeof(struct snd_dmaengine_pcm_config), GFP_KERNEL); > > > - if (!config) > > > - return -ENOMEM; > > > - *config = imx_dmaengine_pcm_config; > > > + ret = snd_dmaengine_pcm_open(substream, chan); > > > + if (ret) > > > + goto pcm_open_fail; > > > > > > - return devm_snd_dmaengine_pcm_register(&pdev->dev, > > > - config, > > > - SND_DMAENGINE_PCM_FLAG_COMPAT); > > > + memset(&hw, 0, sizeof(hw)); > > > + hw.info = SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_MMAP_VALID | > > > + SNDRV_PCM_INFO_INTERLEAVED; > > > + hw.periods_min = 2; > > > + hw.periods_max = UINT_MAX; > > > + hw.period_bytes_min = 256; > > > + hw.period_bytes_max = dma_get_max_seg_size(chan->device->dev); > > > + hw.buffer_bytes_max = IMX_DEFAULT_DMABUF_SIZE; > > > + hw.fifo_size = dma_data->fifo_size; > > > + > > > + /* Refine the hw according to caps of DMA. */ > > > + ret = snd_dmaengine_pcm_refine_runtime_hwparams(substream, > > > + dma_data, > > > + &hw, > > > + chan); > > > + if (ret < 0) > > > + goto refine_runtime_hwparams_fail; > > > + > > > + snd_soc_set_runtime_hwparams(substream, &hw); > > > + > > > + /* Support allocate memory from IRAM */ > > > + ret = snd_dma_alloc_pages(SNDRV_DMA_TYPE_DEV_IRAM, > > > + chan->device->dev, > > > + hw.buffer_bytes_max, > > > + &substream->dma_buffer); > > > + if (ret < 0) > > > + goto alloc_pagas_fail; > > > + > > > + return 0; > > > + > > > +alloc_pagas_fail: > > > +refine_runtime_hwparams_fail: > > > + snd_dmaengine_pcm_close(substream); > > > +pcm_open_fail: > > > + dma_release_channel(chan); > > > + > > > + return ret; > > > +} > > > + > > > +static int imx_pcm_close(struct snd_soc_component *component, > > > + struct snd_pcm_substream *substream) > > > +{ > > > + if (substream) { > > > + snd_dma_free_pages(&substream->dma_buffer); > > > + substream->dma_buffer.area = NULL; > > > + substream->dma_buffer.addr = 0; > > > + } > > > + > > > + return snd_dmaengine_pcm_close_release_chan(substream); > > > +} > > > + > > > +static int imx_pcm_new(struct snd_soc_component *component, > > > + struct snd_soc_pcm_runtime *rtd) > > > +{ > > > + struct snd_card *card = rtd->card->snd_card; > > > + > > > + return dma_coerce_mask_and_coherent(card->dev, DMA_BIT_MASK(32)); > > > +} > > > + > > > +static const struct snd_soc_component_driver imx_pcm_component = { > > > + .name = "imx-pcm-dma", > > > + .pcm_construct = imx_pcm_new, > > > + .open = imx_pcm_open, > > > + .close = imx_pcm_close, > > > + .hw_params = imx_pcm_hw_params, > > > + .hw_free = imx_pcm_hw_free, > > > + .trigger = imx_pcm_trigger, > > > + .pointer = imx_pcm_pointer, > > > +}; > > > + > > > +int imx_pcm_dma_init(struct platform_device *pdev, size_t size) > > > +{ > > > + return devm_snd_soc_register_component(&pdev->dev, > > > + &imx_pcm_component, NULL, 0); > > > } > > > EXPORT_SYMBOL_GPL(imx_pcm_dma_init); > > >

5 years, 1 month

Re: [Linaro-mm-sig] [PATCH] ASoC: fsl: imx-pcm-dma: Don't request dma channel in probe

by Lucas Stach

Am Dienstag, den 19.05.2020, 17:41 +0800 schrieb Shengjiu Wang: > There are two requirements that we need to move the request > of dma channel from probe to open. How do you handle -EPROBE_DEFER return code from the channel request if you don't do it in probe? > - When dma device binds with power-domains, the power will > be enabled when we request dma channel. If the request of dma > channel happen on probe, then the power-domains will be always > enabled after kernel boot up, which is not good for power > saving, so we need to move the request of dma channel to .open(); This is certainly something which could be fixed in the dmaengine driver. > - With FE-BE case, if the dma channel is requested in probe, > then there will be below issue, which is caused by that the > dma channel will be requested duplicately Why is this requested a second time? Is this just some missing cleanup on a deferred probe path? Regards, Lucas > [ 638.906268] sysfs: cannot create duplicate filename '/devices/soc0/soc/2000000.bus/2000000.spba-bus/2024000.esai/dma:tx' > [ 638.919061] CPU: 1 PID: 673 Comm: aplay Not tainted 5.7.0-rc1-12956-gfc64b2585593 #287 > [ 638.927113] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) > [ 638.933690] [<c0110dd8>] (unwind_backtrace) from [<c010b8ec>] (show_stack+0x10/0x14) > [ 638.941464] [<c010b8ec>] (show_stack) from [<c0557fc0>] (dump_stack+0xe4/0x118) > [ 638.948808] [<c0557fc0>] (dump_stack) from [<c032aeb4>] (sysfs_warn_dup+0x50/0x64) > [ 638.956406] [<c032aeb4>] (sysfs_warn_dup) from [<c032b1a8>] (sysfs_do_create_link_sd+0xc8/0xd4) > [ 638.965134] [<c032b1a8>] (sysfs_do_create_link_sd) from [<c05dc668>] (dma_request_chan+0xb0/0x210) > [ 638.974120] [<c05dc668>] (dma_request_chan) from [<c05dc7d0>] (dma_request_slave_channel+0x8/0x14) > [ 638.983111] [<c05dc7d0>] (dma_request_slave_channel) from [<c09d5548>] (fsl_asrc_dma_hw_params+0x1e0/0x438) > [ 638.992881] [<c09d5548>] (fsl_asrc_dma_hw_params) from [<c09c1654>] (soc_pcm_hw_params+0x4a0/0x6a8) > [ 639.001952] [<c09c1654>] (soc_pcm_hw_params) from [<c09c39d4>] (dpcm_fe_dai_hw_params+0x70/0xe4) > [ 639.010765] [<c09c39d4>] (dpcm_fe_dai_hw_params) from [<c099b274>] (snd_pcm_hw_params+0x158/0x418) > [ 639.019750] [<c099b274>] (snd_pcm_hw_params) from [<c099c5a0>] (snd_pcm_ioctl+0x734/0x183c) > [ 639.028129] [<c099c5a0>] (snd_pcm_ioctl) from [<c029ff94>] (ksys_ioctl+0x2ac/0xb98) > [ 639.035812] [<c029ff94>] (ksys_ioctl) from [<c0100080>] (ret_fast_syscall+0x0/0x28) > [ 639.043490] Exception stack(0xec529fa8 to 0xec529ff0) > [ 639.048565] 9fa0: bee84650 01321870 00000004 c25c4111 bee84650 0002000f > [ 639.056766] 9fc0: bee84650 01321870 01321820 00000036 00001f40 00000000 0002c2f8 00000003 > [ 639.064964] 9fe0: b6f483fc bee8451c b6ee2655 b6e1dcf8 > [ 639.070339] fsl-esai-dai 2024000.esai: Cannot create DMA dma:tx symlink > > Signed-off-by: Shengjiu Wang <shengjiu.wang(a)nxp.com> > --- > sound/soc/fsl/imx-pcm-dma.c | 173 +++++++++++++++++++++++++++++++++--- > 1 file changed, 159 insertions(+), 14 deletions(-) > > diff --git a/sound/soc/fsl/imx-pcm-dma.c b/sound/soc/fsl/imx-pcm-dma.c > index 04a9bc749016..dae53b384df4 100644 > --- a/sound/soc/fsl/imx-pcm-dma.c > +++ b/sound/soc/fsl/imx-pcm-dma.c > @@ -11,6 +11,7 @@ > #include <linux/dmaengine.h> > #include <linux/types.h> > #include <linux/module.h> > +#include <linux/dma-mapping.h> > > #include <sound/core.h> > #include <sound/pcm.h> > @@ -29,24 +30,168 @@ static bool filter(struct dma_chan *chan, void *param) > return true; > } > > -static const struct snd_dmaengine_pcm_config imx_dmaengine_pcm_config = { > - .prepare_slave_config = snd_dmaengine_pcm_prepare_slave_config, > - .compat_filter_fn = filter, > -}; > +static int imx_pcm_hw_params(struct snd_soc_component *component, > + struct snd_pcm_substream *substream, > + struct snd_pcm_hw_params *params) > +{ > + struct snd_pcm_runtime *runtime = substream->runtime; > + struct snd_soc_pcm_runtime *rtd = substream->private_data; > + struct snd_soc_dai *cpu_dai = asoc_rtd_to_cpu(rtd, 0); > + struct snd_dmaengine_dai_dma_data *dma_data; > + struct dma_slave_config config; > + struct dma_chan *chan; > + int ret = 0; > > -int imx_pcm_dma_init(struct platform_device *pdev, size_t size) > + snd_pcm_set_runtime_buffer(substream, &substream->dma_buffer); > + runtime->dma_bytes = params_buffer_bytes(params); > + > + chan = snd_dmaengine_pcm_get_chan(substream); > + if (!chan) > + return -EINVAL; > + > + ret = snd_hwparams_to_dma_slave_config(substream, params, &config); > + if (ret) > + return ret; > + > + dma_data = snd_soc_dai_get_dma_data(cpu_dai, substream); > + if (!dma_data) > + return -EINVAL; > + > + snd_dmaengine_pcm_set_config_from_dai_data(substream, > + dma_data, > + &config); > + return dmaengine_slave_config(chan, &config); > +} > + > +static int imx_pcm_hw_free(struct snd_soc_component *component, > + struct snd_pcm_substream *substream) > { > - struct snd_dmaengine_pcm_config *config; > + snd_pcm_set_runtime_buffer(substream, NULL); > + return 0; > +} > + > +static snd_pcm_uframes_t imx_pcm_pointer(struct snd_soc_component *component, > + struct snd_pcm_substream *substream) > +{ > + return snd_dmaengine_pcm_pointer(substream); > +} > + > +static int imx_pcm_trigger(struct snd_soc_component *component, > + struct snd_pcm_substream *substream, int cmd) > +{ > + return snd_dmaengine_pcm_trigger(substream, cmd); > +} > + > +static int imx_pcm_open(struct snd_soc_component *component, > + struct snd_pcm_substream *substream) > +{ > + struct snd_soc_pcm_runtime *rtd = substream->private_data; > + bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK; > + struct snd_soc_dai *cpu_dai = asoc_rtd_to_cpu(rtd, 0); > + struct snd_dmaengine_dai_dma_data *dma_data; > + struct device *dev = component->dev; > + struct snd_pcm_hardware hw; > + struct dma_chan *chan; > + int ret; > + > + ret = snd_pcm_hw_constraint_integer(substream->runtime, > + SNDRV_PCM_HW_PARAM_PERIODS); > + if (ret < 0) { > + dev_err(dev, "failed to set pcm hw params periods\n"); > + return ret; > + } > + > + dma_data = snd_soc_dai_get_dma_data(cpu_dai, substream); > + if (!dma_data) > + return -EINVAL; > + > + chan = dma_request_slave_channel(cpu_dai->dev, tx ? "tx" : "rx"); > + if (!chan) { > + /* Try to request channel using compat_filter_fn */ > + chan = snd_dmaengine_pcm_request_channel(filter, > + dma_data->filter_data); > + if (!chan) > + return -ENXIO; > + } > > - config = devm_kzalloc(&pdev->dev, > - sizeof(struct snd_dmaengine_pcm_config), GFP_KERNEL); > - if (!config) > - return -ENOMEM; > - *config = imx_dmaengine_pcm_config; > + ret = snd_dmaengine_pcm_open(substream, chan); > + if (ret) > + goto pcm_open_fail; > > - return devm_snd_dmaengine_pcm_register(&pdev->dev, > - config, > - SND_DMAENGINE_PCM_FLAG_COMPAT); > + memset(&hw, 0, sizeof(hw)); > + hw.info = SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_MMAP_VALID | > + SNDRV_PCM_INFO_INTERLEAVED; > + hw.periods_min = 2; > + hw.periods_max = UINT_MAX; > + hw.period_bytes_min = 256; > + hw.period_bytes_max = dma_get_max_seg_size(chan->device->dev); > + hw.buffer_bytes_max = IMX_DEFAULT_DMABUF_SIZE; > + hw.fifo_size = dma_data->fifo_size; > + > + /* Refine the hw according to caps of DMA. */ > + ret = snd_dmaengine_pcm_refine_runtime_hwparams(substream, > + dma_data, > + &hw, > + chan); > + if (ret < 0) > + goto refine_runtime_hwparams_fail; > + > + snd_soc_set_runtime_hwparams(substream, &hw); > + > + /* Support allocate memory from IRAM */ > + ret = snd_dma_alloc_pages(SNDRV_DMA_TYPE_DEV_IRAM, > + chan->device->dev, > + hw.buffer_bytes_max, > + &substream->dma_buffer); > + if (ret < 0) > + goto alloc_pagas_fail; > + > + return 0; > + > +alloc_pagas_fail: > +refine_runtime_hwparams_fail: > + snd_dmaengine_pcm_close(substream); > +pcm_open_fail: > + dma_release_channel(chan); > + > + return ret; > +} > + > +static int imx_pcm_close(struct snd_soc_component *component, > + struct snd_pcm_substream *substream) > +{ > + if (substream) { > + snd_dma_free_pages(&substream->dma_buffer); > + substream->dma_buffer.area = NULL; > + substream->dma_buffer.addr = 0; > + } > + > + return snd_dmaengine_pcm_close_release_chan(substream); > +} > + > +static int imx_pcm_new(struct snd_soc_component *component, > + struct snd_soc_pcm_runtime *rtd) > +{ > + struct snd_card *card = rtd->card->snd_card; > + > + return dma_coerce_mask_and_coherent(card->dev, DMA_BIT_MASK(32)); > +} > + > +static const struct snd_soc_component_driver imx_pcm_component = { > + .name = "imx-pcm-dma", > + .pcm_construct = imx_pcm_new, > + .open = imx_pcm_open, > + .close = imx_pcm_close, > + .hw_params = imx_pcm_hw_params, > + .hw_free = imx_pcm_hw_free, > + .trigger = imx_pcm_trigger, > + .pointer = imx_pcm_pointer, > +}; > + > +int imx_pcm_dma_init(struct platform_device *pdev, size_t size) > +{ > + return devm_snd_soc_register_component(&pdev->dev, > + &imx_pcm_component, NULL, 0); > } > EXPORT_SYMBOL_GPL(imx_pcm_dma_init); >

5 years, 2 months

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig May 2020