Hi Markus,
Le mercredi 29 avril 2026 à 19:53 +0000, Markus Fritsche a écrit :
> Hi,
>
> This series proposes a small opt-in API in videobuf2-core that lets V4L2
> drivers populate a dma_resv exclusive write fence on the dmabufs they
> export to userspace, signalled when the buffer transitions to
> VB2_BUF_STATE_DONE. Two example drivers (hantro, rockchip-rga) opt in
> to demonstrate the call shape; the change is no-op for every other
> driver.
Thanks for attempting again this feat. I see you went for implicit fencing, but
in the past we've been recommend to stay away from these and adopt an explicit
fencing model. Is this something you have started to think about, have you
reviewed past proposal in regard to fences ?
>
> Why
> ---
> Modern Wayland compositors and any other userspace consumers that
> import V4L2-produced dmabufs and want to do implicit synchronization
> the spec-clean way (poll(POLLIN) on the dmabuf fd, or
> DMA_BUF_IOCTL_EXPORT_SYNC_FILE for a sync_file) currently get either:
>
> 1. A stub fence from dma_buf_export_sync_file(), because the dmabuf's
> dma_resv has no fences populated. The kernel substitutes
> dma_fence_get_stub() which is permanently signalled. The compositor
> "successfully" waits on a fence that represents nothing real about
> the producer's state.
> 2. A poll(POLLIN) on the dmabuf fd that returns immediately for the
> same reason — dma_buf_poll_add_cb finds zero fences in the resv,
> triggers the wake callback inline, and reports POLLIN ready before
> the producer has actually said anything.
>
> Today this works as a happy accident on most paths because clients
> attach buffers after VIDIOC_DQBUF, which the userspace V4L2 contract
> guarantees only returns a buffer after the producer is done. So the
> implicit "the kernel's stub fence is fine because the buffer is
> already complete by the time anyone polls it" assumption has held.
There is no accident, just saying. Have you studied also the other side of
fences, the one that actually cause problem with Freedreno and Etnaviv ? To me
these would be higher priority since they are known to cause "back flash" kind
of bugs, specially for compositor that are not expecting GL driver to place
implicit fences on imported (v4l2 allocated) buffers.
>
> But:
>
> - It's a contract gap. The kernel claims to expose implicit sync; it
> does not, for V4L2 producers.
> - It paid latency for nothing. Every Wayland frame from a V4L2
> producer pays a DMA_BUF_IOCTL_EXPORT_SYNC_FILE round-trip for a
> fence that's stub-signalled. On Mali-class hardware (RK3566 Wayland
> chrome video playback), this contributed to compositor stalls.
> Removing the wait at the compositor level is a workaround, not a
> fix.
> - It blocks downstream consumers from doing the right thing. A
> Wayland compositor that defensively waits on a sync_file gets a
> stub-fence pass-through with no actual gating; if the V4L2 driver
> ever has an out-of-band path that releases the buffer before
> finishing the write, there is no fence to gate on.
Some things don't add up here. I think I want to remind that there is a contract
in regard to delivering a fence to userspace. One of the most important aspect
of fences is that they must in finit time be signaled, regardless what userspace
decided to do next. And for that reason, you shouldn't deliver a fence to
userspace if its not armed. In my reading, you are delivering that fence at
QBUF(capture) time, just like what Gustavo was trying to do previously. Its even
worse if you deliver it to your compositor allowing that compositor to hang
forever by not feeding any bitstream.
Let's take Hantro driver as an example. The right moment to deliver the fence is
either right before we set the DEC bit on the control register, or somewhere
before that when you have bitstream, parameters and request queued. At that
moment, you are guarantied that the decode will either finish or fail (yes, it
can fail, and its extremely common with live stream, or when application calls
streamoff, since in v4l2, we cancel work). Prior to that, user may starve the
OUTPUT queue (the bitstream) and cause the fence to hold forever. This would
break the contract I mention earlier.
Though, if you attach the fence at that moment, you will need to design how to
signal the fence readiness (rather then the data readiness). One idea would be
(with userspace opting-in) to signal the queue at that moment. But then you
can't do the memory management operation you would normally do in DQBUF. This of
course don't apply to hantro, which has no device cache, but we can't design
something in vb2 for the old HW. So we'd need to move memory management somwhere
else, maybe buffer_done, though you have to carefully make sure in which context
you do that, you can't sleep in an IRQ.
There is an obvious benefit of basing your solution on
DMA_BUF_IOCTL_EXPORT_SYNC_FILE, once you get there, you'll discover that there
is very little room in v4l2_buffer, and that was causing a lot of headache to
previous people attempting this. Though, if we look forward, we could also
consider this a feature of the media_request. Queuing a request could maybe
deliver a fence, assuming few pre-condition that guarantee execution (or
failure) are met. We've seen with DW100 recently that its rather easy to convert
an existing m2m driver to request. The media API is a much more open canvas to
design new mechanism. We could have a really simple ioctl that attach out fences
to request, and in a future hook it to our own depedency manager.
I'm simply throwing ideas, I could have missed few things in your PoC, let me
know.
Nicolas
> What
> ----
> Patch 1 adds:
>
> - struct dma_fence *release_fence to struct vb2_buffer
> - u64 dma_resv_fence_context + atomic64_t dma_resv_fence_seqno +
> spinlock_t dma_resv_fence_lock to struct vb2_queue
> - vb2_buffer_attach_release_fence(vb) — drivers call this from their
> buf_queue callback. Allocates a dma_fence on the queue's fence
> context, attaches it as DMA_RESV_USAGE_WRITE on each plane's
> dmabuf->resv. No-op for buffers without exported dmabufs.
> - vb2_buffer_done() extended to signal+put the fence if attached,
> so the producer's completion signal lands in the resv synchronously
> with the userspace DQBUF wakeup.
>
> Patches 2 and 3 add a single call to the helper from hantro_buf_queue
> and rga_buf_queue respectively. Both are demonstration drivers; other
> vb2 drivers can opt in incrementally with the same one-line change.
>
> Tested on
> ---------
> PineTab2 (RK3566 / Mali-G52 panfrost / mainline 6.19.10, this series
> backported), playing 1080p30 H.264 in chromium under KDE Plasma 6.6.4
> Wayland. The test harness is the chromium-fourier patch series at
> https://github.com/marfrit/fourier — chromium plus a KWin patch
> that *previously bypassed* Transaction::watchDmaBuf because the
> kernel-side fence was stub-signalled. With this series applied, the
> bypass becomes unnecessary; KWin's fence wait completes correctly
> because the fence now signals when hantro completes the capture
> buffer write.
>
> End-to-end result before the kernel patch (chromium + Qt 6 patches +
> KWin watchDmaBuf bypass): 1080p30 H.264 plays through, ~81% combined
> chrome CPU, but the watchDmaBuf bypass weakens KWin's defenses against
> misbehaving clients.
>
> End-to-end result after the kernel patch (chromium + Qt 6 patches +
> plain unmodified KWin): 1080p30 H.264 plays through with the same CPU
> profile, KWin's watchDmaBuf wait completes within microseconds against
> the now-real producer fence, no defenses weakened.
>
> What's missing in this RFC
> --------------------------
> - Other vb2-using drivers don't opt in. Each maintainer should look
> at their driver and decide. The hantro + rga patches show the
> shape; copying it to other drivers should be straightforward.
> - For drivers that have intermediate image-processor stages (e.g.
> CSI -> ISP -> user), the fence semantics across stage boundaries
> are out of scope here. This series only addresses the producer-to-
> userspace edge.
> - No selftest. videobuf2 doesn't have a great in-tree selftest harness
> for dmabuf flows; the validation is end-to-end at the userspace
> consumer level (KWin, in our case).
>
> Reviews especially welcome on:
>
> - The decision to make this opt-in per driver vs. automatic for all
> vb2-CAPTURE queues. Auto-on would force every driver to be audited;
> opt-in is incremental and safer but leaves the contract gap for
> drivers nobody touches.
> - Whether vb2_buffer_done is the right place to signal vs. an earlier
> hook (e.g. immediately after DMA-from-device finishes). For hantro
> the two are effectively the same; for drivers with asynchronous
> post-processing they may differ.
> - The choice of DMA_RESV_USAGE_WRITE — we are emitting the producer's
> write completion, so WRITE matches dma-buf documentation, but a
> sanity check is welcome.
>
> Cheers,
> Markus
>
>
> Markus Fritsche (3):
> media: videobuf2: add dma_resv release-fence helper
> media: hantro: attach dma_resv release fence at buf_queue
> media: rockchip-rga: attach dma_resv release fence at buf_queue
>
> .../media/common/videobuf2/videobuf2-core.c | 95 +++++++++++++++++++
> drivers/media/platform/rockchip/rga/rga-buf.c | 10 ++
> .../media/platform/verisilicon/hantro_v4l2.c | 12 +++
> include/media/videobuf2-core.h | 29 ++++++
> 4 files changed, 146 insertions(+)
On 4/29/26 17:25, Pavel Begunkov wrote:
> Introduce a new file callback that allows creating long-term dma
> mapping. All necessary information together with a dmabuf will be passed
> in the second argument of type struct io_dmabuf_token, which will be
> defined in following patches.
Well first of all the naming is probably not the best. Maybe rather call that dma-buf attachment or context or mappping.
Then the patch should probably define the full interface and not just add the callback here and then the structure in a follow up patch.
Regards,
Christian.
>
> Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
> ---
> include/linux/fs.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index b5b01bb22d12..c5558aab4628 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1920,6 +1920,7 @@ struct dir_context {
>
> struct io_uring_cmd;
> struct offset_ctx;
> +struct io_dmabuf_token;
>
> typedef unsigned int __bitwise fop_flags_t;
>
> @@ -1967,6 +1968,7 @@ struct file_operations {
> int (*uring_cmd_iopoll)(struct io_uring_cmd *, struct io_comp_batch *,
> unsigned int poll_flags);
> int (*mmap_prepare)(struct vm_area_desc *);
> + int (*create_dmabuf_token)(struct file *, struct io_dmabuf_token *);
> } __randomize_layout;
>
> /* Supports async buffered reads */
Have you ever wondered what your life would look like if you made entirely different choices? Life simulation games have always been a fascinating genre for gamers, but few capture the unpredictable, hilarious, and sometimes chaotic nature of existence quite like Bitlife. Instead of relying on heavy 3D graphics, it is a text-based simulator that focuses entirely on the ripple effects of your decisions. It’s perfect for casual gaming sessions, so let's dive into how to play and get the most out of this quirky experience.
https://bitlifefree.io/
Gameplay: Growing Up, One Year at a Time
The premise of the game is incredibly simple but highly addictive. You are born with a random set of basic stats—Happiness, Health, Smarts, and Looks—in a random country to random parents. From there, you control your character's life year by year simply by tapping the "Age" button.
In your early years, your choices are understandably limited to things like interacting with your parents, going to the doctor, or playing with pets. But as you grow into a teenager and an adult, the world completely opens up. You can choose to study hard, drop out, date, travel the world, buy real estate, or even turn to a life of crime.
Every year, the game throws random scenarios at you: a classmate might insult you, you might be offered a questionable substance at a party, or you might find a wallet on the street. How you react directly impacts your stats and future opportunities. You might even have to pass mini-games, like navigating a maze for your driving test or escaping from prison. The ultimate goal is simply to live your life until your character passes away, leaving behind a unique legacy and a tombstone summarizing your deeds.
Tips for a Great Experience
If you are just starting out, here are a few tips to make your virtual life more successful—or at least more entertaining:
Keep an eye on your core stats: Your Health and Happiness are crucial. If they drop too low, your character might face early health issues. Go to the gym, meditate, go to the movies, or spend time with family to keep these bars in the green.
Education pays off (usually): If you want a high-paying, stable career like a doctor, judge, or CEO, use the "Study harder" option every year during school. Read books at the library to passively boost your Smarts stat.
Hunt for Ribbons: At the end of every life, you are awarded a ribbon based on how you lived (e.g., "Hero," "Scandalous," "Lazy," or "Rich"). Trying to collect all the different ribbons is a great way to give yourself specific goals.
Don't be afraid of the absurd: The real charm of the game is in its wild unpredictability. Sometimes, making terrible choices, trying to become a famous actor, or buying a crazy exotic pet leads to the most memorable playthroughs. Don't always play it safe!
Conclusion
Ultimately, the beauty of this simulator lies in its endless replayability. Every time you hit the button to start a new life, it is a completely blank slate. You can be a saint in one lifetime and an absolute menace to society in the next. Whether you have five minutes to kill on a bus commute or an hour to craft a sprawling, multi-generational family dynasty, diving into Bitlife offers a fun, lighthearted escape into a world where you pull all the strings. Give it a try, and see exactly where your choices take you!
I'm happy to see that DEPT reported real problems in practice:
https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a485657d@I-love.SA…https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.pa…https://lore.kernel.org/all/b6e00e77-4a8c-4e05-ab79-266bf05fcc2d@igalia.com/
I’ve added documentation describing DEPT — this should help you
understand what DEPT is and how it works. You can use DEPT simply by
enabling CONFIG_DEPT and checking dmesg at runtime.
---
Hi Linus and folks,
I’ve been developing a tool to detect deadlock possibilities by tracking
waits/events — rather than lock acquisition order — to cover all the
synchronization mechanisms. To summarize the design rationale, starting
from the problem statement, through analysis, to the solution:
CURRENT STATUS
--------------
Lockdep tracks lock acquisition order to identify deadlock conditions.
Additionally, it tracks IRQ state changes — via {en,dis}able — to
detect cases where locks are acquired unintentionally during
interrupt handling.
PROBLEM
-------
Waits and their associated events that are never reachable can
eventually lead to deadlocks. However, since Lockdep focuses solely
on lock acquisition order, it has inherent limitations when handling
waits and events.
Moreover, by tracking only lock acquisition order, Lockdep cannot
properly handle read locks or cross-event scenarios — such as
wait_for_completion() and complete() — making it increasingly
inadequate as a general-purpose deadlock detection tool.
SOLUTION
--------
Once again, waits and their associated events that are never
reachable can eventually lead to deadlocks. The new solution, DEPT,
focuses directly on waits and events. DEPT monitors waits and events,
and reports them when any become unreachable.
DEPT provides:
* Correct handling of read locks.
* Support for general waits and events.
* Continuous operation, even after multiple reports.
* Simple, intuitive annotation APIs.
There are still false positives, and some are already being worked on
for suppression. Especially splitting the folio class into several
appropriate classes e.g. block device mapping class and regular file
mapping class, is currently under active development by me and Yeoreum
Yun.
Anyway, these efforts will need to continue for a while, as we’ve seen
with lockdep over two decades. DEPT is tagged as EXPERIMENTAL in
Kconfig — meaning it’s not yet suitable for use as an automation tool.
However, for those who are interested in using DEPT to analyze complex
synchronization patterns and extract dependency insights, DEPT would be
a great tool for the purpose.
Thanks for your support and contributions to:
Harry Yoo <harry.yoo(a)oracle.com>
Gwan-gyeong Mun <gwan-gyeong.mun(a)intel.com>
Yunseong Kim <ysk(a)kzalloc.com>
Yeoreum Yun <yeoreum.yun(a)arm.com>
FAQ
---
Q. Is this the first attempt to solve this problem?
A. No. The cross-release feature (commit b09be676e0ff2) attempted to
address it — as a Lockdep extension. It was merged, but quickly
reverted, because:
While it uncovered valuable hidden issues, it also introduced false
positives. Since these false positives mask further real problems
with Lockdep — and developers strongly dislike them — the feature was
rolled back.
Q. Why wasn’t DEPT built as a Lockdep extension?
A. Lockdep is the result of years of work by kernel developers — and is
now very stable. But I chose to build DEPT separately, because:
While reusing BFS(Breadth First Search) and Lockdep’s hashing is
beneficial, the rest of the system must be rebuilt from scratch to
align with DEPT’s wait-event model — since Lockdep was originally
designed for tracking lock acquisition orders, not wait-event
dependencies.
Q. Do you plan to replace Lockdep entirely?
A. Not at all — Lockdep still plays a vital role in validating correct
lock usage. While its dependency-checking logic should eventually be
superseded by DEPT, the rest of its functionality should stay.
Q. Should we replace the dependency check immediately?
A. Absolutely not. Lockdep’s stability is the result of years of hard
work by kernel developers. Lockdep and DEPT should run side by side
until DEPT matures.
Q. Stronger detection often leads to more false positives — which was a
major pain point when cross-release was added. Is DEPT designed to
handle this?
A. Yes. DEPT’s simple, generalized design enables flexible reporting —
so while false positives still need fixing, they’re far less
disruptive than they were under the Lockdep extension, cross-release.
Q. Why not fix all false positives out-of-tree before merging?
A. Since the affected subsystems span the entire kernel, like Lockdep,
which has relied on annotations to avoid false positives over the
last two decades, DEPT too will require the annotation efforts.
Performing annotation work within the mainline will help us add
annotations more appropriately and will also make DEPT a useful tool
for a wider range of users more quickly.
CONFIG_DEPT is marked EXPERIMENTAL, so it’s opt-in. Some users are
already interested in using DEPT to analyze complex synchronization
patterns and extract dependency insights.
Byungchul
---
Changes from v17:
1. Rebase on the mainline as of 2025 Dec 5.
2. Convert the documents' format from txt to rst. (feedbacked
by Jonathan Corbet and Bagas Sanjaya)
3. Move the documents from 'Documentation/dependency' to
'Documentation/dev-tools'. (feedbakced by Jonathan Corbet)
4. Improve the documentation. (feedbacked by NeilBrown)
5. Use a common function, enter_from_user_mode(), instead of
arch specific code, to notice context switch from user mode.
(feedbacked by Dave Hansen, Mark Rutland, and Mark Brown)
6. Resolve the header dependency issue by using dept's internal
header, instead of relocating 'struct llist_{head,node}' to
another header. (feedbacked by Greg KH)
7. Improve page(or folio) usage type APIs.
8. Add rust helper for wait_for_completion(). (feedbacked by
Guangbo Cui, Boqun Feng, and Danilo Krummrich)
9. Refine some commit messages.
Changes from v16:
1. Rebase on v6.17.
2. Fix a false positive from rcu (by Yunseong Kim)
3. Introduce APIs to set page's usage, dept_set_page_usage() and
dept_reset_page_usage() to avoid false positives.
4. Consider lock_page() as a potential wait unconditionally.
5. Consider folio_lock_killable() as a potential wait
unconditionally.
6. Add support for tracking PG_writeback waits and events.
7. Fix two build errors due to the additional debug information
added by dept. (by Yunseong Kim)
Changes from v15:
1. Fix typo and improve comments and commit messages (feedbacked
by ALOK TIWARI, Waiman Long, and kernel test robot).
2. Do not stop dept on detection of cicular dependency of
recover event, allowing to keep reporting.
3. Add SK hynix to copyright.
4. Consider folio_lock() as a potential wait unconditionally.
5. Fix Kconfig dependency bug (feedbacked by kernel test rebot).
6. Do not suppress reports that involve classes even that have
already involved in other reports, allowing to keep
reporting.
Changes from v14:
1. Rebase on the current latest, v6.15-rc6.
2. Refactor dept code.
3. With multi event sites for a single wait, even if an event
forms a circular dependency, the event can be recovered by
other event(or wake up) paths. Even though informing the
circular dependency is worthy but it should be suppressed
once informing it, if it doesn't lead an actual deadlock. So
introduce APIs to annotate the relationship between event
site and recover site, that are, event_site() and
dept_recover_event().
4. wait_for_completion() worked with dept map embedded in struct
completion. However, it generates a few false positves since
all the waits using the instance of struct completion, share
the map and key. To avoid the false positves, make it not to
share the map and key but each wait_for_completion() caller
have its own key by default. Of course, external maps also
can be used if needed.
5. Fix a bug about hardirq on/off tracing.
6. Implement basic unit test for dept.
7. Add more supports for dma fence synchronization.
8. Add emergency stop of dept e.g. on panic().
9. Fix false positives by mmu_notifier_invalidate_*().
10. Fix recursive call bug by DEPT_WARN_*() and DEPT_STOP().
11. Fix trivial bugs in DEPT_WARN_*() and DEPT_STOP().
12. Fix a bug that a spin lock, dept_pool_spin, is used in
both contexts of irq disabled and enabled without irq
disabled.
13. Suppress reports with classes, any of that already have
been reported, even though they have different chains but
being barely meaningful.
14. Print stacktrace of the wait that an event is now waking up,
not only stacktrace of the event.
15. Make dept aware of lockdep_cmp_fn() that is used to avoid
false positives in lockdep so that dept can also avoid them.
16. Do do_event() only if there are no ecxts have been
delimited.
17. Fix a bug that was not synchronized for stage_m in struct
dept_task, using a spin lock, dept_task()->stage_lock.
18. Fix a bug that dept didn't handle the case that multiple
ttwus for a single waiter can be called at the same time
e.i. a race issue.
19. Distinguish each kernel context from others, not only by
system call but also by user oriented fault so that dept can
work with more accuracy information about kernel context.
That helps to avoid a few false positives.
20. Limit dept's working to x86_64 and arm64.
Changes from v13:
1. Rebase on the current latest version, v6.9-rc7.
2. Add 'dept' documentation describing dept APIs.
Changes from v12:
1. Refine the whole document for dept.
2. Add 'Interpret dept report' section in the document, using a
deadlock report obtained in practice. Hope this version of
document helps guys understand dept better.
https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a485657d@I-love.SA…https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.pa…
Changes from v11:
1. Add 'dept' documentation describing the concept of dept.
2. Rewrite the commit messages of the following commits for
using weaker lockdep annotation, for better description.
fs/jbd2: Use a weaker annotation in journal handling
cpu/hotplug: Use a weaker annotation in AP thread
(feedbacked by Thomas Gleixner)
Changes from v10:
1. Fix noinstr warning when building kernel source.
2. dept has been reporting some false positives due to the folio
lock's unfairness. Reflect it and make dept work based on
dept annotaions instead of just wait and wake up primitives.
3. Remove the support for PG_writeback while working on 2. I
will add the support later if needed.
4. dept didn't print stacktrace for [S] if the participant of a
deadlock is not lock mechanism but general wait and event.
However, it made hard to interpret the report in that case.
So add support to print stacktrace of the requestor who asked
the event context to run - usually a waiter of the event does
it just before going to wait state.
5. Give up tracking raw_local_irq_{disable,enable}() since it
totally messed up dept's irq tracking. So make it work in the
same way as lockdep does. I will consider it once any false
positives by those are observed again.
6. Change the manual rwsem_acquire_read(->j_trans_commit_map)
annotation in fs/jbd2/transaction.c to the try version so
that it works as much as it exactly needs.
7. Remove unnecessary 'inline' keyword in dept.c and add
'__maybe_unused' to a needed place.
Changes from v9:
1. Fix a bug. SDT tracking didn't work well because of my big
mistake that I should've used waiter's map to indentify its
class but it had been working with waker's one. FYI,
PG_locked and PG_writeback weren't affected. They still
worked well. (reported by YoungJun)
Changes from v8:
1. Fix build error by adding EXPORT_SYMBOL(PG_locked_map) and
EXPORT_SYMBOL(PG_writeback_map) for kernel module build -
appologize for that. (reported by kernel test robot)
2. Fix build error by removing header file's circular dependency
that was caused by "atomic.h", "kernel.h" and "irqflags.h",
which I introduced - appolgize for that. (reported by kernel
test robot)
Changes from v7:
1. Fix a bug that cannot track rwlock dependency properly,
introduced in v7. (reported by Boqun and lockdep selftest)
2. Track wait/event of PG_{locked,writeback} more aggressively
assuming that when a bit of PG_{locked,writeback} is cleared
there might be waits on the bit. (reported by Linus, Hillf
and syzbot)
3. Fix and clean bad style code e.i. unnecessarily introduced
a randome pattern and so on. (pointed out by Linux)
4. Clean code for applying dept to wait_for_completion().
Changes from v6:
1. Tie to task scheduler code to track sleep and try_to_wake_up()
assuming sleeps cause waits, try_to_wake_up()s would be the
events that those are waiting for, of course with proper dept
annotations, sdt_might_sleep_weak(), sdt_might_sleep_strong()
and so on. For these cases, class is classified at sleep
entrance rather than the synchronization initialization code.
Which would extremely reduce false alarms.
2. Remove the dept associated instance in each page struct for
tracking dependencies by PG_locked and PG_writeback thanks to
the 1. work above.
3. Introduce CONFIG_dept_AGGRESIVE_TIMEOUT_WAIT to suppress
reports that waits with timeout set are involved, for those
who don't like verbose reporting.
4. Add a mechanism to refill the internal memory pools on
running out so that dept could keep working as long as free
memory is available in the system.
5. Re-enable tracking hashed-waitqueue wait. That's going to no
longer generate false positives because class is classified
at sleep entrance rather than the waitqueue initailization.
6. Refactor to make it easier to port onto each new version of
the kernel.
7. Apply dept to dma fence.
8. Do trivial optimizaitions.
Changes from v5:
1. Use just pr_warn_once() rather than WARN_ONCE() on the lack
of internal resources because WARN_*() printing stacktrace is
too much for informing the lack. (feedback from Ted, Hyeonggon)
2. Fix trivial bugs like missing initializing a struct before
using it.
3. Assign a different class per task when handling onstack
variables for waitqueue or the like. Which makes dept
distinguish between onstack variables of different tasks so
as to prevent false positives. (reported by Hyeonggon)
4. Make dept aware of even raw_local_irq_*() to prevent false
positives. (reported by Hyeonggon)
5. Don't consider dependencies between the events that might be
triggered within __schedule() and the waits that requires
__schedule(), real ones. (reported by Hyeonggon)
6. Unstage the staged wait that has prepare_to_wait_event()'ed
*and* yet to get to __schedule(), if we encounter __schedule()
in-between for another sleep, which is possible if e.g. a
mutex_lock() exists in 'condition' of ___wait_event().
7. Turn on CONFIG_PROVE_LOCKING when CONFIG_DEPT is on, to rely
on the hardirq and softirq entrance tracing to make dept more
portable for now.
Changes from v4:
1. Fix some bugs that produce false alarms.
2. Distinguish each syscall context from another *for arm64*.
3. Make it not warn it but just print it in case dept ring
buffer gets exhausted. (feedback from Hyeonggon)
4. Explicitely describe "EXPERIMENTAL" and "dept might produce
false positive reports" in Kconfig. (feedback from Ted)
Changes from v3:
1. dept shouldn't create dependencies between different depths
of a class that were indicated by *_lock_nested(). dept
normally doesn't but it does once another lock class comes
in. So fixed it. (feedback from Hyeonggon)
2. dept considered a wait as a real wait once getting to
__schedule() even if it has been set to TASK_RUNNING by wake
up sources in advance. Fixed it so that dept doesn't consider
the case as a real wait. (feedback from Jan Kara)
3. Stop tracking dependencies with a map once the event
associated with the map has been handled. dept will start to
work with the map again, on the next sleep.
Changes from v2:
1. Disable dept on bit_wait_table[] in sched/wait_bit.c
reporting a lot of false positives, which is my fault.
Wait/event for bit_wait_table[] should've been tagged in a
higher layer for better work, which is a future work.
(feedback from Jan Kara)
2. Disable dept on crypto_larval's completion to prevent a false
positive.
Changes from v1:
1. Fix coding style and typo. (feedback from Steven)
2. Distinguish each work context from another in workqueue.
3. Skip checking lock acquisition with nest_lock, which is about
correct lock usage that should be checked by lockdep.
Changes from RFC(v0):
1. Prevent adding a wait tag at prepare_to_wait() but __schedule().
(feedback from Linus and Matthew)
2. Use try version at lockdep_acquire_cpus_lock() annotation.
3. Distinguish each syscall context from another.
Byungchul Park (41):
dept: implement DEPT(DEPendency Tracker)
dept: add single event dependency tracker APIs
dept: add lock dependency tracker APIs
dept: tie to lockdep and IRQ tracing
dept: add proc knobs to show stats and dependency graph
dept: distinguish each kernel context from another
dept: distinguish each work from another
dept: add a mechanism to refill the internal memory pools on running
out
dept: record the latest one out of consecutive waits of the same class
dept: apply sdt_might_sleep_{start,end}() to
wait_for_completion()/complete()
dept: apply sdt_might_sleep_{start,end}() to swait
dept: apply sdt_might_sleep_{start,end}() to waitqueue wait
dept: apply sdt_might_sleep_{start,end}() to hashed-waitqueue wait
dept: apply sdt_might_sleep_{start,end}() to dma fence
dept: track timeout waits separately with a new Kconfig
dept: apply timeout consideration to wait_for_completion()/complete()
dept: apply timeout consideration to swait
dept: apply timeout consideration to waitqueue wait
dept: apply timeout consideration to hashed-waitqueue wait
dept: apply timeout consideration to dma fence wait
dept: make dept able to work with an external wgen
dept: track PG_locked with dept
dept: print staged wait's stacktrace on report
locking/lockdep: prevent various lockdep assertions when
lockdep_off()'ed
dept: add documents for dept
cpu/hotplug: use a weaker annotation in AP thread
dept: assign dept map to mmu notifier invalidation synchronization
dept: assign unique dept_key to each distinct dma fence caller
dept: make dept aware of lockdep_set_lock_cmp_fn() annotation
dept: make dept stop from working on debug_locks_off()
dept: assign unique dept_key to each distinct wait_for_completion()
caller
completion, dept: introduce init_completion_dmap() API
dept: introduce a new type of dependency tracking between multi event
sites
dept: add module support for struct dept_event_site and
dept_event_site_dep
dept: introduce event_site() to disable event tracking if it's
recoverable
dept: implement a basic unit test for dept
dept: call dept_hardirqs_off() in local_irq_*() regardless of irq
state
dept: introduce APIs to set page usage and use subclasses_evt for the
usage
dept: track PG_writeback with dept
SUNRPC: relocate struct rcu_head to the first field of struct rpc_xprt
mm: percpu: increase PERCPU_DYNAMIC_SIZE_SHIFT on DEPT and large
PAGE_SIZE
Yunseong Kim (1):
rcu/update: fix same dept key collision between various types of RCU
Documentation/dev-tools/dept.rst | 778 ++++++
Documentation/dev-tools/dept_api.rst | 125 +
drivers/dma-buf/dma-fence.c | 23 +-
include/asm-generic/vmlinux.lds.h | 13 +-
include/linux/completion.h | 124 +-
include/linux/dept.h | 402 +++
include/linux/dept_ldt.h | 78 +
include/linux/dept_sdt.h | 68 +
include/linux/dept_unit_test.h | 67 +
include/linux/dma-fence.h | 74 +-
include/linux/hardirq.h | 3 +
include/linux/irq-entry-common.h | 4 +
include/linux/irqflags.h | 21 +-
include/linux/local_lock_internal.h | 1 +
include/linux/lockdep.h | 105 +-
include/linux/lockdep_types.h | 3 +
include/linux/mm_types.h | 4 +
include/linux/mmu_notifier.h | 26 +
include/linux/module.h | 5 +
include/linux/mutex.h | 1 +
include/linux/page-flags.h | 217 +-
include/linux/pagemap.h | 37 +-
include/linux/percpu-rwsem.h | 2 +-
include/linux/percpu.h | 4 +
include/linux/rcupdate_wait.h | 13 +-
include/linux/rtmutex.h | 1 +
include/linux/rwlock_types.h | 1 +
include/linux/rwsem.h | 1 +
include/linux/sched.h | 118 +
include/linux/seqlock.h | 2 +-
include/linux/spinlock_types_raw.h | 3 +
include/linux/srcu.h | 2 +-
include/linux/sunrpc/xprt.h | 9 +-
include/linux/swait.h | 3 +
include/linux/wait.h | 3 +
include/linux/wait_bit.h | 3 +
init/init_task.c | 2 +
init/main.c | 2 +
kernel/Makefile | 1 +
kernel/cpu.c | 2 +-
kernel/dependency/Makefile | 5 +
kernel/dependency/dept.c | 3499 ++++++++++++++++++++++++++
kernel/dependency/dept_hash.h | 10 +
kernel/dependency/dept_internal.h | 314 +++
kernel/dependency/dept_object.h | 13 +
kernel/dependency/dept_proc.c | 94 +
kernel/dependency/dept_unit_test.c | 173 ++
kernel/exit.c | 1 +
kernel/fork.c | 2 +
kernel/locking/lockdep.c | 33 +
kernel/module/main.c | 19 +
kernel/rcu/rcu.h | 1 +
kernel/rcu/update.c | 5 +-
kernel/sched/completion.c | 62 +-
kernel/sched/core.c | 9 +
kernel/workqueue.c | 3 +
lib/Kconfig.debug | 48 +
lib/debug_locks.c | 2 +
lib/locking-selftest.c | 2 +
mm/filemap.c | 38 +
mm/mm_init.c | 3 +
mm/mmu_notifier.c | 31 +-
rust/helpers/completion.c | 5 +
63 files changed, 6602 insertions(+), 121 deletions(-)
create mode 100644 Documentation/dev-tools/dept.rst
create mode 100644 Documentation/dev-tools/dept_api.rst
create mode 100644 include/linux/dept.h
create mode 100644 include/linux/dept_ldt.h
create mode 100644 include/linux/dept_sdt.h
create mode 100644 include/linux/dept_unit_test.h
create mode 100644 kernel/dependency/Makefile
create mode 100644 kernel/dependency/dept.c
create mode 100644 kernel/dependency/dept_hash.h
create mode 100644 kernel/dependency/dept_internal.h
create mode 100644 kernel/dependency/dept_object.h
create mode 100644 kernel/dependency/dept_proc.c
create mode 100644 kernel/dependency/dept_unit_test.c
base-commit: 43dfc13ca972988e620a6edb72956981b75ab6b0
--
2.17.1
Ever dreamed of running your own bustling enterprise, watching your profits soar, and building an empire from humble beginnings? If so, you've likely stumbled upon the fascinating and surprisingly addictive world of store management games. These titles offer a unique blend of strategy, incremental growth, and the satisfying feeling of seeing your hard work pay off. And when it comes to the purest, most delightful form of this genre, one game stands out: https://cookieclickers.io/
The Sweet Simplicity of Cookie Clicker: A Gateway to Management
At its heart, Cookie Clicker is incredibly simple. You start with a single, humble cookie, and your goal is to click it to generate more cookies. These initial clicks are crucial, as they fund your first upgrades. Soon, you’ll be able to purchase "grandmas," who automatically bake cookies for you, freeing up your clicking finger. From there, the sky's the limit! You'll acquire farms, factories, mines, and even portals to other dimensions, all dedicated to the singular purpose of baking more and more cookies.
What makes Cookie Clicker so captivating is its elegant progression system. Each new upgrade and building provides a tangible boost to your cookie production, creating a satisfying feedback loop. The numbers on your screen grow exponentially, transforming from humble dozens to mind-boggling septillions and beyond. It’s a masterclass in incremental design, making every new purchase feel impactful and exciting.
Beyond the Click: Strategic Thinking and Exponential Growth
While the initial appeal of Cookie Clicker might be the simple act of clicking, true mastery lies in strategic decision-making. As your cookie empire expands, you'll be faced with choices:
Upgrade Prioritization: Should you invest in another Grandma, a new farm, or a powerful upgrade that boosts all your existing structures? Understanding the cost-benefit analysis of each option is key.
Synergies: Many upgrades have synergistic effects, meaning they become more powerful when paired with specific buildings. Discovering these combinations is a delightful puzzle.
Ascension and Prestige: Eventually, you’ll unlock the ability to “ascend,” resetting your game but granting you powerful "heavenly chips" that provide permanent bonuses. This meta-progression adds a whole new layer of long-term strategy, encouraging you to rethink your approach with each new playthrough.
These elements elevate Cookie Clicker from a simple clicking game to a genuinely engaging management simulation. It teaches you about exponential growth, compound interest (in a fun, cookie-filled way!), and the satisfaction of building something from nothing.
Tips for Aspiring Cookie Tycoons
If you’re ready to dive into the sweet, sweet world of Cookie Clicker, here are a few friendly tips to get you started:
Don't Be Afraid to Click! In the early game, your clicks are your most valuable resource. Keep that finger moving!
Invest in Grandmas Early: They're your first step towards automation and a steady cookie income.
Always Buy Upgrades: The small boosts they provide add up quickly and are often more cost-effective than new buildings in the short term.
Look for Golden Cookies: These appear randomly and offer temporary, powerful buffs. Clicking them can drastically boost your production!
Consider Ascending: While it seems daunting to reset your progress, the permanent bonuses you gain make future runs much faster and more efficient.
The Endless Appeal of Automation
Cookie Clicker, and store management games in general, tap into a fundamental human desire: the joy of creation and the satisfaction of watching systems work efficiently. There's a particular kind of quiet pleasure in setting up a well-oiled machine and observing its output multiply. So, if you're looking for a game that's easy to pick up, surprisingly deep, and immensely satisfying, give Cookie Clicker a try. You might just find yourself baking billions before you know it!