forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 3
riscv: irq: fix: check weather irq stack is in-use. #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
woshiluo
wants to merge
2
commits into
shannmu:xenomai4/wip/dovetail-riscv/evl_port
Choose a base branch
from
woshiluo:woshiluo/evl_port/patch-1
base: xenomai4/wip/dovetail-riscv/evl_port
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
riscv: irq: fix: check weather irq stack is in-use. #1
woshiluo
wants to merge
2
commits into
shannmu:xenomai4/wip/dovetail-riscv/evl_port
from
woshiluo:woshiluo/evl_port/patch-1
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This commit adds a flag to check whether an IRQ stack is currently in use. The EVL module's in-band hardirqs can cause context switches, leading to the same IRQ stack being used multiple times. This new flag prevents potential issues by ensuring a stack isn't re-used until it's available. Additionally, this patch fixes a pr_info format string error. Signed-off-by: Woshiluo Luo <[email protected]>
Han-Chen-BC
pushed a commit
to Han-Chen-BC/visionfive2_linux
that referenced
this pull request
Sep 26, 2025
Checking for PI/PP boosting mutex is not enough when dropping to in-band context: owning any mutex in this case would be wrong, since this would create a priority inversion. Extend the logic of evl_detect_boost_drop() to encompass any owned mutex, renaming it to evl_check_no_mutex() for consistency. As a side-effect, the thread which attempts to switch in-band while owning mutex(es) now receives a single HMDIAG_LKDEPEND notification, instead of notifying all waiter(s) sleeping on those mutexes. As a consequence, we can drop detect_inband_owner() which becomes redundant as it detects the same issue from the converse side without extending the test coverage (i.e. a contender would check whether the mutex owner is running in-band). This change does affect the behavior for applications turning on T_WOLI on waiter threads explicitly. This said, the same issue would still be detected if CONFIG_EVL_DEBUG_WOLI is set globally though, which is the recommended configuration during the development stage. This change also solves an ABBA issue which existed in the former implementation: [ 40.976962] ====================================================== [ 40.976964] WARNING: possible circular locking dependency detected [ 40.976965] 5.15.77-00716-g8390add2f766 torvalds#156 Not tainted [ 40.976968] ------------------------------------------------------ [ 40.976969] monitor-pp-lazy/363 is trying to acquire lock: [ 40.976971] ffff99c5c14e5588 (test363.0){....}-{0:0}, at: evl_detect_boost_drop+0x80/0x200 [ 40.976987] [ 40.976987] but task is already holding lock: [ 40.976988] ffff99c5c243d818 (monitor-pp-lazy:363){....}-{0:0}, at: evl_detect_boost_drop+0x0/0x200 [ 40.976996] [ 40.976996] which lock already depends on the new lock. [ 40.976996] [ 40.976997] [ 40.976997] the existing dependency chain (in reverse order) is: [ 40.976998] [ 40.976998] -> shannmu#1 (monitor-pp-lazy:363){....}-{0:0}: [ 40.977003] fast_grab_mutex+0xca/0x150 [ 40.977006] evl_lock_mutex_timeout+0x60/0xa90 [ 40.977009] monitor_oob_ioctl+0x226/0xed0 [ 40.977014] EVL_ioctl+0x41/0xa0 [ 40.977017] handle_pipelined_syscall+0x3d8/0x490 [ 40.977021] __pipeline_syscall+0xcc/0x2e0 [ 40.977026] pipeline_syscall+0x47/0x120 [ 40.977030] syscall_enter_from_user_mode+0x40/0xa0 [ 40.977036] do_syscall_64+0x15/0xf0 [ 40.977039] entry_SYSCALL_64_after_hwframe+0x61/0xcb [ 40.977044] [ 40.977044] -> #0 (test363.0){....}-{0:0}: [ 40.977048] __lock_acquire+0x133a/0x2530 [ 40.977053] lock_acquire+0xce/0x2d0 [ 40.977056] evl_detect_boost_drop+0xb0/0x200 [ 40.977059] evl_switch_inband+0x41e/0x540 [ 40.977064] do_oob_syscall+0x1bc/0x3d0 [ 40.977067] handle_pipelined_syscall+0xbe/0x490 [ 40.977071] __pipeline_syscall+0xcc/0x2e0 [ 40.977075] pipeline_syscall+0x47/0x120 [ 40.977079] syscall_enter_from_user_mode+0x40/0xa0 [ 40.977083] do_syscall_64+0x15/0xf0 [ 40.977086] entry_SYSCALL_64_after_hwframe+0x61/0xcb [ 40.977090] [ 40.977090] other info that might help us debug this: [ 40.977090] [ 40.977091] Possible unsafe locking scenario: [ 40.977091] [ 40.977092] CPU0 CPU1 [ 40.977093] ---- ---- [ 40.977094] lock(monitor-pp-lazy:363); [ 40.977096] lock(test363.0); [ 40.977098] lock(monitor-pp-lazy:363); [ 40.977100] lock(test363.0); [ 40.977102] [ 40.977102] *** DEADLOCK *** [ 40.977102] [ 40.977103] 1 lock held by monitor-pp-lazy/363: [ 40.977105] #0: ffff99c5c243d818 (monitor-pp-lazy:363){....}-{0:0}, at: evl_detect_boost_drop+0x0/0x200 [ 40.977113] Signed-off-by: Philippe Gerum <[email protected]>
Han-Chen-BC
pushed a commit
to Han-Chen-BC/visionfive2_linux
that referenced
this pull request
Sep 26, 2025
This reverts commit 5ae263a6ca027c4b79c4dfacfcf4eca8209eefd0, because disambiguating the per-thread lock is not as useless as it seemed at first. Fixes this regression when CONFIG_DEBUG_HARD_SPINLOCKS is on: [ 52.090120] [ 52.090129] ============================================ [ 52.090134] WARNING: possible recursive locking detected [ 52.090139] 5.10.199-00830-g18654c202dd6 torvalds#118 Not tainted [ 52.090143] -------------------------------------------- [ 52.090147] monitor-dlk-A:4/493 is trying to acquire lock: [ 52.090152] c34a7010 (__RAWLOCK(&thread->lock)){-.-.}-{0:0}, at: evl_lock_mutex_timeout+0x4e0/0x870 [ 52.090169] [ 52.090173] but task is already holding lock: [ 52.090176] c34a5810 (__RAWLOCK(&thread->lock)){-.-.}-{0:0}, at: evl_lock_mutex_timeout+0x4e0/0x870 [ 52.090192] [ 52.090195] other info that might help us debug this: [ 52.090199] Possible unsafe locking scenario: [ 52.090202] [ 52.090205] CPU0 [ 52.090208] ---- [ 52.090211] lock(__RAWLOCK(&thread->lock)); [ 52.090221] lock(__RAWLOCK(&thread->lock)); [ 52.090229] [ 52.090233] *** DEADLOCK *** [ 52.090235] [ 52.090239] May be due to missing lock nesting notation [ 52.090242] [ 52.090246] 2 locks held by monitor-dlk-A:4/493: [ 52.090249] #0: c2d030d4 (&mon->mutex){....}-{0:0}, at: evl_lock_mutex_timeout+0x104/0x870 [ 52.090267] shannmu#1: c34a5810 (__RAWLOCK(&thread->lock)){-.-.}-{0:0}, at: evl_lock_mutex_timeout+0x4e0/0x870 Signed-off-by: Philippe Gerum <[email protected]>
Han-Chen-BC
pushed a commit
to Han-Chen-BC/visionfive2_linux
that referenced
this pull request
Sep 26, 2025
evl_net_learn_ipv4_route() may be called from a softirq context. Make sure we don't enter fs reclaim, use atomic allocation instead. Fixes this lockdep splat: [ 393.641581] ================================ [ 393.641584] WARNING: inconsistent lock state [ 393.641588] 6.1.111-00840-g87ef751da8a7-dirty torvalds#30 Not tainted [ 393.641594] -------------------------------- [ 393.641597] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 393.641601] swapper/0/0 [HC0[0]:SC1[3]:HE1:SE0] takes: [ 393.641611] c190530c (fs_reclaim){+.?.}-{0:0}, at: __kmem_cache_alloc_node+0x2c/0x204 [ 393.641647] {SOFTIRQ-ON-W} state was registered at: [ 393.641651] fs_reclaim_acquire+0x70/0xa8 [ 393.641664] __kmem_cache_alloc_node+0x2c/0x204 [ 393.641673] kmalloc_node_trace+0x24/0x4c [ 393.641682] init_rescuer+0x3c/0xe8 [ 393.641696] workqueue_init+0xa0/0x1e4 [ 393.641716] kernel_init_freeable+0x88/0x240 [ 393.641733] kernel_init+0x14/0x140 [ 393.641751] ret_from_fork+0x14/0x1c [ 393.641760] irq event stamp: 280800 [ 393.641764] hardirqs last enabled at (280800): [<c012ff8c>] handle_softirqs+0xa0/0x480 [ 393.641784] hardirqs last disabled at (280798): [<c01ab6f4>] sync_current_irq_stage+0x214/0x268 [ 393.641799] softirqs last enabled at (280780): [<c01301c0>] handle_softirqs+0x2d4/0x480 [ 393.641813] softirqs last disabled at (280799): [<c013050c>] __irq_exit_rcu+0x144/0x188 [ 393.641827] [ 393.641827] other info that might help us debug this: [ 393.641831] Possible unsafe locking scenario: [ 393.641831] [ 393.641833] CPU0 [ 393.641835] ---- [ 393.641837] lock(fs_reclaim); [ 393.641843] <Interrupt> [ 393.641845] lock(fs_reclaim); [ 393.641852] [ 393.641852] *** DEADLOCK *** [ 393.641852] [ 393.641853] 2 locks held by swapper/0/0: [ 393.641859] #0: c18e21c0 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb_list_internal+0xc8/0x3d4 [ 393.641891] shannmu#1: c18e21c0 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x64/0x1c8 [ 393.641920] [ 393.641920] stack backtrace: [ 393.641924] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.111-00840-g87ef751da8a7-dirty torvalds#30 [ 393.641933] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 393.641937] IRQ stage: Linux [ 393.641944] unwind_backtrace from show_stack+0x10/0x14 [ 393.641967] show_stack from dump_stack_lvl+0x94/0xcc [ 393.641986] dump_stack_lvl from mark_lock.part.0+0x730/0x940 [ 393.642004] mark_lock.part.0 from __lock_acquire+0x978/0x2924 [ 393.642016] __lock_acquire from lock_acquire+0xf8/0x368 [ 393.642029] lock_acquire from fs_reclaim_acquire+0x70/0xa8 [ 393.642042] fs_reclaim_acquire from __kmem_cache_alloc_node+0x2c/0x204 [ 393.642058] __kmem_cache_alloc_node from kmalloc_trace+0x28/0x58 [ 393.642072] kmalloc_trace from evl_net_learn_ipv4_route+0x6c/0x12c [ 393.642095] evl_net_learn_ipv4_route from ip_route_output_flow+0x5c/0x64 [ 393.642113] ip_route_output_flow from ip_send_unicast_reply+0x144/0x50c [ 393.642132] ip_send_unicast_reply from tcp_v4_send_reset+0x25c/0x514 [ 393.642151] tcp_v4_send_reset from tcp_v4_rcv+0x98c/0xcdc [ 393.642164] tcp_v4_rcv from ip_protocol_deliver_rcu+0x3c/0x248 [ 393.642178] ip_protocol_deliver_rcu from ip_local_deliver_finish+0xd0/0x1c8 [ 393.642194] ip_local_deliver_finish from ip_sublist_rcv_finish+0x38/0xa0 [ 393.642210] ip_sublist_rcv_finish from ip_sublist_rcv+0x1e8/0x340 [ 393.642225] ip_sublist_rcv from ip_list_rcv+0xe4/0x2f8 [ 393.642240] ip_list_rcv from __netif_receive_skb_list_core+0x18c/0x1fc [ 393.642258] __netif_receive_skb_list_core from netif_receive_skb_list_internal+0x1f8/0x3d4 [ 393.642275] netif_receive_skb_list_internal from net_rx_action+0xe0/0x3cc [ 393.642291] net_rx_action from handle_softirqs+0xdc/0x480 [ 393.642312] handle_softirqs from __irq_exit_rcu+0x144/0x188 [ 393.642332] __irq_exit_rcu from irq_exit+0x8/0x28 [ 393.642352] irq_exit from arch_do_IRQ_pipelined+0x30/0x64 [ 393.642372] arch_do_IRQ_pipelined from sync_current_irq_stage+0x160/0x268 [ 393.642386] sync_current_irq_stage from __inband_irq_enable+0x48/0x54 [ 393.642400] __inband_irq_enable from cpuidle_enter_state+0x198/0x3e8 [ 393.642421] cpuidle_enter_state from cpuidle_enter+0x30/0x40 [ 393.642436] cpuidle_enter from do_idle+0x1e0/0x2ac [ 393.642460] do_idle from cpu_startup_entry+0x28/0x2c [ 393.642481] cpu_startup_entry from rest_init+0xd4/0x188 [ 393.642504] rest_init from arch_post_acpi_subsys_init+0x0/0x8 Signed-off-by: Philippe Gerum <[email protected]>
Signed-off-by: Woshiluo Luo <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit adds a flag to check whether an IRQ stack is currently in use.
The EVL module's in-band hardirqs can cause context switches, leading to the same IRQ stack being used multiple times. This new flag prevents potential
issues by ensuring a stack isn't re-used until it's available.
Additionally, this patch fixes a pr_info format string error.