deepin: KABI: KABI reservation for sched structures #551

opsiff · 2025-01-08T03:41:39Z

We reserve some fields beforehand for sched structures prone to change, therefore, we can hot add/change features of sched with this enhancement. After reserving, normally cache does not matter as the reserved fields are not accessed at all.

Link: https://gitee.com/openeuler/kernel/issues/I8ZJI8
Link: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/commit/f11bec33927d12cf8a0f95b1ad10653d69897444

deepin-ci-robot · 2025-01-08T03:42:05Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from opsiff. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

deepin/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

We reserve some fields beforehand for sched structures prone to change, therefore, we can hot add/change features of sched with this enhancement. After reserving, normally cache does not matter as the reserved fields are not accessed at all. Link: https://gitee.com/openeuler/kernel/issues/I8ZJI8 Link: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/commit/f11bec33927d12cf8a0f95b1ad10653d69897444 Signed-off-by: Phil Auld's avatarPhil Auld <[email protected]> Signed-off-by: Guan Jing <[email protected]> Signed-off-by: Wentao Guan <[email protected]>

deepin-ci-robot · 2025-01-08T06:52:03Z

deepin pr auto review

代码审查意见：

新增头文件：在sched.h文件中新增了<linux/deepin_kabi.h>头文件，需要确认该头文件是否必要，以及是否已经包含了所有需要的依赖。
KABI保留字段：在多个结构体中添加了DEEPIN_KABI_RESERVE宏，用于保留字段以支持KABI（内核应用二进制接口）。这些保留字段的位置和数量需要根据具体的需求和未来的扩展计划进行合理规划。
结构体扩展：在task_struct中添加了task_struct_extend指针，用于扩展任务结构体。需要确保这个指针的使用不会导致内存泄漏或指针悬挂问题，并且在使用时进行了适当的初始化和清理。
代码风格：在添加新字段时，应该保持代码风格的一致性，例如在添加新字段时，应该遵循已有的注释和代码格式。
性能考虑：添加新的保留字段可能会影响结构体的对齐和内存占用，需要评估这些改动对性能的影响，并确保不会对系统性能产生负面影响。
安全性：在添加新的结构体和字段时，应该确保不会引入新的安全漏洞，例如内存泄漏、缓冲区溢出等。
文档和注释：应该为新增的结构体和字段添加适当的文档和注释，以便其他开发者理解这些改动的原因和用途。
测试：在添加新的结构体和字段后，应该进行充分的测试，以确保这些改动不会引入新的bug，并且与现有的代码和功能兼容。

总体来说，这些改动需要仔细考虑其对内核架构和性能的影响，并确保代码的质量和安全性。

commit c98cc9797b7009308fff73d41bc1d08642dab77a upstream. Running a modified trace-cmd record --nosplice where it does a mmap of the ring buffer when '--nosplice' is set, caused the following lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 6.15.0-rc7-test-00002-gfb7d03d8a82f #551 Not tainted ------------------------------------------------------ trace-cmd/1113 is trying to acquire lock: ffff888100062888 (&buffer->mutex){+.+.}-{4:4}, at: ring_buffer_map+0x11c/0xe70 but task is already holding lock: ffff888100a5f9f8 (&cpu_buffer->mapping_lock){+.+.}-{4:4}, at: ring_buffer_map+0xcf/0xe70 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #5 (&cpu_buffer->mapping_lock){+.+.}-{4:4}: __mutex_lock+0x192/0x18c0 ring_buffer_map+0xcf/0xe70 tracing_buffers_mmap+0x1c4/0x3b0 __mmap_region+0xd8d/0x1f70 do_mmap+0x9d7/0x1010 vm_mmap_pgoff+0x20b/0x390 ksys_mmap_pgoff+0x2e9/0x440 do_syscall_64+0x79/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #4 (&mm->mmap_lock){++++}-{4:4}: __might_fault+0xa5/0x110 _copy_to_user+0x22/0x80 _perf_ioctl+0x61b/0x1b70 perf_ioctl+0x62/0x90 __x64_sys_ioctl+0x134/0x190 do_syscall_64+0x79/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #3 (&cpuctx_mutex){+.+.}-{4:4}: __mutex_lock+0x192/0x18c0 perf_event_init_cpu+0x325/0x7c0 perf_event_init+0x52a/0x5b0 start_kernel+0x263/0x3e0 x86_64_start_reservations+0x24/0x30 x86_64_start_kernel+0x95/0xa0 common_startup_64+0x13e/0x141 -> #2 (pmus_lock){+.+.}-{4:4}: __mutex_lock+0x192/0x18c0 perf_event_init_cpu+0xb7/0x7c0 cpuhp_invoke_callback+0x2c0/0x1030 __cpuhp_invoke_callback_range+0xbf/0x1f0 _cpu_up+0x2e7/0x690 cpu_up+0x117/0x170 cpuhp_bringup_mask+0xd5/0x120 bringup_nonboot_cpus+0x13d/0x170 smp_init+0x2b/0xf0 kernel_init_freeable+0x441/0x6d0 kernel_init+0x1e/0x160 ret_from_fork+0x34/0x70 ret_from_fork_asm+0x1a/0x30 -> #1 (cpu_hotplug_lock){++++}-{0:0}: cpus_read_lock+0x2a/0xd0 ring_buffer_resize+0x610/0x14e0 __tracing_resize_ring_buffer.part.0+0x42/0x120 tracing_set_tracer+0x7bd/0xa80 tracing_set_trace_write+0x132/0x1e0 vfs_write+0x21c/0xe80 ksys_write+0xf9/0x1c0 do_syscall_64+0x79/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #0 (&buffer->mutex){+.+.}-{4:4}: __lock_acquire+0x1405/0x2210 lock_acquire+0x174/0x310 __mutex_lock+0x192/0x18c0 ring_buffer_map+0x11c/0xe70 tracing_buffers_mmap+0x1c4/0x3b0 __mmap_region+0xd8d/0x1f70 do_mmap+0x9d7/0x1010 vm_mmap_pgoff+0x20b/0x390 ksys_mmap_pgoff+0x2e9/0x440 do_syscall_64+0x79/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e other info that might help us debug this: Chain exists of: &buffer->mutex --> &mm->mmap_lock --> &cpu_buffer->mapping_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&cpu_buffer->mapping_lock); lock(&mm->mmap_lock); lock(&cpu_buffer->mapping_lock); lock(&buffer->mutex); *** DEADLOCK *** 2 locks held by trace-cmd/1113: #0: ffff888106b847e0 (&mm->mmap_lock){++++}-{4:4}, at: vm_mmap_pgoff+0x192/0x390 #1: ffff888100a5f9f8 (&cpu_buffer->mapping_lock){+.+.}-{4:4}, at: ring_buffer_map+0xcf/0xe70 stack backtrace: CPU: 5 UID: 0 PID: 1113 Comm: trace-cmd Not tainted 6.15.0-rc7-test-00002-gfb7d03d8a82f #551 PREEMPT Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x6e/0xa0 print_circular_bug.cold+0x178/0x1be check_noncircular+0x146/0x160 __lock_acquire+0x1405/0x2210 lock_acquire+0x174/0x310 ? ring_buffer_map+0x11c/0xe70 ? ring_buffer_map+0x11c/0xe70 ? __mutex_lock+0x169/0x18c0 __mutex_lock+0x192/0x18c0 ? ring_buffer_map+0x11c/0xe70 ? ring_buffer_map+0x11c/0xe70 ? function_trace_call+0x296/0x370 ? __pfx___mutex_lock+0x10/0x10 ? __pfx_function_trace_call+0x10/0x10 ? __pfx___mutex_lock+0x10/0x10 ? _raw_spin_unlock+0x2d/0x50 ? ring_buffer_map+0x11c/0xe70 ? ring_buffer_map+0x11c/0xe70 ? __mutex_lock+0x5/0x18c0 ring_buffer_map+0x11c/0xe70 ? do_raw_spin_lock+0x12d/0x270 ? find_held_lock+0x2b/0x80 ? _raw_spin_unlock+0x2d/0x50 ? rcu_is_watching+0x15/0xb0 ? _raw_spin_unlock+0x2d/0x50 ? trace_preempt_on+0xd0/0x110 tracing_buffers_mmap+0x1c4/0x3b0 __mmap_region+0xd8d/0x1f70 ? ring_buffer_lock_reserve+0x99/0xff0 ? __pfx___mmap_region+0x10/0x10 ? ring_buffer_lock_reserve+0x99/0xff0 ? __pfx_ring_buffer_lock_reserve+0x10/0x10 ? __pfx_ring_buffer_lock_reserve+0x10/0x10 ? bpf_lsm_mmap_addr+0x4/0x10 ? security_mmap_addr+0x46/0xd0 ? lock_is_held_type+0xd9/0x130 do_mmap+0x9d7/0x1010 ? 0xffffffffc0370095 ? __pfx_do_mmap+0x10/0x10 vm_mmap_pgoff+0x20b/0x390 ? __pfx_vm_mmap_pgoff+0x10/0x10 ? 0xffffffffc0370095 ksys_mmap_pgoff+0x2e9/0x440 do_syscall_64+0x79/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fb0963a7de2 Code: 00 00 00 0f 1f 44 00 00 41 f7 c1 ff 0f 00 00 75 27 55 89 cd 53 48 89 fb 48 85 ff 74 3b 41 89 ea 48 89 df b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 5b 5d c3 0f 1f 00 48 8b 05 e1 9f 0d 00 64 RSP: 002b:00007ffdcc8fb878 EFLAGS: 00000246 ORIG_RAX: 0000000000000009 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb0963a7de2 RDX: 0000000000000001 RSI: 0000000000001000 RDI: 0000000000000000 RBP: 0000000000000001 R08: 0000000000000006 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffdcc8fbe68 R14: 00007fb096628000 R15: 00005633e01a5c90 </TASK> The issue is that cpus_read_lock() is taken within buffer->mutex. The memory mapped pages are taken with the mmap_lock held. The buffer->mutex is taken within the cpu_buffer->mapping_lock. There's quite a chain with all these locks, where the deadlock can be fixed by moving the cpus_read_lock() outside the taking of the buffer->mutex. Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Vincent Donnefort <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: 117c392 ("ring-buffer: Introducing ring-buffer mapping functions") Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 2d6a6cfe969f246c616b0b0f65253060838d58a8)

deepin-ci-robot requested review from chenchongbiao and huangbibo January 8, 2025 03:42

opsiff assigned winnscode and opsiff Jan 8, 2025

opsiff requested a review from winnscode January 8, 2025 06:03

opsiff force-pushed the linux-6.6.y-kabi-sched branch from b35fbbc to 884f72d Compare January 8, 2025 06:51

Avenger-285714 self-assigned this Jan 8, 2025

Avenger-285714 self-requested a review January 8, 2025 06:53

opsiff merged commit 34aa3d7 into linux-6.6.y Jan 10, 2025
7 of 10 checks passed

opsiff deleted the linux-6.6.y-kabi-sched branch January 13, 2025 07:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

deepin: KABI: KABI reservation for sched structures #551

deepin: KABI: KABI reservation for sched structures #551

Uh oh!

opsiff commented Jan 8, 2025

Uh oh!

deepin-ci-robot commented Jan 8, 2025

Uh oh!

deepin-ci-robot commented Jan 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

deepin: KABI: KABI reservation for sched structures #551

deepin: KABI: KABI reservation for sched structures #551

Uh oh!

Conversation

opsiff commented Jan 8, 2025

Uh oh!

deepin-ci-robot commented Jan 8, 2025

Uh oh!

deepin-ci-robot commented Jan 8, 2025

deepin pr auto review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants