Merge branch 'master' into '6.0/stage' #8

github-actions · 2020-05-15T21:52:14Z

No description provided.

Currently, the kernel config is embedded inside of manage.py. This is inconvenient when building kernels manually for testing. Move it to a separate file so that, e.g., it can be copied into the build tree easily.

About half (!) of vmlinux comes from relocation sections (~130M out of ~250M). But, vmlinux is an ET_EXEC file, so relocations don't even apply to it. We can massively shrink vmlinux by removing the unnecessary sections.

And add docstrings to the functions with more complex signatures.

These will be shared with vmtest.

This should just be executed as python3 -m vmtest.manage now.

--build-kernel-org needs the token to get the available releases even if we're not uploading.

If the path doesn't exist, there are no available releases. Otherwise, we need to check for other errors.

Any changes of the vmtest kernel config will require a rebuild of all kernels and a way to distinguish the rebuild. Add CONFIG_LOCALVERSION which we will bump each time the config changes.

Add an verrevcmp() function based on the coreutils implementation (which comes from gnulib, which is derived from the implementation in dpkg). This will be used by vmtest.

The current implementation of vmtest has a few issues: 1. Building drgn for each kernel version on Travis is slow, mostly because they don't all run in parallel. 2. For local, incremental testing, recreating the filesystem image and rebuilding drgn is slow, and syncing the code to the filesystem image is brittle. 3. The filesystem image is the only communication channel, and reading the exit status from the filesystem image is awkward and fragile. 4. Creating and accessing the filesystem image requires root. This reworks vmtest to use the build on the host via VirtFS with a simple agent on the guest that can execute arbitrary commands and return the exit status. This has a few more moving parts but is faster and saner overall.

disable drgn testing when constructing debian package

Currently, drgn_language_from_die() returns the default language when it encounters an unknown DW_LANG because the dwarf_info_cache always wants a language. The next change will want to detect the unknown language case, so make drgn_language_from_die() return NULL if the language is unknown, move it to language.c, and fold drgn_language_from_dw_lang() into it.

Jay reported that the default language detection was happening too early and not finding "main". We need to make sure to do it after the DWARF index is actually populated. The problem with that is that it makes error reporting much harder, as we don't want to return a fatal error from drgn_program_set_language_from_main() if we actually succeeded in loading debug info. That means we probably need to ignore errors in drgn_program_set_language_from_main(). To reduce the surface area where we'd be failing, let's get the language directly from the DWARF index. This also allows us to avoid setting the language if it's actually unknown (information which is lost by the time we convert it to a drgn_object in the current code).

The vmtest rework missed a few new files.

elfutils 0.179 was just released, and it includes my fix for vmcores with >= 2^16 phdrs. Based on: 3a7728808 Prepare for 0.179 With the following patches: configure: Add --disable-programs configure: Add --disable-shared libdwfl: add interface for attaching to/detaching from threads libdwfl: add interface for getting Dwfl_Module and Dwarf_Frame for Dwfl_Frame libdwfl: export __libdwfl_frame_reg_get as dwfl_frame_register libdwfl: add interface for evaluating DWARF expressions in a frame

The configure script allows the user to not use any openmp implementation but dwarf_index.c uses the locking APIs unconditionally. This compiles but fails at runtime. Adding simple stubs for the locking API. This is useful when debugging crashes in dwarf indexing during development.

A few recent changes weren't formatted with black.

drgn may be compiled with some CPU-specific features (e.g., -march=native), so make sure that we support those features inside of the VM, too.

The internal _page_offset() helper gets the value of PAGE_OFFSET, but the fallback when KASLR is disabled has been out of date since Linux v4.20 and never handled 5-level page tables. Additionally, it makes more sense as part of the Linux kernel (formerly vmcoreinfo) object finder so that it's cleanly accessible outside of drgn internals.

Similarly to PAGE_OFFSET, vmemmap makes more sense as part of the Linux kernel object finder than an internal helper. While we're here, let's fix the definition for 5-level page tables. This only matters for kernels with commit 77ef56e4f0fb ("x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y") but without eedb92abb9bb ("x86/mm: Make virtual memory layout dynamic for CONFIG_X86_5LEVEL=y") (namely, v4.14, v4.15, and v4.16); since v4.17, 5-level page table support enables KASLR.

Before Linux v4.11, /proc/kcore didn't have valid physical addresses, so it's currently not possible to read from physical memory on old kernels. However, if we can figure out the address of the direct mapping, then we can determine the corresponding physical addresses for the segments and add them.

This hasn't been used since commit 417a6f0 ("libdrgn: make memory reader pluggable with callbacks").

DRGN_UNREACHABLE() currently expands to abort(), but assert() provides more information. If NDEBUG is defined, we can use __builtin_unreachable() instead. DRGN_UNREACHABLE() isn't drgn-specific, so this renames it to UNREACHABLE(). It's also not really related to errors, so this moves it to internal.h.

Program_load_debug_info() is the last user of the resize_array()/realloc_array() utility functions. We can clean it up by using a vector and finally get rid of those functions. This also happens to fix three bugs in Program_load_debug_info(): we weren't setting a Python exception if we couldn't allocate the path_args array, we weren't zeroing path_args after resizing the array, and we weren't freeing the path_args array. Shame on whoever wrote this.

internal.h includes both drgn-specific helpers and generic utility functions. Split the latter into their own util.h header and use it instead of internal.h in the generic data structure code. This makes it easier to copy the data structures into other projects/test programs.

fls() can be implemented with __bitop(), and we can get rid of clz() since it's only used by fls().

c_integer_literal() has an open-coded equivalent of fls() that assumes that unsigned long long is 64 bits. Use fls() instead.

-fsanitize=undefined reports that the read_u* helpers rely on unaligned loads. Use memcpy() instead.

UNARY_OP_SIGNED_2C() uses a union of int64_t and uint64_t to avoid signed integer overflow... except that there's a typo and the uint64_t is actually an int64_t. Fix it and add a test that would catch it with -fsanitize=undefined.

There are a few big use cases for this in drgn: * Helpers for accessing memory in the virtual address space of userspace tasks. * Removing the libkdumpfile dependency for vmcores. * Handling gaps in the virtual address space of /proc/kcore (cf. #27). I dragged my feet on implementing this because I thought it would be more complicated, but the page table layout on x86-64 isn't too bad. This commit implements page table walking using a page table iterator abstraction. The first thing we'll add on top of this will be a helper for reading memory from a virtual address space, but in the future it'd also be possible to export the page table iterator directly.

Now that we can walk page tables, we can finally read memory from userspace tasks. Closes #53.

These are two of the most common use cases for reading a process's memory.

I originally wanted to avoid depending on another vmcoreinfo field, but an the next change is going to depend on swapper_pg_dir in vmcoreinfo anyways, and it ends up being simpler to use it.

Now that we can walk page tables, we can use it in a memory reader that reads kernel memory via the kernel page table. This means that we don't need libkdumpfile for ELF vmcores anymore (although I'll keep the functionality around until this code has been validated more).

5.7 is up to rc4 (oops). Better late than never.

The automake/libtool compilation output is obnoxiously verbose. Switch on automake's silent mode, and make the custom rules honor it.

Rebase on master and fix dwfl_frame_module/dwfl_frame_dwarf_frame to decrement the program counter when necessary. Based on: a8493c12a libdw: Skip imported compiler_units in libdw_visit_scopes walking DIE tree With the following patches: configure: Add --disable-programs configure: Add --disable-shared libdwfl: simplify activation frame logic libdwfl: add interface for attaching to/detaching from threads libdwfl: add interface for getting Dwfl_Module and Dwarf_Frame for Dwfl_Frame libdwfl: export __libdwfl_frame_reg_get as dwfl_frame_register libdwfl: add interface for evaluating DWARF expressions in a frame

For functions that call a noreturn function, the compiler may omit code after the call instruction. This means that the return address may not lie in the caller's symbol. dwfl_frame_pc() returns whether a frame is an "activation", i.e., its program counter is guaranteed to lie within the caller. This is only the case for the initial frame, frames interrupted by a signal, and the signal trampoline frame. For everything else, we need to decrement the program counter before doing any lookups.

Sync "6.0/stage" with "master" via GitHub Actions

The CI has intermittently been hitting the following test failures on Python 3.8 with Clang: ====================================================================== ERROR: test_task_cpu (tests.linux_kernel.helpers.test_sched.TestSched) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/runner/work/drgn/drgn/tests/linux_kernel/helpers/test_sched.py", line 40, in test_task_cpu with fork_and_stop(os.sched_setaffinity, 0, (cpu,)) as (pid, _): File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/contextlib.py", line 113, in __enter__ return next(self.gen) File "/home/runner/work/drgn/drgn/tests/linux_kernel/__init__.py", line 203, in fork_and_stop ret = pickle.load(pipe_r) EOFError: Ran out of input The EOFError occurs because the forked process segfaults immediately: python[132]: segfault at 7f8f87085014 ip 00007f8f891e9774 sp 00007ffccf7acf00 error 4 in ld-linux-x86-64.so.2[16774,7f8f891d5000+2a000] likely on CPU 0 (core 0, socket 0) The segfault is on dereferencing cache_new in in _dl_load_cache_lookup() in ld-linux here: https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/dl-cache.c;h=88bf78ad7c914b02109d6ddef7e08c0e8fd4574d;hb=f94f6d8a3572840d3ba42ab9ace3ea522c99c0c2#l489 Which is coming from a libomp fork handler: #0 0x00007f5566f9d774 in _dl_load_cache_lookup (name=name@entry=0x7f55654afde6 "libmemkind.so") at ./elf/dl-cache.c:498 #1 0x00007f5566f91982 in _dl_map_object (loader=loader@entry=0x55f8a170b670, name=name@entry=0x7f55654afde6 "libmemkind.so", type=type@entry=2, trace_mode=trace_mode@entry=0, mode=mode@entry=-1879048191, nsid=<optimized out>) at ./elf/dl-load.c:2193 #2 0x00007f5566f959a9 in dl_open_worker_begin (a=a@entry=0x7fffcf5851f0) at ./elf/dl-open.c:534 #3 0x00007f5566b4ab08 in __GI__dl_catch_exception (exception=exception@entry=0x7fffcf585050, operate=operate@entry=0x7f5566f95900 <dl_open_worker_begin>, args=args@entry=0x7fffcf5851f0) at ./elf/dl-error-skeleton.c:208 #4 0x00007f5566f94f9a in dl_open_worker (a=a@entry=0x7fffcf5851f0) at ./elf/dl-open.c:782 #5 0x00007f5566b4ab08 in __GI__dl_catch_exception (exception=exception@entry=0x7fffcf5851d0, operate=operate@entry=0x7f5566f94f60 <dl_open_worker>, args=args@entry=0x7fffcf5851f0) at ./elf/dl-error-skeleton.c:208 #6 0x00007f5566f9534e in _dl_open (file=<optimized out>, mode=-2147483647, caller_dlopen=0x7f55653fa882, nsid=-2, argc=9, argv=<optimized out>, env=0x55f8a1477e10) at ./elf/dl-open.c:883 #7 0x00007f5566a6663c in dlopen_doit (a=a@entry=0x7fffcf585460) at ./dlfcn/dlopen.c:56 #8 0x00007f5566b4ab08 in __GI__dl_catch_exception (exception=exception@entry=0x7fffcf5853c0, operate=<optimized out>, args=<optimized out>) at ./elf/dl-error-skeleton.c:208 #9 0x00007f5566b4abd3 in __GI__dl_catch_error (objname=0x7fffcf585418, errstring=0x7fffcf585420, mallocedp=0x7fffcf585417, operate=<optimized out>, args=<optimized out>) at ./elf/dl-error-skeleton.c:227 #10 0x00007f5566a6612e in _dlerror_run (operate=operate@entry=0x7f5566a665e0 <dlopen_doit>, args=args@entry=0x7fffcf585460) at ./dlfcn/dlerror.c:138 #11 0x00007f5566a666c8 in dlopen_implementation (dl_caller=<optimized out>, mode=<optimized out>, file=<optimized out>) at ./dlfcn/dlopen.c:71 #12 ___dlopen (file=<optimized out>, mode=<optimized out>) at ./dlfcn/dlopen.c:81 #13 0x00007f55653fa882 in ?? () from /usr/lib/llvm-14/lib/libomp.so.5 #14 0x00007f5565413556 in ?? () from /usr/lib/llvm-14/lib/libomp.so.5 #15 0x00007f5565421d1a in ?? () from /usr/lib/llvm-14/lib/libomp.so.5 #16 0x00007f5566ac0fc1 in __run_fork_handlers (who=who@entry=atfork_run_child, do_locking=do_locking@entry=true) at ./posix/register-atfork.c:130 #17 0x00007f5566ac08d3 in __libc_fork () at ./posix/fork.c:108 #18 0x00007f5566e108ad in os_fork_impl (module=<optimized out>) at ./Modules/posixmodule.c:6250 #19 os_fork (module=<optimized out>, _unused_ignored=<optimized out>) at ./Modules/clinic/posixmodule.c.h:2750 This doesn't happen in Python 3.9, which I bisected to CPython commit 45a78f906d2d ("bpo-44434: Don't call PyThread_exit_thread() explicitly (GH-26758)") (in v3.11, backported to v3.9.6). That commit describes a different symptom where the process aborts because libgcc_s can't be loaded. I don't understand how that issue can cause our crash, but the fix appears to be the same. The discussion also suggests a workaround: linking to libgcc_s explicitly. Apply the workaround, which appears to fix our problem. We only do this for the CI and not for the general build for a few reasons: 1. I'm nervous about explicitly linking to this low-level library unconditionally, and the logic to decide when it's necessary (only for Python 3.8 and glibc) isn't worth the trouble. 2. The situation required to hit it (drgn + Python threading + fork) is unlikely outside of our test suite. 3. Python 3.8 is EOL. 4. Builds with libkdumpfile already pull in libgcc_s via libkdumpfile -> libsnappy -> libstdc++ -> libgcc_s. Signed-off-by: Omar Sandoval <[email protected]>

osandov and others added 30 commits March 27, 2020 17:13

vmtest: move to top-level directory

739fe56

vmtest/manage: move kernel config to separate file

dda5980

Currently, the kernel config is embedded inside of manage.py. This is inconvenient when building kernels manually for testing. Move it to a separate file so that, e.g., it can be copied into the build tree easily.

vmtest/manage: remove relocations from vmlinux

32deb22

About half (!) of vmlinux comes from relocation sections (~130M out of ~250M). But, vmlinux is an ET_EXEC file, so relocations don't even apply to it. We can massively shrink vmlinux by removing the unnecessary sections.

vmtest/manage: add type annotations

0548926

And add docstrings to the functions with more complex signatures.

Split some utility functions out of setup.py

b369ca0

These will be shared with vmtest.

vmtest/manage: use util.nproc() instead of multiprocessing.cpu_count()

f73f4f4

vmtest/manage: chmod -x

435a344

This should just be executed as python3 -m vmtest.manage now.

vmtest/manage: always get dropbox token for --build-kernel-org

c55d199

--build-kernel-org needs the token to get the available releases even if we're not uploading.

vmtest/manage: don't open-code dict.fromkeys()

2a31b05

vmtest/manage: handle API errors in get_available_kernel_releases()

9d4b214

If the path doesn't exist, there are no available releases. Otherwise, we need to check for other errors.

vmtest: add -vmtestN localversion

6d8d415

Any changes of the vmtest kernel config will require a rebuild of all kernels and a way to distinguish the rebuild. Add CONFIG_LOCALVERSION which we will bump each time the config changes.

util: add version comparison implementation

7c7286e

Add an verrevcmp() function based on the coreutils implementation (which comes from gnulib, which is derived from the implementation in dpkg). This will be used by vmtest.

disable drgn testing when constructing debian package

6c28b37

Merge pull request #6 from sdimitro/disable_testing

d53ffe5

disable drgn testing when constructing debian package

libdrgn: Add cpp language and tests

d8fadf1

Add missing files to MANIFEST.in

63613d4

The vmtest rework missed a few new files.

Update README to note that C++ support is in progress

2addae5

Merge branch 'refs/heads/upstream-HEAD' into repo-HEAD

5ac71b4

Merge branch 'refs/heads/upstream-HEAD' into repo-HEAD

98f8275

Merge branch 'refs/heads/upstream-HEAD' into repo-HEAD

7b818f3

Run black on some stray changes

cf8d969

A few recent changes weren't formatted with black.

vmtest: use -cpu host instead of kvm64

3adc8f2

drgn may be compiled with some CPU-specific features (e.g., -march=native), so make sure that we support those features inside of the VM, too.

helpers: add pgtable_l5_enabled()

1dbc718

osandov and others added 25 commits May 4, 2020 13:20

Merge branch 'refs/heads/upstream-HEAD' into repo-HEAD

1844ceb

Merge branch 'refs/heads/upstream-HEAD' into repo-HEAD

596df1a

libdrgn: get rid of OFF_MAX

d759c7e

This hasn't been used since commit 417a6f0 ("libdrgn: make memory reader pluggable with callbacks").

libdrgn: improve and document bit operations

340e00d

fls() can be implemented with __bitop(), and we can get rid of clz() since it's only used by fls().

libdrgn: don't open-code fls()

3d59e04

c_integer_literal() has an open-coded equivalent of fls() that assumes that unsigned long long is 64 bits. Use fls() instead.

libdrgn: don't use unaligned loads to parse DWARF

8f81ea2

-fsanitize=undefined reports that the read_u* helpers rely on unaligned loads. Use memcpy() instead.

helpers: add access_process_vm() and access_remote_vm()

8a27683

Now that we can walk page tables, we can finally read memory from userspace tasks. Closes #53.

helpers: add cmdline() and environ()

4b82d1e

These are two of the most common use cases for reading a process's memory.

libdrgn: use swapper_pg_dir in vmcoreinfo for fallback PAGE_OFFSET

e697be7

I originally wanted to avoid depending on another vmcoreinfo field, but an the next change is going to depend on swapper_pg_dir in vmcoreinfo anyways, and it ends up being simpler to use it.

setup.py: add 5.7 to vmtest kernels

62ddc96

5.7 is up to rc4 (oops). Better late than never.

Merge branch 'refs/heads/upstream-HEAD' into repo-HEAD

5ed0631

libdrgn: build in silent mode by default

bf54510

The automake/libtool compilation output is obnoxiously verbose. Switch on automake's silent mode, and make the custom rules honor it.

Merge branch 'refs/heads/upstream-HEAD' into repo-HEAD

e39448d

Merge branch 'refs/heads/upstream-HEAD' into repo-HEAD

8097403

Sync "6.0/stage" with "master" via GitHub Actions

adbc347

Merge pull request #7 from sdimitro/automate_backports_r

987c85a

Sync "6.0/stage" with "master" via GitHub Actions

sdimitro approved these changes May 15, 2020

View reviewed changes

sdimitro requested a review from prakashsurya May 15, 2020 21:55

prakashsurya approved these changes May 16, 2020

View reviewed changes

prakashsurya merged commit bf4c45c into 6.0/stage May 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Merge branch 'master' into '6.0/stage' #8

Merge branch 'master' into '6.0/stage' #8

Uh oh!

github-actions bot commented May 15, 2020

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants

Merge branch 'master' into '6.0/stage' #8

Merge branch 'master' into '6.0/stage' #8

Uh oh!

Conversation

github-actions bot commented May 15, 2020

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants