'h' for help, '2' for inline notes, 'p' for presentor notes
HTML5 slides. Created in python-landslide. For more background info see presentor notes. Press 'h'.
Most important slide. Please contact me. Work upstream on elfutils and valgrind. elfutils used by systemtap, systemd, perf. Happy to work on gdb, gcc, binutils etc. Work for Red Hat (which might explain some of my biases).
Seamlessly go from binary back to code. Limited to "native" ELF code. Stuff you build with GCC. Some recommendations. Please do/use this. Hope useful for others. Some questions. How would you do this? Hope to get feedback. Again. Happy to help implement this to make things easy everywhere.
I know how most of this stuff works in Fedora, but need help with other distros.
What is happening? Every user really is a "debugger". You want to observe (trace, profile) the sytem as it is running. Optimized code really is different from "debug code". Get useful backtraces. Analyse a core file "off-line".
Having everything ready makes recreating "debug environment" that matches the production environment precisely so much easier.
Run code with minimal priviliges.xs
Breaks stuff, might cause use of more priviliged code, security theater.
Context separation, sandboxing, pinning memory highly recommended. Useful security measures. Run as many things in different context, minimal priviliges. Use selinux policies or syscall sandboxing to prevent programs from doing "evil" things. "deny ptrace" however breaks things. It isn't just "ptrace". It doesn't just block "debuggers". It blocks most inter-process synchronisation, signaling, debugging, tracing and profiling by normal (unpriviliged) users. Both tried in Fedora and were disabled in the end. With ptrace deny/restricted scope can no longer observer own processes. So also cannot drop priviliges (debugging/profiling/tracing as root!). Still running in same user/security context, so can manipulate files, insert preload code to disable, etc.
*) Make it a hard failure when they are not there.
ELF note. SHA1 hash of code sections in ELF image. Put there by the linker. Unique identifier. In each ELF image, executable, shared library. Loaded in memory. Put in core files.
Fedora errors out when no build-id is found. Debian only warns.
If your build is reproducable, then build-id will be the same. Stripping symbols or adding debuginfo should not change it.
Know the build-ids and the addresses where the ELF images are/were loaded then you have enough information to match any address to original source.
Shameless plug eu-stack.
If you don't adopt anything else, please do this! Having reliable stacktraces is so useful. Fedora enables this on all arches. Shamelesslyplug elfutils eu-stack ('2').
.eh_frame is loaded in memory and so can be accessed easily and fast. frame pointers only get you so far. It is always the tricky code, signal handlers, fork/exec/clone, unexpected termination in the prologue/epilogue that manipulates the frame pointer, etc. Maintaining frame pointers bloats code and reduces optimization oppertunities. GCC is really good at it. GLIBC now has CFI for all/most hand written assembler.
Only exception might be the kernel. But then please put the unwind info in the .debug_frame.
Normally only .dynsym symbols available. This provides parts of .symtab normally stripped away. You cannot see "inlines" but still very useful. Recommended if the alternative would be having just "raw" addresses (for which you then need build-id to look them up).
Fedora only. But upstream support in gdb and elfutils. Slightly primitive. Alternative might be new ELF compressed section support. Or just not strip the full .symtab out (valgrind needs it in some cases - although only really for ld.so).
This is a lot of info and sadly somewhat entangled.
But please always generate it and then strip it into a separate .debug file.
This will give you inlines (program structure, which code ended up where). Arguments to functions and local variables, plus which value they have at which point in the program. The types and structures used by the program. Matching addresses to source lines. .gdb_index provides debuggers a quick way to navigate some of these structures, so they don't need to scan it all at startup, even if you only want to use a small portion. -g3 used to be very expensive, but recent GCC versions generate much denser data. Nobody really uses them much though, since nobody generates them...so chicken, egg. Both indexing and dense macros are proposed as DWARFv5 extensions.
Please don't use compressed ELF sections (.zdebug or SHF_COMPRESSED)
Do use DWZ the DWARF optimization and duplicate removal tool
Full debuginfo is big! So yes, compress. But ELF section compression is the wrong level. It isn't supported by many programs (valgrind for example doesn't). There are two variants (if you use any please not .zdebug). It prevents simply mmapping the data and using an index to only read/use what you need. It causes very slow startup. You can however use both if you really want to get the most compression. But I would recommend using DWZ only and then compress the while file(s) for storage (like in a package), but install uncompressed for direct usage.
Package the (post-processed) sources (as they were compiled).
Can be adjusted in two ways:
debugedit also provides source file list found.
Put them somewhere under /usr/src/debug/[package-version]/
debugedit is both more and less flexible. It is more flexible because it provides you the actual source file list used in the DWARF describing the program. It is less flexible because it isn't a full DWARF rewriter. It adjusts the location/directories as long as they are smaller... So setup a big enough build root PATH name. It is probably time to rewrite debugedit to support proper DWARF rewriting.
There are two ways to "link" your binaries to their debug files.
The .gnu_debuglink name has to be searched under well known paths (/usr/lib/debug + original location and/or subdirs). This makes it fragile, but more tools support it and it is the fallback used for when there is no build-id. But it might be time to deprecate/remove it because they inherently conflict between package versions.
Fedora supports both linking build-id.debug -> debuglink file. Fedora also throws in extra link to main exe under .build-id. But in debuginfo package, so that link could mismatch...
Work in progress.
To get there:
This is where I still have more questions than answers. build-ids can conflict for minor version updates (the files should be completely identical though). Dropping .gnu_debuglink (or changing install/renamed paths) will need program updates. Should the build-id-main-file backlinks move into main package?