@@ -11,6 +11,8 @@ project on its own that probably needs to have its own debugging document (not
1111that I could find one). But here are some tips that are important in a rustc
1212context:
1313
14+ ### Minimize the example
15+
1416As a general rule, compilers generate lots of information from analyzing code.
1517Thus, a useful first step is usually to find a minimal example. One way to do
1618this is to
@@ -24,6 +26,13 @@ everything relevant to the new crate
24263 . further minimize the issue by making the code shorter (there are tools that
2527help with this like ` creduce ` )
2628
29+ For more discussion on methodology for steps 2 and 3 above, there is an
30+ [ epic blog post] [ mcve-blog ] from pnkfelix specifically about Rust program minimization.
31+
32+ [ mcve-blog ] : https://blog.pnkfx.org/blog/2019/11/18/rust-bug-minimization-patterns/
33+
34+ ### Enable LLVM internal checks
35+
2736The official compilers (including nightlies) have LLVM assertions disabled,
2837which means that LLVM assertion failures can show up as compiler crashes (not
2938ICEs but "real" crashes) and other sorts of weird behavior. If you are
@@ -34,12 +43,29 @@ anything turns up.
3443
3544The rustc build process builds the LLVM tools into
3645` ./build/<host-triple>/llvm/bin ` . They can be called directly.
46+ These tools include:
47+ * [ ` llc ` ] , which compiles bitcode (` .bc ` files) to executable code; this can be used to
48+ replicate LLVM backend bugs.
49+ * [ ` opt ` ] , a bitcode transformer that runs LLVM optimization passes.
50+ * [ ` bugpoint ` ] , which reduces large test cases to small, useful ones.
51+ * and many others, some of which are referenced in the text below.
52+
53+ [ `llc` ] : https://llvm.org/docs/CommandGuide/llc.html
54+ [ `opt` ] : https://llvm.org/docs/CommandGuide/opt.html
55+ [ `bugpoint` ] : https://llvm.org/docs/Bugpoint.html
56+
57+ By default, the Rust build system does not check for changes to the LLVM source code or
58+ its build configuration settings. So, if you need to rebuild the LLVM that is linked
59+ into ` rustc ` , first delete the file ` llvm-finished-building ` , which should be located
60+ in ` build/<host-triple>/llvm/ ` .
3761
3862The default rustc compilation pipeline has multiple codegen units, which is
3963hard to replicate manually and means that LLVM is called multiple times in
4064parallel. If you can get away with it (i.e. if it doesn't make your bug
4165disappear), passing ` -C codegen-units=1 ` to rustc will make debugging easier.
4266
67+ ### Get your hands on raw LLVM input
68+
4369For rustc to generate LLVM IR, you need to pass the ` --emit=llvm-ir ` flag. If
4470you are building via cargo, use the ` RUSTFLAGS ` environment variable (e.g.
4571` RUSTFLAGS='--emit=llvm-ir' ` ). This causes rustc to spit out LLVM IR into the
@@ -52,24 +78,22 @@ other useful options. Also, debug info in LLVM IR can clutter the output a lot:
5278` RUSTFLAGS="-C debuginfo=0" ` is really useful.
5379
5480` RUSTFLAGS="-C save-temps" ` outputs LLVM bitcode (not the same as IR) at
55- different stages during compilation, which is sometimes useful. One just needs
56- to convert the bitcode files to ` .ll ` files using ` llvm-dis ` which should be in
57- the target local compilation of rustc.
81+ different stages during compilation, which is sometimes useful. The output LLVM
82+ bitcode will be in ` .bc ` files in the compiler's output directory, set via the
83+ ` --out-dir DIR ` argument to ` rustc ` .
5884
59- If you are seeing incorrect behavior due to an optimization pass, a very handy
60- LLVM option is ` -opt-bisect-limit ` , which takes an integer denoting the index
61- value of the highest pass to run. Index values for taken passes are stable
62- from run to run; by coupling this with software that automates bisecting the
63- search space based on the resulting program, an errant pass can be quickly
64- determined. When an ` -opt-bisect-limit ` is specified, all runs are displayed
65- to standard error, along with their index and output indicating if the
66- pass was run or skipped. Setting the limit to an index of -1 (e.g.,
67- ` RUSTFLAGS="-C llvm-args=-opt-bisect-limit=-1" ` ) will show all passes and
68- their corresponding index values.
85+ * If you are hitting an assertion failure or segmentation fault from the LLVM
86+ backend when invoking ` rustc ` itself, it is a good idea to try passing each
87+ of these ` .bc ` files to the ` llc ` command, and see if you get the same
88+ failure. (LLVM developers often prefer a bug reduced to a ` .bc ` file over one
89+ that uses a Rust crate for its minimized reproduction.)
6990
70- If you want to play with the optimization pipeline, you can use the ` opt ` tool
71- from ` ./build/<host-triple>/llvm/bin/ ` with the LLVM IR emitted by rustc. Note
72- that rustc emits different IR depending on whether ` -O ` is enabled, even
91+ * To get human readable versions of the LLVM bitcode, one just needs to convert
92+ the bitcode (` .bc ` ) files to ` .ll ` files using ` llvm-dis ` , which should be in
93+ the target local compilation of rustc.
94+
95+
96+ Note that rustc emits different IR depending on whether ` -O ` is enabled, even
7397without LLVM's optimizations, so if you want to play with the IR rustc emits,
7498you should:
7599
@@ -93,6 +117,18 @@ to some file. Also, if you are using neither `-filter-print-funcs` nor `-C
93117codegen-units=1`, then, because the multiple codegen units run in parallel, the
94118printouts will mix together and you won't be able to read anything.
95119
120+ * One caveat to the aforementioned methodology: the ` -print ` family of options
121+ to LLVM only prints the IR unit that the pass runs on (e.g., just a
122+ function), and does not include any referenced declarations, globals,
123+ metadata, etc. This means you cannot in general feed the output of ` -print `
124+ into ` llc ` to reproduce a given problem.
125+
126+ * Within LLVM itself, calling ` F.getParent()->dump() ` at the beginning of
127+ ` SafeStackLegacyPass::runOnFunction ` will dump the whole module, which
128+ may provide better basis for reproduction. (However, you
129+ should be able to get that same dump from the ` .bc ` files dumped by
130+ ` -C save-temps ` .)
131+
96132If you want just the IR for a specific function (say, you want to see why it
97133causes an assertion or doesn't optimize correctly), you can use ` llvm-extract ` ,
98134e.g.
@@ -105,6 +141,45 @@ $ ./build/$TRIPLE/llvm/bin/llvm-extract \
105141 > extracted.ll
106142```
107143
144+ ### Investigate LLVM optimization passes
145+
146+ If you are seeing incorrect behavior due to an optimization pass, a very handy
147+ LLVM option is ` -opt-bisect-limit ` , which takes an integer denoting the index
148+ value of the highest pass to run. Index values for taken passes are stable
149+ from run to run; by coupling this with software that automates bisecting the
150+ search space based on the resulting program, an errant pass can be quickly
151+ determined. When an ` -opt-bisect-limit ` is specified, all runs are displayed
152+ to standard error, along with their index and output indicating if the
153+ pass was run or skipped. Setting the limit to an index of -1 (e.g.,
154+ ` RUSTFLAGS="-C llvm-args=-opt-bisect-limit=-1" ` ) will show all passes and
155+ their corresponding index values.
156+
157+ If you want to play with the optimization pipeline, you can use the [ ` opt ` ] tool
158+ from ` ./build/<host-triple>/llvm/bin/ ` with the LLVM IR emitted by rustc.
159+
160+ When investigating the implementation of LLVM itself, you should be
161+ aware of its [ internal debug infrastructure] [ llvm-debug ] .
162+ This is provided in LLVM Debug builds, which you enable for rustc
163+ LLVM builds by changing this setting in the config.toml:
164+ ```
165+ [llvm]
166+ # Indicates whether the LLVM assertions are enabled or not
167+ assertions = true
168+
169+ # Indicates whether the LLVM build is a Release or Debug build
170+ optimize = false
171+ ```
172+ The quick summary is:
173+ * Setting ` assertions=true ` enables coarse-grain debug messaging.
174+ * beyond that, setting ` optimize=false ` enables fine-grain debug messaging.
175+ * ` LLVM_DEBUG(dbgs() << msg) ` in LLVM is like ` debug!(msg) ` in ` rustc ` .
176+ * The ` -debug ` option turns on all messaging; it is like setting the
177+ environment variable ` RUSTC_LOG=debug ` in ` rustc ` .
178+ * The ` -debug-only=<pass1>,<pass2> ` variant is more selective; it is like
179+ setting the environment variable ` RUSTC_LOG=path1,path2 ` in ` rustc ` .
180+
181+ [ llvm-debug ] : https://llvm.org/docs/ProgrammersManual.html#the-llvm-debug-macro-and-debug-option
182+
108183### Getting help and asking questions
109184
110185If you have some questions, head over to the [ rust-lang Zulip] and
@@ -164,7 +239,9 @@ create a minimal working example with Godbolt. Go to
164239 optimizations transform it.
165240
1662415 . Once you have a godbolt link demonstrating the issue, it is pretty easy to
167- fill in an LLVM bug. Just visit [ bugs.llvm.org] ( https://bugs.llvm.org/ ) .
242+ fill in an LLVM bug. Just visit their [ github issues page] [ llvm-issues ] .
243+
244+ [ llvm-issues ] : https://github.com/llvm/llvm-project/issues
168245
169246### Porting bug fixes from LLVM
170247
0 commit comments