|
| 1 | +# MIR Background topics |
| 2 | + |
| 3 | +This section covers a numbers of common compiler terms that arise when |
| 4 | +talking about MIR and optimizations. We try to give the general |
| 5 | +definition while providing some Rust-specific context. |
| 6 | + |
| 7 | +<a name=cfg> |
| 8 | + |
| 9 | +## What is a control-flow graph? |
| 10 | + |
| 11 | +A control-flow graph is a common term from compilers. If you've ever |
| 12 | +used a flow-chart, then the concept of a control-flow graph will be |
| 13 | +pretty familiar to you. It's a representation of your program that |
| 14 | +exposes the underlying control flow in a very clear way. |
| 15 | + |
| 16 | +A control-flow graph is structured as a set of **basic blocks** |
| 17 | +connected by edges. The key idea of a basic block is that it is a set |
| 18 | +of statements that execute "together" -- that is, whenever you branch |
| 19 | +to a basic block, you start at the first statement and then execute |
| 20 | +all the remainder. Only at the end of the is there the possibility of |
| 21 | +branching to more than one place (in MIR, we call that final statement |
| 22 | +the **terminator**): |
| 23 | + |
| 24 | +``` |
| 25 | +bb0: { |
| 26 | + statement0; |
| 27 | + statement1; |
| 28 | + statement2; |
| 29 | + ... |
| 30 | + terminator; |
| 31 | +} |
| 32 | +``` |
| 33 | + |
| 34 | +Many expressions that you are used to in Rust compile down to multiple |
| 35 | +basic blocks. For example, consider an if statement: |
| 36 | + |
| 37 | +```rust |
| 38 | +a = 1; |
| 39 | +if some_variable { |
| 40 | + b = 1; |
| 41 | +} else { |
| 42 | + c = 1; |
| 43 | +} |
| 44 | +d = 1; |
| 45 | +``` |
| 46 | + |
| 47 | +This would compile into four basic blocks: |
| 48 | + |
| 49 | +``` |
| 50 | +BB0: { |
| 51 | + a = 1; |
| 52 | + if some_variable { goto BB1 } else { goto BB2 } |
| 53 | +} |
| 54 | +
|
| 55 | +BB1: { |
| 56 | + b = 1; |
| 57 | + goto BB3; |
| 58 | +} |
| 59 | +
|
| 60 | +BB2: { |
| 61 | + c = 1; |
| 62 | + goto BB3; |
| 63 | +} |
| 64 | +
|
| 65 | +BB3: { |
| 66 | + d = 1; |
| 67 | + ...; |
| 68 | +} |
| 69 | +``` |
| 70 | + |
| 71 | +When using a control-flow graph, a loop simply appears as a cycle in |
| 72 | +the graph, and the `break` keyword translates into a path out of that |
| 73 | +cycle. |
| 74 | + |
| 75 | +<a name=dataflow> |
| 76 | + |
| 77 | +## What is a dataflow analysis? |
| 78 | + |
| 79 | +*to be written* |
| 80 | + |
| 81 | +<a name=quantified> |
| 82 | + |
| 83 | +## What is "universally quantified"? What about "existentially quantified"? |
| 84 | + |
| 85 | +*to be written* |
| 86 | + |
| 87 | +<a name=variance> |
| 88 | + |
| 89 | +## What is co- and contra-variance? |
| 90 | + |
| 91 | +*to be written* |
| 92 | + |
| 93 | +<a name=free-vs-bound> |
| 94 | + |
| 95 | +## What is a "free region" or a "free variable"? What about "bound region"? |
| 96 | + |
| 97 | +Let's describe the concepts of free vs bound in terms of program |
| 98 | +variables, since that's the thing we're most familiar with. |
| 99 | + |
| 100 | +- Consider this expression: `a + b`. In this expression, `a` and `b` |
| 101 | + refer to local variables that are defined *outside* of the |
| 102 | + expression. We say that those variables **appear free** in the |
| 103 | + expression. To see why this term makes sense, consider the next |
| 104 | + example. |
| 105 | +- In contrast, consider this expression, which creates a closure: `|a, |
| 106 | + b| a + b`. Here, the `a` and `b` in `a + b` refer to the arguments |
| 107 | + that the closure will be given when it is called. We say that the |
| 108 | + `a` and `b` there are **bound** to the closure, and that the closure |
| 109 | + signature `|a, b|` is a **binder** for the names `a` and `b` |
| 110 | + (because any references to `a` or `b` within refer to the variables |
| 111 | + that it introduces). |
| 112 | + |
| 113 | +So there you have it: a variable "appears free" in some |
| 114 | +expression/statement/whatever if it refers to something defined |
| 115 | +outside of that expressions/statement/whatever. Equivalently, we can |
| 116 | +then refer to the "free variables" of an expression -- which is just |
| 117 | +the set of variables that "appear free". |
| 118 | + |
| 119 | +So what does this have to do with regions? Well, we can apply the |
| 120 | +analogous concept to type and regions. For example, in the type `&'a |
| 121 | +u32`, `'a` appears free. But in the type `for<'a> fn(&'a u32)`, it |
| 122 | +does not. |
0 commit comments