You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
passes. The first thing compiler engineers should focus on when pursuing
4799
4799
memory safety is to lower their frontend’s AST to MIR. Several compiled
4800
4800
languages already pass through a mid-level IR: Swift passes through
4801
-
SIL,<span class="citation" data-cites="sil">[<a href="https://github.com/swiftlang/swift/blob/main/docs/SIL.rst" role="doc-biblioref">sil</a>]</span> Rust passes through MIR,<span class="citation" data-cites="mir">[<a href="https://rustc-dev-guide.rust-lang.org/mir/index.html" role="doc-biblioref">mir</a>]</span> and Circle passes through it’s
4801
+
SIL,<span class="citation" data-cites="sil">[<a href="https://github.com/swiftlang/swift/blob/main/docs/SIL.rst" role="doc-biblioref">sil</a>]</span> Rust passes through MIR,<span class="citation" data-cites="mir">[<a href="https://rustc-dev-guide.rust-lang.org/mir/index.html" role="doc-biblioref">mir</a>]</span> and Circle passes through its
4802
4802
mid-level IR when targeting the new object model. There is an effort
4803
4803
called ClangIR<span class="citation" data-cites="clangir">[<a href="https://discourse.llvm.org/t/rfc-upstreaming-clangir/76587" role="doc-biblioref">clangir</a>]</span> to lower Clang to an MLIR
4804
4804
dialect called CIR, but the project is in an early phase and doesn’t
Copy file name to clipboardexpand all lines: proposal/draft.md
+8-8
Original file line number
Diff line number
Diff line change
@@ -210,7 +210,7 @@ The "billion-dollar mistake" is a type safety problem. Consider `std::unique_ptr
210
210
211
211
As Hoare observes, the problem comes from conflating two different things, a pointer to an object and an empty state, into the same type and giving them the same interface. Smart pointers should only hold valid pointers. Denying the null state eliminates undefined behavior.
212
212
213
-
We address the type safety problem by overhauling the object model. Safe C++ features a new kind of move: [_relocation_](#relocation-object-model), also called _destructive move_. The object model is called an _affine_ or a _linear_ type system. Unless explicitly initialized, objects start out _uninitialized_. They can't be used in this state. When you assign to an object, it becomes initialized. When you relocate from an object, it's value is moved and it's reset to uninitialized. If you relocate from an object inside control flow, it becomes _potentially uninitialized_, and its destructor is conditionally executed after reading a compiler-generated drop flag.
213
+
We address the type safety problem by overhauling the object model. Safe C++ features a new kind of move: [_relocation_](#relocation-object-model), also called _destructive move_. The object model is called an _affine_ or a _linear_ type system. Unless explicitly initialized, objects start out _uninitialized_. They can't be used in this state. When you assign to an object, it becomes initialized. When you relocate from an object, its value is moved and it's reset to uninitialized. If you relocate from an object inside control flow, it becomes _potentially uninitialized_, and its destructor is conditionally executed after reading a compiler-generated drop flag.
214
214
215
215
`std2::box` is our version of `unique_ptr`. It has no null state. There's no default constructor. Dereference it without risk of undefined behavior. If this design is so much safer, why doesn't C++ simply introduce its own fixed `unique_ptr` without a null state? Blame C++11 move semantics.
216
216
@@ -530,7 +530,7 @@ public:
530
530
};
531
531
```
532
532
533
-
The [safety model](#memory-safety-as-terms-and-conditions) establishes rules for where library code must insert panic calls. If a function is marked safe but is internally unsound for some values of its arguments, it should check those arguments and panic before executing the unsafe operation. Unsafe functions generally don't panic because its the responsibility of their callers to observe the preconditions of the function.
533
+
The [safety model](#memory-safety-as-terms-and-conditions) establishes rules for where library code must insert panic calls. If a function is marked safe but is internally unsound for some values of its arguments, it should check those arguments and panic before executing the unsafe operation. Unsafe functions generally don't panic because it's the responsibility of their callers to observe the preconditions of the function.
534
534
535
535
# Design overview
536
536
@@ -925,7 +925,7 @@ Garbage collection requires storing objects on the _heap_. But C++ is about _man
925
925
926
926
### Use-after-free
927
927
928
-
`std::string_view` was added to C++17 as a safer alternative to passing character pointers around. Unfortunately, its rvalue-reference constructor is so dangerously designed that its reported to _encourage_ use-after-free bugs.[@string-view-use-after-free]
928
+
`std::string_view` was added to C++17 as a safer alternative to passing character pointers around. Unfortunately, its rvalue-reference constructor is so dangerously designed that it's reported to _encourage_ use-after-free bugs.[@string-view-use-after-free]
I've relabelled the example to show function points and region names of variables and loans. If we run live analysis on 'R0, the region for the variable `ref`, we see it's live at points 'R0 = { 4, 8, 9, 10, 11 }. These are the points where its subsequently used. We'll grow the loan regions 'R1 and 'R2 until their constraint equations are satisfied.
1389
+
I've relabelled the example to show function points and region names of variables and loans. If we run live analysis on 'R0, the region for the variable `ref`, we see it's live at points 'R0 = { 4, 8, 9, 10, 11 }. These are the points where it's subsequently used. We'll grow the loan regions 'R1 and 'R2 until their constraint equations are satisfied.
1390
1390
1391
1391
`'R1 : 'R0 @ P3` means that starting at P3, the 'R1 contains all points 'R0 does, along all control flow paths, as long as 'R0 is live. 'R1 = { 3, 4 }. Grow 'R2 the same way: 'R2 = { 7, 8, 9, 10, 11 }.
1392
1392
@@ -1440,7 +1440,7 @@ Circle tries to identify all three of these points when forming borrow checker e
1440
1440
1441
1441
The invariants that are tested are established with a network of lifetime constraints. It might not be the case that the invalidating action is obviously related to either the place of the loan or the use that extends the loan. More completely describing the chain of constraints could help users diagnose borrow checker errors. But there's a fine line between presenting an error like the one above, which is already pretty wordy, and overwhelming programmers with information.
1442
1442
1443
-
### Lifetime constraints on called functinos
1443
+
### Lifetime constraints on called functions
1444
1444
1445
1445
Borrow checking is easiest to understand when applied to a single function. The function is lowered to a control flow graph, the compiler assigns regions to loans and borrow variables, emits lifetime constraints where there are assignments, iteratively grows regions until the constraints are solved, and walks the instructions, checking for invalidating actions on loans in scope. Within the definition of the function, there's nothing it can't analyze. The complexity arises when passing and receiving borrows through function calls.
1446
1446
@@ -2555,7 +2555,7 @@ Lifetime safety also guarantees that the `lock_guard` is in scope (meaning the m
2555
2555
2556
2556
Interior mutability is a legal loophole around exclusivity. You're still limited to one mutable borrow or any number of shared borrows to an object. Types with a deconfliction strategy use `unsafe_cell` to safely strip the const off shared borrows, allowing users to mutate the protected resource.
2557
2557
2558
-
Safe C++ and Rust conflate exclusive access with mutable borrows and shared access with const borrows. It's is an economical choice, because one type qualifier, `const` or `mut`, also determines exclusivity. But the cast-away-const model of interior mutability is an awkward consequence. This design may not be the only way: The Ante language[@ante] experiments with separate `own mut` and `shared mut` qualifiers. That's really attractive, because you're never mutating something through a const reference. This three-state system doesn't map onto C++'s existing type system as easily, but that doesn't mean the const/mutable borrow treatment, which does integrate elegantly, is the most expressive. A `shared` type qualifier merits investigation during the course of this project.
2558
+
Safe C++ and Rust conflate exclusive access with mutable borrows and shared access with const borrows. It's an economical choice, because one type qualifier, `const` or `mut`, also determines exclusivity. But the cast-away-const model of interior mutability is an awkward consequence. This design may not be the only way: The Ante language[@ante] experiments with separate `own mut` and `shared mut` qualifiers. That's really attractive, because you're never mutating something through a const reference. This three-state system doesn't map onto C++'s existing type system as easily, but that doesn't mean the const/mutable borrow treatment, which does integrate elegantly, is the most expressive. A `shared` type qualifier merits investigation during the course of this project.
2559
2559
2560
2560
* `T^` - Exclusive mutable access. Permits standard conversion to `shared T^` and `const T^`.
2561
2561
* `shared T^` - Shared mutable access. Permits standard conversion to `const T^`. Only types that enforce interior mutability have overloads with shared mutable access.
@@ -2610,7 +2610,7 @@ class [[
2610
2610
2611
2611
`std2::mutex` is another candidate for use with `std2::arc`. This type is thread safe. As shown in the [thread safety](#thread-safety) example, it provides threads with exclusive access to its interior data using a synchronization object. The borrow checker prevents the reference to the inner data from being used outside of the mutex's lock. Therefore, `std2::mutex` is `sync` if its inner type is `send`. Why make it conditional on `send` when the mutex is already providing threads with exclusive access to the inner value? This provides protection for the rare type with thread affinity. A type is `send` if it can both be copied to a different thread _and used_ by a different thread.
2612
2612
2613
-
`std2::arc<std2::mutex<T>>` is `send` if `std2::mutex<T>` is `send` and `sync`. `std2::mutex<T>` is `send` and `sync` if `T` is `send`. Since most types are `send` by construction, we can safely mutate shared state over multiple threads as long as its wrapped in a `std2::mutex` and that's owned by an `std2::arc`. The `arc` provides shared ownership. The `mutex` provides shared mutation.
2613
+
`std2::arc<std2::mutex<T>>` is `send` if `std2::mutex<T>` is `send` and `sync`. `std2::mutex<T>` is `send` and `sync` if `T` is `send`. Since most types are `send` by construction, we can safely mutate shared state over multiple threads as long as it's wrapped in a `std2::mutex` and that's owned by an `std2::arc`. The `arc` provides shared ownership. The `mutex` provides shared mutation.
2614
2614
2615
2615
```cpp
2616
2616
classthread {
@@ -2773,7 +2773,7 @@ We should also revise the policy for using lifetime parameters in class definiti
2773
2773
2774
2774
# Implementation guidance
2775
2775
2776
-
The intelligence behind the _ownership and borrowing_ safety model resides in the compiler's middle-end, in its _MIR analysis_ passes. The first thing compiler engineers should focus on when pursuing memory safety is to lower their frontend's AST to MIR. Several compiled languages already pass through a mid-level IR: Swift passes through SIL,[@sil] Rust passes through MIR,[@mir] and Circle passes through it's mid-level IR when targeting the new object model. There is an effort called ClangIR[@clangir] to lower Clang to an MLIR dialect called CIR, but the project is in an early phase and doesn't have enough coverage to support the language or library features described in this document.
2776
+
The intelligence behind the _ownership and borrowing_ safety model resides in the compiler's middle-end, in its _MIR analysis_ passes. The first thing compiler engineers should focus on when pursuing memory safety is to lower their frontend's AST to MIR. Several compiled languages already pass through a mid-level IR: Swift passes through SIL,[@sil] Rust passes through MIR,[@mir] and Circle passes through its mid-level IR when targeting the new object model. There is an effort called ClangIR[@clangir] to lower Clang to an MLIR dialect called CIR, but the project is in an early phase and doesn't have enough coverage to support the language or library features described in this document.
2777
2777
2778
2778
The AST->MIR and MIR->LLVM pipelines (or whatever codegen is used) fully replaces the compiler's old AST->LLVM codegen. It is more difficult to lower through MIR than directly emitting LLVM, but implementing new codegen is not a very large investment. You can look into Circle's MIR support with the `-print-mir` and `-print-mir-drop` cmdline options, which print the MIR before and after drop elaboration, respectively.
0 commit comments