|
| 1 | +# February, 2025 |
| 2 | +## Feb 22, 2025 |
| 3 | +### Progress |
| 4 | +In the past month, I got more of the basic compilation in place. Informal is able to compile basic numeric expressions, supports loading large constants in the binary, and basic branching blocks. Enough of the concrete, fundamental building blocks to take the next step. It's now at a point where I see complexity slowly building up across the layers. Each construct adds to the surface area, and concepts have to be weaved carefully through the layers of the system. Sounds familiar? This is the complexity trap that ultimately swallows any large system. The code start elegantly enough, but then accumulate more and more responsibilities, until the system itself calcifies into an amorphous blob that hides bugs, resists changes and slowly destroys the momentum of the project. There are many ways to manage this complexity trap. How do we manage the complexity trap in Informal? And how can we do that in a way that benefits not just the compiler, but systems that are written in this language? |
| 5 | + |
| 6 | +### Pattern based abstractions |
| 7 | +Abstractions are a double-edged sword. There's the push and pull of wanting to control each detail of execution, or wanting not to be bothered by all of the minutae. A culture of design-patterns often become dogma, where layers are added to simple systems which themselves become the complexity. When things are unnecessarilly generalized to support future hypothetical use-cases at the expense of added ceremony to today's practical needs. Good abstractions simplify the most common needs, and enable tackling problems that seemed impossible otherwise. They make complexity manageable. |
| 8 | + |
| 9 | +Pattern based types are such a mechanism for abstraction in Informal. When you think about it, Types represent a universe of values. You can define them in many ways - by their structure (structural / record types), by abstract names / labels (nominal types), by predicates (predicative types). Or by patterns, representing their universe of values. A pattern for some text may look like `Date: "${DAY}-${MONTH}-${YEAR}"`. Such string-patterns are often used as "F-strings" to turn values into text, but patterns can be used in either direction - to format values into a string, or to parse a string into its constituent values. Most languages already support rich patterns for structures of data, but what about arrays? Or binary data? Or... If programs are just values, can we represent them with types? |
| 10 | +``` |
| 11 | +ForLoop(VAR, COLLECTION, BODY): |
| 12 | + for VAR IN COLLECTION: |
| 13 | + BODY |
| 14 | +``` |
| 15 | +This may look like regular code, but it gives you a mechanism to express the structure of code in code using the normal constructs of the language. The uppercase parameters here represent code passed in lazily in its raw symbolic form, delegating evaluation to the code. Just like with f-strings, this code-pattern can be used to construct or to parse. You can match against those patterns in functions, and use it to desugar syntax, transform code patterns into faster equivalents, or optimize lower-level assembly intrinsic code blocks into faster equivalents. Such term-rewriting systems are powerful enough to represent a lot of local-transformations, but what if we want to express rules and optimizations that work wholistically across the whole program? Things like register allocation, loop-invariant code motion or data-flow and control-flow analysis? |
| 16 | + |
| 17 | +For that, the abstraction we turn to the other super-power of types in Informal: Types are declarative. They express the fundamental constraints of what is true declaratively, rather than an imperative sequence of instructions to check it. Expressing graph-coloring register allocation through these datalog style rules is simple compared to the complexity of maintaining such invariants in imperative programs. This higher-level abstraction also opens the door to combining phases under the hood. I have the inkling of an idea for how to turn this datalog style rule-matching into a fast GPU operation, but that is far beyond the scope of this post. The important thing is, expressing these concepts as rules allow us to express the fundamental constraints while iterating on different ways of evaluating those rules quickly enough. |
| 18 | + |
| 19 | +### Restrictions on recursion and the stack-based model of function evaluation |
| 20 | +Stack semantics dominate all of the major modern languages. When you call a function, it implies pushing and popping off of a call-stack. It's so pervasive that most CPU architectures have built in instructions for such call/ret primitives. |
| 21 | +But this was not always the case. There's a [fascinating story](https://vanemden.wordpress.com/2014/06/18/how-recursion-got-into-programming-a-comedy-of-errors-3/) about how recursion made its way into Algol 60. Supporting recursion has a lot of implications for languages. In a language without recrusion, to pass arguments to a function, you could write to a fixed location (register or memory), and have the confidence that other calls would not write over it. As long as functions calls are acyclic (i.e. no direct or indirect recursion), this is a safe assumption. Ofcourse, there are caveats for multi-threading, FFI/external calls and more. But disallowing recursion opens up the door to vastly more efficient calling conventions. At the program level, this is a NP-class tiling problem, but there is a lot more freedom in this solution space without the stack semantics. Disallowing recursion also makes informal much easier to adapt to GPUs. Ofcourse, we can still support tail-recursion, which covers most of the common use-cases. |
| 22 | + |
0 commit comments