Version: 0.4.4.1 | ← Language Spec | API Reference →
Target Architecture: x86-64 (AMD64) Targets: Windows x64 (COFF/PE) + Linux x86-64 (ELF) Calling Conventions: Microsoft x64 ABI (Windows) + SystemV AMD64 ABI (Linux)
This document details the internal architecture, data structures, and algorithms used in the Baa compiler.
- Pipeline Architecture
- Component Boundaries & Size Guard
- Lexical Analysis
- Syntactic Analysis
- Abstract Syntax Tree
- Semantic Analysis
- Intermediate Representation
- IR Mem2Reg Pass
- IR Out-of-SSA Pass
- IR SSA Verification
- IR Well-Formedness Verification
- IR Canonicalization Pass
- IR CFG Simplification Pass
- IR Data Layout Module
- IR Constant Folding Pass
- IR Dead Code Elimination Pass
- IR Copy Propagation Pass
- IR Common Subexpression Elimination Pass
- IR Global Value Numbering Pass
- IR Loop Invariant Code Motion Pass
- IR Inlining Pass
- IR Loop Unrolling Pass
- Instruction Selection
- Register Allocation
- Code Generation
- Global Data Section
- Naming & Entry Point
- IR Developer Guide
- Code Review Checklist
The compiler is orchestrated by the Driver (src/main.c), which acts as the entry point and build manager. It parses command-line arguments to determine which stages of compilation to run.
flowchart LR
A[".baa Source"] --> B["Lexer + Preprocessor"]
B -->|Tokens| C["Parser"]
C -->|AST| D["Semantic Analysis"]
D -->|Validated AST| E["IR Lowering"]
E -->|IR| F["Optimizer"]
F -->|Optimized IR| G["Code Generator"]
G --> H[".s Assembly"]
H -->|GCC -c| I[".o Object"]
I -->|GCC -o| J[".exe Executable"]
style A fill:#e1f5fe
style J fill:#c8e6c9
style E fill:#fff3e0
style F fill:#fff3e0
| Stage | Input | Output | Component | Description |
|---|---|---|---|---|
| 1. Frontend | .baa Source |
AST | lexer.c, parser.c |
Tokenizes, handles macros, and builds the syntax tree. |
| 2. Analysis | AST | Valid AST | analysis.c |
Semantic Pass: Checks types, scopes, and resolves symbols. |
| 3. IR Lowering | AST | IR | ir_lower.c (v0.3.0.3+) + ir_builder.c |
Converts AST expressions/statements to SSA-form Intermediate Representation using the IR Builder. |
| 4. Optimization | IR | Optimized IR | ir_optimizer.c, ir_mem2reg.c, ir_sccp.c, ir_gvn.c, etc. |
Full middle-end: Inlining (O2), Mem2Reg, Canon, InstCombine, SCCP, ConstFold, CopyProp, GVN (O2), CSE (O2), DCE, CFGSimplify, LICM. |
| 5. Backend | IR | .s Assembly |
isel.c, regalloc.c, emit.c |
Lowers IR to machine instructions, allocates registers, and emits x86-64 AT&T assembly. |
| 6. Assemble | .s Assembly |
.o Object |
gcc -c |
Invokes external assembler. On Windows (v0.4.4.1), toolchain calls run via ASCII staging paths, then outputs are copied back to requested UTF-8 paths. |
| 7. Link | .o Object |
.exe Executable |
gcc |
Links with C Runtime. On Windows (v0.4.4.1), link inputs/outputs are staged on ASCII paths for GCC compatibility. |
Note (v0.3.2.4+): The compiler uses the full IR-based backend pipeline end-to-end: AST → IR → Optimizer → ISel → RegAlloc → Emit → Assembly.
flowchart TB
Driver["Driver / CLI\nsrc/driver/main.c"] --> Lexer["Lexer + Preprocessor\nsrc/frontend/lexer.c"]
Lexer --> Parser["Parser\nsrc/frontend/parser.c"]
Parser --> Analyzer["Semantic Analysis\nsrc/frontend/analysis.c"]
Analyzer --> Lower["IR Lowering\nsrc/middleend/ir_lower.c (v0.3.0.3+)"]
Lower --> IR["IR Module\nsrc/middleend/ir.c (v0.3.0+)"]
IR --> Backend["Backend\nsrc/backend/isel.c + src/backend/regalloc.c + src/backend/emit.c"]
Backend --> GCC["External Toolchain\nMinGW-w64 gcc"]
Driver --> Diagnostics["Diagnostics\nsrc/support/error.c"]
Driver --> Updater["Updater\nsrc/support/updater.c (Windows-only)"]
style Lower fill:#fff3e0
style IR fill:#fff3e0
style Backend fill:#fff3e0
The driver in main.c (v0.2.0+) supports multi-file compilation and various modes:
| Flag | Mode | Output | Action |
|---|---|---|---|
| (Default) | Compile & Link | .exe |
Runs full pipeline. Deletes intermediate .s and .o files. |
-o <file> |
Custom Output | .exe |
Sets the linked output filename (default: out.exe). |
| (Multiple Files) | Multi-File Build | .exe |
Compiles each .baa to .o and links them. |
-S, -s |
Assembly Only | .s |
Stops after code emission. Writes <input>.s (or -o when a single input file is used). |
-c |
Compile Only | .o |
Stops after assembling. Writes <input>.o (or -o when a single input file is used). |
-v |
Verbose | - | Prints commands and compilation time; keeps intermediate .s files. |
--debug-info |
Debug Info | .s/.o/.exe |
Emits source .file/.loc info and passes -g to toolchain. |
--asm-comments |
Assembly Comments | .s |
Emits explanatory comments in generated assembly (prologue/epilogue/blocks). |
-O0 / -O1 / -O2 |
Optimization Level | - | Selects optimizer aggressiveness (-O1 is default). |
--dump-ir |
IR Dump | stdout | Prints Baa IR (Arabic) after semantic analysis (v0.3.0.6+). |
--emit-ir |
IR Emit | <input>.ir |
Writes Baa IR (Arabic) to a .ir file after semantic analysis (v0.3.0.7). |
--dump-ir-opt |
Optimized IR Dump | stdout | Prints Baa IR (Arabic) after optimization (v0.3.2.6.5). |
--verify |
Verify (All) | stderr | Runs --verify-ir + --verify-ssa (requires -O1/-O2) (v0.3.2.9.1). |
--verify-ir |
IR Verification | stderr | Verifies IR well-formedness (operands/types/terminators/phi/calls) after optimization and before Out-of-SSA/backend (v0.3.2.6.5). |
--verify-ssa |
SSA Verification | stderr | Verifies SSA invariants after Mem2Reg and before Out-of-SSA (requires -O1/-O2) (v0.3.2.5.3). |
--verify-gate |
Verifier Gate (Debug) | stderr | Runs --verify-ir/--verify-ssa after each optimizer iteration (requires -O1/-O2) (v0.3.2.6.5). |
--time-phases |
Phase Timings | stderr | Prints per-phase timing and IR arena memory stats ([TIME]/[MEM]) (v0.3.2.9.2). |
--target=<t> |
Target Select | .s/.o/.exe |
Selects backend target: x86_64-windows or x86_64-linux. |
-fPIC / -fPIE |
Code Model (ELF) | .s/.o/.exe |
Enables PIC/PIE-friendly emission on Linux/ELF. |
-fno-pic / -fno-pie |
Disable PIC/PIE | .s/.o/.exe |
Disables PIC/PIE modes. |
-mcmodel=small |
Code Model | .s/.o/.exe |
Uses small code model (only supported model). |
-fstack-protector / -fstack-protector-all / -fno-stack-protector |
Stack Protector (ELF) | .s/.o/.exe |
Controls stack-canary emission on Linux/ELF. |
-funroll-loops |
Loop Unrolling (Opt-in) | - | Conservatively fully-unrolls small constant-trip-count loops after Out-of-SSA (v0.3.2.7.1). |
--version |
Version Info | stdout | Displays compiler version and build date. |
--help, -h |
Help | stdout | Shows usage information. |
update |
Self-Update | - | Downloads and installs the latest version. |
The source tree now uses physical component directories under src/, and the build targets those component source files directly:
| Component | Current scope |
|---|---|
| Frontend | source loading, lexing, preprocessing, parsing, AST construction |
| Middle-End | semantic analysis, IR construction, IR verification, IR optimization |
| Backend | target-aware IR lowering, register allocation, assembly emission |
| Driver | CLI orchestration, staging/toolchain execution, updater entry points |
| Support | shared diagnostics and shared declarations |
The full ownership/dependency contract now lives in Component Ownership.
Component-local internal facades are also in place for the main implementation roots:
src/frontend/frontend_internal.hsrc/middleend/middleend_internal.hsrc/backend/backend_internal.hsrc/driver/driver_internal.hsrc/support/support_internal.h
These headers are not public API; they centralize implementation-facing includes during the transition.
Size governance for handwritten modules is also active:
scripts/check_module_sizes.pyscanssrc/**/*.candsrc/**/*.h.- Warning threshold:
700physical lines per file. - Error threshold:
1000physical lines per file. scripts/qa_run.py --mode full|stressruns the guard before the expensive QA stages.- CI runs the same guard before full QA on both Windows and Linux.
This remains a transitional hardening step: implementation files and component-owned headers now live under component directories, and local internal facade headers reduce direct cross-component include leakage.
Current in-place split pattern (2026-03-06):
parser.cnow delegates toparser_types.c,parser_expr.c,parser_stmt.c, andparser_decl.c.analysis.cnow delegates toanalysis_scope.c,analysis_types.c,analysis_semantic_utils.c,analysis_builtins.c,analysis_format.c,analysis_infer_expr.c, andanalysis_visit.c.lexer.c,isel.c,regalloc.c,ir.c,ir_text.c,ir_verify_ir.c,ir_lower.c, andemit.calso use companion implementation files to shrink the original hotspots while preserving their exported entry points.scripts/module_size_allowlist.txtis currently empty; the size guard has no active legacy exceptions.driver*.handprocess.hnow live undersrc/driver/.emit.h,isel.h,regalloc.h,target.h, andcode_model.hnow live undersrc/backend/.- all
ir*.hheaders now live undersrc/middleend/. lexer.h,ast.h,parser.h, andanalysis.hnow live undersrc/frontend/as the frontend-owned public surface.version.h,read_file.h,diagnostics.h,updater.h, andtarget_contract.hnow live undersrc/support/as the support-owned public surface.src/baa.his now a compatibility umbrella over those component-owned headers.support/diagnostics.hno longer pulls in frontend lexer declarations directly;error_report(...)is a compatibility macro overerror_report_loc(...).- the build intentionally uses no project-wide include directories; source files must use same-directory includes or explicit relative component paths.
- the
src/root now effectively contains only thebaa.hcompatibility umbrella and resource files.
The repository includes a small benchmark suite under bench/ and a runner script:
- Runner:
scripts/bench.py - Bench programs:
bench/runtime_*.baa(compile+run) andbench/compile_*.baa(compile-time/memory)
Examples:
# Run all benchmarks (compile + runtime)
python3 scripts/bench.py --mode all
# Compile-only benchmark (compiler only, no assembler/linker)
python3 scripts/bench.py --mode compile_s --opt O2
# Memory profiling on Linux (uses /usr/bin/time -v)
python3 scripts/bench.py --mode mem --opt O2
# Include verifier and per-phase stats
python3 scripts/bench.py --mode compile_s --opt O2 --verify --time-phases
Notes:
- The runner uses repo-relative paths to avoid toolchain quoting issues when the repo path contains spaces.
--time-phasesprints[TIME]/[MEM]lines to stderr for machine parsing.
- Primary runner:
scripts/qa_run.py--mode quick: integration smoke (tests/integration/**/*.baaviatests/test.py)--mode full: integration + regression + verify smoke + multi-file smoke--mode stress: full +tests/stress/*.baa+ seeded fuzz-lite (timeout-guarded)
- Legacy runners remain valid:
tests/test.py(integration)tests/regress.py(integration + corpus + negatives)
- On all hosts: docs-derived v0.2.x corpus runs under
tests/corpus_v2x_docs/. - Test metadata markers (recognized by
tests/test.pyandtests/regress.py):// RUN:execution contract (expect-pass,expect-fail,runtime,compile-only,skip)// FLAGS:per-test compiler flags// EXPECT:negative diagnostic anchor(s)// ARGS:runtime executable arguments// STDIN:stdin lines for runtime tests (may repeat; joined with\n+ trailing newline)// EXPECT-ASM:assembly substring expectations (used with-Scompile-only tests)
- Stress programs live under
tests/stress/.
Windows build:
cmake -B build -G "MinGW Makefiles"
cmake --build build
python scripts\qa_run.py --mode full
Linux build:
cmake -B build-linux -DCMAKE_BUILD_TYPE=Release
cmake --build build-linux -j
python3 scripts/qa_run.py --mode full
The compiler uses a centralized Diagnostic Module (src/error.c) to handle errors and warnings.
Note (v0.3.2.9.4): semantic analysis errors now use error_report(...) as well, so semantic diagnostics include file:line:col and source context like parser errors.
Error Features:
- Source Context: Prints the actual line of code where the error occurred.
- Pointers: Uses
^to point exactly to the offending token. - Colored Output: Errors displayed in red (ANSI) when terminal supports it (v0.2.8+).
- Panic Mode Recovery (v0.3.7): When a syntax error is found, the parser does not exit immediately. It reports the error, enters panic mode, then synchronizes by context:
- Statement mode: sync on
.,}and statement starters. - Declaration mode: sync on declaration starters (
صحيح,نص,هيكل,اتحاد,ثابت,ساكن, ...). - Switch mode: sync on
حالة,افتراضي,}and statement terminators. - Parsing resumes after the nearest valid anchor to reduce cascading diagnostics.
- Statement mode: sync on
Warning Features (v0.2.8+):
- Non-fatal: Warnings do not stop compilation by default.
- Colored Output: Warnings displayed in yellow (ANSI) when terminal supports it.
- Warning Names: Each warning shows its type in brackets:
[-Wunused-variable]. - Configurable: Enable with
-Wallor specific-W<type>flags. - Errors Mode: Use
-Werrorto treat warnings as fatal errors. - Numeric Diagnostics (v0.3.5.5):
-Wimplicit-narrowingand-Wsigned-unsigned-compare.
ANSI Color Support:
- Windows 10+: Automatically enables Virtual Terminal Processing.
- Unix/Linux: Detects TTY via
isatty(). - Override with
-Wcolor(force on) or-Wno-color(force off).
The Lexer (src/lexer.c) transforms raw bytes into Token structures.
The Lexer now supports Nested Includes via a state stack and Macro Definitions.
// Represents the state of a single file being parsed
typedef struct {
char* source; // Full source code buffer (owned by this state)
char* cur_char; // Current reading pointer
const char* filename; // اسم الملف الحالي
int line;
int col;
} LexerState;
// Definition (Macro)
typedef struct {
char* name; // اسم الماكرو
char* value; // القيمة الاستبدالية
} Macro;
// The main Lexer context
typedef struct {
// الحالة الحالية
LexerState state;
// مكدس التضمين (Include Stack)
LexerState stack[10]; // أقصى عمق للتضمين: 10
int stack_depth;
// حالة المعالج القبلي (Preprocessor State)
Macro macros[100]; // جدول الماكروهات (حد أقصى 100)
int macro_count; // عدد الماكروهات المعرفة
bool skipping; // هل نحن في وضع التخطي؟ (مُشتق من مكدس الشروط)
// مكدس الشروط (#إذا_عرف/#وإلا/#نهاية) لدعم التعشيش بشكل صحيح
struct {
unsigned char parent_active;
unsigned char cond_true;
unsigned char in_else;
} if_stack[32];
int if_depth;
} Lexer;Lexer Limits:
| Limit | Value | Description |
|---|---|---|
| Max Include Depth | 10 | stack[10] - maximum nested #تضمين |
| Max Macros | 100 | macros[100] - maximum #تعريف macros |
| Max Conditional Nesting | 32 | if_stack[32] - maximum nested #إذا_عرف |
The preprocessor is integrated directly into the lexer_next_token function. It intercepts directives starting with # before tokenizing normal code.
When #تعريف NAME VALUE is encountered:
- The name and value are parsed as strings.
- They are stored in the
macrostable. - When the Lexer later encounters an
IDENTIFIER:- It checks the macro table
- If found, replaces the token's value with the macro value
- Updates the token type based on the value (INT if numeric, STRING if quoted, IDENTIFIER otherwise)
When #إذا_عرف NAME is encountered:
- The lexer checks if
NAMEexists in the macro table. - If it exists, normal parsing continues.
- If not, the lexer enters Skipping Mode.
- In Skipping Mode, all tokens are discarded until
#وإلاor#نهايةis found.
When #الغاء_تعريف NAME is encountered:
- The lexer searches for
NAMEin the macro table. - If found, the entry is removed (by shifting subsequent entries).
- If not found, the directive is ignored.
When #تضمين "file" is encountered:
- The filename is extracted from the quoted string.
- Include resolution tries, in order:
- source-file directory (
<source_dir>/<path>), - exact path as written,
{BAA_HOME}/<path>(for relative paths),- CLI include paths from
-I(in the user-provided order), - for bare names:
<source_dir>/stdlib/<name>,stdlib/<name>,{BAA_STDLIB}/<name>,{BAA_HOME}/stdlib/<name>.
- source-file directory (
- The first successful candidate is normalized to a canonical active path.
- The normalized path is checked against the current include stack to reject cycles early.
- The selected file is read into memory.
- The current lexer state is pushed onto the include stack.
- The lexer state is updated to point to the new file's content.
- When EOF is reached, the previous state is popped and restored.
The preprocessor supports nested conditionals via if_stack[32]:
| Field | Purpose |
|---|---|
parent_active |
Was the parent conditional block active? |
cond_true |
Is the current condition (or branch) true? |
in_else |
Are we currently in the #وإلا (else) branch? |
Nesting rules:
- Maximum 32 nested conditional levels
#إذا_عرفpushes a new level ontoif_stack#وإلاtogglescond_truewithin the current level#نهايةpops the current levelskippingmode is computed from the stack state
| Feature | Description |
|---|---|
| UTF-8 Handling | Full Unicode support for Arabic text |
| Strict UTF-8 Validation (v0.3.7) | Rejects invalid UTF-8 sequences in identifiers and string/char literals |
| BOM Detection | Skips 0xEF 0xBB 0xBF if present |
| Arabic Numerals | Normalizes ٠-٩ → 0-9 |
| Arabic Punctuation | Handles ؛ (semicolon) 0xD8 0x9B |
Keywords: صحيح, ص٨, ص١٦, ص٣٢, ص٦٤, ط٨, ط١٦, ط٣٢, ط٦٤, عشري, عشري٣٢, حرف, نص, منطقي, عدم, حجم, نوع, ثابت, ساكن, إذا, وإلا, طالما, لكل, اختر, حالة, افتراضي, اطبع, اقرأ, إرجع, توقف, استمر, تعداد, هيكل, اتحاد
Literals: INTEGER, STRING, CHAR, TRUE, FALSE
Operators: + - * / % ++ -- ! ~ && || & | ^ << >>
Comparison: == != < > <= >=
Delimiters: ( ) { } [ ] , . : ؛
Special: IDENTIFIER, EOF
Note:
ثابت(const) was added in v0.2.7.ساكن(static storage) was added in v0.3.7.5.
The Parser (src/parser.c) builds the AST using Recursive Descent with 2-token lookahead.
typedef struct {
Lexer* lexer; // Reference to the lexer for token stream
Token current; // Current token (lookahead)
Token next; // Next token (2-token lookahead)
bool panic_mode; // وضع الذعر للتعافي من الأخطاء
bool had_error; // هل حدث خطأ أثناء التحليل؟
} Parser;Parser Type Alias Registry:
The parser maintains its own type alias registry (separate from semantic analysis):
#define PARSER_MAX_TYPE_ALIASES 256
typedef struct {
char* name;
DataType target_type;
char* target_type_name;
DataType target_ptr_base_type;
char* target_ptr_base_type_name;
int target_ptr_depth;
FuncPtrSig* target_func_sig; // مملوك (قد يكون NULL)
} ParserTypeAlias;Panic Mode Recovery (v0.3.7):
- When a syntax error is detected,
panic_modeis set to true - The parser synchronizes on statement terminators (
.,}) or declaration starters - This prevents cascading error messages from a single syntax error
- After synchronization, normal parsing resumes
Synchronization Modes:
typedef enum {
PARSER_SYNC_STATEMENT = 0, // Statement-level sync
PARSER_SYNC_DECLARATION = 1, // Declaration-level sync
PARSER_SYNC_SWITCH = 2 // Switch-case sync
} ParserSyncMode;Program ::= Declaration* EOF
Declaration ::= FuncDecl | GlobalVarDecl | GlobalArrayDecl | EnumDecl | StructDecl | UnionDecl | TypeAliasDecl
FuncDecl ::= Type ID "(" ParamList ")" Block
| Type ID "(" ParamList ")" "." // Prototype (v0.2.5+)
GlobalVarDecl ::= DeclMods TypeSpec ID ("=" Expr)? "."
GlobalArrayDecl ::= DeclMods "صحيح" ID "[" INT "]" ArrayInit? "." // v0.3.3+
EnumDecl ::= "تعداد" ID "{" EnumMembers? "}" // v0.3.4+
StructDecl ::= "هيكل" ID "{" FieldDecl* "}" // v0.3.4+
UnionDecl ::= "اتحاد" ID "{" FieldDecl* "}" // v0.3.4.5+
TypeAliasDecl ::= "نوع" ID "=" TypeSpec "." // v0.3.6.5+
DeclMod ::= "ثابت" | "ساكن"
DeclMods ::= DeclMod*
TypeSpec ::= Type | EnumType | StructType | UnionType | AliasType
Type ::= "صحيح" | "ص٨" | "ص١٦" | "ص٣٢" | "ص٦٤"
| "ط٨" | "ط١٦" | "ط٣٢" | "ط٦٤"
| "عشري" | "عشري٣٢" | "حرف" | "نص" | "منطقي" | "عدم"
EnumType ::= "تعداد" ID
StructType ::= "هيكل" ID
UnionType ::= "اتحاد" ID
AliasType ::= ID // Resolved via parser alias registry
Block ::= "{" Statement* "}"
Statement ::= VarDecl | ArrayDecl | Assign | ArrayAssign | MemberAssign
| If | Switch | While | For | Return | Print | Read | CallStmt
| Break | Continue
VarDecl ::= DeclMods TypeSpec ID ("=" Expr)? "." // initializer optional للتخزين الساكن
ArrayDecl ::= DeclMods "صحيح" ID "[" INT "]" ArrayInit? "." // v0.3.3+
EnumMembers ::= ID (COMMA ID)* COMMA?
FieldDecl ::= "ثابت"? TypeSpec ID "."
ArrayInit ::= "=" "{" (Expr (COMMA Expr)* COMMA?)? "}"
COMMA ::= "," | "،"
Assign ::= ID "=" Expr "."
ArrayAssign ::= ID "[" Expr "]" "=" Expr "."
MemberAssign ::= MemberAccess "=" Expr "."
MemberAccess ::= Primary ":" ID (":" ID)*
If ::= "إذا" "(" Expr ")" Block ("وإلا" (Block | If))?
Switch ::= "اختر" "(" Expr ")" "{" Case* Default? "}"
Case ::= "حالة" (INT | CHAR) ":" Statement*
Default ::= "افتراضي" ":" Statement*
While ::= "طالما" "(" Expr ")" Block
For ::= "لكل" "(" Init? "؛" Expr? "؛" Update? ")" Block
Break ::= "توقف" "."
Continue ::= "استمر" "."
Return ::= "إرجع" Expr? "."
Print ::= "اطبع" Expr "."
Read ::= "اقرأ" ID "."
Implemented via precedence climbing:
Logical OR ::= Logical AND { "||" Logical AND }
Logical AND ::= Bitwise OR { "&&" Bitwise OR }
Bitwise OR ::= Bitwise XOR { "|" Bitwise XOR }
Bitwise XOR ::= Bitwise AND { "^" Bitwise AND }
Bitwise AND ::= Equality { "&" Equality }
Equality ::= Relational { ("==" | "!=") Relational }
Relational ::= Shift { ("<" | ">" | "<=" | ">=") Shift }
Shift ::= Additive { ("<<" | ">>") Additive }
Additive ::= Multiplicative { ("+" | "-") Multiplicative }
Multiplicative ::= Unary { ("*" | "/" | "%") Unary }
Unary ::= ("!" | "~" | "-" | "++" | "--") Unary | Postfix
Postfix ::= Primary { "++" | "--" }
Primary ::= INT | STRING | CHAR | ID | ArrayAccess | Call | "حجم" "(" (TypeSpec | Expr) ")" | "(" Expr ")"
The parser uses synchronize() to recover from errors.
Example Scenario:
صحيح س = ١٠ // Error: Missing dot
صحيح ص = ٢٠.
- Parser expects
.but findsصحيح. report_error()is called.synchronize()is called. It skips until it seesصحيح(start of next statement).- Parser continues parsing
صحيح ص = ٢٠.. - At the end, compiler exits with status 1 if any errors were found.
Type alias parsing note (v0.3.6.5):
نوعis handled as a contextual keyword in parser declarations.- This preserves existing identifier/member usages like
س:نوعwhile still supportingنوع اسم = ...at top-level.
The AST uses a tagged union structure for type-safe node representation.
| Category | Node Types |
|---|---|
| Structure | NODE_PROGRAM, NODE_FUNC_DEF, NODE_BLOCK, NODE_TYPE_ALIAS |
| Variables | NODE_VAR_DECL, NODE_ASSIGN, NODE_VAR_REF |
| Array Decls | NODE_ARRAY_DECL, NODE_ARRAY_ACCESS, NODE_ARRAY_ASSIGN |
| Member Access | NODE_MEMBER_ACCESS, NODE_MEMBER_ASSIGN, NODE_DEREF_ASSIGN |
| Control Flow | NODE_IF, NODE_WHILE, NODE_FOR, NODE_RETURN |
| Branching | NODE_SWITCH, NODE_CASE, NODE_BREAK, NODE_CONTINUE |
| Expressions | NODE_BIN_OP, NODE_UNARY_OP, NODE_POSTFIX_OP, NODE_CALL_EXPR |
| Literals | NODE_INT, NODE_STRING, NODE_CHAR, NODE_BOOL, NODE_NULL |
| Calls & I/O | NODE_CALL_STMT, NODE_PRINT, NODE_READ |
typedef struct Node {
NodeType type; // Discriminator
struct Node* next; // Linked list for siblings
union { ... } data; // Type-specific payload
} Node;The Semantic Analyzer (src/analysis.c) performs a static check on the AST before code generation.
- Symbol Resolution: Verifies variables are declared before use.
- Type Checking: Enforces compatibility across primitive/sized numeric types,
نص/حرف/منطقي/عشري, and compound types (تعداد/هيكل/اتحاد) including array elements. - Scope Validation: Manages visibility rules.
- Constant Checking (v0.2.7+): Prevents reassignment of immutable variables.
- Static Storage Rules (v0.3.7.5): Validates
ساكنdeclarations and enforces compile-time initializers for static-storage objects. - Control Flow Validation: Ensures
breakandcontinueare used only within loops/switches. - Function Validation: Checks function prototypes and definitions match.
- Usage Tracking (v0.2.8+): Tracks variable usage for unused variable warnings.
- Dead Code Detection (v0.2.8+): Detects unreachable code after
return/break. - Type Alias Validation (v0.3.6.5): Registers aliases, validates alias targets, and enforces strict alias/symbol name collision diagnostics.
- Array Shape Validation (v0.3.9): Tracks array rank/dimensions in symbols, validates index-count match, and performs compile-time out-of-bounds checks for constant indices.
- Pointer Semantics (v0.3.10): Validates pointer arithmetic, comparisons, dereference, and address-of constraints.
- Type Casting (v0.3.10.5): Enforces rules for explicit scalar and pointer conversions.
- Function Pointers (v0.3.10.6): Validates assignment, comparison (EQ/NE only), and indirect calls matching exact signatures.
- Variadic Functions (v0.4.0.5): Validates
...signatures, variadic builtin usage (بدء_معاملات/معامل_تالي/نهاية_معاملات), and fixed/extra argument checks for variadic direct calls. - Inline Assembly (v0.4.0.6): Validates
مجمع { ... }blocks, enforces fixed-register constraint subset (=a/=c/=d,a/c/d), checks output lvalue requirements, and restricts operand types to integer/pointer forms. - Standard Library Modules + Float Extensions (v0.4.2): Validates Math/System/Time builtins (
جذر_تربيعي/أس/جيب/جيب_تمام/ظل/مطلق/عشوائي/متغير_بيئة/نفذ_أمر/وقت_حالي/وقت_كنص), Arabic float format specs (%ع/%أ), and acceptsعشري٣٢as a float keyword alias. - Error Handling Builtins (v0.4.3): Validates
تأكد/توقف_فوري/كود_خطأ_النظام/ضبط_كود_خطأ_النظام/نص_كود_خطأfor arity/type contracts with Arabic diagnostics.
The current cross-file linkage contract is intentionally small and explicit:
- top-level functions are externally visible by default,
- a prototype declaration in a
.baahdheader does not emit a body, - top-level
ساكنglobals and arrays are lowered with internal linkage at file scope, ساكنon functions remains rejected in parsing/semantic validation,- shared global-variable APIs are not first-class yet because there is no separate
extern-style variable declaration syntax.
This behavior is now locked by multi-file QA smoke in addition to the existing single-file semantic checks.
The analyzer tracks is_const / is_static and enforces immutability + static-storage constraints:
| Error Condition | Error Message |
|---|---|
| Reassigning a constant | Arabic semantic error for modifying ثابت |
| Modifying constant array element | Arabic semantic error for constant array mutation |
| Automatic constant without initializer | Arabic semantic error (الثابت ... يجب تهيئته) |
| Static-storage non-constant initializer | Arabic semantic error for non-constant static initializer |
The analyzer generates warnings for potential issues that don't prevent compilation:
Algorithm:
- Each symbol has an
is_usedflag initialized tofalse. - When a variable is referenced (in expressions, assignments, etc.), the flag is set to
true. - At end of function scope, all local variables with
is_used == falsegenerate a warning. - At end of program, all global variables with
is_used == falsegenerate a warning.
Exception: Function parameters are marked as "used" implicitly to avoid false positives.
Algorithm:
- While analyzing a block, track if a "terminating" statement was encountered.
- Terminating statements:
NODE_RETURN,NODE_BREAK,NODE_CONTINUE. - If a terminating statement was found and there are more statements after it, generate a warning.
Implementation:
static void analyze_statements_with_dead_code_check(Node* statements, const char* context) {
bool found_terminator = false;
Node* stmt = statements;
while (stmt) {
if (found_terminator) {
warning_report(WARN_DEAD_CODE, ...);
found_terminator = false; // Avoid multiple warnings
}
analyze_node(stmt);
if (is_terminating_statement(stmt)) {
found_terminator = true;
}
stmt = stmt->next;
}
}When a local variable is declared with the same name as a global variable, a WARN_SHADOW_VARIABLE warning is generated.
The analyzer emits WARN_IMPLICIT_NARROWING when an implicit numeric conversion may lose information.
Covered conversion sites:
- Variable declaration initializers
- Assignments
- Return expressions
- Function-call arguments
- Array element assignments
- Struct/union member assignments
The check is constant-aware: if the source expression is a compile-time constant that is provably representable in the destination type, the warning is suppressed.
The analyzer emits WARN_SIGNED_UNSIGNED_COMPARE for comparison operators (==, !=, <, >, <=, >=) when the integer-promotion result mixes signed and unsigned domains.
- Bitwise operators (
&,|,^,~,<<,>>) are restricted to integer-like types. - Shift-count literals are range-checked (
0..63) during semantic analysis. NODE_SIZEOFis resolved to a compile-time integer value when size information is known.عدم(void) rules are enforced:- no variable declarations of type
عدم(local/global), - no function parameters of type
عدم, - return shape must match function type (
إرجع.only inعدمfunctions, value required in non-void).
- no variable declarations of type
Since v0.2.4, analysis.c maintains its own symbol table for isolation. This ensures validation logic is independent from the backend pipeline.
In v0.3.7, semantic lookups were optimized using hash-indexed chains for local/global symbol lookup while preserving deterministic semantics and existing symbol ownership.
Future improvement: Unify symbol tables into a shared context object passed between phases.
The analyzer walks the AST recursively. It maintains a Symbol Table stack to track active variables in the current scope. If it encounters:
x = "text"(where x isint): Reports a type mismatch error.print y(where y is undeclared): Reports an undefined symbol error.x = 5(where x isconst): Reports a const reassignment error (v0.2.7+).
The semantic analyzer uses the following constants (defined in src/analysis.c):
| Constant | Value | Description |
|---|---|---|
ANALYSIS_MAX_SYMBOLS |
100 | Maximum symbols per scope (global/local) |
ANALYSIS_MAX_SCOPES |
64 | Maximum nested scope depth |
ANALYSIS_MAX_FUNCS |
128 | Maximum function declarations |
ANALYSIS_MAX_FUNC_PARAMS |
32 | Maximum parameters per function |
ANALYSIS_SYMBOL_HASH_BUCKETS |
257 | Hash table buckets for symbol lookup |
ANALYSIS_MAX_ENUMS |
128 | Maximum enum definitions |
ANALYSIS_MAX_STRUCTS |
128 | Maximum struct definitions |
ANALYSIS_MAX_UNIONS |
128 | Maximum union definitions |
ANALYSIS_MAX_ENUM_MEMBERS |
128 | Maximum members per enum |
ANALYSIS_MAX_STRUCT_FIELDS |
128 | Maximum fields per struct/union |
ANALYSIS_MAX_TYPE_ALIASES |
256 | Maximum type alias definitions |
The semantic analyzer maintains several internal data structures for symbol management:
typedef struct {
char name[32]; // اسم الرمز (Symbol name)
ScopeType scope; // النطاق (SCOPE_GLOBAL or SCOPE_LOCAL)
DataType type; // نوع البيانات (للمتغير: نوعه، للمصفوفة: نوع العنصر)
char type_name[32]; // اسم النوع عند TYPE_ENUM/TYPE_STRUCT (فارغ لغير ذلك)
DataType ptr_base_type; // نوع أساس المؤشر عندما type == TYPE_POINTER
char ptr_base_type_name[32];// اسم النوع المركب لأساس المؤشر
int ptr_depth; // عمق المؤشر عندما type == TYPE_POINTER
FuncPtrSig* func_sig; // توقيع مؤشر الدالة عندما type == TYPE_FUNC_PTR
bool is_array; // هل الرمز مصفوفة؟
int array_rank; // عدد الأبعاد
int64_t array_total_elems; // حاصل ضرب الأبعاد
int* array_dims; // أبعاد المصفوفة (مملوك لجدول الرموز)
int offset; // الإزاحة في المكدس أو العنوان
bool is_const; // هل هو ثابت (immutable)؟
bool is_static; // هل التخزين ساكن؟
bool is_used; // هل تم استخدام هذا المتغير؟ (للتحذيرات)
int decl_line; // سطر التعريف (للتحذيرات)
int decl_col; // عمود التعريف (للتحذيرات)
const char* decl_file; // ملف التعريف (للتحذيرات)
} Symbol;typedef struct {
char* name; // اسم الدالة (مملوك strdup)
DataType return_type; // نوع الإرجاع
DataType return_ptr_base_type; // نوع أساس مؤشر الإرجاع
char* return_ptr_base_type_name;// اسم نوع أساس المؤشر (مملوك strdup)
int return_ptr_depth; // عمق مؤشر الإرجاع
FuncPtrSig* return_func_sig; // توقيع مؤشر دالة الإرجاع (مملوك clone)
DataType* param_types; // أنواع المعاملات (مملوك malloc)
DataType* param_ptr_base_types; // أنواع أساس مؤشرات المعاملات
char** param_ptr_base_type_names;// أسماء أنواع أساس مؤشرات المعاملات
int* param_ptr_depths; // أعماق مؤشرات المعاملات
FuncPtrSig** param_func_sigs; // تواقيع مؤشرات دوال المعاملات
int param_count; // عدد المعاملات
FuncPtrSig* ref_funcptr_sig; // توقيع "مرجع الدالة" كقيمة (مملوك clone)
bool is_defined; // هل تم تعريف الدالة (لها جسم)؟
const char* decl_file; // ملف التعريف
int decl_line; // سطر التعريف
int decl_col; // عمود التعريف
} FuncSymbol;typedef struct FuncPtrSig {
DataType return_type;
DataType return_ptr_base_type;
char* return_ptr_base_type_name; // مملوك (قد يكون NULL)
int return_ptr_depth;
int param_count;
DataType* param_types; // مملوك (malloc)
DataType* param_ptr_base_types; // مملوك (malloc)
char** param_ptr_base_type_names; // مملوك (malloc) وعناصره مملوكة (strdup)
int* param_ptr_depths; // مملوك (malloc)
} FuncPtrSig;EnumDef (تعريف التعداد):
typedef struct {
char* name; // مملوك (strdup)
int member_count;
struct {
char* name; // مملوك (strdup)
int64_t value;
} members[ANALYSIS_MAX_ENUM_MEMBERS];
} EnumDef;StructDef (تعريف الهيكل):
typedef struct {
char* name; // مملوك (strdup)
int field_count;
StructFieldDef fields[ANALYSIS_MAX_STRUCT_FIELDS];
int size;
int align;
bool layout_done;
bool layout_in_progress;
} StructDef;UnionDef (تعريف الاتحاد):
typedef struct {
char* name; // مملوك (strdup)
int field_count;
StructFieldDef fields[ANALYSIS_MAX_STRUCT_FIELDS];
int size;
int align;
bool layout_done;
bool layout_in_progress;
} UnionDef;StructFieldDef (تعريف حقل الهيكل/الاتحاد):
typedef struct {
char* name; // مملوك (strdup)
DataType type;
char* type_name; // مملوك (strdup) عند TYPE_ENUM/TYPE_STRUCT
DataType ptr_base_type;
char* ptr_base_type_name;
int ptr_depth;
bool is_const;
int offset;
int size;
int align;
} StructFieldDef;TypeAliasDef (تعريف الاسم البديل للنوع):
typedef struct {
char* name; // مملوك (strdup)
DataType target_type; // النوع الهدف بعد فك الاسم البديل
char* target_type_name; // مملوك (strdup) عند TYPE_ENUM/TYPE_STRUCT/TYPE_UNION
DataType target_ptr_base_type;
char* target_ptr_base_type_name;
int target_ptr_depth;
FuncPtrSig* target_func_sig; // مملوك (clone) عند TYPE_FUNC_PTR
} TypeAliasDef;Semantic lookups use hash-indexed chains for O(1) average-case lookup:
// جداول الرموز
static Symbol global_symbols[ANALYSIS_MAX_SYMBOLS];
static int global_count = 0;
static int global_symbol_hash_head[ANALYSIS_SYMBOL_HASH_BUCKETS];
static int global_symbol_hash_next[ANALYSIS_MAX_SYMBOLS];
static Symbol local_symbols[ANALYSIS_MAX_SYMBOLS];
static int local_count = 0;
static int local_symbol_hash_head[ANALYSIS_SYMBOL_HASH_BUCKETS];
static int local_symbol_hash_next[ANALYSIS_MAX_SYMBOLS];
// مكدس النطاقات
static int scope_stack[ANALYSIS_MAX_SCOPES];
static int scope_depth = 0;The hash function used is FNV-1a 32-bit for fast string hashing.
The AST uses the following type enumeration (defined in src/frontend/ast.h):
typedef enum {
TYPE_INT, // صحيح / ص٦٤ (int64)
// أحجام الأعداد الصحيحة (v0.3.5.5)
TYPE_I8, // ص٨
TYPE_I16, // ص١٦
TYPE_I32, // ص٣٢
TYPE_U8, // ط٨
TYPE_U16, // ط١٦
TYPE_U32, // ط٣٢
TYPE_U64, // ط٦٤
TYPE_STRING, // نص (حرف[])
TYPE_POINTER, // مؤشر عام
TYPE_FUNC_PTR, // مؤشر دالة: دالة(...) -> نوع
TYPE_BOOL, // منطقي (bool - stored as byte)
TYPE_CHAR, // حرف (UTF-8 sequence)
TYPE_FLOAT, // عشري (float64) + عشري٣٢ (alias في v0.4.2)
TYPE_VOID, // عدم (void)
TYPE_ENUM, // تعداد (يُخزن كـ int64)
TYPE_STRUCT, // هيكل (ليس قيمة من الدرجة الأولى)
TYPE_UNION // اتحاد (ليس قيمة من الدرجة الأولى)
} DataType;OpType Enum (Binary Operations):
typedef enum {
// عمليات حسابية
OP_ADD, OP_SUB, OP_MUL, OP_DIV, OP_MOD,
// عمليات بتية (Bitwise)
OP_BIT_AND, OP_BIT_OR, OP_BIT_XOR, OP_SHL, OP_SHR,
// عمليات مقارنة
OP_EQ, OP_NEQ, OP_LT, OP_GT, OP_LTE, OP_GTE,
// عمليات منطقية
OP_AND, OP_OR
} OpType;UnaryOpType Enum:
typedef enum {
UOP_NEG, // السالب (-)
UOP_NOT, // النفي (!)
UOP_BIT_NOT, // النفي البتي (~)
UOP_ADDR, // أخذ العنوان (&)
UOP_DEREF, // فك الإشارة (*)
UOP_INC, // الزيادة (++)
UOP_DEC // النقصان (--)
} UnaryOpType;| Type | C Type | Size | Notes |
|---|---|---|---|
صحيح |
int64_t |
8 bytes | Signed integer |
نص |
char* |
8 bytes | Pointer to read-only string (.rdata/.rodata) |
منطقي |
bool (stored as int) |
8 bytes | Stored as 0/1 in 8-byte slots |
I/O note: The backend dynamically resolves format strings (%lld, %llu, %g/%e) for integers/floats (Arabic %ع/%أ → C %f/%e in formatted builtins). Strings (نص) and Characters (حرف) are handled with a custom UTF-8 emission loop or packed format.
The parser performs constant folding on arithmetic expressions. If both operands of a binary operation are integer literals, the compiler evaluates the result at compile-time.
Example:
- Source:
٢ * ٣ + ٤ - Before folding:
BinOp(+, BinOp(*, 2, 3), 4) - After folding:
Int(10)
Supported Operations: +, -, *, /, %
Note: Division/modulo by zero is detected and reported during folding.
The IR Module (src/ir.h, src/ir.c) provides an Arabic-first Intermediate Representation using SSA (Static Single Assignment) form.
Memory management (v0.3.2.6.1): IR objects are now allocated from a module-owned arena (src/ir_arena.c) and freed in bulk by ir_module_free(). IR passes should treat IR nodes as module-owned and avoid per-node frees.
IR serialization (v0.3.2.6.3): The compiler also includes a machine-readable IR text serializer/reader for round-trip tests (src/ir_text.c, src/ir_text.h). This format is separate from the Arabic-first debug printer (ir_module_print()).
Baa's IR is designed with three goals:
- Arabic Identity: All opcodes, types, and predicates have Arabic names.
- Technical Parity: Comparable to LLVM IR, GIMPLE, or WebAssembly in capabilities.
- SSA Form: Each virtual register is assigned exactly once, enabling powerful optimizations.
IRModule
├── name: char* // Module name (source file)
├── arena: IRArena // IR memory arena (all IR objects allocated here)
├── cached_i8_ptr_type: IRType* // Common type cache
├── globals: IRGlobal* // Global variables
├── global_count: int
├── funcs: IRFunc* // Functions
├── func_count: int
├── strings: IRStringEntry* // C string literal table
├── string_count: int
├── baa_strings: IRBaaStringEntry* // Baa string table (حرف[])
└── baa_string_count: int
IRFunc
├── name: char*
├── ret_type: IRType*
├── params: IRParam[]
├── param_count: int // Number of parameters
├── blocks: IRBlock* // Linked list of basic blocks
├── block_count: int // Number of blocks
├── entry: IRBlock* // Entry block pointer
├── next_reg: int // Virtual register counter (next available %م<n>)
├── next_inst_id: int // Instruction ID counter
├── ir_epoch: uint32_t // IR change counter (invalidates analyses)
├── def_use: IRDefUse* // Def-Use analysis cache (heap allocated)
├── next_block_id: int // Block ID counter
├── is_prototype: bool // Is this a declaration without body?
└── next: IRFunc* // Next function in module
IRBlock
├── label: char* // Arabic label (e.g., "بداية", "حلقة")
├── id: int
├── parent: IRFunc* // Function containing this block
├── first/last: IRInst* // Instruction list
├── inst_count: int
├── succs[2]: IRBlock* // Successors (0-2 for br/br_cond)
├── succ_count: int
├── preds: IRBlock** // Predecessors (dynamic array)
├── pred_count: int
├── pred_capacity: int
├── idom: IRBlock* // Immediate dominator
├── dom_frontier: IRBlock** // Dominance frontier
├── dom_frontier_count: int
└── next: IRBlock* // Next block in function
IRInst
├── op: IROp // Opcode
├── type: IRType* // Result type
├── id: int // Instruction ID for diagnostics/tests
├── dest: int // Destination register (-1 if none)
├── operands[4]: IRValue* // Up to 4 operands
├── operand_count: int // Number of operands used
├── cmp_pred: IRCmpPred // For comparison instructions
├── phi_entries: IRPhiEntry* // Linked list of [value, block] pairs
├── call_target: char* // Direct call target name (NULL for indirect)
├── call_callee: IRValue* // Indirect call callee value (NULL for direct)
├── call_args: IRValue** // Argument list
├── call_arg_count: int // Number of arguments
├── src_file: const char* // Source file (debug info)
├── src_line: int // Source line (debug info)
├── src_col: int // Source column (debug info)
├── dbg_name: const char* // Optional symbol name for debugging
├── parent: IRBlock* // Block containing this instruction
├── prev: IRInst* // Previous instruction in block
└── next: IRInst* // Next instruction in block
The IR system uses an arena allocator for efficient memory management. All IR objects (types, values, instructions, blocks, functions, globals) are allocated from the module-owned arena and freed in bulk when the module is destroyed.
Arena Structure:
typedef struct IRArenaChunk {
struct IRArenaChunk* next;
size_t used;
size_t cap;
unsigned char data[]; // Flexible array member
} IRArenaChunk;
typedef struct IRArena {
IRArenaChunk* head;
size_t default_chunk_size;
} IRArena;
typedef struct IRArenaStats {
size_t chunks; // Number of allocated chunks
size_t used_bytes; // Total used bytes
size_t cap_bytes; // Total capacity
} IRArenaStats;Key Functions:
void ir_arena_init(IRArena* arena, size_t default_chunk_size);
void ir_arena_destroy(IRArena* arena);
void* ir_arena_alloc(IRArena* arena, size_t size, size_t align);
void* ir_arena_calloc(IRArena* arena, size_t count, size_t size, size_t align);
char* ir_arena_strdup(IRArena* arena, const char* s);
void ir_arena_get_stats(const IRArena* arena, IRArenaStats* out_stats);Important Notes:
- IR passes should treat IR nodes as module-owned and avoid per-node frees
- Memory is freed in bulk by
ir_module_free()viair_arena_destroy() - The arena provides O(1) allocation with minimal overhead
- All IR objects are annotated with:
ملاحظة: هذه البنية تُخصَّص داخل ساحة IR (Arena) وتُحرَّر دفعة واحدة.
Usage Pattern:
IRModule* module = ir_module_new("program.baa");
// All IR objects allocated via ir_module_get_current() use the arena
IRFunc* func = ir_func_new("الرئيسية", ret_type);
ir_module_add_func(module, func);
// ... build IR ...
ir_module_free(module); // Bulk free all arena memoryThe IR system maintains a thread-local context for the current module to simplify allocation:
void ir_module_set_current(IRModule* module);
IRModule* ir_module_get_current(void);This allows IR construction functions to allocate from the correct arena without passing the module explicitly.
Indirect Call Support (v0.3.10.6):
For indirect function calls through function pointers:
call_targetis NULLcall_calleecontains the IRValue (register) holding the function pointer- ISel lowers this to
call *%reginstead ofcall @function
| Category | Opcode | Arabic | Description |
|---|---|---|---|
| Arithmetic | IR_OP_ADD |
جمع | Addition |
IR_OP_SUB |
طرح | Subtraction | |
IR_OP_MUL |
ضرب | Multiplication | |
IR_OP_DIV |
قسم | Division | |
IR_OP_MOD |
باقي | Modulo | |
IR_OP_NEG |
سالب | Negation | |
| Memory | IR_OP_ALLOCA |
حجز | Stack allocation |
IR_OP_LOAD |
حمل | Load from memory | |
IR_OP_STORE |
خزن | Store to memory | |
IR_OP_PTR_OFFSET |
إزاحة_مؤشر | Pointer offset: base + index * sizeof(pointee) | |
| Comparison | IR_OP_CMP |
قارن | Compare with predicate |
| Logical | IR_OP_AND |
و | Bitwise AND |
IR_OP_OR |
أو | Bitwise OR | |
IR_OP_XOR |
أو_حصري | Bitwise XOR | |
IR_OP_NOT |
نفي | Bitwise NOT | |
IR_OP_SHL |
ازاحة_يسار | Shift left | |
IR_OP_SHR |
ازاحة_يمين | Shift right (signed/unsigned-aware) | |
| Control | IR_OP_BR |
قفز | Unconditional branch |
IR_OP_BR_COND |
قفز_شرط | Conditional branch | |
IR_OP_RET |
رجوع | Return | |
IR_OP_CALL |
نداء | Function call | |
| SSA | IR_OP_PHI |
فاي | Phi node |
IR_OP_COPY |
نسخ | Copy value | |
IR_OP_NOP |
NOP | No operation | |
| Conversion | IR_OP_CAST |
تحويل | Type cast |
| Type | Arabic | Bits | Description |
|---|---|---|---|
IR_TYPE_VOID |
فراغ | 0 | No value |
IR_TYPE_I1 |
ص١ | 1 | Boolean |
IR_TYPE_I8 |
ص٨ | 8 | Byte/Char (8-bit signed) |
IR_TYPE_I16 |
ص١٦ | 16 | Short (16-bit signed) |
IR_TYPE_I32 |
ص٣٢ | 32 | Int (32-bit signed) |
IR_TYPE_I64 |
ص٦٤ | 64 | Long (64-bit signed) |
IR_TYPE_U8 |
ط٨ | 8 | Unsigned byte |
IR_TYPE_U16 |
ط١٦ | 16 | Unsigned short |
IR_TYPE_U32 |
ط٣٢ | 32 | Unsigned int |
IR_TYPE_U64 |
ط٦٤ | 64 | Unsigned long |
IR_TYPE_CHAR |
حرف | 8 | UTF-8 char (packed into i64) |
IR_TYPE_F64 |
ع٦٤ | 64 | Float (double) |
IR_TYPE_PTR |
مؤشر | 64 | Pointer |
IR_TYPE_ARRAY |
مصفوفة | varies | Array |
IR_TYPE_FUNC |
دالة | 64 | Function pointer type (v0.3.10.6+) |
ملاحظة (v0.3.10.6): قيم IR_TYPE_FUNC تُستخدم كمؤشرات دوال (قابلة للتخزين/التحميل/المقارنة EQ/NE مع 0)،
وتُخفض على x86-64 كقيمة 64-بت مثل المؤشر العادي. الدالة ir_builder_emit_call_indirect() تُستخدم للنداء غير المباشر.
| Predicate | Arabic | Description |
|---|---|---|
IR_CMP_EQ |
يساوي | Equal |
IR_CMP_NE |
لا_يساوي | Not Equal |
IR_CMP_GT |
أكبر | Greater Than (signed) |
IR_CMP_LT |
أصغر | Less Than (signed) |
IR_CMP_GE |
أكبر_أو_يساوي | Greater or Equal (signed) |
IR_CMP_LE |
أصغر_أو_يساوي | Less or Equal (signed) |
IR_CMP_UGT |
أكبر_بدون_إشارة | Greater Than (unsigned) |
IR_CMP_ULT |
أصغر_بدون_إشارة | Less Than (unsigned) |
IR_CMP_UGE |
أكبر_أو_يساوي_بدون_إشارة | Greater or Equal (unsigned) |
IR_CMP_ULE |
أصغر_أو_يساوي_بدون_إشارة | Less or Equal (unsigned) |
Registers use Arabic naming with Arabic-Indic numerals:
- Format:
%م<n>whereم= مؤقت (temporary) - Examples:
%م٠,%م١,%م٢, ...
The int_to_arabic_numerals() function converts integers to Arabic-Indic digits (٠١٢٣٤٥٦٧٨٩).
Baa Source:
صحيح الرئيسية() {
صحيح س = ١٠.
صحيح ص = ٢٠.
إرجع س + ص.
}
Generated IR (Arabic mode):
دالة الرئيسية() -> ص٦٤ {
بداية:
%م٠ = حجز ص٦٤
خزن ص٦٤ ١٠, %م٠
%م١ = حجز ص٦٤
خزن ص٦٤ ٢٠, %م١
%م٢ = حمل ص٦٤ %م٠
%م٣ = حمل ص٦٤ %م١
%م٤ = جمع ص٦٤ %م٢, %م٣
رجوع ص٦٤ %م٤
}
Key functions for building IR directly (without builder):
// Module
IRModule* ir_module_new(const char* name);
void ir_module_add_func(IRModule* module, IRFunc* func);
int ir_module_add_string(IRModule* module, const char* str);
// Function
IRFunc* ir_func_new(const char* name, IRType* ret_type);
int ir_func_alloc_reg(IRFunc* func);
IRBlock* ir_func_new_block(IRFunc* func, const char* label);
// Block
IRBlock* ir_block_new(const char* label, int id);
void ir_block_append(IRBlock* block, IRInst* inst);
// Instructions
IRInst* ir_inst_binary(IROp op, IRType* type, int dest, IRValue* lhs, IRValue* rhs);
IRInst* ir_inst_cmp(IRCmpPred pred, int dest, IRValue* lhs, IRValue* rhs);
IRInst* ir_inst_load(IRType* type, int dest, IRValue* ptr);
IRInst* ir_inst_store(IRValue* value, IRValue* ptr);
IRInst* ir_inst_br(IRBlock* target);
IRInst* ir_inst_br_cond(IRValue* cond, IRBlock* if_true, IRBlock* if_false);
IRInst* ir_inst_ret(IRValue* value);
IRInst* ir_inst_call(const char* target, IRType* ret_type, int dest, IRValue** args, int arg_count);
IRInst* ir_inst_call_indirect(IRValue* callee, IRType* ret_type, int dest, IRValue** args, int arg_count);
IRInst* ir_inst_phi(IRType* type, int dest);
// Printing
void ir_module_print(IRModule* module, FILE* out, int use_arabic);
void ir_module_dump(IRModule* module, const char* filename, int use_arabic);The IR Builder (src/ir_builder.h, src/ir_builder.c) provides a convenient builder pattern API:
// Builder lifecycle
IRBuilder* ir_builder_new(IRModule* module);
void ir_builder_free(IRBuilder* builder);
// Function/Block creation
IRFunc* ir_builder_create_func(IRBuilder* builder, const char* name, IRType* ret_type);
IRBlock* ir_builder_create_block(IRBuilder* builder, const char* label);
void ir_builder_set_insert_point(IRBuilder* builder, IRBlock* block);
// Register allocation
int ir_builder_alloc_reg(IRBuilder* builder);
// Emit instructions (auto-appends to current block)
int ir_builder_emit_add(IRBuilder* builder, IRType* type, IRValue* lhs, IRValue* rhs);
int ir_builder_emit_sub(IRBuilder* builder, IRType* type, IRValue* lhs, IRValue* rhs);
int ir_builder_emit_mul(IRBuilder* builder, IRType* type, IRValue* lhs, IRValue* rhs);
int ir_builder_emit_alloca(IRBuilder* builder, IRType* type);
int ir_builder_emit_load(IRBuilder* builder, IRType* type, IRValue* ptr);
void ir_builder_emit_store(IRBuilder* builder, IRValue* value, IRValue* ptr);
int ir_builder_emit_ptr_offset(IRBuilder* builder, IRType* type, IRValue* base, IRValue* index);
int ir_builder_emit_cast(IRBuilder* builder, IRType* from, IRValue* v, IRType* to);
void ir_builder_emit_br(IRBuilder* builder, IRBlock* target);
void ir_builder_emit_br_cond(IRBuilder* builder, IRValue* cond, IRBlock* if_true, IRBlock* if_false);
void ir_builder_emit_ret(IRBuilder* builder, IRValue* value);
int ir_builder_emit_call(IRBuilder* builder, const char* target, IRType* ret_type, IRValue** args, int arg_count);
int ir_builder_emit_call_indirect(IRBuilder* builder, IRValue* callee, IRType* ret_type, IRValue** args, int arg_count);
// Control flow structure helpers
void ir_builder_create_if_then(IRBuilder* builder, IRValue* cond,
const char* then_label, const char* merge_label,
IRBlock** then_block, IRBlock** merge_block);
void ir_builder_create_while(IRBuilder* builder,
const char* header_label, const char* body_label,
const char* exit_label,
IRBlock** header_block, IRBlock** body_block,
IRBlock** exit_block);
// Constants
IRValue* ir_builder_const_int(int64_t value);
IRValue* ir_builder_const_i64(int64_t value);
IRValue* ir_builder_const_bool(int value);Benefits over low-level API:
- Automatic register allocation
- Automatic CFG edge management (successors/predecessors)
- Source location propagation
- Control flow structure helpers for if/else/while
- Statistics tracking
Expression lowering lives in src/ir_lower.h and src/ir_lower.c and is built on top of the IR Builder (src/ir_builder.h, src/ir_builder.c).
Key concepts:
IRLowerCtx: Lowering context (builder + local bindings + control-flow stacks + debug bounds-check toggle).ir_lower_bind_local(): Bind a variable name to itsحجزpointer register. (Statement lowering will populate this in v0.3.0.4.)- Local bindings now carry array metadata (rank/dimensions/element type) to support multi-dimensional indexing.
lower_expr(): Lower AST expressions into IR operands (IRValue*) and emits IR instructions via the builder.
Currently lowered expressions:
NODE_INT,NODE_STRING,NODE_CHAR,NODE_BOOL,NODE_FLOAT,NODE_NULLNODE_VAR_REF(loads viaحمل)NODE_BIN_OP(arithmetic, comparisons, logical ops, pointer difference)NODE_UNARY_OP(سالب, bitwiseنفي,!,UOP_ADDRvia pointers,UOP_DEREFvia load)NODE_POSTFIX_OP(++/--postfix via load + add/sub + store; expression result is the old value)NODE_SIZEOF-> compile-time constant sizeNODE_CAST->تحويل(cast)NODE_CALL_EXPR->نداء(supports direct calls, and indirect calls viaIR_TYPE_FUNCpointers)- Builtin string calls in
NODE_CALL_EXPR(v0.3.9):طول_نص: loop until terminator and return lengthقارن_نص: lexicographic compare over Baaحرفنسخ_نص/دمج_نص: heap allocation عبرmalloc+ copy loopsحرر_نص: تحرير الذاكرة عبرfree
- Builtin dynamic memory calls in
NODE_CALL_EXPR(v0.3.11):حجز_ذاكرة: lowers tomallocتحرير_ذاكرة: lowers tofreeإعادة_حجز: lowers toreallocنسخ_ذاكرة: lowers tomemcpyتعيين_ذاكرة: lowers tomemset
- Builtin file I/O calls in
NODE_CALL_EXPR(v0.3.12):فتح_ملف: lowers tofopen(handle isعدم*representingFILE*)اغلق_ملف: lowers tofcloseاقرأ_حرف: lowers tofgetc+ UTF-8 packing intoحرفاكتب_حرف: lowers tofputcاقرأ_ملف: lowers tofreadاكتب_ملف: lowers tofwriteنهاية_ملف: lowers tofeofموقع_ملف: lowers toftello(Linux) /_ftelli64(Windows)اذهب_لموقع: lowers tofseeko(Linux) /_fseeki64(Windows)اقرأ_سطر: reads bytes until\\n/EOF and returns nullableنصاكتب_سطر: lowers tofputs+fputc('\\n')
- Builtin variadic runtime calls in
NODE_CALL_EXPR(v0.4.0.5):بدء_معاملات: initializes variadic cursor from hidden variadic base.معامل_تالي: reads next packed argument slot as requested type and advances cursor.نهاية_معاملات: clears variadic cursor.
- Builtin standard-library module calls in
NODE_CALL_EXPR(v0.4.2):- Math:
جذر_تربيعي->sqrt,أس->pow,جيب->sin,جيب_تمام->cos,ظل->tan,مطلق->llabs,عشوائي->rand - System:
متغير_بيئة->getenv(+ C-string → Baa string conversion),نفذ_أمر->system - Time:
وقت_حالي->time,وقت_كنص->ctime(+ C-string → Baa string conversion)
- Math:
- Builtin error-handling calls in
NODE_CALL_EXPR(v0.4.3):تأكد/توقف_فوري: fail-fast abort paths with message emission.كود_خطأ_النظام/ضبط_كود_خطأ_النظام: hosterrnobridge (__errno_locationon Linux,_errnoon Windows).نص_كود_خطأ: lowers tostrerror+ C-string → Baa string conversion.
Statement lowering is implemented in the same module and currently supports:
NODE_VAR_DECL: emitحجز+خزنand bind the variable name viair_lower_bind_local()NODE_ASSIGN: emitخزنto an existing local bindingNODE_RETURN: emitرجوعNODE_PRINT: emitنداء @اطبع(...)(builtin call)NODE_READ: emitنداء @اقرأ(%ptr)(builtin call)NODE_ARRAY_DECL/NODE_ARRAY_ASSIGN(including multi-dimensional index chains)NODE_MEMBER_ASSIGNon indexed array elements and structsNODE_DEREF_ASSIGN: store value through dereferenced pointer
Control flow lowering extends statement lowering to produce a full CFG using:
قفز(unconditional branch)قفز_شرط(conditional branch)
Currently lowered control-flow nodes:
NODE_IF: then/else/merge blocks withقفز_شرطNODE_WHILE: header/body/exit blocks, back edge to header (قفز)NODE_FOR: init + header/body/increment/exit blocks (استمرtargets increment)NODE_SWITCH: comparison-chain dispatch + case blocks + default + end (with fallthrough)NODE_BREAK: branch to active loop/switch exit blockNODE_CONTINUE: branch to active loop header/increment block
For full specification, see BAA_IR_SPECIFICATION.md.
The IR printer provides a canonical, Arabic-first text format for debugging and tooling.
- Core printer entry point:
ir_module_print() - Instruction formatting:
ir_inst_print() - Values / registers / immediates:
ir_value_print() - Arabic-Indic numerals for registers:
int_to_arabic_numerals()
The driver exposes the printer via the CLI flag --dump-ir implemented in src/main.c. This flag:
- Parses + analyzes the source as usual.
- Builds an IR module using
IRBuilderand lowers AST statements usinglower_stmt(). - Prints IR to stdout.
Note:
--dump-iris a debug/inspection output mode. The default compilation pipeline is fully IR-based: AST → IR → Optimizer → ISel → RegAlloc → Emit.
Example invocation:
build\baa.exe --dump-ir program.baaThe IR analysis layer provides foundational compiler analyses required by the upcoming optimizer pipeline:
-
CFG validation: ensure each block has a terminator (
قفز/قفز_شرط/رجوع) -
Predecessor rebuilding: recompute
preds[]andsuccs[]from terminator instructions (useful after IR edits) -
Dominator tree + dominance frontier: compute
idomfor each block and build dominance frontier sets -
Loop detection (v0.3.2.7.1): natural loop discovery via back edges using dominance (
src/ir_loop.c,src/ir_loop.h). -
LICM (v0.3.2.7.1): conservative hoisting of pure loop-invariant computations to preheaders (
src/ir_licm.c,src/ir_licm.h). -
Strength reduction (v0.3.2.7.1): instruction selection reduces
ضربby power-of-two constants inside loops toshl. -
Loop unrolling (v0.3.2.7.1): optional conservative full unroll for small constant trip-count loops (after Out-of-SSA) (
src/ir_unroll.c,src/ir_unroll.h). -
Inlining (v0.3.2.7.2): conservative inliner at
-O2for small internal functions with a single call site (src/ir_inline.c,src/ir_inline.h).
Implementation lives in
src/ir_analysis.c.
Canonical Mem2Reg is a correctness-first SSA construction step that promotes a safe subset of local variables represented by حجز/خزن/حمل into SSA values:
- Computes dominance + dominance frontiers (via
ir_func_compute_dominators()). - Inserts
فايnodes at join points. - Performs SSA renaming to rewrite
حمل/خزنinto SSA register values (usuallyنسخ).
File: src/ir_mem2reg.c
Entry Point: ir_mem2reg_run()
Pass Descriptor: IR_PASS_MEM2REG (used with the optimizer pipeline).
Constraints (correctness-first):
- No pointer escape (not passed to
نداء, not used insideفاي, not stored as a value) - Alloca block must dominate all uses (ensures SSA correctness)
- Must be definitely initialized before any load on all paths (must-def initialization). The initializing
خزنmay be in a different block as long as every path to aحملhas a prior store.
Pipeline position: Runs first inside each optimizer iteration (before Canon/InstCombine/SCCP/ConstFold/CopyProp/etc.) via ir_optimizer_run().
Out-of-SSA eliminates فاي before the backend by inserting copies on CFG edges. When a predecessor has multiple successors (critical edge), the pass splits the edge to create an insertion block:
P -> B becomes P -> E -> B
File: src/ir_outssa.c
Entry Point: ir_outssa_run()
Driver integration: Executed in src/main.c before isel_run_ex() to ensure no IR_OP_PHI reaches ISel/RegAlloc/Emit.
SSA verification is an analysis step that validates IR invariants after Mem2Reg and before Out-of-SSA:
- Single definition: each virtual register is defined exactly once (SSA property), including function parameter registers.
- Dominance: every use is dominated by the register’s definition (with edge semantics for
فاي). - Phi correctness (
فاي): exactly one incoming value per predecessor block, no duplicates, and no non-predecessor entries.
This verifier is exposed via the CLI flag:
--verify-ssa— aborts compilation with diagnostics on the first violations (capped), and requires-O1/-O2because Mem2Reg runs in the optimizer pipeline.
Files: src/ir_verify_ssa.c, header: src/ir_verify_ssa.h
IR well-formedness verification validates general IR invariants that should hold regardless of SSA state:
- Operand counts and required fields per instruction
- Type consistency between instruction results and operands
- Terminator placement (must end blocks; no instructions after terminators)
- Phi placement and incoming-edge shape (after rebuilding predecessors)
- Intra-module call signature checks when the callee exists in the same IR module
This verifier is exposed via the CLI flag:
--verify-ir— aborts compilation with diagnostics on the first violations (capped).
Files: src/ir_verify_ir.c, header: src/ir_verify_ir.h
Pipeline position: Executed after optimization and before Out-of-SSA/backend in src/main.c.
Canonicalization normalizes instruction forms to increase matchability for later optimizations (CSE/DCE/constfold):
- Commutative ops: constant placement and deterministic operand ordering
- Comparisons: swap operands and predicate when the constant is on the left
File: src/ir_canon.c
Entry Point: ir_canon_run()
Pass Descriptor: IR_PASS_CANON (used with the optimizer pipeline).
InstCombine performs fast, local instruction simplifications to improve later passes (SCCP/constfold/copyprop/DCE). It rewrites eligible instructions into نسخ (IR_OP_COPY) or constants rather than deleting SSA definitions directly.
File: src/ir_instcombine.c
Entry Point: ir_instcombine_run()
Testing: Integration validation via scripts/qa_run.py --mode full and --mode stress.
SCCP (Sparse Conditional Constant Propagation) combines reachability with SSA constant propagation:
- Tracks reachable blocks and feasible edges.
- Propagates integer constants through SSA.
- Folds
قفز_شرط(IR_OP_BR_COND) intoقفز(IR_OP_BR) when the condition becomes constant.
File: src/ir_sccp.c
Entry Point: ir_sccp_run()
Testing: Integration validation via scripts/qa_run.py --mode full and --mode stress.
CFG simplification reduces unnecessary control-flow structure:
قفز_شرط cond, X, Xbecomesقفز X- Removes trivial
قفز-only blocks conservatively, avoiding unsafe phi interactions - Provides a reusable critical-edge splitting helper for IR passes
File: src/ir_cfg_simplify.c
Entry Point: ir_cfg_simplify_run()
Helper: ir_cfg_split_critical_edge()
Pass Descriptor: IR_PASS_CFG_SIMPLIFY (used with the optimizer pipeline).
The Data Layout module provides a central source of truth for target-specific type information (size, alignment, store size). Currently hardcoded for Windows x86-64, but designed to support multiple backends in the future.
File: src/ir_data_layout.c / src/ir_data_layout.h
Key API:
ir_type_size_bytes(dl, type): Returns size in bytes (e.g.,i32→ 4).ir_type_alignment(dl, type): Returns required alignment (e.g.,i32→ 4).ir_type_store_size(dl, type): Returns memory size for storage.
Arithmetic Contract (v0.3.2.6.6):
- Overflow: Two's complement wrap (no undefined behavior).
- Safe Division:
INT64_MIN / -1→INT64_MIN(no trap). - Safe Modulo:
INT64_MIN % -1→0(no trap).
The IR constant folding pass optimizes Baa IR by evaluating arithmetic and comparison instructions at compile time when both operands are immediate constants. It replaces all uses of the folded register with the constant value and removes the instruction from its block.
File: src/ir_constfold.c
Entry Point: ir_constfold_run()
Pass Descriptor: IR_PASS_CONSTFOLD (used with the optimizer pipeline).
Supported Operations:
- Arithmetic: جمع (add), طرح (sub), ضرب (mul), قسم (div), باقي (mod)
- Comparisons: قارن (eq, ne, gt, lt, ge, le)
How it works:
- Scans each function and block for foldable instructions.
- If both operands are immediate integer constants, computes the result.
- Replaces all uses of the destination register with a new constant IRValue.
- Removes the folded instruction from its block.
- Pass is function-local; virtual registers are scoped per function.
Testing: Covered by integration corpus and optimizer-enabled smoke in scripts/qa_run.py --mode full.
API: See docs/API_REFERENCE.md for function signatures.
The IR dead code elimination pass removes useless IR after lowering/other optimizations:
- Dead SSA instructions: any instruction that produces a destination register which is never used, and has no side effects.
- Unreachable blocks: any basic block not reachable from the function entry block.
File: src/ir_dce.c
Entry Point: ir_dce_run()
Pass Descriptor: IR_PASS_DCE
Conservative correctness rules:
نداء(calls) are treated as side-effecting and are not removed even if the result is unused.خزن(stores) are not removed.- Terminators (
قفز,قفز_شرط,رجوع) are never removed.
CFG hygiene:
- Unreachable-block removal uses
ir_func_rebuild_preds()before/after pruning. - Phi nodes are pruned of incoming edges from removed predecessor blocks to avoid dangling references.
Testing: Covered by integration corpus and optimizer-enabled smoke in scripts/qa_run.py --mode full.
API: See docs/API_REFERENCE.md for function signatures.
The IR copy propagation pass removes redundant SSA copy chains by replacing uses of registers defined by نسخ (IR_OP_COPY) with their original source values. This simplifies the IR and improves the effectiveness of later passes (like common subexpression elimination and dead code elimination).
File: src/ir_copyprop.c
Entry Point: ir_copyprop_run()
Pass Descriptor: IR_PASS_COPYPROP
Scope: Function-local (virtual registers are scoped per function in the current IR).
What it does:
- Detects
IR_OP_COPYinstructions (نسخ) and builds an alias map (%مX→ source value). - Canonicalizes copy chains (e.g.
%م٣ = نسخ %م٢,%م٢ = نسخ %م١) so%م٣is rewritten to%م١. - Rewrites operands in:
- normal instruction operands
نداءcall argumentsفايphi incoming values
- Removes
نسخinstructions after propagation.
Testing: Covered by integration corpus and optimizer-enabled smoke in scripts/qa_run.py --mode full.
The IR common subexpression elimination (CSE) pass detects duplicate computations with identical opcode and operands, replacing subsequent uses with the first computed result.
File: src/ir_cse.c
Entry Point: ir_cse_run()
Pass Descriptor: IR_PASS_CSE
Algorithm:
- For each function and block, hash each pure expression (opcode + operand signatures).
- If a duplicate hash is found, replace all uses of the duplicate instruction's destination register with the original result.
- Remove redundant instructions after propagation.
Eligible Operations (pure, no side effects):
- Arithmetic: جمع (add), طرح (sub), ضرب (mul), قسم (div), باقي (mod)
- Comparisons: قارن (compare)
- Logical: و (and), أو (or), نفي (not)
NOT Eligible (side effects or non-deterministic):
- Memory: حجز (alloca), حمل (load), خزن (store)
- Control: نداء (call), فاي (phi), terminators (branches/returns)
Pipeline position: Enabled at -O2 after GVN.
Testing: Covered by integration corpus and optimizer-enabled smoke in scripts/qa_run.py --mode full.
API: See docs/API_REFERENCE.md for function signatures.
GVN (Global Value Numbering) removes redundant pure expressions across dominator scopes, even when they use different SSA registers due to copies. Unlike CSE which relies on exact opcode/operand matching, GVN assigns value numbers to expressions based on their semantic equivalence.
File: src/ir_gvn.c
Entry Point: ir_gvn_run()
Pass Descriptor: IR_PASS_GVN
Algorithm:
- Computes dominance tree for each function.
- Assigns value numbers to expressions based on opcode and operand value numbers.
- Detects equivalent expressions even when they use different SSA registers.
- Replaces redundant computations with the original value.
Pipeline position: Enabled at -O2 after copy propagation and before CSE.
Testing: Covered by integration corpus and optimizer-enabled smoke in scripts/qa_run.py --mode full.
LICM (Loop Invariant Code Motion) identifies pure instructions inside loops that depend only on values outside the loop and moves them to the loop preheader.
File: src/ir_licm.c
Entry Point: ir_licm_run()
Pass Descriptor: IR_PASS_LICM
Safety Constraints:
- Does not move memory operations or calls.
- Does not move division/remainder to avoid changing trap behavior when the loop is not entered.
- Requires a single preheader for the loop header (otherwise skips that loop).
Pipeline position: Runs in both -O1 and -O2 after CFG simplification.
Testing: Covered by integration corpus and optimizer-enabled smoke in scripts/qa_run.py --mode full.
The inlining pass expands function calls directly at their call sites, enabling further optimizations by exposing the function body to the optimizer.
File: src/ir_inline.c
Entry Point: ir_inline_run()
Algorithm:
- Conservatively inlines small internal functions with a single call site.
- Applied before Mem2Reg (before SSA construction) to avoid phi complexity.
- Relies on Mem2Reg + subsequent optimization passes for "cleanup after inlining".
Pipeline position: Enabled at -O2 only, runs before the main optimization loop.
Testing: Covered by integration corpus and optimizer-enabled smoke in scripts/qa_run.py --mode full.
Loop unrolling replicates loop bodies to reduce loop overhead and enable further optimizations.
File: src/ir_unroll.c
Entry Point: ir_unroll_run()
Constraints:
- Only if the trip count is constant and small.
- Only on natural loops with a single preheader.
- Runs after Out-of-SSA because that makes loop values explicit through copies.
Pipeline position: Enabled with -funroll-loops flag after Out-of-SSA.
Testing: Covered by integration corpus and optimizer-enabled smoke in scripts/qa_run.py --mode full.
The instruction selection pass converts Baa IR (SSA form) into an abstract machine representation (MachineModule) that closely mirrors x86-64 instructions while keeping virtual registers. Physical register assignment is deferred to the register allocation pass (v0.3.2.2).
Files: src/isel.h, src/isel.c
Entry Point: isel_run_ex() — takes an IRModule* plus a BaaTarget to select ABI/object-format behavior.
Multi-target note (v0.3.2.8.1): The backend is being refactored to accept a BaaTarget descriptor (src/target.h) so the same IR can be lowered for Windows x64 (COFF) or Linux x86-64 (ELF). The driver exposes this via --target=....
IRModule ──→ isel_run_ex() ──→ MachineModule
IRFunc │ MachineFunc
IRBlock │ MachineBlock
IRInst │ MachineInst (1:N expansion)
▼
ISelCtx (internal context)
- current function/block
- vreg counter
- stack size tracking
Each IR instruction is lowered to one or more MachineInst nodes. The expansion ratio is typically 1:1 to 1:4 depending on the IR opcode (e.g., IR_OP_DIV expands to MOV + CQO + IDIV).
| Structure | Description |
|---|---|
MachineOp |
Enum of x86-64 opcodes: ADD, SUB, IMUL, SHL, SHR, SAR, IDIV, DIV, NEG, CQO, ADDSD, SUBSD, MULSD, DIVSD, UCOMISD, XORPD, CVTSI2SD, CVTTSD2SI, MOV, LEA, LOAD, STORE, CMP, TEST, SETcc (E, NE, G, L, GE, LE, A, B, AE, BE, P, NP), MOVZX, MOVSX, AND, OR, NOT, XOR, JMP, JE, JNE, CALL, TAILJMP, RET, PUSH, POP, NOP, LABEL, COMMENT |
MachineOperandKind |
NONE, VREG, IMM, MEM, LABEL, GLOBAL, FUNC, XMM |
MachineOperand |
Union: vreg number, immediate value, memory (base+offset), label id, global/func name, xmm register |
MachineInst |
Doubly-linked list node: op + dst/src1/src2 + ir_reg + comment + src_loc + dbg_name + sysv_al (for varargs) |
MachineBlock |
Label + instruction list + successors + linked-list next |
MachineFunc |
Name + block list + vreg counter + stack_size + param_count |
MachineModule |
Function list + globals (ref from IR) + strings (ref from IR) + baa_strings |
MachineInst Structure:
typedef struct MachineInst {
MachineOp op; // كود العملية
MachineOperand dst; // المعامل الوجهة
MachineOperand src1; // المعامل المصدر الأول
MachineOperand src2; // المعامل المصدر الثاني (اختياري)
// معلومات تعقب المصدر
int ir_reg; // سجل IR الأصلي (للربط مع IR)
const char* comment; // تعليق اختياري (لسهولة القراءة)
// معلومات الديبغ (Debug Info)
const char* src_file;
int src_line;
int src_col;
int ir_inst_id; // معرّف تعليمة IR (إن وُجد)
const char* dbg_name; // اسم متغير/رمز اختياري
// SystemV AMD64 varargs: AL = عدد سجلات XMM المستخدمة لتمرير المعاملات.
// -1 => لا يُطلب إعداد AL صراحةً (الافتراضي 0).
int sysv_al;
// القائمة المترابطة المزدوجة
struct MachineInst* prev;
struct MachineInst* next;
} MachineInst;MachineBlock Structure:
typedef struct MachineBlock {
char* label; // اسم الكتلة
int id; // معرف الكتلة
MachineInst* first; // أول تعليمة
MachineInst* last; // آخر تعليمة
int inst_count; // عدد التعليمات
struct MachineBlock* succs[2]; // الخلفاء (0-2)
int succ_count;
struct MachineBlock* next; // الكتلة التالية في القائمة
} MachineBlock;MachineFunc Structure:
typedef struct MachineFunc {
char* name; // اسم الدالة
MachineBlock* blocks; // قائمة الكتل
int block_count;
int next_vreg; // عداد السجلات الافتراضية
int stack_size; // حجم المكدس المحلي
int param_count; // عدد المعاملات
struct MachineFunc* next; // الدالة التالية
} MachineFunc;MachineModule Structure:
typedef struct MachineModule {
MachineFunc* funcs; // قائمة الدوال
int func_count;
IRGlobal* globals; // مرجع من IR (غير مملوك)
int global_count;
IRStringEntry* strings; // مرجع من IR (غير مملوك)
int string_count;
IRBaaStringEntry* baa_strings; // مرجع من IR (غير مملوك)
int baa_string_count;
} MachineModule;| IR Opcode | Machine Pattern | Notes |
|---|---|---|
IR_OP_ADD / IR_OP_SUB / IR_OP_MUL |
MOV dst, lhs; OP dst, rhs |
Two-address form. Immediates inlined as src2 |
IR_OP_DIV / IR_OP_MOD |
MOV RAX, lhs; CQO; IDIV rhs |
If rhs is immediate, temp vreg is allocated for MOV |
IR_OP_NEG |
MOV dst, src; NEG dst |
Two-instruction pattern |
IR_OP_ALLOCA |
LEA dst, [RBP - offset] |
Stack offset tracked in ISelCtx.stack_size |
IR_OP_LOAD |
LOAD dst, [ptr] or LOAD dst, @global |
Global variables use MACH_OP_GLOBAL operand |
IR_OP_STORE |
STORE [ptr], src |
Immediate values can be stored directly to memory |
IR_OP_PTR_OFFSET |
MOV dst, base; (scale index); ADD dst, index_scaled |
Used for array indexing: computes element address using data layout element size |
IR_OP_CMP |
CMP lhs, rhs; SETcc tmp; MOVZX dst, tmp |
SETcc selected by predicate (EQ/NE/GT/LT/GE/LE). If LHS is immediate, temp vreg is used |
IR_OP_AND / IR_OP_OR / IR_OP_XOR |
MOV dst, lhs; OP dst, rhs |
Same two-address form as arithmetic |
IR_OP_SHL |
MOV dst, lhs; SHL dst, rhs |
If rhs is non-immediate, count is moved to RCX/CL |
IR_OP_SHR |
MOV dst, lhs; SHR/SAR dst, rhs |
SHR for unsigned types, SAR for signed types |
IR_OP_NOT |
MOV dst, src; NOT dst |
Bitwise NOT |
IR_OP_BR |
JMP label |
Unconditional jump |
IR_OP_BR_COND |
TEST cond, cond; JNE true_label; JMP false_label |
Three-instruction pattern |
IR_OP_RET |
MOV RAX, val; RET |
Uses special vreg -2 (= RAX) |
IR_OP_CALL |
MOV param_regs, args...; (setup stack args); CALL @func/*reg; MOV dst, RAX |
Direct: CALL @func. Indirect: CALL *reg (callee value). ABI: Windows (shadow) / SysV (no shadow). In v0.4.0.5 variadic Baa calls pass packed extras via hidden __baa_va_base pointer. In v0.4.0.6 inline asm is lowered كـ pseudo-call (__baa_inline_asm_v0406) ويُحوّل في ISel إلى أسطر تجميع خام مع نقل مدخلات/مخرجات السجلات. |
IR_OP_CALL + IR_OP_RET (tail) |
MOV param_regs, args...; TAILJMP @func |
v0.3.2.7.3: مفعل فقط عند -O2 وبشكل محافظ (register args only) |
IR_OP_PHI |
NOP |
Placeholder; copy insertion deferred to register allocation |
IR_OP_CAST |
MOV dst, src (larger/same size) or MOVZX/MOVSX dst, src (smaller to larger) |
Size and sign dependent conversion (تحويل) |
The instruction selector uses negative vreg numbers to represent physical register constraints that will be resolved during register allocation:
| Vreg | Physical Register | Purpose |
|---|---|---|
| -1 | RBP | Memory base for stack accesses |
| -2 | RAX | Return value register |
| -3 | RSP | Stack pointer base for outgoing call frames |
| -4 | R11 | Reserved scratch register (spill-base fixups, mem-to-mem avoidance) |
| -5 | RDX | Remainder register for idiv / backend fixed constraint |
| -6 | RCX | Shift count register (cl) for variable shifts |
| -10.. | ABI arg regs | Function arguments (target-dependent). Windows: -10..-13 → RCX/RDX/R8/R9. SysV: -10..-15 → RDI/RSI/RDX/RCX/R8/R9 |
- Virtual registers preserved: ISel keeps IR virtual register numbers intact. Physical register mapping is entirely deferred to v0.3.2.2 (register allocation).
- Immediate inlining: Constants are embedded as
MACH_OP_IMMwherever x86-64 encoding permits. Where not allowed (CMP first operand, IDIV divisor), a temp vreg + MOV is emitted. - Phi nodes as NOPs: Phi instructions become NOP placeholders. Actual copy insertion into predecessor blocks is deferred to SSA destruction during register allocation.
- MachineModule references IR data: Global variables and string tables are referenced (not copied) from the IR module. Memory is freed by the IR module.
- Stack size tracking: Each
IR_OP_ALLOCAincreasesstack_sizeby the store size of the allocated pointee type (rounded up to its alignment via the target data layout). The LEA instruction uses the accumulated offset.
Testing: Backend behavior is validated by integration runtime tests under tests/integration/backend/.
The register allocator transforms virtual register references in machine instructions into physical x86-64 registers. It uses the Linear Scan algorithm for simplicity and fast compilation.
Source: src/regalloc.h / src/regalloc.c
MachineModule (vregs)
│
├── 1. Number Instructions ← Sequential numbering for position tracking
├── 2. Compute def/use ← Per-block def/use bitsets
├── 3. Liveness Analysis ← Iterative dataflow → live-in/live-out
├── 4. Build Live Intervals ← vreg → [start, end] ranges
├── 5. Linear Scan ← Assign physical registers, spill on pressure
├── 6. Insert Spill Code ← Handle spilled vregs
└── 7. Rewrite Operands ← Replace VREG → physical reg / MEM
│
▼
MachineModule (physical regs)
| Structure | Purpose |
|---|---|
PhysReg |
Enum of 16 x86-64 physical registers (RAX=0 through R15=15) |
LiveInterval |
Per-vreg range: {vreg, start, end, phys_reg, spilled, spill_offset} |
BlockLiveness |
Per-block bitsets: {def, use, live_in, live_out} as uint64_t* arrays |
RegAllocCtx |
Full context: function, inst_map, block liveness, intervals, vreg→phys mapping, spill tracking |
Registers are allocated in a specific priority order to minimize callee-save overhead:
- Caller-saved temporaries: R10 (free to use, no save/restore). R11 is reserved as a scratch register for spill/base fixups.
- General purpose: RSI, RDI (caller-saved on Windows x64)
- Callee-saved: RBX, R12, R13, R14, R15 (require save/restore in prologue/epilogue)
- ABI-reserved: RCX, RDX, R8, R9 (argument registers, allocated last). RAX is reserved for return value and backend scratch sequences.
Always reserved: RSP (stack pointer), RBP (frame pointer) — never allocated.
ISel emits negative vregs for ABI-fixed locations. The register allocator resolves these during rewrite:
| Virtual Reg | Physical Reg | Purpose |
|---|---|---|
-1 |
RBP | Frame pointer (memory base) |
-2 |
RAX | Return value |
-4 |
R11 | Scratch register for spilled memory bases |
-5 |
RDX | Remainder register (idiv) |
-6 |
RCX | Shift count register (cl) |
-10 |
RCX | 1st argument (Windows x64) |
-11 |
RDX | 2nd argument (Windows x64) |
-12 |
R8 | 3rd argument (Windows x64) |
-13 |
R9 | 4th argument (Windows x64) |
The liveness analysis uses iterative dataflow on bitsets:
-
def/use computation: Walk each block's instructions. For each instruction, if a vreg is used before being defined in the block, it goes into
use. If defined, it goes intodef. Two-address form (e.g.,add dst, dst, src) recordsdstas both use and def. -
Dataflow iteration: Iterate in reverse block order until fixpoint (max 100 iterations):
live_out[B] = union(live_in[S])for all successors S of Blive_in[B] = use[B] union (live_out[B] - def[B])
-
Interval construction: Walk instructions sequentially, extending intervals for vregs in live_in/live_out sets at block boundaries.
When register pressure exceeds available registers, the allocator spills the longest-lived interval (comparing current candidate vs active intervals). Spilled vregs are assigned stack offsets relative to RBP. During rewrite, spilled VREG operands are converted to MEM operands [RBP + offset], leveraging x86-64's ability to have one memory operand per instruction. Exception: if a spilled vreg is used as the base of a memory operand (e.g. MACH_LOAD/MACH_STORE through a spilled pointer), the allocator reloads the pointer base into a reserved scratch register (R11) immediately before the instruction.
- Linear scan over graph coloring: Chosen for simplicity and O(n log n) compilation speed. Sufficient for the current optimization level.
- Spill via rewrite (not explicit loads/stores): Spilled vregs become
[RBP+offset]MEM operands directly, avoiding extra load/store insertion. Works because x86-64 allows one memory operand per instruction. Exception: spilled pointer bases used inMACH_OP_MEM.base_vregare reloaded into R11 beforeMACH_LOAD/MACH_STORE. - RSP/RBP always reserved: Frame pointer is always maintained for simple stack access. No frame pointer omission.
- Callee-saved tracking:
RegAllocCtx.callee_saved_used[]tracks which callee-saved registers are allocated, informing prologue/epilogue generation in the code emission phase.
Testing: Register allocation behavior is validated by integration runtime tests under tests/integration/backend/.
The backend uses a target abstraction layer (src/target.h, src/target.c) to separate OS/object-format/calling-convention assumptions from the rest of the backend (isel/regalloc/emit). This enables support for multiple targets.
Target Kinds:
typedef enum {
BAA_TARGET_X86_64_WINDOWS = 0, // Windows x86-64 (COFF/PE)
BAA_TARGET_X86_64_LINUX = 1, // Linux x86-64 (ELF)
} BaaTargetKind;
typedef enum {
BAA_OBJFORMAT_COFF = 0, // Windows PE/COFF
BAA_OBJFORMAT_ELF = 1, // Linux ELF
} BaaObjectFormat;Calling Convention Descriptor:
typedef struct BaaCallingConv {
int int_arg_reg_count; // عدد سجلات معاملات الأعداد الصحيحة
int int_arg_phys_regs[8]; // PhysReg values (from regalloc.h)
int ret_phys_reg; // PhysReg (عادةً RAX)
unsigned int callee_saved_mask; // bitmask over PhysReg
unsigned int caller_saved_mask; // bitmask over PhysReg
int stack_align_bytes; // محاذاة المكدس عند نقاط الاستدعاء (عادة 16)
// تمثيل سجلات معاملات ABI داخل Machine IR كسجلات افتراضية سالبة
// arg i -> (abi_arg_vreg0 - i)
int abi_arg_vreg0; // افتراضي: -10
int abi_ret_vreg; // افتراضي: -2 (RAX)
int shadow_space_bytes; // Windows: 32, SysV: 0
bool home_reg_args_on_call; // Windows varargs: true, SysV: false
bool sysv_set_al_zero_on_call; // SysV varargs rule: true
} BaaCallingConv;Target Descriptor:
typedef struct BaaTarget {
BaaTargetKind kind;
const char* name; // short name: x86_64-windows, x86_64-linux
const char* triple; // future: full triple
BaaObjectFormat obj_format;
const IRDataLayout* data_layout;
const BaaCallingConv* cc;
const char* default_exe_ext; // ".exe" on Windows, "" on Linux
} BaaTarget;Target Selection:
const BaaTarget* baa_target_builtin_windows_x86_64(void);
const BaaTarget* baa_target_builtin_linux_x86_64(void);
const BaaTarget* baa_target_host_default(void);
const BaaTarget* baa_target_parse(const char* s); // "x86_64-windows" or "x86_64-linux"ABI Differences:
| Feature | Windows x64 | SystemV AMD64 (Linux) |
|---|---|---|
| Integer args | RCX, RDX, R8, R9 | RDI, RSI, RDX, RCX, R8, R9 |
| Shadow space | 32 bytes | None |
| Varargs | Home register args on stack | Set AL = number of XMM args |
| Callee-saved | RBX, RBP, RDI, RSI, R12-R15 | RBX, RBP, R12-R15 |
| Stack alignment | 16 bytes | 16 bytes |
Backend Integration:
isel_run_ex()takesconst BaaTarget*to select ABI/object-format behaviorregalloc_run_ex()accepts target for calling convention-aware allocationemit_module_ex()uses target for code emission decisions (sections, symbols)
The code emission pass is the final backend stage that converts machine IR (after register allocation) into x86-64 assembly text in AT&T syntax, compatible with GAS (GNU Assembler) on Windows.
Source: src/emit.h / src/emit.c
MachineModule (physical regs)
│
├── 1. Emit rodata section ← Format strings (COFF: .rdata, ELF: .rodata)
├── 2. Emit .data section ← Global variables with initializers
├── 3. Emit .text section ← Functions:
│ ├── Function prologue ← Stack setup + callee-saved preservation
│ ├── Instruction emission ← Translate each MachineInst to AT&T
│ └── Function epilogue ← Callee-saved restoration + return
└── 4. Emit string table ← .Lstr_N labels for string literals
│
▼
Assembly file (.s)
| Aspect | AT&T Syntax | Intel Syntax (for comparison) |
|---|---|---|
| Register prefix | %rax, %rcx |
rax, rcx |
| Immediate prefix | $10 |
10 |
| Operand order | mov source, dest |
mov dest, source |
| Size suffix | movq (64-bit), movl (32-bit), movb (8-bit) |
mov qword, mov dword, mov byte |
| Memory addressing | offset(%base) |
[base + offset] |
The prologue sets up the stack frame and preserves callee-saved registers:
push %rbp # Save old frame pointer
mov %rsp, %rbp # Set up new frame pointer
sub $N, %rsp # Allocate stack space (N = local + shadow + callee-save, 16-byte aligned)
mov %rbx, -8(%rbp) # Save callee-saved registers (if used)
mov %r12, -16(%rbp)
...Stack frame layout:
High addresses
┌─────────────────┐
│ Return address │ ← pushed by CALL
├─────────────────┤
│ Old RBP │ ← pushed by prologue
├─────────────────┤ ← RBP points here
│ Local vars │ (func->stack_size bytes)
├─────────────────┤
│ Shadow space │ (32 bytes for Windows x64)
├─────────────────┤
│ Callee-saved │ (RBX, R12-R15 if used)
├─────────────────┤ ← RSP points here (16-byte aligned)
Low addresses
Callee-saved register detection:
The emitter scans all instructions in the function to determine which callee-saved registers (RBX, RSI, RDI, R12-R15) are used as destinations. Only used registers are preserved in the prologue and restored in the epilogue.
The epilogue restores callee-saved registers and tears down the stack frame:
mov -16(%rbp), %r12 # Restore callee-saved registers (reverse order)
mov -8(%rbp), %rbx
leave # Equivalent to: mov %rbp, %rsp; pop %rbp
ret # Return to callerEach MachineInst is translated to one or more AT&T assembly instructions:
| Machine Op | AT&T Output | Notes |
|---|---|---|
MACH_MOV |
movq %src, %dst |
Skips redundant mov %reg, %reg |
MACH_ADD |
addq %src2, %dst |
Two-address form (dst = dst + src2) |
MACH_SUB |
subq %src2, %dst |
Two-address form |
MACH_IMUL |
imulq %src2, %dst |
Two-address form |
MACH_NEG |
negq %dst |
Unary negation |
MACH_CQO |
cqo |
Sign-extend RAX into RDX:RAX |
MACH_IDIV |
idivq %src1 |
Signed division (RDX:RAX / src1) |
MACH_LEA |
leaq offset(%base), %dst |
Load effective address |
MACH_LOAD |
movq offset(%base), %dst |
Load from memory |
MACH_STORE |
movq %src, offset(%base) |
Store to memory |
MACH_CMP |
cmpq %src2, %src1 |
Compare (sets flags) |
MACH_TEST |
testq %src2, %src1 |
Bitwise AND (sets flags) |
MACH_SETcc |
sete %dst8 |
Set byte if condition (6 variants: E, NE, G, L, GE, LE) |
MACH_MOVZX |
movzbq %src8, %dst64 |
Zero-extend byte to qword |
MACH_AND |
andq %src2, %dst |
Bitwise AND |
MACH_OR |
orq %src2, %dst |
Bitwise OR |
MACH_NOT |
notq %dst |
Bitwise NOT |
MACH_XOR |
xorq %src2, %dst |
Bitwise XOR |
MACH_JMP |
jmp .LBB_N |
Unconditional jump |
MACH_JE |
je .LBB_N |
Jump if equal |
MACH_JNE |
jne .LBB_N |
Jump if not equal |
MACH_CALL |
sub $32, %rsp; call <sym>; add $32, %rsp / sub $32, %rsp; call *%reg; add $32, %rsp |
Direct/indirect call. Shadow space on Windows only |
MACH_TAILJMP |
restore callee-saved; leave; home args; jmp func |
Tail call optimization (no new return address) |
MACH_RET |
(triggers epilogue emission) | Return handled by epilogue |
MACH_PUSH |
pushq %src |
Push to stack |
MACH_POP |
popq %dst |
Pop from stack |
MACH_LABEL |
.LBB_N: |
Block label |
MACH_NOP |
(skipped) | No operation |
Format strings (rodata):
.section .rdata,"dr" # COFF (Windows)
.section .rodata # ELF (Linux)
fmt_int: .asciz "%d\n"
fmt_str: .asciz "%s\n"
fmt_scan_int: .asciz "%d"Global variables (.data):
.data
global_var: .quad 42 # Integer initializer
global_str: .quad .Lbs_0 # Baa string pointer initializer (ptr<char>)
global_fp: .quad جمع # Function pointer initializer (func address)Linkage note (v0.3.7.5):
- Globals lowered from
ساكنuse internal linkage. - ELF emission prints
.local <symbol>for internal globals. - Non-internal globals are exported with
.globl <symbol>.
String tables (rodata):
.section .rdata,"dr" # COFF (Windows)
.section .rodata # ELF (Linux)
# C strings (i8*) for printf/scanf formats, etc.
.Lstr_0: .asciz "%d\n"
.Lstr_1: .asciz "%s\n"
# Baa strings (char[]) as packed .quad entries, null-terminated.
.Lbs_0:
.quad <packed 'م'>
.quad <packed 'ر'>
.quad <packed 'ح'>
.quad <packed 'ب'>
.quad <packed 'ا'>
.quad 0The emitter translates Arabic function names to their C runtime equivalents:
| Baa Name | Assembly Name | Purpose |
|---|---|---|
الرئيسية |
main |
Program entry point |
اطبع |
printf |
Print function |
اقرأ |
scanf |
Input function |
This backend is being refactored to support multiple ABIs via BaaTarget (src/target.h).
- Windows x64 ABI: shadow space (32 bytes) around calls; first 4 args in RCX/RDX/R8/R9; return in RAX.
- SystemV AMD64 (Linux): no shadow space; first 6 args in RDI/RSI/RDX/RCX/R8/R9; return in RAX; varargs require
AL=0when no XMM args.
- AT&T syntax: Chosen for compatibility with GAS (GNU Assembler) which is the default on MinGW-w64.
- Redundant move elimination: The emitter skips
mov %reg, %reginstructions that may result from register allocation. - Callee-saved detection: Scans all instructions to determine which registers need preservation, minimizing prologue/epilogue overhead.
- Call frame management: Allocates shadow space on Windows; emits SysV call sequence on ELF targets.
- Size suffix inference: Determines instruction size suffix (q/l/w/b) from operand size_bits field.
Entry Points:
emit_module()— Top-level entry point for complete assembly fileemit_func()— Emits single function with prologue/epilogueemit_inst()— Translates individual machine instruction
Testing: Integration testing via full compilation pipeline (no standalone unit tests yet).
| Section | Contents |
|---|---|
.data |
Global variables (mutable) |
.rdata / .rodata |
String literals (read-only) |
.text |
Executable code |
Strings are collected during parsing and emitted with unique labels:
# COFF (Windows)
.section .rdata,"dr"
.LC0:
.asciz "مرحباً"
.LC1:
.asciz "العالم"# ELF (Linux)
.section .rodata
.LC0:
.asciz "مرحباً"
.LC1:
.asciz "العالم"| Aspect | Details |
|---|---|
| Entry Point | الرئيسية → exported as main |
| Name Mangling | None - functions use their Arabic UTF-8 names as assembly labels |
| Special Case | الرئيسية is explicitly exported as main using .globl main |
| Main with args (v0.3.12.5) | If the user defines صحيح الرئيسية(صحيح عدد، نص[] معاملات), the compiler lowers the user function as __baa_user_main and emits an ABI wrapper named الرئيسية (exported as main). The wrapper converts C char** argv into Baa نص[] (حرف[] packed UTF-8) before calling __baa_user_main. |
| Custom startup (v0.3.12.5) | --startup=custom selects a custom entry symbol __baa_start (driver injects a small startup stub and links with -Wl,-e,__baa_start). The stub delegates to CRT/libc startup (mainCRTStartup on Windows, __libc_start_main on Linux). |
| External Calls | C runtime (printf, etc.) via toolchain symbol resolution |