FCC (Forth Compiler) is a minimal and pedagogical Forth compiler written in C that generates assembly code for the FASM assembler. It supports a Forth-like syntax and serves as both a learning tool for compiler construction and a functional compiler for simple programs.
- Author: Chris Curl
- License: MIT License (c) 2025
- Language: C
- Target: 32-bit x86 Assembly (FASM format)
- fcl: Forth compiler for Linux systems
- fcw: Forth compiler for Windows systems
The compiler follows a streamlined three-phase approach:
- IRL Generation - Parse source and generate Intermediate Representation Language (IRL)
- Optimization - Perform peephole optimizations on the IRL
- Code Generation - Output assembly code for Linux x86
#define VARS_SZ 500 // Maximum number of variables/symbols
#define STRS_SZ 500 // Maximum number of string literals
#define CODE_SZ 5000 // Maximum number of IRL instructions
#define HEAP_SZ 5000 // Maximum number of characters in the HEAPtypedef struct {
char type; // 'I'=Integer, 'F'=Function, 'T'=Target
char name[23]; // Symbol name
char asmName[8]; // Generated assembly name
int sz; // Size in bytes
} SYM_T;typedef struct {
char name[32]; // Generated string name (S1, S2, etc.)
char *val; // String value (heap allocated)
} STR_T;next_ch()- Advances to next character, handles line reading and EOFnext_line()- Reads next line from input filenext_token()- Extracts next token, handles comments (//) and numbers
checkNumber(char *w, int base)- Parses numbers in multiple bases:- Binary:
%1010(prefix%) - Decimal:
#123or123(prefix#or none) - Hexadecimal:
$FF(prefix$) - Character literals:
'Y'(single quotes) - Supports negative numbers with
-prefix
- Binary:
findSymbol(char *name, char type)- Locates symbol by name and typeaddSymbol(char *name, char type)- Adds new symbol to tablegenTargetSymbol()- Generates unique target labels (Tgt1, Tgt2, etc.)
addString(char *str)- Adds string literal to string tabledumpSymbols()- Outputs symbol declarations in assembly format
The compiler uses an internal instruction set:
Stack Operations:
PUSHA,POPA- Push/pop accumulatorSWAP,SP4- Stack manipulationPOPB- Pop to second register
Memory Operations:
STORE,FETCH- 32-bit memory store/loadCSTORE,CFETCH- 8-bit (byte) memory store/loadLOADSTR- Load string address
Arithmetic:
ADD,SUB,MULT,DIVIDE- Basic arithmeticDIVMOD- Division with both quotient and remainderLT,GT,EQ,NEQ- ComparisonsAND,OR,XOR- Bitwise operations
Control Flow:
TESTA- Test accumulator against zeroJMP,JMPZ,JMPNZ- Conditional/unconditional jumpsTARGET- Jump target labelsDEF,CALL,RETURN- Function definition and calls
Register Operations:
MOVAB,MOVAC,MOVAD- Copy accumulator to EBX, ECX, EDXSYS- System call interrupt
Special:
LIT- Literal valuesPLEQ- Plus-equals operation (+!)INCTOS,DECTOS- Increment/decrement top of stack
Variables:
var myVar // Declare integer variableFunctions:
: myFunc // Function definition
42 myVar ! // Store 42 in myVar
; // End functionControl Structures:
condition if // Conditional execution
// code
then
begin // Loops
// code
condition
while // While loop
again // Infinite loop
until // Until loopStack Operations:
42 // Push literal
dup // Duplicate top
drop // Remove top
swap // Swap top two
over // Copy second to topMemory Operations:
@ // Fetch 32-bit value from address
! // Store 32-bit value to address
c@ // Fetch 8-bit (byte) value from address
c! // Store 8-bit (byte) value to address
+! // Add to memory location
1+ 1- // Increment/decrement TOSRegister and System Operations:
->reg1 // Copy TOS to EAX (no-op, already in EAX)
->reg2 // Copy TOS to EBX
->reg3 // Copy TOS to ECX
->reg4 // Copy TOS to EDX
sys // Execute system call (INT 0x80)Arithmetic and Logic:
+ - * / // Basic arithmetic
/mod // Division with quotient and remainder
< = <> > // Comparisons
AND OR XOR // Bitwise operationsSource Code Comments:
// // Comment until the end of the line
( ... ) // In-line comment- Direct system calls via
syscommand - ELF executable format
- No external library dependencies
- Custom function call convention using EBP stack
cd fcl # Change directory
make fcl # Compile the fcl program
./fcl > output.asm # Compile fcl.fth to assembly code
./fcl myfile.fth > output.asm # Compile specific file to assembly code
fasm output.asm program # Assemble to executable using FASM
chmod +x program # Make the program executable
./program # Run the program- Syntax errors show line number, column, and source context
- Fatal errors terminate compilation
- Warnings are displayed as comments in output
var counter
var limit
: increment counter @ 1+ counter ! ;
: mil ( n--m ) 1000 dup * * ;
: main
0 counter !
1 mil limit !
begin
counter @ .d " " puts
increment
counter @ limit @ =
until
"Done!" puts
;- Input Processing - Read source file (defaults to
fcl.fthif no argument provided) - IRL Generation - Parse declarations and generate intermediate representation
- Optimization - Perform peephole optimizations on IRL
- Code Generation - Output ELF assembly with startup code and runtime support
- Symbol Output - Generate variable and string declarations in data section
- No built-in I/O functions (must use system calls via
sys) - Limited error checking and recovery
- No floating-point support
- Fixed-size tables and heap
- Basic optimization only (peephole)
elseclause not yet implemented
- Byte and word memory access (
c@,c!,@,!) - Direct system call support via register operations
- Multi-base number literals (binary, decimal, hex, character)
- Integrated optimization pass
- Compact, self-contained compiler
- Clean separation of IRL generation and code emission
- Stack-based execution model with register access
This compiler serves as an example of a minimal but functional compiler implementation, demonstrating core compiler concepts in a clear and understandable way.