| File | Entity | Role |
|---|---|---|
rtl/leaf.vhdl |
leaf |
Top-level: Wishbone interface, clock gating, counters, COP interface passthrough |
rtl/wb_ctrl.vhdl |
wb_ctrl |
Wishbone B4 master FSM |
rtl/clk_ctrl.vhdl |
clk_ctrl |
Clock gating |
rtl/counters.vhdl |
counters |
mcycle, time, instret counters |
rtl/core.vhdl |
core |
Core integration: IF + ID/EX pipeline |
rtl/if_stage.vhdl |
if_stage |
Instruction fetch, PC register, flush |
rtl/id_stage.vhdl |
id_stage |
Decode, register file, CSRs |
rtl/main_ctrl.vhdl |
main_ctrl |
Main control decoder and immediate generator |
rtl/reg_file.vhdl |
reg_file |
32×32 register file |
rtl/csrs.vhdl |
csrs |
Machine CSRs and trap control |
rtl/ex_block.vhdl |
ex_block |
ALU, branch, CSR logic, load/store |
rtl/alu_ctrl.vhdl |
alu_ctrl |
ALU operation decoder |
rtl/alu.vhdl |
alu |
ALU datapath |
rtl/br_detector.vhdl |
br_detector |
Branch condition evaluation |
rtl/dmls_block.vhdl |
dmls_block |
Data memory load/store alignment |
rtl/csrs_logic.vhdl |
csrs_logic |
CSR write data muxing |
rtl/leaf_pkg.vhdl |
leaf_pkg |
ISA constants, opcodes, ALU ops, component declarations |
Leaf implements a two-stage pipeline with Wishbone B4 bus interface:
┌──────────────────────────────────────┐
│ leaf (top) │
│ ┌──────────┐ ┌──────────────────┐ │
clk_i ────────────┼─▶│clk_ctrl │─▶│ core │ │
rst_i ────────────┼─▶│ │ │ ┌─────────────┐ │ │
│ └──────────┘ │ │ IF Stage │ │ │
│ ┌──────────┐ │ │ (if_stage) │ │ │
ack_i ────────────┼─▶│ wb_ctrl │◀─┼──│ • PC fetch │ │ │
err_i ────────────┼─▶│ (FSM) │──┼──│ • imem rd │ │ │
dat_i ◀───────────┼──│ │ │ │ • flush │ │ │
│ └──────────┘ │ └──────┬──────┘ │ │
│ │ │pipeline │ │
│ ┌──────────┐ │ ┌──────▼──────┐ │ │
│ │ counters │ │ │ ID/EX │ │ │
│ │ (cycle, │ │ │ (id_stage + │ │ │
│ │ time, │ │ │ ex_block) │ │ │
│ │ instret)│ │ │ • decode │ │ │
│ └──────────┘ │ │ • reg file │ │ │
│ │ │ • CSR │ │ │
│ │ │ • ALU │ │ │
│ │ │ • branch │ │ │
│ │ │ • load/store│ │ │
│ │ └─────────────┘ │ │
└──────────────────────────────────────┘
IF stage writes to pipeline registers on each clock; ID/EX operates combinatorially from those registers and writes results back in the same cycle. Both stages advance together — there is no independent stall per stage.
leaf (top)
├── wb_ctrl Wishbone B4 master FSM
├── clk_ctrl Glitch-free clock gating
├── counters cycle, time, instret counters
└── core Core (IF + ID/EX pipeline)
├── if_stage Instruction fetch (IF)
├── id_stage Decode + register file + CSRs (ID)
│ ├── main_ctrl Instruction decoder and immediate generator
│ ├── reg_file 32 × XLEN register file
│ └── csrs Machine-mode CSRs and trap logic
└── ex_block ALU + branch + load/store (EX)
├── alu_ctrl ALU operation decoder
├── alu ALU datapath (bypass chain)
├── br_detector Branch condition evaluation
├── dmls_block Data memory load/store alignment
└── csrs_logic CSR write data mux
| Domain | Signal | Source | Consumers |
|---|---|---|---|
| Free-running | clk_i |
External input | wb_ctrl, counters, clk_ctrl |
| Gated | clk |
clk_ctrl(clk_i, clk_en) |
core (pipeline) |
| Component | Reset Signal | Source | Deassertion |
|---|---|---|---|
wb_ctrl |
rst_i |
External | Immediate after rst_i |
clk_ctrl |
rst_i |
External | Immediate (clock forced on during reset) |
counters |
rst_i |
External | Immediate after rst_i |
core |
reset |
wb_ctrl |
1 cycle after rst_i (when FSM exits START) |
The core's reset is derived from the Wishbone FSM START state, introducing a 1-cycle skew relative to rst_i.
File: rtl/leaf.vhdl
| Generic | Default | Description |
|---|---|---|
RESET_ADDR |
0x00000000 |
Reset vector address |
CSRS_MHART_ID |
0x00000000 |
Machine hart ID (mhartid CSR) |
REG_FILE_SIZE |
32 | Register file size (16 or 32) |
| Port | Direction | Width | Description |
|---|---|---|---|
clk_i |
in | 1 | Master clock (50 MHz, 20 ns) |
rst_i |
in | 1 | Asynchronous reset (active high) |
ex_irq_i |
in | 1 | External interrupt (level-sensitive) |
sw_irq_i |
in | 1 | Software interrupt (level-sensitive) |
tm_irq_i |
in | 1 | Timer interrupt (level-sensitive) |
ack_i |
in | 1 | Wishbone acknowledge |
err_i |
in | 1 | Wishbone error |
dat_i |
in | XLEN | Wishbone read data bus |
cop_dat_i |
in | XLEN | Coprocessor read data (default 0) |
cop_adr_o |
out | 6 | Coprocessor address (CSR address offset) |
cop_dat_o |
out | XLEN | Coprocessor write data |
cop_we_o |
out | 1 | Coprocessor write strobe |
cyc_o |
out | 1 | Wishbone cycle |
stb_o |
out | 1 | Wishbone strobe |
we_o |
out | 1 | Wishbone write enable |
sel_o |
out | 4 | Wishbone byte selects |
adr_o |
out | XLEN | Wishbone address |
dat_o |
out | XLEN | Wishbone write data |
┌─────────────────────────────────────────────────────────────────┐
│ leaf │
│ │
│ clk_i ──▶ clk_ctrl ──clk──┐ │
│ rst_i ──▶ clk_ctrl │ │
│ ┌────┴────┐ │
│ ack_i ──┐ │ │ │
│ err_i ──┤ ┌───────┐ │ core │ │
│ dat_i ◀─┼──┤wb_ctrl├─┘ IF │ │
│ │ │ B4 │◀─── ID+CSR │ │
│ cyc_o ──┤ │ FSM │───▶ EX │ │
│ stb_o ──┤ └───────┘ │ │ │
│ we_o ───┤ │ │ │
│ adr_o ──┤ ┌──┴────────┘ │
│ dat_o ──┘ │ retire │
│ ┌──────▼─────┐ │
│ │ counters │ │
│ │ cycle ─────┼──▶ core │
│ │ timer ─────┼──▶ core │
│ │ instret ───┼──▶ core │
│ └────────────┘ │
│ │
│ cop_dat_i ──▶ core cop_adr_o ◀── core │
│ cop_dat_o ◀── core cop_we_o ◀── core │
└─────────────────────────────────────────────────────────────────┘
leaf.vhdl
┌────────────────────────────────────┐
│ │
│ ┌──────────────┐ │
│ │ wb_ctrl │ ◀── imrd_en │
│ │ │ ◀── dmrd_en │
│ │ │ ◀── dmwr_en │
│ │ │ ◀── imrd_addr │
│ │ (arbitrates)│ ◀── dmrw_addr │
│ │ │ ◀── dmwr_data │
│ │ │ ◀── dmwr_be │
│ │ │ │
│ │ ──▶ imrd_err ────▶ core │
│ │ ──▶ dmrd_err ────▶ core │
│ │ ──▶ dmwr_err ────▶ core │
│ │ ──▶ imrd_data ──▶ core │
│ │ ──▶ dmrd_data ──▶ core │
│ │ ──▶ clk_en ──▶ clk_ctrl │
│ │ ──▶ reset ──▶ core │
│ └──────────────┘ │
│ │
│ ┌──────────────┐ │
│ │ clk_ctrl │ ──▶ clk ──▶ core│
│ └──────────────┘ │
│ │
│ ┌──────────────┐ │
│ │ counters │ ──▶ cycle ──▶ core│
│ │ │ ──▶ timer ──▶ core│
│ │ │ ──▶ instret ─▶ core│
│ └──────────────┘ │
│ │
│ cop_adr_o ◀────── core (direct) │
│ cop_dat_o ◀────── core (direct) │
│ cop_we_o ◀────── core (direct) │
│ cop_dat_i ──────▶ core (direct) │
└────────────────────────────────────┘
The COP interface bypasses wb_ctrl — it is a private channel between core and external coprocessor. No bus arbitration or error handling is performed on this path.
wb_ctrlreceiveserr_ifrom Wishbone slave- FSM transitions to
ERRORstate - Combinatorial logic asserts
imrd_err,dmrd_err, ordmwr_errbased on current enable signals - Error signals propagate to
core:imrd_err→if_stage→ setsimrd_faultin pipeline registerdmrd_err/dmwr_err→ex_block→ setsdmld_fault/dmst_fault
id_stagedetects fault in decode →csrstriggers exception- FSM returns to
IDLEon next clock
File: rtl/wb_ctrl.vhdl
Implements a Wishbone B4-compatible master with a single-cycle arbitration FSM.
| Port | Direction | Width | Description |
|---|---|---|---|
clk_i |
in | 1 | Clock (free-running) |
rst_i |
in | 1 | Asynchronous reset (active high) |
imrd_en_i |
in | 1 | Instruction fetch enable (from core) |
dmrd_en_i |
in | 1 | Data read enable (from core) |
dmwr_en_i |
in | 1 | Data write enable (from core) |
ack_i |
in | 1 | Wishbone acknowledge |
err_i |
in | 1 | Wishbone error |
dat_i |
in | XLEN | Wishbone read data |
dmwr_be_i |
in | 4 | Data write byte enables (from core) |
imrd_addr_i |
in | XLEN | Instruction fetch address (from core) |
dmrw_addr_i |
in | XLEN | Data memory address (from core) |
dmwr_data_i |
in | XLEN | Data write data (from core) |
cyc_o |
out | 1 | Wishbone cycle |
stb_o |
out | 1 | Wishbone strobe |
we_o |
out | 1 | Wishbone write enable |
clk_en_o |
out | 1 | Clock enable (to clk_ctrl) |
reset_o |
out | 1 | Core reset (to core) |
imrd_err_o |
out | 1 | Instruction fetch bus error (to core) |
dmrd_err_o |
out | 1 | Data read bus error (to core) |
dmwr_err_o |
out | 1 | Data write bus error (to core) |
sel_o |
out | 4 | Wishbone byte selects |
adr_o |
out | XLEN | Wishbone address |
dat_o |
out | XLEN | Wishbone write data |
imrd_data_o |
out | XLEN | Instruction data (to core) |
dmrd_data_o |
out | XLEN | Read data (to core) |
| State | Description |
|---|---|
START |
Initial reset state, asserts internal reset |
IDLE |
Waits for imem request (imrd_en) |
READ_INSTR |
Instruction fetch cycle, waits for ack_i or err_i |
BRD_CYCLE |
Transition to data read |
READ_DATA |
Data read cycle, waits for ack_i or err_i |
RMW_CYCLE |
Transition to data write |
WRITE_DATA |
Data write cycle, waits for ack_i or err_i |
EXECUTE |
Single-cycle execute — clock gating enabled, bus released |
ERROR |
Bus error response — signals error to core |
Read-modify-write is used for stores: the FSM goes READ_INSTR → RMW_CYCLE → WRITE_DATA → EXECUTE, ensuring the bus is acquired for the full memory operation.
File: rtl/clk_ctrl.vhdl
Generates a glitch-free gated clock using a transparent latch (enable sampled on falling edge) + AND gate.
| Port | Direction | Width | Description |
|---|---|---|---|
clk_i |
in | 1 | Master clock |
rst_i |
in | 1 | Reset (forces clock on during reset) |
clk_en |
in | 1 | Clock enable (from wb_ctrl) |
clk |
out | 1 | Gated clock (to core) |
The clk_en is asserted when the Wishbone FSM is in START, EXECUTE, or ERROR states — meaning the core clock is stopped during bus transactions and running when the pipeline has work to do.
clk_i ─┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──
clk_en ─┐ └──────┐ └──────┐ └──────┐ └──
en_latch ─┐ └──────┐ └──────┐ └──────┐ └──
clk ─┐──┐ └──┐──┐ └──┐──┐ └──┐──┐ └──┐──┐
FETCH EXEC FETCH EXEC FETCH EXEC FETCH EXEC
File: rtl/counters.vhdl
Tracks three 64-bit values: mcycle (free-running, resettable), time (free-running, no reset), minstret (increments on instruction retire).
| Port | Direction | Width | Description |
|---|---|---|---|
clk_i |
in | 1 | Clock (free-running, not gated) |
reset_i |
in | 1 | Reset |
retire_i |
in | 1 | Instruction retire pulse (from core) |
cycle_o |
out | 64 | Cycle counter value (CSR 0xC00/0xC80) |
timer_o |
out | 64 | Timer value (CSR 0xC01/0xC81) |
instret_o |
out | 64 | Instruction retired counter (CSR 0xC02/0xC82) |
| Counter | CSR (low) | CSR (high) | Reset | Behavior |
|---|---|---|---|---|
mcycle |
0xC00 |
0xC80 |
Yes | Increments every clk_i cycle (free-running) |
time |
0xC01 |
0xC81 |
No | Increments every clk_i cycle (free-running, separate register) |
minstret |
0xC02 |
0xC82 |
Yes | Increments on instruction retire (retire_i) |
The time counter has no reset — it counts continuously from power-on as a free-running real-time clock, independent of the core's operating state.
The retire pulse is generated in if_stage.vhdl as:
retire_o <= pcwr_en_i and not flush_reg;flush_reg is the registered version of flush (captured in the pipeline register). Since flush_reg reflects the flush from the previous cycle (when the instruction was fetched), a current taken branch has flush_reg = 0 and is counted. The speculatively fetched instruction after the branch has flush_reg = 1 and is not counted.
This counts one instruction per valid pipeline advance:
- Normal instructions: counted on each pipeline cycle
- Taken branches: branch is counted, next instruction (flushed) is not
- Traps: trap-causing instruction (ecall/ebreak) is counted
- Stalls: no count when pipeline is stalled (pcwr_en = '0')
- Bus errors: faulted instruction is not counted (flush = '1')
File: rtl/core.vhdl
Integrates IF stage, ID stage, and execution block into a two-stage pipeline.
core.vhdl (pipeline flow →)
imem ┌──────────┐ pc, next_pc ┌──────────┐ control ┌──────────┐
─────────────▶│ if_stage │ instr, flush │ id_stage │ signals │ ex_block │
imrd_data_i │ │────────────────▶│ │────────────▶│ │
│ pc_reg │ imrd_fault │ main_ctrl│ func3/7 │ alu_ctrl │──▶ alu_res
│ flush │────────────────▶│ reg_file │ imm/jmp │ alu │──▶ dmld_data
│ retire │◀────────────────│ csrs │◀────────────│ dmls │──▶ csrwr_data
└──────────┘ pcwr_en └──────────┘ res/dmld │ br_det │──▶ taken
▲ ▲ └──────────┘──▶ target
│ │
└─── taken, target ──────────┘
| Generic | Default | Description |
|---|---|---|
RESET_ADDR |
0x00000000 |
Reset vector address |
CSRS_MHART_ID |
0x00000000 |
Machine hart ID |
REG_FILE_SIZE |
32 | Register file size (16 or 32) |
| Port | Direction | Width | Description |
|---|---|---|---|
clk_i |
in | 1 | Gated clock (from clk_ctrl) |
reset_i |
in | 1 | Core reset (from wb_ctrl, 1 cycle after rst_i) |
ex_irq_i |
in | 1 | External interrupt |
sw_irq_i |
in | 1 | Software interrupt |
tm_irq_i |
in | 1 | Timer interrupt |
imrd_err_i |
in | 1 | Instruction memory bus error |
dmrd_err_i |
in | 1 | Data read bus error |
dmwr_err_i |
in | 1 | Data write bus error |
imrd_data_i |
in | XLEN | Instruction data from Wishbone |
dmrd_data_i |
in | XLEN | Data read data from Wishbone |
cycle_i |
in | 64 | Cycle counter value |
timer_i |
in | 64 | Timer value |
instret_i |
in | 64 | Instruction retired counter value |
cop_dat_i |
in | XLEN | Coprocessor read data |
cop_adr_o |
out | 6 | Coprocessor address |
cop_dat_o |
out | XLEN | Coprocessor write data |
cop_we_o |
out | 1 | Coprocessor write enable |
retire_o |
out | 1 | Instruction retire pulse |
imrd_en_o |
out | 1 | Instruction fetch enable |
dmrd_en_o |
out | 1 | Data read enable |
dmwr_en_o |
out | 1 | Data write enable |
dmwr_be_o |
out | 4 | Data write byte enables |
imrd_addr_o |
out | XLEN | Instruction fetch address |
dmrw_addr_o |
out | XLEN | Data memory address |
dmwr_data_o |
out | XLEN | Data write data |
File: rtl/if_stage.vhdl
Manages the program counter, instruction fetch request, and pipeline flush logic.
if_stage.vhdl (flow →)
taken ───┐
target ──┼─────┐
pcwr_en ─┤ │ ┌──────────────────┐
imrd_err ┤ ├───▶ pc_reg_proc │──── pc_reg ──▶ imrd_addr_o
reset_i ─┤ │ │ MUX(0:RESET, │ │
│ │ │ 1:target, │ │
│ │ │ 2:next_res, │ ├──────▶ next_res (PC+4)
│ │ │ 3:hold) │ │
│ │ └──────────────────┘ │
│ │ ├──────▶ imrd_en_o
│ │ ┌──────────────────┐ │
└─────┼───▶ flush_val │ │
│ │ taken or err │───────┼──────▶ flush_reg ──▶ flush_o
│ │ or not pcwr_en │ │
│ └──────────────────┘ │
│ │
imrd_data_i ─────────────┼──────────────────────────────┘
│ ┌──────────────────┐
└───▶ out_pipe_proc │──── pc_o
│ (pipeline reg) │──── next_pc_o
│ │──── instr_o
│ │──── imrd_fault_o
└──────────────────┘
retire_o <= pcwr_en_i and not flush_reg
| Port | Direction | Width | Description |
|---|---|---|---|
clk_i |
in | 1 | Clock |
reset_i |
in | 1 | Synchronous reset (active high) |
pcwr_en_i |
in | 1 | Pipeline advance enable |
imrd_err_i |
in | 1 | Instruction memory bus error |
taken_i |
in | 1 | Branch/jump taken (from ex_block) |
target_i |
in | XLEN | Branch/jump target address |
imrd_data_i |
in | XLEN | Instruction data from Wishbone |
imrd_en_o |
out | 1 | Instruction fetch request (to wb_ctrl) |
imrd_fault_o |
out | 1 | Instruction bus fault (to pipeline register) |
flush_o |
out | 1 | Discard current pipeline instruction |
retire_o |
out | 1 | Instruction retire pulse (= pcwr_en_i and not flush_reg) |
imrd_addr_o |
out | XLEN | Fetch address (to wb_ctrl) |
pc_o |
out | XLEN | Current PC (to ID/EX) |
next_pc_o |
out | XLEN | PC + 4 (to ID/EX) |
instr_o |
out | XLEN | Fetched instruction (to ID/EX) |
pc_regholds the current PC, updated everyclk_iviapc_reg_procnext_resis PC+4 (combinatorial)flush_valis the combinatorial flush value (taken_i or imrd_err_i or not pcwr_en_i)flush_regcapturesflush_valin the pipeline register — represents the validity of the current instruction- Pipeline register (
out_pipe_proc) capturespc_o,next_pc_o,instr_o,flush_reg,imrd_fault_oon the rising clock edge imrd_en_o = pcwr_en_i— fetch active whenever pipeline advancesimrd_addr_o = pc_reg— fetch address always reflects the current PCretire_o = pcwr_en_i and not flush_reg— retire pulse, indicates valid instruction completed
- Reset:
pc_reg <= RESET_ADDR - Branch taken (
taken_i = '1'):pc_reg <= target_i - Pipeline advance (
pcwr_en_i = '1'):pc_reg <= next_res - Stall (no condition above):
pc_regholds value
File: rtl/id_stage.vhdl
Combines instruction decode, register file read, and CSR access. Passes decoded control signals to ex_block.
id_stage.vhdl (flow →)
instr_i ──▶ field extraction
├── func3(14:12) ──▶ func3_o
├── func7(31:25) ──▶ func7_o
├── rs1(19:15) ──▶ reg_file.rd_addr0
├── rs2(24:20) ──▶ reg_file.rd_addr1
├── rd(11:7) ──▶ reg_file.wr_addr
└── csr(31:20) ──▶ csrs.rw_addr
instr_i ──▶ main_ctrl ──▶ imm_o, jmp_o, br_en_o,
opcode decode opd_src_sel, pass,
imm gen ftype, op_en,
dmls_mode, dmls_en
┌─────────── reg_file (32 × XLEN) ────────────┐
│ rd_addr0 ◀── rs1 rd_data0 ──▶ rd_data0_o
│ rd_addr1 ◀── rs2 rd_data1 ──▶ rd_data1_o
│ wr_addr ◀── rd │
│ wr_data0 ◀── exec_res (from ex_block) │
│ wr_data1 ◀── dmld_data (from ex_block) │
│ wr_data2 ◀── next_pc (from if_stage) │
│ wr_data3 ◀── csrrd_data (from csrs) │
└──────────────────────────────────────────────┘
csrs (CSR registers + trap logic)
├── wr_data ◀── csrwr_data_i (from ex_block)
├── rw_addr ◀── instr(31:20)
├── rd_data ──▶ csrrd_data_o
├── trap_taken_o, trap_target_o ──▶ ex_block
├── pcwr_en_o ────────────────────▶ if_stage (pipeline advance)
├── cop_adr_o, cop_dat_o, cop_we_o ──▶ external COP
└── faults/irqs ──▶ exception decode ──▶ mepc, mcause, mtval
| Port | Direction | Width | Description |
|---|---|---|---|
clk_i |
in | 1 | Clock |
reset_i |
in | 1 | Synchronous reset (active high) |
ex_irq_i |
in | 1 | External interrupt |
sw_irq_i |
in | 1 | Software interrupt |
tm_irq_i |
in | 1 | Timer interrupt |
imrd_malgn_i |
in | 1 | Instruction fetch misaligned |
imrd_fault_i |
in | 1 | Instruction fetch bus fault |
dmld_malgn_i |
in | 1 | Data load misaligned |
dmld_fault_i |
in | 1 | Data load bus fault |
dmst_malgn_i |
in | 1 | Data store misaligned |
dmst_fault_i |
in | 1 | Data store bus fault |
cycle_i |
in | 64 | Cycle counter value |
timer_i |
in | 64 | Timer value |
instret_i |
in | 64 | Instruction retired counter |
exec_res_i |
in | XLEN | ALU execution result |
dmld_data_i |
in | XLEN | Data load result (from dmls_block) |
pc_i |
in | XLEN | Current PC (from if_stage) |
next_pc_i |
in | XLEN | PC + 4 (from if_stage) |
instr_i |
in | XLEN | Fetched instruction |
flush_i |
in | 1 | Flush — discard current instruction |
csrwr_data_i |
in | XLEN | CSR write data (from csrs_logic in ex_block) |
cop_dat_i |
in | XLEN | Coprocessor read data |
func3_o |
out | 3 | funct3 field |
func7_o |
out | 7 | funct7 field |
imm_o |
out | XLEN | Decoded immediate |
jmp_o |
out | 1 | Jump (JAL/JALR) |
br_en_o |
out | 1 | Branch enable |
opd0_src_sel_o |
out | 1 | Select PC vs reg0 as ALU operand 0 |
opd1_src_sel_o |
out | 1 | Select imm vs reg1 as ALU operand 1 |
opd0_pass_o |
out | 1 | Gate ALU operand 0 |
opd1_pass_o |
out | 1 | Gate ALU operand 1 |
ftype_o |
out | 1 | Instruction type for ALU control |
op_en_o |
out | 1 | ALU operation enable |
dmls_mode_o |
out | 1 | Data memory mode (0=load, 1=store) |
dmls_en_o |
out | 1 | Data memory enable |
cop_adr_o |
out | 6 | Coprocessor address |
cop_dat_o |
out | XLEN | Coprocessor write data |
cop_we_o |
out | 1 | Coprocessor write enable |
pcwr_en_o |
out | 1 | Pipeline advance enable (to if_stage) |
trap_taken_o |
out | 1 | Trap taken |
trap_target_o |
out | XLEN | Trap handler address |
rd_data0_o |
out | XLEN | Register file read port 0 |
rd_data1_o |
out | XLEN | Register file read port 1 |
csrrd_data_o |
out | XLEN | CSR read data |
Sub-blocks instantiated within id_stage:
File: rtl/main_ctrl.vhdl
Decodes the instruction opcode to generate control signals and the appropriate immediate value.
| Port | Direction | Width | Description |
|---|---|---|---|
imrd_malgn_i |
in | 1 | Instruction fetch misaligned |
dmld_malgn_i |
in | 1 | Data load misaligned |
dmld_fault_i |
in | 1 | Data load fault |
flush_i |
in | 1 | Pipeline flush |
instr_i |
in | XLEN | Instruction word |
instr_err_o |
out | 1 | Illegal instruction |
csrwr_en_o |
out | 1 | CSR write enable |
regwr_en_o |
out | 1 | Register file write enable |
regwr_sel_o |
out | 2 | Register write data select (0=ALU, 1=dmem, 2=next_pc, 3=CSR) |
dmls_mode_o |
out | 1 | Data memory mode (0=load, 1=store) |
dmls_en_o |
out | 1 | Data memory enable |
jmp_o |
out | 1 | Jump (JAL/JALR) |
br_en_o |
out | 1 | Branch enable |
opd0_src_sel_o |
out | 1 | Select PC vs reg0 as ALU operand 0 |
opd1_src_sel_o |
out | 1 | Select imm vs reg1 as ALU operand 1 |
opd0_pass_o |
out | 1 | Gate ALU operand 0 |
opd1_pass_o |
out | 1 | Gate ALU operand 1 |
ftype_o |
out | 1 | Instruction type for ALU control |
op_en_o |
out | 1 | ALU operation enable |
imm_o |
out | XLEN | Decoded immediate |
Immediate encoding per RISC-V specification: I-type, S-type, B-type, U-type, J-type, Z-type (shamt for CSR).
File: rtl/reg_file.vhdl
32 × XLEN register file with combinatorial read (dual-port) and synchronous write. Register x0 is hardwired to zero.
| Port | Direction | Width | Description |
|---|---|---|---|
clk_i |
in | 1 | Clock |
we_i |
in | 1 | Write enable |
wr_sel_i |
in | 2 | Write data mux select (0=ALU, 1=dmem, 2=next_pc, 3=CSR) |
wr_addr_i |
in | 5 | Write destination register address |
wr_data0_i |
in | XLEN | Write data from ALU result |
wr_data1_i |
in | XLEN | Write data from data load |
wr_data2_i |
in | XLEN | Write data from next PC |
wr_data3_i |
in | XLEN | Write data from CSR read |
rd_addr0_i |
in | 5 | Read port 0 address |
rd_addr1_i |
in | 5 | Read port 1 address |
rd_data0_o |
out | XLEN | Read port 0 data |
rd_data1_o |
out | XLEN | Read port 1 data |
Dual-implementation: SIZE=16 selects small_reg_file (4-bit addressing), SIZE=32 selects large_reg_file (5-bit). Default is 32.
File: rtl/csrs.vhdl
Implements machine-mode CSR registers and all trap/exception logic.
| Port | Direction | Width | Description |
|---|---|---|---|
clk_i |
in | 1 | Clock |
reset_i |
in | 1 | Synchronous reset |
ex_irq_i |
in | 1 | External interrupt |
sw_irq_i |
in | 1 | Software interrupt |
tm_irq_i |
in | 1 | Timer interrupt |
imrd_malgn_i |
in | 1 | Instruction fetch misaligned |
imrd_fault_i |
in | 1 | Instruction fetch fault |
instr_err_i |
in | 1 | Illegal instruction |
dmld_malgn_i |
in | 1 | Data load misaligned |
dmld_fault_i |
in | 1 | Data load fault |
dmst_malgn_i |
in | 1 | Data store misaligned |
dmst_fault_i |
in | 1 | Data store fault |
wr_en_i |
in | 1 | CSR write enable |
wr_mode_i |
in | 3 | CSR write mode (funct3) |
rw_addr_i |
in | 12 | CSR address |
wr_data_i |
in | XLEN | CSR write data |
exec_res_i |
in | XLEN | ALU result (for mtval on misaligned) |
pc_i |
in | XLEN | Current PC (for mepc/mtval on ebreak) |
next_pc_i |
in | XLEN | Next PC (for mepc on WFI) |
cycle_i |
in | 64 | Cycle counter |
timer_i |
in | 64 | Timer value |
instret_i |
in | 64 | Instruction retired counter |
cop_dat_i |
in | XLEN | Coprocessor read data |
cop_adr_o |
out | 6 | Coprocessor address |
cop_dat_o |
out | XLEN | Coprocessor write data |
cop_we_o |
out | 1 | Coprocessor write enable |
pcwr_en_o |
out | 1 | Pipeline advance (0 during WFI until interrupt) |
trap_taken_o |
out | 1 | Exception/interrupt/mret taken |
trap_target_o |
out | XLEN | Trap handler or return address |
rd_data_o |
out | XLEN | CSR read data |
- System calls:
ecall,ebreak,mret,wfidecoded from write enable + address - Interrupt pending:
mip_meip/msip/mtipdirectly wired from external IRQ inputs (level-sensitive) - Exception vector:
exc_takencombines all fault signals, ecall, ebreak, and interrupts - Trap taken:
trap_taken_o <= exc_taken or mret— redirects pipeline for both traps and MRET - mstatus: MIE/MPIE updated on entry (save+disable) and MRET (restore)
- mepc: Saves PC on trap;
next_pcon WFI (return after wakeup); writable via CSR - mcause: Priority encoder for exception source; interrupt bit =
int_taken - mtval: Address for misaligned access faults; PC for ebreak; zero otherwise
- Coprocessor window: CSR addresses
0x7C0–0x7FFforwarded tocop_dat_owithcop_we_ostrobe
| Address | Register | Description |
|---|---|---|
0x300 |
mstatus |
Machine status (MIE, MPIE) |
0x301 |
misa |
ISA and extensions (RV32I) |
0x304 |
mie |
Interrupt enable (MEIE, MTIE, MSIE) |
0x305 |
mtvec |
Trap vector base address |
0x320 |
mcountinhibit |
Machine counter inhibit (WARL) — not implemented |
0x321 |
mhpmevent3 |
Hardware performance event select (future) |
0x323–0x32F |
mhpmevent4–31 |
Hardware performance event select (future) |
0x340 |
mscratch |
Machine scratchpad |
0x341 |
mepc |
Exception program counter |
0x342 |
mcause |
Trap cause |
0x343 |
mtval |
Trap value |
0x344 |
mip |
Interrupt pending |
| Address | Register | Description |
|---|---|---|
0xC00 |
cycle |
Cycle counter (low) |
0xC01 |
time |
Timer (low) |
0xC02 |
instret |
Instruction retired (low) |
0xC80 |
cycleh |
Cycle counter (high) |
0xC81 |
timeh |
Timer (high) |
0xC82 |
instreth |
Instruction retired (high) |
mcountinhibit (CSR 0x320) is a WARL register that allows software to selectively pause performance counters:
| Bit | Field | Control |
|---|---|---|
| 0 | CY | mcycle — 1 = inhibit increment |
| 2 | IR | minstret — 1 = inhibit increment |
| others | — | Hardwired to 0 (reserved) |
When a bit is 1, the respective counter stops incrementing. Bit 1 (TM for time) is hardwired to 0 — time is an independent wall-clock timer and should not be inhibited.
Note: mcountinhibit is not yet implemented in Leaf. Future implementation requires:
- Add
mcountinhibit_regincsrs.vhdl(bits 0 and 2 writable WARL, others hardwired to 0) - Add
mcountinhibit_oports incsrs→id_stage→core - Add
inhibit_iport incounters— gating on increments (inhibit_i(0)lockscycle,inhibit_i(2)locksinstret) - Connect
core.mcountinhibit_o→counters.inhibit_iinleaf.vhdl
tm_irq is an external core input — Leaf does not generate it internally. The time counter (CSR 0xC01/0xC81) increments every clk_i cycle and is readable by software, but there is no mtimecmp register to compare the timer and generate the IRQ automatically.
To use timer interrupts, external hardware must:
- Program a comparison value via memory-mapped register or coprocessor CSR
- Compare against
timeor its own counter - Assert
tm_irqwhen the condition is met
Implementation of mtimecmp per the RISC-V Privileged Spec (section 3.1.11) is a future improvement.
CSR addresses 0x7C0 to 0x7FF are reserved for coprocessor attachment. Reads are forwarded to cop_dat_i, writes to cop_dat_o with cop_we_o strobe.
Exception sources, their mcause codes, and mtval behavior:
| Code | Source | mtval |
|---|---|---|
| 0 | Instruction address misaligned | Target address (exec_res) |
| 1 | Instruction access fault | PC of faulted instruction |
| 2 | Illegal instruction | 0 |
| 3 | Breakpoint (ebreak) | PC of breakpoint instruction |
| 4 | Load address misaligned | Effective address (exec_res) |
| 5 | Load access fault | Effective address (exec_res) |
| 6 | Store address misaligned | Effective address (exec_res) |
| 7 | Store access fault | Effective address (exec_res) |
| 11 | Environment call (ecall) | 0 |
Interrupt codes (mcause bit 31 = 1):
| Code | Source |
|---|---|
| 3 | Machine software interrupt |
| 7 | Machine timer interrupt |
| 11 | Machine external interrupt |
Trap flow:
- Current PC is saved to
mepc mstatus.MIEis saved tomstatus.MPIE, thenMIEis clearedmcauseandmtvalare set- PC jumps to
mtvec
File: rtl/ex_block.vhdl
Contains all datapath execution logic: ALU, branch detection, load/store alignment, and CSR write data muxing.
ex_block.vhdl (flow →)
Operand selection:
reg0_i ─────┐ gtd_opd0
pc_i ───────┼──▶ MUX ──▶ AND ───┐ (gated)
opd0_src_sel┘ opd0_pass ▲ │
│ │
reg1_i ─────┐ │ │
imm_i ──────┼──▶ MUX ──▶ AND ───┐ │ │
opd1_src_sel┘ opd1_pass ▲ │ │
│ │
ALU: │ │
┌──────────┐ ┌──────────┐ │ │
│ alu_ctrl │───▶│ alu │◀─────┘ │
│ op_en_i │ │arith→comp│◀──────────┘
│ ftype_i │ │→logic→shf│──── alu_res
│ func3/7 │ └──────────┘ │
└──────────┘ │
│
Branch: │
┌─────────────────────┐ │
│ br_detector │ │
│ compare(reg0, reg1) │──── branch─┼──▶ taken_o
│ mode = func3 │ │ (branch or jmp or trap)
└─────────────────────┘ │
│
Load/Store: │
┌────────────────────────────┐ │
│ dmls_block │ │
│ alu_res ──▶ addr align │◀────┘
│ func3 ────▶ dtype decode │──── dmld_data_o
│ reg1 ─────▶ store data rot │──── dmwr_data_o
│ dmrd_data ▶ load align │──── dm_byte_en_o
│ dmrd/wr_err▶ fault detect │──── dmrd/wr_en_o
│ ▶ misalign det │──── dmld/st_malgn/fault_o
└────────────────────────────┘
CSR write data:
┌──────────────────────┐
│ csrs_logic │
│ mode = func3 │──── csrwr_data_o
│ csrrd_data_i │
│ reg0_i / imm_i │
└──────────────────────┘
Target: target_o <= trap_target when trap_taken else alu_res & 0
| Port | Direction | Width | Description |
|---|---|---|---|
trap_taken_i |
in | 1 | Trap taken (from csrs) |
trap_target_i |
in | XLEN | Trap handler PC |
func3_i |
in | 3 | funct3 field |
func7_i |
in | 7 | funct7 field |
reg0_i |
in | XLEN | Register file read port 0 |
reg1_i |
in | XLEN | Register file read port 1 |
pc_i |
in | XLEN | Current PC |
imm_i |
in | XLEN | Decoded immediate |
csrrd_data_i |
in | XLEN | CSR read data |
jmp_i |
in | 1 | Jump (JAL/JALR) |
br_en_i |
in | 1 | Branch enable |
opd0_src_sel_i |
in | 1 | Select PC vs reg0 as ALU operand 0 |
opd1_src_sel_i |
in | 1 | Select imm vs reg1 as ALU operand 1 |
opd0_pass_i |
in | 1 | Gate ALU operand 0 |
opd1_pass_i |
in | 1 | Gate ALU operand 1 |
ftype_i |
in | 1 | Instruction type for ALU control |
op_en_i |
in | 1 | ALU operation enable |
dmls_mode_i |
in | 1 | Data memory mode (0=load, 1=store) |
dmls_en_i |
in | 1 | Data memory enable |
dmrd_err_i |
in | 1 | Data read bus error |
dmwr_err_i |
in | 1 | Data write bus error |
dmrd_data_i |
in | XLEN | Data read data (from Wishbone) |
imrd_malgn_o |
out | 1 | Instruction fetch misaligned |
dmld_malgn_o |
out | 1 | Data load misaligned |
dmld_fault_o |
out | 1 | Data load bus fault |
dmst_malgn_o |
out | 1 | Data store misaligned |
dmst_fault_o |
out | 1 | Data store bus fault |
dmrd_en_o |
out | 1 | Data read request (to wb_ctrl) |
dmwr_en_o |
out | 1 | Data write request (to wb_ctrl) |
dmwr_data_o |
out | XLEN | Data write data |
dmrw_addr_o |
out | XLEN | Data memory address |
dm_byte_en_o |
out | 4 | Data byte enables |
dmld_data_o |
out | XLEN | Data load result (aligned/sign-extended) |
csrwr_data_o |
out | XLEN | CSR write data (from csrs_logic) |
taken_o |
out | 1 | Branch/jump/trap taken |
target_o |
out | XLEN | Branch/jump/trap target address |
res_o |
out | XLEN | ALU result |
Sub-blocks instantiated within ex_block:
File: rtl/alu_ctrl.vhdl
Combinational decoder that maps instruction fields to ALU operation codes.
| Port | Direction | Width | Description |
|---|---|---|---|
op_en_i |
in | 1 | ALU operation enable (0 = idle/ADD) |
ftype_i |
in | 1 | Format type (0 = R-type, 1 = I-type) |
func3_i |
in | 3 | funct3 field |
func7_i |
in | 7 | funct7 field |
op_o |
out | 6 | ALU operation code (ALU_ADD, ALU_SUB, etc.) |
Decoding logic:
op_en_i = 0→ALU_ADD(pipeline bubble)func3 = 000,func7 = 0100000,ftype = 0→ALU_SUBfunc3 = 101,func7 = 0100000→ALU_SRA- Otherwise maps
func3to the corresponding ALU operation (ADD, SLL, SLT, SLTU, XOR, SRL, OR, AND)
File: rtl/alu.vhdl
Combinational datapath organized as a bypass chain: arith → comp → logic → shifter → res_o.
| Port | Direction | Width | Description |
|---|---|---|---|
opd0_i |
in | XLEN | Operand 0 |
opd1_i |
in | XLEN | Operand 1 |
op_i |
in | 6 | ALU operation code from alu_ctrl |
res_o |
out | XLEN | Result |
Sub-blocks:
- arith_unit: ADD/SUB via
unsignedaddition with conditional 2's complement ofopd1_i - comparator: SLT/SLTU using MSB comparison with
arith_res(31)for same-sign case - logic_unit: XOR/OR/AND with bypass
- shifter: SLL/SRL/SRA via
numeric_stdshift functions (5-bit shift amount fromopd1_i(4:0))
When a sub-block's operation is not selected, it passes through the previous result.
File: rtl/br_detector.vhdl
Combinational comparator for branch condition evaluation.
| Port | Direction | Width | Description |
|---|---|---|---|
reg0_i |
in | XLEN | Register value 0 (RS1) |
reg1_i |
in | XLEN | Register value 1 (RS2) |
mode_i |
in | 3 | Branch mode (funct3: EQ/NE/LT/GE/LTU/GEU) |
en_i |
in | 1 | Branch enable (from br_en_i) |
branch_o |
out | 1 | Branch condition met |
Output is gated: branch_o <= branch_i and en_i.
File: rtl/dmls_block.vhdl
Handles data memory load/store alignment and sign-extension for all RISC-V load/store data types (byte, halfword, word, signed/unsigned).
| Port | Direction | Width | Description |
|---|---|---|---|
dmrd_err_i |
in | 1 | Data read bus error |
dmwr_err_i |
in | 1 | Data write bus error |
dmls_mode_i |
in | 1 | Mode (0=load, 1=store) |
dmls_en_i |
in | 1 | Enable |
dmls_dtype_i |
in | 3 | Data type (LSU_BYTE, LSU_BYTEU, LSU_HALF, LSU_HALFU, LSU_WORD) |
dmst_data_i |
in | XLEN | Store data from register |
dmls_addr_i |
in | XLEN | Load/store address |
dmrd_data_i |
in | XLEN | Data read data from Wishbone |
dmld_malgn_o |
out | 1 | Load address misaligned |
dmld_fault_o |
out | 1 | Load bus fault |
dmst_malgn_o |
out | 1 | Store address misaligned |
dmst_fault_o |
out | 1 | Store bus fault |
dmrd_en_o |
out | 1 | Data read request |
dmwr_en_o |
out | 1 | Data write request |
dmwr_data_o |
out | XLEN | Data write data (byte-rotated to align with byte enables) |
dmrw_addr_o |
out | XLEN | Data memory address (word-aligned) |
dm_byte_en_o |
out | 4 | Byte enables |
dmld_data_o |
out | XLEN | Load data (aligned and sign/zero-extended) |
File: rtl/csrs_logic.vhdl
Combinational mux that computes the CSR write data based on the instruction's funct3 field.
| Port | Direction | Width | Description |
|---|---|---|---|
csrwr_mode_i |
in | 3 | CSR write mode (funct3) |
csrrd_data_i |
in | XLEN | Current CSR read data |
regwr_data_i |
in | XLEN | Register file read data (RS1) |
immwr_data_i |
in | XLEN | Zero-extended immediate (uimm) |
csrwr_data_o |
out | XLEN | CSR write data |
Modes: 001=CSRRW, 010=CSRRS, 011=CSRRC, 101=CSRRWI, 110=CSRRSI, 111=CSRRCI. others (incl. 000) = 0 (ECALL/EBREAK/MRET/WFI).