Skip to content

raeesiarya/risc-v_processor_fpga

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pipelined RISC-V for PYNQ-Z1

A self-contained RV32I/RV32F SoC with a UART-tethered boot flow, local instruction/data memories, and a pipelined floating-point unit. The core targets the Digilent PYNQ-Z1 (Zynq-7020) and ships with software, tests, and build scripts to take the design from simulation to FPGA.

Highlights

  • Four-stage in-order pipeline: fetch, decode, execute/memory, and writeback, with single-cycle forwarding and centralized stall/kill control.
  • ISA support: RV32I + CSR (tohost) plus an RV32F subset (add/sub, mul, fused multiply-add, sign-inject, moves, int↔fp convert).
  • Multi-stage FPU with explicit pipeline latency tracking; the integer pipeline stalls while FP results retire to preserve precise state.
  • Memory-mapped UART console for boot, file loading, and debug plus MMIO performance counters (cycle, instruction, branch, branch-correct).
  • On-chip memories only: BIOS ROM, IMEM, and DMEM inferred as block RAMs with byte-enable writes.
  • PYNQ-Z1 top level with PLL-based clock generation, debounced buttons/switch synchronizers, and board pin constraints.

Repository Layout

  • hardware/src/ — RTL for the core (riscv_core), FPU (execute/fpu), local memories, UART, clocking, and PYNQ top (z1top.v).
  • hardware/sim/ — iverilog/VCS testbenches for ISA, C/assembly suites, BIOS, UART parsing, MMIO counters, and small benchmarks.
  • hardware/scripts/ — Vivado Tcl for synth/impl, UART hex loader (hex_to_serial), CPI/FOM helpers, and FPGA control.
  • hardware/run_all_sims — Python wrapper to run the full regression suite (ISA + C tests + directed benches) and capture logs.
  • software/ — BIOS, UART utilities, 151 library (MMIO helpers), ISA tests, C micro-tests, and benchmarks (mmult, fpmmult, bsort, ssort, bdd, echo, uart_parse, plus reduced-size small/ variants).
  • docs/ — CPU datapath diagram (fa25_ee151_cpu_diagram.drawio.png) and checkpoint notes.

CPU & Memory Map

Pipeline Overview

  • Fetch: parameterized PC with BIOS vs IMEM muxing; synchronous instruction memories.
  • Decode: instruction decode, immediate generation, hazard detection, and control-flow kill injection.
  • Execute / Memory: ALU, branch comparator, forwarding network, memory address generation, DMEM/IMEM write masking, and FPU issue/retire logic.
  • Writeback: load sign/zero extension, MMIO data muxing, CSR writes, and integer/FP register writeback.

A tohost CSR (0x51E) is implemented for ISA and C testbench completion signaling.

Floating Point Unit

  • Dedicated FP register file (3R/1W).
  • Stage 1 combinational ops (mul, sign-inject, moves, int→fp convert).
  • Pipelined add/sub alignment/normalize and FMA paths.
  • FP operations assert pipeline stalls until completion to maintain correct forwarding and precise architectural state.

Performance Counters

  • Free-running cycle, instruction, branch, and branch-correct counters.
  • Store to 0x8000_0018 resets all counters to zero.
  • Instruction counter increments only on committed (non-bubble, non-killed) instructions.

Memory Map (word-aligned)

Address Function
0x4000_0000 BIOS ROM (read-only, initialized from software/bios/bios.hex)
0x1000_0000 IMEM base (64 KiB window, word-addressed)
0x1000_0000 DMEM base (64 KiB window, separate RAM with byte enables)
0x8000_0000 UART ctrl (bit0: TX ready, bit1: RX valid)
0x8000_0004 UART RX data (low 8 bits)
0x8000_0008 UART TX data (store byte)
0x8000_0010 Cycle counter (load)
0x8000_0014 Instruction counter (load)
0x8000_0018 Counter reset (store)
0x8000_001c Branch counter (load)
0x8000_0020 Branch-correct counter (load)

Address partitioning note: For loads/stores, the top nibble of the address selects the target (DMEM, IMEM write-only, BIOS read-only, or MMIO). The same numeric address may refer to different physical memories depending on whether it is used as a PC (instruction fetch) or a data address.

Software Stack

  • software/151_library: UART helpers, MMIO counter accessors, ASCII/string utils, and type defs.
  • software/bios: UART command shell (file, jal, lw/lhu/lbu, sw/sh/sb) used to load hex images and jump to user code.
  • Benchmarks and tests (all build to .hex via riscv64-unknown-elf-*):
    • Integer: mmult, bsort, ssort, bdd
    • Floating point: fpmmult
    • small/ reduced-size mirrors of the above for faster simulation
    • c_tests/ micro-programs (fib, sum, strcmp, cachetest, vecadd, replace)
    • asm/ directed assembly, echo/, uart_parse/, and riscv-isa-tests/ harness

Typical build pattern:

# Build BIOS and a sample workload
make -C software/bios
make -C software/mmult
make -C software/fpmmult
# Build reduced-size images for simulation
make -C software/small/mmult
make -C software/small/fpmmult

Build, Simulate, and Program

  • Lint: make -C hardware lint
  • Single testbench: make -C hardware sim/cpu_tb.fst (iverilog) or make -C hardware sim/cpu_tb.vpd (VCS). Logs land in hardware/sim/.
  • Full regression: cd hardware && ./run_all_sims --simulator iverilog (pass --simulator vcs if available). Results stored under test_results/.
  • ISA tests only: make -C hardware isa-tests; C tests: make -C hardware c-tests.
  • Synthesis/implementation: make -C hardware synth then make -C hardware impl. Generated bitstream: hardware/build/impl/z1top.bit.
  • Program FPGA: make -C hardware program (uses program.tcl with the generated bitstream). make -C hardware screen opens a 115200 baud UART session.
  • Clean: make -C hardware clean-sim or make -C hardware clean-build.

Running on Hardware

  1. Program the bitstream (make -C hardware program) so the BIOS ROM is baked in.
  2. Open the UART console (make -C hardware screen).
  3. Load a hex image over UART from the host. The helper script accepts a base address:
    # Example: write a program into both IMEM and DMEM, then run it from IMEM
    hardware/scripts/hex_to_serial software/mmult/mmult.hex 30000000
  4. At the BIOS prompt (151>), jump to your program: jal 10000000.
  5. Use BIOS commands to inspect memory (lw/lhu/lbu), write words/halfs/bytes (sw/sh/sb), or load additional images (file <addr> <len>).
  6. UART status/data are visible at 0x8000_0000/4/8; performance counters at 0x8000_0010/14/1c/20 (store to 0x8000_0018 to clear).

Notes

  • RTL includes simulation stubs (hardware/stubs) and behavioral models (hardware/sim_models) for PLL/BUFG when running outside Vivado.
  • Board-level constraints for the PYNQ-Z1 live in hardware/src/z1top.xdc.
  • The BIOS hex is pulled in with $readmemh; keep software/bios/bios.hex up to date before synthesis.
  • Default CPU clock is derived from CLK_125MHZ_FPGA via the PLL in clocks.v/z1top.v.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors