RISC-V⚓︎

Bootstrapping⚓︎

The boot-flow is tied to the privilege modes on each architecture. For the purpose of simplicity, this documentation covers RISC-V exemplarily.

RISC-V has three privilege levels:

Machine Mode: This is the level with the most privilege; firmware like OpenSBI runs here.
Supervisor Mode: This privilege level is where our kernel runs.
User Mode: This is the level with the least privilege; typically, user processes run here.

Boot-Flow on RISC-V | Henry Gressmann

Compared to other CPU Architectures, RISC-V's boot process is straightforward. We're using OpenSBI as our Supervisor Execution Environment (SEE), our Machine Mode (M-Mode) run-time firmware. Supervisor Binary Interface (SBI) is a standard interface for interacting with the SEE, and OpenSBI is an implementation of this standard.

When running inside QEMU, a Jump Address (0x8020_0000) is used. QEMU will load our kernel into memory at this address, then jump to address 0x800_0000 where OpenSBI is located, which will then jump to 0x8020_0000, where our kernel is located.

Traditional vs QEMU RISC-V Boot Flow

The imagery was developed by Henry Gressmann.

While a traditional boot flow looks like this:

    System ROM           System ROM         Disk / Network       Disk / Network
┌────────────────┐   ┌────────────────┐   ┌────────────────┐   ┌────────────────┐
│ Device         │   | First Stage    │   │ Second Stage   │   │                │
│ Specific       ├──>| Bootloader     ├──>│ Bootloader     ├──>│ Kernel         │
│ Firmware       │   | (e.g., UEFI)   │   │ (e.g, Grub 2)  │   │                │
└────────────────┘   └────────────────┘   └────────────────┘   └────────────────┘

                      Loads the Second     Loads the kernel
                      Stage Bootloader,    into memory, e.g.
                      e.g., from address   from disk.
                      specified in GPT.

The QEMU RISC-V is simplified and looks like this:

    System ROM              RAM                  RAM
                        0x8000_0000          0x8020_0000
┌────────────────┐   ┌────────────────┐   ┌────────────────┐
│ Device         │   |                │   │                │
│ Specific       ├──>| OpenSBI        ├──>│ Kernel         │
│ Firmware       │   |                │   │                │
└────────────────┘   └────────────────┘   └────────────────┘

  M-Mode              M-Mode               S-Mode

  Loads OpenSBI       Loads the kernel
  into RAM.           and device tree
                      into RAM

Such an architecture is not only simpler, but it also enables writing a single kernel for all RISC-V CPUs that implment SBI. SBI puts a layer of abstraction between the hardward and our kernel. SBI also provides functionality like printing and a Flattened Device Tree (FDT).

Interacting with SBI is handled by the sbi crate. This crate utilizes the ecall instruction to trap into the SEE (which is OpenSBI on QEMU), where a handler will handle the trap and then return to the kernel. This is handled much in the same way that a system call is handled: first, you set up registers, then you execute ecall, and then you read out registers that contain return values.

unCORE currently uses riscv-rt. This crate provides a run-time for RISC-V and additionally handlers for interrupts and exceptions. The linker script currently in use for RISC-V 64bit is derived from the linker script that riscv-rt ships. QEMU takes an ELF file with the -kernel parameter. The ELF is built according to our linker script.

When OpenSBI and riscv-rt have finished running, unCORE is entered. The entry functions lives (as the only function) in code/uncore/src/main.rs:

unCORE Entry Function Signature

/// The RISC-V 64bit entrypoint, called by the [`riscv-rt`] runtime after SBI has set up
/// the machine.
#[cfg(target_arch = "riscv64")]
#[riscv_rt::entry]
fn riscv64_entry(hart: usize) -> ! {

The entry function is called with one argument, the HART (CPU core; in RISC-V slang "hardware thread", i.e., HART) on which the setup has been called. This will prove useful because some system initialization steps need to happen only once, and some have to happen for each HART.