Skip to content

Commit 8c68d48

Browse files
Add some documentation on syscalls and argument handling
1 parent 4d3b393 commit 8c68d48

File tree

2 files changed

+224
-40
lines changed

2 files changed

+224
-40
lines changed

doc/README.md

Lines changed: 0 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -95,46 +95,6 @@ When the external RAM is written to by the VM, the dirty flag will be set. The f
9595

9696
bool uvm32_extramDirty(uvm32_state_t *vmst)
9797

98-
## syscall ABI
99-
100-
All communication between bytecode and the vm host is performed via syscalls.
101-
102-
To make a syscall, register `a7` is set with the syscall number (a `UVM32_SYSCALL_x`) and `a0`, `a1` are set with the syscall parameters. The response is returned in `a2`.
103-
104-
[target.h](common/uvm32_target.h#L12)
105-
106-
```c
107-
static uint32_t syscall(uint32_t id, uint32_t param1, uint32_t param2) {
108-
register uint32_t a0 asm("a0") = (uint32_t)(param1);
109-
register uint32_t a1 asm("a1") = (uint32_t)(param2);
110-
register uint32_t a2 asm("a2");
111-
register uint32_t a7 asm("a7") = (uint32_t)(id);
112-
113-
asm volatile (
114-
"ecall"
115-
: "=r"(a2) // output
116-
: "r"(a7), "r"(a0), "r"(a1) // input
117-
: "memory"
118-
);
119-
return a2;
120-
}
121-
```
122-
The [RISC-V SBI](https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/riscv-sbi.adoc) is not followed, a simpler approach is taken.
123-
124-
## syscalls
125-
126-
There are two inbuilt syscalls used by uvm32, `halt()` and `yield()`.
127-
128-
`halt()` tells the host that the program has ended normally. `yield()` tells the host that the program requires more instructions to be executed. Halt is handled internally and transitions the VM to `UVM32_STATUS_ENDED`, `yield()` is handled in the VM host like other syscalls.
129-
130-
Syscalls are handled in the host by reading the syscall identifier, then using the provided functions to get arguments and set a return response. Direct access to the VM's memory space is not allowed, to avoid memory corruption issues.
131-
132-
The following functions are used to access syscall parameters safely:
133-
134-
uint32_t uvm32_getval(uvm32_state_t *vmst, uvm32_evt_t *evt, uvm32_arg_t);
135-
const char *uvm32_getcstr(uvm32_state_t *vmst, uvm32_evt_t *evt, uvm32_arg_t);
136-
void uvm32_setval(uvm32_state_t *vmst, uvm32_evt_t *evt, uvm32_arg_t, uint32_t val);
137-
uvm32_evt_syscall_buf_t uvm32_getbuf(uvm32_state_t *vmst, uvm32_evt_t *evt, uvm32_arg_t argPtr, uvm32_arg_t argLen);
13898

13999
## Event driven operation
140100

doc/syscall.md

Lines changed: 224 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,224 @@
1+
# Syscalls
2+
3+
## Introduction
4+
5+
To interact with the host, code running in the vm makes syscalls.
6+
A syscall acts like a function call which accepts up to two arguments and optionally returns one.
7+
8+
From [apps/common/uvm32_target.h](apps/common/uvm32_target.h):
9+
10+
```c
11+
uint32_t syscall(uint32_t id, uint32_t param1, uint32_t param2) {
12+
...
13+
}
14+
```
15+
16+
Both arguments and the return type are `uint32_t` meaning they are large enough (on uvm32 - a 32bit machine) to hold any type of data.
17+
18+
To do anything useful, VM code must make syscalls and you will likely need to define some which make sense for your application.
19+
20+
(**Note**, uvm32 assumes that hosts are the same endianness as the vm - little endian. If this is not the case, you will have a bad time, patches welcome...)
21+
22+
## Inbuilt systems
23+
24+
There are two inbuilt syscalls used by uvm32, `halt()` and `yield()`.
25+
26+
`halt()` tells the host that the program has ended normally. `yield()` tells the host that the program requires more instructions to be executed. Halt is handled internally and transitions the VM to `UVM32_STATUS_ENDED`, `yield()` is handled in the VM host like other syscalls.
27+
28+
## Worked example
29+
30+
[`common/uvm32_common_custom.h`](common/uvm32_common_custom.h) defines numbers for a few useful syscalls, for example we have a syscall which prints a single NULL terminated C string:
31+
32+
```c
33+
#define UVM32_SYSCALL_PRINT 0x00000002
34+
```
35+
36+
Using this definition, we can now use the syscall from vm code and handle it in the host.
37+
38+
In the vm code, print "Hello, world\n"
39+
40+
```c
41+
syscall(UVM32_SYSCALL_PRINT, (uint32_t)"Hello world\n", 0);
42+
```
43+
44+
The host code, we will receive an event for every syscall. We receive and handle the syscall as follows:
45+
46+
```c
47+
uvm32_state_t vmst;
48+
uvm32_evt_t evt;
49+
uvm32_init(&vmst);
50+
uvm32_load(&vmst, code, code_len);
51+
uvm32_run(&vmst, &evt, 100);
52+
switch(evt.typ) {
53+
case UVM32_EVT_SYSCALL:
54+
switch(evt.data.syscall.code) {
55+
case UVM32_SYSCALL_PRINT:
56+
printf("%s", uvm32_arg_getcstr(&vmst, &evt, ARG0));
57+
break;
58+
}
59+
break;
60+
...
61+
}
62+
```
63+
64+
In order to get the string we expect in argument 0 (the first argument), we call `uvm32_arg_getcstr(&vmst, &evt, ARG0)`.
65+
66+
## Syscall argument handling
67+
68+
You might reasonably ask, "why can't I just get the string pointer? why must I call `uvm32_arg_getval()`?".
69+
70+
The answer is safety. uvm32 takes the view that code running inside the sandbox is untrusted and so could send all kind of invalid data in syscalls. Imagine what might happens if we did:
71+
72+
```c
73+
syscall(UVM32_SYSCALL_PRINT, (uint32_t)0xBADBAD, 0);
74+
```
75+
76+
Our host code could receive anything. Perhaps there is a valid C string in that location, or perhaps there is data which never terminates, leaving the host trying forever to print the string.
77+
78+
uvm32 guarantees that syscall arguments are safe to access in the host. To do this, it places some limits on what can be passed through a syscall. All arguments and the return type must be accessed through the `uvm32_arg_*` set of functions
79+
80+
## `uvm32_arg_getval()`
81+
82+
`uint32_t uvm32_arg_getval(uvm32_state_t *vmst, uvm32_evt_t *evt, uvm32_arg_t arg)`
83+
84+
Reads either `ARG0` or `ARG1` and returns the value as a `uint32_t`.
85+
86+
Passing other integer types requires both sides to cast appropriately, for example:
87+
88+
```c
89+
int16_t x = -1234;
90+
syscall(UVM32_SYSCALL_PRINTI16, (uint32_t)int16_t, 0);
91+
```
92+
93+
```c
94+
case UVM32_EVT_SYSCALL:
95+
switch(evt.data.syscall.code) {
96+
case UVM32_SYSCALL_PRINTI16:
97+
printf("%d", (int16_t)uvm32_arg_getval(&vmst, &evt, ARG0));
98+
break;
99+
}
100+
break;
101+
```
102+
103+
## `uvm32_arg_getcstr()`
104+
105+
`const char *uvm32_arg_getcstr(uvm32_state_t *vmst, uvm32_evt_t *evt, uvm32_arg_t arg);`
106+
107+
Reads either `ARG0` or `ARG1` and returns the value as a terminated C string in valid memory for the host. To achieve this and guarantee safety, uvm32 will check that every byte including the NULL terminator are safe to access. If the string is invalid and would lead to reading outside of the vm's memory space, an empty string (not a NULL will be returned) and the next call to `uvm32_run()` will pass back `UVM32_EVT_ERR`.
108+
109+
Though convenient, `uvm32_arg_getcstr()` in inefficient as it must scan the entire string to check it is safe to access.
110+
111+
## `uvm32_arg_getslice()`
112+
113+
`uvm32_slice_t uvm32_arg_getslice(uvm32_state_t *vmst, uvm32_evt_t *evt, uvm32_a
114+
rg_t argPtr, uvm32_arg_t argLen);`
115+
116+
Reads a slice (a bounded array) of memory where `argPtr` holds the starting address and `argLen` holds the length.
117+
118+
For example, in the VM:
119+
120+
```c
121+
uint8_t buf[10];
122+
for (int i=0;i<10;i++) {
123+
buf[i] = i*10;
124+
}
125+
syscall(PRINTBUF, (uint32_t)buf, sizeof(buf));
126+
```
127+
128+
In the host:
129+
130+
```c
131+
uvm32_slice_t slice = uvm32_arg_getslice(vmst, evt, ARG0, ARG1);
132+
for (int i=0;i<slice.len;i++) {
133+
printf("%d\n", slice.ptr[i]);
134+
}
135+
```
136+
137+
As the lenth is known in advance, `uvm32_arg_getslice()` is both fast and safe.
138+
139+
## `uvm32_arg_getslice_fixed()`
140+
141+
Where the host knows in advance what size slice to expect, it can use:
142+
143+
`uvm32_slice_t uvm32_arg_getslice_fixed(uvm32_state_t *vmst, uvm32_evt_t *evt, uvm32_arg_t arg, uint32_t len)`
144+
145+
For example, in the VM:
146+
147+
```c
148+
float vector3_pos[3] = {0.5f, 10.0f, 13.2f);
149+
syscall(PRINTVEC3, (uint32_t)vector3_pos, 0);
150+
```
151+
152+
In the host:
153+
154+
```c
155+
uvm32_slice_t slice = uvm32_arg_getslice_fixed(vmst, evt, ARG0, sizeof(float) * 3);
156+
float *vector3 = (float *)slice.ptr;
157+
...
158+
159+
## `uvm32_arg_setval`
160+
161+
`void uvm32_arg_setval(uvm32_state_t *vmst, uvm32_evt_t *evt, uvm32_arg_t, uint3
162+
2_t val);`
163+
164+
Writes a `uint32_t` value into the pointer in the argument. Other integer types can be handled through casting.
165+
166+
For example, in the VM:
167+
168+
```c
169+
int8_t val;
170+
syscall(GETI8, (uint32_t)&val, 0);
171+
// val now equals -73
172+
```
173+
174+
In the host:
175+
```c
176+
int8_t x = -73;
177+
uvm32_arg_setval(vmst, evt, ARG0, (uint32)x)
178+
```
179+
180+
## Returning values
181+
182+
Currently, uvm32 only supports returning integer types from syscalls, as returning a bare pointer into the host's memory space is unsafe.
183+
184+
To return a value from a syscall, in the VM:
185+
186+
```c
187+
uint32_t sum = syscall(ADD, 13, 17);
188+
...
189+
```
190+
191+
In the host:
192+
193+
```
194+
uint32_t a = uvm32_arg_getval(vmst, evt, ARG0);
195+
uint32_t b = uvm32_arg_getval(vmst, evt, ARG1);
196+
uvm32_arg_setval(vmst, evt, RET, a+b);
197+
198+
```
199+
200+
# syscall ABI
201+
202+
To make a syscall, register `a7` is set with the syscall number (a `UVM32_SYSCALL_x`) and `a0`, `a1` are set with the syscall parameters. The response is returned in `a2`.
203+
204+
[target.h](common/uvm32_target.h#L12)
205+
206+
```c
207+
static uint32_t syscall(uint32_t id, uint32_t param1, uint32_t param2) {
208+
register uint32_t a0 asm("a0") = (uint32_t)(param1);
209+
register uint32_t a1 asm("a1") = (uint32_t)(param2);
210+
register uint32_t a2 asm("a2");
211+
register uint32_t a7 asm("a7") = (uint32_t)(id);
212+
213+
asm volatile (
214+
"ecall"
215+
: "=r"(a2) // output
216+
: "r"(a7), "r"(a0), "r"(a1) // input
217+
: "memory"
218+
);
219+
return a2;
220+
}
221+
```
222+
The [RISC-V SBI](https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/riscv-sbi.adoc) is not followed, a simpler approach is taken.
223+
224+

0 commit comments

Comments
 (0)