You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/user_guide/fastcache.md
+33-33Lines changed: 33 additions & 33 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -50,13 +50,43 @@ qd.init(arch=qd.gpu)
50
50
# qd.init(arch=qd.gpu, print_non_pure=True)
51
51
```
52
52
53
+
## Dataclass fields with cached values
54
+
55
+
By default, for `dataclasses.dataclass` parameters, fastcache only includes the *types* of each field in the cache key, not their values. This is fine for fields like ndarrays, where the compiled kernel doesn't depend on the actual data, only the dtype and dimensionality.
56
+
57
+
However, some dataclass fields hold configuration values that get baked into the compiled kernel — typically values used with `qd.static()`, such as loop bounds or feature flags:
58
+
59
+
```python
60
+
for i in qd.static(range(config.num_layers)):
61
+
...
62
+
```
63
+
64
+
Here the value of `num_layers` is compiled into the kernel. Concretely the loop will be unrolled, at compile time. If `num_layers` changes, a different kernel must be compiled.
65
+
66
+
Mark such fields with `add_value_to_cache_key` so their values are included in the cache key:
67
+
68
+
```python
69
+
import dataclasses
70
+
from quadrants.lang._fast_caching importFIELD_METADATA_CACHE_VALUE
With this annotation, changing `num_envs` from 100 to 200 produces a different cache key so the correct compiled kernel is looked up (or compiled if not yet cached). Without it, the wrong kernel could be loaded.
80
+
81
+
Note: `@qd.data_oriented` objects and `qd.Template` parameters already include primitive values in the cache key automatically — this annotation is only needed for `dataclasses.dataclass` fields.
82
+
53
83
## Constraints
54
84
55
85
A kernel is eligible for fastcache only if all of the following hold:
56
86
57
87
### 1. All data flows through parameters
58
88
59
-
The kernel must receive every piece of data it operates on as an explicit parameter. It must **not** capture variables from the enclosing Python scope (closures over ndarrays, mutable globals, or any other external state). This is the core "purity" constraint — the compiled kernel's behavior must be fully determined by its arguments.
89
+
The kernel must receive every piece of data it operates on as an explicit parameter. It must **not** capture variables from the enclosing Python scope (closures over fields, ndarrays, or mutable globals). This is the core "purity" constraint — the compiled kernel's behavior must be fully determined by its arguments.
60
90
61
91
```python
62
92
a = qd.ndarray(qd.f32, (10,))
@@ -95,8 +125,8 @@ Fastcache supports the following parameter types:
|`dataclasses.dataclass`| Yes |member types recursively; member values if annotated with `FIELD_METADATA_CACHE_VALUE` (see [Appendix — compound-type cache keying](#compound-type-cache-keying)) |
99
-
|`@qd.data_oriented` objects | Yes | member types recursively; primitive member types and values baked into kernel (see [Appendix — compound-type cache keying](#compound-type-cache-keying))|
128
+
|`dataclasses.dataclass`| Yes |field types recursively; field values if annotated with `add_value_to_cache_key` (see [above](#dataclass-fields-with-cached-values)) |
129
+
|`@qd.data_oriented` objects | Yes | member types and primitive member values recursively|
100
130
|`qd.Template` primitives (int, float, bool) | Yes | type and value (baked into kernel) |
101
131
| Non-template primitives (int, float, bool) | Yes | type only |
102
132
|`enum.Enum`| Yes | name and value |
@@ -142,33 +172,3 @@ print(obs.cache_stored) # True if the compiled kernel was stored to cach
142
172
```
143
173
144
174
On the first run you'll see `cache_stored=True` but `cache_loaded=False`. On the second run (after `qd.init`), `cache_loaded=True`.
145
-
146
-
## Appendix
147
-
148
-
### Compound-type cache keying
149
-
150
-
The args hasher walks compound-type kernel parameters recursively. For each leaf member it decides what (if anything) contributes to the cache key. The headline rules:
151
-
152
-
**`@qd.data_oriented`:** the walker descends into `vars(obj)`. For each child:
153
-
154
-
-`qd.ndarray` member — `(dtype, ndim, layout)` is included in the cache key. Element values are not.
155
-
- Primitive (`int` / `float` / `bool` / `enum.Enum`) member — value is baked into the kernel (same semantics as a `qd.Template` primitive). Two instances of the same class with different primitive member values get different cache entries.
156
-
- Nested `@qd.data_oriented` member — recurses.
157
-
- Nested `dataclasses.dataclass` member — recurses (with the dataclass rules below).
158
-
-`qd.field` member — fastcache is disabled for the entire kernel call. The kernel still runs via normal compilation; a warn-level log line is emitted.
159
-
160
-
**`dataclasses.dataclass`:** the walker descends into the declared members. For each member, only the *type* is included in the cache key by default — **not** the value. To include a member's value, annotate it:
161
-
162
-
```python
163
-
import dataclasses
164
-
from quadrants.lang._fast_caching importFIELD_METADATA_CACHE_VALUE
This is necessary whenever the compiled kernel depends on the member's *value* rather than just its type (for example, when the value is used as a loop bound that the compiler bakes into the generated code). Without the annotation, two `SimConfig` instances with different `num_layers` values would share a fastcache key, and the second instance would silently load a kernel compiled for the wrong value.
173
-
174
-
Note the asymmetry: `@qd.data_oriented` primitive members are baked into the kernel automatically (same semantics as `qd.Template`); `dataclasses.dataclass` members contribute only their *type* to the cache key unless you opt in per-member.
Copy file name to clipboardExpand all lines: docs/source/user_guide/tensor.md
-9Lines changed: 0 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -205,15 +205,6 @@ fill(b) # ndarray branch
205
205
206
206
The kernel argument is unwrapped to the bare impl before the template-mapper / AST sees it, so kernel bodies still write `x[i, j]` and pay no per-call cost for the wrapper.
207
207
208
-
`qd.Tensor` is also the right annotation when storing a tensor as a `dataclasses.dataclass` member:
209
-
210
-
```python
211
-
@dataclass
212
-
classState:
213
-
a: qd.Tensor
214
-
b: qd.Tensor
215
-
```
216
-
217
208
## Pickle
218
209
219
210
`qd.Tensor` objects are picklable on **both** backends, including under non-identity layouts. Round-trip (pickle then unpickle) preserves the canonical data, the dtype, the shape, and the layout:
0 commit comments