Skip to content

Commit fc6783a

Browse files
docs: add guide on file uploads (#2017)
Co-authored-by: Benjie <[email protected]>
1 parent 8304619 commit fc6783a

File tree

2 files changed

+86
-0
lines changed

2 files changed

+86
-0
lines changed

src/pages/learn/_meta.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ export default {
1919
"best-practices": "",
2020
"thinking-in-graphs": "",
2121
"serving-over-http": "",
22+
"file-uploads": "",
2223
authorization: "",
2324
pagination: "",
2425
"schema-design": "Schema Design",

src/pages/learn/file-uploads.mdx

Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
# Handling File Uploads in GraphQL
2+
3+
GraphQL was not designed with file uploads in mind. While it’s technically possible to implement them, doing so requires
4+
extending the transport layer and introduces several risks, both in security and reliability.
5+
6+
This guide explains why file uploads via GraphQL are problematic and presents safer alternatives.
7+
8+
## Why uploads are challenging
9+
10+
The [GraphQL specification](https://spec.graphql.org/draft/) is transport-agnostic and serialization-agnostic (though HTTP and JSON are the most prevalent combination seen in the community).
11+
GraphQL was designed to work with relatively small requests from clients, and was not designed with handling binary data in mind.
12+
13+
File uploads, by contrast, typically handle binary data such as images and PDFs &mdash; something many encodings, including JSON, cannot handle directly.
14+
One option is to encode within our encoding (e.g. use a base64-encoded string within our JSON), but this is inefficient and is not suitable for larger binary files as it does not support streamed processing easily.
15+
Instead, `multipart/form-data` is a common choice for transferring binary data; but it is not without its own set of complexities.
16+
17+
Supporting uploads over GraphQL usually involves adopting community conventions, the most prevalent of which is the
18+
[GraphQL multipart request specification](https://github.com/jaydenseric/graphql-multipart-request-spec).
19+
This specification has been successfully implemented in many languages and frameworks, but users
20+
implementing it must pay very close attention to ensure that they do not introduce
21+
security or reliability concerns.
22+
23+
## Risks to be aware of
24+
25+
### Memory exhaustion from repeated variables
26+
27+
GraphQL operations allow the same variable to be referenced multiple times. If a file upload variable is reused, the underlying
28+
stream may be read multiple times or prematurely drained. This can result in incorrect behavior or memory exhaustion.
29+
30+
A safe practice is to use trusted documents or a validation rule to ensure each upload variable is referenced exactly once.
31+
32+
### Stream leaks on failed operations
33+
34+
GraphQL executes in phases: validation, then execution. If validation fails or an authorization check prematurely terminates execution, uploaded
35+
file streams may never be consumed. If your server buffers or retains these streams, it can cause memory leaks.
36+
37+
To avoid this, ensure that all streams are terminated when the request finishes, whether or not they were consumed in resolvers.
38+
An alternative to consider is writing incoming files to temporary storage immediately, and passing references (like filenames) into
39+
resolvers. Ensure this storage is cleaned up after request completion, regardless of success or failure.
40+
41+
### Cross-Site Request Forgery (CSRF)
42+
43+
`multipart/form-data` is classified as a “simple” request in the CORS spec and does not trigger a preflight check. Without
44+
explicit CSRF protection, your GraphQL server may unknowingly accept uploads from malicious origins.
45+
46+
### Oversized or excess payloads
47+
48+
Attackers may submit very large uploads or include extraneous files under unused variable names. Servers that accept and
49+
buffer these can be overwhelmed.
50+
51+
Enforce request size caps and reject any files not explicitly referenced in the map field of the multipart payload.
52+
53+
### Untrusted file metadata
54+
55+
Information such as file names, MIME types, and contents should never be trusted. To mitigate risk:
56+
57+
- Sanitize filenames to prevent path traversal or injection issues.
58+
- Sniff file types independently of declared MIME types, and reject mismatches.
59+
- Validate file contents. Be aware of format-specific exploits like zip bombs or maliciously crafted PDFs.
60+
61+
## Recommendation: Use signed URLs
62+
63+
The most secure and scalable approach is to avoid uploading files through GraphQL entirely. Instead:
64+
65+
1. Use a GraphQL mutation to request a signed upload URL from your storage provider (e.g., Amazon S3).
66+
2. Upload the file directly from the client using that URL.
67+
3. Submit a second mutation to associate the uploaded file with your application’s data (or use an automatically triggered process, such as Amazon Lambda, to do the same).
68+
69+
You should ensure that these file uploads are only retained for a short period such that an attacker completing only steps 1 and 2 will not exhaust your storage.
70+
When processing the file upload (step 3), the file should be moved to more permanent storage as appropriate.
71+
72+
This separates responsibilities cleanly, protects your server from binary data handling, and aligns with best practices for
73+
modern web architecture.
74+
75+
## If you still choose to support uploads
76+
77+
If your application truly requires file uploads through GraphQL, proceed with caution. At a minimum, you should:
78+
79+
- Use a well-maintained implementation of the
80+
[GraphQL multipart request spec](https://github.com/jaydenseric/graphql-multipart-request-spec).
81+
- Enforce a rule that upload variables are only referenced once.
82+
- Stream uploads to disk or cloud storage—avoid buffering them in memory.
83+
- Ensure that streams are always terminated when the request ends, whether or not they were consumed.
84+
- Apply strict request size limits and validate all fields.
85+
- Treat file names, types, and contents as untrusted data.

0 commit comments

Comments
 (0)