Skip to content

Latest commit

 

History

History
262 lines (173 loc) · 21.2 KB

README.md

File metadata and controls

262 lines (173 loc) · 21.2 KB

Serverless: Federated GraphQL

Ever wanted to run Federated GraphQL in AWS Lambda?

In this repo we do a comparison of Cold/Warm Start times of Federated GraphQL solutions (Cosmo, Mesh, Apollo Gateway, Apollo Router) and provide a minimal/scrappy build of Apollo Router built for Lambda via Amazon Linux 2, as well as a version of Cosmo built in Go that you can utilize.

  • ⚡️ TL;DR I'd recommend the lambda-cosmo-custom (available as bootstrap-cosmo-arm in the Releases) alternative which is a lot less hacky and much more performant (300-500ms Cold Starts). See the README for details on how to use.

  • ⚡️ TL;DR 2: Of the Apollo Router variants, lambda-directly-optimized beats all other variants and is on par with the alternatives for Cold and Warm Starts (use the bootstrap-directly-optimized-graviton-arm-size binary).

Overview:

Motivation

Serverless is great to get started with a low-cost Cloud setup that'll scale you from zero to profitable without having to worry about infrastructure overhead. That said, there are not many great Federated GraphQL solutions that work out-of-the box for Serverless. Router from Apollo and Cosmo from Wundergraph are both tailoered to long-running processes, e.g. in a k8s cluster. Mesh and Apollo Gateway are both JavaScript programs which incur a massive penalty in Cold Start times and are thus not a great solution.

Methodology

Currently Apollo Router does not support running in AWS Lambda (apollographql/router#364). Instead it's focusing on running as a long-lived process, which means that it's less optimized for quick startup, as well as built with dependencies that does not mesh with Lambda's Amazon Linux 2 environment. Similarly for Cosmo, although with Cosmo we can actually use the binary although with a bit of indirection.

But what if we were a little bit creative? Could we get it to work? The answer is: Yes! (sorta)...

This repository contains five examples:

  • lambda-with-server/: Spins up a Apollo Router using the apollo-router crate, and proxies Lambda Events to the HTTP server locally.
  • lambda-directly/: Uses the TestHarness that Apollo Router uses to easily make GraphQL requests in its tests without needing a full Router. The Lambda takes the incoming event, runs it through the TestHarness and returns the result.
  • lambda-directly-optimized/: Same approach as lambda-directly, but we only construct the TestHarness once and then reuse it across all invocations. We also optimize loading configurations as well as initializing the Supergraph by doing it during Lambda's Initialization phase, which runs at full resource. Additionally, we buid this for the ARM architecture and also optimize it for the AWS Graviton CPU.
  • lambda-cosmo: A small Rust wrapper that starts the Cosmo binary and proxies events to the server.
  • lambda-cosmo-custom: Spins up a Cosmo sever using the Cosmo Router and proxies Lambda Events to the HTTP server locally, similar to lambda-with-server.

We do some additional tricks to reduce the size of the Apollo variants in the bootstrap-directly-optimized-graviton-arm-size binary, which has an impact on Cold Starts:

Check out the code and Dockerfile for each. There's really not a lot going on, and it is a minimal implementation compared to what you'd want in Production. My current recommendation would be either use the bootstrap-directly-optimized-graviton-arm binary produced from the lambda-directly-optimized approach in AWS Lambda, or to run Apollo Router in App Runner, which it does extremely well (I can max out the allowed 200 concurrent requests on a 0.25 CPU and 0.5GB Memory setting).

Measurements

Measurement (ms) GraphQL Mesh (512 MB) GraphQL Mesh (1024 MB) GraphQL Mesh (2048 MB) lambda-directly-optimized (512 MB) lambda-directly-optimized (1024 MB) lambda-directly-optimized (2048 MB) Cosmo (512 MB) Cosmo (1024 MB) Cosmo (2048 MB) Apollo Gateway (512 MB) Apollo Gateway (1024 MB) Apollo Gateway (2048 MB)
Average warm start response time 10.2 ms 10 ms 10.3 ms 6.8 ms 6.2 ms 6.8 ms 10.7 ms 10 ms 9.8 ms 8.8 ms 8.9 ms 9.8 ms
Average cold start response time 615.9 ms 609.8 ms 565.2 ms 703.3 ms 681.8 ms 678 ms 442.9 ms 464.7 ms 427.7 ms 1037.7 ms 871.2 ms 851 ms
Fastest warm response time 6.9 ms 7.9 ms 8 ms 5 ms 5 ms 6 ms 6.9 ms 7.9 ms 7.9 ms 6.9 ms 6.9 ms 6.9 ms
Slowest warm response time 38.9 ms 38.9 ms 38.9 ms 11 ms 11 ms 9 ms 19 ms 11.9 ms 10.9 ms 12 ms 12 ms 10.9 ms
Fastest cold response time 495.9 ms 495.9 ms 495.9 ms 625 ms 625 ms 625 ms 328 ms 328 ms 328 ms 797 ms 797 ms 797 ms
Slowest cold response time 877 ms 786.9 ms 786.9 ms 2724 ms 804 ms 724.9 ms 581 ms 531 ms 505 ms 1170 ms 1039.9 ms 898 ms

Of the Apollo variants specifically:

Approach Advantage Performance
lambda-with-server · Full router functionality (almost) · Cold Start: ~1.58s
· Warm Start: ~49ms
lambda-directly · No need to wait for a server to start first (lower overhead) · Cold Start: ~1.32s
· Warm Start: ~314ms
lambda-directly-optimized · No need to wait for a server to start first (lower overhead)
· Built for ARM
· Optimized for the Graviton CPU
Optimized for size
· Cold Start: ~0.7s
· Warm Start: ~20ms
Optimized for speed
· Cold Start: ~0.9s
· Warm Start: ~20ms

How to use

Each of the approach are generic and can be used as-is. You can simply grab whichever variant you want from the Releases page, which uploads the bootstrap artifact from each of them.

For example, let's say we want to try running lambda-with-server.

First we create a folder to hold our artifacts in, which we will .zip up and deploy to Lambda:

$ mkdir apollo-router

Then we download the relevant

$ curl -sSL https://github.com/codetalkio/apollo-router-lambda/releases/latest/download/bootstrap-directly-optimized-graviton-arm-size -o bootstrap
$ mv bootstrap ./apollo-router/bootstrap

Now we just need to add our router.yaml and supergraph.graphql since the services will look these up from their same folder during startup:

# From whereever your router.yaml is:
$ cp router.yaml ./apollo-router/router.yaml
# From whereever your supergraph.graphql is:
$ cp supergraph.graphql ./apollo-router/supergraph.graphql

You now have the following contents in your apollo-router folder:

.
├── ms-router
│   ├── bootstrap
│   ├── router.yaml
│   └── supergraph.graphql

And you're ready to deploy using your preferred method of AWS CDK/SAM/SLS/SST/CloudFormation/Terraform.

Comparison: Federation via Apollo Router (Cold Start)

The lambda-directly-optimized approach is the only one that enters the realm of "acceptable" cold starts. Still high, but almost always below 1 second. Both of the other approachs unfortunately have quite a high cold start time. The lambda-directly approach wins by a tiny margin, but none are great. None of the variants talk to any Subgraphs, this is purely measuring the overhead of startup.

lambda-with-server

Direct Router Cold Screenshot 2023-10-31 at 20 50 17

A good 450ms of this is spent just waiting for the Router to spin up:

Apollo as Server in Lambda Screenshot 2023-10-30 at 23 19 50

Breakdown of only the router (making no queries to subgraphs):

Measurement (ms) 128 MB 256 MB 512 MB 1024 MB 2048 MB
Average warm start response time 8.3 ms 8.7 ms 7.6 ms 7.6 ms 8 ms
Average cold start response time 2870.9 ms 2570.4 ms 2174.1 ms 1012.8 ms 943.4 ms
Fastest warm response time 6 ms 6 ms 6 ms 6.9 ms 6.9 ms
Slowest warm response time 16.9 ms 16.9 ms 16.9 ms 16.9 ms 16.9 ms
Fastest cold response time 837 ms 837 ms 837 ms 837 ms 837 ms
Slowest cold response time 3861.9 ms 3861.9 ms 2612.9 ms 1625 ms 1139 ms

lambda-directly

Lambda Router Cold Screenshot 2023-10-31 at 20 57 47

lambda-directly-optimized (optimized for speed)

Cold start (No Query) Screenshot 2023-11-11 at 23 25 15

A few samples of lambda-directly-optimized (optimized for speed) Cold Starts:

Overview of Cold starts (No Query) Screenshot 2023-11-11 at 23 24 59

Breakdown of only the router (making no queries to subgraphs):

Measurement (ms) 128 MB 256 MB 512 MB 1024 MB 2048 MB
Average warm start response time 9.7 ms 5.4 ms 5.6 ms 6.1 ms 5.8 ms
Average cold start response time 858 ms 837.6 ms 775.5 ms 768.3 ms 753.2 ms
Fastest warm response time 4.9 ms 4.9 ms 4.9 ms 4.9 ms 4.9 ms
Slowest warm response time 23 ms 8 ms 7 ms 7 ms 7 ms
Fastest cold response time 719 ms 719 ms 719 ms 719 ms 719 ms
Slowest cold response time 1075 ms 981.9 ms 981.9 ms 981.9 ms 868 ms

lambda-directly-optimized (optimized for size)

Cold start (No Query) Screenshot 2023-11-13 at 18 01 42

A few samples of lambda-directly-optimized (optimized for size) Cold Starts:

Overview of Cold starts (No Query) Screenshot 2023-11-13 at 18 01 10

Breakdown of only the router (making no queries to subgraphs):

Measurement (ms) 128 MB 256 MB 512 MB 1024 MB 2048 MB
Average warm start response time 5.2 ms 5.6 ms 5.2 ms 5.6 ms 5.5 ms
Average cold start response time 735.8 ms 735.6 ms 698.1 ms 698.8 ms 688.1 ms
Fastest warm response time 4 ms 4 ms 4.9 ms 4.9 ms 4.9 ms
Slowest warm response time 72.9 ms 20.9 ms 9.9 ms 8 ms 8 ms
Fastest cold response time 617 ms 617 ms 617 ms 617 ms 617 ms
Slowest cold response time 985 ms 985 ms 894.9 ms 894.9 ms 762 ms

Comparison: Federation via Apollo Router (Warm Start)

Here we see both lambda-directly-optimized and lambda-with-server shine. Once it's started the Apollo Router/TestHarness, then it has relatively little overhead. lambda-directly on the other hand will build a TestHarness on each new request, and will keep paying a high cost, slowing it down.

Both of these examples talk to 1 warm subgraph implemented in Rust, to simulate a real warm run.

lambda-with-server

Direct Router Warm (Products query)  Screenshot 2023-10-31 at 20 50 48

lambda-directly

Lambda Router Warm (Products query) Screenshot 2023-10-31 at 20 41 43

lambda-directly-optimized (optimized for size)

Warm start (talking to Products) Screenshot 2023-11-11 at 23 48 46

Comparison: Rust Subgraph in AWS Lambda

For comparison so that you know how far we could go, here's a subgraph in Rust implemented using async-graphql and wrapped up in cargo-lambda.

Cold Start (201ms):

Screenshot 2023-10-21 at 12 13 18

Warm Start (8ms):

Screenshot 2023-10-21 at 12 14 41

Comparison: Federation via Apollo Gateway

To have something to compare the Apollo Router PoC more directly against, here's one alternative using Apollo Gateway.

Cold start (1.23ms):

Cold start ms-gateway Screenshot 2023-10-22 at 21 45 34

Warm start (120ms):

Warm start subgraph times Screenshot 2023-10-22 at 16 13 26

Breakdown of only the router (making no queries to subgraphs):

Measurement (ms) 512 MB 1024 MB 2048 MB
Average warm start response time 8.8 ms 8.9 ms 9.8 ms
Average cold start response time 1037.7 ms 871.2 ms 851 ms
Fastest warm response time 6.9 ms 6.9 ms 6.9 ms
Slowest warm response time 12 ms 12 ms 10.9 ms
Fastest cold response time 797 ms 797 ms 797 ms
Slowest cold response time 1170 ms 1039.9 ms 898 ms

Comparison: Federation via GraphQL Mesh

Another comparison point against the Apollo Router PoC, here's one alternative using GraphQL Mesh.

Cold start (956ms):

Cold start ms-mesh Screenshot 2023-10-22 at 21 42 45

Breakdown of only the router (making no queries to subgraphs):

Measurement (ms) 512 MB 1024 MB 2048 MB
Average warm start response time 10.2 ms 10 ms 10.3 ms
Average cold start response time 615.9 ms 609.8 ms 565.2 ms
Fastest warm response time 6.9 ms 7.9 ms 8 ms
Slowest warm response time 38.9 ms 38.9 ms 38.9 ms
Fastest cold response time 495.9 ms 495.9 ms 495.9 ms
Slowest cold response time 877 ms 786.9 ms 786.9 ms

Comparison: Federation via Cosmo Router

Another comparison point against the Apollo Router PoC, here's one alternative using Cosmo Router, using the variant from lambda-cosmo-custom.

Cold start (339ms):

ms-cosmo best (Cold, No query) Screenshot 2023-12-06 at 22 57 57

Breakdown of only the router (making no queries to subgraphs):

Measurement (ms) ms-cosmo (512 MB) ms-cosmo (1024 MB) ms-cosmo (2048 MB)
Average warm start response time 10.7 ms 10 ms 9.8 ms
Average cold start response time 442.9 ms 464.7 ms 427.7 ms
Fastest warm response time 6.9 ms 7.9 ms 7.9 ms
Slowest warm response time 19 ms 11.9 ms 10.9 ms
Fastest cold response time 328 ms 328 ms 328 ms
Slowest cold response time 581 ms 531 ms 505 ms