Ollama on AWS with GPU acceleration

Bachelor thesis project - Deploying open-source LLM (Ollama) on AWS cloud infrastructure with GPU support.

Author: Stepan Konecny

Architecture

Client -> API Gateway (REST, API key auth) -> VPC Link -> NLB -> EC2 (g5.xlarge, NVIDIA GPU) -> Ollama (Docker)

Project structure

ollama-aws/
├── terraform/             # Infrastructure as Code (Terraform)
│   ├── main.tf            # VPC, EC2 (g5.xlarge), NLB, API Gateway, security groups
│   ├── variables.tf       # Configurable parameters (region, AMI, SSH key)
│   └── outputs.tf         # Terraform outputs (instance IP, API URL, API key)
├── scripts/
│   ├── setup-gpu.sh       # EC2 user_data - installs NVIDIA drivers, Docker, Ollama
│   └── deploy-ollama.sh   # Manual redeployment script for Ollama container
├── examples/
│   ├── client.py          # Python client - generate, chat, list models
│   └── client.js          # JavaScript/Node.js client - same functionality
├── tests/
│   ├── test_api.py        # Functional tests - connectivity, auth, generation, error handling
│   └── benchmark.py       # Performance benchmark - latency, TTFT, throughput
└── docker-compose.yml     # Docker Compose config for Ollama with GPU passthrough

AWS resources

Resource	Details
VPC	10.0.0.0/16, public + private subnet
EC2	g5.xlarge (NVIDIA A10G GPU, 24GB VRAM), Ubuntu 24.04, 100GB gp3
NLB	Internal Network Load Balancer, TCP forwarding to port 11434
API Gateway	REST API with API key authentication, VPC Link to NLB

Usage

# Deploy infrastructure
cd terraform
terraform init
terraform apply

# Test API
python tests/test_api.py <API_URL> <API_KEY>

# Run benchmark
python tests/benchmark.py <API_URL> <API_KEY>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ollama on AWS with GPU acceleration

Architecture

Project structure

AWS resources

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
examples		examples
scripts		scripts
terraform		terraform
tests		tests
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

Ollama on AWS with GPU acceleration

Architecture

Project structure

AWS resources

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages