LLMCache - API Caching Proxy

Never pay for the same LLM call twice.

A high-performance API caching proxy built with FastAPI and SQLAlchemy that provides intelligent caching for HTTP requests. This service helps reduce API costs and improve response times by caching API responses and serving them when appropriate. Optimized for caching responses from Large Language Models (LLMs) and other usage-based API services to reduce costs and improve performance.

Quick start

Set up the env file. You can use the .env.example file as a template:

API_SECRET_TOKEN=my-secret-token
DATABASE_URL=postgresql://postgres:postgres@postgres:5432/proxy_cache

Start LLMCache with docker compose locally:

docker compose -f docker-compose.local.yml up -d

You can now use it with any API that you need to cache, and even with the most common LLM modules. Example for the Openai python module:

import openai

openai.api_base = "http://localhost:8001"
openai.default_headers = {
   "x-proxy-auth": "my-secret-token",
   "x-proxy-base-url": "https://api.openai.com/v1",
}

Example for the Anthropic python module:

import anthropic

client = anthropic.Anthropic(
   base_url="http://0.0.0.0:8001",
   http_client=httpx.Client(
      transport=transport,
      headers={
            "x-proxy-auth": "my-secret-token",
            "x-proxy-base-url": "https://api.anthropic.com",
      },
   ),
)

How It Works

Incoming requests are authenticated using the X-Proxy-Auth header
A unique cache key is generated based on:
- HTTP method
- Path
- Headers (filtered)
- Query parameters/POST data
If a valid cached response exists, it's returned immediately
If no cache exists or it's expired, the request is forwarded to the target API
The response is cached for future use

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

Testing

To test a local version of the LLMCache, you can use the following command:

docker compose -f docker-compose.test.yml up -d

To properly test, you will need to set up the additional envs you find in the .env.example file.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built with FastAPI
Database support via SQLAlchemy
HTTP client powered by HTTPX

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
app		app
docker		docker
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.local.yml		docker-compose.local.yml
docker-compose.production.yml		docker-compose.production.yml
docker-compose.test.yml		docker-compose.test.yml
main.py		main.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLMCache - API Caching Proxy

Quick start

How It Works

Contributing

Testing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

codebeaver-ai/llmcache

Folders and files

Latest commit

History

Repository files navigation

LLMCache - API Caching Proxy

Quick start

How It Works

Contributing

Testing

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages