Experiment with open-source AI models
A modern AI playground that combines the development experience of Next.js with the performance of Cloudflare Workers platform. Experiment with 50+ open-source AI models, including GPT-OSS, Leonardo, Llama, Qwen, Gemini, DeepSeek, and more. Features text-to-speech with multiple voices and real-time speech-to-text transcription.
OpenGPT leverages three core technologies to deliver an exceptional AI development experience:
| Technology | What it brings | Why it matters | 
|---|---|---|
| π OpenNext | Seamless Next.js β Cloudflare Workers deployment | Deploy Next.js apps globally with the most affordable edge compute offering | 
| π€ AI SDK v5 | Universal AI framework with streaming support | Connect to any AI provider with type-safe, streaming APIs | 
| βοΈ Cloudflare Workers AI | Global AI inference | Sub-100ms latency worldwide with 50+ open-source models | 
- Chat Mode: Conversational AI with 30+ text generation models
- Image Mode: High-quality image generation with 5+ image models
- Text-to-Speech (TTS): Voice synthesis with multiple speaker options
- Speech-to-Text (STT): Real-time audio transcription with visual feedback
- Seamless Switching: Toggle between modes without losing context
- Thinking Process Visualization: See how AI models reason through problems
- Collapsible Reasoning: Clean UI that shows/hides reasoning on demand
- Universal Compatibility: Works with any AI model that supports reasoning tokens
- AI Elements UI: Professional, accessible components built with AI Elements
- Responsive Design: Mobile-first responsive interactions
- Real-time Streaming: See responses as they're generated
- Type Safety: Full TypeScript support
- One-Command Deploy: pnpm deployto Cloudflare Workers globally
# Clone the repository
git clone https://github.com/devhims/opengpt.git
cd opengpt
# Install dependencies
pnpm install
# Start development server
pnpm devVisit http://localhost:3000 to see OpenGPT in action! π
- Create .dev.varsfor local development:
# .dev.vars (not committed to git)
NEXTJS_ENV=development- For production secrets:
wrangler secret put NEXTJS_ENV| Command | Description | 
|---|---|
| pnpm dev | Start development server with Turbopack | 
| pnpm build | Build the Next.js application | 
| pnpm preview | Preview the Cloudflare Workers build locally | 
| pnpm deploy | Build and deploy to Cloudflare Workers globally | 
| pnpm lint | Run ESLint with TypeScript rules | 
| pnpm format | Format code with Prettier | 
| pnpm cf-typegen | Generate Cloudflare binding types | 
- GPT-OSS: OpenAI-compatible 20B and 120B variants
- Meta Llama: 4 Scout 17B, 3.3 70B, 3.1 family (6 variants), 3.2 family (3 variants), 3.0 family (3 variants)
- Google Gemma: 3 12B IT, 7B IT, and LoRA variants (4 total)
- Mistral: Small 3.1 24B, 7B v0.1/v0.2 variants (5 total)
- Qwen: QWQ 32B, 2.5 Coder 32B, and 1.5 family variants (6 total)
- DeepSeek: R1 Distill Qwen 32B, Math 7B, Coder variants (4 total)
- Black Forest Labs: FLUX-1-Schnell (fast, high-quality text-to-image)
- Leonardo AI: Lucid Origin and Phoenix 1.0
- Stability AI: Stable Diffusion XL Base 1.0
- ByteDance: Stable Diffusion XL Lightning (ultra-fast generation)
- Text-to-Speech (TTS):
- Deepgram Aura-1: 12+ expressive voices (Luna, Athena, Zeus, Angus, etc.)
- MeloTTS: Multi-language support (EN, ES, FR, ZH, JP, KR) with regional accents
 
- Speech-to-Text (STT):
- Deepgram Nova-3: High-accuracy real-time transcription with punctuation
 
OpenGPT showcases a modern, production-ready architecture with comprehensive request handling:
flowchart TD
    User[π€ User] --> UI[π¨ Next.js Frontend]
    UI --> ModeToggle{Mode Selection}
    ModeToggle -->|π¬ Chat| ChatPath[Chat Request Path]
    ModeToggle -->|πΌοΈ Image| ImagePath[Image Request Path]
    ModeToggle -->|π£οΈ Speech| SpeechPath[Speech Request Path]
    ChatPath --> ChatAPI[π‘ /api/chat]
    ImagePath --> ImageAPI[π‘ /api/image]
    SpeechPath --> SpeechAPI["π‘ /api/speech-to-text | /api/text-to-speech"]
    ChatAPI --> RateLimit1[π« Rate Limiter]
    ImageAPI --> RateLimit2[π« Rate Limiter]
    SpeechAPI --> RateLimit3[π« Rate Limiter]
    RateLimit1 --> RateCheck1{Rate OK?}
    RateLimit2 --> RateCheck2{Rate OK?}
    RateLimit3 --> RateCheck3{Rate OK?}
    RateCheck1 -->|β| RateError1[429 Error]
    RateCheck1 -->|β
| ChatProcessing[π€ Chat Processing]
    RateCheck2 -->|β| RateError2[429 Error]
    RateCheck2 -->|β
| ImageProcessing[π¨ Image Processing]
    RateCheck3 -->|β| RateError3[429 Error]
    RateCheck3 -->|β
| SpeechProcessing[π£οΈ Speech Processing]
    ChatProcessing --> ModelType{Model Type}
    ModelType -->|Standard| AISDKPath[π§ AI SDK v5 + workers-ai-provider]
    ModelType -->|GPT-OSS| DirectPath[π― Direct env.AI.run]
    ImageProcessing --> ImageAI[π¨ Direct env.AI.run]
    SpeechProcessing --> SpeechAI[π£οΈ Direct env.AI.run]
    AISDKPath --> WorkersAI1[βοΈ Cloudflare Workers AI]
    DirectPath --> WorkersAI2[βοΈ Cloudflare Workers AI]
    ImageAI --> WorkersAI3[βοΈ Cloudflare Workers AI]
    SpeechAI --> WorkersAI4[βοΈ Cloudflare Workers AI]
    WorkersAI1 --> Streaming[π Real-time Streaming]
    WorkersAI2 --> Batch[π Batch Processing + Emulated Stream]
    WorkersAI3 --> ImageResponse[πΈ Generated Image]
    WorkersAI4 --> SpeechResponse[π Audio/Text Response]
    Streaming --> ParseReasoning[π§  Parse Reasoning]
    Batch --> ParseReasoning
    ParseReasoning --> ChatSuccess[β
 Chat Response]
    ImageResponse --> ImageSuccess[β
 Image Response]
    SpeechResponse --> SpeechSuccess[β
 Speech Response]
    RateError1 --> ErrorUI[π¨ Error Display]
    RateError2 --> ErrorUI
    RateError3 --> ErrorUI
    ChatSuccess --> ResponseUI[π₯ Response Display]
    ImageSuccess --> ResponseUI
    SpeechSuccess --> ResponseUI
    Chat Route Processing:
- Standard Models: Uses AI SDK v5 with workers-ai-providerwrapper for streaming
- GPT-OSS Models: Direct env.AI.runcall with emulated streaming viacreateUIMessageStream
- All models: Connect to the same Cloudflare Workers AI backend
Image Route Processing:
- All Image Models: Direct env.AI.runcall (no AI SDK wrapper needed)
- Response Handling: Supports both base64 and binary stream responses
- Format Conversion: Automatic conversion to both base64 and Uint8Array for frontend compatibility
Speech Route Processing:
- Speech-to-Text: Direct env.AI.runcall with@cf/deepgram/nova-3model
- Text-to-Speech: Direct env.AI.runcall with@cf/deepgram/aura-1or@cf/myshell-ai/melottsmodels
- Audio Processing: WebM/MP4 audio file handling with automatic format detection
- Voice Options: 12+ Aura-1 speakers, multi-language MeloTTS with regional accents
Rate Limiting:
- Shared Infrastructure: Both routes use the same checkRateLimitutility
- Per-endpoint Limits: Separate daily limits for chat (20), image (5), and speech (10) requests
- Storage: Hybrid Upstash Redis + Cloudflare KV fallback
- Frontend Validation: Client-side input validation and optional rate limit pre-checking
- Rate Limiting: IP-based daily limits (20 chat, 5 image, 10 speech requests) with Redis/KV storage
- Model Routing: Smart routing between Standard Models (streaming) and GPT-OSS Models (batch)
- AI Processing: Direct Cloudflare Workers AI integration with optimized parameters
- Response Handling: Reasoning token parsing, format conversion, and UI display
# Build and deploy in one command
pnpm deploy
# Or step by step
pnpm build
npx wrangler deploy| Variable | Description | 
|---|---|
| UPSTASH_REDIS_REST_URL | Upstash Redis URL (optional) | 
| UPSTASH_REDIS_REST_TOKEN | Upstash Redis token (optional) | 
- Add model to constants:
// src/constants/index.ts
export const CLOUDFLARE_AI_MODELS = {
  textGeneration: [
    // Add your new model here
    '@cf/vendor/new-model',
    // ... existing models
  ] as const,
  imageGeneration: [
    // For image models
  ] as const,
  speech: [
    // For speech-to-text models
  ] as const,
  textToSpeech: [
    // For text-to-speech models
  ] as const,
};- Update utility functions:
// src/constants/index.ts
export function getTextGenerationModels(): readonly string[] {
  return CLOUDFLARE_AI_MODELS.textGeneration;
}
export function getSpeechModels(): readonly string[] {
  return CLOUDFLARE_AI_MODELS.speech;
}
export function getTextToSpeechModels(): readonly string[] {
  return CLOUDFLARE_AI_MODELS.textToSpeech;
}- Test the integration:
pnpm dev
# Test the new model in the UIWe welcome contributions!
# Fork the repo and clone your fork
git clone https://github.com/devhims/opengpt.git
# Create a feature branch
git checkout -b feature/new-feature
# Make your changes and test
pnpm dev
# Run linting and formatting
pnpm lint
pnpm format
# Commit using conventional commits
git commit -m "feat: add new feature"
# Push and create a PR
git push origin feature/new-feature- TypeScript: Strict mode enabled
- Formatting: Prettier with Tailwind class sorting
- Linting: ESLint with Next.js rules
This project is licensed under the MIT License.
Made with β€οΈ for the AI community
β Star this repo if you find it useful!

