feat: add RAG-Anything Studio — standalone multimodal-aware Web UI by devinlovekoala · Pull Request #270 · HKUDS/RAG-Anything

devinlovekoala · 2026-04-28T02:48:03Z

Summary

This PR introduces RAG-Anything Studio, a standalone, local-first Web UI for RAG-Anything.

Motivation

While RAG-Anything provides strong multimodal capabilities, it currently lacks a practical interface for:

inspecting multimodal parsing outputs
debugging retrieval behavior across text and images
configuring heterogeneous model providers and storage backends

This PR provides a minimal, low-risk UI layer to improve usability and observability without modifying the core pipeline.

Scope

This PR does not redesign RAG-Anything.
It adds an optional UI layer on top of the existing system.

All code lives under /raganything_studio and interacts with RAG-Anything only through its public APIs.

What's included

Backend (raganything_studio/backend/) — FastAPI service wrapping RAGAnything without modifying its internals
Frontend (raganything_studio/frontend/) — React + TypeScript SPA (prebuilt and served from backend/static/, no Node.js required)

Key capabilities

Multimodal interaction
- Multimodal query interface with image-evidence display
- Image preview support for knowledge graph nodes
System configuration
- Full settings page for LLM / VLM / embedding providers
- Remote vector DB and hybrid storage configuration
- Provider model discovery and embedding dimension auto-detection
Reliability & usability
- Test-connection and readiness guards before insert/query
- Phased concurrency controls and file-hash caching for insert jobs
Visualization
- Knowledge graph visualization (Cytoscape fcose layout)

Design principles

Zero risk to existing users — fully isolated under /raganything_studio
No core pipeline changes — uses only public APIs
Optional dependency — installed via extras (pip install ".[studio]")
Maintainability-first — Studio evolves independently from core

Test plan

Automated checks:

python -m pip install -e . completes in a clean virtual environment
cd raganything_studio/frontend && npm ci && npm run build
python -m raganything_studio --host 127.0.0.1 --port 7860
curl -fsS http://127.0.0.1:7860/api/health
pytest tests/studio

Manual checks:

Open http://127.0.0.1:7860 and verify the Studio UI loads
In Settings, save valid model/storage settings and verify invalid storage combinations are rejected
Use Test connection for the configured model provider and any configured database storage backend
Upload a document, start processing via the UI, and confirm the job reaches succeeded and the document becomes indexed
Open Knowledge Graph and confirm graph nodes/edges render
Run a query with multimodal retrieval enabled and confirm the answer includes sources plus image/media preview when the indexed document contains visual content

Related issue: #269

…rials

…ved job errors - Add ReadinessContext: globally tracks llm/embedding key status and indexed doc count - UploadPage: show gate banner with missing keys listed, disable process button when unconfigured - QueryPage: dual guard — blocks query if unconfigured OR no indexed docs, shows contextual action - Dashboard: replace static metrics with 3-step onboarding track (Config → Upload → Query), auto-advances active step based on real state; keep metrics + recent doc list below - JobPage: extract error summary from traceback (last non-File line), show collapsible traceback behind 'Show details'; log console auto-scrolls to bottom on every log update - styles.css: full rewrite with CSS custom properties, refined sidebar (sticky, footer status dot), improved typography scale, gate-banner, onboarding-track, error-summary, log-section components

Backend: - POST /api/settings/test-connection: tests LLM/Embedding/Vision using unsaved form values; falls back to saved key when form field is blank - Embedding test returns detected_dim by measuring actual vector length - LLM/Vision test sends a minimal 4-token prompt and measures latency Frontend: - SettingsPage: each provider card gets a 'Test connection' button with spinner, green latency badge on success, red Failed badge on error - Embedding card: on successful test, detected_dim is auto-applied to the Dimension field and shown as 'Auto-detected: N' alongside it - API key placeholder shows hint when key is already saved so user knows they can leave it blank during a re-test

- Backend: POST /api/settings/list-models — fetches /v1/models from any OpenAI-compatible provider; falls back to saved API key when form key is blank; returns ModelListResponse{ok, models[], error} - Provider registry: 15 known platforms (SiliconFlow, 阿里云百炼, 百度千帆, 火山引擎, OpenRouter, DeepSeek, Groq, etc.) with pre-filled base URLs; selecting a known provider auto-fills Base URL and hides the URL field - Frontend: ProviderSection replaces free-text model input with a two-mode ModelPickerField — text input + "Load models" button before fetch, grouped searchable dropdown after; groups by owned_by field matching Cherry Studio UX - Provider select reordered into optgroups: Popular (CN), International, Self-hosted, Other - CSS: model-dropdown-panel, model-group, model-option, base-url-display, model-picker-row, load-models-btn and related selectors - pyproject.toml: add httpx>=0.25 as explicit dependency

- _test_vision: replace text-only openai_complete_if_cache call with a direct httpx multimodal POST containing a 1×1 white PNG; plain text calls are rejected by GLM-4V and most VLMs with InvalidResponseError - _is_vision_capable: pattern-match model IDs against known VL keywords (vl, vision, glm-4v, qwen-vl, llava, gpt-4o, claude-3, etc.) to tag ModelInfo.vision_capable in the /list-models response - ModelPickerField: when kind==='vision', default visionOnly filter to true and show "VL only" toggle button; VL badge shown on capable models - styles: .model-badge--vision (blue), .vision-filter-btn + .active state

…s, improved processing efficiency version

…repaired the data model

…rage mode selection

…n detection and full configuration

privat655 · 2026-05-07T14:31:38Z

wow great idea

devinlovekoala added 15 commits April 25, 2026 11:21

chore: ignore circuit-domain branch artifacts and personal study mate…

0e09a35

…rials

feat(studio): align with the LightRAG Web UI style version

15464e4

feat(studio): optimized file hash caching, phased concurrency setting…

8b27b0e

…s, improved processing efficiency version

fix(studio): fixed backend loading issues of the knowledge graph and …

0d53361

…repaired the data model

feat(studio): optimize the knowledge graph display UI version

93f9f4c

feat(studio): upgrade Query feature version

24dd114

feat(studio): added support for remote vector database and hybrid sto…

18b3c2c

…rage mode selection

feat(studio): added support for remote vector database and hybrid sto…

be253e6

…rage mode selection

fix(studio): standardize API page Swagger UI style rendering issues

7ffa398

feat(studio): optimize database configuration features, add connectio…

f746677

…n detection and full configuration

feat(studio): add support for local model services

be80e1b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add RAG-Anything Studio — standalone multimodal-aware Web UI#270

feat: add RAG-Anything Studio — standalone multimodal-aware Web UI#270
devinlovekoala wants to merge 15 commits into
HKUDS:mainfrom
devinlovekoala:gui-support

devinlovekoala commented Apr 28, 2026 •

edited

Loading

Uh oh!

privat655 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

devinlovekoala commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Scope

What's included

Key capabilities

Design principles

Test plan

Uh oh!

privat655 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

devinlovekoala commented Apr 28, 2026 •

edited

Loading