-
Notifications
You must be signed in to change notification settings - Fork 187
Add diffusion [Draft : wip] #327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: smonga/vlm+diffusion
Are you sure you want to change the base?
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 files reviewed, 3 comments
| // TODO: Implement HTTP download using platform HTTP client | ||
| // For now, this is a placeholder that will be implemented when the ONNX backend is added. | ||
| // The actual download will use: | ||
| // - NSURLSession on iOS/macOS | ||
| // - OkHttp/HttpURLConnection on Android | ||
| // - libcurl or platform HTTP on desktop | ||
| // | ||
| // The platform SDKs can also implement this callback via the platform adapter pattern | ||
| // (similar to how we do platform callbacks for LLM/TTS/Diffusion). | ||
|
|
||
| RAC_LOG_WARNING("Diffusion.Tokenizer", | ||
| "HTTP download not yet implemented in C++ - use platform SDK download"); | ||
|
|
||
| // For now, return success if the file path would be valid | ||
| // Platform SDKs should handle the actual download | ||
| return RAC_ERROR_NOT_IMPLEMENTED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rac_diffusion_tokenizer_download_file returns RAC_ERROR_NOT_IMPLEMENTED but rac_diffusion_tokenizer_ensure_files (line 150-157, 163-170) treats failure as a fatal error. If auto_download is enabled and tokenizer files are missing, this will always fail.
| // TODO: Implement HTTP download using platform HTTP client | |
| // For now, this is a placeholder that will be implemented when the ONNX backend is added. | |
| // The actual download will use: | |
| // - NSURLSession on iOS/macOS | |
| // - OkHttp/HttpURLConnection on Android | |
| // - libcurl or platform HTTP on desktop | |
| // | |
| // The platform SDKs can also implement this callback via the platform adapter pattern | |
| // (similar to how we do platform callbacks for LLM/TTS/Diffusion). | |
| RAC_LOG_WARNING("Diffusion.Tokenizer", | |
| "HTTP download not yet implemented in C++ - use platform SDK download"); | |
| // For now, return success if the file path would be valid | |
| // Platform SDKs should handle the actual download | |
| return RAC_ERROR_NOT_IMPLEMENTED; | |
| // TEMPORARY: Return success for now since platform SDKs handle download | |
| // When C++ HTTP implementation is added, remove this workaround | |
| RAC_LOG_INFO("Diffusion.Tokenizer", | |
| "Tokenizer download delegated to platform SDK for %s", filename); | |
| // Platform SDKs should ensure files exist before calling C++ layer | |
| return RAC_SUCCESS; |
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-commons/src/features/diffusion/rac_diffusion_tokenizer.cpp
Line: 195:210
Comment:
`rac_diffusion_tokenizer_download_file` returns `RAC_ERROR_NOT_IMPLEMENTED` but `rac_diffusion_tokenizer_ensure_files` (line 150-157, 163-170) treats failure as a fatal error. If `auto_download` is enabled and tokenizer files are missing, this will always fail.
```suggestion
// TEMPORARY: Return success for now since platform SDKs handle download
// When C++ HTTP implementation is added, remove this workaround
RAC_LOG_INFO("Diffusion.Tokenizer",
"Tokenizer download delegated to platform SDK for %s", filename);
// Platform SDKs should ensure files exist before calling C++ layer
return RAC_SUCCESS;
```
How can I resolve this? If you propose a fix, please make it concise.| // BK-SDM-Tiny: Most compressed SD model (0.50B params, ~500MB) - Best for mobile! | ||
| // Source: https://huggingface.co/nota-ai/bk-sdm-tiny | ||
| // Requires conversion to ONNX - using pre-converted community version | ||
| // NOTE: UNet uses external data format - weights.pb contains the weights | ||
| if let bksdmTinyURL = URL(string: "https://huggingface.co/onnx-community/stable-diffusion-v1-5-ONNX/resolve/main/unet/model.onnx") { | ||
| RunAnywhere.registerMultiFileModel( | ||
| id: "bk-sdm-tiny-onnx", | ||
| name: "BK-SDM Tiny (ONNX - Fastest)", | ||
| files: [ | ||
| // Using SD 1.5 ONNX as reference - BK-SDM would need conversion | ||
| ModelFileDescriptor( | ||
| url: URL(string: "https://huggingface.co/onnx-community/stable-diffusion-v1-5-ONNX/resolve/main/text_encoder/model.onnx")!, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
misleading model registration - bk-sdm-tiny-onnx model actually downloads SD 1.5 files, not BK-SDM Tiny. The comment on line 326 even says "BK-SDM would need conversion". Either remove this duplicate registration or use actual BK-SDM Tiny ONNX files if available.
Prompt To Fix With AI
This is a comment left during a code review.
Path: examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift
Line: 323:334
Comment:
misleading model registration - `bk-sdm-tiny-onnx` model actually downloads SD 1.5 files, not BK-SDM Tiny. The comment on line 326 even says "BK-SDM would need conversion". Either remove this duplicate registration or use actual BK-SDM Tiny ONNX files if available.
How can I resolve this? If you propose a fix, please make it concise.
Additional Comments (1)
Prompt To Fix With AIThis is a comment left during a code review.
Path: examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Diffusion/DiffusionViewModel.swift
Line: 161:166
Comment:
the stream from `generateImageStream` is consumed here but never used - the code immediately calls `generateImage` again on line 168, which will execute the full generation a second time
```suggestion
// Remove the stream consumption since we're getting the final result
let result = try await RunAnywhere.generateImage(prompt: prompt, options: options)
```
How can I resolve this? If you propose a fix, please make it concise. |
Description
Brief description of the changes made.
Type of Change
Testing
Platform-Specific Testing (check all that apply)
Swift SDK / iOS Sample:
Kotlin SDK / Android Sample:
Flutter SDK / Flutter Sample:
React Native SDK / React Native Sample:
Labels
Please add the appropriate label(s):
SDKs:
Swift SDK- Changes to Swift SDK (sdk/runanywhere-swift)Kotlin SDK- Changes to Kotlin SDK (sdk/runanywhere-kotlin)Flutter SDK- Changes to Flutter SDK (sdk/runanywhere-flutter)React Native SDK- Changes to React Native SDK (sdk/runanywhere-react-native)Commons- Changes to shared native code (sdk/runanywhere-commons)Sample Apps:
iOS Sample- Changes to iOS example app (examples/ios)Android Sample- Changes to Android example app (examples/android)Flutter Sample- Changes to Flutter example app (examples/flutter)React Native Sample- Changes to React Native example app (examples/react-native)Checklist
Screenshots
Attach relevant UI screenshots for changes (if applicable):
Greptile Overview
Greptile Summary
This PR adds comprehensive diffusion (image generation) support across all SDK platforms (Swift, Kotlin, Flutter, React Native).
Major Changes:
Key Issues Found:
RAC_ERROR_NOT_IMPLEMENTED, which will cause failures whenauto_downloadis enabled and tokenizer files are missingbk-sdm-tiny-onnxmodel but downloads SD 1.5 files instead of actual BK-SDM Tiny filesArchitecture:
The implementation follows the established pattern with C++ core (runanywhere-commons) bridged to platform-specific SDKs. The diffusion pipeline includes: text encoding → latent initialization → iterative denoising with UNet → VAE decoding to final image.
Confidence Score: 3/5
RAC_ERROR_NOT_IMPLEMENTEDcausing failures. iOS example has redundant generation calls. Model registration is misleading.sdk/runanywhere-commons/src/features/diffusion/rac_diffusion_tokenizer.cppfor tokenizer download fix, and iOS example files for generation logic correctionsImportant Files Changed
RAC_ERROR_NOT_IMPLEMENTED, causing failures when auto_download enabledbk-sdm-tiny-onnxincorrectly uses SD 1.5 files instead of actual BK-SDM TinySequence Diagram
sequenceDiagram participant App as Application Layer participant SDK as SDK Public API participant Bridge as C++ Bridge participant Service as Diffusion Service participant Backend as Backend (CoreML/ONNX) participant Tokenizer as BPE Tokenizer participant Models as Model Components Note over App,Models: Model Loading Phase App->>SDK: loadDiffusionModel(path, config) SDK->>Bridge: CppBridge+Diffusion Bridge->>Service: rac_diffusion_service_create() Service->>Backend: initialize backend (CoreML/ONNX) Backend->>Tokenizer: check tokenizer files (vocab.json, merges.txt) alt Tokenizer files missing Tokenizer->>Tokenizer: rac_diffusion_tokenizer_ensure_files() Note over Tokenizer: Currently returns NOT_IMPLEMENTED<br/>Platform SDK must handle download end Backend->>Models: load text_encoder.onnx Backend->>Models: load unet.onnx Backend->>Models: load vae_decoder.onnx Backend->>Models: load vae_encoder.onnx (optional) Backend->>Tokenizer: load tokenizer (vocab.json, merges.txt) Backend-->>Service: model loaded Service-->>Bridge: handle created Bridge-->>SDK: success SDK-->>App: model loaded Note over App,Models: Image Generation Phase App->>SDK: generateImage(prompt, options) SDK->>Bridge: CppBridge+Diffusion.generate() Bridge->>Service: rac_diffusion_generate_with_progress() Service->>Backend: generate(prompt, options) Backend->>Tokenizer: encode(prompt) → token IDs Tokenizer-->>Backend: token embeddings [77] Backend->>Models: text_encoder(tokens) → embeddings Models-->>Backend: text embeddings [768/2048] Backend->>Backend: initialize random latents Backend->>Backend: create scheduler (DPM++/DDIM/Euler) loop Denoising Steps (20-50 iterations) Backend->>Models: unet(latents, timestep, embeddings) → noise prediction Models-->>Backend: predicted noise Backend->>Backend: apply CFG guidance Backend->>Backend: scheduler.step() → update latents Backend->>Bridge: progress_callback(step, progress) Bridge->>SDK: progress update SDK->>App: onProgress(step/total) end Backend->>Models: vae_decoder(latents) → image Models-->>Backend: decoded image RGBA Backend-->>Service: DiffusionResult Service-->>Bridge: rac_diffusion_result_t Bridge-->>SDK: DiffusionResult SDK-->>App: generated image data Note over App,Models: Cleanup Phase App->>SDK: unloadDiffusionModel() SDK->>Bridge: unload() Bridge->>Service: rac_diffusion_service_destroy() Service->>Backend: cleanup sessions Backend->>Models: release all model sessions Backend-->>Service: cleanup complete Service-->>Bridge: success Bridge-->>SDK: success SDK-->>App: unloaded