-
Notifications
You must be signed in to change notification settings - Fork 191
feat: huggingface provider added #1006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Warning Rate limit exceeded@qwerty-dvorak has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 6 minutes and 4 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⛔ Files ignored due to path filters (6)
📒 Files selected for processing (38)
📝 WalkthroughWalkthroughAdds HuggingFace as a native provider to Bifrost, implementing chat completions, embeddings, text generation, speech synthesis, and transcription through the HuggingFace Inference API with multi-backend routing, model caching, and full OpenAI-compatible request/response translation. Changes
Sequence DiagramsequenceDiagram
participant Client
participant HFProvider as HuggingFace<br/>Provider
participant ModelCache as Model Provider<br/>Mapping Cache
participant HFInference as HuggingFace<br/>Inference API
participant HTTPClient as FastHTTP<br/>Client
Client->>HFProvider: ChatCompletion(modelID, request)
HFProvider->>ModelCache: Get model mapping for modelID
alt Cache Hit
ModelCache-->>HFProvider: {provider, model}
else Cache Miss
HFProvider->>HFInference: List models (inference providers)
HFInference-->>HFProvider: Model list
HFProvider->>ModelCache: Store mapping
ModelCache-->>HFProvider: {provider, model}
end
HFProvider->>HFProvider: buildRequestURL(modelID, provider)
HFProvider->>HFProvider: ToHuggingFaceChatCompletionRequest(bifrostReq)
HFProvider->>HTTPClient: Execute POST request
HTTPClient->>HFInference: Send to inference provider
HFInference-->>HTTPClient: Chat completion response
HTTPClient-->>HFProvider: Response body + latency
alt Success
HFProvider->>HFProvider: Parse response
HFProvider->>HFProvider: Enrich with metadata<br/>(provider, model, latency)
HFProvider-->>Client: BifrostChatResponse
else Not Found (404)
HFProvider->>ModelCache: Clear stale mapping
HFProvider->>HFInference: Retry List models
HFInference-->>HFProvider: Updated model list
loop Retry with new mapping
HFProvider->>HTTPClient: Execute POST request (retry)
HTTPClient->>HFInference: Send to new provider
HFInference-->>HTTPClient: Response
HTTPClient-->>HFProvider: Response body
end
HFProvider-->>Client: BifrostChatResponse
else Error
HFProvider->>HFProvider: Decode HuggingFaceError
HFProvider-->>Client: BifrostError
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Areas requiring extra attention:
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🧪 Test Suite AvailableThis PR can be tested by a repository admin. |
135b9b5 to
5baaee2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 18
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/schemas/mux.go (1)
1146-1229: Thought content is emitted twice in the stream when delta contains both Content and Thought.At lines 1207-1213, the code aggregates both
delta.Contentanddelta.Thoughtinto a singlecontentDeltastring and emits it asOutputTextDeltaat line 1218. Then at lines 1343-1377, ifhasThoughtis true, the code separately emitsReasoningSummaryTextDeltacontaining the thought content again.Since these are sequential conditions (not mutually exclusive), when a delta contains both fields, thought appears in:
- The aggregated
OutputTextDeltaevent (lines 1207-1213, 1218)- The separate
ReasoningSummaryTextDeltaevent (lines 1365-1377)Clarify the intended behavior: should thought only appear in the reasoning delta event, or is it correct for it to appear in both? If thought should be separate, remove
delta.Thoughtfrom the aggregation at lines 1207-1213.
🧹 Nitpick comments (6)
docs/features/unified-interface.mdx (1)
88-106: Verify Hugging Face capability matrix matches actual implementationThe Hugging Face row advertises ✅ for Text/Text (stream)/Chat/Chat (stream)/Embeddings/TTS/STT and ❌ for both Responses modes. Please double‑check this against the actual HuggingFace provider implementation (especially Responses and any streaming audio paths) so the matrix doesn’t drift from reality.
core/providers/huggingface/huggingface_test.go (1)
59-62: Moveclient.Shutdown()inside the subtest or uset.Cleanup.
client.Shutdown()is called unconditionally aftert.Run, but if the subtest is run in parallel or skipped, this could cause issues. Consider usingt.Cleanupfor proper resource cleanup.+ t.Cleanup(func() { + client.Shutdown() + }) + t.Run("HuggingFaceTests", func(t *testing.T) { testutil.RunAllComprehensiveTests(t, client, ctx, testConfig) }) - client.Shutdown()core/providers/huggingface/speech.go (1)
27-28: Consider avoiding empty Parameters struct allocation.
hfRequest.Parametersis allocated on line 28 even if no parameters are actually mapped. Consider only allocating when there are parameters to set.// Map parameters if present if request.Params != nil { - hfRequest.Parameters = &HuggingFaceSpeechParameters{} - // Map generation parameters from ExtraParams if available if request.Params.ExtraParams != nil { genParams := &HuggingFaceTranscriptionGenerationParameters{} // ... parameter mapping ... - hfRequest.Parameters.GenerationParameters = genParams + hfRequest.Parameters = &HuggingFaceSpeechParameters{ + GenerationParameters: genParams, + } } }core/providers/huggingface/chat.go (1)
36-38: Errors fromsonic.Marshalare silently ignored.Multiple calls to
sonic.Marshaldiscard the error using_. While marshalling simple structs rarely fails, silently ignoring errors could mask issues in edge cases (e.g., cyclic references, unusual types).Consider logging or returning an error when marshalling fails for better debuggability:
contentJSON, err := sonic.Marshal(*msg.Content.ContentStr) if err != nil { // At minimum, log the error for debugging if debug { fmt.Printf("[huggingface debug] Failed to marshal content: %v\n", err) } continue }Also applies to: 61-62, 189-194
core/providers/huggingface/huggingface.go (1)
51-85: Typo: "aquire" should be "acquire"The function names use "aquire" instead of the correct spelling "acquire". While this doesn't affect functionality, it's a code quality issue that should be fixed for consistency and readability.
-func aquireHuggingFaceChatResponse() *HuggingFaceChatResponse { +func acquireHuggingFaceChatResponse() *HuggingFaceChatResponse {-func aquireHuggingFaceTranscriptionResponse() *HuggingFaceTranscriptionResponse { +func acquireHuggingFaceTranscriptionResponse() *HuggingFaceTranscriptionResponse {-func aquireHuggingFaceSpeechResponse() *HuggingFaceSpeechResponse { +func acquireHuggingFaceSpeechResponse() *HuggingFaceSpeechResponse {Don't forget to update the call sites at lines 348, 790, and 861.
core/providers/huggingface/types.go (1)
379-396: Silent error swallowing in UnmarshalJSONThe
UnmarshalJSONmethod silently returnsnilwhen both boolean and string unmarshaling fail, leaving the struct in an uninitialized state. This could mask malformed JSON input.Consider returning an error when the input doesn't match expected types:
func (e *HuggingFaceTranscriptionEarlyStopping) UnmarshalJSON(data []byte) error { + // Handle null explicitly + if string(data) == "null" { + return nil + } + // Try boolean first var boolVal bool if err := json.Unmarshal(data, &boolVal); err == nil { e.BoolValue = &boolVal return nil } // Try string var stringVal string if err := json.Unmarshal(data, &stringVal); err == nil { e.StringValue = &stringVal return nil } - return nil + return fmt.Errorf("early_stopping must be a boolean or string, got: %s", string(data)) }This would require adding
"fmt"to the imports.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (28)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)Makefile(8 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(3 hunks)core/internal/testutil/chat_completion_stream.go(1 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types copy.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(2 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/features/unified-interface.mdx(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(1 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
transports/config.schema.jsoncore/internal/testutil/responses_stream.gocore/schemas/bifrost.gocore/providers/huggingface/embedding.godocs/apis/openapi.jsoncore/schemas/mux.godocs/features/unified-interface.mdxcore/internal/testutil/chat_completion_stream.goui/lib/constants/config.tscore/providers/huggingface/types copy.gocore/providers/huggingface/models.goui/README.mdcore/schemas/account.gocore/bifrost.gocore/providers/huggingface/transcription.goui/lib/constants/icons.tsxui/lib/constants/logs.tsMakefilecore/providers/huggingface/huggingface_test.gocore/providers/huggingface/speech.gocore/internal/testutil/account.godocs/contributing/adding-a-provider.mdxcore/providers/huggingface/utils.gocore/providers/huggingface/chat.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
🧬 Code graph analysis (10)
core/schemas/bifrost.go (1)
ui/lib/types/config.ts (1)
ModelProvider(171-174)
core/providers/huggingface/embedding.go (2)
core/schemas/embedding.go (4)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)core/providers/huggingface/types.go (2)
HuggingFaceEmbeddingRequest(278-288)HuggingFaceEmbeddingResponse(299-299)
core/schemas/mux.go (2)
core/schemas/responses.go (2)
BifrostResponsesStreamResponse(1422-1460)ResponsesStreamResponseTypeOutputTextDelta(1370-1370)core/schemas/utils.go (1)
Ptr(16-18)
core/internal/testutil/chat_completion_stream.go (1)
core/internal/testutil/utils.go (1)
CreateBasicChatMessage(247-254)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
HuggingFaceListModelsResponse(21-23)core/schemas/bifrost.go (13)
ModelProvider(32-32)RequestType(86-86)ChatCompletionRequest(92-92)ChatCompletionStreamRequest(93-93)TextCompletionRequest(90-90)TextCompletionStreamRequest(91-91)ResponsesRequest(94-94)ResponsesStreamRequest(95-95)EmbeddingRequest(96-96)SpeechRequest(97-97)SpeechStreamRequest(98-98)TranscriptionRequest(99-99)TranscriptionStreamRequest(100-100)core/schemas/models.go (1)
Model(109-129)
core/schemas/account.go (1)
ui/lib/types/config.ts (3)
AzureKeyConfig(23-27)VertexKeyConfig(36-42)BedrockKeyConfig(53-60)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/providers/huggingface/huggingface.go (1)
NewHuggingFaceProvider(88-120)
core/providers/huggingface/speech.go (3)
core/schemas/speech.go (2)
BifrostSpeechRequest(9-16)BifrostSpeechResponse(22-29)core/providers/huggingface/types.go (5)
HuggingFaceSpeechRequest(304-310)HuggingFaceSpeechParameters(313-316)HuggingFaceTranscriptionGenerationParameters(342-359)HuggingFaceTranscriptionEarlyStopping(363-366)HuggingFaceSpeechResponse(319-323)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/providers/huggingface/utils.go (5)
core/providers/huggingface/huggingface.go (1)
HuggingFaceProvider(25-31)core/schemas/models.go (1)
BifrostListModelsRequest(23-34)core/providers/utils/utils.go (5)
GetRequestPath(219-239)MakeRequestWithContext(39-93)HandleProviderAPIError(317-337)CheckAndDecodeBody(423-431)NewBifrostOperationError(449-460)core/providers/huggingface/types.go (3)
HuggingFaceInferenceProviderMappingResponse(25-30)HuggingFaceInferenceProviderMapping(39-42)HuggingFaceHubError(263-266)core/schemas/provider.go (1)
ErrProviderResponseDecode(29-29)
core/providers/huggingface/chat.go (2)
core/schemas/chatcompletions.go (17)
BifrostChatRequest(12-19)ChatContentBlockTypeText(497-497)ChatContentBlockTypeImage(498-498)ChatAssistantMessage(541-545)ChatToolMessage(536-538)ChatToolChoiceStruct(339-344)BifrostResponseChoice(582-590)ChatMessageRole(415-415)ChatAssistantMessageToolCall(564-570)ChatAssistantMessageToolCallFunction(573-576)ChatNonStreamResponseChoice(605-608)BifrostLogProbs(593-598)ContentLogProb(632-637)LogProb(625-629)BifrostLLMUsage(640-647)ChatStreamResponseChoice(611-613)ChatStreamResponseChoiceDelta(616-622)core/providers/huggingface/types.go (12)
HuggingFaceChatRequest(47-67)HuggingFaceChatMessage(69-77)HuggingFaceContentItem(80-84)HuggingFaceImageRef(86-88)HuggingFaceToolCall(90-94)HuggingFaceFunction(96-100)HuggingFaceResponseFormat(102-105)HuggingFaceStreamOptions(114-116)HuggingFaceTool(118-121)HuggingFaceToolFunction(123-127)HuggingFaceChatResponse(129-136)HuggingFaceChatStreamResponse(190-199)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...
(QB_NEW_EN_HYPHEN)
[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...
(QB_NEW_EN_HYPHEN)
[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...
(QB_NEW_EN_HYPHEN)
[grammar] ~1469-~1469: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...
(QB_NEW_EN_HYPHEN)
[uncategorized] ~1779-~1779: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
🔇 Additional comments (29)
.github/workflows/release-pipeline.yml (1)
118-118: LGTM! Consistent HuggingFace API key propagation across release jobs.The HUGGING_FACE_API_KEY has been properly added to all four release jobs (core, framework, plugins, bifrost-http) that require provider API keys for testing. The implementation is consistent with existing API key patterns and correctly sources from GitHub secrets.
Also applies to: 191-191, 268-268, 357-357
core/schemas/account.go (1)
54-56: LGTM! HuggingFaceKeyConfig follows established patterns.The new
HuggingFaceKeyConfigtype and its integration into theKeystruct follow the same pattern as existing provider configurations (Azure, Vertex, Bedrock). The implementation is consistent with:
- Similar
Deploymentsmap structure used by other providers- Proper JSON tags with
omitempty- Go naming conventions
- Optional field design
Also applies to: 17-17
core/internal/testutil/account.go (1)
96-96: LGTM! HuggingFace test configuration added with conservative settings.The HuggingFace provider has been properly integrated into the test account setup. Notable configuration choices:
- MaxRetries: 1 - Much lower than other providers (8-10), suggesting less reliability or desire to fail fast
- Timeout: 300 seconds - Higher than most providers (120s), indicating potentially longer response times
- Retry backoff: 2s-30s - Conservative settings for retry attempts
These settings appear intentional for the HuggingFace Inference API characteristics. Ensure these align with production use cases and adjust if needed based on actual performance data.
Also applies to: 259-266, 512-524
.github/workflows/pr-tests.yml (1)
118-118: LGTM! HuggingFace API key added to test environment.The HUGGING_FACE_API_KEY has been properly added to the PR test workflow, consistent with the release pipeline changes and other provider API key patterns.
ui/README.md (1)
84-84: LGTM! Documentation updated with HuggingFace provider.The README has been updated to include HuggingFace in the list of supported providers, keeping the documentation in sync with the code changes.
core/internal/testutil/responses_stream.go (1)
693-693: Verify the lifecycle streaming safety threshold of 300 is appropriate.The response count safety check at line 693 uses a threshold of 300 chunks. However, the file shows different thresholds for different streaming scenarios: tool streaming (100), reasoning streaming (150), lifecycle streaming (300), and basic streaming (500). Before merging, clarify:
- Why lifecycle streaming allows 3x more chunks than tool streaming—is this intentional differentiation or oversight?
- Is 300 chunks a reasonable upper bound based on actual HuggingFace lifecycle streaming behavior?
- Consider adding logging when approaching thresholds to help diagnose unexpected verbosity.
transports/config.schema.json (1)
135-140: HuggingFace correctly wired into config schema
providers.huggingfacereuses the genericproviderschema and the semanticcacheproviderenum now includes"huggingface", both consistent with other providers and with the newModelProviderconstant. No issues spotted.Also applies to: 764-784
core/schemas/bifrost.go (1)
35-83: ModelProvider and provider lists updated consistently for HuggingFace
HuggingFaceis added as aModelProviderand included in bothSupportedBaseProvidersandStandardProviderswith the correct"huggingface"identifier. This aligns with the rest of the provider plumbing.ui/lib/constants/config.ts (1)
5-24: UI config: HuggingFace placeholder and key requirement look correctThe
huggingfacemodel placeholder andisKeyRequiredByProviderentry are consistent with other providers and with the expected HF auth model. No issues from a UI/config standpoint.Also applies to: 26-44
ui/lib/constants/logs.ts (1)
2-20: Logging constants updated to recognize HuggingFaceAdding
"huggingface"toKnownProvidersNamesandProviderLabelskeeps the provider type and display labels consistent across the UI logging layer. Looks good.Also applies to: 43-61
docs/apis/openapi.json (1)
3239-3259: OpenAPI ModelProvider enum now includes HuggingFace (and Cerebras)Extending the
ModelProviderenum to include"huggingface"(and"cerebras") brings the public API spec in line with the backend/provider constants and config schema. The change is additive and backward‑compatible.core/bifrost.go (1)
1327-1328: LGTM!The HuggingFace provider is correctly integrated into the provider factory, following the established pattern used by other providers that return only a pointer (like OpenAI, Anthropic, Mistral, Gemini).
core/providers/huggingface/embedding.go (2)
22-31: LGTM with a note on unsupported input types.The conversion correctly handles
TextandTextsinput types. The comment on lines 29-30 appropriately documents that embedding/embeddings (int arrays) are not supported by HuggingFace feature extraction.
57-92: LGTM!The response conversion correctly:
- Validates nil input
- Pre-allocates the slice with proper capacity
- Maps each embedding to the appropriate Bifrost structure
- Documents that HuggingFace doesn't return usage information
core/providers/huggingface/models.go (2)
46-104: LGTM!The
deriveSupportedMethodsfunction correctly:
- Normalizes the pipeline string for case-insensitive matching
- Uses a map to deduplicate methods
- Handles both pipeline tags and model tags for flexibility
- Returns a sorted, deduplicated list of supported methods
34-39: Code correctly distinguishes betweenmodel.IDandmodel.ModelIDfields.The distinction is valid:
model.ModelID(maps to"modelId"in the HuggingFace API response) represents the model's user-facing identifier, whilemodel.ID(maps to"_id"in the API response) represents the internal HuggingFace identifier. The code appropriately usesModelIDfor the composite ID and Name, andIDfor the HuggingFaceID field.core/providers/huggingface/speech.go (1)
94-116: LGTM!The response conversion correctly maps the HuggingFace response to Bifrost format, with appropriate nil checks and documentation about missing usage/alignment data.
Makefile (2)
14-21: LGTM! Improved tool binary discovery for portable environments.The logic for detecting Go binary paths (
GOBIN,GOPATH,DEFAULT_GOBIN) and constructing tool paths (AIR_BIN,GOTESTSUM_BIN) is well-structured. This enables robust tool invocation without relying on globalPATHconfiguration, which benefits environments like Nix.
65-69: Good safety check for preventing root execution in local development.The guard against running as root on developer machines while allowing CI environments is appropriate. This prevents global npm install failures on systems like Nix.
core/providers/huggingface/transcription.go (1)
100-139: LGTM! Response conversion is well-implemented.The
ToBifrostTranscriptionResponsemethod properly validates inputs, handles optional timestamp chunks, and correctly maps them toTranscriptionSegmentstructures.core/providers/huggingface/chat.go (2)
201-315: LGTM! Non-streaming response conversion is comprehensive.The
ToBifrostChatResponsemethod properly handles:
- Nil checks and model validation
- Choice conversion with message, role, content, and tool calls
- Logprobs conversion including nested top logprobs
- Usage information mapping
317-413: LGTM! Streaming response conversion is well-implemented.The
ToBifrostChatStreamResponsemethod correctly converts streaming delta responses, handling roles, content, reasoning (as thought), tool calls, and logprobs appropriately.core/providers/huggingface/utils.go (1)
83-128: LGTM! URL building with proper encoding and ExtraParams handling.The
buildModelHubURLfunction correctly:
- Applies default and maximum limits
- URL-encodes all query parameters
- Handles various types in
ExtraParamswith appropriate conversions- Constructs a well-formed URL with the inference provider filter
core/providers/huggingface/huggingface.go (3)
477-650: LGTM: Stream processing goroutineThe goroutine properly handles:
- Context cancellation checks before processing each chunk
- Deferred cleanup of resources (channel close, response release)
- Scanner buffer sizing for large responses
- Error handling with proper error propagation to the response channel
- Both regular chat streaming and ResponsesStream fallback modes
132-205: LGTM: Request handling with proper resource managementThe
completeRequestfunction correctly:
- Uses defer for cleanup of fasthttp resources
- Makes a copy of the response body before releasing to avoid use-after-free
- Handles error responses with proper error type extraction
- Supports debug logging controlled by environment variable
87-120: LGTM: Provider initializationProvider initialization correctly handles configuration defaults, client setup with timeouts, pool pre-warming, proxy configuration, and base URL normalization.
core/providers/huggingface/types.go (3)
1-43: LGTM: Model and inference provider typesThe model metadata types and inference provider mapping structures are well-defined with appropriate JSON tags for API compatibility.
44-260: LGTM: Chat completion typesComprehensive type definitions for chat requests/responses with:
- Support for both streaming and non-streaming responses
- Flexible content handling via
json.RawMessage- Tool/function calling support
- Logprobs support
- Time info for streaming diagnostics
274-411: LGTM: Embedding, Speech, and Transcription typesWell-structured types for:
- Embedding with flexible input types and encoding format options
- Speech synthesis with generation parameters
- Transcription with timestamp support and generation configuration
- Type aliases for backward compatibility
5baaee2 to
5a72875
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (12)
docs/contributing/adding-a-provider.mdx (2)
500-527: Variable name inconsistency in documentation example (duplicate comment).The example declares
hfReqon line 500 but then referencesproviderReqin the parameter mapping section. This inconsistency was flagged in a past review and remains unresolved. Replace allproviderReqreferences withhfReqfor consistency.Apply this diff to fix the variable names:
// Build the request hfReq := &HuggingFaceChatRequest{ Model: bifrostReq.Model, Messages: hfMessages, } // Map parameters if bifrostReq.Params != nil { params := bifrostReq.Params // Map standard parameters if params.Temperature != nil { - providerReq.Temperature = params.Temperature + hfReq.Temperature = params.Temperature } if params.MaxTokens != nil { - providerReq.MaxTokens = params.MaxTokens + hfReq.MaxTokens = params.MaxTokens } // ... other standard parameters // Handle provider-specific ExtraParams if params.ExtraParams != nil { if customParam, ok := params.ExtraParams["custom_param"].(string); ok { - providerReq.CustomParam = &customParam + hfReq.CustomParam = &customParam } } } - return providerReq + return hfReq
1405-1427: Incomplete/truncated code example (duplicate comment).The code example is cut off mid-function at the tool/function calling section, which may confuse contributors trying to implement converters. This was flagged in a past review. Either complete the example with the tool calling logic and final return, or explicitly mark it as abbreviated with a comment like:
// Tool/Function calling if params.Tools != nil && len(params.Tools) > 0 { // Convert tools... } } return hfReq }Consider adding clarity about why the example is truncated and pointing contributors to full working examples in
core/providers/huggingface/orcore/providers/anthropic/.core/providers/huggingface/huggingface_test.go (1)
29-33: TranscriptionModel and SpeechSynthesisModel are swapped.Based on model capabilities:
Kokoro-82Mis a text-to-speech (TTS) model → should beSpeechSynthesisModelwhisper-large-v3is a speech-to-text (transcription) model → should beTranscriptionModelApply this diff to fix the model assignments:
- TranscriptionModel: "fal-ai/hexgrad/Kokoro-82M", - SpeechSynthesisModel: "fal-ai/openai/whisper-large-v3", + TranscriptionModel: "fal-ai/openai/whisper-large-v3", + SpeechSynthesisModel: "fal-ai/hexgrad/Kokoro-82M", SpeechSynthesisFallbacks: []schemas.Fallback{ - {Provider: schemas.HuggingFace, Model: "fal-ai/openai/whisper-large-v3"}, + {Provider: schemas.HuggingFace, Model: "fal-ai/hexgrad/Kokoro-82M"}, },core/providers/huggingface/speech.go (1)
37-63: Type assertions forintwill fail when values come from JSON unmarshaling.When
ExtraParamsis populated from JSON (e.g., from HTTP request bodies), numeric values are unmarshaled asfloat64, notint. These type assertions will silently fail, and the parameters won't be set.Handle both
intandfloat64types for all integer parameters. Example formax_new_tokens:- if val, ok := request.Params.ExtraParams["max_new_tokens"].(int); ok { - genParams.MaxNewTokens = &val + if val, ok := request.Params.ExtraParams["max_new_tokens"].(float64); ok { + intVal := int(val) + genParams.MaxNewTokens = &intVal + } else if val, ok := request.Params.ExtraParams["max_new_tokens"].(int); ok { + genParams.MaxNewTokens = &val }Apply the same pattern to:
max_length,min_length,min_new_tokens,num_beams,num_beam_groups, andtop_k.Consider extracting a helper function to reduce code duplication:
func getIntParam(params map[string]any, key string) *int { if val, ok := params[key].(float64); ok { intVal := int(val) return &intVal } if val, ok := params[key].(int); ok { return &val } return nil }core/providers/huggingface/transcription.go (2)
14-16: Incorrect error message references "speech" instead of "transcription".The error message says "speech request input cannot be nil" but this is a transcription request converter.
if request.Input == nil { - return nil, fmt.Errorf("speech request input cannot be nil") + return nil, fmt.Errorf("transcription request input cannot be nil") }
38-63: Type assertions forintwill fail when values come from JSON unmarshaling.Same issue as in
speech.go- whenExtraParamscomes from JSON, numeric values arefloat64, notint. These assertions will silently fail.Handle both
intandfloat64types. Consider creating a shared helper function in the package to avoid code duplication acrossspeech.goandtranscription.go:// In a shared utils file func getIntFromExtra(params map[string]any, key string) *int { if val, ok := params[key].(float64); ok { intVal := int(val) return &intVal } if val, ok := params[key].(int); ok { return &val } return nil }Then use it as:
genParams.MaxNewTokens = getIntFromExtra(request.Params.ExtraParams, "max_new_tokens")core/providers/huggingface/chat.go (1)
69-81: Critical: Nil pointer dereference remains unaddressed.Line 74 still dereferences
tc.Function.Namewithout checking for nil, which can cause a panic.Apply the previously suggested fix:
for _, tc := range msg.ChatAssistantMessage.ToolCalls { + if tc.Function.Name == nil { + continue // Skip tool calls without a function name + } hfToolCall := HuggingFaceToolCall{ ID: tc.ID, Type: tc.Type, Function: HuggingFaceFunction{ Name: *tc.Function.Name, Arguments: tc.Function.Arguments, }, } hfToolCalls = append(hfToolCalls, hfToolCall) }core/providers/huggingface/huggingface.go (4)
277-293: Critical: Wrong provider constant remains unfixed.Line 279 still uses
schemas.Geminiinstead ofschemas.HuggingFacein the operation check, causing incorrect permission validation.Apply the previously suggested fix:
-if err := providerUtils.CheckOperationAllowed(schemas.Gemini, provider.customProviderConfig, schemas.ListModelsRequest); err != nil { +if err := providerUtils.CheckOperationAllowed(schemas.HuggingFace, provider.customProviderConfig, schemas.ListModelsRequest); err != nil { return nil, err }
700-707: Minor: Wrong request type in Embedding error messages remains unfixed.Lines 701 and 706 use
schemas.SpeechRequestinstead ofschemas.EmbeddingRequestin error messages, which will confuse debugging.Apply the previously suggested fix:
if providerMapping == nil { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) } mapping, ok := providerMapping[inferenceProvider] if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "feature-extraction" { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) }
773-776: Major: Wrong task type check in Speech remains unfixed.Line 774 checks for
"automatic-speech-recognition"(transcription task) instead of"text-to-speech"(speech generation task). This will cause Speech operations to incorrectly validate against the wrong task type.Apply the previously suggested fix:
mapping, ok := providerMapping[inferenceProvider] -if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "automatic-speech-recognition" { +if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "text-to-speech" { return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) }
831-838: Major: Wrong task type and request type in Transcription remain unfixed.Two issues here:
- Lines 832 and 837 use
schemas.SpeechRequestinstead ofschemas.TranscriptionRequestin error messages- Line 836 checks for
"text-to-speech"task instead of"automatic-speech-recognition"(the tasks are swapped with the Speech function)Apply the previously suggested fix:
if providerMapping == nil { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey()) } mapping, ok := providerMapping[inferenceProvider] -if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "text-to-speech" { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) +if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "automatic-speech-recognition" { + return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey()) }core/providers/huggingface/types.go (1)
21-23: Critical: Struct definition may not match API response format.The HuggingFace
/api/modelsendpoint returns a JSON array directly, butHuggingFaceListModelsResponseexpects an object with amodelsfield. This mismatch will cause unmarshaling to fail.Run this script to verify the API response format:
#!/bin/bash # Check the actual response format from HuggingFace API curl -s "https://huggingface.co/api/models?limit=1" | jq -c 'if type == "array" then "ARRAY" else "OBJECT with keys: " + (keys | join(", ")) end'If the API returns
"ARRAY", change the response type to a slice or add customUnmarshalJSON:-type HuggingFaceListModelsResponse struct { - Models []HuggingFaceModel `json:"models"` -} +type HuggingFaceListModelsResponse []HuggingFaceModelAnd update usage in
listModelsByKey(line ~256 in huggingface.go):-response := huggingfaceAPIResponse.ToBifrostListModelsResponse(providerName) +response := ToBifrostListModelsResponse(huggingfaceAPIResponse, providerName)
🧹 Nitpick comments (8)
docs/contributing/adding-a-provider.mdx (1)
43-43: Hyphenate compound adjectives: "OpenAI-compatible" throughout documentation.For consistency and grammatical correctness, compound adjectives should be hyphenated when they precede a noun. Update all instances of "OpenAI Compatible" to "OpenAI-compatible":
- #### Non-OpenAI Compatible Providers + #### Non-OpenAI-compatible Providers - #### OpenAI Compatible Providers + #### OpenAI-compatible Providers - ### OpenAI Compatible Providers + ### OpenAI-compatible Providers - ### For OpenAI Compatible Providers + ### For OpenAI-compatible ProvidersApply these changes throughout the document to improve consistency and readability.
Also applies to: 71-71, 629-629, 1469-1469
Makefile (3)
14-21: Binary path variables introduce deterministic tool discovery—good for portability.The approach of chaining GOBIN → GOPATH/bin → default Go locations is sound and addresses real pain points on systems like Nix. However, Line 21's fallback to
whichmay not handle absolute paths reliably:GOTESTSUM_BIN := $(if $(strip $(DEFAULT_GOBIN)),$(DEFAULT_GOBIN)/gotestsum,$(shell which gotestsum 2>/dev/null || echo gotestsum))
whichtypically expects command names (searched in PATH), not absolute paths. Ifwhichis called with a full path, it may fail unexpectedly on some systems. Consider usingcommand -vor simplifying to just fallback to the bare name:- GOTESTSUM_BIN := $(if $(strip $(DEFAULT_GOBIN)),$(DEFAULT_GOBIN)/gotestsum,$(shell which gotestsum 2>/dev/null || echo gotestsum)) + GOTESTSUM_BIN := $(if $(strip $(DEFAULT_GOBIN)),$(DEFAULT_GOBIN)/gotestsum,gotestsum)The
|| echo gotestsumensures the variable is never empty, but the bare name should be sufficient; downstream checks like Line 103 already verify existence.
89-100: install-air logic is sound but the binary availability check is convoluted (Line 97).The condition:
if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then…will work (warns only if the full path doesn't exist), but it's unnecessarily complex. Since
INSTALLEDis a full path,whichon an absolute path may fail unexpectedly on some shells. Simplify to:- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \ + if [ ! -x "$$INSTALLED" ]; then \
102-113: install-gotestsum has the samewhichissue and convoluted logic (Lines 103, 110).Line 103:
if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; thenIf GOTESTSUM_BIN is a full path like
/home/user/go/bin/gotestsum,whichon it will fail. Simplify to just check the file:- if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; then \ + if [ -x "$(GOTESTSUM_BIN)" ]; then \Line 110 has the same convoluted conditional as Line 97—apply the same simplification:
- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \ + if [ ! -x "$$INSTALLED" ]; then \core/schemas/mux.go (1)
1146-1221: Confirm intent of mixingdelta.Thoughtintooutput_text.deltaand potential duplicationThe new logic now (a) enters the text path when either
hasContentorhasThoughtis true and (b) buildscontentDeltaby concatenatingdelta.Contentanddelta.Thought, then uses that forResponsesStreamResponseTypeOutputTextDelta.Delta, while still emitting a separateResponsesStreamResponseTypeReasoningSummaryTextDeltabased ondelta.Thought.This is a behavioral change from “text delta only reflects
Content” to “text delta reflectsContent + Thought”, and also means the sameThoughttokens appear in both the output-text and reasoning-summary streams. IfThoughtis intended to remain non-user-visible reasoning, this could leak it into the primary text channel and may surprise existing consumers that only look atoutput_text.delta.Please double‑check that:
- Existing UIs/clients that consume
output_text.deltaare meant to seeThoughttext, and- They can tolerate
Thoughtappearing both inoutput_text.deltaandreasoning_summary_text.delta.If the goal was only to advance lifecycle (item creation / closing) when thought‑only chunks arrive, a narrower change that gates on
hasThoughtbut keepsDeltabased solely onContentmight be safer. Happy to sketch that refactor if you confirm the desired semantics.core/providers/huggingface/huggingface_test.go (1)
59-63: Consider using defer for client.Shutdown() to ensure cleanup on panic.If
RunAllComprehensiveTestspanics,client.Shutdown()won't be called, potentially leaving resources unreleased.+ defer client.Shutdown() + t.Run("HuggingFaceTests", func(t *testing.T) { testutil.RunAllComprehensiveTests(t, client, ctx, testConfig) }) - client.Shutdown()core/providers/huggingface/chat.go (2)
34-64: Consider handling marshaling errors for content conversion.Lines 37 and 61 discard marshaling errors when converting content. While the API may reject invalid payloads, explicitly handling these errors would make debugging easier.
Consider logging or returning errors:
if msg.Content.ContentStr != nil { - contentJSON, _ := sonic.Marshal(*msg.Content.ContentStr) + contentJSON, err := sonic.Marshal(*msg.Content.ContentStr) + if err != nil { + // Log warning or return error + continue + } hfMsg.Content = json.RawMessage(contentJSON) }Apply similar handling at line 61 for ContentBlocks.
136-195: Consider handling errors in ResponseFormat and ToolChoice conversions.Lines 138-144 and 189-193 silently discard marshaling errors when converting
ResponseFormatandToolChoice. Since these are optional but important parameters, logging failures would aid debugging.Example for ResponseFormat:
if params.ResponseFormat != nil { responseFormatJSON, err := sonic.Marshal(params.ResponseFormat) - if err == nil { + if err != nil { + // Log warning: failed to marshal ResponseFormat + } else { var hfResponseFormat HuggingFaceResponseFormat if err := sonic.Unmarshal(responseFormatJSON, &hfResponseFormat); err == nil { hfReq.ResponseFormat = &hfResponseFormat + } else { + // Log warning: failed to unmarshal ResponseFormat } } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (28)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)Makefile(8 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(3 hunks)core/internal/testutil/chat_completion_stream.go(1 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types copy.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(2 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/features/unified-interface.mdx(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(1 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- core/providers/huggingface/utils.go
🚧 Files skipped from review as they are similar to previous changes (14)
- core/providers/huggingface/types copy.go
- ui/README.md
- docs/features/unified-interface.mdx
- .github/workflows/pr-tests.yml
- docs/apis/openapi.json
- ui/lib/constants/logs.ts
- core/providers/huggingface/embedding.go
- transports/config.schema.json
- ui/lib/constants/config.ts
- .github/workflows/release-pipeline.yml
- core/schemas/bifrost.go
- core/internal/testutil/chat_completion_stream.go
- core/internal/testutil/account.go
- core/internal/testutil/responses_stream.go
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/huggingface/huggingface_test.gocore/providers/huggingface/models.gocore/providers/huggingface/chat.gocore/providers/huggingface/speech.gocore/schemas/mux.goMakefilecore/providers/huggingface/transcription.gocore/bifrost.goui/lib/constants/icons.tsxcore/schemas/account.godocs/contributing/adding-a-provider.mdxcore/providers/huggingface/types.gocore/providers/huggingface/huggingface.go
🧬 Code graph analysis (6)
core/providers/huggingface/huggingface_test.go (5)
core/internal/testutil/setup.go (1)
SetupTest(51-60)core/internal/testutil/account.go (2)
ComprehensiveTestConfig(47-64)TestScenarios(22-44)core/schemas/bifrost.go (2)
HuggingFace(51-51)Fallback(131-134)core/schemas/models.go (1)
Model(109-129)core/internal/testutil/tests.go (1)
RunAllComprehensiveTests(15-62)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
HuggingFaceListModelsResponse(21-23)core/schemas/bifrost.go (13)
ModelProvider(32-32)RequestType(86-86)ChatCompletionRequest(92-92)ChatCompletionStreamRequest(93-93)TextCompletionRequest(90-90)TextCompletionStreamRequest(91-91)ResponsesRequest(94-94)ResponsesStreamRequest(95-95)EmbeddingRequest(96-96)SpeechRequest(97-97)SpeechStreamRequest(98-98)TranscriptionRequest(99-99)TranscriptionStreamRequest(100-100)core/schemas/models.go (2)
BifrostListModelsResponse(36-45)Model(109-129)
core/providers/huggingface/speech.go (3)
core/schemas/speech.go (2)
BifrostSpeechRequest(9-16)BifrostSpeechResponse(22-29)core/providers/huggingface/types.go (5)
HuggingFaceSpeechRequest(304-310)HuggingFaceSpeechParameters(313-316)HuggingFaceTranscriptionGenerationParameters(342-359)HuggingFaceTranscriptionEarlyStopping(363-366)HuggingFaceSpeechResponse(319-323)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/schemas/mux.go (2)
core/schemas/responses.go (2)
BifrostResponsesStreamResponse(1422-1460)ResponsesStreamResponseTypeOutputTextDelta(1370-1370)core/schemas/utils.go (1)
Ptr(16-18)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/providers/huggingface/huggingface.go (1)
NewHuggingFaceProvider(88-120)
core/schemas/account.go (1)
ui/lib/types/config.ts (3)
AzureKeyConfig(23-27)VertexKeyConfig(36-42)BedrockKeyConfig(53-60)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...
(QB_NEW_EN_HYPHEN)
[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...
(QB_NEW_EN_HYPHEN)
[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...
(QB_NEW_EN_HYPHEN)
[grammar] ~1469-~1469: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...
(QB_NEW_EN_HYPHEN)
[uncategorized] ~1779-~1779: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
🔇 Additional comments (9)
Makefile (2)
155-155: Variable substitutions are correct and enable deterministic test execution.The use of
$(AIR_BIN)and$(GOTESTSUM_BIN)throughout the dev and test targets is consistent and properly formatted. This decoupling from PATH-based tool discovery is essential for reliable test execution, especially for the new HuggingFace provider tests.Also applies to: 163-163, 349-349, 401-401, 425-425, 454-454, 548-548
24-29: Color vars, root check, and next installation fallback are sound.
- Color refactor (lines 24–29): Using
printfcentralizes ANSI code definitions—good for maintainability.- Root user guard (lines 67–70): Defensive check prevents npm permission issues on dev machines—appropriate.
- Next installation fallback (lines 75–85): Thoughtful multi-level strategy (local, then npx, then global) handles Nix-like environments well.
Also applies to: 66-70, 75-85
core/schemas/account.go (1)
54-56: LGTM! Consistent with existing provider key config patterns.The
HuggingFaceKeyConfigtype follows the established pattern used by Azure, Vertex, and Bedrock configurations, with aDeploymentsmap for model-to-deployment mapping.core/bifrost.go (1)
1327-1328: LGTM! Provider integration follows established patterns.The HuggingFace provider case follows the same pattern as other providers like OpenAI, Anthropic, and Gemini that return a provider instance without an error.
core/providers/huggingface/speech.go (1)
94-116: LGTM! Response conversion is straightforward and correct.The
ToBifrostSpeechResponseproperly validates inputs and maps the response fields. The comment about missing usage/alignment data is helpful for future maintainers.core/providers/huggingface/models.go (3)
16-44: LGTM! Model list conversion handles edge cases properly.The method correctly:
- Skips models with empty
ModelID- Skips models without derivable supported methods
- Pre-allocates the result slice with appropriate capacity
- Creates a composite ID with the provider prefix
46-104: LGTM! Comprehensive method derivation from pipeline and tags.The function properly:
- Normalizes the pipeline tag
- Uses a set to deduplicate methods
- Handles both primary pipeline tags and secondary model tags
- Returns a sorted slice for consistent output
11-14: These constants are used incore/providers/huggingface/utils.go(lines 90, 92–93), so no action is needed.core/providers/huggingface/transcription.go (1)
100-139: LGTM! Response conversion with proper segment mapping.The method correctly:
- Validates non-nil receiver and non-empty model name
- Maps chunks to segments with proper ID assignment
- Safely handles timestamp arrays with bounds checking before access
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/schemas/mux.go (1)
1146-1220: Remove duplicate delta handling for Thought content.The code processes
delta.Thoughttwice with conflicting approaches:
- Lines 1207-1213: Concatenates
ThoughtwithContentinto a singlecontentDeltastring emitted asResponsesStreamResponseTypeOutputTextDelta- Lines 1369-1377: Emits the same
delta.Thoughtseparately asResponsesStreamResponseTypeReasoningSummaryTextDeltaThis means consumers receive the thought content both mixed into regular text deltas and as a separate reasoning delta event. Keep the separate reasoning delta emission (lines 1369-1377) and remove the concatenation of
ThoughtintocontentDelta(lines 1211-1212).
♻️ Duplicate comments (15)
core/providers/huggingface/types.go (1)
21-23: Struct definition doesn't match HuggingFace API response format.This issue was previously flagged: The HuggingFace
/api/modelsendpoint returns a JSON array directly, butHuggingFaceListModelsResponseexpects an object with amodelsfield. This will cause unmarshaling failures. Either change the response type to[]HuggingFaceModelor add customUnmarshalJSONlogic.docs/contributing/adding-a-provider.mdx (2)
510-527: Variable name inconsistency in documentation example.The example code references
providerReq(e.g.,providerReq.Temperature), but the variable was namedhfReqon line 500. This inconsistency could confuse contributors.
1406-1427: Code example is truncated mid-function.The code block is incomplete, cutting off inside the parameter mapping section. This could confuse contributors trying to follow the pattern.
core/providers/huggingface/transcription.go (2)
14-16: Incorrect error message references "speech" instead of "transcription".The error message says "speech request input cannot be nil" but this is a transcription request converter.
38-79: Type assertions forintwill fail when values come from JSON unmarshalling.When
ExtraParamsis populated from JSON (e.g., from request bodies), numeric values are unmarshalled asfloat64, notint. The type assertions like.(int)on lines 38, 41, 44, 47, 50, 53, and 62 will silently fail, causing these parameters to be ignored.Consider handling both
intandfloat64types for all integer parameters (max_new_tokens,max_length,min_length,min_new_tokens,num_beams,num_beam_groups,top_k).core/providers/huggingface/speech.go (1)
37-77: Type assertions forintwill fail when values come from JSON unmarshalling.When
ExtraParamsis populated from JSON, numeric values are unmarshalled asfloat64, notint. The type assertions like.(int)on lines 37, 40, 43, 46, 49, 52, and 61 will silently fail, causing these parameters to be ignored.Consider handling both
intandfloat64types for all integer parameters.core/providers/huggingface/chat.go (1)
69-78: Potential nil pointer dereference when accessingtc.Function.Name.On line 74,
tc.Function.Nameis dereferenced without checking if it's nil. If the tool call has a nilNamefield, this will cause a panic.Add a nil check before dereferencing or skip tool calls without function names.
core/providers/huggingface/utils.go (3)
136-162: Empty provider and model names when input has no slashes.When
splitIntoModelProviderreceives a model name with no slashes (t == 0), bothprovandmodelremain empty strings. Downstream code at lines 313 and 398 inhuggingface.goformats this as"" : "", resulting in malformed model identifiers.Handle the t == 0 case by setting a default provider (e.g., "hf-inference") and using the input as the model name.
164-197: Incomplete provider routing ingetInferenceProviderRouteURL— 13 of 19 defined providers will error.The function only handles 6 providers (
fal-ai,hf-inference,nebius,replicate,sambanova,scaleway) whileINFERENCE_PROVIDERSdefines 19 total. Providers likecerebras,cohere,groq,featherlessAI,fireworksAI,hyperbolic,novita,nscale,ovhcloud,publicai,together,wavespeed, andzaiOrgwill hit the default error case.This affects embedding, speech, and transcription operations. Either expand routing logic for all providers or remove unsupported ones from
INFERENCE_PROVIDERS.
180-191: Copy-paste error: Wrong provider names in error messages.Lines 184 and 190 incorrectly reference "nebius provider" when the actual providers are "sambanova" (line 184) and "scaleway" (line 190).
core/providers/huggingface/huggingface.go (5)
277-281: Bug: Wrong provider constant used in ListModels.
CheckOperationAllowedis called withschemas.Geminiinstead ofschemas.HuggingFace. This is a copy-paste error and will cause incorrect operation permission checks.
295-301: Bug: Wrong request types in unsupported operation errors.Both
TextCompletionandTextCompletionStreamreturn errors withschemas.EmbeddingRequestinstead of the correct request types (schemas.TextCompletionRequestandschemas.TextCompletionStreamRequest).
700-707: Bug: Wrong request type in Embedding error messages.The error messages reference
schemas.SpeechRequestinstead ofschemas.EmbeddingRequestwhen the embedding operation is unsupported (lines 701 and 706).
773-776: Bug: Wrong task type check in Speech.The Speech function checks for
"automatic-speech-recognition"task, but Speech (text-to-speech) should check for"text-to-speech". The task checks appear to be swapped between Speech and Transcription functions.
831-838: Bug: Wrong task type check and error messages in Transcription.Two issues:
- The error messages use
schemas.SpeechRequestinstead ofschemas.TranscriptionRequest(lines 832, 837)- Line 836 checks for
"text-to-speech"task, but Transcription should check for"automatic-speech-recognition"
🧹 Nitpick comments (4)
Makefile (2)
88-100: Extract repeated DEFAULT_GOBIN logic into a shared variable for DRY.Lines 95 and 108 both replicate the DEFAULT_GOBIN fallback logic:
$(if $(strip $(GOBIN)),$(GOBIN)/gotestsum,$(if $(strip $(GOPATH)),$(GOPATH)/bin/gotestsum,...))This can be defined once and reused, improving maintainability and consistency across targets.
Consider extracting into a helper pattern (or simply reusing DEFAULT_GOBIN):
INSTALLED_AIR_PATH := $(DEFAULT_GOBIN)/air INSTALLED_GOTESTSUM_PATH := $(DEFAULT_GOBIN)/gotestsumThen reference these in the informational messages instead of computing the logic twice.
Also applies to: 102-113
103-103: Minor: Quote GOTESTSUM_BIN in which invocation for robustness.Line 103 uses
which $(GOTESTSUM_BIN)without quotes. While binary names rarely contain spaces, shell best practice is to quote variable expansions:which "$(GOTESTSUM_BIN)" > /dev/null 2>&1core/providers/huggingface/types.go (2)
379-396: Consider returning an error for invalid early_stopping values.The
UnmarshalJSONmethod silently ignores invalid values by returningnilat line 395. If the JSON value is neither a boolean nor a string, this will leave the field in an indeterminate state. Consider returning an error to surface invalid API responses:func (e *HuggingFaceTranscriptionEarlyStopping) UnmarshalJSON(data []byte) error { // Try boolean first var boolVal bool if err := json.Unmarshal(data, &boolVal); err == nil { e.BoolValue = &boolVal return nil } // Try string var stringVal string if err := json.Unmarshal(data, &stringVal); err == nil { e.StringValue = &stringVal return nil } - return nil + return fmt.Errorf("early_stopping must be a boolean or string, got: %s", string(data)) }
66-66: Extra field won't capture unknown fields withjson:"-"tag.The
Extrafield at line 66 has thejson:"-"tag, which excludes it from JSON marshaling/unmarshaling. This means it won't capture unknown additional fields as the comment suggests. If you want to capture unknown fields, remove thejson:"-"tag or use a different approach. If this field is for internal use only (not from/to JSON), the current implementation is correct but the comment should be clarified.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (28)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)Makefile(8 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(3 hunks)core/internal/testutil/chat_completion_stream.go(1 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types copy.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(2 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/features/unified-interface.mdx(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(1 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- core/providers/huggingface/types copy.go
🚧 Files skipped from review as they are similar to previous changes (12)
- core/internal/testutil/chat_completion_stream.go
- core/providers/huggingface/huggingface_test.go
- ui/lib/constants/config.ts
- core/providers/huggingface/models.go
- ui/lib/constants/logs.ts
- core/schemas/account.go
- core/schemas/bifrost.go
- transports/config.schema.json
- docs/apis/openapi.json
- ui/README.md
- .github/workflows/pr-tests.yml
- core/internal/testutil/responses_stream.go
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/schemas/mux.godocs/features/unified-interface.mdxcore/internal/testutil/account.gocore/providers/huggingface/embedding.godocs/contributing/adding-a-provider.mdxcore/providers/huggingface/speech.goMakefilecore/providers/huggingface/transcription.gocore/providers/huggingface/utils.goui/lib/constants/icons.tsxcore/providers/huggingface/chat.gocore/bifrost.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
🧬 Code graph analysis (6)
core/schemas/mux.go (2)
core/schemas/responses.go (2)
BifrostResponsesStreamResponse(1422-1460)ResponsesStreamResponseTypeOutputTextDelta(1370-1370)core/schemas/utils.go (1)
Ptr(16-18)
core/internal/testutil/account.go (4)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/schemas/account.go (1)
Key(8-18)core/schemas/provider.go (4)
ProviderConfig(234-242)NetworkConfig(45-53)DefaultRequestTimeoutInSeconds(15-15)ConcurrencyAndBufferSize(128-131)core/internal/testutil/cross_provider_scenarios.go (1)
ProviderConfig(45-53)
core/providers/huggingface/embedding.go (3)
core/schemas/embedding.go (4)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)core/providers/huggingface/types.go (2)
HuggingFaceEmbeddingRequest(278-288)HuggingFaceEmbeddingResponse(299-299)core/schemas/chatcompletions.go (1)
BifrostLLMUsage(640-647)
core/providers/huggingface/transcription.go (3)
core/schemas/transcriptions.go (2)
BifrostTranscriptionRequest(3-10)BifrostTranscriptionResponse(16-26)core/providers/huggingface/types.go (5)
HuggingFaceTranscriptionRequest(328-333)HuggingFaceTranscriptionRequestParameters(336-339)HuggingFaceTranscriptionGenerationParameters(342-359)HuggingFaceTranscriptionEarlyStopping(363-366)HuggingFaceTranscriptionResponse(399-402)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/providers/huggingface/chat.go (3)
core/schemas/chatcompletions.go (4)
BifrostChatRequest(12-19)BifrostChatResponse(26-41)BifrostResponseChoice(582-590)BifrostLLMUsage(640-647)core/providers/huggingface/types.go (11)
HuggingFaceChatRequest(47-67)HuggingFaceChatMessage(69-77)HuggingFaceContentItem(80-84)HuggingFaceImageRef(86-88)HuggingFaceToolCall(90-94)HuggingFaceFunction(96-100)HuggingFaceResponseFormat(102-105)HuggingFaceStreamOptions(114-116)HuggingFaceTool(118-121)HuggingFaceToolFunction(123-127)HuggingFaceChatResponse(129-136)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/providers/huggingface/huggingface.go (1)
NewHuggingFaceProvider(88-120)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...
(QB_NEW_EN_HYPHEN)
[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...
(QB_NEW_EN_HYPHEN)
[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...
(QB_NEW_EN_HYPHEN)
[grammar] ~1469-~1469: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...
(QB_NEW_EN_HYPHEN)
[uncategorized] ~1779-~1779: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
🔇 Additional comments (21)
Makefile (3)
15-21: Tool-path discovery logic is sound; minor consistency opportunity.The conditional expansion for DEFAULT_GOBIN and fallback chains for AIR_BIN and GOTESTSUM_BIN are well-structured. However, the logic could be slightly clearer:
- Line 17–18:
GOBIN ?=andGOPATH ?=query environment each time; consider memoizing to$(shell go env GOBIN)and$(shell go env GOPATH)without the conditional assignment if these are truly static per invocation.- Line 21: The fallback to
which gotestsumis good, but ensure it's tested on systems wherewhichmay not be available (rare, but worth verifying).Overall approach is solid for improving portability and reducing hard-coded tool assumptions.
155-155: Binary invocation refactoring is correct.The substitution of direct
airwith$(AIR_BIN)is clean and allows the Makefile to find air without relying on PATH. Well done.Also applies to: 163-163
349-349: Test target refactoring to use $(GOTESTSUM_BIN) looks good.All test invocations now consistently use the computed GOTESTSUM_BIN variable, ensuring tests can run even if gotestsum is not on PATH. Proper fallback logic in variable definition (line 21) makes this robust.
Verify that tests pass when GOTESTSUM_BIN resolves to a full path outside PATH (e.g., via GOBIN). Run a quick smoke test locally to confirm:
make test-core PROVIDER=<any_provider>.Also applies to: 401-401, 425-425, 454-454, 548-548
core/internal/testutil/account.go (2)
96-96: LGTM! HuggingFace provider added to configured providers.The addition follows the established pattern and correctly references the HuggingFace constant defined in core/schemas/bifrost.go.
259-266: LGTM! Key retrieval properly configured.The HuggingFace key configuration follows the same pattern as other providers, retrieving the API key from the
HUGGING_FACE_API_KEYenvironment variable.core/providers/huggingface/types.go (1)
129-408: Comprehensive type definitions for HuggingFace integration.The type definitions provide good coverage for chat, embeddings, speech, and transcription APIs. The use of pointer fields,
json.RawMessage, andmap[string]anyprovides appropriate flexibility for varying API responses while maintaining type safety where possible.docs/features/unified-interface.mdx (1)
98-98: Documentation correctly reflects HuggingFace provider capabilities.The provider support matrix entry accurately documents HuggingFace's supported operations and follows the consistent format of other provider entries.
core/bifrost.go (2)
26-26: Import correctly added for HuggingFace provider.The import follows the consistent pattern used for other provider packages.
1327-1328: HuggingFace provider properly integrated into factory.The provider instantiation follows the same pattern as other providers, correctly calling
NewHuggingFaceProviderwith config and logger parameters. The integration is consistent with the existing provider architecture.docs/contributing/adding-a-provider.mdx (1)
7-2065: LGTM! Comprehensive provider implementation guide.The documentation provides excellent coverage of both OpenAI-compatible and custom provider patterns, with clear examples, conventions, and checklists. The phase-based workflow and file organization guidelines will help contributors implement providers consistently.
core/providers/huggingface/embedding.go (2)
10-55: LGTM! Solid embedding request converter.The function properly handles nil inputs, correctly splits model/provider, and safely extracts ExtraParams with type assertions. The mapping logic is clear and follows the converter pattern.
58-93: LGTM! Well-implemented response converter.The method correctly handles nil responses with error returns, pre-allocates slices for performance, and properly documents that HuggingFace doesn't return usage information (setting it to zero). The conversion logic is clean and efficient.
core/providers/huggingface/transcription.go (1)
100-139: LGTM! Solid transcription response converter.The method properly validates inputs, safely handles timestamp array access, and cleanly converts HuggingFace chunks to Bifrost segments. The implementation is defensive and correct.
core/providers/huggingface/speech.go (1)
94-116: LGTM! Clean speech response converter.The method properly validates inputs and clearly documents that HuggingFace TTS API doesn't return usage or alignment data. The implementation is straightforward and correct.
core/providers/huggingface/chat.go (2)
318-413: LGTM! Well-structured streaming response converter.The method properly handles nil responses and cleanly converts all streaming delta fields (role, content, reasoning, tool calls, logprobs) to Bifrost format. The implementation is thorough and correct.
86-89: No action needed. Thedebugvariable is properly accessible.The
debugvariable is defined at package level inhuggingface.go(line 22) and is correctly accessible inchat.go. In Go, all files within the same package share a common namespace—no import or additional qualification is required for package-level variables. The code will compile without errors.Likely an incorrect or invalid review comment.
core/providers/huggingface/utils.go (2)
83-128: LGTM! Well-implemented URL builder.The function properly constructs the model hub URL with pagination, sorting, and provider filtering. The type switch for handling ExtraParams is robust and covers common types.
130-134: LGTM! Solid utility functions.The
buildModelInferenceProviderURL,convertToInferenceProviderMappings, andgetModelInferenceProviderMappingfunctions are well-implemented with proper error handling and clean logic.Also applies to: 199-251
core/providers/huggingface/huggingface.go (3)
20-126: LGTM! Well-structured provider implementation.The provider struct follows standard patterns with proper field ordering. The use of sync.Pool for response objects is a good performance optimization, and the acquire/release helpers are correctly implemented. The constructor properly initializes all fields and pre-warms the pools.
303-653: LGTM! Robust chat completion implementation.Both sync and streaming chat completion methods are well-implemented. The streaming logic properly handles SSE parsing, error cases, and the complex ResponsesStream fallback scenario. The workaround for combined usage/choices chunks (lines 597-626) is clever and well-documented.
655-677: LGTM! Clean delegation to chat completion methods.The Responses and ResponsesStream methods correctly delegate to the corresponding chat completion methods with appropriate context flags and request type overrides.
5a72875 to
00ecc06
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/schemas/mux.go (1)
1146-1221:Thoughtis now streamed as both user-visible text and as a separate reasoning delta, creating double-streaming and changed semantics for all providersThe code confirms the concerns raised:
- Lines 1146–1148 compute
hasContentandhasThought; line 1149 branches if either is true.- Lines 1207–1214 concatenate both
*delta.Contentand*delta.ThoughtintocontentDelta, which becomes theOutputTextDelta.Delta.- Line 1228 sets
state.TextItemHasContent = truefor either content or thought.- Lines 1369–1380 emit
delta.Thoughtseparately asResponsesStreamResponseTypeReasoningSummaryTextDelta.This creates two key problems:
Double-streaming: Chunks with only
Thoughtnow emit both anoutput_text.deltaand areasoning_summary_text.delta, causing reasoning content to appear in the visible text channel.Semantic shift: Chunks with both
ContentandThoughtnow concatenate them into a single visible delta, mixing chain-of-thought into user-facing text instead of keeping reasoning separate.Unguarded change: This logic applies uniformly to all providers (no provider-specific conditionals), making this a behavioral change across the entire stack, not isolated to HuggingFace.
If
Thoughtis intended as a distinct reasoning stream (per OpenAI'sreasoningsemantics), this mixes concerns. If it's a fallback for providers that emit primary text asThought, this needs explicit provider-level detection to avoid leaking reasoning into normal output for other providers.Recommend verifying this is intentional across all providers and that clients consuming only
output_text.deltaare prepared to receive reasoning content.
♻️ Duplicate comments (17)
docs/contributing/adding-a-provider.mdx (1)
510-527: Variable name inconsistency persists from previous review.The code example declares
hfReqat line 500 but referencesproviderReqthroughout lines 511–527. This identical issue was flagged in a previous review. While you clarified that examples are for reference, this particular snippet is labeled "Real Example fromcore/providers/huggingface/chat.go" and should be corrected for accuracy.Apply this diff to align variable names:
// Map parameters if bifrostReq.Params != nil { params := bifrostReq.Params // Map standard parameters if params.Temperature != nil { - providerReq.Temperature = params.Temperature + hfReq.Temperature = params.Temperature } if params.MaxTokens != nil { - providerReq.MaxTokens = params.MaxTokens + hfReq.MaxTokens = params.MaxTokens } // ... other standard parameters // Handle provider-specific ExtraParams if params.ExtraParams != nil { if customParam, ok := params.ExtraParams["custom_param"].(string); ok { - providerReq.CustomParam = &customParam + hfReq.CustomParam = &customParam } } } - return providerReq + return hfReqcore/internal/testutil/account.go (1)
512-524: Align HuggingFace retry policy with other providers
MaxRetriesis set to1for HuggingFace while essentially all other remote providers use8–10retries, and the comment “HuggingFace can be variable” argues for more retries, not fewer. This will make tests unnecessarily flaky on transient errors.Recommend matching the standard 10‑retry policy and updating/removing the comment:
case schemas.HuggingFace: return &schemas.ProviderConfig{ NetworkConfig: schemas.NetworkConfig{ DefaultRequestTimeoutInSeconds: 300, - MaxRetries: 1, // HuggingFace can be variable + MaxRetries: 10, // Align with other variable cloud providers RetryBackoffInitial: 2 * time.Second, RetryBackoffMax: 30 * time.Second, },core/providers/huggingface/huggingface_test.go (1)
29-33: SpeechSynthesisModel and TranscriptionModel appear to be swapped.
Kokoro-82Mis a text-to-speech model and should be assigned toSpeechSynthesisModel, whilewhisper-large-v3is a speech-to-text model and should be assigned toTranscriptionModel. TheSpeechSynthesisFallbacksshould also reference the TTS model.- TranscriptionModel: "fal-ai/hexgrad/Kokoro-82M", - SpeechSynthesisModel: "fal-ai/openai/whisper-large-v3", + TranscriptionModel: "fal-ai/openai/whisper-large-v3", + SpeechSynthesisModel: "fal-ai/hexgrad/Kokoro-82M", SpeechSynthesisFallbacks: []schemas.Fallback{ - {Provider: schemas.HuggingFace, Model: "fal-ai/openai/whisper-large-v3"}, + {Provider: schemas.HuggingFace, Model: "fal-ai/hexgrad/Kokoro-82M"}, },core/providers/huggingface/transcription.go (2)
14-16: Incorrect error message references "speech" instead of "transcription".The error message says "speech request input cannot be nil" but this is a transcription request converter.
if request.Input == nil { - return nil, fmt.Errorf("speech request input cannot be nil") + return nil, fmt.Errorf("transcription request input cannot be nil") }
38-63: Type assertions forintwill fail when values come from JSON unmarshalling.When
ExtraParamsis populated from JSON, numeric values are unmarshalled asfloat64, notint. These type assertions will silently fail, causing parameters to be ignored.Consider handling both types. Example fix for one parameter:
- if val, ok := request.Params.ExtraParams["max_new_tokens"].(int); ok { - genParams.MaxNewTokens = &val + if val, ok := request.Params.ExtraParams["max_new_tokens"].(int); ok { + genParams.MaxNewTokens = &val + } else if val, ok := request.Params.ExtraParams["max_new_tokens"].(float64); ok { + intVal := int(val) + genParams.MaxNewTokens = &intVal }Apply the same pattern to
max_length,min_length,min_new_tokens,num_beams,num_beam_groups, andtop_k.core/providers/huggingface/speech.go (1)
37-62: Type assertions forintmay fail when values come from JSON.Same issue as in transcription.go - when
ExtraParamsis populated from JSON unmarshaling, numeric values are typicallyfloat64, notint. These type assertions will silently fail.Apply the same fix pattern as suggested for transcription.go - handle both
intandfloat64types for all integer parameters.core/providers/huggingface/chat.go (2)
69-81: Potential nil pointer dereference when accessingtc.Function.Name.On line 74,
tc.Function.Nameis dereferenced without checking if it's nil. If a tool call has a nilNamefield, this will cause a panic.for _, tc := range msg.ChatAssistantMessage.ToolCalls { + if tc.Function.Name == nil { + continue // Skip tool calls without a function name + } hfToolCall := HuggingFaceToolCall{ ID: tc.ID, Type: tc.Type,
86-89: Undefined variabledebugwill cause compilation error.The
debugvariable is referenced at lines 86 and 250 but does not appear to be defined in this file. This will cause a compilation failure.#!/bin/bash # Check if debug variable is defined in the huggingface package rg -n "^var debug\b|^const debug\b|debug\s*:=\s*(true|false)" core/providers/huggingface/Also applies to: 250-256
core/providers/huggingface/utils.go (3)
136-162: Empty provider and model names when input has no slashes.When
splitIntoModelProviderreceives a model name with no slashes (t == 0), bothprovandmodelremain empty strings. This results in malformed model identifiers downstream. A user passing"llama-7b"(without organization prefix) would trigger this issue.Consider defaulting to
hf-inferenceand using the original name as the model:+ } else { + // No slashes - default to hf-inference with the full name as model + prov = hfInference + model = bifrostModelName + if debug { + fmt.Printf("[huggingface debug] splitIntoModelProvider (t==0): prov=%s, model=%s\n", prov, model) + } }
180-191: Copy-paste error: Wrong provider names in error messages.Lines 184 and 190 incorrectly reference "nebius provider" when the actual providers are "sambanova" and "scaleway" respectively.
case "sambanova": if requestType == schemas.EmbeddingRequest { defaultPath = "/sambanova/v1/embeddings" } else { - return "", fmt.Errorf("nebius provider only supports embedding requests") + return "", fmt.Errorf("sambanova provider only supports embedding requests") } case "scaleway": if requestType == schemas.EmbeddingRequest { defaultPath = "/scaleway/v1/embeddings" } else { - return "", fmt.Errorf("nebius provider only supports embedding requests") + return "", fmt.Errorf("scaleway provider only supports embedding requests") }
164-197: Incomplete provider routing — 13 of 19 defined providers will error.The
getInferenceProviderRouteURLfunction only handles 6 providers (fal-ai,hf-inference,nebius,replicate,sambanova,scaleway) whileINFERENCE_PROVIDERSdefines 19 total. Providers likecerebras,cohere,groq,fireworks-ai,together, etc. will hit the default error case.This affects embedding, speech, and transcription operations for the majority of defined providers.
Either expand the routing logic to handle all providers or document which providers are actually supported for which request types.
core/providers/huggingface/huggingface.go (6)
51-85: Typo: "aquire" should be "acquire".The pool helper functions are misspelled as
aquireHuggingFaceChatResponse,aquireHuggingFaceTranscriptionResponse, andaquireHuggingFaceSpeechResponse. The correct spelling is "acquire". Update the function names and their call sites at lines 348, 790, and 861.
277-281: Bug: Wrong provider constant in ListModels.Line 279 uses
schemas.Geminiinstead ofschemas.HuggingFacein theCheckOperationAllowedcall. This causes incorrect operation permission checks for the HuggingFace provider.Apply this diff:
- if err := providerUtils.CheckOperationAllowed(schemas.Gemini, provider.customProviderConfig, schemas.ListModelsRequest); err != nil { + if err := providerUtils.CheckOperationAllowed(schemas.HuggingFace, provider.customProviderConfig, schemas.ListModelsRequest); err != nil { return nil, err }
295-301: Bug: Wrong request types in unsupported operation errors.Both
TextCompletion(line 296) andTextCompletionStream(line 300) return errors usingschemas.EmbeddingRequestinstead of the correct request types. This was previously flagged and marked as addressed, but the issue persists in the current code.Apply this diff:
func (provider *HuggingFaceProvider) TextCompletion(ctx context.Context, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (*schemas.BifrostTextCompletionResponse, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionRequest, provider.GetProviderKey()) } func (provider *HuggingFaceProvider) TextCompletionStream(ctx context.Context, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionStreamRequest, provider.GetProviderKey()) }
700-707: Bug: Wrong request type in Embedding error messages.Lines 701 and 706 use
schemas.SpeechRequestinstead ofschemas.EmbeddingRequestwhen returning unsupported operation errors in the Embedding method.Apply this diff:
if providerMapping == nil { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) } mapping, ok := providerMapping[inferenceProvider] if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "feature-extraction" { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) }
773-776: Bug: Wrong task type check in Speech.Line 774 checks for
"automatic-speech-recognition"task, but Speech (text-to-speech) should check for"text-to-speech". The task checks appear to be swapped between Speech and Transcription functions.Apply this diff:
mapping, ok := providerMapping[inferenceProvider] - if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "automatic-speech-recognition" { + if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "text-to-speech" { return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) }
831-838: Bug: Wrong task type check and error messages in Transcription.Two issues in the Transcription function:
- Lines 832 and 837 use
schemas.SpeechRequestinstead ofschemas.TranscriptionRequestin error messages- Line 836 checks for
"text-to-speech"task, but Transcription should check for"automatic-speech-recognition"(which is what Speech incorrectly checks at line 774, suggesting these are swapped)Apply this diff:
if providerMapping == nil { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey()) } mapping, ok := providerMapping[inferenceProvider] - if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "text-to-speech" { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "automatic-speech-recognition" { + return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey()) }
🧹 Nitpick comments (1)
core/internal/testutil/account.go (1)
530-876: Consider adding a ComprehensiveTestConfig entry for HuggingFace
GetConfiguredProvidersandGetConfigForProvidernow supportschemas.HuggingFace, butAllProviderConfigsdoesn’t define aComprehensiveTestConfigfor it. That means HuggingFace won’t participate in these cross‑provider end‑to‑end scenarios.Recommend adding a HuggingFace entry here (with a representative chat/embedding model) so the new provider is exercised alongside the others.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (28)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)Makefile(8 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(3 hunks)core/internal/testutil/chat_completion_stream.go(1 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types copy.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(2 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/features/unified-interface.mdx(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(1 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
- ui/README.md
- core/schemas/bifrost.go
- core/internal/testutil/chat_completion_stream.go
- core/bifrost.go
- core/schemas/account.go
- docs/features/unified-interface.mdx
- ui/lib/constants/config.ts
- .github/workflows/pr-tests.yml
- .github/workflows/release-pipeline.yml
- core/providers/huggingface/types copy.go
- core/providers/huggingface/embedding.go
- core/providers/huggingface/types.go
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
transports/config.schema.jsoncore/providers/huggingface/huggingface_test.gocore/schemas/mux.gocore/internal/testutil/responses_stream.gocore/providers/huggingface/transcription.goui/lib/constants/logs.tscore/providers/huggingface/models.gocore/internal/testutil/account.godocs/apis/openapi.jsonMakefilecore/providers/huggingface/speech.goui/lib/constants/icons.tsxdocs/contributing/adding-a-provider.mdxcore/providers/huggingface/chat.gocore/providers/huggingface/utils.gocore/providers/huggingface/huggingface.go
🧬 Code graph analysis (6)
core/providers/huggingface/huggingface_test.go (4)
core/internal/testutil/setup.go (1)
SetupTest(51-60)core/internal/testutil/account.go (2)
ComprehensiveTestConfig(47-64)TestScenarios(22-44)core/schemas/bifrost.go (2)
HuggingFace(51-51)Fallback(131-134)core/internal/testutil/tests.go (1)
RunAllComprehensiveTests(15-62)
core/providers/huggingface/transcription.go (4)
core/schemas/transcriptions.go (2)
BifrostTranscriptionRequest(3-10)BifrostTranscriptionResponse(16-26)core/providers/huggingface/types.go (5)
HuggingFaceTranscriptionRequest(328-333)HuggingFaceTranscriptionRequestParameters(336-339)HuggingFaceTranscriptionGenerationParameters(342-359)HuggingFaceTranscriptionEarlyStopping(363-366)HuggingFaceTranscriptionResponse(399-402)core/schemas/models.go (1)
Model(109-129)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
HuggingFaceListModelsResponse(21-23)core/schemas/bifrost.go (13)
ModelProvider(32-32)RequestType(86-86)ChatCompletionRequest(92-92)ChatCompletionStreamRequest(93-93)TextCompletionRequest(90-90)TextCompletionStreamRequest(91-91)ResponsesRequest(94-94)ResponsesStreamRequest(95-95)EmbeddingRequest(96-96)SpeechRequest(97-97)SpeechStreamRequest(98-98)TranscriptionRequest(99-99)TranscriptionStreamRequest(100-100)core/schemas/models.go (1)
Model(109-129)
core/internal/testutil/account.go (4)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/schemas/account.go (1)
Key(8-18)core/schemas/provider.go (4)
ProviderConfig(234-242)NetworkConfig(45-53)DefaultRequestTimeoutInSeconds(15-15)ConcurrencyAndBufferSize(128-131)core/internal/testutil/cross_provider_scenarios.go (1)
ProviderConfig(45-53)
core/providers/huggingface/speech.go (3)
core/schemas/speech.go (2)
BifrostSpeechRequest(9-16)BifrostSpeechResponse(22-29)core/providers/huggingface/types.go (5)
HuggingFaceSpeechRequest(304-310)HuggingFaceSpeechParameters(313-316)HuggingFaceTranscriptionGenerationParameters(342-359)HuggingFaceTranscriptionEarlyStopping(363-366)HuggingFaceSpeechResponse(319-323)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/providers/huggingface/huggingface.go (7)
core/providers/huggingface/types.go (5)
HuggingFaceChatResponse(129-136)HuggingFaceTranscriptionResponse(399-402)HuggingFaceSpeechResponse(319-323)HuggingFaceResponseError(268-272)HuggingFaceHubError(263-266)core/providers/utils/utils.go (7)
GetRequestPath(219-239)SetExtraHeaders(178-208)HandleProviderAPIError(317-337)CheckAndDecodeBody(423-431)NewBifrostOperationError(449-460)HandleProviderResponse(343-387)ShouldSendBackRawResponse(482-487)core/schemas/bifrost.go (4)
HuggingFace(51-51)EmbeddingRequest(96-96)BifrostStream(321-328)ChatCompletionRequest(92-92)core/providers/huggingface/chat.go (1)
ToHuggingFaceChatCompletionRequest(12-199)core/schemas/mux.go (1)
ChatToResponsesStreamState(966-982)core/schemas/responses.go (1)
BifrostResponsesResponse(45-83)core/providers/huggingface/embedding.go (1)
ToHuggingFaceEmbeddingRequest(10-55)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...
(QB_NEW_EN_HYPHEN)
[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...
(QB_NEW_EN_HYPHEN)
[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...
(QB_NEW_EN_HYPHEN)
[grammar] ~1469-~1469: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...
(QB_NEW_EN_HYPHEN)
[uncategorized] ~1779-~1779: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
🔇 Additional comments (18)
core/internal/testutil/responses_stream.go (1)
692-695: Lifecycle stream safety cap increase looks appropriateBumping the lifecycle loop guard to
responseCount > 300keeps the infinite‑loop protection but avoids prematurely truncating providers that emit many small lifecycle events (e.g., more granular Hugging Face streaming), especially given the existing 200s context timeout. This looks like a safe, non‑breaking adjustment in the broader streaming test harness.Makefile (3)
24-29: Color initialization refactor is cleaner and more portable.Replacing hard-coded ANSI escape sequences with
$(shell printf ...)improves readability and portability across different shells. This is a good improvement.
15-21: Binary path resolution logic is sound but relies on Go toolchain consistency.The GOBIN/GOPATH detection and fallback logic correctly identifies where
go installplaces binaries. The pattern of preferring explicitDEFAULT_GOBINthen falling back towhichis robust for most environments. However, ensure downstream targets (e.g.,dev,test) tolerate scenarios where binaries are missing or not inPATH—these targets already callinstall-airandinstall-gotestsum, which is good.Verify that all targets invoking
$(AIR_BIN)or$(GOTESTSUM_BIN)are guarded by correspondinginstall-*targets or handle missing binaries gracefully.
65-86: install-ui improvements handle root check and Next.js discovery well.The root check (lines 67-70) prevents permission issues on developer machines. The multi-stage Next.js discovery (lines 76-85) gracefully handles systems without global npm directories (e.g., Nix). The fallback chain (local → npx → global) is well-reasoned for environment diversity.
transports/config.schema.json (2)
135-140: HuggingFace provider wired into transport config schema correctlyUsing the generic
#/$defs/providerref for"huggingface"is consistent with other HTTP providers and looks good.
764-784: Semantic cache embedding provider enum correctly extended for HuggingFaceIncluding
"huggingface"in thesemanticcachepluginproviderenum keeps the plugin in sync with the newly supported embedding provider.docs/apis/openapi.json (1)
3239-3258: ModelProvider enum now exposes HuggingFace in OpenAPIAdding
"huggingface"to theModelProviderenum keeps the public OpenAPI schema aligned with the backend provider set; change looks correct.core/internal/testutil/account.go (1)
77-99: HuggingFace correctly added to configured providersIncluding
schemas.HuggingFaceinGetConfiguredProviderskeeps the comprehensive test account aligned with the new provider set.ui/lib/constants/logs.ts (2)
2-20: KnownProvidersNames correctly extended with HuggingFaceAdding
"huggingface"here ensures log views and filters recognize the new provider at the type level.
43-61: ProviderLabels updated for HuggingFace display nameThe
huggingface: "HuggingFace"label keeps the logs UI consistent with the rest of the product’s provider naming.core/providers/huggingface/huggingface_test.go (1)
12-63: Test structure looks good overall.The test follows the established pattern: parallel execution, environment variable check for API key, proper setup/teardown with
defer cancel(), and client shutdown. The comprehensive test config enables a reasonable subset of scenarios for the HuggingFace provider.core/providers/huggingface/models.go (2)
16-44: Model ID conversion logic looks correct.The conversion properly handles nil input, pre-allocates the slice, skips models without IDs or supported methods, and constructs composite IDs with the provider prefix. The use of
model.ModelIDfor the display name andmodel.IDfor the HuggingFace reference appears intentional based on the type definitions.
46-104: Method derivation logic is well-structured.The function correctly uses a set for deduplication, handles both pipeline tags and model tags with appropriate fallbacks, and returns a sorted slice for deterministic output. The broad tag matching (e.g.,
strings.Contains(tagLower, "embedding")) is reasonable for discovery purposes.core/providers/huggingface/transcription.go (1)
100-139: Response conversion implementation is correct.The
ToBifrostTranscriptionResponsemethod properly validates inputs, maps the text response, and safely converts chunks to segments with appropriate bounds checking for timestamps.core/providers/huggingface/speech.go (1)
9-24: Request and response conversion logic is well-implemented.The
ToHuggingFaceSpeechRequestproperly validates input, extracts provider/model, and maps the text input. TheToBifrostSpeechResponsecorrectly handles nil cases and maps the audio data with appropriate extra fields.Also applies to: 94-116
core/providers/huggingface/chat.go (1)
201-315: Response conversion implementations are comprehensive.Both
ToBifrostChatResponseandToBifrostChatStreamResponseproperly handle nil checks, convert choices with messages/deltas, map tool calls, logprobs, and usage information. The streaming response correctly maps theReasoningfield toThought.Also applies to: 317-413
core/providers/huggingface/utils.go (2)
83-128: URL construction for model hub API is well-implemented.The
buildModelHubURLfunction properly handles pagination limits with bounds checking, sets appropriate query parameters, and handles various ExtraParams types with type switching.
217-251: Model inference provider mapping retrieval is correctly implemented.Proper use of fasthttp with deferred resource release, appropriate error handling for API responses, and clean conversion to internal mapping structure.
f279893 to
4663a80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (10)
core/providers/huggingface/utils.go (1)
168-201: Only a subset of INFERENCE_PROVIDERS are routable in getInferenceProviderRouteURL.The switch handles
fal-ai,hf-inference,nebius,replicate,sambanova, andscaleway, but other entries inINFERENCE_PROVIDERS(e.g.,cerebras,cohere,groq,featherless-ai,fireworks-ai,hyperbolic,novita,nscale,ovhcloud,publicai,together,wavespeed,zai-org) fall through to the default error. That means models mapped to those providers (and surfaced via list-models) will currently be unusable.Either (a) add routing cases for the remaining providers with correct default paths or (b) remove any providers you don’t intend to support yet from
INFERENCE_PROVIDERSso they don’t appear as “supported” in filters and mappings.core/internal/testutil/account.go (1)
259-266: Fix HuggingFace test API key env var name and optionally support HF_TOKEN.This branch reads
HUGGING_FACE_API_KEY, but the PR/docs useHUGGINGFACE_API_KEYand the provider is supposed to supportHF_TOKENas well. As written, CI that exportsHUGGINGFACE_API_KEY/HF_TOKENwill see empty keys in tests.Consider standardizing on
HUGGINGFACE_API_KEYwith anHF_TOKENfallback:- case schemas.HuggingFace: - return []schemas.Key{ - { - Value: os.Getenv("HUGGING_FACE_API_KEY"), - Models: []string{}, - Weight: 1.0, - }, - }, nil + case schemas.HuggingFace: + key := os.Getenv("HUGGINGFACE_API_KEY") + if key == "" { + key = os.Getenv("HF_TOKEN") + } + return []schemas.Key{ + { + Value: key, + Models: []string{}, + Weight: 1.0, + }, + }, nilMakefile (3)
21-21: SimplifyGOTESTSUM_BINfallback for consistency withAIR_BIN.The
shell which gotestsum 2>/dev/null || echo gotestsumfallback adds complexity without benefit. For consistency withAIR_BIN(line 20), use a simple fallback.
103-104: Replacewhichwithcommand -vfor proper full-path handling.The
which $(GOTESTSUM_BIN)fallback doesn't work correctly with full paths. Usecommand -vwhich handles both cases, or align with theAIR_BINpattern using only[ -x ].
97-98: Incorrect use ofwhichwith full path variables.At lines 97-98 (and 110-111),
which $$INSTALLEDis used where$$INSTALLEDis an absolute path. Thewhichcommand searchesPATHby name, not by checking if a full path exists. Replace with a direct existence check:[ ! -x "$$INSTALLED" ].core/providers/huggingface/chat.go (2)
93-96: Undefined variabledebugwill cause compilation error.The variable
debugis referenced but never defined in this file. This will prevent the code from compiling.Either define the debug variable or remove the debug logging:
+// Package-level debug flag (set via build tags or environment) +var debug = false + func ToHuggingFaceChatCompletionRequest(bifrostReq *schemas.BifrostChatRequest) *HuggingFaceChatRequest {Or remove the debug blocks entirely if not needed:
- if debug { - fmt.Printf("[huggingface debug] Added tool_call_id=%s to tool message\n", *msg.ChatToolMessage.ToolCallID) - }
257-263: Same undefineddebugvariable issue.This is the same compilation error as in the request converter -
debugis undefined.core/providers/huggingface/huggingface.go (3)
295-301: Bug: Wrong request type in unsupported operation errors.Both
TextCompletionandTextCompletionStreamuseschemas.EmbeddingRequestinstead of their respective request types. This causes incorrect error messages.Apply this diff:
func (provider *HuggingFaceProvider) TextCompletion(ctx context.Context, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (*schemas.BifrostTextCompletionResponse, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionRequest, provider.GetProviderKey()) } func (provider *HuggingFaceProvider) TextCompletionStream(ctx context.Context, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionStreamRequest, provider.GetProviderKey()) }
713-716: Bug: Wrong request type in Embedding error message.Line 715 uses
schemas.SpeechRequestinstead ofschemas.EmbeddingRequestwhen the mapping check fails.Apply this diff:
mapping, ok := providerMapping[inferenceProvider] if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "feature-extraction" { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) }
847-854: Bug: Wrong request type in Transcription error messages.Lines 848 and 853 use
schemas.SpeechRequestinstead ofschemas.TranscriptionRequest.Apply this diff:
if providerMapping == nil { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey()) } mapping, ok := providerMapping[inferenceProvider] if !ok || mapping.ProviderModelMapping == "" || mapping.ProviderTask != "automatic-speech-recognition" { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TranscriptionRequest, provider.GetProviderKey()) }
🧹 Nitpick comments (9)
docs/contributing/adding-a-provider.mdx (1)
41-75: Fix hyphenation for compound adjectives.According to English grammar rules, compound adjectives should be hyphenated when they precede a noun. Update all instances of "OpenAI Compatible" to "OpenAI-compatible" when used as adjectives:
- Line 43: "not OpenAI compatible" → "not OpenAI-compatible"
- Line 71: "#### OpenAI Compatible Providers" → "#### OpenAI-Compatible Providers"
- Line 629: "### OpenAI Compatible Providers" → "### OpenAI-Compatible Providers"
- Line 1475: "### For OpenAI Compatible Providers" → "### For OpenAI-Compatible Providers"
Also update any related section titles and cross-references to maintain consistency throughout the document.
Also applies to: 629-632, 1475-1477
core/providers/huggingface/utils.go (2)
27-82: inferenceProvider constants and provider lists look consistent with HF docs.
INFERENCE_PROVIDERSandPROVIDERS_OR_POLICIESare well-structured; if you want slightly tighter typing, you could append theautoconstant instead of the raw"auto"literal inPROVIDERS_OR_POLICIES, but the current form is functionally fine.
257-296: Int extraction helper is flexible; be aware of truncation semantics.
extractIntFromInterfacehandles most numeric variants (includingjson.Number) and falls back cleanly; just note that all float cases (andjson.NumberviaFloat64) are truncated viaint(...), which is fine if all upstream numeric fields are expected to be integral.core/providers/huggingface/huggingface_test.go (1)
24-57: HuggingFace scenarios omit embedding/audio tests despite models being configured.You’ve set
EmbeddingModel,TranscriptionModel, andSpeechSynthesisModel, but all corresponding scenario flags (Embedding,Transcription,TranscriptionStream,SpeechSynthesis,SpeechSynthesisStream) arefalse, soRunAllComprehensiveTestswon’t actually exercise those paths. Once you’re confident in those flows, consider flipping these booleans totrueso chat, embeddings, and audio are all covered end-to-end.core/providers/huggingface/models.go (1)
46-104: deriveSupportedMethods’ pipeline/tag heuristics look solid and conservative.Normalizing the pipeline tag, aggregating methods via a set, and falling back to tags for embeddings, chat/text, TTS, and ASR is a good balance between coverage and precision, and sorting the final method list keeps responses deterministic.
core/providers/huggingface/embedding.go (1)
10-13: Inconsistent nil handling - returning(nil, nil)may cause silent failures.Returning
(nil, nil)whenbifrostReqis nil differs from other converters in this PR (e.g.,ToBifrostEmbeddingResponsereturns an error for nil input). This could lead to silent failures where the caller doesn't know if the request was successfully converted or if the input was nil.Consider returning an explicit error for consistency:
func ToHuggingFaceEmbeddingRequest(bifrostReq *schemas.BifrostEmbeddingRequest) (*HuggingFaceEmbeddingRequest, error) { if bifrostReq == nil { - return nil, nil + return nil, fmt.Errorf("bifrost embedding request is nil") }Alternatively, if nil-in-nil-out is intentional, document this behavior with a comment.
core/providers/huggingface/speech.go (1)
9-12: Inconsistent nil handling pattern.Similar to
ToHuggingFaceEmbeddingRequest, returning(nil, nil)for nil input may cause silent failures. Consider returning an error or documenting the nil-in-nil-out behavior for consistency across the provider.core/providers/huggingface/transcription.go (1)
124-129: Potential silent data loss whenTimestamphas exactly one element.The condition
len(chunk.Timestamp) >= 2handles empty and full timestamps, but if a chunk has exactly one timestamp element, bothstartandendwill be zero. Consider logging or handling this edge case explicitly if it indicates malformed data.core/providers/huggingface/chat.go (1)
36-38: Silently discarding marshalling errors could hide bugs.Multiple locations discard
sonic.Marshalerrors with_. While this may be acceptable for optional fields, consider logging these errors at debug level to aid troubleshooting.- contentJSON, _ := sonic.Marshal(*msg.Content.ContentStr) + contentJSON, err := sonic.Marshal(*msg.Content.ContentStr) + if err != nil { + // Log error but continue - content will be empty + continue + } hfMsg.Content = json.RawMessage(contentJSON)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (26)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)Makefile(8 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(3 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(2 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/features/unified-interface.mdx(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(1 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
- core/schemas/bifrost.go
- core/bifrost.go
- ui/lib/constants/logs.ts
- transports/config.schema.json
- .github/workflows/pr-tests.yml
- core/schemas/account.go
- core/schemas/mux.go
- .github/workflows/release-pipeline.yml
- ui/lib/constants/config.ts
- core/internal/testutil/responses_stream.go
- docs/features/unified-interface.mdx
- ui/README.md
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
docs/apis/openapi.jsoncore/providers/huggingface/embedding.gocore/internal/testutil/account.gocore/providers/huggingface/models.goMakefilecore/providers/huggingface/speech.godocs/contributing/adding-a-provider.mdxcore/providers/huggingface/transcription.goui/lib/constants/icons.tsxcore/providers/huggingface/chat.gocore/providers/huggingface/utils.gocore/providers/huggingface/huggingface_test.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
🧬 Code graph analysis (5)
core/providers/huggingface/embedding.go (3)
core/schemas/embedding.go (4)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)core/providers/huggingface/types.go (2)
HuggingFaceEmbeddingRequest(303-313)HuggingFaceEmbeddingResponse(324-324)core/schemas/chatcompletions.go (1)
BifrostLLMUsage(640-647)
core/internal/testutil/account.go (4)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/schemas/account.go (1)
Key(8-18)core/schemas/provider.go (4)
ProviderConfig(234-242)NetworkConfig(45-53)DefaultRequestTimeoutInSeconds(15-15)ConcurrencyAndBufferSize(128-131)core/internal/testutil/cross_provider_scenarios.go (1)
ProviderConfig(45-53)
core/providers/huggingface/speech.go (4)
core/schemas/speech.go (2)
BifrostSpeechRequest(9-16)BifrostSpeechResponse(22-29)core/providers/huggingface/types.go (5)
HuggingFaceSpeechRequest(329-335)HuggingFaceSpeechParameters(338-341)HuggingFaceTranscriptionGenerationParameters(367-384)HuggingFaceTranscriptionEarlyStopping(388-391)HuggingFaceSpeechResponse(344-348)core/schemas/models.go (1)
Model(109-129)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/providers/huggingface/huggingface.go (4)
core/providers/utils/utils.go (11)
GetProviderName(734-741)GetRequestPath(219-239)SetExtraHeaders(178-208)MakeRequestWithContext(39-93)HandleProviderAPIError(317-337)NewBifrostOperationError(449-460)HandleProviderResponse(343-387)ShouldSendBackRawResponse(482-487)NewUnsupportedOperationError(391-402)ProcessAndSendBifrostError(571-601)GetBifrostResponseForStreamResponse(777-805)core/providers/huggingface/chat.go (1)
ToHuggingFaceChatCompletionRequest(12-206)core/providers/huggingface/embedding.go (1)
ToHuggingFaceEmbeddingRequest(10-58)core/providers/huggingface/transcription.go (1)
ToHuggingFaceTranscriptionRequest(9-101)
core/providers/huggingface/types.go (1)
core/schemas/models.go (1)
Model(109-129)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...
(QB_NEW_EN_HYPHEN)
[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...
(QB_NEW_EN_HYPHEN)
[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...
(QB_NEW_EN_HYPHEN)
[grammar] ~1475-~1475: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...
(QB_NEW_EN_HYPHEN)
[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
🔇 Additional comments (31)
docs/apis/openapi.json (1)
3238-3258: ModelProvider enum now correctly exposes HuggingFace (and Cerebras) in the public API.The added
"cerebras"and"huggingface"enum values align with the newModelProviderconstants and make these providers addressable via the OpenAPI schema; no issues from a schema/compat perspective.core/providers/huggingface/utils.go (3)
83-128: Model hub URL builder is robust and respects pagination and overrides.The URL construction correctly clamps
limit, sets sensible defaults (full,sort,direction,inference_provider), and then letsExtraParamsoverride as needed; this should work well for listing inference-provider-backed models.
203-255: Provider-mapping fetch and conversion are straightforward and aligned with fasthttp/providerUtils patterns.The mapping conversion guards against nil, preserves tasks and model mappings, and the HTTP helper correctly acquires/releases fasthttp objects, handles non-200 responses via
HandleProviderAPIError, and decodes JSON into the mapping type; this looks solid.
137-166: All call sites ofsplitIntoModelProviderproperly handle the error return value. Every instance checksnameErr != niland either propagates the error or wraps it in an appropriate error type (UnsupportedOperationError). The stricter validation of model name format is safely enforced across the codebase with no gaps in error handling.core/internal/testutil/account.go (2)
77-99: Including HuggingFace in the configured providers set is correct.Adding
schemas.HuggingFacehere keeps the test harness aligned with the new provider and ensures it participates in cross-provider setups.
512-524: HuggingFace ProviderConfig defaults look reasonable.A 300s timeout, 10 retries, and moderate backoff (2s–30s) with standard concurrency/buffer mirror how other “variable” cloud providers are configured; this should be fine as a starting point.
core/providers/huggingface/models.go (1)
16-44: Model listing transformation correctly scopes IDs and filters unsupported models.
ToBifrostListModelsResponsesensibly skips models without IDs or derived methods, prefixes IDs with the Bifrost provider key, and stores the raw Hugging Face ID separately; this gives a clean, provider-scoped surface for/v1/models.core/providers/huggingface/embedding.go (1)
60-96: LGTM!The
ToBifrostEmbeddingResponsemethod correctly converts HuggingFace embeddings to Bifrost format, properly handles nil input with an error, and documents that usage information is unavailable from the HuggingFace API.core/providers/huggingface/speech.go (1)
98-119: LGTM!The response converter properly validates the model name and correctly notes that HuggingFace TTS doesn't return usage or alignment data.
Makefile (1)
66-70: Good addition of root-user guard for local development.Preventing
make install-uifrom running as root on developer machines avoids common permission issues with npm global installs.core/providers/huggingface/transcription.go (1)
38-82: LGTM!The integer parameter extraction correctly uses
extractIntFromInterfaceto handle bothintandfloat64types from JSON unmarshalling, addressing the concern from previous reviews.core/providers/huggingface/chat.go (2)
69-76: LGTM - nil pointer dereference fix is correctly implemented.The code now safely handles a nil
tc.Function.Nameby using a default empty string, preventing potential panics.
324-420: LGTM!The streaming response converter correctly handles delta fields, tool calls, logprobs, and usage conversion. The nil handling returning plain
nil(without error) is appropriate for streaming contexts.core/providers/huggingface/huggingface.go (10)
1-31: LGTM: Package setup and provider struct are well-structured.The debug toggle via environment variable and the provider struct with proper configuration fields follow established patterns from other providers.
33-85: LGTM: Object pooling implementation is correct.The acquire/release pattern with struct reset ensures clean state reuse. The nil checks in release functions prevent panics.
87-120: LGTM: Provider constructor follows established patterns.Pre-warming response pools and proper configuration handling align with other provider implementations.
122-130: LGTM: Helper methods correctly delegate to utility functions.
132-205: LGTM: HTTP request handling is robust.The response body copy at line 192 correctly prevents use-after-free when fasthttp releases its internal buffer. Error response parsing properly extracts HuggingFace-specific error details.
207-293: LGTM: Model listing implementation handles both keyed and keyless modes correctly.The operation check at line 279 correctly uses
schemas.HuggingFace(the copy-paste issue from Gemini was fixed).
303-375: LGTM: ChatCompletion implementation is well-structured.The model name splitting and reconstruction pattern correctly handles HuggingFace's
modelName:inferenceProviderformat. Response conversion and extra fields population follow established patterns.
377-659: LGTM: Streaming implementation is comprehensive.The SSE parsing, context cancellation handling, and the workaround for combined usage+content chunks (lines 604-645) are well-documented and correctly implemented. Resource cleanup via defers ensures proper release of fasthttp resources.
661-683: LGTM: Responses API correctly adapts ChatCompletion endpoints.The fallback pattern via context flag enables code reuse while maintaining proper response type conversion.
756-830: LGTM: Speech implementation correctly validates task type.The task check at line 787 correctly validates
"text-to-speech"for Speech operations (the swapped task check from past review was fixed).core/providers/huggingface/types.go (8)
1-48: LGTM: Model response unmarshaling correctly handles both API formats.The custom
UnmarshalJSONat lines 30-48 properly handles both the top-level array format (current API) and the object format with amodelsfield (backward compatibility). This addresses the past review concern.
50-68: LGTM: Inference provider mapping types are well-structured.The separation between API response types (
HuggingFaceInferenceProviderInfo) and internal types (HuggingFaceInferenceProviderMapping) provides clean abstraction.
69-213: LGTM: Chat types comprehensively cover OpenAI-compatible format.The flexible use of
json.RawMessagefor content and tool_choice allows handling various input formats. Response types include full logprobs and usage support.
214-285: LGTM: Streaming types correctly model delta structures.The streaming response types properly handle incremental content delivery with optional usage and timing information.
286-298: LGTM: Error types distinguish Hub and inference API responses.
299-324: LGTM: Embedding types support flexible input formats.The
interface{}type forInputscorrectly handles both single string and string array inputs required by the feature extraction API.
326-349: LGTM: Speech types appropriately model text-to-speech API.
350-436: LGTM: Transcription types handle complex union types correctly.The custom
MarshalJSON/UnmarshalJSONforHuggingFaceTranscriptionEarlyStoppingproperly handles the boolean or string ("never") union type. The type aliases at lines 435-436 provide convenient access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
Makefile (3)
97-98: Simplify binary availability check—whichdoes not work with full paths.The condition
if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; thencontains a redundant and confusingwhichcall. Since$$INSTALLEDis an absolute path (e.g.,/home/user/go/bin/air),whichwill always fail—making the&&logic unintuitive. Simply check if the file is executable.- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \ - echo "$(YELLOW)Note: the installed air binary may not be on your PATH...$(NC)"; \ + if [ ! -x "$$INSTALLED" ]; then \ + echo "$(YELLOW)Note: the installed air binary may not be on your PATH...$(NC)"; \
110-111: Simplify binary availability check—same issue as install-air (line 97).Apply the same simplification to
install-gotestsum: remove the redundantwhichcall and rely solely on the[ -x ]check.- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \ - echo "$(YELLOW)Note: the installed gotestsum binary may not be on your PATH...$(NC)"; \ + if [ ! -x "$$INSTALLED" ]; then \ + echo "$(YELLOW)Note: the installed gotestsum binary may not be on your PATH...$(NC)"; \
103-103: Replacewhichwith a more portable check or simplify to match AIR_BIN pattern.Line 103 uses
which $(GOTESTSUM_BIN)as a fallback after the[ -x ]check. SinceGOTESTSUM_BINmay be a full path,whichis unreliable. Either replace withcommand -v(which works for both names and full paths) or simplify to just[ -x ]for consistency with theAIR_BINpattern on line 90.Option 1 (recommended): Simplify to match the
AIR_BINpattern:- @if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; then \ + @if [ -x "$(GOTESTSUM_BIN)" ]; then \Option 2: Use
command -vfor portability:- @if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; then \ + @if [ -x "$(GOTESTSUM_BIN)" ] || command -v $(GOTESTSUM_BIN) > /dev/null 2>&1; then \core/providers/huggingface/huggingface.go (1)
295-301: Fix request type in unsupported TextCompletion operations.Both
TextCompletionandTextCompletionStreamreturn an unsupported-operation error but incorrectly tag it asschemas.EmbeddingRequestinstead of the appropriate text-completion request types. That makes error classification and telemetry misleading.Consider updating to:
func (provider *HuggingFaceProvider) TextCompletion(ctx context.Context, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (*schemas.BifrostTextCompletionResponse, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionRequest, provider.GetProviderKey()) } func (provider *HuggingFaceProvider) TextCompletionStream(ctx context.Context, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionStreamRequest, provider.GetProviderKey()) }
🧹 Nitpick comments (2)
core/providers/huggingface/types.go (1)
405-421: Consider returning an error for invalid EarlyStopping values.The
UnmarshalJSONmethod returnsnil(line 420) when the value is neither a boolean nor a string. This silently ignores invalid values. Consider returning an error for non-null values that don't match expected types to catch malformed API responses.Apply this diff to add error handling:
func (e *HuggingFaceTranscriptionEarlyStopping) UnmarshalJSON(data []byte) error { + // Handle null explicitly + if string(data) == "null" { + return nil + } + // Try boolean first var boolVal bool if err := json.Unmarshal(data, &boolVal); err == nil { e.BoolValue = &boolVal return nil } // Try string var stringVal string if err := json.Unmarshal(data, &stringVal); err == nil { e.StringValue = &stringVal return nil } - return nil + return fmt.Errorf("early_stopping must be a boolean or string, got: %s", string(data)) }core/providers/huggingface/huggingface.go (1)
685-754: Embedding implementation matches the expected routing and extra-field semantics.The Embedding path:
- Checks operation permissions for HuggingFace/Embedding.
- Builds the HF request via
ToHuggingFaceEmbeddingRequest.- Splits the model into
inferenceProviderandmodelName, then resolves agetModelInferenceProviderMappingentry and validatesProviderTask == "feature-extraction".- Uses the provider-specific model id and
getInferenceProviderRouteURLto derive the target URL, then executescompleteRequest.- Converts
HuggingFaceEmbeddingResponseinto aBifrostEmbeddingResponseand fillsExtraFields(provider, model requested, request type, latency, raw response).This is correct and consistent with how embeddings are handled for other providers.
As a non-blocking improvement, the repeated pattern of splitting the model, looking up
getModelInferenceProviderMapping, validatingProviderTask, and deriving the route URL (here and inSpeech/Transcription) could be factored into a small helper to avoid drift if the mapping rules ever change.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (26)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)Makefile(8 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(3 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(2 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/features/unified-interface.mdx(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(1 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
- ui/README.md
- core/schemas/mux.go
- .github/workflows/release-pipeline.yml
- .github/workflows/pr-tests.yml
- docs/apis/openapi.json
- core/schemas/account.go
- transports/config.schema.json
- ui/lib/constants/logs.ts
- docs/features/unified-interface.mdx
- core/internal/testutil/responses_stream.go
- core/schemas/bifrost.go
- core/internal/testutil/account.go
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/bifrost.goui/lib/constants/config.tscore/providers/huggingface/embedding.godocs/contributing/adding-a-provider.mdxMakefilecore/providers/huggingface/chat.gocore/providers/huggingface/models.gocore/providers/huggingface/utils.gocore/providers/huggingface/transcription.goui/lib/constants/icons.tsxcore/providers/huggingface/huggingface.gocore/providers/huggingface/huggingface_test.gocore/providers/huggingface/speech.gocore/providers/huggingface/types.go
🧬 Code graph analysis (9)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/providers/huggingface/huggingface.go (1)
NewHuggingFaceProvider(88-120)
core/providers/huggingface/embedding.go (2)
core/schemas/embedding.go (4)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)core/providers/huggingface/types.go (2)
HuggingFaceEmbeddingRequest(303-313)HuggingFaceEmbeddingResponse(324-324)
core/providers/huggingface/chat.go (2)
core/schemas/chatcompletions.go (2)
BifrostChatRequest(12-19)LogProb(625-629)core/providers/huggingface/types.go (12)
HuggingFaceChatRequest(72-92)HuggingFaceChatMessage(94-102)HuggingFaceContentItem(105-109)HuggingFaceImageRef(111-113)HuggingFaceToolCall(115-119)HuggingFaceFunction(121-125)HuggingFaceResponseFormat(127-130)HuggingFaceStreamOptions(139-141)HuggingFaceTool(143-146)HuggingFaceToolFunction(148-152)HuggingFaceChatResponse(154-161)HuggingFaceChatStreamResponse(215-224)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
HuggingFaceListModelsResponse(24-26)core/schemas/bifrost.go (13)
ModelProvider(32-32)RequestType(86-86)ChatCompletionRequest(92-92)ChatCompletionStreamRequest(93-93)TextCompletionRequest(90-90)TextCompletionStreamRequest(91-91)ResponsesRequest(94-94)ResponsesStreamRequest(95-95)EmbeddingRequest(96-96)SpeechRequest(97-97)SpeechStreamRequest(98-98)TranscriptionRequest(99-99)TranscriptionStreamRequest(100-100)core/schemas/models.go (1)
Model(109-129)
core/providers/huggingface/utils.go (2)
core/providers/huggingface/huggingface.go (1)
HuggingFaceProvider(25-31)core/providers/utils/utils.go (5)
GetRequestPath(219-239)MakeRequestWithContext(39-93)HandleProviderAPIError(317-337)CheckAndDecodeBody(423-431)NewBifrostOperationError(449-460)
core/providers/huggingface/transcription.go (3)
core/schemas/transcriptions.go (2)
BifrostTranscriptionRequest(3-10)BifrostTranscriptionResponse(16-26)core/providers/huggingface/types.go (4)
HuggingFaceTranscriptionRequest(353-358)HuggingFaceTranscriptionRequestParameters(361-364)HuggingFaceTranscriptionGenerationParameters(367-384)HuggingFaceTranscriptionEarlyStopping(388-391)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/providers/huggingface/huggingface.go (5)
core/providers/huggingface/types.go (4)
HuggingFaceChatResponse(154-161)HuggingFaceResponseError(293-297)HuggingFaceHubError(288-291)HuggingFaceEmbeddingResponse(324-324)core/providers/utils/utils.go (5)
SetExtraHeaders(178-208)MakeRequestWithContext(39-93)HandleProviderAPIError(317-337)CheckAndDecodeBody(423-431)NewBifrostOperationError(449-460)core/schemas/bifrost.go (15)
ModelProvider(32-32)HuggingFace(51-51)RequestType(86-86)BifrostError(356-365)ErrorField(374-381)ListModelsRequest(89-89)EmbeddingRequest(96-96)ChatCompletionRequest(92-92)ChatCompletionStreamRequest(93-93)ResponsesStreamRequest(95-95)ResponsesRequest(94-94)SpeechRequest(97-97)SpeechStreamRequest(98-98)TranscriptionRequest(99-99)TranscriptionStreamRequest(100-100)core/providers/huggingface/chat.go (1)
ToHuggingFaceChatCompletionRequest(12-207)core/providers/huggingface/embedding.go (1)
ToHuggingFaceEmbeddingRequest(10-58)
core/providers/huggingface/huggingface_test.go (5)
core/internal/testutil/setup.go (1)
SetupTest(51-60)core/internal/testutil/account.go (2)
ComprehensiveTestConfig(47-64)TestScenarios(22-44)core/schemas/bifrost.go (2)
HuggingFace(51-51)Fallback(131-134)core/schemas/models.go (1)
Model(109-129)core/internal/testutil/tests.go (1)
RunAllComprehensiveTests(15-62)
core/providers/huggingface/speech.go (3)
core/schemas/speech.go (2)
BifrostSpeechRequest(9-16)BifrostSpeechResponse(22-29)core/providers/huggingface/types.go (4)
HuggingFaceSpeechRequest(329-335)HuggingFaceSpeechParameters(338-341)HuggingFaceTranscriptionGenerationParameters(367-384)HuggingFaceTranscriptionEarlyStopping(388-391)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...
(QB_NEW_EN_HYPHEN)
[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...
(QB_NEW_EN_HYPHEN)
[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...
(QB_NEW_EN_HYPHEN)
[grammar] ~1475-~1475: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...
(QB_NEW_EN_HYPHEN)
[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
🔇 Additional comments (34)
Makefile (5)
15-21: ✓ Well-structured tool path resolution with sensible fallbacks.The new
DEFAULT_GOBINvariable elegantly prioritizesGOBIN→GOPATH/bin→ empty, andAIR_BINandGOTESTSUM_BINconsistently prefer local paths. This avoids hardcoding tool expectations and gracefully handles systems with custom Go installations (e.g., Nix).
24-29: ✓ Dynamic ANSI color codes improve portability and maintainability.Using
shell printfinstead of literal escape sequences is more portable and easier to read. This pattern is cleaner than hardcoded escape codes.
65-70: ✓ Root check and multi-fallback Next.js detection enhance developer experience.Lines 67-70 prevent root-user npm installs on local dev machines, which avoids permission issues on NixOS and similar systems. Lines 76-85 provide three tiers of fallback (local node_modules, npx, global), making the install flow robust across different environments.
Also applies to: 75-85
155-155: ✓ Consistent use of $(AIR_BIN) variable throughout dev target.Both debug and non-debug paths now use the centralized
$(AIR_BIN)variable, ensuring the tool path is resolved consistently.Also applies to: 163-163
349-349: ✓ $(GOTESTSUM_BIN) consistently used across all test targets.All test invocations now reference
$(GOTESTSUM_BIN)instead of the baregotestsumcommand, ensuring they use the resolved tool path. This improves reliability on systems where gotestsum is installed in non-standard locations.Also applies to: 401-401, 425-425, 454-454, 548-548
core/bifrost.go (2)
26-26: LGTM!The import statement is correctly placed in alphabetical order and follows Go conventions.
1327-1328: LGTM!The HuggingFace provider is correctly wired into the factory switch with the proper constructor call and return pattern, consistent with other providers.
core/providers/huggingface/types.go (6)
11-48: LGTM!The custom
UnmarshalJSONimplementation properly handles both the array and object response formats from the HuggingFace API, addressing the earlier review concern.
90-92: Clarify the purpose of the Extra field.The
Extrafield is tagged withjson:"-"which means it won't be marshaled or unmarshaled. If the intent is to capture unknown additional fields from the API response, you'd need customUnmarshalJSONlogic. If it's for application-level metadata only (not from API), the current approach is fine.
288-297: LGTM!The error types properly represent different HuggingFace API error response formats.
303-324: LGTM!The embedding types properly handle flexible input formats and the response structure matches HuggingFace's feature extraction API output.
329-348: LGTM!The speech types follow a consistent pattern with the chat types. Note that the
Extrafields withjson:"-"tags won't capture unknown API fields (same consideration as the chat request Extra field).
435-436: LGTM!The type aliases provide convenient alternative names for shared generation parameters, improving code readability.
ui/lib/constants/config.ts (2)
14-14: LGTM!The HuggingFace model placeholder examples demonstrate both chat and embedding models with proper HuggingFace Hub naming conventions.
34-34: LGTM!Correctly marks HuggingFace as requiring an API key, consistent with other cloud-based inference providers.
core/providers/huggingface/huggingface_test.go (1)
12-63: LGTM! Test configuration is well-structured.The test setup correctly:
- Checks for API key before running
- Uses parallel execution
- Configures comprehensive scenarios
- Enables appropriate features (chat, streaming, tool calls, vision)
- Disables unsupported features (text completion, embedding, speech, transcription)
The model assignments appear correct and the disabled scenarios align with the provider's current capabilities.
core/providers/huggingface/models.go (1)
16-104: LGTM! Model listing conversion is well-implemented.The code demonstrates good practices:
- Proper nil handling and input validation
- Pre-allocated slices for performance
- Skip logic for invalid models (empty ID or no supported methods)
- Comprehensive method derivation from both pipeline tags and model tags
- Deduplication using a map-based approach
- Sorted output for consistency
The logic correctly maps HuggingFace model metadata to Bifrost's schema.
core/providers/huggingface/embedding.go (1)
10-96: LGTM! Embedding conversion logic is solid.The implementation correctly handles:
- Nil input validation
- Model/provider extraction via
splitIntoModelProvider- Mapping single text vs. array of texts
- Provider-specific parameters from
ExtraParams- Embedding array construction with proper indexing
- Missing usage information (documented that HF doesn't provide it)
The converters follow the established pattern and handle both request and response transformations cleanly.
core/providers/huggingface/transcription.go (1)
9-142: LGTM! Transcription converters handle complex parameter mapping well.The implementation demonstrates:
- Proper input validation with clear error messages
- Correct use of
extractIntFromInterfaceto handle JSON numeric types (addressing past review concerns)- Comprehensive generation parameter mapping (do_sample, max_new_tokens, temperature, etc.)
- Proper handling of the polymorphic
early_stoppingfield (bool or string)- Segment conversion with timestamp preservation
- Clean separation of concerns
Previous issues with error messages and integer type assertions have been addressed.
docs/contributing/adding-a-provider.mdx (1)
1-2070: Documentation is comprehensive and well-structured.The guide provides excellent coverage of:
- Clear distinction between OpenAI-compatible and custom API providers
- Step-by-step implementation phases with proper ordering
- File structure and naming conventions
- Type definitions and validation requirements
- Converter patterns with real examples
- Test setup and CI/CD integration
- UI and configuration updates
The reference code examples serve their purpose of illustrating patterns without needing to be complete implementations (as clarified by the author in past reviews).
core/providers/huggingface/speech.go (1)
9-120: LGTM! Speech synthesis converters are well-implemented.The code follows the same robust patterns as transcription:
- Input validation with clear error messages
- Proper use of
extractIntFromInterfacefor JSON numeric handling- Comprehensive generation parameter mapping
- Handling of polymorphic
early_stoppingfield- Clear documentation that HF TTS doesn't provide usage/alignment data
- Clean response construction with proper metadata
Previous type assertion issues have been resolved.
core/providers/huggingface/chat.go (1)
12-421: LGTM! Chat converters handle complex transformations correctly.The implementation demonstrates excellent handling of:
- Nil pointer protection for tool calls (fixed from past review)
- Message content conversion (string vs. structured blocks)
- Image URL handling for vision models
- Tool call conversion with proper nil checks
- Response format and stream options
- Debug logging (variable properly defined in package scope)
- Logprobs and top_logprobs conversion
- Streaming delta transformation
The code properly handles both non-streaming and streaming responses with appropriate type conversions.
core/providers/huggingface/utils.go (3)
137-166: Good fix for model name validation.The
splitIntoModelProviderfunction now properly handles invalid input (no slashes) by returning an error instead of producing empty strings. The debug logging helps trace the parsing logic, and the distinction between single-slash (org/model) and multi-slash (provider/org/model) formats is clear.
168-201: Provider routing correctly limited to supported operations.As clarified in past reviews, only 6 providers (fal-ai, hf-inference, nebius, replicate, sambanova, scaleway) support embedding/speech/transcription operations. The other 13 providers in
INFERENCE_PROVIDERSare used for chat/text-generation, which follows a different routing pattern. The error messages have been corrected to reference the appropriate provider names.
257-296: Excellent utility function for JSON numeric handling.The
extractIntFromInterfacehelper comprehensively handles all numeric types that can result from JSON unmarshaling:
- All signed and unsigned integer types
- Float types (with conversion)
json.Numberwith fallback parsingThis is used throughout the provider to safely extract integer parameters from
ExtraParams, addressing the type assertion issues flagged in earlier reviews.core/providers/huggingface/huggingface.go (9)
33-85: Response pooling helpers look correct and safe to reuse.The sync.Pool setup and acquire/release helpers for
HuggingFaceChatResponse,HuggingFaceTranscriptionResponse, andHuggingFaceSpeechResponsecorrectly reset structs on acquire and only reuse them after the call site is done, which avoids stale state and minimizes allocations. No changes needed here.
87-131: Provider initialization and URL handling are consistent with other providers.
NewHuggingFaceProvider,GetProviderKey, andbuildRequestURLfollow the existing provider patterns: they apply defaults, configure the fasthttp client (including proxy), normalizeBaseURL, honorCustomProviderConfig, and respect per-request path overrides. This looks good and matches the broader design.
132-205: Core HTTP request path is robust and correctly decouples the response body.
completeRequestcleanly centralizes request construction, extra-header injection, auth, context-aware execution, non-200 handling viaHandleProviderAPIError, gzip-aware decoding viaCheckAndDecodeBody, and copies the body before releasing the fasthttp response to avoid use-after-free. The debug logging is also appropriately gated. No functional issues from this implementation.
207-293: Model listing logic is aligned with Bifrost patterns.
listModelsByKeyandListModelscorrectly apply operation-allowed checks, build the model hub URL, attach auth and extra headers, handle HTTP and provider-level errors, decode viaHuggingFaceListModelsResponse, and then delegate fan-out across keys viaHandleMultipleListModelsRequests, including latency and optional raw-response propagation. This looks consistent and complete.
303-375: ChatCompletion request/response wiring looks solid.The ChatCompletion path correctly:
- Checks operation permissions.
- Normalizes the model via
splitIntoModelProviderbefore building the HF request.- Uses
CheckContextAndGetRequestBodywithToHuggingFaceChatCompletionRequestand explicitly disables streaming.- Builds the URL with
buildRequestURL, callscompleteRequest, and uses the pooledHuggingFaceChatResponsefor JSON decoding.- Converts to
BifrostChatResponseand fillsExtraFields(provider, requested model, request type, latency, raw response when enabled).This end-to-end flow is cohesive and matches the intended Bifrost provider contract.
377-659: Streaming chat implementation handles SSE and Responses fallback correctly.The
ChatCompletionStreamimplementation is careful and comprehensive: it validates operation permissions, handles the Responses→Chat fallback via a stream state object, builds a streaming POST withAccept: text/event-stream, and usesbufio.ScanneronBodyStream()with an increased buffer. The loop:
- Filters comments/empty lines and
[DONE].- Parses SSE
data:lines, detects error payloads viaHuggingFaceResponseErrorand sends a structuredBifrostError.- Decodes normal chunks into
HuggingFaceChatStreamResponse, converts to Bifrost stream responses, populates per-chunk metadata (including latency and chunk index), and optionally attaches raw JSON.- Handles the combined usage+content case for Responses fallback by splitting content/usage into separate events.
- Propagates scanner errors via
ProcessAndSendErrorand ensures resources/channels are released.This is a solid streaming implementation; no blocking issues identified.
661-683: Responses and ResponsesStream wrappers are straightforward and correct.These methods reuse the ChatCompletion/ChatCompletionStream implementations and only adapt to the Responses API shape, setting
ExtraFields.RequestType,Provider, andModelRequestedappropriately. This is a good reuse of existing logic and behaves as expected.
756-830: Speech and SpeechStream behaviors are coherent with provider capabilities.
Speechcorrectly reuses the model-mapping pattern (includingProviderTask == "text-to-speech"), builds the HF-specific speech request viaToHuggingFaceSpeechRequest, executescompleteRequest, and converts the pooledHuggingFaceSpeechResponseinto aBifrostSpeechResponsewith fully populatedExtraFields(including latency and optional raw response).SpeechStreamexplicitly returns an unsupported-operation error tagged withschemas.SpeechStreamRequest, which is appropriate if streaming speech isn’t supported yet.
832-906: Transcription and TranscriptionStream correctly target ASR and error types.
Transcription:
- Checks operation permissions for
schemas.TranscriptionRequest.- Splits model/provider and resolves a mapping where
ProviderTask == "automatic-speech-recognition".- Builds the body via
ToHuggingFaceTranscriptionRequest, derives the inference route URL withgetInferenceProviderRouteURL, and callscompleteRequest.- Uses the pooled
HuggingFaceTranscriptionResponseto decode and convert toBifrostTranscriptionResponse, filling provider/model/request-type/latency/RawResponse metadata.
TranscriptionStreamcleanly returns an unsupported-operation error with the correctschemas.TranscriptionStreamRequesttag. This aligns with the intended ASR behavior and prior fixes to the task and error types.
4663a80 to
53cb91d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (4)
Makefile (3)
97-98: DUPLICATE COMMENT:whichwith full path variable—still present despite prior review.This pattern was flagged in a previous review (commit f279893) but remains in the code. While the AND logic with
[ ! -x "$$INSTALLED" ]does provide fallback protection, the condition! which $$INSTALLEDwill always be true for full paths (sincewhichcannot resolve absolute paths). Simplify by removing the redundantwhichcheck:- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \ + if [ ! -x "$$INSTALLED" ]; then \
103-104: DUPLICATE COMMENT:whichfallback for full-path variable—unclear intent and inconsistent with AIR_BIN pattern.This pattern was flagged in a previous review (commit f279893). Although the
[ -x ]check comes first and will handle full paths correctly (short-circuiting thewhichfallback), the fallback itself is ineffective for full paths and inconsistent with the simpler[ -x "$(AIR_BIN)" ]check used for AIR_BIN on line 90. Consider aligning both patterns:- @if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; then \ + @if [ -x "$(GOTESTSUM_BIN)" ]; then \
110-111: DUPLICATE COMMENT:whichwith full path variable—same issue as lines 97–98.This pattern was flagged in a previous review (commit f279893) but persists here. The AND logic provides practical protection, but the
whichcheck is redundant for full paths. Apply the same simplification as line 97:- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \ + if [ ! -x "$$INSTALLED" ]; then \core/providers/huggingface/huggingface.go (1)
381-387: Bug: Incorrect request type in unsupported operation errors.Both
TextCompletionandTextCompletionStreamreturn errors withschemas.EmbeddingRequestinstead of the correct request types.Apply this diff to fix the error types:
func (provider *HuggingFaceProvider) TextCompletion(ctx context.Context, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (*schemas.BifrostTextCompletionResponse, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionRequest, provider.GetProviderKey()) } func (provider *HuggingFaceProvider) TextCompletionStream(ctx context.Context, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostTextCompletionRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.EmbeddingRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.TextCompletionStreamRequest, provider.GetProviderKey()) }
🧹 Nitpick comments (3)
Makefile (1)
75-85: Next.js fallback chain is reasonable but could be simplified.The logic prefers local
./node_modules/.bin/next, thennpx, then global install. However, the condition structure is a bit convoluted:
- Line 77: checks if local next exists
- Line 79: only runs npm install if npx is found (but the intent seems to be to install next if not found)
- Line 83: fallback comment mentions "may fail on Nix"
The logic works, but consider clarifying the intent—should the npm install on line 81 run unconditionally if local next is missing, or should it depend on npx availability? Currently, it only runs
npm installifnpxexists, which may not match the intent.core/schemas/account.go (1)
54-57: Consider adding anEndpointfield for self-hosted deployments.The current implementation only includes
Deployments, which covers model-to-deployment mappings. Per the linked issue, self-hosted endpoints are an optional enhancement. For future extensibility, consider whether anEndpointfield (similar toAzureKeyConfig) would be beneficial for users deploying models on custom infrastructure or using dedicated inference endpoints.This is non-blocking for the initial implementation.
type HuggingFaceKeyConfig struct { + Endpoint string `json:"endpoint,omitempty"` // Custom HuggingFace inference endpoint URL Deployments map[string]string `json:"deployments,omitempty"` // Mapping of model identifiers to deployment names }core/providers/huggingface/embedding.go (1)
60-104: Consider simplifying zero-usage handling.The conversion logic is correct. However, lines 94-100 explicitly create a zero-valued
BifrostLLMUsagewhen the HuggingFace response doesn't include usage. You could simplify by leavingUsageasnilwhen not provided, since the Bifrost schema allows it to be omitted.If you prefer explicit zero values, the current implementation is fine. Otherwise, you can remove lines 94-100:
// Map usage information if available if response.Usage != nil { bifrostResponse.Usage = &schemas.BifrostLLMUsage{ PromptTokens: response.Usage.PromptTokens, CompletionTokens: response.Usage.CompletionTokens, TotalTokens: response.Usage.TotalTokens, } - } else { - // Set empty usage if not provided - bifrostResponse.Usage = &schemas.BifrostLLMUsage{ - PromptTokens: 0, - CompletionTokens: 0, - TotalTokens: 0, - } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (26)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)Makefile(8 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(3 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(2 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/features/unified-interface.mdx(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(1 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
- ui/README.md
- core/schemas/mux.go
- core/internal/testutil/account.go
- transports/config.schema.json
- core/providers/huggingface/utils.go
- .github/workflows/pr-tests.yml
- docs/apis/openapi.json
- core/providers/huggingface/speech.go
- docs/features/unified-interface.mdx
- ui/lib/constants/config.ts
- core/bifrost.go
- core/internal/testutil/responses_stream.go
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/schemas/bifrost.gocore/schemas/account.gocore/providers/huggingface/transcription.godocs/contributing/adding-a-provider.mdxMakefilecore/providers/huggingface/huggingface_test.gocore/providers/huggingface/embedding.gocore/providers/huggingface/chat.goui/lib/constants/logs.tscore/providers/huggingface/models.goui/lib/constants/icons.tsxcore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
🧬 Code graph analysis (6)
core/schemas/bifrost.go (1)
ui/lib/types/config.ts (1)
ModelProvider(171-174)
core/providers/huggingface/transcription.go (6)
core/schemas/transcriptions.go (2)
BifrostTranscriptionRequest(3-10)BifrostTranscriptionResponse(16-26)core/providers/huggingface/types.go (5)
HuggingFaceTranscriptionRequest(376-381)HuggingFaceTranscriptionRequestParameters(384-387)HuggingFaceTranscriptionGenerationParameters(390-407)HuggingFaceTranscriptionEarlyStopping(411-414)HuggingFaceTranscriptionResponse(447-450)ui/components/ui/input.tsx (1)
Input(15-69)core/schemas/models.go (1)
Model(109-129)core/providers/elevenlabs/transcription.go (1)
ToBifrostTranscriptionResponse(100-150)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(287-296)HuggingFace(51-51)
core/providers/huggingface/embedding.go (3)
core/schemas/embedding.go (4)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)core/providers/huggingface/types.go (2)
HuggingFaceEmbeddingRequest(303-313)HuggingFaceEmbeddingResponse(324-328)core/schemas/chatcompletions.go (1)
BifrostLLMUsage(640-647)
core/providers/huggingface/chat.go (2)
core/schemas/chatcompletions.go (12)
BifrostChatRequest(12-19)ChatContentBlockTypeText(497-497)ChatContentBlockTypeImage(498-498)ChatAssistantMessage(541-545)ChatToolMessage(536-538)BifrostResponseChoice(582-590)ChatAssistantMessageToolCall(564-570)ChatNonStreamResponseChoice(605-608)BifrostLogProbs(593-598)LogProb(625-629)ChatStreamResponseChoice(611-613)ChatStreamResponseChoiceDelta(616-622)core/providers/huggingface/types.go (12)
HuggingFaceChatRequest(72-92)HuggingFaceChatMessage(94-102)HuggingFaceContentItem(105-109)HuggingFaceImageRef(111-113)HuggingFaceToolCall(115-119)HuggingFaceFunction(121-125)HuggingFaceResponseFormat(127-130)HuggingFaceStreamOptions(139-141)HuggingFaceTool(143-146)HuggingFaceToolFunction(148-152)HuggingFaceChatResponse(154-161)HuggingFaceChatStreamResponse(215-224)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
HuggingFaceListModelsResponse(24-26)core/schemas/bifrost.go (13)
ModelProvider(32-32)RequestType(86-86)ChatCompletionRequest(92-92)ChatCompletionStreamRequest(93-93)TextCompletionRequest(90-90)TextCompletionStreamRequest(91-91)ResponsesRequest(94-94)ResponsesStreamRequest(95-95)EmbeddingRequest(96-96)SpeechRequest(97-97)SpeechStreamRequest(98-98)TranscriptionRequest(99-99)TranscriptionStreamRequest(100-100)core/schemas/models.go (2)
BifrostListModelsResponse(36-45)Model(109-129)
core/providers/huggingface/types.go (1)
core/schemas/models.go (1)
Model(109-129)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...
(QB_NEW_EN_HYPHEN)
[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...
(QB_NEW_EN_HYPHEN)
[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...
(QB_NEW_EN_HYPHEN)
[grammar] ~1475-~1475: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...
(QB_NEW_EN_HYPHEN)
[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
🔇 Additional comments (25)
.github/workflows/release-pipeline.yml (1)
118-118: Environment variable additions are correct and necessary.The
HUGGING_FACE_API_KEYenvironment variable is properly added to all four release jobs with correct GitHub Secrets reference syntax. The environment variables are inherited by the release scripts' child processes (specificallygo test), where HuggingFace tests access the key viaos.Getenv("HUGGING_FACE_API_KEY"). This is the correct pattern for passing secrets to test suites during the release workflow.The documentation already uses consistent naming (
HUGGING_FACE_API_KEYwith underscore) as shown indocs/contributing/adding-a-provider.mdx, so no documentation updates are needed.Likely an incorrect or invalid review comment.
Makefile (5)
15-21: Binary path resolution and variable initialization look sound.The new variables (GOBIN, GOPATH, DEFAULT_GOBIN, AIR_BIN, GOTESTSUM_BIN) properly detect and prioritize local Go binary paths, with sensible fallbacks. Line 21's GOTESTSUM_BIN simplification (removal of the
shell whichfallback) is an improvement. This approach supports the HuggingFace provider tests by ensuring test tooling is available in predictable locations.
24-29: Color variables using printf are a good improvement.Replacing hardcoded ANSI escape sequences with shell
printfcalls is more portable and avoids raw escape codes in the Makefile. This is a solid enhancement.
65-70: Root check in install-ui is appropriate for local development.The guard against running
makeas root on developer machines (except CI environments) prevents npm permission issues on systems like Nix. This is a pragmatic improvement.
155-155: Using AIR_BIN variable in dev target is correct.The change from direct
airinvocation to$(AIR_BIN)ensures the tool is invoked from the detected/installed path, improving reliability. Good alignment with the binary path resolution strategy.Also applies to: 163-163
349-349: Test targets correctly use GOTESTSUM_BIN variable.All test targets (test, test-core, test-plugins) now use
$(GOTESTSUM_BIN)instead of directgotestsuminvocations. This ensures consistency with the binary path resolution and supports the HuggingFace provider testing workflow. The change is well-applied across all affected targets.Also applies to: 401-401, 425-425, 454-454, 548-548
core/schemas/bifrost.go (3)
35-52: LGTM!The HuggingFace provider constant is correctly added with the lowercase value
"huggingface", consistent with the naming convention used by other providers.
55-62: LGTM!Adding HuggingFace to
SupportedBaseProvidersis appropriate, allowing it to serve as a base provider for custom provider configurations.
65-83: LGTM!HuggingFace is correctly added to
StandardProviders, completing the registration as a built-in provider.ui/lib/constants/logs.ts (2)
2-20: LGTM!The
"huggingface"entry is correctly placed in alphabetical order within theKnownProvidersNamesarray, and theProviderNametype will automatically include it through type derivation.
43-61: LGTM!The
ProviderLabelsentry correctly uses the brand-appropriate capitalization"HuggingFace"for user-facing display.core/schemas/account.go (1)
8-18: LGTM!The
HuggingFaceKeyConfigfield is correctly added to theKeystruct following the established pattern for provider-specific configurations.core/providers/huggingface/models.go (1)
16-104: LGTM! Well-structured model listing and method derivation.The implementation correctly:
- Handles nil inputs and skips invalid models
- Derives supported methods from pipeline tags and model tags with comprehensive coverage
- Pre-allocates slices for performance
- Returns sorted method lists for consistency
core/providers/huggingface/types.go (2)
28-48: LGTM! Flexible JSON unmarshaling.The custom
UnmarshalJSONcorrectly handles both the array form[...]and object form{"models": [...]}returned by different HuggingFace API versions, with proper error handling when neither format matches.
416-444: LGTM! Correct handling of union types.The custom JSON marshaling/unmarshaling for
HuggingFaceTranscriptionEarlyStoppingproperly handles both boolean and string ("never") forms, which aligns with HuggingFace API's flexible parameter schema.core/providers/huggingface/huggingface_test.go (1)
12-63: LGTM! Comprehensive test configuration.The test correctly:
- Gates on
HUGGING_FACE_API_KEYenvironment variable- Uses appropriate models for each feature (transcription, speech synthesis, embeddings, chat, vision)
- Configures comprehensive test scenarios covering chat, streaming, tool calls, images, embeddings, and more
- Follows testutil patterns with proper setup and cleanup
core/providers/huggingface/transcription.go (2)
9-101: LGTM! Robust parameter handling.The conversion correctly:
- Validates inputs with appropriate error messages
- Uses
extractIntFromInterfaceto handle numeric parameters from JSON (which may befloat64orint)- Handles the
early_stoppingunion type (bool or string)- Maps all generation parameters comprehensively
103-142: LGTM! Clean response conversion.The conversion properly:
- Validates non-nil response and model name
- Maps transcription chunks to Bifrost segments with timestamps
- Sets appropriate provider metadata in ExtraFields
core/providers/huggingface/chat.go (2)
12-207: LGTM! Comprehensive chat request conversion.The conversion correctly:
- Handles messages, roles, names, and content (string and structured blocks)
- Safely processes tool calls with nil checks for
Function.Name- Maps all chat parameters including response format, stream options, and tools
- Uses
json.RawMessagefor flexible fields likeToolChoice
209-323: LGTM! Thorough response conversion.The conversion properly:
- Validates inputs and constructs base response fields
- Converts choices, messages, tool calls, and logprobs to Bifrost format
- Maps usage information when available
- Sets appropriate ExtraFields metadata
core/providers/huggingface/huggingface.go (5)
389-461: LGTM! Solid chat completion implementation.The implementation correctly:
- Checks operation permissions
- Splits model identifiers into provider and model components
- Converts requests using the chat converter
- Handles errors and raw responses appropriately
- Sets all required ExtraFields metadata
463-745: LGTM! Robust streaming implementation.The streaming logic properly:
- Handles SSE parsing with proper line-by-line processing
- Checks for context cancellation between chunks
- Parses error responses in the stream
- Converts HuggingFace stream responses to Bifrost format
- Supports fallback to
ResponsesStreamwhen needed with proper state management- Handles combined usage and content chunks by splitting them into separate events
771-840: LGTM! Clean embedding implementation.The implementation correctly:
- Validates operation permissions
- Retrieves provider mapping to get the correct provider-specific model ID
- Validates the task type matches "feature-extraction"
- Converts responses with proper error handling
842-918: LGTM! Complete speech synthesis flow.The implementation correctly:
- Validates permissions and request body
- Retrieves and validates provider mapping for "text-to-speech" task
- Downloads audio from the returned URL
- Converts response with proper metadata
924-993: LGTM! Correct transcription implementation.The implementation correctly:
- Validates permissions and model mapping
- Checks for "automatic-speech-recognition" task
- Converts requests and responses with proper error handling
- Sets appropriate ExtraFields metadata
cdb45be to
a1ee290
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (3)
Makefile (2)
97-98: ** Redundantwhichcheck with full path will always fail.**Although a past review comment flagged this (marked "✅ Addressed in commit f279893"), the pattern remains at line 97:
if ! which $$INSTALLED && [ ! -x "$$INSTALLED" ]. Since$$INSTALLEDis an absolute path (e.g.,/home/user/go/bin/air), thewhichcommand will always fail (it only searches executables in PATH by name, not by path). The[ ! -x "$$INSTALLED" ]check is what matters.Simplify to rely solely on the path check:
- if ! which $$INSTALLED >/dev/null 2>&1 && [ ! -x "$$INSTALLED" ]; then \ + if [ ! -x "$$INSTALLED" ]; then \This aligns with how
AIR_BINis checked earlier (line 90).
103-104: ** Inconsistent binary availability check;whichfallback doesn't handle full paths.**Line 103 uses
[ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN)whileAIR_BIN(line 90) uses just[ -x "$(AIR_BIN)" ]. A past review flagged this inconsistency (marked "✅ Addressed"), but the pattern remains. IfGOTESTSUM_BINresolves to a full path like/home/user/go/bin/gotestsum, thewhichfallback fails (it only searches PATH by name).For consistency and clarity, simplify to match the
AIR_BINpattern:- @if [ -x "$(GOTESTSUM_BIN)" ] || which $(GOTESTSUM_BIN) > /dev/null 2>&1; then \ + @if [ -x "$(GOTESTSUM_BIN)" ]; then \Also apply the same simplification to line 110 in
install-gotestsumto remove the redundantwhichcheck.docs/contributing/adding-a-provider.mdx (1)
500-527: Variable name inconsistency persists in documentation example.The example code declares
hfReqon line 500, but lines 510-527 referenceproviderReq(e.g.,providerReq.Temperature). This could confuse contributors.Apply this diff to fix the variable names:
// Map parameters if bifrostReq.Params != nil { params := bifrostReq.Params // Map standard parameters if params.Temperature != nil { - providerReq.Temperature = params.Temperature + hfReq.Temperature = params.Temperature } if params.MaxTokens != nil { - providerReq.MaxTokens = params.MaxTokens + hfReq.MaxTokens = params.MaxTokens } // ... other standard parameters // Handle provider-specific ExtraParams if params.ExtraParams != nil { if customParam, ok := params.ExtraParams["custom_param"].(string); ok { - providerReq.CustomParam = &customParam + hfReq.CustomParam = &customParam } } } - return providerReq + return hfReq }
🧹 Nitpick comments (9)
ui/README.md (1)
12-12: Consider removing the specific provider count to further reduce maintenance burden.Line 12 mentions "15+ AI providers" which will require updates whenever the count changes. Since the objective is to redirect users to external documentation rather than maintain this kind of reference, consider removing the number entirely and simply linking to the provider configuration docs.
-- **Provider Management** - Configure [15+ AI providers](https://docs.getbifrost.ai/quickstart/gateway/provider-configuration) +- **Provider Management** - Configure [AI providers](https://docs.getbifrost.ai/quickstart/gateway/provider-configuration)core/schemas/mux.go (1)
1146-1221:delta.Thoughtnow flows into both output text deltas and reasoning deltas — verify this is intentionalThe new logic:
- Treats
hasContent := delta.Content != nil && *delta.Content != ""andhasThought := delta.Thought != nil && *delta.Thought != "".- Enters the text path when
hasContent || hasThought.- Builds
contentDeltaby concatenatingdelta.Contentanddelta.Thought(when present) and emits it asResponsesStreamResponseTypeOutputTextDeltaviaDelta: &contentDelta.Further down, the existing block at Lines 1369–1380 still emits a separate
ResponsesStreamResponseTypeReasoningSummaryTextDeltafor non‑emptydelta.Thought.Net effect:
- For chunks with only
delta.Thought(nodelta.Content), we now:
- Create a text item and mark
TextItemHasContent = true.- Emit
response.output_text.deltawith the thought text.- Also emit
reasoning_summary_text.deltawith the same text.- For chunks where both
ContentandThoughtare set, the mainoutput_text.deltastream carriescontent + thoughtwhile reasoning deltas still carrythoughtalone.If
delta.Thoughtis meant as reasoning/chain‑of‑thought that should not appear in the primary user‑visible text stream, this is a behavior change and may cause reasoning to be shown twice (or in the wrong place) in consumers that use bothresponse.output_text.deltaandreasoning_summary_text.delta.If the intent is to surface thought text in the main output stream for specific providers (e.g., Hugging Face) while still emitting reasoning events, it might be worth:
- Documenting this clearly, and/or
- Considering a guard such as:
- Only including
delta.ThoughtincontentDeltawhen the upstream provider flags it as user‑visible, or- Skipping the separate
ReasoningSummaryTextDeltaemission when you’ve already foldedThoughtinto the mainDelta.Can you confirm the intended semantics here and whether clients are expected to consume both event types simultaneously?
core/providers/huggingface/utils.go (1)
291-319: Consider adding context support for request cancellation.The
downloadAudioFromURLfunction doesn't accept or use acontext.Context, which means:
- No timeout control beyond the client's default
- No cancellation support if the caller's context is cancelled
Consider accepting context and using
DoTimeoutor checking context cancellation:-func (provider *HuggingFaceProvider) downloadAudioFromURL(audioURL string) ([]byte, error) { +func (provider *HuggingFaceProvider) downloadAudioFromURL(ctx context.Context, audioURL string) ([]byte, error) { req := fasthttp.AcquireRequest() resp := fasthttp.AcquireResponse() defer fasthttp.ReleaseRequest(req) defer fasthttp.ReleaseResponse(resp) req.SetRequestURI(audioURL) req.Header.SetMethod(http.MethodGet) - err := provider.client.Do(req, resp) + _, err := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp) if err != nil { return nil, fmt.Errorf("failed to download audio: %w", err) }docs/contributing/adding-a-provider.mdx (1)
43-43: Optional: Consider hyphenating "OpenAI-Compatible" for consistency.Static analysis suggests using a hyphen to join "OpenAI" and "Compatible" when used as a compound adjective (e.g., "OpenAI-Compatible Providers"). This is a minor grammatical nitpick and optional to address.
Also applies to: 71-71, 629-629, 1475-1475
core/providers/huggingface/speech.go (1)
10-12: Consider returning explicit error for nil inputs.Returning
(nil, nil)for nil input can make error handling ambiguous for callers - they need to check both return values. Consider returning an error instead, or document this behavior clearly.func ToHuggingFaceSpeechRequest(request *schemas.BifrostSpeechRequest) (*HuggingFaceSpeechRequest, error) { if request == nil { - return nil, nil + return nil, fmt.Errorf("speech request cannot be nil") }Alternatively, if
nilinput is a valid case that should be handled silently, add a comment explaining this design choice.Also applies to: 99-101
core/providers/huggingface/huggingface_test.go (1)
12-63: Comprehensive HuggingFace test configuration looks solidEnv gating,
SetupTestusage, and theComprehensiveTestConfig(models and enabled scenarios) are consistent with the provider’s implemented surfaces and should give good end‑to‑end coverage. As a minor polish, you coulddefer client.Shutdown()immediately after successful setup so it still runs if additional subtests are added or the test body grows.core/providers/huggingface/embedding.go (1)
10-65: Embedding request/response converters are correct; consider mapping more params laterThe request converter correctly handles
hf-inferencevs other providers (usingInputsvsInput), and the response converter buildsEmbeddingDataandBifrostLLMUsagein line with Bifrost’s schema. As a future enhancement, you could also plumb through typed fields likeEncodingFormatandDimensions(plus any non-text embedding inputs you decide to support) if/when Bifrost starts exposing them for HuggingFace embeddings.Also applies to: 68-110
core/providers/huggingface/models.go (1)
16-44: Model listing and capability derivation are reasonable and safeTransforming Hub models into
schemas.Modelwith IDs of the formprovider/inferenceProvider/modelIdand derivingSupportedMethodsfrompipeline_tagplus tags gives a sensible, conservative view over the catalog. The heuristics for chat/text, embeddings, TTS, and ASR look balanced; if you find important HF tags that don’t get mapped yet, they can be incrementally added toderiveSupportedMethods.Also applies to: 46-104
core/providers/huggingface/transcription.go (1)
11-119: Transcription converters look correct; you can later expose more paramsThe request converter cleanly handles
hf-inference(raw audio) vs other providers and the fal‑ai data‑URL requirement, including an explicit guard against unsupported wav input. Generation parameters are mapped in a type‑safe way viaSafeExtractIntPointerand direct float/bool assertions, and the response converter sensibly turnschunks.timestampintoTranscriptionSegmentranges.When you want richer control, you can also project
Language,Prompt, andResponseFormatfromBifrostTranscriptionParameters(orExtraParams) into the HuggingFace request; doing so won’t disturb the current happy path.Also applies to: 121-159
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (5)
core/internal/testutil/scenarios/audio/Numbers_And_Punctuation.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/audio/RoundTrip_Basic_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/audio/RoundTrip_Medium_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/audio/RoundTrip_Technical_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/audio/Technical_Terms.mp3is excluded by!**/*.mp3
📒 Files selected for processing (32)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)Makefile(8 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(3 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/internal/testutil/transcription.go(6 hunks)core/providers/gemini/speech.go(2 hunks)core/providers/gemini/transcription.go(2 hunks)core/providers/gemini/utils.go(0 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/providers/utils/audio.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(2 hunks)core/schemas/transcriptions.go(1 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/features/unified-interface.mdx(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(6 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
💤 Files with no reviewable changes (1)
- core/providers/gemini/utils.go
🚧 Files skipped from review as they are similar to previous changes (7)
- core/internal/testutil/responses_stream.go
- docs/apis/openapi.json
- core/providers/huggingface/chat.go
- transports/config.schema.json
- .github/workflows/pr-tests.yml
- core/internal/testutil/account.go
- docs/features/unified-interface.mdx
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/gemini/transcription.gocore/schemas/transcriptions.goui/lib/constants/logs.tscore/providers/huggingface/models.gocore/providers/gemini/speech.gocore/internal/testutil/transcription.gocore/schemas/account.goui/lib/constants/config.tscore/providers/huggingface/huggingface_test.goui/README.mdcore/schemas/mux.gocore/providers/huggingface/embedding.gocore/providers/huggingface/transcription.gocore/bifrost.gocore/providers/huggingface/speech.gocore/schemas/bifrost.gocore/providers/utils/audio.goui/lib/constants/icons.tsxMakefilecore/providers/huggingface/utils.godocs/contributing/adding-a-provider.mdxcore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
🧠 Learnings (1)
📚 Learning: 2025-09-01T15:29:17.076Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 372
File: core/providers/utils.go:552-598
Timestamp: 2025-09-01T15:29:17.076Z
Learning: The detectAudioMimeType function in core/providers/utils.go is specifically designed for Gemini's audio format support, which only includes: WAV (audio/wav), MP3 (audio/mp3), AIFF (audio/aiff), AAC (audio/aac), OGG Vorbis (audio/ogg), and FLAC (audio/flac). The implementation should remain focused on these specific formats rather than being over-engineered for general-purpose audio detection.
Applied to files:
core/providers/utils/audio.go
🧬 Code graph analysis (11)
core/providers/gemini/transcription.go (1)
core/providers/utils/audio.go (1)
DetectAudioMimeType(78-119)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
HuggingFaceListModelsResponse(24-26)core/schemas/bifrost.go (13)
ModelProvider(32-32)RequestType(86-86)ChatCompletionRequest(92-92)ChatCompletionStreamRequest(93-93)TextCompletionRequest(90-90)TextCompletionStreamRequest(91-91)ResponsesRequest(94-94)ResponsesStreamRequest(95-95)EmbeddingRequest(96-96)SpeechRequest(97-97)SpeechStreamRequest(98-98)TranscriptionRequest(99-99)TranscriptionStreamRequest(100-100)core/schemas/models.go (1)
Model(109-129)
core/providers/gemini/speech.go (1)
core/providers/utils/audio.go (1)
DetectAudioMimeType(78-119)
core/internal/testutil/transcription.go (4)
core/schemas/transcriptions.go (3)
BifrostTranscriptionRequest(3-10)TranscriptionInput(28-30)TranscriptionParameters(32-49)core/internal/testutil/utils.go (1)
GetProviderVoice(38-86)core/schemas/speech.go (4)
BifrostSpeechRequest(9-16)SpeechParameters(43-58)SpeechVoiceInput(65-68)BifrostSpeechResponse(22-29)core/internal/testutil/test_retry_framework.go (3)
TestRetryContext(168-173)SpeechRetryConfig(216-223)WithSpeechTestRetry(1326-1476)
core/schemas/account.go (1)
ui/lib/types/config.ts (3)
AzureKeyConfig(23-27)VertexKeyConfig(36-42)BedrockKeyConfig(53-60)
core/providers/huggingface/huggingface_test.go (5)
core/internal/testutil/setup.go (1)
SetupTest(51-60)core/internal/testutil/account.go (2)
ComprehensiveTestConfig(47-64)TestScenarios(22-44)core/schemas/bifrost.go (2)
HuggingFace(51-51)Fallback(131-134)core/schemas/models.go (1)
Model(109-129)core/internal/testutil/tests.go (1)
RunAllComprehensiveTests(15-62)
core/schemas/mux.go (2)
core/schemas/responses.go (2)
BifrostResponsesStreamResponse(1422-1460)ResponsesStreamResponseTypeOutputTextDelta(1370-1370)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/huggingface/embedding.go (3)
core/schemas/embedding.go (4)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)core/providers/huggingface/types.go (2)
HuggingFaceEmbeddingRequest(303-314)HuggingFaceEmbeddingResponse(325-329)core/schemas/chatcompletions.go (1)
BifrostLLMUsage(640-647)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/providers/huggingface/huggingface.go (1)
NewHuggingFaceProvider(84-117)
core/schemas/bifrost.go (1)
ui/lib/types/config.ts (1)
ModelProvider(171-174)
ui/lib/constants/icons.tsx (1)
ui/scripts/fix-paths.js (1)
path(4-4)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[grammar] ~43-~43: Use a hyphen to join words.
Context: ... OpenAI API compatible: #### Non-OpenAI Compatible Providers If the provider ha...
(QB_NEW_EN_HYPHEN)
[grammar] ~71-~71: Use a hyphen to join words.
Context: ...T - Add comprehensive tests #### OpenAI Compatible Providers If the provider is...
(QB_NEW_EN_HYPHEN)
[grammar] ~629-~629: Use a hyphen to join words.
Context: ...ls (for response converters) ### OpenAI Compatible Providers If you are impleme...
(QB_NEW_EN_HYPHEN)
[grammar] ~1475-~1475: Use a hyphen to join words.
Context: ...r complete details. --- ### For OpenAI Compatible Providers For OpenAI-compati...
(QB_NEW_EN_HYPHEN)
[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
🔇 Additional comments (27)
ui/README.md (2)
1-242: Excellent update addressing past feedback—README now emphasizes external docs and less maintenance burden.The README has been significantly improved to align with the prior feedback to redirect to documentation rather than maintain hardcoded lists. The shift from "dashboard" to "web interface," addition of external documentation links, updated architecture section (Redux Toolkit + RTK Query), and clearer Quick Start flow are all valuable enhancements.
209-218: Verify that all external documentation links are current and accessible.The README now extensively links to
docs.getbifrost.aifor detailed information. Ensure these endpoints are live, correct, and reflect the current state of the documentation (especially those that reference new HuggingFace provider features and recent architectural changes like the move to RTK Query)..github/workflows/release-pipeline.yml (2)
118-118: Environment variable additions are consistent and correctly placed.The four additions of
HUGGING_FACE_API_KEYacross the release jobs use the correct naming convention (consistent with the resolution documented in commit 53cb91d) and follow the established pattern of other provider API keys. The placement in the environment sections is logical and maintains alphabetical/categorical grouping with other secrets.Also applies to: 191-191, 268-268, 357-357
118-118: HUGGING_FACE_API_KEY is defined but not used by any release script.The environment variable is added consistently across all four release jobs with correct syntax, but examination of the release scripts reveals it is never referenced. None of
release-core.sh,release-framework.sh,release-all-plugins.sh,release-single-plugin.sh, orrelease-bifrost-http.shcheck for or use this variable. It only handlesCODECOV_TOKENandGH_TOKEN/GITHUB_TOKEN.Verify that:
- This variable is intended for a future feature not yet implemented, or
- The scripts that should consume it need to be updated to use
HUGGING_FACE_API_KEYLikely an incorrect or invalid review comment.
Makefile (5)
14-21: Binary path management logic is well-structured.The introduction of GOBIN, GOPATH, and DEFAULT_GOBIN to compute AIR_BIN and GOTESTSUM_BIN provides a consistent way to locate or fall back to binaries, avoiding hard-coded path assumptions. This improves portability across systems with different Go installation layouts.
24-29: Color definitions via printf improve portability and maintainability.Using
$(shell printf)instead of embedded ANSI codes is clearer and easier to maintain. This approach also enables future conditional color handling if needed.
75-85: Clarify npm install working directory in the next.js installation fallback.At line 81, the code runs
npm --prefix . install nextwhile already in theui/directory (line 76:@cd ui &&). The--prefix .appears redundant since you're already in that directory. Either remove the prefix or adjust the path if this is intentional.Verify that this installs next in the correct location (
ui/node_modules). If intended, a comment would help clarify the reasoning.
155-155: Consistent use of $(AIR_BIN) in dev target.Both debug and normal mode branches correctly invoke
$(AIR_BIN), ensuring the target works whenairis in a non-standard location (e.g., managed byDEFAULT_GOBIN).Also applies to: 163-163
349-349: Test targets consistently use $(GOTESTSUM_BIN) variable.All test invocations now use
$(GOTESTSUM_BIN)instead of hard-coded "gotestsum", ensuring tests run even when the binary is in a non-standard location. TheGOWORK=offflags are correctly positioned.Also applies to: 401-401, 425-425, 454-454, 548-548
core/providers/gemini/transcription.go (1)
6-8: Centralizing audio MIME detection via utils is consistent and safeUsing
utils.DetectAudioMimeTypeinToGeminiTranscriptionRequestkeeps Gemini transcription in sync with the shared audio detector and avoids duplicated logic; for any non‑emptyInput.Fileyou now get a deterministic MIME type within Gemini’s supported set, with a safe MP3 fallback. This looks correct and consistent with the speech path.Also applies to: 159-165
core/providers/utils/audio.go (1)
64-119: SharedDetectAudioMimeTypecorrectly mirrors Gemini’s supported formatsThe new
DetectAudioMimeTypecovers exactly the expected Gemini formats (WAV, MP3, AIFF/AIFC, AAC, OGG, FLAC) using robust header checks and keeps a conservative MP3 fallback for short/unknown data. The ordering of checks (e.g., AAC before MP3 frame sync) is sensible and matches the earlier Gemini‑specific behavior, making this a solid centralization of the audio detection logic.Based on learnings, this stays scoped to Gemini’s supported formats without over-engineering.
core/providers/gemini/speech.go (2)
151-161: No functional change in WAV conversion branchThe minor edit around the
elsebranch inToBifrostSpeechResponsedoesn’t alter behavior: PCM is still converted to WAV when requested, otherwise raw PCM is passed through. No issues here.
176-183: Usingutils.DetectAudioMimeTypefor speech responses keeps MIME typing consistentSwitching
MIMETypeinToGeminiSpeechResponsetoutils.DetectAudioMimeType(bifrostResp.Audio)reuses the shared detector and ensures outbound audio responses are labeled with a MIME type from the same, well‑defined set as other Gemini audio flows. This is a clean and correct consolidation.ui/lib/constants/logs.ts (1)
10-10: LGTM!The HuggingFace provider addition follows the established pattern and is correctly integrated into both the provider names array and the labels mapping.
Also applies to: 60-60
core/schemas/account.go (2)
54-56: LGTM!The
HuggingFaceKeyConfigtype definition follows the established pattern from other provider configurations (Azure, Vertex, Bedrock) and correctly includes theDeploymentsfield for model-to-deployment mapping.
17-17: Fix typo in JSON tag: "hugggingface" → "huggingface".The JSON tag contains three 'g's (
hugggingface_key_config) instead of two. This typo will break API compatibility and serialization.Apply this diff:
- HuggingFaceKeyConfig *HuggingFaceKeyConfig `json:"hugggingface_key_config,omitempty"` // Hugging Face-specific key configuration + HuggingFaceKeyConfig *HuggingFaceKeyConfig `json:"huggingface_key_config,omitempty"` // Hugging Face-specific key configurationLikely an incorrect or invalid review comment.
core/schemas/bifrost.go (1)
51-51: LGTM! HuggingFace provider correctly registered in schemas.The provider constant follows the established naming conventions:
- PascalCase for Go constant (
HuggingFace)- Lowercase string value (
"huggingface")- Appropriately added to both
SupportedBaseProviders(enabling custom providers to use HuggingFace as a base) andStandardProviders(registering it as a built-in provider)Also applies to: 61-61, 82-82
core/providers/huggingface/utils.go (2)
178-189: Confirmed: Copy-paste error fixed.The error messages now correctly reference
sambanova(line 182) andscaleway(line 188) instead of the previous erroneous "nebius" references noted in past reviews.
132-149: LGTM! Model name parsing now handles edge cases.The function now properly returns an error when the model name has no slashes (t == 0), addressing the previously flagged issue. The logic correctly:
- Returns error for invalid format (no slashes)
- Defaults to
hf-inferenceprovider for single-slash models (e.g.,org/model)- Extracts explicit provider for multi-slash models (e.g.,
provider/org/model)docs/contributing/adding-a-provider.mdx (1)
7-13: Good addition of quick reference note.The note directing contributors to reference implementations (
cerebras/for OpenAI-compatible,huggingface/oranthropic/for custom APIs) is helpful for understanding real-world patterns beyond the simplified examples.core/providers/huggingface/speech.go (1)
41-67: LGTM! Integer extraction now handles JSON float64 values.The use of
schemas.SafeExtractIntPointerfor integer parameters (max_new_tokens,max_length,min_length, etc.) properly handles the case where JSON unmarshaling producesfloat64instead ofint. This addresses the concern raised in past reviews.ui/lib/constants/config.ts (1)
40-40: LGTM! HuggingFace correctly added to UI constants.The model placeholder provides helpful examples (
google/gemma-2-2b-it,nebius/Qwen/Qwen3-Embedding-8B) that demonstrate the HuggingFace model naming convention with provider prefixes. The key requirement is correctly set totrue.Likely an incorrect or invalid review comment.
core/bifrost.go (1)
26-37: HuggingFace provider wiring into Bifrost factory looks correctThe new import and
createBaseProviderbranch forschemas.HuggingFacefollow the existing provider pattern and cleanly integrate the new provider without altering existing behavior.Also applies to: 1327-1328
core/providers/huggingface/huggingface.go (3)
83-117: Provider construction, request execution, and model listing follow existing patterns
NewHuggingFaceProvider’s client setup, pool pre‑warming, and base‑URL handling match other providers;completeRequestWithRetrycleanly centralizes model‑ID validation and 404 cache invalidation, andlistModelsByKey’s fan‑out/aggregation logic is careful about errors, latency, and optional raw responses. The wiring intoListModelswithCheckOperationAllowedand the keyless path is also consistent with the rest of Bifrost.Also applies to: 129-206, 268-401
430-751: Chat completion (sync/stream) integration looks robustThe chat path correctly normalizes the model string via
splitIntoModelProvider, usesCheckContextAndGetRequestBody+ToHuggingFaceChatCompletionRequest, and maps back into Bifrost responses with properExtraFields(provider, requested model, request type, latency, raw response). The streaming implementation usesbufio.Scannerwith an enlarged buffer, handles SSE framing, distinguishes error envelopes vs normal chunks, and supports the Responses→Chat fallback by re‑emitting events asResponsesStreamwhere needed. Overall this matches Bifrost’s streaming conventions and should behave well under load.
777-837: Embedding, Speech, and Transcription paths are consistent and guardedThe Embedding/Speech/Transcription methods all:
- Check
CheckOperationAllowedfor the relevant request type,- Derive
inferenceProvider/modelNameviasplitIntoModelProvider,- Use the appropriate
ToHuggingFace*converter pluscompleteRequestWithRetryfor routing and retry,- Convert back into Bifrost responses and set
ExtraFields(provider, requested model, request type, latency, optional raw response).The hf‑inference vs non‑hf‑inference split for transcription audio, and the fal‑ai audio‑URL handling combined with the new test fixtures, align well with the new converter logic.
Also applies to: 839-902, 908-975
core/providers/huggingface/types.go (1)
10-68: HuggingFace model, embedding, and transcription types align well with the APIsThe model types (including
HuggingFaceModel,HuggingFaceListModelsResponse, and the inference‑provider mapping structs) give a clean projection of the Hub’s model/info endpoints, and the customUnmarshalJSONfor list‑models makes the code resilient to both array and object forms. Similarly, the embedding response’s multi‑shape unmarshal logic and the transcription generation/early‑stopping union cover the main formats exposed by HuggingFace while still mapping cleanly into Bifrost’s abstractions.Also applies to: 301-397, 431-513
1249a29 to
55ca6ee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
core/providers/huggingface/types.go (1)
271-272: Remove unusedExtrafield or document its intended purpose.The
Extra map[string]anyfield withjson:"-"tag is defined but never used anywhere in the codebase. This was flagged in a previous review but appears unaddressed. Either remove it or add documentation explaining its intended future use.type HuggingFaceSpeechRequest struct { Text string `json:"text"` Provider string `json:"provider" validate:"required"` Model string `json:"model" validate:"required"` Parameters *HuggingFaceSpeechParameters `json:"parameters,omitempty"` - Extra map[string]any `json:"-"` }docs/features/providers/huggingface.mdx (1)
47-52: Clarify fal‑ai audio format wording to match the current enforcement.The bullets say “Only MP3 … WAV and other formats are explicitly rejected”, but the snippet’s error suggests “mp3 or ogg” and the guard only special‑cases
audio/wav. To avoid confusion for maintainers, consider tightening either:
- the prose (e.g., “WAV is explicitly rejected; MP3 is supported and other formats like OGG are best‑effort”), or
- the code/example to truly enforce MP3‑only if that’s the intended contract.
This keeps docs and behavior aligned while still documenting the HuggingFace‑specific fal‑ai limitation (WAV rejected, MP3 required). Based on learnings, this is specific to fal‑ai when routed via HuggingFace.
Also applies to: 107-118
core/providers/huggingface/transcription.go (1)
24-47: falAI transcription branch still omitsModelandProviderfields.Non‑falAI requests correctly set both
ModelandProvider, but thefalAIbranch only setsAudioURL. For multi‑provider Hugging Face usage, fal‑ai still needs model and provider identifiers; without them you risk provider‑side 4xx or ambiguous behavior.You’ve already added the empty‑file guard; the remaining fix is to also populate
ModelandProviderin the falAI branch.Suggested patch for falAI branch
- encoded := base64.StdEncoding.EncodeToString(request.Input.File) - mimeType := getMimeTypeForAudioType(utils.DetectAudioMimeType(request.Input.File)) - if mimeType == "audio/wav" { - return nil, fmt.Errorf("fal-ai provider does not support audio/wav format; please use a different format like mp3 or ogg") - } - encoded = fmt.Sprintf("data:%s;base64,%s", mimeType, encoded) - hfRequest = &HuggingFaceTranscriptionRequest{ - AudioURL: encoded, - } + encoded := base64.StdEncoding.EncodeToString(request.Input.File) + mimeType := getMimeTypeForAudioType(utils.DetectAudioMimeType(request.Input.File)) + if mimeType == "audio/wav" { + return nil, fmt.Errorf("fal-ai provider does not support audio/wav format; please use a different format like mp3 or ogg") + } + encoded = fmt.Sprintf("data:%s;base64,%s", mimeType, encoded) + hfRequest = &HuggingFaceTranscriptionRequest{ + AudioURL: encoded, + Model: schemas.Ptr(modelName), + Provider: schemas.Ptr(string(inferenceProvider)), + }
🧹 Nitpick comments (7)
ui/README.md (1)
11-17: Address prior feedback on manual feature maintenance in README.A past review comment suggested avoiding manually maintained feature lists in favor of redirecting entirely to docs (as done in the main README). While these changes improve the README with external links, the Key Features section still maintains an inline list of 7 features. Consider consolidating this to a single link to the full feature documentation, keeping only the most critical items or removing the list entirely in favor of a pointer to the complete feature guide.
This aligns with the spirit of avoiding lists that can become outdated and keeping the README concise.
For reference, a more consolidated approach might be:
### Key Features Bifrost UI provides comprehensive AI infrastructure management. [View all features →](https://docs.getbifrost.ai/features)core/schemas/mux.go (1)
1214-1241: Simplify condition by removing redundant check.The condition at line 1216 contains a redundant
hasContentcheck inside the second disjunct. Since the outerhasContentalready short-circuits when true, the inner|| hasContentin(hasReasoning || hasContent)is unnecessary.🔎 Apply this diff to simplify the condition:
- // Emit text delta - at least one is required for lifecycle validation - // Even for reasoning-only responses, we emit an empty delta on the first chunk - if hasContent || (!state.TextItemHasContent && (hasReasoning || hasContent)) { + // Emit text delta - at least one is required for lifecycle validation + // Even for reasoning-only responses, we emit an empty delta on the first chunk + if hasContent || (!state.TextItemHasContent && hasReasoning) {docs/contributing/adding-a-provider.mdx (2)
1713-1714: Minor formatting issue: Missing newline before code block.Line 1714 ends with
}but there's no blank line before the next section. This could cause rendering issues in some Markdown processors.return nil, fmt.Errorf("unsupported provider: %s", targetProviderKey) } +
1999-2002: Minor: Hyphenate "Tool-calling" for grammatical consistency.Per LanguageTool and consistency with "OpenAI-compatible" elsewhere in the doc:
-**Tool calling tests fail**: +**Tool-calling tests fail**:core/internal/testutil/transcription.go (1)
73-97: Consider extracting fixture loading into a helper function.The fixture loading pattern is duplicated across 5 locations with nearly identical code:
_, filename, _, _ := runtime.Caller(0) dir := filepath.Dir(filename) filePath := filepath.Join(dir, "scenarios", "media", fmt.Sprintf("%s.mp3", tc.name)) fileContent, err := os.ReadFile(filePath) if err != nil { t.Fatalf("failed to read audio fixture %s: %v", filePath, err) }🔎 Consider extracting to a helper:
// loadAudioFixture loads a pre-generated audio fixture for testing func loadAudioFixture(t *testing.T, fixtureName string) []byte { t.Helper() _, filename, _, _ := runtime.Caller(1) dir := filepath.Dir(filename) filePath := filepath.Join(dir, "scenarios", "media", fmt.Sprintf("%s.mp3", fixtureName)) content, err := os.ReadFile(filePath) if err != nil { t.Fatalf("failed to read audio fixture %s: %v", filePath, err) } return content }Also applies to: 261-278, 369-386, 463-480, 561-578
core/providers/huggingface/huggingface.go (2)
369-417: Consider labeling aggregated raw list‑models responses by inference provider instead of index.Right now combined raw responses are stored under keys like
"provider_0","provider_1". If you ever inspect this blob, mapping back to which entry came from which inference provider is indirect.If you decide to iterate on observability later, consider storing them keyed by the concrete inference provider identifier (e.g.,
"hf-inference","fal-ai") instead of just the ordinal index.
49-59: Chat response pooling is only partially used; you may simplify or expand it.
acquireHuggingFaceChatResponsepulls from a pool that you pre‑warm inNewHuggingFaceProvider, butChatCompletionnever returns these objects to the pool (onlyResponsesdoes viadefer releaseHuggingFaceChatResponse). That means the pre‑warmed chat pool doesn’t meaningfully reduce allocations for the main chat path.Not urgent, but for clarity/perf you could either:
- Drop pooling for chat responses and allocate directly in
ChatCompletion, or- Ensure returned chat responses are eventually recycled back into this pool if you add a release point in the higher layers.
Also applies to: 97-103, 494-529
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (6)
core/internal/testutil/scenarios/media/Numbers_And_Punctuation.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Basic_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Medium_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Technical_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/Technical_Terms.mp3is excluded by!**/*.mp3ui/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (36)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(4 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/internal/testutil/transcription.go(6 hunks)core/providers/gemini/speech.go(1 hunks)core/providers/gemini/transcription.go(2 hunks)core/providers/gemini/utils.go(0 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/responses.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/providers/openai/openai.go(1 hunks)core/providers/utils/audio.go(1 hunks)core/providers/utils/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(3 hunks)core/schemas/transcriptions.go(1 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/docs.json(1 hunks)docs/features/providers/huggingface.mdx(1 hunks)docs/features/providers/providers-unified-interface.mdx(2 hunks)transports/config.schema.json(2 hunks)ui/README.md(6 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
💤 Files with no reviewable changes (1)
- core/providers/gemini/utils.go
🚧 Files skipped from review as they are similar to previous changes (17)
- core/providers/gemini/transcription.go
- docs/features/providers/providers-unified-interface.mdx
- core/providers/huggingface/models.go
- docs/apis/openapi.json
- core/schemas/transcriptions.go
- core/schemas/account.go
- core/providers/huggingface/chat.go
- .github/workflows/pr-tests.yml
- core/providers/huggingface/speech.go
- ui/lib/constants/config.ts
- core/providers/utils/audio.go
- ui/lib/constants/logs.ts
- core/internal/testutil/responses_stream.go
- transports/config.schema.json
- core/providers/huggingface/utils.go
- core/providers/huggingface/responses.go
- docs/docs.json
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/gemini/speech.gocore/providers/huggingface/embedding.gocore/internal/testutil/transcription.goui/lib/constants/icons.tsxcore/providers/openai/openai.gocore/schemas/bifrost.gocore/schemas/mux.godocs/contributing/adding-a-provider.mdxcore/providers/utils/utils.gocore/bifrost.gocore/internal/testutil/account.gocore/providers/huggingface/transcription.gocore/providers/huggingface/huggingface_test.gocore/providers/huggingface/huggingface.goui/README.mddocs/features/providers/huggingface.mdxcore/providers/huggingface/types.go
🧠 Learnings (8)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/gemini/speech.gocore/providers/huggingface/embedding.gocore/internal/testutil/transcription.gocore/providers/openai/openai.gocore/schemas/bifrost.gocore/schemas/mux.gocore/providers/utils/utils.gocore/bifrost.gocore/internal/testutil/account.gocore/providers/huggingface/transcription.gocore/providers/huggingface/huggingface_test.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
📚 Learning: 2025-12-15T10:34:13.855Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/chat.go:370-389
Timestamp: 2025-12-15T10:34:13.855Z
Learning: In the HuggingFace provider (core/providers/huggingface/), streaming tool call data arrives as a single delta chunk with all fields (id, type, function.name, function.arguments) present at once. Do not implement accumulation logic across deltas for streaming tool calls in this codebase; rely on single-chunk data as the guaranteed behavior. If there is existing accumulation logic in this directory, remove it or guard it behind this assumption.
Applied to files:
core/providers/huggingface/embedding.gocore/providers/huggingface/transcription.gocore/providers/huggingface/huggingface_test.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
📚 Learning: 2025-12-15T10:19:32.071Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/internal/testutil/transcription.go:241-258
Timestamp: 2025-12-15T10:19:32.071Z
Learning: In core/internal/testutil package, the function GenerateTTSAudioForTest returns ([]byte, string) not ([]byte, error), and it already handles errors internally by calling t.Fatalf(). Therefore, the blank identifier for the second return value is correct and no external error checking is needed.
Applied to files:
core/internal/testutil/transcription.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/openai.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.
Applied to files:
core/providers/openai/openai.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.
Applied to files:
core/providers/huggingface/huggingface_test.go
📚 Learning: 2025-12-15T10:06:05.395Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: docs/features/providers/huggingface.mdx:39-61
Timestamp: 2025-12-15T10:06:05.395Z
Learning: For fal-ai transcription requests routed through HuggingFace in Bifrost, WAV (audio/wav) is not supported and should be rejected. Only MP3 format is supported. Update the documentation and any related examples to reflect MP3 as the required input format for HuggingFace-based transcription, and note WAV should not be used. This applies specifically to the HuggingFace provider integration in this repository.
Applied to files:
docs/features/providers/huggingface.mdx
📚 Learning: 2025-12-09T17:08:21.123Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: docs/features/providers/huggingface.mdx:171-195
Timestamp: 2025-12-09T17:08:21.123Z
Learning: In docs/features/providers/huggingface.mdx, use the official Hugging Face naming conventions for provider identifiers in the capabilities table (e.g., ovhcloud-ai-endpoints, z-ai). Do not map to SDK identifiers like ovhcloud or zai-org; this aligns with Hugging Face's public docs and improves consistency for readers.
Applied to files:
docs/features/providers/huggingface.mdx
🧬 Code graph analysis (9)
core/providers/gemini/speech.go (1)
core/providers/utils/audio.go (1)
DetectAudioMimeType(78-119)
core/providers/huggingface/embedding.go (3)
core/schemas/embedding.go (4)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)core/providers/huggingface/types.go (2)
HuggingFaceEmbeddingRequest(161-172)InputsCustomType(211-214)core/schemas/chatcompletions.go (1)
BifrostLLMUsage(845-852)
core/internal/testutil/transcription.go (4)
core/schemas/transcriptions.go (3)
BifrostTranscriptionRequest(3-10)TranscriptionInput(28-30)TranscriptionParameters(32-49)core/internal/testutil/utils.go (1)
GetProviderVoice(39-87)core/schemas/speech.go (4)
BifrostSpeechRequest(9-16)SpeechParameters(43-58)SpeechVoiceInput(65-68)BifrostSpeechResponse(22-29)core/internal/testutil/test_retry_framework.go (5)
GetTestRetryConfigForScenario(1118-1150)TestRetryContext(168-173)SpeechRetryConfig(216-223)SpeechRetryCondition(144-147)WithSpeechTestRetry(1328-1478)
core/providers/openai/openai.go (2)
core/schemas/chatcompletions.go (1)
ChatStreamResponseChoice(783-785)ui/lib/types/logs.ts (1)
ReasoningDetails(127-134)
core/providers/utils/utils.go (1)
core/schemas/bifrost.go (3)
Cerebras(47-47)Perplexity(46-46)HuggingFace(51-51)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/providers/huggingface/huggingface.go (1)
NewHuggingFaceProvider(86-120)
core/internal/testutil/account.go (4)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/schemas/account.go (1)
Key(8-20)core/schemas/provider.go (5)
ProviderConfig(267-276)NetworkConfig(48-56)DefaultRequestTimeoutInSeconds(15-15)ConcurrencyAndBufferSize(131-134)Provider(314-361)core/internal/testutil/cross_provider_scenarios.go (1)
ProviderConfig(45-53)
core/providers/huggingface/huggingface_test.go (5)
core/internal/testutil/setup.go (1)
SetupTest(51-60)core/internal/testutil/account.go (2)
ComprehensiveTestConfig(60-82)TestScenarios(22-57)core/schemas/bifrost.go (2)
HuggingFace(51-51)Fallback(149-152)core/schemas/models.go (1)
Model(109-129)core/internal/testutil/tests.go (1)
RunAllComprehensiveTests(15-78)
core/providers/huggingface/types.go (1)
core/schemas/chatcompletions.go (3)
ChatStreamOptions(237-240)ChatTool(252-257)ChatToolChoiceFunction(444-446)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
[grammar] ~2000-~2000: Use a hyphen to join words.
Context: ...ror or incorrect stream handling Tool calling tests fail: - Solution: Ve...
(QB_NEW_EN_HYPHEN)
🔇 Additional comments (29)
core/providers/gemini/speech.go (2)
8-8: LGTM: Import added to support shared utility.The import of the
utilspackage correctly supports the refactored MIME type detection on line 175.
175-175: LGTM: Good refactoring to consolidate MIME type detection.The change to use
utils.DetectAudioMimeTypeconsolidates audio MIME type detection logic into a shared utility, reducing code duplication across providers. The utility function (incore/providers/utils/audio.go) provides comprehensive format detection for WAV, MP3, AAC, AIFF, FLAC, and OGG with a sensible fallback.ui/README.md (4)
12-12: Verify the provider count is accurate.Line 12 references "15+ AI providers" in the link text. Please confirm this count reflects the addition of the HuggingFace provider in this PR and remains current.
53-53: Architecture documentation aligns well with provider changes.The addition of "Redux Toolkit with RTK Query" in the Technology Stack is well-timed with the expanded provider system (HuggingFace) and provides clear context for developers integrating with the backend.
166-169: "Adding New Features" section properly updated for RTK Query.The steps now correctly reflect RTK Query-based API state management and React hooks for local state, aligning with the backend provider architecture changes in this PR.
133-155: RTK Query example is clear and practical.The code example demonstrates
useGetLogsQueryanduseCreateProviderMutationwith proper error handling viagetErrorMessage. This is a helpful reference for developers and aligns with the new provider integration patterns introduced in this PR.core/schemas/mux.go (2)
1155-1160: LGTM: Reasoning-only response support implemented correctly.The introduction of
hasContentandhasReasoningguards effectively handles models that emit reasoning without visible text. The outer conditionhasContent || (hasReasoning && !state.TextItemAdded)ensures a text item is created on the first chunk for reasoning-only responses, while subsequent reasoning chunks correctly skip text item creation and are handled separately (lines 1382-1393).
1409-1457: LGTM: Text item closure correctly handles reasoning-only responses.The modification to close text items "regardless of whether it has content" (line 1410) is essential for reasoning-only responses. Previously, items with no visible content deltas would remain unclosed, violating lifecycle expectations. This ensures proper cleanup and response completion even when only reasoning/thought content is emitted.
docs/contributing/adding-a-provider.mdx (1)
7-13: LGTM! Clear quick reference for contributors.The note providing quick references to existing implementations (OpenAI-compatible: cerebras/groq, Custom API: huggingface/anthropic) is helpful for new contributors to understand patterns quickly.
core/internal/testutil/account.go (4)
114-114: LGTM! HuggingFace correctly added to configured providers.The provider is appropriately placed in the list alongside other cloud providers.
327-334: LGTM! HuggingFace key configuration is consistent.The API key is correctly sourced from
HUGGING_FACE_API_KEYenvironment variable. The absence ofUseForBatchAPIis appropriate since HuggingFace doesn't support the batch API.
589-601: LGTM! Network configuration is well-tuned for HuggingFace.The 300-second timeout appropriately handles cold starts for serverless inference endpoints, and the retry configuration (10 retries, 2s-30s backoff) matches other cloud providers for consistent resilience.
1020-1053: LGTM! Comprehensive test scenario configuration for HuggingFace.The test configuration appropriately:
- Uses inference provider routing format for models (e.g.,
groq/openai/gpt-oss-120b)- Enables supported features (chat, streaming, tool calls, vision, embeddings, transcription, speech)
- Disables unsupported features (text completion, multiple tool calls, streaming transcription/speech)
- Includes OpenAI fallback for resilience
The
MultipleToolCalls: falsewhileToolCalls: truesuggests HuggingFace supports single tool calls but not parallel tool calling, which is a reasonable limitation.core/internal/testutil/transcription.go (2)
73-97: LGTM! Fixture-based approach for Fal-AI format incompatibility is sound.The comment clearly explains the technical limitation: Fal-AI speech models return WAV format, but transcription requires MP3. Using pre-generated fixtures is a pragmatic workaround. Error handling with
t.Fatalfensures tests fail fast if fixtures are missing.
277-277: Blank identifier for second return value is correct.Based on learnings,
GenerateTTSAudioForTestreturns([]byte, string)and handles errors internally viat.Fatalf(), so the blank identifier is appropriate here.core/providers/huggingface/types.go (5)
32-52: LGTM! Flexible UnmarshalJSON handles API response variations.The implementation correctly handles both the array format
[...](current API) and the object format{"models": [...]}(potential legacy/alternate format), providing backwards compatibility.
99-130: LGTM! HuggingFaceToolChoice correctly implements flexible enum/object pattern.The implementation properly handles the
tool_choicefield which can be either an enum string ("auto","none","required") or a function object. Reusingschemas.ChatToolChoiceFunctionfor the function sub-object maintains consistency with the core schemas.
211-254: LGTM! InputsCustomType handles HuggingFace's flexible input formats.The implementation correctly supports:
- String input (single text)
- Array input (multiple texts)
- Object input (fallback to struct fields)
Both MarshalJSON and UnmarshalJSON are symmetric and handle all cases appropriately.
347-364: LGTM! UnmarshalJSON now properly returns error on invalid input.The implementation correctly returns a descriptive error when the input is neither a boolean nor a string, addressing the previous review comment about failing fast on invalid data.
174-209: LGTM! MarshalJSON correctly dereferences pointer fields.The implementation properly uses
*r.Provider,*r.Model, etc. instead of wrapping pointers again, addressing the previous double-pointer issue..github/workflows/release-pipeline.yml (1)
90-121: HUGGING_FACE_API_KEY wiring into release jobs looks consistent.All four release jobs now receive
HUGGING_FACE_API_KEYfromsecrets.HUGGING_FACE_API_KEY, matching the rest of the repo’s naming and usage. No issues from a workflow or secret‑handling perspective.Also applies to: 165-195, 242-271, 327-360
core/schemas/bifrost.go (1)
35-52: HuggingFace provider registration is coherent with existing enums and lists.Adding
HuggingFacetoModelProvider,SupportedBaseProviders, andStandardProvidersis consistent with how other providers are declared and exposed; nothing else appears missing here.Also applies to: 55-63, 66-85
core/providers/utils/utils.go (1)
1045-1056: Extending non‑[DONE]behavior to HuggingFace is reasonable.Treating
schemas.HuggingFaceas a provider that ends streams onfinish_reasoninstead of[DONE]aligns with Cerebras/Perplexity handling and integrates cleanly with existing streaming logic.core/providers/openai/openai.go (1)
1047-1054: Streaming now correctly emits reasoning‑only chat deltas.Including
Delta.ReasoningandDelta.ReasoningDetailsin the emission predicate ensures reasoning‑only chunks are no longer silently skipped while preserving existing content/audio/tool‑call behavior. Consider adding/expanding a ChatCompletionStream test that feeds a reasoning‑only SSE delta to lock this in.core/providers/huggingface/huggingface_test.go (1)
12-75: Comprehensive HuggingFace test configuration is well‑scoped.Env‑gated test setup, use of
testutil.ComprehensiveTestConfigwith realistic models, and selectively enabled scenarios all align with existing provider tests and should give good coverage without over‑reaching unsupported flows.core/bifrost.go (1)
17-39: HuggingFace provider wiring into the factory looks consistent.Import and
createBaseProvidercase forschemas.HuggingFacematch the existing provider pattern (construct and return the concrete provider, no extra side effects). No issues from the core wiring side.Also applies to: 1854-1893
core/providers/huggingface/embedding.go (1)
10-69: Embedding request/response conversions look robust and align with HuggingFace shapes.The request converter correctly distinguishes
hf-inferencevs other providers (usingInputsvsInput) and surfaces HF‑specific params fromExtraParams. The response unmarshal path defensively handles object, 2D, and 1D array formats and normalizes everything intoBifrostEmbeddingResponsewith sane defaults. No issues from this implementation.Also applies to: 71-159
core/providers/huggingface/huggingface.go (2)
132-211: Core HTTP / retry plumbing for HuggingFace provider looks solid.
completeRequestWithRetry+completeRequestcleanly separate 404/model‑alias retry from generic HTTP/error handling, and the decoded body is safely copied before response release. Combined withlistModelsByKey’s fan‑out aggregation and latency tracking, this gives you a good base for all operations. No changes needed here.Also applies to: 236-271, 369-419
916-999: Embedding, Speech, and Transcription entrypoints are consistent with the shared helpers.These methods correctly:
- Guard operations via
CheckOperationAllowed.- Use
splitIntoModelProvideronce per call and feedcompleteRequestWithRetrywith the appropriaterequiredTask.- Convert to/from HuggingFace types via the dedicated converters and wrap errors with
NewBifrostOperationError.- Attach provider/model/request‑type/latency and optional raw request/response into
ExtraFields.Aside from the separate falAI transcription concern called out in transcription.go, this surface looks coherent and aligned with the rest of Bifrost’s provider implementations.
Also applies to: 1001-1081, 1087-1173
4c33aab to
bb7d1a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
core/providers/huggingface/transcription.go (1)
37-47: fal-ai transcription request missing requiredModelandProviderfields.The fal-ai branch (lines 44-46) only sets
AudioURLbut doesn't includeModelandProviderfields, which the fal-ai API requires according to past review analysis. The non-fal-ai branch correctly sets these fields (lines 32-36).🔎 Apply this diff to add the missing fields:
hfRequest = &HuggingFaceTranscriptionRequest{ AudioURL: encoded, + Model: schemas.Ptr(modelName), + Provider: schemas.Ptr(string(inferenceProvider)), }
🧹 Nitpick comments (7)
core/internal/testutil/responses_stream.go (1)
693-694: LGTM! Threshold increase accommodates more verbose streaming responses.Increasing the safety guard from 100 to 300 is appropriate for the lifecycle test, which validates the complete sequence of streaming events and may legitimately produce more chunks, especially with the new HuggingFace provider.
Optional observation: Other streaming tests have lower thresholds (line 394: 100 chunks, line 527: 150 chunks). If you observe similar threshold issues with those tests for verbose providers, consider adjusting them as well. However, the current values may be intentionally tuned to each test's expected complexity.
core/schemas/mux.go (1)
1214-1241: Simplify the emission condition and verify empty delta handling.The delta emission logic correctly handles reasoning-only responses by emitting an empty delta on the first chunk. However, the condition at line 1216 contains a redundant term that can be simplified for clarity.
🔎 Simplify the condition at line 1216
The condition
hasContent || (!state.TextItemHasContent && (hasReasoning || hasContent))can be simplified becausehasContentappears in both the outer OR and the inner expression:- if hasContent || (!state.TextItemHasContent && (hasReasoning || hasContent)) { + if hasContent || (!state.TextItemHasContent && hasReasoning) {This simplification makes the intent clearer: emit a delta if there's content, OR if this is the first delta and there's reasoning (for reasoning-only responses).
Additionally, verify that downstream consumers (clients, proxies, other providers) correctly handle empty content deltas for reasoning-only responses, as this is a relatively uncommon edge case.
#!/bin/bash # Description: Search for code that processes output_text.delta events to verify empty delta handling # Search for delta processing logic rg -n -C5 'output_text\.delta|OutputTextDelta|response\.Delta' --type go -g '!*_test.go' -g '!core/schemas/'core/providers/utils/audio.go (1)
76-98: Consider adding an inline comment explaining the 0xF6 mask.The implementation is correct. Based on learnings, the
0xF6mask at line 96 is intentionally stricter than the standard0xF0to check both the sync word (top 4 bits = 0xF) and the Layer field bits (bits 2-1 = 00), preventing MP3 Layer III (which has Layer bits = 11) from being misidentified as AAC.🔎 Consider adding a comment to document this design decision:
// AAC: ADIF or ADTS (0xFFF sync) - check before MP3 frame sync to avoid misclassification if bytes.HasPrefix(audioData, adif) { return "audio/aac" } + // ADTS: 0xFF followed by top 4 bits = 0xF and Layer bits (2-1) = 00 + // Mask 0xF6 prevents MP3 Layer III (Layer bits = 11) from being misidentified as AAC if len(audioData) >= 2 && audioData[0] == 0xFF && (audioData[1]&0xF6) == 0xF0 { return "audio/aac" }core/schemas/transcriptions.go (1)
37-40: LGTM! HuggingFace transcription parameters added correctly.The four new optional fields (
MaxLength,MinLength,MaxNewTokens,MinNewTokens) are properly typed as integer pointers with appropriate JSON tags. The inline comments indicating "used by HuggingFace" address the previous review feedback and make the provider association clear.Optional: Consider enhancing comments with brief functional descriptions
While the current comments clearly indicate these parameters are HuggingFace-specific, adding brief functional descriptions could improve developer understanding. For example:
- MaxLength *int `json:"max_length,omitempty"` // Maximum length of the transcription used by HuggingFace + MaxLength *int `json:"max_length,omitempty"` // Maximum length of the generated transcription (HuggingFace) - MinLength *int `json:"min_length,omitempty"` // Minimum length of the transcription used by HuggingFace + MinLength *int `json:"min_length,omitempty"` // Minimum length of the generated transcription (HuggingFace) - MaxNewTokens *int `json:"max_new_tokens,omitempty"` // Maximum new tokens to generate used by HuggingFace + MaxNewTokens *int `json:"max_new_tokens,omitempty"` // Maximum new tokens to generate in transcription (HuggingFace) - MinNewTokens *int `json:"min_new_tokens,omitempty"` // Minimum new tokens to generate used by HuggingFace + MinNewTokens *int `json:"min_new_tokens,omitempty"` // Minimum new tokens to generate in transcription (HuggingFace)This is purely for clarity and not required.
core/providers/huggingface/utils.go (1)
75-81: Minor maintainability tweaks for provider list and mapping cache.Two small improvements you may want to consider (non-blocking):
In
PROVIDERS_OR_POLICIES(lines 75–81), append theautoconstant instead of the string literal"auto"to keep things self‑documenting and avoid accidental divergence:out = append(out, auto)
modelProviderMappingCachecurrently never expires once populated ingetModelInferenceProviderMapping. If Hugging Face changes mappings at runtime, a long‑lived process will keep using stale routes. If that’s a concern for you, wrapping this in a small TTL cache (or invalidating on non‑fatal provider errors) would make the behavior more robust, while preserving the existing fast‑path.Also applies to: 213-267
core/internal/testutil/account.go (1)
95-118: HuggingFace test wiring looks good; consider exercising Responses/Reasoning when stable.The HuggingFace additions to
GetConfiguredProviders,GetKeysForProvider, andGetConfigForProviderlook consistent with other cloud providers, and the comprehensive test config covers chat, streaming, vision, embeddings, transcription, and speech.Given the provider also implements the Responses API, you might eventually want to:
- Set a
ReasoningModelfor HuggingFace, and- Flip
Reasoningtotruein the HuggingFaceTestScenariosso Responses/Reasoning flows get the same coverage as other providers once you’re confident in that path.
Also applies to: 327-335, 589-602, 1020-1053
core/providers/huggingface/types.go (1)
234-234: Optional: Consider removing the unusedExtrafield.The
Extrafield inHuggingFaceSpeechRequesthas thejson:"-"tag, meaning it's not serialized. If this field isn't being used elsewhere in the codebase, removing it would improve code clarity. If it's reserved for future use, adding a brief comment explaining its purpose would be helpful.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (6)
core/internal/testutil/scenarios/media/Numbers_And_Punctuation.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Basic_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Medium_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Technical_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/Technical_Terms.mp3is excluded by!**/*.mp3ui/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (36)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)core/bifrost.go(2 hunks)core/internal/testutil/account.go(4 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/internal/testutil/transcription.go(6 hunks)core/providers/gemini/speech.go(1 hunks)core/providers/gemini/transcription.go(2 hunks)core/providers/gemini/utils.go(0 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/responses.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/providers/openai/openai.go(1 hunks)core/providers/utils/audio.go(1 hunks)core/providers/utils/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(3 hunks)core/schemas/transcriptions.go(1 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/docs.json(1 hunks)docs/features/providers/huggingface.mdx(1 hunks)docs/features/providers/providers-unified-interface.mdx(2 hunks)transports/config.schema.json(2 hunks)ui/README.md(6 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
💤 Files with no reviewable changes (1)
- core/providers/gemini/utils.go
🚧 Files skipped from review as they are similar to previous changes (15)
- core/providers/gemini/speech.go
- core/schemas/bifrost.go
- core/bifrost.go
- docs/apis/openapi.json
- docs/docs.json
- ui/lib/constants/logs.ts
- core/providers/gemini/transcription.go
- docs/features/providers/huggingface.mdx
- core/providers/huggingface/huggingface_test.go
- core/providers/huggingface/chat.go
- core/schemas/account.go
- ui/lib/constants/config.ts
- core/providers/openai/openai.go
- core/providers/huggingface/responses.go
- core/providers/huggingface/speech.go
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/utils/utils.gocore/providers/utils/audio.gocore/schemas/transcriptions.gotransports/config.schema.jsonui/lib/constants/icons.tsxdocs/features/providers/providers-unified-interface.mdxcore/internal/testutil/responses_stream.gocore/internal/testutil/transcription.gocore/schemas/mux.gocore/providers/huggingface/models.gocore/providers/huggingface/transcription.goui/README.mdcore/providers/huggingface/embedding.godocs/contributing/adding-a-provider.mdxcore/providers/huggingface/utils.gocore/internal/testutil/account.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
🧠 Learnings (5)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/utils/utils.gocore/providers/utils/audio.gocore/schemas/transcriptions.gocore/internal/testutil/responses_stream.gocore/internal/testutil/transcription.gocore/schemas/mux.gocore/providers/huggingface/models.gocore/providers/huggingface/transcription.gocore/providers/huggingface/embedding.gocore/providers/huggingface/utils.gocore/internal/testutil/account.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
📚 Learning: 2025-09-01T15:29:17.076Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 372
File: core/providers/utils.go:552-598
Timestamp: 2025-09-01T15:29:17.076Z
Learning: The detectAudioMimeType function in core/providers/utils.go is specifically designed for Gemini's audio format support, which only includes: WAV (audio/wav), MP3 (audio/mp3), AIFF (audio/aiff), AAC (audio/aac), OGG Vorbis (audio/ogg), and FLAC (audio/flac). The implementation should remain focused on these specific formats rather than being over-engineered for general-purpose audio detection.
Applied to files:
core/providers/utils/audio.go
📚 Learning: 2025-12-10T15:15:14.041Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/audio.go:92-98
Timestamp: 2025-12-10T15:15:14.041Z
Learning: In core/providers/utils/audio.go, within DetectAudioMimeType, use a mask of 0xF6 for ADTS sync detection instead of the standard 0xF0. This stricter check validates that the top nibble is 0xF and the Layer field bits (bits 2-1) are 00, preventing MP3 Layer III (Layer bits 11) from being misidentified as AAC. Ensure unit tests cover this behavior and document the rationale in code comments.
Applied to files:
core/providers/utils/audio.go
📚 Learning: 2025-12-15T10:19:32.071Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/internal/testutil/transcription.go:241-258
Timestamp: 2025-12-15T10:19:32.071Z
Learning: In core/internal/testutil package, the function GenerateTTSAudioForTest returns ([]byte, string) not ([]byte, error), and it already handles errors internally by calling t.Fatalf(). Therefore, the blank identifier for the second return value is correct and no external error checking is needed.
Applied to files:
core/internal/testutil/transcription.go
📚 Learning: 2025-12-15T10:34:13.855Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/chat.go:370-389
Timestamp: 2025-12-15T10:34:13.855Z
Learning: In the HuggingFace provider (core/providers/huggingface/), streaming tool call data arrives as a single delta chunk with all fields (id, type, function.name, function.arguments) present at once. Do not implement accumulation logic across deltas for streaming tool calls in this codebase; rely on single-chunk data as the guaranteed behavior. If there is existing accumulation logic in this directory, remove it or guard it behind this assumption.
Applied to files:
core/providers/huggingface/models.gocore/providers/huggingface/transcription.gocore/providers/huggingface/embedding.gocore/providers/huggingface/utils.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
🧬 Code graph analysis (7)
core/providers/utils/utils.go (1)
core/schemas/bifrost.go (3)
Cerebras(47-47)Perplexity(46-46)HuggingFace(51-51)
core/internal/testutil/transcription.go (5)
core/schemas/transcriptions.go (3)
BifrostTranscriptionRequest(3-10)TranscriptionInput(28-30)TranscriptionParameters(32-49)core/internal/testutil/utils.go (2)
GetProviderVoice(39-87)GetErrorMessage(642-675)core/schemas/speech.go (4)
BifrostSpeechRequest(9-16)SpeechParameters(43-58)SpeechVoiceInput(65-68)BifrostSpeechResponse(22-29)core/internal/testutil/test_retry_framework.go (5)
GetTestRetryConfigForScenario(1118-1150)TestRetryContext(168-173)SpeechRetryConfig(216-223)SpeechRetryCondition(144-147)WithSpeechTestRetry(1328-1478)core/internal/testutil/validation_presets.go (1)
SpeechExpectations(146-162)
core/schemas/mux.go (2)
core/schemas/responses.go (2)
BifrostResponsesStreamResponse(1440-1479)ResponsesStreamResponseTypeOutputTextDelta(1388-1388)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
HuggingFaceListModelsResponse(28-30)core/schemas/bifrost.go (9)
ModelProvider(32-32)RequestType(88-88)ChatCompletionRequest(94-94)ChatCompletionStreamRequest(95-95)ResponsesRequest(96-96)ResponsesStreamRequest(97-97)EmbeddingRequest(98-98)SpeechRequest(99-99)TranscriptionRequest(101-101)core/schemas/models.go (2)
BifrostListModelsResponse(36-45)Model(109-129)
core/providers/huggingface/transcription.go (5)
core/schemas/transcriptions.go (3)
BifrostTranscriptionRequest(3-10)BifrostTranscriptionResponse(16-26)TranscriptionSegment(87-98)core/providers/huggingface/types.go (5)
HuggingFaceTranscriptionRequest(258-264)HuggingFaceTranscriptionRequestParameters(267-270)HuggingFaceTranscriptionGenerationParameters(273-290)HuggingFaceTranscriptionEarlyStopping(294-297)HuggingFaceTranscriptionResponse(330-333)core/providers/utils/audio.go (1)
DetectAudioMimeType(78-119)core/schemas/utils.go (1)
SafeExtractIntPointer(486-494)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(394-405)HuggingFace(51-51)
core/providers/huggingface/huggingface.go (4)
core/providers/utils/audio.go (1)
DetectAudioMimeType(78-119)core/providers/huggingface/chat.go (1)
ToHuggingFaceChatCompletionRequest(11-106)core/providers/huggingface/embedding.go (2)
ToHuggingFaceEmbeddingRequest(11-78)UnmarshalHuggingFaceEmbeddingResponse(82-168)core/providers/huggingface/speech.go (1)
ToHuggingFaceSpeechRequest(9-96)
core/providers/huggingface/types.go (1)
core/schemas/chatcompletions.go (3)
ChatStreamOptions(237-240)ChatTool(252-257)ChatToolChoiceFunction(444-446)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
[grammar] ~2000-~2000: Use a hyphen to join words.
Context: ...ror or incorrect stream handling Tool calling tests fail: - Solution: Ve...
(QB_NEW_EN_HYPHEN)
🔇 Additional comments (31)
core/schemas/mux.go (3)
1155-1159: LGTM! Correct gating logic for reasoning-only responses.The introduction of
hasContentandhasReasoningflags with the conditionalhasContent || (hasReasoning && !state.TextItemAdded)correctly handles both content-only and reasoning-only streaming responses. The text item is created on the first chunk when either content or reasoning is present.
1410-1411: LGTM! Correct lifecycle handling for reasoning-only responses.The change to close the text item regardless of whether it has content is necessary to support reasoning-only responses. This ensures the proper lifecycle sequence (added → deltas → done) is completed even when only empty deltas are emitted.
1155-1241: Well-structured solution for reasoning-only streaming responses.The changes cohesively address the requirement to support reasoning-only models in the Chat-to-Responses streaming conversion:
- Gating flags (
hasContent,hasReasoning) cleanly separate concerns- First-chunk semantics correctly create text items for reasoning-only responses
- Empty delta emission satisfies lifecycle requirements (at least one delta per item)
- Item closure updated to work regardless of content presence
The state tracking (
TextItemAdded,TextItemHasContent,TextItemClosed) properly coordinates the lifecycle events.Also applies to: 1410-1411
.github/workflows/pr-tests.yml (1)
118-118: LGTM!The
HUGGING_FACE_API_KEYenvironment variable is correctly added following the established pattern for other provider API keys. This enables CI tests to run the new HuggingFace provider test suite..github/workflows/release-pipeline.yml (4)
118-118: LGTM!The
HUGGING_FACE_API_KEYis correctly propagated to thecore-releasejob, following the established pattern for other provider API keys.
193-193: LGTM!Consistent propagation of
HUGGING_FACE_API_KEYto theframework-releasejob.
270-270: LGTM!Consistent propagation of
HUGGING_FACE_API_KEYto theplugins-releasejob.
359-359: LGTM!Consistent propagation of
HUGGING_FACE_API_KEYto thebifrost-http-releasejob.core/providers/utils/audio.go (2)
64-74: LGTM!The header constants are well-organized and use idiomatic Go byte slices for efficient prefix matching.
99-119: LGTM!The remaining format detection logic is correct:
- AIFF/AIFC detection properly handles both variants
- FLAC and OGG detection use correct magic bytes
- MP3 frame sync detection covers common MPEG audio variants (0xFB, 0xF3, 0xF2, 0xFA)
- The ordering (AAC before MP3 frame sync) prevents misclassification
The fallback to
audio/mp3is a reasonable default for unrecognized formats.core/providers/utils/utils.go (1)
1048-1057: Verify HuggingFace Inference API stream termination behavior.Adding
schemas.HuggingFaceto providers that don't send[DONE]markers follows an established pattern for non-standard streaming APIs. The comment correctly lists all three providers.However, verification is needed: confirm that HuggingFace's Inference API actually terminates streams via
finish_reasonrather than[DONE]markers, as this behavior is not explicitly documented in public HuggingFace API references. Consider adding test coverage or inline documentation referencing the specific API behavior that justifies this classification.docs/contributing/adding-a-provider.mdx (1)
424-527: HuggingFace chat converter example is now consistent and accurate.The
ToHuggingFaceChatCompletionRequestsnippet useshfReqconsistently and mirrors the actual converter pattern well (messages, content blocks, tool calls, params). It’s a solid reference for new providers.ui/README.md (1)
3-18: UI README accurately reflects the current architecture and usage.The updated description (Next.js + RTK Query/Redux, websocket logging, provider/MCP/plugin docs links) aligns with the rest of the PR and centralizes details in docs instead of duplicating them. Looks good.
Also applies to: 46-57, 129-155
transports/config.schema.json (1)
96-151: Schema wiring for HuggingFace provider and semantic cache looks consistent.Adding
"huggingface"underproviderswith the baseproviderschema and including it in the semantic cacheproviderenum cleanly aligns config with the new core provider. No issues from a schema/interop perspective.Also applies to: 813-833
core/internal/testutil/transcription.go (2)
73-97: LGTM on the Fal-AI/HuggingFace fixture handling path.The conditional handling correctly addresses the format incompatibility between Fal-AI models (which only return WAV) and the test requirements (which need MP3). The error handling with
t.Fatalfensures tests fail fast when fixtures are missing.
98-178: TTS generation and transcription request construction looks good.The retry framework integration, temp file management with cleanup, and transcription request construction follow established patterns in the codebase.
core/providers/huggingface/models.go (2)
16-44: Model conversion logic is well-structured.The function correctly:
- Handles nil response
- Filters models without IDs or supported methods
- Constructs consistent composite model IDs
- Pre-allocates slice capacity for efficiency
11-14: Constants are actively used in utils.go for enforcing model fetch limits—no action needed.The constants
defaultModelFetchLimitandmaxModelFetchLimitare referenced incore/providers/huggingface/utils.go(lines 90, 92–93) to enforce minimum and maximum bounds on the model fetch limit parameter.core/providers/huggingface/embedding.go (2)
11-78: Request conversion logic is well-implemented.The function correctly:
- Handles nil input gracefully
- Differentiates between
hfInference(usingInputs) and other providers (usingInput)- Maps standard parameters and provider-specific
ExtraParams- Uses
InputsCustomTypewrapper for flexible input handling
80-168: Response unmarshalling handles multiple HuggingFace response formats gracefully.The function correctly handles the three known response shapes:
- Standard object with
data,model,usagefields- 2D array
[[float64...], ...]for batch embeddings- 1D array
[float64...]for single embeddingsThe float64→float32 conversion is appropriate for embedding storage efficiency.
core/providers/huggingface/transcription.go (1)
121-160: Response conversion is well-structured.The function correctly:
- Validates non-nil response and non-empty model
- Maps chunks to transcription segments with proper timestamp handling
- Sets appropriate
ExtraFieldsfor provider trackingcore/providers/huggingface/huggingface.go (9)
19-118: Provider struct and initialization are well-designed.Good practices observed:
- Sync pools with pre-warming for reduced GC pressure
- Proper timeout configuration from network config
- Thread-safe model mapping cache using
sync.Map- Proxy configuration support
131-207: Cache-first retry pattern is appropriate.The design correctly prioritizes cache hits (most common case) over immediate validation, only paying the cost of re-validation on 404 (cache miss). This is an intentional optimization per prior discussion.
209-269: Request execution with proper resource management.The function correctly:
- Sets appropriate Content-Type based on request type
- Handles error responses with guarded message overwrites
- Copies response body before releasing fasthttp resources to prevent use-after-free
271-436: Concurrent model listing implementation is robust.Good patterns:
- Parallel fetching with proper synchronization (WaitGroup + channel)
- Graceful handling of partial failures (returns first error only if all fail)
- Average latency calculation across successful requests
- Correct provider constant usage (past issue addressed)
446-583: Chat completion implementation follows established patterns.Both sync and streaming paths correctly:
- Handle model parsing errors with proper error responses
- Transform model names to HuggingFace format
- Populate
ExtraFieldsfor diagnostics- Leverage OpenAI-compatible streaming for efficiency
585-615: Responses API correctly delegates to ChatCompletion.The implementation properly converts Responses requests to ChatCompletion format and updates the response's
RequestTypeto maintain accurate metadata.
617-700: Embedding implementation is well-structured.The function correctly:
- Uses the model alias cache for retry on 404
- Handles custom response unmarshalling for HuggingFace's varied response formats
- Tracks raw request/response when enabled
702-875: Speech and Transcription implementations are complete.Both functions correctly:
- Handle provider-specific request formats (hf-inference uses raw audio)
- Include proper input validation (nil checks addressed)
- Use model alias cache for retry logic
- Download audio from URLs when needed (Speech)
877-930: Unsupported operation stubs are consistent.All batch and file operations correctly return
NewUnsupportedOperationErrorwith appropriate request type identifiers.core/providers/huggingface/types.go (1)
1-342: Excellent implementation of comprehensive HuggingFace provider types.The file is well-structured with clear separation of concerns across models, chat, embeddings, speech, and transcription types. The custom JSON marshaling/unmarshaling logic correctly handles multiple input formats and edge cases. Previous feedback has been properly addressed:
HuggingFaceListModelsResponse.UnmarshalJSON(lines 34-52) now handles both array and object responsesHuggingFaceTranscriptionEarlyStopping.UnmarshalJSON(line 326) returns proper errors for invalid input- Field naming improved (
ProviderModelIDinstead ofProviderModelMapping)The use of sonic for JSON operations and the flexible type handling (e.g.,
InputsCustomType,HuggingFaceToolChoice) demonstrate solid design for API integration.
b5d2c41 to
338f1a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
core/providers/huggingface/transcription.go (1)
37-47: fal-ai branch missing requiredModelandProviderfields.The fal-ai branch constructs the request with only
AudioURLbut omitsModelandProviderfields. According to the HuggingFace inference providers documentation, these fields are required for fal-ai API calls.🔎 Suggested fix
} else { encoded := base64.StdEncoding.EncodeToString(request.Input.File) mimeType := getMimeTypeForAudioType(utils.DetectAudioMimeType(request.Input.File)) if mimeType == "audio/wav" { return nil, fmt.Errorf("fal-ai provider does not support audio/wav format; please use a different format like mp3 or ogg") } encoded = fmt.Sprintf("data:%s;base64,%s", mimeType, encoded) hfRequest = &HuggingFaceTranscriptionRequest{ AudioURL: encoded, + Model: schemas.Ptr(modelName), + Provider: schemas.Ptr(string(inferenceProvider)), } }core/providers/huggingface/types.go (1)
229-235: TheExtrafield inHuggingFaceSpeechRequestremains unaddressed.Line 234: The
Extrafield is still present with typemap[string]anyandjson:"-"tag. A previous review comment recommended either removing this field if unused, or changing it tojson.RawMessagefor consistency with similar patterns in the codebase. While theExtrafields were removed fromHuggingFaceSpeechParametersandHuggingFaceSpeechResponse, this one remains.
🧹 Nitpick comments (5)
docs/contributing/adding-a-provider.mdx (1)
2000-2002: Minor: Use hyphen in compound adjective "Tool-calling".For grammatical consistency with other compound adjectives in the document (e.g., "OpenAI-compatible"), consider hyphenating "Tool-calling" when used as a compound adjective modifying "tests".
Suggested fix
-**Tool calling tests fail**: +**Tool-calling tests fail**:core/schemas/mux.go (1)
1155-1241: LGTM! Well-designed support for reasoning-only responses.The introduction of
hasContentandhasReasoningflags with the updated gating logic correctly handles reasoning-only model responses by:
- Creating text items even without content when reasoning is present
- Emitting an empty delta on the first chunk to satisfy lifecycle validation requirements
- Prioritizing content over empty deltas when both are present
The implementation ensures proper event sequencing for the Responses API streaming format.
💡 Optional simplification of condition on line 1216
The condition on line 1216 contains a redundant
hasContentcheck:if hasContent || (!state.TextItemHasContent && (hasReasoning || hasContent))Since
hasContentappears in the outer OR, when evaluating the second part,hasContentmust be false, making the inner|| hasContentredundant. The condition simplifies to:-if hasContent || (!state.TextItemHasContent && (hasReasoning || hasContent)) { +if hasContent || (!state.TextItemHasContent && hasReasoning) {This is purely for readability and does not affect correctness.
core/internal/testutil/transcription.go (1)
261-278: Consider extracting a helper for repeated fixture loading pattern.The Fal-AI fixture loading pattern is duplicated across multiple test functions (here, lines 369-386, 463-480, 561-578). Consider extracting a helper:
func loadAudioFixture(t *testing.T, fixtureName string) []byte { _, filename, _, _ := runtime.Caller(1) dir := filepath.Dir(filename) filePath := filepath.Join(dir, "scenarios", "media", fixtureName+".mp3") data, err := os.ReadFile(filePath) if err != nil { t.Fatalf("failed to read audio fixture %s: %v", filePath, err) } return data }This is optional since the current implementation works correctly and is test code.
core/providers/huggingface/embedding.go (1)
117-142: Consider edge case handling for empty array responses.If the HuggingFace API returns an empty array
[], the 2D array unmarshal at line 119 would succeed withlen(arr2D) == 0, resulting in a response with zero embeddings. This may be intentional, but ensure callers handle emptyDataarrays gracefully.core/providers/huggingface/responses.go (1)
66-84: Consider returning an error whenToBifrostResponsesResponse()returns nil.If
resp.ToBifrostResponsesResponse()at line 76 returns nil (conversion failure), the function silently returnsnil, nil, which could mask errors. Consider returning an explicit error to help diagnose conversion issues.🔎 Suggested improvement
responsesResp := resp.ToBifrostResponsesResponse() - if responsesResp != nil { - responsesResp.ExtraFields.Provider = schemas.HuggingFace - responsesResp.ExtraFields.ModelRequested = requestedModel - responsesResp.ExtraFields.RequestType = schemas.ResponsesRequest + if responsesResp == nil { + return nil, fmt.Errorf("failed to convert chat response to responses response") } + responsesResp.ExtraFields.Provider = schemas.HuggingFace + responsesResp.ExtraFields.ModelRequested = requestedModel + responsesResp.ExtraFields.RequestType = schemas.ResponsesRequest return responsesResp, nil
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (6)
core/internal/testutil/scenarios/media/Numbers_And_Punctuation.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Basic_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Medium_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Technical_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/Technical_Terms.mp3is excluded by!**/*.mp3ui/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (38)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)core/bifrost.go(2 hunks)core/changelog.md(1 hunks)core/internal/testutil/account.go(4 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/internal/testutil/transcription.go(6 hunks)core/providers/gemini/speech.go(1 hunks)core/providers/gemini/transcription.go(2 hunks)core/providers/gemini/utils.go(0 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/responses.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/providers/openai/openai.go(1 hunks)core/providers/utils/audio.go(1 hunks)core/providers/utils/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(3 hunks)core/schemas/transcriptions.go(1 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/docs.json(1 hunks)docs/features/providers/huggingface.mdx(1 hunks)docs/features/providers/providers-unified-interface.mdx(2 hunks)transports/changelog.md(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(6 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
💤 Files with no reviewable changes (1)
- core/providers/gemini/utils.go
✅ Files skipped from review due to trivial changes (3)
- transports/changelog.md
- core/providers/huggingface/utils.go
- core/changelog.md
🚧 Files skipped from review as they are similar to previous changes (18)
- docs/features/providers/providers-unified-interface.mdx
- .github/workflows/release-pipeline.yml
- core/schemas/transcriptions.go
- transports/config.schema.json
- core/providers/utils/audio.go
- core/internal/testutil/responses_stream.go
- ui/lib/constants/logs.ts
- core/providers/openai/openai.go
- docs/features/providers/huggingface.mdx
- core/providers/huggingface/speech.go
- docs/apis/openapi.json
- core/providers/utils/utils.go
- core/bifrost.go
- ui/README.md
- .github/workflows/pr-tests.yml
- core/providers/gemini/transcription.go
- core/providers/huggingface/huggingface_test.go
- docs/docs.json
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/schemas/account.gocore/providers/gemini/speech.goui/lib/constants/icons.tsxcore/providers/huggingface/chat.gocore/providers/huggingface/responses.gocore/internal/testutil/transcription.gocore/schemas/bifrost.goui/lib/constants/config.tscore/providers/huggingface/transcription.gocore/schemas/mux.gocore/providers/huggingface/models.gocore/internal/testutil/account.gocore/providers/huggingface/embedding.godocs/contributing/adding-a-provider.mdxcore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
🧠 Learnings (3)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/schemas/account.gocore/providers/gemini/speech.gocore/providers/huggingface/chat.gocore/providers/huggingface/responses.gocore/internal/testutil/transcription.gocore/schemas/bifrost.gocore/providers/huggingface/transcription.gocore/schemas/mux.gocore/providers/huggingface/models.gocore/internal/testutil/account.gocore/providers/huggingface/embedding.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
📚 Learning: 2025-12-15T10:34:13.855Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/chat.go:370-389
Timestamp: 2025-12-15T10:34:13.855Z
Learning: In the HuggingFace provider (core/providers/huggingface/), streaming tool call data arrives as a single delta chunk with all fields (id, type, function.name, function.arguments) present at once. Do not implement accumulation logic across deltas for streaming tool calls in this codebase; rely on single-chunk data as the guaranteed behavior. If there is existing accumulation logic in this directory, remove it or guard it behind this assumption.
Applied to files:
core/providers/huggingface/chat.gocore/providers/huggingface/responses.gocore/providers/huggingface/transcription.gocore/providers/huggingface/models.gocore/providers/huggingface/embedding.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
📚 Learning: 2025-12-15T10:19:32.071Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/internal/testutil/transcription.go:241-258
Timestamp: 2025-12-15T10:19:32.071Z
Learning: In core/internal/testutil package, the function GenerateTTSAudioForTest returns ([]byte, string) not ([]byte, error), and it already handles errors internally by calling t.Fatalf(). Therefore, the blank identifier for the second return value is correct and no external error checking is needed.
Applied to files:
core/internal/testutil/transcription.go
🧬 Code graph analysis (10)
core/schemas/account.go (1)
ui/lib/types/config.ts (3)
AzureKeyConfig(23-27)VertexKeyConfig(36-42)BedrockKeyConfig(63-71)
core/providers/gemini/speech.go (1)
core/providers/utils/audio.go (1)
DetectAudioMimeType(78-119)
core/providers/huggingface/chat.go (2)
core/schemas/chatcompletions.go (5)
BifrostChatRequest(12-19)ChatStreamOptions(237-240)ChatToolChoiceStruct(390-395)ChatToolChoiceTypeFunction(382-382)ChatToolChoiceFunction(444-446)core/providers/huggingface/types.go (6)
HuggingFaceChatRequest(76-94)HuggingFaceResponseFormat(132-135)HuggingFaceToolChoice(99-104)EnumStringTypeAuto(109-109)EnumStringTypeNone(110-110)EnumStringTypeRequired(111-111)
core/internal/testutil/transcription.go (5)
core/schemas/transcriptions.go (3)
BifrostTranscriptionRequest(3-10)TranscriptionInput(28-30)TranscriptionParameters(32-49)core/schemas/bifrost.go (3)
HuggingFace(51-51)BifrostError(465-474)SpeechRequest(99-99)core/internal/testutil/utils.go (4)
GetProviderVoice(39-87)GenerateTTSAudioForTest(568-640)TTSTestTextBasic(20-20)TTSTestTextMedium(23-23)core/schemas/speech.go (4)
BifrostSpeechRequest(9-16)SpeechParameters(43-58)SpeechVoiceInput(65-68)BifrostSpeechResponse(22-29)core/internal/testutil/test_retry_framework.go (5)
GetTestRetryConfigForScenario(1118-1150)TestRetryContext(168-173)SpeechRetryConfig(216-223)SpeechRetryCondition(144-147)WithSpeechTestRetry(1328-1478)
core/schemas/bifrost.go (1)
ui/lib/types/config.ts (1)
ModelProvider(197-200)
core/providers/huggingface/transcription.go (6)
core/schemas/transcriptions.go (3)
BifrostTranscriptionRequest(3-10)BifrostTranscriptionResponse(16-26)TranscriptionSegment(87-98)core/providers/huggingface/types.go (5)
HuggingFaceTranscriptionRequest(258-264)HuggingFaceTranscriptionRequestParameters(267-270)HuggingFaceTranscriptionGenerationParameters(273-290)HuggingFaceTranscriptionEarlyStopping(294-297)HuggingFaceTranscriptionResponse(330-333)core/schemas/models.go (1)
Model(109-129)core/providers/utils/audio.go (1)
DetectAudioMimeType(78-119)core/schemas/utils.go (1)
SafeExtractIntPointer(486-494)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(394-405)HuggingFace(51-51)
core/schemas/mux.go (2)
core/schemas/responses.go (2)
BifrostResponsesStreamResponse(1440-1479)ResponsesStreamResponseTypeOutputTextDelta(1388-1388)core/schemas/utils.go (1)
Ptr(16-18)
core/internal/testutil/account.go (3)
core/schemas/bifrost.go (2)
HuggingFace(51-51)Fallback(149-152)core/schemas/provider.go (4)
ProviderConfig(267-276)NetworkConfig(48-56)ConcurrencyAndBufferSize(131-134)Provider(314-361)core/internal/testutil/cross_provider_scenarios.go (1)
ProviderConfig(45-53)
core/providers/huggingface/embedding.go (4)
core/schemas/embedding.go (4)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)core/providers/huggingface/types.go (3)
HuggingFaceEmbeddingRequest(161-172)InputsCustomType(174-177)EncodingType(219-219)core/schemas/models.go (1)
Model(109-129)core/schemas/chatcompletions.go (1)
BifrostLLMUsage(845-852)
core/providers/huggingface/types.go (1)
core/schemas/chatcompletions.go (3)
ChatStreamOptions(237-240)ChatTool(252-257)ChatToolChoiceFunction(444-446)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
[grammar] ~2000-~2000: Use a hyphen to join words.
Context: ...ror or incorrect stream handling Tool calling tests fail: - Solution: Ve...
(QB_NEW_EN_HYPHEN)
🔇 Additional comments (46)
core/providers/gemini/speech.go (1)
175-175: LGTM! Clean refactor to shared audio MIME detection utility.Moving to
utils.DetectAudioMimeTypeconsolidates audio MIME type detection logic, reducing code duplication across providers and improving maintainability. The shared utility handles multiple audio formats with proper fallback logic.docs/contributing/adding-a-provider.mdx (4)
7-13: Well-structured quick reference for contributors.The introduction provides clear guidance on where to find production-ready implementations for both OpenAI-compatible and custom API providers. This helps contributors quickly locate relevant examples based on their integration needs.
64-69: Clear file creation order guidance.Marking the file creation order as "CRITICAL" with explicit numbered steps helps prevent common mistakes where contributors might create files in the wrong order, leading to circular dependencies or incomplete implementations.
1770-1786: Comprehensive UI integration checklist.The checklist covers all necessary files and locations for UI integration, making it easy for contributors to verify they haven't missed any required updates. This will reduce incomplete PRs and review cycles.
2033-2069: Thorough pre-submission checklist.The final checklist consolidates all requirements across provider implementation, tests, schema, UI, CI/CD, and documentation. This comprehensive list helps ensure complete and high-quality provider contributions.
core/schemas/mux.go (1)
1410-1411: LGTM! Correct lifecycle management for reasoning-only responses.Removing the
state.TextItemHasContentcheck from the text item closure condition is the right approach. This ensures that text items are properly closed even for reasoning-only responses where the text item is created but may only contain an empty content delta.This change maintains correct event sequencing and lifecycle transitions in the Responses API streaming format.
ui/lib/constants/config.ts (2)
61-61: LGTM!Correctly marks HuggingFace as requiring an API key, consistent with the PR objectives that introduce
HUGGING_FACE_API_KEY. The alphabetical ordering is also correct.
94-124: Add HuggingFace to BaseProvider type and PROVIDER_SUPPORTED_REQUESTS mapping.HuggingFace is defined as a
ProviderNamein the constants but is missing from theBaseProvidertype definition, which prevents it from being included in thePROVIDER_SUPPORTED_REQUESTSmapping. Since HuggingFace appears configured in the codebase (with sample models and icons), it should be added to both:
BaseProvidertype inui/lib/types/config.ts(currently:"openai" | "anthropic" | "cohere" | "gemini" | "bedrock")PROVIDER_SUPPORTED_REQUESTSinui/lib/constants/config.ts(with supported request types:list_models,chat_completion,chat_completion_stream,text_completion,text_completion_stream,responses,responses_stream,embedding)core/schemas/bifrost.go (3)
35-53: LGTM! HuggingFace provider constant is properly defined.The
HuggingFaceconstant follows the established naming convention and string value pattern used by other providers.
56-63: HuggingFace correctly added to SupportedBaseProviders.This enables custom providers to use HuggingFace as their base provider type.
66-85: HuggingFace correctly added to StandardProviders.The provider is now included in the complete list of built-in providers, enabling it for standard provider operations.
core/schemas/account.go (2)
9-20: LGTM! HuggingFaceKeyConfig field properly integrated into Key struct.The field follows the established pattern of other provider-specific configurations (Azure, Vertex, Bedrock) with appropriate optional semantics via pointer type and
omitemptyJSON tag.
70-73: HuggingFaceKeyConfig struct properly defined.The
Deploymentsmap follows the same pattern as other provider key configurations, enabling model-to-deployment name mappings for future HuggingFace inference endpoint integration. Based on learnings, this is correctly reserved for future use.core/internal/testutil/account.go (4)
114-114: HuggingFace properly added to configured providers list.Consistent with other provider entries in the list.
327-334: HuggingFace key configuration looks good.The environment variable
HUGGING_FACE_API_KEYis consistently used. Note thatUseForBatchAPIis intentionally omitted since HuggingFace doesn't support batch API operations (as confirmed by theAllProviderConfigsscenarios).
589-601: HuggingFace provider configuration is well-tuned.The 300-second timeout accommodates HuggingFace model cold starts, and the retry configuration (10 retries with 2s-30s backoff) aligns with other variable/cloud providers.
1020-1053: Comprehensive HuggingFace test configuration.The test config properly defines:
- Provider-specific model identifiers (using HuggingFace's routing format)
- Accurate scenario flags reflecting HuggingFace capabilities (e.g.,
MultipleToolCalls: false, streaming transcription/speech disabled)- Appropriate fallbacks to OpenAI
The configuration enables thorough testing of the HuggingFace provider integration.
core/providers/huggingface/chat.go (5)
11-14: Appropriate nil guard for request conversion.Returning
(nil, nil)for nil or empty input is a reasonable convention that allows callers to handle this case gracefully.
17-52: Parameter mapping looks correct.The mappings from Bifrost parameters to HuggingFace parameters are straightforward and preserve optional semantics via pointer checks.
54-66: ResponseFormat conversion now properly handles errors.The JSON marshal/unmarshal approach for format conversion returns meaningful errors, addressing the previous review concern about silent error swallowing.
68-73: Verify ifIncludeObfuscationshould also be forwarded.Only
IncludeUsageis copied fromparams.StreamOptions, butschemas.ChatStreamOptionsalso has anIncludeObfuscationfield. If HuggingFace API supports this option, consider forwarding it as well:if params.StreamOptions != nil { hfReq.StreamOptions = &schemas.ChatStreamOptions{ - IncludeUsage: params.StreamOptions.IncludeUsage, + IncludeUsage: params.StreamOptions.IncludeUsage, + IncludeObfuscation: params.StreamOptions.IncludeObfuscation, } }If HuggingFace doesn't support obfuscation, the current behavior is correct.
77-102: ToolChoice handling covers primary use cases.The implementation correctly handles:
- String enum values (
auto,none,required)- Function-based tool choice via
ChatToolChoiceTypeFunction- Guard to prevent setting invalid/empty ToolChoice
Note:
ChatToolChoiceStructalso supportsCustomandAllowedToolstypes which aren't handled, but these may not be applicable to HuggingFace's API.core/internal/testutil/transcription.go (5)
73-97: Fixture-based testing for Fal-AI/HuggingFace properly implemented.The code correctly:
- Detects Fal-AI models via prefix check
- Uses
runtime.Callerto locate fixture files relative to source- Handles file read errors with
t.Fatalf(addressing previous review concern)- Constructs proper transcription request with mp3 format
98-178: TTS generation path for non-Fal-AI providers looks good.The else branch maintains the existing TTS generation flow with proper retry configuration and error handling.
369-386: Fixture loading for AllResponseFormats test is consistent.Same pattern as other Fal-AI branches, correctly using
RoundTrip_Basic_MP3.mp3fixture.
463-480: Fixture loading for WithCustomParameters test is consistent.Uses
RoundTrip_Medium_MP3.mp3fixture appropriately for the medium-length text test.
561-578: Fixture loading for MultipleLanguages test is consistent.Correctly reuses
RoundTrip_Basic_MP3.mp3fixture for language testing.core/providers/huggingface/models.go (3)
16-44: LGTM!The model conversion logic is well-structured with proper nil checks, empty model filtering, and correct field mapping. The
HuggingFaceIDcorrectly usesmodel.ID(the original identifier) whileNameusesmodel.ModelID.
46-102: LGTM!The method derivation logic correctly maps HuggingFace pipeline types and tags to Bifrost request types. The deduplication via map and deterministic sorting ensures consistent output.
11-14: No action needed. ThemaxModelFetchLimitconstant is used incore/providers/huggingface/utils.goat lines 92-93 for limit validation and should be retained.Likely an incorrect or invalid review comment.
core/providers/huggingface/embedding.go (1)
11-78: LGTM!The request conversion correctly handles provider-specific field mapping (
Inputsforhf-inference,Inputfor others), properly extracts model/provider info, and maps both standard and HuggingFace-specific parameters fromExtraParams.core/providers/huggingface/responses.go (2)
13-32: LGTM!The conversion chain (ResponsesRequest → ChatRequest → HuggingFaceChatRequest) includes proper nil checks at each step, addressing the previous concern about nil pointer dereference.
34-62: LGTM!The function correctly delegates to
CheckContextAndGetRequestBody, includes the defensive nil check forhfReq, and properly configures streaming options whenisStreamingis true.core/providers/huggingface/transcription.go (2)
49-116: LGTM!The parameter mapping correctly addresses the previous gating issue by mapping typed parameters (
MaxNewTokens,MaxLength, etc.) outside theExtraParamsblock. Generation parameters are properly initialized and populated from both schema-level fields andExtraParams.
121-160: LGTM!The response conversion correctly handles nil responses, validates required model name, maps text content, and safely extracts timestamp data from chunks with proper bounds checking.
core/providers/huggingface/huggingface.go (8)
29-81: LGTM!Response pool management is well-implemented with proper struct reset on acquire and nil-safe release functions. Pre-warming the pools based on concurrency configuration is a good optimization.
209-269: LGTM!The
completeRequestmethod correctly handles audio content-type detection, error response parsing with proper message preservation (addressing past feedback), body decoding, and safe buffer copying to avoid use-after-free issues.
271-417: LGTM!The concurrent model fetching implementation is well-structured with proper goroutine management, WaitGroup synchronization, and result aggregation. Error handling correctly returns the first error when all requests fail, and latency averaging is safely guarded.
585-615: LGTM!The Responses implementation correctly delegates to ChatCompletion/ChatCompletionStream while properly overriding ExtraFields to reflect the actual request type. The streaming variant correctly sets the fallback context flag.
617-700: LGTM!The embedding implementation correctly uses the model alias cache for request execution, handles multiple response formats via
UnmarshalHuggingFaceEmbeddingResponse, and properly populates all ExtraFields including raw request/response when enabled.
702-782: LGTM!The Speech implementation correctly manages the response pool with
defer releaseHuggingFaceSpeechResponse, properly passes context todownloadAudioFromURL, and handles audio data conversion. ExtraFields are comprehensively populated.
788-875: LGTM!The Transcription implementation correctly handles the hf-inference special case with proper input validation, manages the response pool with deferred release, and populates ExtraFields appropriately. The nil check for
request.Inputaddresses the previous nil dereference concern.
882-930: LGTM!All unsupported operations correctly return
NewUnsupportedOperationErrorwith the appropriate request type, maintaining consistency across the provider interface.core/providers/huggingface/types.go (3)
73-143: Chat types correctly structured for HuggingFace API.The chat-related types appropriately reuse Bifrost schemas where possible (
ChatMessage,ChatStreamOptions,ChatTool,ChatToolChoiceFunction) while defining HuggingFace-specific variants likeHuggingFaceToolChoicewhere the API contract differs. The customMarshalJSONforHuggingFaceToolChoicecorrectly handles the union type (enum string or function object).
157-224: Embedding types correctly handle flexible input formats.The
InputsCustomTypewith custom JSON marshaling properly supports the HuggingFace API's flexible input field that accepts either a single string, an array of strings, or an object. The unmarshal logic tries formats in a sensible order (string → array → object) and the marshal logic appropriately prioritizes array over single text.
256-342: Transcription types are well-structured and error handling is correct.The transcription-related types properly handle the HuggingFace ASR API, including:
- Flexible audio input (raw bytes or URL)
- Comprehensive generation parameters
- Union type for
EarlyStopping(bool or string "never") with symmetric marshaling- Proper error handling in
UnmarshalJSON(line 326 correctly returns an error for invalid input)The type aliases at the end provide convenient access without code duplication.
70fd6d0 to
3ea5fc8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
ui/lib/constants/config.ts (1)
40-40: Fix the HuggingFace model identifier in the placeholder example.The model identifier
nebius/Qwen/Qwen3-Embedding-8Buses an incorrect format. The "nebius/" prefix indicates a Nebius inference service identifier, not a standard HuggingFace model format. Standard HuggingFace model identifiers follow the patternorganization/model-name.🔎 Proposed fix
- huggingface: "e.g. sambanova/meta-llama/Llama-3.1-8B-Instruct, nebius/Qwen/Qwen3-Embedding-8B", + huggingface: "e.g. meta-llama/Llama-3.1-8B-Instruct, Qwen/Qwen2.5-72B-Instruct",
🧹 Nitpick comments (4)
core/providers/openai/openai.go (1)
1047-1054: LGTM! Enhanced streaming emission for reasoning fields.The broadened condition correctly triggers chunk emission when
Delta.ReasoningorDelta.ReasoningDetailsare present, following existing patterns (!= nilfor pointers likeContent,len() > 0for slices likeToolCalls). The updated comment clearly documents the expanded behavior.Optional: Verify test coverage
Consider verifying that test coverage includes scenarios where:
Delta.Reasoningis present (non-nil, including empty string case if valid)Delta.ReasoningDetailshas entries- Both fields are present simultaneously
- These fields are present alongside other delta fields (Content, Audio, ToolCalls)
This ensures the new emission paths are exercised and behave as expected.
core/internal/testutil/transcription.go (2)
73-178: Fal‑AI-on-HuggingFace fixture path is correct; consider extracting a small helperThe conditional path for
schemas.HuggingFace+fal-ai/models that loads pre-generated mp3 fixtures and wires them intoBifrostTranscriptionRequestneatly avoids wav/mp3 mismatches and fails fast on missing files. You might optionally pull theruntime.Caller+scenarios/media/<name>.mp3resolution into a small helper to reuse in other Fal‑AI branches below and keep the test code DRY.
261-278: Repeated Fal‑AI fixture loading could be centralized for clarityThe Additional/Advanced/Language transcription tests repeat the same Fal‑AI-on-HuggingFace mp3 fixture lookup pattern (via
runtime.Callerandos.ReadFilewitht.Fatalfon failure). Extracting a sharedloadFalAIAudioFixture(t, name string) []bytewould remove repetition and make it easier to adjust fixture paths or error messages in one place. The continued use ofaudioData, _ = GenerateTTSAudioForTest(...)in non-Fal‑AI branches is correct given its([]byte, string)signature and internalt.Fatalfon error. Based on learnings, this usage is intentional.Also applies to: 369-386, 463-480, 561-578
core/providers/huggingface/transcription.go (1)
11-118: Transcription request conversion covers validations and provider-specific nuances wellNil/empty-input checks, provider/model splitting, fal‑ai’s
audio_urldata‑URL handling with a WAV guard, and the detailed mapping of typed + ExtraParams intoGenerationParametersall look correct and consistent with the rest of the provider. Only micro‑nit (optional): you always attach an emptygeneration_parametersobject even when no fields are set; if you care about minimizing payloads you could gate assignment behind a “has any field” check.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (6)
core/internal/testutil/scenarios/media/Numbers_And_Punctuation.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Basic_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Medium_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Technical_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/Technical_Terms.mp3is excluded by!**/*.mp3ui/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (38)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)core/bifrost.go(2 hunks)core/changelog.md(1 hunks)core/internal/testutil/account.go(4 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/internal/testutil/transcription.go(6 hunks)core/providers/gemini/speech.go(1 hunks)core/providers/gemini/transcription.go(2 hunks)core/providers/gemini/utils.go(0 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/responses.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/providers/openai/openai.go(1 hunks)core/providers/utils/audio.go(1 hunks)core/providers/utils/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(3 hunks)core/schemas/transcriptions.go(1 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/docs.json(1 hunks)docs/features/providers/huggingface.mdx(1 hunks)docs/features/providers/providers-unified-interface.mdx(2 hunks)transports/changelog.md(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(6 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
💤 Files with no reviewable changes (1)
- core/providers/gemini/utils.go
🚧 Files skipped from review as they are similar to previous changes (16)
- .github/workflows/pr-tests.yml
- core/providers/utils/audio.go
- transports/config.schema.json
- core/providers/gemini/transcription.go
- ui/lib/constants/logs.ts
- docs/features/providers/providers-unified-interface.mdx
- core/providers/huggingface/embedding.go
- core/providers/gemini/speech.go
- core/providers/huggingface/huggingface_test.go
- docs/apis/openapi.json
- core/providers/utils/utils.go
- core/changelog.md
- core/providers/huggingface/models.go
- core/providers/huggingface/speech.go
- .github/workflows/release-pipeline.yml
- docs/features/providers/huggingface.mdx
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/internal/testutil/transcription.gocore/providers/huggingface/responses.gocore/schemas/bifrost.gocore/bifrost.gotransports/changelog.mdcore/schemas/account.gocore/providers/openai/openai.gocore/internal/testutil/account.goui/lib/constants/icons.tsxcore/schemas/mux.goui/README.mdcore/schemas/transcriptions.gocore/providers/huggingface/chat.godocs/docs.jsoncore/providers/huggingface/utils.gocore/providers/huggingface/transcription.godocs/contributing/adding-a-provider.mdxui/lib/constants/config.tscore/internal/testutil/responses_stream.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
🧠 Learnings (5)
📚 Learning: 2025-12-15T10:19:32.071Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/internal/testutil/transcription.go:241-258
Timestamp: 2025-12-15T10:19:32.071Z
Learning: In core/internal/testutil package, the function GenerateTTSAudioForTest returns ([]byte, string) not ([]byte, error), and it already handles errors internally by calling t.Fatalf(). Therefore, the blank identifier for the second return value is correct and no external error checking is needed.
Applied to files:
core/internal/testutil/transcription.go
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/internal/testutil/transcription.gocore/providers/huggingface/responses.gocore/schemas/bifrost.gocore/bifrost.gocore/schemas/account.gocore/providers/openai/openai.gocore/internal/testutil/account.gocore/schemas/mux.gocore/schemas/transcriptions.gocore/providers/huggingface/chat.gocore/providers/huggingface/utils.gocore/providers/huggingface/transcription.gocore/internal/testutil/responses_stream.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
📚 Learning: 2025-12-15T10:34:13.855Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/chat.go:370-389
Timestamp: 2025-12-15T10:34:13.855Z
Learning: In the HuggingFace provider (core/providers/huggingface/), streaming tool call data arrives as a single delta chunk with all fields (id, type, function.name, function.arguments) present at once. Do not implement accumulation logic across deltas for streaming tool calls in this codebase; rely on single-chunk data as the guaranteed behavior. If there is existing accumulation logic in this directory, remove it or guard it behind this assumption.
Applied to files:
core/providers/huggingface/responses.gocore/providers/huggingface/chat.gocore/providers/huggingface/utils.gocore/providers/huggingface/transcription.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/openai.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.
Applied to files:
core/providers/openai/openai.go
🧬 Code graph analysis (8)
core/providers/huggingface/responses.go (5)
core/schemas/responses.go (2)
BifrostResponsesRequest(32-39)BifrostResponsesResponse(45-85)core/providers/huggingface/types.go (1)
HuggingFaceChatRequest(76-94)core/providers/huggingface/chat.go (1)
ToHuggingFaceChatCompletionRequest(11-106)core/schemas/chatcompletions.go (1)
BifrostChatResponse(27-42)core/schemas/bifrost.go (3)
HuggingFace(51-51)RequestType(88-88)ResponsesRequest(96-96)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/providers/huggingface/huggingface.go (1)
NewHuggingFaceProvider(66-99)
core/providers/openai/openai.go (3)
core/schemas/chatcompletions.go (1)
ChatStreamResponseChoice(783-785)core/providers/gemini/types.go (1)
Content(977-985)ui/lib/types/logs.ts (1)
ReasoningDetails(127-134)
core/internal/testutil/account.go (4)
core/schemas/bifrost.go (2)
HuggingFace(51-51)OpenAI(35-35)core/schemas/account.go (1)
Key(8-20)core/schemas/provider.go (5)
ProviderConfig(267-276)NetworkConfig(48-56)DefaultRequestTimeoutInSeconds(15-15)ConcurrencyAndBufferSize(131-134)Provider(314-361)core/internal/testutil/cross_provider_scenarios.go (1)
ProviderConfig(45-53)
core/schemas/mux.go (2)
core/schemas/responses.go (2)
BifrostResponsesStreamResponse(1440-1479)ResponsesStreamResponseTypeOutputTextDelta(1388-1388)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/huggingface/chat.go (3)
core/schemas/chatcompletions.go (4)
BifrostChatRequest(12-19)ChatStreamOptions(237-240)ChatToolChoiceStruct(390-395)ChatToolChoiceFunction(444-446)core/providers/huggingface/types.go (2)
HuggingFaceChatRequest(76-94)HuggingFaceToolChoice(99-104)ui/lib/types/logs.ts (1)
Function(152-157)
core/providers/huggingface/transcription.go (5)
core/schemas/transcriptions.go (3)
BifrostTranscriptionRequest(3-10)BifrostTranscriptionResponse(16-26)TranscriptionSegment(87-98)core/providers/huggingface/types.go (5)
HuggingFaceTranscriptionRequest(258-264)HuggingFaceTranscriptionRequestParameters(267-270)HuggingFaceTranscriptionGenerationParameters(273-290)HuggingFaceTranscriptionEarlyStopping(294-297)HuggingFaceTranscriptionResponse(330-333)core/providers/utils/audio.go (1)
DetectAudioMimeType(78-119)core/schemas/utils.go (1)
SafeExtractIntPointer(486-494)core/schemas/bifrost.go (2)
BifrostResponseExtraFields(394-405)HuggingFace(51-51)
core/providers/huggingface/huggingface.go (4)
core/providers/huggingface/types.go (2)
HuggingFaceTranscriptionResponse(330-333)HuggingFaceSpeechResponse(243-245)core/providers/utils/utils.go (2)
MakeRequestWithContext(41-95)NewBifrostOperationError(733-744)core/providers/utils/audio.go (1)
DetectAudioMimeType(78-119)core/providers/huggingface/embedding.go (2)
ToHuggingFaceEmbeddingRequest(11-78)UnmarshalHuggingFaceEmbeddingResponse(82-168)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
[grammar] ~2000-~2000: Use a hyphen to join words.
Context: ...ror or incorrect stream handling Tool calling tests fail: - Solution: Ve...
(QB_NEW_EN_HYPHEN)
🔇 Additional comments (36)
transports/changelog.md (1)
1-1: LGTM! Changelog entry is clear and concise.The changelog entry accurately summarizes the HuggingFace provider addition with UI integration.
docs/contributing/adding-a-provider.mdx (1)
1-2070: LGTM! Comprehensive contributor guide for adding providers.This is an excellent, thorough guide that provides clear step-by-step instructions for adding both OpenAI-compatible and custom API providers. The structure is logical, examples are helpful, and the guide covers all necessary aspects from research to CI/CD integration.
docs/docs.json (1)
158-166: LGTM! Documentation navigation structure updated appropriately.The new Providers group consolidates provider-related documentation logically, and the HuggingFace page entry fits well within this structure.
core/schemas/mux.go (2)
1155-1241: LGTM! Proper handling of reasoning-only streaming responses.The new gating logic correctly handles edge cases where responses contain only reasoning content without text. The empty delta emission on the first chunk ensures proper lifecycle validation while maintaining backward compatibility with content-only and mixed content/reasoning responses.
1410-1456: LGTM! Text item closure logic improved.Closing the text item regardless of whether it has content properly supports reasoning-only responses while maintaining correct lifecycle sequencing. This is a good improvement over the previous content-dependent closure logic.
ui/lib/constants/config.ts (1)
61-61: LGTM! HuggingFace key requirement correctly configured.Setting
huggingface: trueappropriately indicates that an API key is required for the HuggingFace provider.core/internal/testutil/responses_stream.go (1)
693-693: LGTM! Increased safety threshold for complex streaming scenarios.Raising the threshold from 100 to 300 is reasonable to accommodate providers that emit richer lifecycle events (such as reasoning, tool calls, and multiple content types) while still protecting against infinite loops.
core/schemas/account.go (2)
9-20: LGTM! HuggingFaceKeyConfig field added for future use.The new field follows the existing pattern for provider-specific key configurations and is properly structured with the
omitemptytag. Based on learnings, this is reserved for future Hugging Face inference endpoint deployments.Based on learnings, this field is currently unused and should not be flagged for missing OpenAPI documentation until the feature is actively implemented.
70-72: LGTM! HuggingFaceKeyConfig type definition follows established patterns.The new type mirrors the structure of other provider-specific configs (AzureKeyConfig, BedrockKeyConfig) with a Deployments map for model-to-deployment mappings.
core/bifrost.go (2)
26-26: Import correctly placed.The HuggingFace provider import follows the alphabetical ordering convention used by other provider imports.
1889-1890: HuggingFace case correctly integrated into provider factory.The switch case follows the established pattern for providers that don't return initialization errors. The placement before
schemas.Nebiusmaintains logical grouping.core/schemas/bifrost.go (2)
51-51: HuggingFace constant correctly defined.The constant follows the established naming convention with lowercase provider identifier string.
62-62: Provider registration is complete.HuggingFace is correctly added to both
SupportedBaseProviders(enabling custom provider configurations based on HuggingFace) andStandardProviders(registering as a built-in provider).Also applies to: 83-83
core/internal/testutil/account.go (4)
114-114: Provider correctly added to configured providers list.HuggingFace is appropriately placed in the provider list, maintaining consistency with other provider entries.
327-334: HuggingFace key configuration is appropriate.The key retrieval uses the standardized
HUGGING_FACE_API_KEYenvironment variable. The omission ofUseForBatchAPIis intentional since the HuggingFace test configuration shows batch operations are not enabled for this provider.
589-601: HuggingFace provider configuration is well-tuned.The 300-second timeout appropriately accounts for HuggingFace model cold starts. The retry settings (10 retries, 2s-30s backoff) align with other cloud providers, and the concurrency/buffer settings match the standard pattern.
1020-1053: Comprehensive test configuration for HuggingFace is well-structured.The test configuration appropriately reflects HuggingFace capabilities:
- Models use the correct HuggingFace Inference API naming convention (provider/org/model)
MultipleToolCalls: falsealigns with the learning that HuggingFace streaming tool call data arrives as single delta chunks- Stream variants for transcription and speech are correctly disabled
- Fallback to OpenAI is properly configured
core/schemas/transcriptions.go (1)
37-40: HuggingFace transcription parameters are well-documented.The new generation parameters (
MaxLength,MinLength,MaxNewTokens,MinNewTokens) are correctly typed as optional pointers and include clear inline comments indicating their HuggingFace-specific purpose. This follows the pattern established by the Elevenlabs-specific fields below.core/providers/huggingface/responses.go (2)
11-30: Request conversion function has proper nil guards.The function correctly implements defensive nil checks at each conversion step. The guard at lines 25-27 addresses the previous review concern about dereferencing a nil request.
34-52: Response conversion correctly enriches metadata.The function properly handles nil input and ensures model propagation. The ExtraFields enrichment with
Provider,ModelRequested, andRequestTypefollows the pattern used by other providers.Note: The function mutates the input
resp.Model(line 41) as a side effect. This appears intentional to ensure model information is preserved in the response chain, but be aware of this behavior if the caller expects the original response to remain unmodified.core/providers/huggingface/chat.go (1)
11-105: Chat request conversion aligns well with Bifrost schema and HF expectationsThe field mapping (parameters, tools, tool_choice, response_format, stream options) is consistent and defensive, and error surfacing for
ResponseFormatconversion is appropriate. I don’t see gaps relative to the surrounding provider implementation.core/providers/huggingface/utils.go (2)
52-147: Inference provider list, model hub URL, and model/provider parsing look coherentThe provider enum/set and
buildModelHubURLcorrectly mirror Hugging Face’s hub filters, andsplitIntoModelProvidercleanly separates provider vs. model while explicitly rejecting ambiguous no-slash model strings. This matches the rest of the provider’s routing logic.
195-322: Provider mapping, validation, and audio download helpers are robust
getModelInferenceProviderMapping/getValidatedProviderModelIDcombine caching, precise task checks, and safe error propagation;downloadAudioFromURLandgetMimeTypeForAudioTypecorrectly use context-aware requests, status checks, and body copying to avoid use-after-free. Overall this utility layer looks solid.ui/README.md (1)
3-239: UI README refresh is aligned with current architecture and docs strategyThe README now accurately describes the UI’s role, tech stack, and project structure while delegating detailed feature/provider lists to the central docs site. This reduces drift and looks good for long‑term maintainability.
core/providers/huggingface/transcription.go (1)
121-159: Transcription response→Bifrost mapping is straightforward and safeThe response adapter correctly sets provider/model metadata, copies
Text, and turnsChunksintoTranscriptionSegments with guarded timestamp handling (defaulting start/end to 0 when missing). This is a clean, minimal translation.core/providers/huggingface/huggingface.go (8)
18-99: Provider construction, client config, and pooling are well-structuredThe provider struct fields, fasthttp client configuration (timeouts, connection limits), proxy wiring, base URL normalization, and pre-warming of speech/transcription response pools are all coherent and align with how other providers in this codebase are set up.
111-190: Model-alias cache helper cleanly encapsulates routing + retry behavior
completeRequestWithModelAliasCachecleanly separates URL construction, provider‑task validation, optional embedding model rewrite, and a single 404‑driven retry with cache invalidation. The control flow is clear and reusescompleteRequestwithout duplicating HTTP/error logic.
190-250: Core HTTP request helper handles headers, audio content types, and errors correctly
completeRequestuses context-aware requests, sets JSON vs audio content types appropriately (viaDetectAudioMimeType+getMimeTypeForAudioType), and preserves useful error information by layeringHuggingFaceResponseErroroverHandleProviderAPIError. Copying the body before releasing the response avoids fasthttp buffer pitfalls.
252-398: Model listing fan-out and aggregation logic is solid and tolerant to partial failures
listModelsByKey’s per-provider goroutines, shared result channel, and aggregation of data/latency/raw responses are implemented carefully: errors per provider don’t poison the whole result unless everything fails, and average latency plus combined raw responses are exposed only when requested.
427-596: Chat/Responses paths reuse OpenAI-compatible surface sensiblyChat completion (sync + stream) correctly validates models via
splitIntoModelProvider, normalizes themodelas<hub-id>:<provider>, uses the shared chat converter, and leverages the OpenAI streaming helper with a custom request converter. The Responses/ResponsesStream fallback through chat is consistent with how other providers are currently wired and preserves extra fields.
598-681: Embedding path integrates request conversion, alias routing, and flexible response parsing cleanlyThe Embedding method checks operation permissions, validates the model string, uses the dedicated converter, then routes via
completeRequestWithModelAliasCachewith the correct task, and finally normalizes the variety of possible HF embedding response shapes viaUnmarshalHuggingFaceEmbeddingResponse. ExtraFields and optional raw request/response are populated consistently.
683-763: Speech flow correctly composes provider mapping, response pooling, and audio downloadText-to-speech requests go through provider/alias validation, use the speech converter, and then populate a pooled
HuggingFaceSpeechResponse. Downloading the final audio viadownloadAudioFromURLwith context and then converting toBifrostSpeechResponsekeeps the HTTP/audio concerns separated and maintains consistent ExtraFields and optional raw payloads.
769-861: Transcription flow handles hf-inference vs provider JSON paths and mapping consistentlyThe Transcription method distinguishes raw-audio
hf-inferencerequests from JSON-based ones, validates presence of input audio, routes viacompleteRequestWithModelAliasCachewith the correct ASR task, and uses pooled HF responses plus the dedicated converter to buildBifrostTranscriptionResponse. ExtraFields and raw payload flags follow the same conventions as other operations.core/providers/huggingface/types.go (3)
14-72: Model and inference-provider mapping types match Hub API semantics with flexible JSON handling
HuggingFaceModelandHuggingFaceListModelsResponse.UnmarshalJSONcorrectly accommodate both top-level array and legacy{"models":[...]}forms, and the inference provider mapping structs (HuggingFaceInferenceProviderInfo/Mapping) align with providerId+task usage seen in the routing utilities.
75-217: Chat/tool-choice and embedding input types are well-shaped and JSON-friendly
HuggingFaceChatRequestplusHuggingFaceToolChoice.MarshalJSONmodel the dual string/objecttool_choicecontract cleanly, andInputsCustomTypehandles string/array/object inputs symmetrically via custom (un)marshaling. This matches how the chat and embedding converters populate these fields.
228-342: Speech and transcription parameter/response types, includingearly_stoppingunion, are solidSpeech reuses the shared generation-parameter struct, and the transcription request/parameter types plus
HuggingFaceTranscriptionEarlyStopping’s custom (un)marshal correctly model the “bool or string” union with proper validation. The transcription response/chunk types fit the conversion logic used inToBifrostTranscriptionResponse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (7)
core/internal/testutil/transcription.go (1)
73-177: Fal‑AI/HuggingFace mp3 fixture handling looks correct; consider a small helper to DRY it up.The conditional branches for
schemas.HuggingFace+fal-ai/models correctly:
- Bypass TTS and load pre-generated mp3 fixtures.
- Fail fast on
os.ReadFileerrors with clear messages.- Keep non-HF/fal paths on the existing TTS + retry flow.
- Preserve correct transcription params (
Language,Format: "mp3",ResponseFormat, fallbacks).Also, the
GenerateTTSAudioForTest(...); audioData, _ = ...usage is correct and intentional given it returns([]byte, string)and handles errors viat.Fatalf()internally. Based on learnings, there’s no need to add extra error handling around it.You might optionally extract a small helper like
loadFalAudioFixture(t, testName string) []byte(wrapping theruntime.Caller+filepath.Join+os.ReadFilepattern) to avoid repeating the same 7–8 lines across all five branches, but that’s purely for readability/maintainability.Also applies to: 261-278, 369-386, 463-480, 561-578
core/schemas/mux.go (1)
1155-1160: Reasoning-only streaming handling is correct; the delta condition could be slightly simplified (optional).The new
hasContent/hasReasoninggating plus:
- Creating the text item when either content is present or on the first reasoning-only chunk, and
- Always closing the text item on
finish_reason,gives a consistent lifecycle for both contentful and reasoning-only responses and fixes the prior gap where reasoning-only streams might never get an
output_textitem.The
OutputTextDeltaguard:if hasContent || (!state.TextItemHasContent && (hasReasoning || hasContent)) { ... }is logically sound but a bit redundant; it could be simplified to:
if hasContent || (!state.TextItemHasContent && hasReasoning) { ... }for readability, without changing behavior. This is optional and non-blocking.
Also applies to: 1214-1241, 1410-1457
core/changelog.md (1)
1-1: Optional wording polish for readability.Purely stylistic, but you might consider:
-feat: added HuggingFace provider using Inference Provider API, support for chat(with stream also), response(with stream also), TTS and speech synthesis +feat: added HuggingFace provider using Inference Provider API, with support for chat (including streaming), responses (including streaming), TTS, and speech synthesisto improve grammar and clarity. No functional impact either way.
core/providers/utils/audio.go (1)
64-119:DetectAudioMimeTypelogic matches the intended, limited format set; just be sure tests cover the edge cases.The header constants and detection flow (WAV → ID3/MP3 → AAC via ADIF/ADTS → AIFF/AIFC → FLAC → OGG → MP3 frame sync → mp3 fallback) are aligned with the documented supported formats and the stricter ADTS mask is intentional to avoid misclassifying MP3 as AAC, per prior discussion. Based on learnings, the 0xF6 mask here is correct.
If not already in place, it’s worth having unit tests that:
- Positively detect each of WAV/MP3/AAC/AIFF/OGG/FLAC.
- Specifically cover an MP3 frame with bits that would trip a naive ADTS check, to lock in the stricter 0xF6 behavior.
Otherwise this helper looks good.
core/providers/huggingface/chat.go (1)
11-14: Consider returning an error for nil input instead of (nil, nil).When
bifrostReqorbifrostReq.Inputis nil, the function returns(nil, nil). This silent nil return could mask bugs at call sites since the caller receives no error but also no usable request. Consider returning an explicit error to signal the invalid input.🔎 Suggested improvement
func ToHuggingFaceChatCompletionRequest(bifrostReq *schemas.BifrostChatRequest) (*HuggingFaceChatRequest, error) { if bifrostReq == nil || bifrostReq.Input == nil { - return nil, nil + return nil, fmt.Errorf("bifrost chat request or input is nil") }docs/contributing/adding-a-provider.mdx (1)
1999-2002: Minor: Hyphenate "Tool-calling" for consistency.Static analysis flagged that "Tool calling" should be hyphenated as a compound modifier.
🔎 Suggested fix
-**Tool calling tests fail**: +**Tool-calling tests fail**:core/providers/huggingface/types.go (1)
234-234: Remove or document the unusedExtrafield.The
Extrafield is tagged withjson:"-"which excludes it from JSON marshaling. Based on previous review discussions, this field is unused throughout the codebase. Either remove it entirely, or if it's reserved for future provider-specific metadata, document its intended purpose with a clear comment.🔎 Proposed fix
If the field is truly unused, remove it:
type HuggingFaceSpeechRequest struct { Text string `json:"text"` Provider string `json:"provider" validate:"required"` Model string `json:"model" validate:"required"` Parameters *HuggingFaceSpeechParameters `json:"parameters,omitempty"` - Extra map[string]any `json:"-"` }Or, if it's reserved for future use, document it clearly:
type HuggingFaceSpeechRequest struct { Text string `json:"text"` Provider string `json:"provider" validate:"required"` Model string `json:"model" validate:"required"` Parameters *HuggingFaceSpeechParameters `json:"parameters,omitempty"` + // Extra holds provider-specific opaque data for future extensions (not serialized) Extra map[string]any `json:"-"` }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (6)
core/internal/testutil/scenarios/media/Numbers_And_Punctuation.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Basic_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Medium_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Technical_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/Technical_Terms.mp3is excluded by!**/*.mp3ui/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (38)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)core/bifrost.go(2 hunks)core/changelog.md(1 hunks)core/internal/testutil/account.go(4 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/internal/testutil/transcription.go(6 hunks)core/providers/gemini/speech.go(1 hunks)core/providers/gemini/transcription.go(2 hunks)core/providers/gemini/utils.go(0 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/responses.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/providers/openai/openai.go(1 hunks)core/providers/utils/audio.go(1 hunks)core/providers/utils/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(3 hunks)core/schemas/transcriptions.go(1 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/docs.json(1 hunks)docs/features/providers/huggingface.mdx(1 hunks)docs/features/providers/providers-unified-interface.mdx(2 hunks)transports/changelog.md(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(6 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
💤 Files with no reviewable changes (1)
- core/providers/gemini/utils.go
🚧 Files skipped from review as they are similar to previous changes (17)
- docs/apis/openapi.json
- core/providers/huggingface/embedding.go
- docs/features/providers/providers-unified-interface.mdx
- core/providers/huggingface/huggingface_test.go
- ui/lib/constants/logs.ts
- core/schemas/bifrost.go
- ui/README.md
- core/providers/huggingface/transcription.go
- core/providers/huggingface/models.go
- core/providers/huggingface/responses.go
- docs/docs.json
- transports/config.schema.json
- .github/workflows/pr-tests.yml
- core/schemas/account.go
- core/providers/gemini/transcription.go
- core/providers/huggingface/speech.go
- ui/lib/constants/config.ts
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/utils/utils.gocore/providers/huggingface/chat.gocore/bifrost.godocs/features/providers/huggingface.mdxcore/internal/testutil/responses_stream.gocore/providers/openai/openai.gocore/schemas/mux.gocore/providers/gemini/speech.gocore/providers/utils/audio.gocore/schemas/transcriptions.godocs/contributing/adding-a-provider.mdxcore/internal/testutil/account.gocore/internal/testutil/transcription.gocore/providers/huggingface/utils.gotransports/changelog.mdui/lib/constants/icons.tsxcore/changelog.mdcore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
🧠 Learnings (9)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/utils/utils.gocore/providers/huggingface/chat.gocore/bifrost.gocore/internal/testutil/responses_stream.gocore/providers/openai/openai.gocore/schemas/mux.gocore/providers/gemini/speech.gocore/providers/utils/audio.gocore/schemas/transcriptions.gocore/internal/testutil/account.gocore/internal/testutil/transcription.gocore/providers/huggingface/utils.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
📚 Learning: 2025-12-15T10:34:13.855Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/chat.go:370-389
Timestamp: 2025-12-15T10:34:13.855Z
Learning: In the HuggingFace provider (core/providers/huggingface/), streaming tool call data arrives as a single delta chunk with all fields (id, type, function.name, function.arguments) present at once. Do not implement accumulation logic across deltas for streaming tool calls in this codebase; rely on single-chunk data as the guaranteed behavior. If there is existing accumulation logic in this directory, remove it or guard it behind this assumption.
Applied to files:
core/providers/huggingface/chat.gocore/providers/huggingface/utils.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
📚 Learning: 2025-12-15T10:06:05.395Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: docs/features/providers/huggingface.mdx:39-61
Timestamp: 2025-12-15T10:06:05.395Z
Learning: For fal-ai transcription requests routed through HuggingFace in Bifrost, WAV (audio/wav) is not supported and should be rejected. Only MP3 format is supported. Update the documentation and any related examples to reflect MP3 as the required input format for HuggingFace-based transcription, and note WAV should not be used. This applies specifically to the HuggingFace provider integration in this repository.
Applied to files:
docs/features/providers/huggingface.mdx
📚 Learning: 2025-12-09T17:08:21.123Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: docs/features/providers/huggingface.mdx:171-195
Timestamp: 2025-12-09T17:08:21.123Z
Learning: In docs/features/providers/huggingface.mdx, use the official Hugging Face naming conventions for provider identifiers in the capabilities table (e.g., ovhcloud-ai-endpoints, z-ai). Do not map to SDK identifiers like ovhcloud or zai-org; this aligns with Hugging Face's public docs and improves consistency for readers.
Applied to files:
docs/features/providers/huggingface.mdx
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.
Applied to files:
core/providers/openai/openai.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.
Applied to files:
core/providers/openai/openai.go
📚 Learning: 2025-09-01T15:29:17.076Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 372
File: core/providers/utils.go:552-598
Timestamp: 2025-09-01T15:29:17.076Z
Learning: The detectAudioMimeType function in core/providers/utils.go is specifically designed for Gemini's audio format support, which only includes: WAV (audio/wav), MP3 (audio/mp3), AIFF (audio/aiff), AAC (audio/aac), OGG Vorbis (audio/ogg), and FLAC (audio/flac). The implementation should remain focused on these specific formats rather than being over-engineered for general-purpose audio detection.
Applied to files:
core/providers/utils/audio.go
📚 Learning: 2025-12-10T15:15:14.041Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/audio.go:92-98
Timestamp: 2025-12-10T15:15:14.041Z
Learning: In core/providers/utils/audio.go, within DetectAudioMimeType, use a mask of 0xF6 for ADTS sync detection instead of the standard 0xF0. This stricter check validates that the top nibble is 0xF and the Layer field bits (bits 2-1) are 00, preventing MP3 Layer III (Layer bits 11) from being misidentified as AAC. Ensure unit tests cover this behavior and document the rationale in code comments.
Applied to files:
core/providers/utils/audio.go
📚 Learning: 2025-12-15T10:19:32.071Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/internal/testutil/transcription.go:241-258
Timestamp: 2025-12-15T10:19:32.071Z
Learning: In core/internal/testutil package, the function GenerateTTSAudioForTest returns ([]byte, string) not ([]byte, error), and it already handles errors internally by calling t.Fatalf(). Therefore, the blank identifier for the second return value is correct and no external error checking is needed.
Applied to files:
core/internal/testutil/transcription.go
🧬 Code graph analysis (9)
core/providers/utils/utils.go (1)
core/schemas/bifrost.go (3)
Cerebras(47-47)Perplexity(46-46)HuggingFace(51-51)
core/bifrost.go (1)
core/schemas/bifrost.go (1)
HuggingFace(51-51)
core/providers/openai/openai.go (3)
core/schemas/chatcompletions.go (1)
ChatStreamResponseChoice(783-785)core/providers/gemini/types.go (1)
Content(977-985)ui/lib/types/logs.ts (1)
ReasoningDetails(127-134)
core/schemas/mux.go (2)
core/schemas/responses.go (2)
BifrostResponsesStreamResponse(1440-1479)ResponsesStreamResponseTypeOutputTextDelta(1388-1388)core/schemas/utils.go (1)
Ptr(16-18)
core/providers/gemini/speech.go (1)
core/providers/utils/audio.go (1)
DetectAudioMimeType(78-119)
core/internal/testutil/account.go (3)
core/schemas/bifrost.go (3)
HuggingFace(51-51)Fallback(149-152)OpenAI(35-35)core/schemas/provider.go (5)
ProviderConfig(267-276)NetworkConfig(48-56)DefaultRequestTimeoutInSeconds(15-15)ConcurrencyAndBufferSize(131-134)Provider(314-361)core/internal/testutil/cross_provider_scenarios.go (1)
ProviderConfig(45-53)
core/internal/testutil/transcription.go (4)
core/schemas/transcriptions.go (3)
BifrostTranscriptionRequest(3-10)TranscriptionInput(28-30)TranscriptionParameters(32-49)core/internal/testutil/utils.go (1)
GetProviderVoice(39-87)core/schemas/speech.go (4)
BifrostSpeechRequest(9-16)SpeechParameters(43-58)SpeechVoiceInput(65-68)BifrostSpeechResponse(22-29)core/internal/testutil/test_retry_framework.go (5)
GetTestRetryConfigForScenario(1118-1150)TestRetryContext(168-173)SpeechRetryConfig(216-223)SpeechRetryCondition(144-147)WithSpeechTestRetry(1328-1478)
core/providers/huggingface/huggingface.go (7)
core/providers/huggingface/types.go (5)
HuggingFaceTranscriptionResponse(330-333)HuggingFaceSpeechResponse(243-245)HuggingFaceResponseError(151-155)HuggingFaceListModelsResponse(28-30)HuggingFaceHubError(146-149)core/providers/utils/utils.go (2)
ConfigureProxy(100-148)GetRequestPath(221-241)core/providers/utils/audio.go (1)
DetectAudioMimeType(78-119)core/providers/huggingface/chat.go (1)
ToHuggingFaceChatCompletionRequest(11-106)core/providers/huggingface/embedding.go (2)
ToHuggingFaceEmbeddingRequest(11-78)UnmarshalHuggingFaceEmbeddingResponse(82-168)core/providers/huggingface/speech.go (1)
ToHuggingFaceSpeechRequest(9-96)core/providers/huggingface/transcription.go (1)
ToHuggingFaceTranscriptionRequest(11-119)
core/providers/huggingface/types.go (1)
core/schemas/chatcompletions.go (3)
ChatStreamOptions(237-240)ChatTool(252-257)ChatToolChoiceFunction(444-446)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
[grammar] ~2000-~2000: Use a hyphen to join words.
Context: ...ror or incorrect stream handling Tool calling tests fail: - Solution: Ve...
(QB_NEW_EN_HYPHEN)
🔇 Additional comments (27)
.github/workflows/release-pipeline.yml (1)
118-118: Environment variable naming needs verification against actual usage.The changes correctly add
HUGGING_FACE_API_KEYto all four release jobs (lines 118, 193, 270, 359) following the same underscore pattern as other provider keys. However, verification reveals:
- Standard Hugging Face documentation specifies HUGGINGFACE_API_KEY (without underscores) for authentication
- The codebase uses HUGGING_FACE_API_KEY with underscores (consistent with other providers like ELEVENLABS_API_KEY)
- No direct code references to HUGGING_FACE_API_KEY were found in tests or integrations
- Test configuration (conftest.py) doesn't include this variable in its API keys fixture
Confirm that the chosen naming (HUGGING_FACE_API_KEY) aligns with how Hugging Face libraries or test code actually consume the variable in this project, and verify the GitHub secret is configured in repository settings.
core/internal/testutil/responses_stream.go (1)
693-693: LGTM: Appropriate threshold increase for comprehensive streaming tests.The increase from 100 to 300 chunks is reasonable given the enhanced streaming capabilities and new provider support (HuggingFace). This safety guard still prevents infinite loops while accommodating more comprehensive lifecycle event sequences.
core/providers/openai/openai.go (1)
1047-1054: LGTM: Proper reasoning content handling in streaming.The broadened emission condition now correctly handles reasoning-related delta fields (
Delta.ReasoningandDelta.ReasoningDetails) in addition to regular content, ensuring reasoning model responses stream properly. This aligns with the OpenAI reasoning API patterns.Based on learnings, this supports the ResponsesReasoning API behavior where reasoning blocks are properly identified and streamed.
core/internal/testutil/account.go (4)
114-114: LGTM: HuggingFace correctly added to configured providers.The addition properly integrates HuggingFace into the test infrastructure's provider list.
327-334: LGTM: HuggingFace key configuration is correct.The API key sourcing from
HUGGING_FACE_API_KEYis consistent with the documented environment variable. The absence ofUseForBatchAPIaligns with the HuggingFace provider not supporting batch operations in the current implementation.
589-601: LGTM: Well-tuned configuration for HuggingFace Inference API.The configuration appropriately accounts for HuggingFace's characteristics:
- 300s timeout: Accommodates model cold starts on Inference API
- 10 retries with 2s-30s backoff: Provides resilience matching other cloud providers
- Concurrency (4) and buffer (10): Consistent with other provider configurations
1020-1053: LGTM: Comprehensive HuggingFace test configuration.The configuration properly defines HuggingFace's capabilities:
- Model selection: Uses appropriate HuggingFace Inference API models across chat, vision, embedding, transcription, and speech synthesis
- Scenarios: Correctly enables supported features (chat, streaming, tools, vision, audio) while appropriately disabling unsupported ones (reasoning, batch/file operations)
- Fallbacks: Includes OpenAI gpt-4o-mini for test resilience
The reasoning scenario is correctly set to
falseas HuggingFace doesn't provide a native reasoning API equivalent to OpenAI's o1 models.core/schemas/transcriptions.go (1)
37-40: LGTM: Well-documented HuggingFace generation parameters.The four new fields (
MaxLength,MinLength,MaxNewTokens,MinNewTokens) are properly implemented:
- Clear documentation: Each field's comment explicitly indicates HuggingFace usage
- Correct types: Optional pointers (
*int) withomitemptyallow provider-specific flexibility- API alignment: Parameters match HuggingFace Inference API's automatic-speech-recognition generation controls
The per-field documentation approach is clearer and more maintainable than a grouped comment.
transports/changelog.md (1)
1-1: Changelog entry is fine as-is.The new line succinctly documents the HuggingFace+UI addition; no changes needed.
core/providers/gemini/speech.go (1)
169-176: Usingutils.DetectAudioMimeTypehere is a good consolidation.Swapping the inline MIME sniffing for
utils.DetectAudioMimeType(bifrostResp.Audio)keeps Gemini’s synthetic response in sync with the shared audio detection logic used elsewhere, and reduces duplication. No issues spotted.core/bifrost.go (1)
26-26: HuggingFace wiring intocreateBaseProvideris consistent with other providers.Adding the HuggingFace import and
case schemas.HuggingFace: return huggingface.NewHuggingFaceProvider(config, bifrost.logger), nilfollows the same pattern as OpenAI/Mistral/Elevenlabs/OpenRouter, and integrates cleanly with the existingtargetProviderKey/ custom-provider logic. No issues spotted here.Also applies to: 1889-1890
core/providers/utils/utils.go (1)
1049-1052: Verify HuggingFace Inference API streaming behavior against current documentation.HuggingFace's TGI supports the Messages API, which is fully compatible with the OpenAI Chat Completion API, and you can use OpenAI's client libraries or third-party libraries expecting OpenAI schema to interact with TGI's Messages API. However, the available documentation does not explicitly confirm whether HuggingFace sends a
[DONE]sentinel marker at stream termination or simply closes the connection. OpenAI sends a final chunk with "[DONE]" to indicate the end of the stream, and if HuggingFace's API is truly fully compatible with OpenAI's, it should follow the same pattern. Confirm whether this implementation detail still holds true for the current version of the Inference API or Inference Endpoints you're targeting.docs/features/providers/huggingface.mdx (1)
1-254: Documentation looks comprehensive and well-structured.The documentation thoroughly covers HuggingFace provider implementation details including model aliasing, request format differences across inference providers, and the capability matrix. The past review feedback has been addressed.
One minor observation: The note at line 197 clarifying the checkmark convention is helpful for readers.
core/providers/huggingface/chat.go (1)
78-101: ToolChoice handling is well-implemented.The logic correctly handles both string-based tool choices (
auto,none,required) and structured function-based tool choices. The guard at line 99 ensureshfReq.ToolChoiceis only set when valid.core/providers/huggingface/utils.go (3)
130-147: Model parsing logic is correct and handles edge cases.The
splitIntoModelProviderfunction properly handles:
t == 0(no slashes): returns errort == 1(one slash): sets provider toautoand uses full string as modelt > 1(multiple slashes): splits on first slash for provider, rest for modelThis correctly handles model formats like
hf-inference/meta-llama/Llama-3-8B-Instruct.
213-267: Model mapping cache implementation looks solid.The caching logic properly:
- Checks cache first before making HTTP requests
- Uses type assertion to validate cached data
- Stores mappings in cache only when non-nil
- Returns properly structured errors
The error handling at lines 243-246 now correctly guards against empty messages, preserving fallback messages.
318-322: Defensive copy of audio data prevents use-after-free.The explicit copy at line 319 correctly prevents potential use-after-free issues since fasthttp response bodies reference internal buffers that get recycled.
docs/contributing/adding-a-provider.mdx (2)
7-13: Excellent quick reference section.The note at the beginning pointing to specific provider implementations for reference is very helpful for contributors.
64-69: Clear file creation order guidance.The explicit ordering (types.go → utils.go → feature files → provider.go → tests) is critical for maintainability and is well-documented.
core/providers/huggingface/huggingface.go (8)
29-63: Response pooling implementation is correct.The
sync.Poolpattern forHuggingFaceTranscriptionResponseandHuggingFaceSpeechResponseis properly implemented:
- Acquire functions reset the struct before returning
- Release functions check for nil before putting back
- Both Speech (lines 727-728) and Transcription (lines 825-826) properly acquire and defer release
The pool pre-warming in the constructor (lines 78-81) helps reduce allocation pressure during startup.
111-188: Cache-first retry logic is well-designed.The
completeRequestWithModelAliasCachefunction implements an efficient cache-first pattern:
- Uses cached model mapping for initial request
- On 404, clears cache and re-fetches mapping
- Retries with updated model ID
This minimizes API calls (1 call on cache hit, 3 on miss) as clarified by the author in past comments. The embedding model field update logic at lines 136-144 and 161-168 correctly handles the retry scenario.
220-238: Error handling preserves fallback messages correctly.The guarded assignments at lines 226-228 and 233-234 ensure that:
bifrostErr.Typeis only set iferrorResp.Typeis non-emptybifrostErr.Error.Messageis only overwritten iferrorResp.Messageis non-emptyThis preserves any fallback message already set by
HandleProviderAPIErrorwhen JSON parsing fails.
252-398: Parallel ListModels aggregation is robust.The implementation:
- Spawns concurrent goroutines for each inference provider
- Properly uses channels and WaitGroup for coordination
- Aggregates successful results and tracks first error
- Calculates average latency across successful responses
- Combines raw responses when enabled
The channel closure in the background goroutine (lines 343-346) ensures the range loop terminates correctly.
546-563: Streaming delegates to shared OpenAI handler.The
ChatCompletionStreamcorrectly usesopenai.HandleOpenAIChatCompletionStreamingwith a custom request converter. This reuses battle-tested streaming logic while allowing HuggingFace-specific request transformation.
584-596: ResponsesStream context flag for conversion.The context value
BifrostContextKeyIsResponsesToChatCompletionFallbackat line 589 properly signals downstream handlers that this is a Responses-to-ChatCompletion fallback flow.
794-798: Input validation for hf-inference transcription is correct.The nil/empty check at lines 795-797 properly guards against the panic that was identified in past review comments. The error message clearly indicates the requirement.
863-911: Unsupported operations return consistent errors.All unsupported operations (Batch*, File*) consistently use
providerUtils.NewUnsupportedOperationErrorwith the correct request type. This provides clear feedback to callers.
3ea5fc8 to
39d5ca5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (7)
core/providers/utils/audio.go (1)
92-98: Document the rationale for the 0xF6 mask.The 0xF6 mask is correct and intentional (distinguishing AAC from MP3 Layer III), but the inline comment doesn't explain why 0xF6 is used instead of the standard 0xF0. Based on learnings, this stricter mask checks both the sync word (top 4 bits = 0xF) and the Layer field bits (bits 2-1 = 00), preventing MP3 Layer III files (which have Layer bits = 11) from being misidentified as AAC.
🔎 Suggested documentation enhancement
// AAC: ADIF or ADTS (0xFFF sync) - check before MP3 frame sync to avoid misclassification if bytes.HasPrefix(audioData, adif) { return "audio/aac" } + // ADTS sync: 0xFF followed by top 4 bits = 0xF and Layer field = 00 + // Mask 0xF6 checks sync (top 4 bits) AND Layer bits (bits 2-1) = 00 to distinguish from MP3 Layer III (11) if len(audioData) >= 2 && audioData[0] == 0xFF && (audioData[1]&0xF6) == 0xF0 { return "audio/aac" }Based on learnings, the mask 0xF6 is intentionally stricter than 0xF0 to prevent MP3 misidentification.
ui/README.md (1)
11-17: Verify provider count accuracy and consider reducing manual list maintenance.Line 12 references "15+ AI providers"—with HuggingFace now added, please verify this count is still accurate. Additionally, per the previous reviewer's feedback (Pratham-Mishra04), consider moving the entire Key Features section to be purely link-based with descriptions pulled from external docs rather than maintaining a hardcoded list here.
core/internal/testutil/transcription.go (2)
73-97: Consider extracting fixture-reading logic to reduce duplication.The Fal-AI/HuggingFace fixture-reading block appears 5 times with nearly identical structure (runtime.Caller → filepath construction → os.ReadFile → error check). Extracting to a helper function would improve maintainability and reduce the ~55 lines of duplicated code.
🔎 Suggested helper function approach
Add a helper function to this file:
// getAudioForTranscriptionTest returns audio data for transcription tests. // For Fal-AI models on HuggingFace, reads from mp3 fixture; otherwise generates TTS audio. func getAudioForTranscriptionTest( ctx context.Context, t *testing.T, client *bifrost.Bifrost, testConfig ComprehensiveTestConfig, speechSynthesisProvider schemas.ModelProvider, speechSynthesisModel string, text string, voiceType string, fixtureName string, ) []byte { if testConfig.Provider == schemas.HuggingFace && strings.HasPrefix(testConfig.TranscriptionModel, "fal-ai/") { _, filename, _, _ := runtime.Caller(1) dir := filepath.Dir(filename) filePath := filepath.Join(dir, "scenarios", "media", fmt.Sprintf("%s.mp3", fixtureName)) audioData, err := os.ReadFile(filePath) if err != nil { t.Fatalf("failed to read audio fixture %s: %v", filePath, err) } return audioData } audioData, _ := GenerateTTSAudioForTest(ctx, t, client, speechSynthesisProvider, speechSynthesisModel, text, voiceType, "mp3") return audioData }Then replace each block with a single call, e.g.:
- var transcriptionRequest *schemas.BifrostTranscriptionRequest - if testConfig.Provider == schemas.HuggingFace && strings.HasPrefix(testConfig.TranscriptionModel, "fal-ai/") { - // For Fal-AI models on HuggingFace, we have to use mp3 but fal-ai speech models only return wav - // So we read from a pre-generated mp3 file to avoid format issues - _, filename, _, _ := runtime.Caller(0) - dir := filepath.Dir(filename) - filePath := filepath.Join(dir, "scenarios", "media", fmt.Sprintf("%s.mp3", tc.name)) - fileContent, err := os.ReadFile(filePath) - if err != nil { - t.Fatalf("failed to read audio fixture %s: %v", filePath, err) - } + audioData := getAudioForTranscriptionTest(ctx, t, client, testConfig, speechSynthesisProvider, speechSynthesisModel, tc.text, tc.voiceType, tc.name) + var transcriptionRequest *schemas.BifrostTranscriptionRequest + if testConfig.Provider == schemas.HuggingFace && strings.HasPrefix(testConfig.TranscriptionModel, "fal-ai/") { transcriptionRequest = &schemas.BifrostTranscriptionRequest{ Provider: testConfig.Provider, Model: testConfig.TranscriptionModel, Input: &schemas.TranscriptionInput{ - File: fileContent, + File: audioData, },(Note: Request construction still needs conditional logic for parameters, but audio fetching is centralized.)
Also applies to: 261-278, 369-386, 463-480, 561-578
377-377: Hardcoded fixture names may be fragile.Lines 377, 471, and 569 hardcode fixture names (
"RoundTrip_Basic_MP3.mp3"and"RoundTrip_Medium_MP3.mp3") instead of deriving them from test context. If the RoundTrip tests change or fixtures are reorganized, these references may break without obvious connection.Consider either:
- Creating test-specific fixtures matching the test names (e.g.,
"Format_json.mp3","WithCustomParameters.mp3","Language_en.mp3"), or- Defining fixture name constants if reuse is intentional (e.g.,
const DefaultTranscriptionFixture = "RoundTrip_Basic_MP3.mp3").Also applies to: 471-471, 569-569
core/schemas/mux.go (1)
1216-1216: Consider simplifying the condition.The condition
hasContent || (!state.TextItemHasContent && (hasReasoning || hasContent))contains a redundant|| hasContentin the inner expression. This simplifies to:if hasContent || (!state.TextItemHasContent && hasReasoning) {This doesn't affect correctness but improves clarity.
🔎 Suggested simplification
-if hasContent || (!state.TextItemHasContent && (hasReasoning || hasContent)) { +if hasContent || (!state.TextItemHasContent && hasReasoning) {core/providers/huggingface/embedding.go (1)
11-78: Consider explicitly rejecting non‑text embedding inputs for HuggingFace
ToHuggingFaceEmbeddingRequestonly inspectsInput.Text/Input.Textsand silently produces a request withoutinput/inputswhen callers supply onlyEmbedding/Embeddings. For HF, vector‑to‑vector embeddings aren’t supported, so it may be clearer to fail fast instead of sending an effectively empty request body.You could, for example, detect this case and return an error (or wrap it into a provider‑specific “unsupported input type” BifrostError) when
Embedding/Embeddingsare set but no text input is present.core/providers/huggingface/types.go (1)
226-253: Consider removing unusedExtrafield.Line 234 defines an
Extrafield withjson:"-"that's never populated or used. If it's reserved for future use, add a comment explaining its purpose; otherwise, remove it to reduce maintenance burden.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (6)
core/internal/testutil/scenarios/media/Numbers_And_Punctuation.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Basic_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Medium_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Technical_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/Technical_Terms.mp3is excluded by!**/*.mp3ui/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (38)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)core/bifrost.go(2 hunks)core/changelog.md(1 hunks)core/internal/testutil/account.go(4 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/internal/testutil/transcription.go(6 hunks)core/providers/gemini/speech.go(1 hunks)core/providers/gemini/transcription.go(2 hunks)core/providers/gemini/utils.go(0 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/responses.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/providers/openai/openai.go(1 hunks)core/providers/utils/audio.go(1 hunks)core/providers/utils/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(3 hunks)core/schemas/transcriptions.go(1 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/docs.json(1 hunks)docs/features/providers/huggingface.mdx(1 hunks)docs/features/providers/providers-unified-interface.mdx(2 hunks)transports/changelog.md(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(6 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
💤 Files with no reviewable changes (1)
- core/providers/gemini/utils.go
🚧 Files skipped from review as they are similar to previous changes (20)
- core/providers/huggingface/responses.go
- core/schemas/bifrost.go
- core/providers/utils/utils.go
- .github/workflows/pr-tests.yml
- transports/changelog.md
- transports/config.schema.json
- core/providers/gemini/transcription.go
- core/changelog.md
- ui/lib/constants/logs.ts
- ui/lib/constants/config.ts
- docs/features/providers/providers-unified-interface.mdx
- core/providers/openai/openai.go
- core/providers/huggingface/speech.go
- docs/features/providers/huggingface.mdx
- core/providers/huggingface/models.go
- core/schemas/transcriptions.go
- core/providers/huggingface/transcription.go
- core/providers/huggingface/huggingface_test.go
- core/internal/testutil/responses_stream.go
- docs/docs.json
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/providers/huggingface/embedding.gocore/internal/testutil/account.goui/lib/constants/icons.tsxcore/bifrost.gocore/providers/gemini/speech.gocore/schemas/mux.godocs/apis/openapi.jsoncore/internal/testutil/transcription.godocs/contributing/adding-a-provider.mdxcore/providers/huggingface/chat.gocore/providers/utils/audio.gocore/providers/huggingface/utils.gocore/schemas/account.goui/README.mdcore/providers/huggingface/types.gocore/providers/huggingface/huggingface.go
🧠 Learnings (6)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/providers/huggingface/embedding.gocore/internal/testutil/account.gocore/bifrost.gocore/providers/gemini/speech.gocore/schemas/mux.gocore/internal/testutil/transcription.gocore/providers/huggingface/chat.gocore/providers/utils/audio.gocore/providers/huggingface/utils.gocore/schemas/account.gocore/providers/huggingface/types.gocore/providers/huggingface/huggingface.go
📚 Learning: 2025-12-15T10:34:13.855Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/chat.go:370-389
Timestamp: 2025-12-15T10:34:13.855Z
Learning: In the HuggingFace provider (core/providers/huggingface/), streaming tool call data arrives as a single delta chunk with all fields (id, type, function.name, function.arguments) present at once. Do not implement accumulation logic across deltas for streaming tool calls in this codebase; rely on single-chunk data as the guaranteed behavior. If there is existing accumulation logic in this directory, remove it or guard it behind this assumption.
Applied to files:
core/providers/huggingface/embedding.gocore/providers/huggingface/chat.gocore/providers/huggingface/utils.gocore/providers/huggingface/types.gocore/providers/huggingface/huggingface.go
📚 Learning: 2025-12-19T08:29:20.286Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1095
File: core/internal/testutil/count_tokens.go:30-67
Timestamp: 2025-12-19T08:29:20.286Z
Learning: In core/internal/testutil test files, enforce using GetTestRetryConfigForScenario() to obtain a generic retry config, then construct a typed retry config (e.g., CountTokensRetryConfig, EmbeddingRetryConfig, TranscriptionRetryConfig) with an empty Conditions slice. Copy only MaxAttempts, BaseDelay, MaxDelay, OnRetry, and OnFinalFail from the generic config. This convention should be consistently applied across all test files in this directory.
Applied to files:
core/internal/testutil/account.gocore/internal/testutil/transcription.go
📚 Learning: 2025-12-15T10:19:32.071Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/internal/testutil/transcription.go:241-258
Timestamp: 2025-12-15T10:19:32.071Z
Learning: In core/internal/testutil package, the function GenerateTTSAudioForTest returns ([]byte, string) not ([]byte, error), and it already handles errors internally by calling t.Fatalf(). Therefore, the blank identifier for the second return value is correct and no external error checking is needed.
Applied to files:
core/internal/testutil/transcription.go
📚 Learning: 2025-09-01T15:29:17.076Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 372
File: core/providers/utils.go:552-598
Timestamp: 2025-09-01T15:29:17.076Z
Learning: The detectAudioMimeType function in core/providers/utils.go is specifically designed for Gemini's audio format support, which only includes: WAV (audio/wav), MP3 (audio/mp3), AIFF (audio/aiff), AAC (audio/aac), OGG Vorbis (audio/ogg), and FLAC (audio/flac). The implementation should remain focused on these specific formats rather than being over-engineered for general-purpose audio detection.
Applied to files:
core/providers/utils/audio.go
📚 Learning: 2025-12-10T15:15:14.041Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/audio.go:92-98
Timestamp: 2025-12-10T15:15:14.041Z
Learning: In core/providers/utils/audio.go, within DetectAudioMimeType, use a mask of 0xF6 for ADTS sync detection instead of the standard 0xF0. This stricter check validates that the top nibble is 0xF and the Layer field bits (bits 2-1) are 00, preventing MP3 Layer III (Layer bits 11) from being misidentified as AAC. Ensure unit tests cover this behavior and document the rationale in code comments.
Applied to files:
core/providers/utils/audio.go
🧬 Code graph analysis (9)
core/providers/huggingface/embedding.go (3)
core/schemas/embedding.go (4)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)core/providers/huggingface/types.go (3)
HuggingFaceEmbeddingRequest(161-172)InputsCustomType(174-177)EncodingType(219-219)core/schemas/chatcompletions.go (1)
BifrostLLMUsage(845-852)
core/internal/testutil/account.go (3)
core/schemas/bifrost.go (2)
HuggingFace(51-51)Fallback(149-152)core/schemas/provider.go (4)
ProviderConfig(267-276)NetworkConfig(48-56)ConcurrencyAndBufferSize(131-134)Provider(314-361)core/internal/testutil/cross_provider_scenarios.go (1)
ProviderConfig(45-53)
core/bifrost.go (2)
core/schemas/bifrost.go (1)
HuggingFace(51-51)core/providers/huggingface/huggingface.go (1)
NewHuggingFaceProvider(66-99)
core/providers/gemini/speech.go (1)
core/providers/utils/audio.go (1)
DetectAudioMimeType(78-119)
core/schemas/mux.go (3)
core/providers/gemini/types.go (2)
Content(977-985)Type(782-782)core/schemas/responses.go (2)
BifrostResponsesStreamResponse(1440-1479)ResponsesStreamResponseTypeOutputTextDelta(1388-1388)core/schemas/utils.go (1)
Ptr(16-18)
core/internal/testutil/transcription.go (4)
core/schemas/transcriptions.go (3)
BifrostTranscriptionRequest(3-10)TranscriptionInput(28-30)TranscriptionParameters(32-49)core/internal/testutil/utils.go (3)
GetProviderVoice(39-87)GetErrorMessage(642-675)GenerateTTSAudioForTest(568-640)core/schemas/speech.go (4)
BifrostSpeechRequest(9-16)SpeechParameters(43-58)SpeechVoiceInput(65-68)BifrostSpeechResponse(22-29)core/internal/testutil/test_retry_framework.go (5)
GetTestRetryConfigForScenario(1118-1150)TestRetryContext(168-173)SpeechRetryConfig(216-223)SpeechRetryCondition(144-147)WithSpeechTestRetry(1328-1478)
core/providers/huggingface/utils.go (4)
core/schemas/models.go (1)
BifrostListModelsRequest(23-34)core/providers/utils/utils.go (3)
GetRequestPath(221-241)MakeRequestWithContext(41-95)NewBifrostOperationError(733-744)core/providers/huggingface/types.go (2)
HuggingFaceInferenceProviderMapping(68-71)HuggingFaceHubError(146-149)core/schemas/provider.go (1)
ErrProviderResponseDecode(29-29)
core/providers/huggingface/types.go (2)
core/schemas/models.go (1)
Model(109-129)core/schemas/chatcompletions.go (3)
ChatStreamOptions(237-240)ChatTool(252-257)ChatToolChoiceFunction(444-446)
core/providers/huggingface/huggingface.go (4)
core/providers/utils/audio.go (1)
DetectAudioMimeType(78-119)core/providers/huggingface/embedding.go (2)
ToHuggingFaceEmbeddingRequest(11-78)UnmarshalHuggingFaceEmbeddingResponse(82-168)core/providers/huggingface/speech.go (1)
ToHuggingFaceSpeechRequest(9-96)core/providers/huggingface/transcription.go (1)
ToHuggingFaceTranscriptionRequest(11-119)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
[grammar] ~2000-~2000: Use a hyphen to join words.
Context: ...ror or incorrect stream handling Tool calling tests fail: - Solution: Ve...
(QB_NEW_EN_HYPHEN)
🔇 Additional comments (44)
core/providers/gemini/speech.go (1)
175-175: LGTM! Good refactoring to use the shared utility.The change correctly uses the centralized
utils.DetectAudioMimeTypefunction instead of a local implementation, improving code maintainability and consistency across providers.ui/README.md (3)
46-72: Architecture section is well-structured and accurate.The updated architecture description correctly reflects the technology stack changes (Redux Toolkit with RTK Query) and maintains clarity on the integration model between UI and backend.
129-155: Verify RTK Query imports and exports match actual codebase structure.The code example imports
useGetLogsQuery,useCreateProviderMutation, andgetErrorMessagefrom@/lib/store. Please confirm that this module path and these specific exports exist and are publicly available, as the actual implementation may differ from this documentation.
172-236: External documentation links are current and accessible.The referenced documentation pages at docs.getbifrost.ai (provider configuration, governance features, observability, and MCP integration) are published and actively maintained. The benchmarking documentation at docs.getbifrost.ai/benchmarking/getting-started is accessible with current performance data. The main GitHub repository at github.com/maximhq/bifrost is active, and the documentation structure effectively reduces manual maintenance by redirecting users to external resources rather than maintaining duplicate content locally.
core/schemas/mux.go (1)
1155-1242: LGTM: Reasoning-only response support is well-implemented.The gating logic correctly handles reasoning-only models by:
- Creating text items when reasoning is present but no content has been emitted yet (line 1159)
- Emitting an empty delta on the first reasoning-only chunk to satisfy lifecycle validation (lines 1219-1225)
- Tracking whether the text item has received any delta via
TextItemHasContent(line 1240)- Closing text items regardless of content presence to support reasoning-only responses (line 1411)
The logic properly handles content-only, reasoning-only, and mixed content+reasoning scenarios.
Also applies to: 1410-1411
docs/contributing/adding-a-provider.mdx (7)
7-38: Documentation structure and introduction are well-organized.The quick reference to existing providers and clear explanation of the gateway pattern provide excellent context for contributors. The explicit instructions for registering in
core/schemas/bifrost.goare actionable and specific.
41-85: Directory structure guidance is comprehensive and practical.The clear distinction between OpenAI-compatible and custom API providers, along with the emphasized file creation order, helps prevent common implementation mistakes. The references to actual provider implementations (
core/providers/huggingface/,core/providers/cerebras/) are valuable for contributors.
86-627: File conventions are thorough and enforce excellent separation of concerns.The strict rules (marked with "CRITICAL RULE") combined with real examples (especially the HuggingFace chat converter on lines 412-528) provide clear patterns for contributors. The emphasis on naming conventions, nil checks, and pure transformation functions promotes maintainable code.
629-971: OpenAI-compatible provider guidance is clear and complete.The section effectively explains when and how to leverage the OpenAI handler functions, with complete examples from the Cerebras provider. The distinction between constructor return types (with/without error) and the advantages of this approach are well-articulated.
972-1487: Implementation steps are well-structured and comprehensive.The phase-by-phase approach with explicit ordering, checklists, and the planning document template (lines 989-1021) provides a clear roadmap. The note on line 1405 about omitted code being for brevity (with reference to complete examples) appropriately addresses the documentation's reference nature.
1489-1788: UI integration steps are comprehensive and actionable.The 8-step process with specific file paths, code examples, and a complete checklist (lines 1770-1786) ensures contributors don't miss any integration points. Each step includes context about where to make changes and why.
1791-2071: Testing guidelines are thorough and practical.The complete test file example, scenario configuration table (lines 1884-1896), and troubleshooting section (lines 1978-2003) provide contributors with everything needed to write comprehensive tests. The final pre-submission checklist organized by category ensures nothing is missed.
docs/apis/openapi.json (1)
4413-4437: ModelProvider enum update for Hugging Face looks consistentAdding
"huggingface"here matches the new provider wiring and keeps the public API enum in sync with backend support; no further schema changes needed in this file..github/workflows/release-pipeline.yml (1)
90-121: HUGGING_FACE_API_KEY propagation across release jobs is correctWiring
HUGGING_FACE_API_KEYinto all four release steps is consistent with other provider keys and should unblock Hugging Face–backed tests/releases in these pipelines. Please just confirm the corresponding GitHub secret is defined in the target environment so these jobs don’t start failing at runtime.Also applies to: 165-195, 242-271, 327-360
core/schemas/account.go (1)
9-20: HuggingFace key config wiring is structurally sound and backward‑compatibleThe new
HuggingFaceKeyConfigtype andKey.HuggingFaceKeyConfigfield follow the same pattern as Azure/Vertex/Bedrock configs (pointer +omitempty, deployments map), so this won’t break existing payloads and cleanly reserves space for future Hugging Face deployment mappings. Based on learnings, it’s fine that this is not yet surfaced/used elsewhere in the API surface.Also applies to: 70-72
core/bifrost.go (2)
26-26: Import wiring for HuggingFace provider looks correctImporting
core/providers/huggingfacealongside other providers is consistent and required for the new switch case below. No issues here.
1889-1890: HuggingFace registered increateBaseProviderconsistentlyThe new
schemas.HuggingFacecase mirrors other providers that construct a concrete provider and return it with a nil error, so HF will participate correctly in init, updates, and fallbacks.core/internal/testutil/account.go (4)
95-118: HuggingFace added to comprehensive test providersIncluding
schemas.HuggingFaceinGetConfiguredProviderskeeps it aligned with the concrete configs and scenario table below, so HF will be exercised in comprehensive tests.
327-334: Test key configuration for HuggingFace is consistentThe HuggingFace test key is sourced from
HUGGING_FACE_API_KEYwith emptyModelsand weight 1.0, matching the pattern used for other single-key providers. OmittingUseForBatchAPIis fine given HF scenarios currently don’t enable batch/file APIs.
589-601: ProviderConfig defaults for HuggingFace are reasonableA 300s default timeout plus 10 retries and moderate backoff windows are in line with other “variable” cloud providers and give HF some resilience to cold starts without being extreme. Concurrency and buffer reuse the shared
Concurrencyconstant, which keeps tests consistent.
1020-1053: HuggingFace test scenario entry is well‑shapedThe HuggingFace
ComprehensiveTestConfigwires chat, vision, embedding, transcription, and TTS models plus scenarios and an OpenAI fallback, mirroring how other providers are modeled. As long as these booleans match real HF capabilities, this should integrate cleanly into the cross‑provider test matrix.core/providers/huggingface/chat.go (1)
11-105: Chat request conversion to HuggingFace format looks correctThe helper cleanly maps all standard chat parameters, response format, streaming options, tools, and tool choice into the HF request struct, and now surfaces ResponseFormat conversion failures instead of silently dropping them. The nil‑guard at the top is safe given upstream validation.
core/providers/huggingface/embedding.go (1)
80-168: Embedding response unmarshal covers expected HF shapes
UnmarshalHuggingFaceEmbeddingResponsesensibly tries the structured object, then 2D array, then 1D array, always normalizing intoBifrostEmbeddingResponsewith consistentobject="list"and a non‑nilUsage. This should handle the common HF embedding response variants without surprising callers.core/providers/huggingface/utils.go (5)
17-81: Inference provider enums and registry are coherentThe
inferenceProviderconstants,INFERENCE_PROVIDERS, andPROVIDERS_OR_POLICIESgive a clear, type‑safe catalog of supported HF providers plus theautopolicy, matching how the rest of the provider code expects to route requests. Using a precomputed slice keeps call‑site code simple.
83-147: Model hub and provider URL helpers, plus model parsing, are well‑structured
buildModelHubURLandbuildModelInferenceProviderURLcorrectly assemble the Hub API URLs with pagination, sorting, and aninference_providerfilter, whilesplitIntoModelProvidercleanly distinguishes explicit provider prefixes (>=2 slashes) from theautocase (org/model) and rejects obviously invalid names (no slash). This aligns with the model IDs used in the test configs.
149-193: Routing only embedding/speech/transcription through supported providers is appropriate
getInferenceProviderRouteURLrestricts routing to the subset of HF providers that actually support embeddings, text‑to‑speech, or transcription and returns a clear error otherwise, which is consistent with its use only in those code paths. The hf‑inference pipeline selection byRequestTypealso looks correct.
195-267: Provider‑model mapping cache and validation are robust
convertToInferenceProviderMappings,getModelInferenceProviderMapping, andgetValidatedProviderModelIDcombine to fetch, cache, and validate provider‑specific model IDs with good error handling (HTTP status, decode failures, unsupported operations). Using async.Mapkeyed by HF model name avoids redundant Hub calls without adding contention.
294-344: Audio download and MIME normalization utilities are safe and context‑aware
downloadAudioFromURLnow usesMakeRequestWithContext, checks status codes, decodes the body safely, and returns a copied byte slice to avoid use‑after‑free issues.getMimeTypeForAudioTypeprovides a sensible default (audio/mpeg) and normalizesaudio/mp3while passing otheraudio/*types through unchanged.core/providers/huggingface/huggingface.go (10)
18-27: LGTM!The provider struct is well-designed with appropriate fields for HTTP client configuration, caching, and custom provider support. The
modelProviderMappingCacheusingsync.Mapis a good choice for concurrent access patterns.
65-99: LGTM!The constructor properly configures the HTTP client with reasonable defaults (5000 max connections per host, 60s idle timeout, 10s wait timeout), pre-warms response pools based on concurrency settings, and handles proxy configuration and base URL defaulting correctly.
111-188: Cache-first optimization working as intended.The retry logic efficiently handles model alias resolution by attempting the cached mapping first, then clearing the cache and re-validating only on 404. This minimizes API calls for high cache hit rates.
Based on learnings, this cache-first pattern was confirmed as the intended design.
190-250: LGTM!The request execution properly handles:
- Audio vs. JSON content types with MIME detection
- Authorization headers
- Error response parsing with fallback message preservation (addressed from past review)
- Body copying to prevent use-after-free with fasthttp's internal buffer
252-398: LGTM!The model listing implementation efficiently queries multiple inference providers in parallel using goroutines, properly aggregates results, calculates average latency, and handles partial failures gracefully. Resource cleanup with defer statements is correct.
400-425: LGTM!ListModels properly checks operation permissions and delegates to the multi-key handler. TextCompletion and TextCompletionStream correctly return unsupported operation errors with the proper request types (addressed from past reviews).
427-564: LGTM!Chat completion methods properly:
- Parse and validate model names with descriptive errors (addressed from past review)
- Use direct struct allocation instead of pooling to avoid leaks (addressed from past review)
- Delegate streaming to OpenAI-compatible helper with custom request converter
- Set all required ExtraFields for observability
566-681: LGTM!Responses/ResponsesStream properly fallback to chat completion with context tracking. Embedding implementation:
- Validates model names with descriptive errors
- Uses cache-aware retry logic for model alias resolution
- Handles multiple response formats via custom unmarshaling
- Properly tracks raw request/response data when enabled
683-861: LGTM!Speech and Transcription methods properly:
- Acquire and release pooled responses with defer (preventing leaks)
- Validate task types correctly (text-to-speech vs automatic-speech-recognition, addressed from past reviews)
- Handle hf-inference raw audio special case with nil checks (addressed from past review)
- Pass context to downloadAudioFromURL (addressed from past review)
- Use proper error types for unsupported streaming operations
863-911: LGTM!Batch and file operations correctly return unsupported operation errors with appropriate request types, as these features are not supported by the HuggingFace provider.
core/providers/huggingface/types.go (6)
12-52: LGTM!Model types are well-defined, and the custom
UnmarshalJSONforHuggingFaceListModelsResponsecorrectly handles both the current array format[...]and legacy object format{"models": [...]}, with a descriptive error for unexpected formats (addressed from past review).
54-71: LGTM!Inference provider mapping types are correct, with
ProviderModelIDproperly reflecting the provider-specific model ID from the HuggingFace API (addressed from past review).
73-143: LGTM!Chat types properly represent HuggingFace request structures:
HuggingFaceToolChoicecorrectly handles both enum string ("auto", "none", "required") and object forms with customMarshalJSON- Reuses common schema types (
schemas.ChatMessage,schemas.ChatTool) for consistency- Response format and JSON schema types support structured outputs
145-155: LGTM!Error types appropriately distinguish between Hub API errors and Inference API errors, with proper fields for error parsing and reporting.
157-224: LGTM!Embedding types properly handle flexible input formats:
InputsCustomTypecustom marshaling/unmarshaling supports string, array, and object representationsHuggingFaceEmbeddingRequestseparatesInput(for most providers) fromInputs(for hf-inference)EncodingTypeenum supports both float and base64 encoding formats
255-342: LGTM!Transcription types properly represent HuggingFace ASR API:
HuggingFaceTranscriptionEarlyStoppingcorrectly handles both boolean and string "never" values with proper error handling (addressed from past review)- Generation parameters provide comprehensive control over text generation
- Response types support both full text and timestamped chunks
ef5f796 to
3e8d6d7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
ui/README.md (1)
1-243: Significantly improved documentation structure with good external links, but consider the extent of inline documentation.The comprehensive updates (terminology, architecture refresh, new Quick Links section) meaningfully address documentation clarity. The shift toward docs.getbifrost.ai links is helpful. However, this partially addresses the past reviewer's suggestion to "redirect everything to the docs" — the README still contains substantial inline content (architecture diagrams, feature descriptions, code examples) that mirrors docs content. Consider whether sections like Architecture (lines 46–72), Feature descriptions (lines 76–104), and Configuration guidance (lines 172–190) could be condensed further with a note to "see documentation for details" to avoid drift between README and canonical docs.
Also note: The HuggingFace provider integration (the core of this PR) is not explicitly mentioned in the README. The generic "15+ providers" link (line 12) will include it, but no specific acknowledgment of the new provider is present.
ui/lib/constants/config.ts (1)
40-40: Both model identifier examples use non-standard 3-part format.The previous review correctly identified that
nebius/Qwen/Qwen3-Embedding-8Bis not a standard HuggingFace model ID. Additionally,sambanova/meta-llama/Llama-3.1-8B-Instructhas the same issue—both use 3-part paths that appear to be deployment-specific routing identifiers rather than standard HuggingFace Hub model IDs.Standard HuggingFace model identifiers follow the 2-part
organization/model-nameformat. Update to use examples from the PR objectives:🔎 Suggested fix
- huggingface: "e.g. sambanova/meta-llama/Llama-3.1-8B-Instruct, nebius/Qwen/Qwen3-Embedding-8B", + huggingface: "e.g. meta-llama/Llama-3.1-8B-Instruct, google/gemma-2-9b-it",core/providers/huggingface/transcription.go (1)
11-47: fal-ai branch missingModelandProviderfields in the request struct.The past review comment noted that fal-ai's API requires
ModelandProviderfields. The non-fal-ai branch at lines 32-36 correctly sets these fields, but the fal-ai branch at lines 44-46 only setsAudioURL. This may cause API failures.🔎 Suggested fix
hfRequest = &HuggingFaceTranscriptionRequest{ AudioURL: encoded, + Model: schemas.Ptr(modelName), + Provider: schemas.Ptr(string(inferenceProvider)), }
🧹 Nitpick comments (2)
core/internal/testutil/transcription.go (1)
261-278: Consider extracting repeated fixture path discovery logic.The pattern of using
runtime.Caller(0)to locate and read fixture files is duplicated across 6 locations. Consider extracting this into a helper function for maintainability.🔎 Suggested helper function
// getTestFixturePath returns the path to a test fixture file relative to the test source. func getTestFixturePath(fixtureName string) string { _, filename, _, _ := runtime.Caller(1) // Caller of this function dir := filepath.Dir(filename) return filepath.Join(dir, "scenarios", "media", fmt.Sprintf("%s.mp3", fixtureName)) } // readTestFixture reads a test fixture file, failing the test if not found. func readTestFixture(t *testing.T, fixtureName string) []byte { filePath := getTestFixturePath(fixtureName) data, err := os.ReadFile(filePath) if err != nil { t.Fatalf("failed to read audio fixture %s: %v", filePath, err) } return data }Also applies to: 369-386, 463-480, 561-578
core/providers/huggingface/types.go (1)
226-253: Consider removing unusedExtrafield fromHuggingFaceSpeechRequest.The
Extra map[string]anyfield at line 234 hasjson:"-"tag and is never populated or used in the codebase. Per the past review discussion, consider removing it or documenting its intended purpose.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (6)
core/internal/testutil/scenarios/media/Numbers_And_Punctuation.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Basic_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Medium_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/RoundTrip_Technical_MP3.mp3is excluded by!**/*.mp3core/internal/testutil/scenarios/media/Technical_Terms.mp3is excluded by!**/*.mp3ui/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (38)
.github/workflows/pr-tests.yml(1 hunks).github/workflows/release-pipeline.yml(4 hunks)core/bifrost.go(2 hunks)core/changelog.md(1 hunks)core/internal/testutil/account.go(4 hunks)core/internal/testutil/responses_stream.go(1 hunks)core/internal/testutil/transcription.go(6 hunks)core/providers/gemini/speech.go(1 hunks)core/providers/gemini/transcription.go(2 hunks)core/providers/gemini/utils.go(0 hunks)core/providers/huggingface/chat.go(1 hunks)core/providers/huggingface/embedding.go(1 hunks)core/providers/huggingface/huggingface.go(1 hunks)core/providers/huggingface/huggingface_test.go(1 hunks)core/providers/huggingface/models.go(1 hunks)core/providers/huggingface/responses.go(1 hunks)core/providers/huggingface/speech.go(1 hunks)core/providers/huggingface/transcription.go(1 hunks)core/providers/huggingface/types.go(1 hunks)core/providers/huggingface/utils.go(1 hunks)core/providers/openai/openai.go(1 hunks)core/providers/utils/audio.go(1 hunks)core/providers/utils/utils.go(1 hunks)core/schemas/account.go(2 hunks)core/schemas/bifrost.go(3 hunks)core/schemas/mux.go(3 hunks)core/schemas/transcriptions.go(1 hunks)docs/apis/openapi.json(1 hunks)docs/contributing/adding-a-provider.mdx(1 hunks)docs/docs.json(1 hunks)docs/features/providers/huggingface.mdx(1 hunks)docs/features/providers/supported-providers.mdx(3 hunks)transports/changelog.md(1 hunks)transports/config.schema.json(2 hunks)ui/README.md(6 hunks)ui/lib/constants/config.ts(2 hunks)ui/lib/constants/icons.tsx(1 hunks)ui/lib/constants/logs.ts(2 hunks)
💤 Files with no reviewable changes (1)
- core/providers/gemini/utils.go
🚧 Files skipped from review as they are similar to previous changes (19)
- core/schemas/bifrost.go
- .github/workflows/pr-tests.yml
- core/providers/gemini/transcription.go
- .github/workflows/release-pipeline.yml
- core/internal/testutil/responses_stream.go
- transports/changelog.md
- core/providers/openai/openai.go
- transports/config.schema.json
- core/providers/huggingface/responses.go
- core/providers/gemini/speech.go
- core/bifrost.go
- core/providers/huggingface/huggingface_test.go
- core/providers/huggingface/speech.go
- docs/features/providers/huggingface.mdx
- core/changelog.md
- core/schemas/transcriptions.go
- docs/apis/openapi.json
- core/providers/utils/audio.go
- ui/lib/constants/logs.ts
👮 Files not reviewed due to content moderation or server errors (1)
- ui/lib/constants/icons.tsx
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
core/schemas/account.goui/lib/constants/config.tsdocs/features/providers/supported-providers.mdxcore/providers/huggingface/embedding.gocore/providers/huggingface/chat.gocore/schemas/mux.goui/lib/constants/icons.tsxcore/providers/huggingface/transcription.gocore/providers/huggingface/models.goui/README.mdcore/providers/utils/utils.gocore/providers/huggingface/utils.godocs/contributing/adding-a-provider.mdxcore/internal/testutil/account.gocore/internal/testutil/transcription.godocs/docs.jsoncore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
core/schemas/account.gocore/providers/huggingface/embedding.gocore/providers/huggingface/chat.gocore/schemas/mux.gocore/providers/huggingface/transcription.gocore/providers/huggingface/models.gocore/providers/utils/utils.gocore/providers/huggingface/utils.gocore/internal/testutil/account.gocore/internal/testutil/transcription.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
📚 Learning: 2025-12-15T10:34:13.855Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/chat.go:370-389
Timestamp: 2025-12-15T10:34:13.855Z
Learning: In the HuggingFace provider (core/providers/huggingface/), streaming tool call data arrives as a single delta chunk with all fields (id, type, function.name, function.arguments) present at once. Do not implement accumulation logic across deltas for streaming tool calls in this codebase; rely on single-chunk data as the guaranteed behavior. If there is existing accumulation logic in this directory, remove it or guard it behind this assumption.
Applied to files:
core/providers/huggingface/embedding.gocore/providers/huggingface/chat.gocore/providers/huggingface/transcription.gocore/providers/huggingface/models.gocore/providers/huggingface/utils.gocore/providers/huggingface/huggingface.gocore/providers/huggingface/types.go
📚 Learning: 2025-12-19T08:29:20.286Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1095
File: core/internal/testutil/count_tokens.go:30-67
Timestamp: 2025-12-19T08:29:20.286Z
Learning: In core/internal/testutil test files, enforce using GetTestRetryConfigForScenario() to obtain a generic retry config, then construct a typed retry config (e.g., CountTokensRetryConfig, EmbeddingRetryConfig, TranscriptionRetryConfig) with an empty Conditions slice. Copy only MaxAttempts, BaseDelay, MaxDelay, OnRetry, and OnFinalFail from the generic config. This convention should be consistently applied across all test files in this directory.
Applied to files:
core/internal/testutil/account.gocore/internal/testutil/transcription.go
📚 Learning: 2025-12-15T10:19:32.071Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/internal/testutil/transcription.go:241-258
Timestamp: 2025-12-15T10:19:32.071Z
Learning: In core/internal/testutil package, the function GenerateTTSAudioForTest returns ([]byte, string) not ([]byte, error), and it already handles errors internally by calling t.Fatalf(). Therefore, the blank identifier for the second return value is correct and no external error checking is needed.
Applied to files:
core/internal/testutil/transcription.go
🧬 Code graph analysis (6)
core/schemas/account.go (1)
ui/lib/types/config.ts (3)
AzureKeyConfig(23-27)VertexKeyConfig(36-42)BedrockKeyConfig(63-71)
core/providers/huggingface/embedding.go (4)
core/schemas/embedding.go (4)
BifrostEmbeddingRequest(9-16)BifrostEmbeddingResponse(22-28)EmbeddingData(118-122)EmbeddingStruct(124-128)core/providers/huggingface/types.go (3)
HuggingFaceEmbeddingRequest(161-172)InputsCustomType(174-177)EncodingType(219-219)core/schemas/models.go (1)
Model(109-129)core/schemas/chatcompletions.go (1)
BifrostLLMUsage(845-852)
core/providers/huggingface/models.go (3)
core/providers/huggingface/types.go (1)
HuggingFaceListModelsResponse(28-30)core/schemas/bifrost.go (9)
ModelProvider(32-32)RequestType(88-88)ChatCompletionRequest(94-94)ChatCompletionStreamRequest(95-95)ResponsesRequest(96-96)ResponsesStreamRequest(97-97)EmbeddingRequest(98-98)SpeechRequest(99-99)TranscriptionRequest(101-101)core/schemas/models.go (2)
BifrostListModelsResponse(36-45)Model(109-129)
core/providers/utils/utils.go (1)
core/schemas/bifrost.go (3)
Cerebras(47-47)Perplexity(46-46)HuggingFace(51-51)
core/internal/testutil/account.go (3)
core/schemas/bifrost.go (3)
HuggingFace(51-51)Fallback(149-152)OpenAI(35-35)core/schemas/provider.go (5)
ProviderConfig(267-276)NetworkConfig(48-56)DefaultRequestTimeoutInSeconds(15-15)ConcurrencyAndBufferSize(131-134)Provider(314-361)core/internal/testutil/cross_provider_scenarios.go (1)
ProviderConfig(45-53)
core/providers/huggingface/types.go (1)
core/schemas/chatcompletions.go (3)
ChatStreamOptions(237-240)ChatTool(252-257)ChatToolChoiceFunction(444-446)
🪛 LanguageTool
docs/contributing/adding-a-provider.mdx
[uncategorized] ~1785-~1785: The official name of this software platform is spelled with a capital “H”.
Context: ....yml- ✅ Environment variable added to.github/workflows/release-pipeline.yml` (4 jobs...
(GITHUB)
[grammar] ~2000-~2000: Use a hyphen to join words.
Context: ...ror or incorrect stream handling Tool calling tests fail: - Solution: Ve...
(QB_NEW_EN_HYPHEN)
🔇 Additional comments (45)
ui/lib/constants/config.ts (1)
61-61: LGTM!Correctly marks HuggingFace as requiring an API key, which aligns with the authentication requirements specified in the PR objectives.
docs/contributing/adding-a-provider.mdx (1)
1-2071: Excellent comprehensive guide for provider contributors!This documentation represents a significant improvement to the contributor experience. The guide is thorough, well-structured, and provides clear patterns for both OpenAI-compatible and custom API providers.
Strengths:
- Clear separation of concerns with strict file conventions
- Progressive examples from simple to complex
- Multiple verification checklists throughout
- Real working examples from existing providers (HuggingFace, Cerebras)
- Comprehensive testing guidance with scenario configuration
- Complete UI integration steps with all affected files
- Helpful troubleshooting sections
Structure:
- ✅ Phase-based implementation workflow
- ✅ CRITICAL markers for important sequences (e.g., file creation order)
- ✅ Separate sections for OpenAI-compatible vs. custom providers
- ✅ Code examples follow stated conventions
- ✅ All past review concerns have been addressed
The guide successfully balances comprehensiveness with usability, providing both high-level patterns and detailed reference implementations.
core/schemas/account.go (2)
9-20: LGTM! HuggingFaceKeyConfig follows established patterns.The addition of
HuggingFaceKeyConfigto theKeystruct is consistent with other provider-specific configurations (Azure, Vertex, Bedrock). The structure and formatting align with existing conventions.Based on learnings, this field is reserved for future Hugging Face inference endpoint deployments and is intentionally unused in the current implementation.
70-72: LGTM! HuggingFaceKeyConfig struct is properly defined.The struct definition follows the same pattern as other provider configs with a
Deploymentsmap for model-to-deployment mappings.docs/features/providers/supported-providers.mdx (3)
1-14: LGTM! Documentation restructure improves clarity.The changes from "Unified Interface" to "Supported Providers" with reorganized sections (Overview, Response Format) make the documentation more focused and easier to navigate. The updated description clearly highlights Bifrost's multi-provider support and OpenAI-compatible formats.
96-96: LGTM! HuggingFace provider capabilities accurately documented.The provider support matrix correctly lists HuggingFace capabilities:
- ✅ Models, Chat, Chat streaming, Responses, Responses streaming, Embeddings, TTS, STT
- ❌ Text completions, streaming variants, Batch, Files
This aligns with the implementation in the PR.
127-148: LGTM! Custom Providers and metadata sections add valuable context.The new sections clearly explain:
- Custom provider configurations and use cases
- Provider metadata in the
extra_fieldsresponse section- Configuration options with links to Go SDK and Gateway docs
These additions improve the documentation's completeness.
docs/docs.json (1)
158-166: LGTM! Provider documentation properly grouped.The new "Providers" group logically organizes provider-related documentation (supported-providers, custom-providers, huggingface) under a unified navigation section with appropriate icon. This improves documentation discoverability and structure.
core/schemas/mux.go (2)
1155-1241: LGTM! Streaming conversion properly handles reasoning-only responses.The updated logic correctly:
- Gates content emission with
hasContentandhasReasoningchecks- Creates text items when content OR reasoning is present (line 1159)
- Emits an empty delta for reasoning-only first chunks to satisfy lifecycle requirements (lines 1220-1225)
- Tracks content state with
TextItemHasContentflagThis implementation supports models that output reasoning without text content, maintaining proper OpenAI-compatible streaming event sequencing.
1410-1411: LGTM! Text item closure supports reasoning-only responses.Removing the dependency on
TextItemHasContentfor closure (line 1411) ensures that text items are properly closed even for reasoning-only responses where no actual text content was emitted. This completes the lifecycle correctly for all response types.core/providers/huggingface/chat.go (1)
11-106: LGTM! Comprehensive chat request conversion with proper error handling.The
ToHuggingFaceChatCompletionRequestfunction correctly:
- Validates input with nil checks (line 12)
- Maps all standard parameters (frequency penalty, temperature, top_p, etc.)
- Handles ResponseFormat conversion with proper error propagation (lines 57-64)
- Converts StreamOptions and Tools arrays
- Supports both string-based (auto/none/required) and structured ToolChoice formats
All past review concerns have been addressed, including error handling for ResponseFormat conversion.
core/internal/testutil/account.go (3)
114-114: LGTM! HuggingFace properly integrated into test configuration.HuggingFace is correctly:
- Added to configured providers list (line 114)
- Configured with API key retrieval from
HUGGING_FACE_API_KEYenvironment variable (line 330)- Following the same pattern as other providers
Also applies to: 327-334
589-601: LGTM! HuggingFace provider config has appropriate settings for cold starts.The configuration is well-tuned:
- 300s timeout accommodates model cold starts on serverless inference
- 10 retries aligns with other cloud providers for resilience
- 2s initial → 30s max backoff provides appropriate retry spacing
- Concurrency (4) and buffer size (10) match other providers
1020-1053: LGTM! Comprehensive test configuration for HuggingFace provider.The test config properly defines:
- Models for chat, vision, embedding, transcription, and speech synthesis
- Comprehensive scenario coverage (chat, streaming, tool calls, embedding, audio)
- Appropriate capability flags (e.g., Reasoning: false, Batch: false)
- Fallback to OpenAI gpt-4o-mini
This enables thorough testing of HuggingFace provider integration.
core/providers/huggingface/embedding.go (2)
11-78: LGTM! Embedding request conversion handles provider differences correctly.The
ToHuggingFaceEmbeddingRequestfunction properly:
- Splits model into inference provider and model name with error handling (lines 16-19)
- Initializes request with provider-specific fields (Model/Provider for non-hf-inference, empty for hf-inference)
- Uses correct input field based on provider:
Inputsfor hf-inference,Inputfor others (lines 40-44)- Maps standard parameters (EncodingFormat, Dimensions)
- Extracts HuggingFace-specific parameters from ExtraParams (normalize, prompt_name, truncate, truncation_direction)
All past review concerns about input field selection have been addressed.
80-168: LGTM! Response unmarshaling supports multiple HuggingFace formats.The
UnmarshalHuggingFaceEmbeddingResponsefunction robustly handles three response formats:
- Standard object with Data/Model/Usage fields (lines 94-114)
- 2D array of embeddings for batch inputs (lines 119-141)
- 1D array for single embedding (lines 146-164)
The fallback logic with ordered attempts and default Usage provision (lines 107-111, 136-140, 159-163) ensures compatibility across different HuggingFace inference providers.
core/providers/huggingface/models.go (2)
16-44: LGTM! Model list conversion with proper filtering.The
ToBifrostListModelsResponsefunction correctly:
- Filters out models with empty ModelID (lines 25-27)
- Filters models without supported methods (lines 29-32)
- Constructs Model entries with proper ID format:
provider/inferenceProvider/modelID(line 35)- Populates Name, SupportedMethods, and HuggingFaceID fields appropriately
This ensures only actionable, properly identified models are exposed.
46-102: LGTM! Comprehensive method derivation from pipeline and tags.The
deriveSupportedMethodsfunction effectively:
- Maps pipeline types to core request types (conversational→chat, feature-extraction→embedding, etc.)
- Augments methods from tag patterns covering embedding, chat/completion, TTS, and STT
- Deduplicates via map-based set (lines 49-54)
- Returns deterministically sorted results (lines 95-101)
This approach ensures models are correctly advertised with their actual capabilities based on HuggingFace metadata. All past concerns about unsupported methods have been addressed.
core/internal/testutil/transcription.go (2)
73-98: LGTM - Fixture-based audio loading for fal-ai/HuggingFace transcription tests.The implementation correctly:
- Uses
runtime.Caller(0)to locate fixtures relative to the test file- Properly handles file read errors with
t.Fatalf- Constructs the transcription request with appropriate parameters for the fal-ai path
98-178: LGTM - TTS generation path for non-fal-ai providers.The else branch properly implements the TTS-based audio generation flow with:
- Correct retry configuration using
GetTestRetryConfigForScenario- Proper cleanup of temporary audio files
- Consistent request construction
Based on learnings,
GenerateTTSAudioForTestreturns([]byte, string)and handles errors internally viat.Fatalf(), so the blank identifier usage at line 277 is correct.core/providers/huggingface/utils.go (7)
1-51: LGTM - Well-structured inference provider constants and types.The inference provider type system is cleanly defined with:
- A dedicated
inferenceProviderstring type for type safety- Comprehensive list of 19 providers matching HuggingFace documentation
- Special
autopolicy for automatic provider selection
52-81: LGTM - Provider lists are correctly structured.The
INFERENCE_PROVIDERSslice andPROVIDERS_OR_POLICIES(which adds "auto") are properly initialized. The IIFE pattern forPROVIDERS_OR_POLICIESensures the slice is created once at init time.
83-121: LGTM - Model hub URL builder handles edge cases well.The function properly:
- Enforces pagination limits with
defaultModelFetchLimitandmaxModelFetchLimit- Handles various
ExtraParamstypes via type switch- Uses proper URL encoding via
url.Values
130-147: LGTM - Model/provider parsing with proper error handling.The
splitIntoModelProviderfunction now correctly:
- Returns an error for model names without slashes (t==0)
- Handles single-slash models (org/model) with
autoprovider- Handles multi-slash models (provider/org/model) correctly
This addresses the past review concern about empty provider/model names.
149-193: LGTM - Provider routing correctly scoped to supported operations.Per the past discussion, this function intentionally only handles 6 providers (
fal-ai,hf-inference,nebius,replicate,sambanova,scaleway) because these are the only providers that support embedding, speech, and transcription operations per HuggingFace documentation. The other providers inINFERENCE_PROVIDERSare used for chat/text-generation which follows a different routing pattern.
213-267: LGTM - Model inference provider mapping with proper caching.The implementation correctly:
- Checks cache first before making HTTP requests
- Uses
sync.Mapfor concurrent access safety- Handles error responses with proper message preservation
- Stores results in cache after successful fetch
294-322: LGTM - Audio download with context support.The
downloadAudioFromURLfunction correctly:
- Accepts context for cancellation/timeout support (addressing past review)
- Uses
providerUtils.MakeRequestWithContextfor context-aware requests- Copies the body to avoid use-after-free issues with fasthttp's internal buffer
core/providers/huggingface/transcription.go (2)
48-116: LGTM - Parameter mapping correctly handles both typed and ExtraParams.The implementation properly:
- Always initializes
genParamsregardless ofExtraParamspresence (fixing past gating bug)- Maps typed fields from
request.Paramsfirst- Overlays
ExtraParamsvalues when present- Handles
early_stoppingunion type (bool or string)- Uses
schemas.SafeExtractIntPointerfor safe numeric extraction
121-160: LGTM - Response conversion handles segments correctly.The
ToBifrostTranscriptionResponsemethod:
- Validates non-empty
requestedModel- Maps chunks to
TranscriptionSegmentwith proper timestamp extraction- Safely handles variable-length timestamp arrays
core/providers/huggingface/types.go (5)
12-52: LGTM - Model types with flexible JSON unmarshaling.The
HuggingFaceListModelsResponse.UnmarshalJSONcorrectly handles both:
- Top-level JSON array (most common for
/api/models)- Object with
modelsfield (fallback)This addresses the past review concern about struct/API format mismatch.
54-71: LGTM - Inference provider mapping types correctly structured.The
HuggingFaceInferenceProviderInfonow usesProviderModelID(addressing the past rename suggestion) and the internalHuggingFaceInferenceProviderMappingstruct correctly separates task and model ID.
73-143: LGTM - Chat request types with flexible tool choice handling.The
HuggingFaceToolChoicetype correctly:
- Supports enum values ("auto", "none", "required") via
EnumValue- Supports function object via
Functionfield- Has proper
MarshalJSONthat emits the correct format for each case
157-224: LGTM - Embedding types with flexible input handling.The
InputsCustomTypewith customUnmarshalJSON/MarshalJSON:
- Handles string, array of strings, and object forms
- Correctly returns error for unexpected formats
- Uses
sonicfor efficient JSON operations
292-327: LGTM - EarlyStopping type with proper error handling.The
HuggingFaceTranscriptionEarlyStopping.UnmarshalJSONnow correctly returns an error with the invalid data when neither boolean nor string parsing succeeds, addressing the past review concern about silently accepting invalid input.core/providers/huggingface/huggingface.go (11)
29-63: LGTM - Response pooling with proper acquire/release pattern.The transcription and speech response pools are correctly implemented:
- Pool initialization with
sync.Pool- Acquire functions reset structs before returning
- Release functions handle nil checks
- Pre-warming in constructor based on concurrency config
65-99: LGTM - Provider constructor with sensible defaults.The
NewHuggingFaceProvidercorrectly:
- Sets read/write timeouts from config
- Configures proxy if present
- Defaults base URL to
defaultInferenceBaseURL- Pre-warms response pools
111-188: LGTM - Model alias caching with 404 retry optimization.Per the past discussion, this is an intentional cache-first optimization where:
- Most requests complete with one API call due to high cache hit rate
- Only on 404 (cache miss) is there a re-validation and retry
- This saves API calls compared to always validating first
190-250: LGTM - HTTP request handling with proper error message preservation.The
completeRequestfunction correctly:
- Sets appropriate content types for audio vs JSON
- Handles error responses with guarded message overwrites (addressing past review)
- Copies response body to avoid use-after-free with fasthttp
252-398: LGTM - Parallel model listing across inference providers.The
listModelsByKeyfunction:
- Spawns goroutines for each inference provider
- Uses channels and WaitGroup for coordination
- Aggregates results with proper error handling
- Calculates average latency across successful responses
427-508: LGTM - ChatCompletion with proper model name formatting.The implementation correctly:
- Parses provider/model from request
- Reformats model as
modelName:inferenceProviderfor downstream processing- Converts to HuggingFace request format
- Enriches response with provider metadata
510-564: LGTM - ChatCompletionStream delegates to OpenAI-compatible handler.The streaming implementation correctly:
- Uses the shared
openai.HandleOpenAIChatCompletionStreamingfor SSE handling- Passes custom request converter for HuggingFace-specific format
- Sets stream=true in the request body
566-596: LGTM - Responses API implemented via ChatCompletion fallback.The Responses and ResponsesStream methods correctly:
- Convert to chat request format
- Delegate to ChatCompletion/ChatCompletionStream
- Convert response back to Responses format
- Set appropriate context for stream fallback detection
598-681: LGTM - Embedding with model alias caching.The Embedding function correctly:
- Validates and parses model name
- Uses
completeRequestWithModelAliasCachefor model resolution- Handles raw request/response tracking
- Uses custom
UnmarshalHuggingFaceEmbeddingResponsefor flexible format handling
683-763: LGTM - Speech with audio download post-processing.The Speech function correctly:
- Uses model alias caching for text-to-speech task
- Acquires/releases pooled response
- Downloads audio from returned URL using context-aware method
- Converts to Bifrost response format
769-856: LGTM - Transcription with hf-inference audio handling.The Transcription function correctly:
- Validates input for hf-inference provider (addressing past nil deref concern)
- Sends raw audio bytes for hf-inference vs JSON for other providers
- Uses model alias caching for automatic-speech-recognition task
- Properly acquires/releases pooled response
3e8d6d7 to
60e18dc
Compare
Merge activity
|

Summary
Adding Hugging Face inference provider.
Changes
Type of change
Affected areas
How to test
Added new env variable for hugging face
HUGGING_FACE_API_KEY=""
Screenshots/Recordings
Breaking changes
Related issues
Closes #430
Checklist
docs/contributing/README.mdand followed the guidelines