adding docstring documentation to backend module #1353

colinosullivan-ie · 2025-12-07T21:15:50Z

Which issue(s) does this pull-request address?

#1228

Closes: #

Description

Adding docstring documentation to the Backend module

Checklist

General

I have read the appropriate contributor guide: Python
/ TypeScript
I have signed off on my commit: Python instructions / TypeScript instructions
Commit messages and PR title follow conventional commits
Appropriate label(s) added to PR: Python for Python changes, TypeScript for TypeScript changes

Code quality checks

Code quality checks pass: mise check (mise fix to auto-fix)

Testing

Unit tests pass: mise test:unit
E2E tests pass: mise test:e2e
Tests are included (for bug fixes or new features)

Documentation

Documentation is updated
Embedme embeds code examples in docs. To update after edits, run: Python mise docs:fix

Signed-off-by: Colin O'Sullivan <[email protected]>

gemini-code-assist · 2025-12-07T21:16:09Z

Summary of Changes

Hello @colinosullivan-ie, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the documentation within the backend module of the beeai-framework by introducing comprehensive docstrings across several key classes and methods. The primary goal is to improve the developer experience by providing clear, detailed explanations and usage examples for core components such as Backend, ChatModel, EmbeddingModel, DocumentLoader, TextSplitter, and VectorStore. This effort ensures that developers can quickly understand and effectively utilize the framework's AI model and data processing capabilities.

Highlights

Backend Module Documentation: Comprehensive docstrings have been added to the Backend class and its core methods (__init__, from_name, from_provider, clone) in python/beeai_framework/backend/backend.py. These docstrings clarify the purpose, attributes, arguments, returns, and provide usage examples for each component, significantly improving the clarity of the unified interface for chat and embedding models.
Chat Model Documentation: Detailed docstrings are now present for ChatModelKwargs and the ChatModel abstract base class, along with all its abstract methods and key functionalities like __init__, config, from_name, clone, and get_default_parameters in python/beeai_framework/backend/chat.py. This enhances understanding of how to interact with various LLM providers, including tool calling, structured outputs, and model configuration.
Embedding Model Documentation: The EmbeddingModelKwargs and EmbeddingModel abstract base class, including its properties (model_id, provider_id, emitter) and methods (__init__, create, from_name, clone, destroy), have received extensive docstrings in python/beeai_framework/backend/embedding.py. This provides clear guidance on converting text into vector embeddings and managing embedding model instances.
Document and Text Processing Documentation: Docstrings have been added to the DocumentLoader, TextSplitter, QueryLike protocol, and VectorStore abstract base classes and their respective methods in python/beeai_framework/backend/document_loader.py, python/beeai_framework/backend/text_splitter.py, and python/beeai_framework/backend/vector_store.py. This improves the documentation for loading documents, splitting text into chunks, and interacting with vector databases for semantic search.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request does an excellent job of adding comprehensive docstrings to the backend module, significantly improving the documentation. The new docstrings are clear, follow a consistent style, and include helpful examples.

I've identified a couple of areas for improvement:

The examples for the clone methods in Backend, ChatModel, and EmbeddingModel can be misleading. They imply a deep copy behavior that is not guaranteed by the base implementation, which could lead to user confusion.
The placeholder error messages in NotImplementedError exceptions for abstract methods could be made more descriptive to improve the developer experience when subclassing.

These are minor points in an otherwise great contribution to the project's documentation.

gemini-code-assist · 2025-12-07T21:18:50Z

python/beeai_framework/backend/backend.py

+        Example:
+            >>> original = Backend.from_provider("openai")
+            >>> cloned = await original.clone()
+            >>> # Modifications to cloned won't affect original


The example for clone assumes that the underlying chat and embedding models implement a deep copy clone method. As noted in the ChatModel and EmbeddingModel base classes, their default clone implementation performs a shallow copy (returns self). If the specific models (e.g., from the "openai" provider) do not override this, the example is misleading as cloned will not be a deep copy. This could cause unexpected side effects for users. The documentation should be clear about this dependency or use a more robust example that doesn't rely on a specific provider's implementation.

gemini-code-assist · 2025-12-07T21:18:50Z

python/beeai_framework/backend/chat.py

+        Example:
+            >>> original = ChatModel.from_name("openai:gpt-4")
+            >>> cloned = await original.clone()
+            >>> # Modifications to cloned won't affect original
+            >>> cloned.config(parameters=ChatModelParameters(temperature=0.9))


The example provided for the clone method can be misleading. It uses ChatModel.from_name("openai:gpt-4"), which returns an OpenAIChatModel instance. However, the base ChatModel.clone method returns self (a shallow copy) and logs a warning if not overridden. If OpenAIChatModel or its parents do not override clone to perform a deep copy, the example is incorrect because modifications to cloned will affect original. This could lead to subtle bugs for users who copy this example.

To avoid confusion, the example should either use a mock class that is known to implement clone correctly, or be rephrased to not rely on a specific implementation that may not have this feature.

python/beeai_framework/backend/document_loader.py

gemini-code-assist · 2025-12-07T21:18:50Z

python/beeai_framework/backend/embedding.py

+        Example:
+            >>> original = EmbeddingModel.from_name("openai:text-embedding-3-small")
+            >>> cloned = await original.clone()
+            >>> # Modifications to cloned won't affect original


Similar to the ChatModel.clone method, the example for EmbeddingModel.clone can be misleading. It uses EmbeddingModel.from_name("openai:text-embedding-3-small") and implies a deep copy is made. However, the base implementation returns self and warns if not overridden. If OpenAIEmbeddingModel does not implement a deep clone, this example is incorrect and could cause bugs for users.

Consider using a mock class in the example to demonstrate the intended behavior of a correctly implemented clone method, to avoid confusion.

python/beeai_framework/backend/text_splitter.py

gemini-code-assist · 2025-12-07T21:18:51Z

python/beeai_framework/backend/text_splitter.py

+            >>> print(len(chunks))
+            >>> print(chunks[0])  # First chunk
+        """
        raise NotImplementedError("Implement me")


The NotImplementedError message "Implement me" is generic. For a better developer experience, consider providing a more descriptive message.

Suggested change

raise NotImplementedError("Implement me")

raise NotImplementedError("Subclasses must implement the `split_text` method.")

gemini-code-assist · 2025-12-07T21:18:51Z

python/beeai_framework/backend/text_splitter.py

+        Raises:
+            NotImplementedError: If the subclass doesn't implement this method.
+        """
        raise NotImplementedError("Implement me")


The NotImplementedError message "Implement me" is generic. For a better developer experience, consider providing a more descriptive message.

Suggested change

raise NotImplementedError("Implement me")

raise NotImplementedError("Subclasses must implement the `_class_from_name` method.")

gemini-code-assist · 2025-12-07T21:18:51Z

python/beeai_framework/backend/vector_store.py

+            >>> ids = await vector_store.add_documents(documents)
+            >>> print(ids)  # ['id1', 'id2']
+        """
        raise NotImplementedError("Implement me")


The NotImplementedError message "Implement me" is generic. For a better developer experience, consider providing a more descriptive message.

Suggested change

raise NotImplementedError("Implement me")

raise NotImplementedError("Subclasses must implement the `add_documents` method.")

gemini-code-assist · 2025-12-07T21:18:51Z

python/beeai_framework/backend/vector_store.py

+            ...     filter={"source": "documentation"}
+            ... )
+        """
        raise NotImplementedError("Implement me")


The NotImplementedError message "Implement me" is generic. For a better developer experience, consider providing a more descriptive message.

Suggested change

raise NotImplementedError("Implement me")

raise NotImplementedError("Subclasses must implement the `search` method.")

gemini-code-assist · 2025-12-07T21:18:51Z

python/beeai_framework/backend/vector_store.py

+        Raises:
+            NotImplementedError: If the subclass doesn't implement this method.
+        """
        raise NotImplementedError("Implement me")


The NotImplementedError message "Implement me" is generic. For a better developer experience, consider providing a more descriptive message.

Suggested change

raise NotImplementedError("Implement me")

raise NotImplementedError("Subclasses must implement the `_class_from_name` method.")

Tomas2D · 2025-12-31T16:52:49Z

python/beeai_framework/backend/chat.py

+    Attributes:
+        tool_call_fallback_via_response_format: Enable fallback to response format for tool calls.
+        retry_on_empty_response: Automatically retry when the model returns an empty response.
+        model_supports_tool_calling: Whether the underlying model supports native tool calling.
+        allow_parallel_tool_calls: Allow the model to make multiple tool calls simultaneously.
+        ignore_parallel_tool_calls: Ignore all but the first tool call when multiple are returned.
+        use_strict_tool_schema: Use strict JSON schema validation for tool parameters.
+        use_strict_model_schema: Use strict JSON schema validation for structured outputs.
+        supports_top_level_unions: Whether the model supports union types at the top level.
+        parameters: Default parameters for model generation (temperature, max_tokens, etc.).
+        cache: Cache implementation for storing and retrieving model outputs.
+        settings: Additional provider-specific settings.
+        middlewares: List of middleware to apply during model execution.
+        tool_choice_support: Set of supported tool choice modes (required, none, single, auto).
+        fix_invalid_tool_calls: Automatically attempt to fix malformed tool calls.
+    """


Could you do comments on each attribute instead? Like it is done in AgentOptions.

Tomas2D

Thank you for your contribution. Just please update the way comments are done for Pydantic Models / Kwargs. Prefer an inline description instead of the top-level one, as it improves readability.

colinosullivan-ie and others added 4 commits December 7, 2025 20:41

i-am-bee#1228 adding docstring documentation for Backend module

62724e0

Merge branch 'i-am-bee:main' into main

090c0d8

adding docstring documentation for Backend module

6c5f839

Signed-off-by: Colin O'Sullivan <[email protected]>

Merge branch 'main' of github.com:colinosullivan-ie/beeai-framework

34550ec

colinosullivan-ie requested a review from a team as a code owner December 7, 2025 21:15

dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Dec 7, 2025

github-actions bot added the python Python related functionality label Dec 7, 2025

gemini-code-assist bot reviewed Dec 7, 2025

View reviewed changes

colinosullivan-ie added 3 commits December 8, 2025 21:29

Merge branch 'main' into main

a107a50

Merge branch 'main' into main

2e0c2de

Merge branch 'main' into main

35acfbb

Tomas2D reviewed Dec 31, 2025

View reviewed changes

Tomas2D requested changes Dec 31, 2025

View reviewed changes

	raise NotImplementedError("Implement me")
	raise NotImplementedError("Subclasses must implement the `split_text` method.")

adding docstring documentation to backend module #1353

Are you sure you want to change the base?

adding docstring documentation to backend module #1353

Uh oh!

Conversation

colinosullivan-ie commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue(s) does this pull-request address?

Description

Checklist

General

Code quality checks

Testing

Documentation

Uh oh!

gemini-code-assist bot commented Dec 7, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

Tomas2D Dec 31, 2025

Choose a reason for hiding this comment

Uh oh!

Tomas2D left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

colinosullivan-ie commented Dec 7, 2025 •

edited

Loading