Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 35 additions & 3 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -478,7 +478,39 @@ Net effect: the code for a Hermes-compatible planning todo tool is present and d

**Takeaway for future work**: Before adding ColorScheme-based theming, confirm that the configuration API actually returns usable RGB values on the target LibreOffice versions. If not, prefer letting LibreOffice handle dark/light mode by default and only layer on minimal, well-tested overrides.

## 8. Gotchas
## 8. Error Handling Style Guide

To maintain a predictable and reliable codebase, follow these rules when handling errors:

### 1. Unified Error Classes
- **Avoid raising base `Exception`**: Do not use `raise Exception("...")` or catch raw `except Exception:` unless absolutely necessary at the highest level boundary (like a thread run loop).
- **Use Custom Exceptions**: Prefer creating and using specific exceptions that inherit from a central `WriterAgentException` (to be created in `plugin/framework/errors.py`), such as `NetworkError`, `ConfigError`, `UnoObjectError`, or `ToolExecutionError`. This allows callers to distinguish between user-correctable issues (like bad config) and systemic failures.

### 2. Standardized Error Payload Formats
- **Consistent JSON Structure**: When an API endpoint, MCP route, or AI tool execution fails, serialize the error as a standardized dictionary/JSON object:
```json
{
"status": "error",
"code": "SPECIFIC_ERROR_CODE",
"message": "Human readable summary of the failure.",
"details": { "key": "value" }
}
```
- **Never return raw strings** for tool errors; wrap them in the standard `{"status": "error", ...}` format so the LLM and UI can parse them predictably.

### 3. Improve Error Context Logging
- **Rich Context**: When raising a custom exception, attach context (e.g., `endpoint`, `document_url`, `tool_name`) so the top-level logger can print *what* failed, not just *that* it failed.
- **Use `log_exception`**: When catching unexpected errors, use `logging.log_exception(e, context="ModuleContext")` (or `traceback.format_exc()` into `debug_log`) rather than silently swallowing the traceback or just printing the message.

### 4. Recovery Patterns for Common Failures
- **Graceful Degradation**: Where external systems are involved (HTTP requests, LLM parsing, UNO COM calls), anticipate transient failures.
- **LLM Parse Failures**: If the AI returns malformed JSON, catch the specific parsing error and inject a recovery prompt (e.g., "The previous tool call had malformed JSON. Please fix the syntax and try again.") before failing the session.
- **UNO Stale References**: When interacting with LibreOffice documents, elements (like shapes or text cursors) can be deleted by the user mid-operation. Catch `DisposedException` or `RuntimeException`, invalidate the document cache, and abort gracefully without crashing the extension.
- **Network Retries**: For HTTP 502/503/504, implement limited backoff/retry loops instead of immediately failing the generation request.

---

## 9. Gotchas

- **Settings dialog fields**: The list of settings is defined in **`MainJob._get_settings_field_specs()`** (single source); `_apply_settings_result` derives apply keys from it. Settings dialog field list in XDL must match the names in that method.
- **Library name**: `WriterAgentDialogs` (folder name) must match `library:name` in `dialog.xlb`.
Expand All @@ -499,7 +531,7 @@ Net effect: the code for a Hermes-compatible planning todo tool is present and d

---

## 9. References
## 10. References

- LibreOffice xmlscript: `~/Desktop/libreoffice/xmlscript/` (if you have a local clone)
- DTD: `xmlscript/dtd/dialog.dtd`
Expand All @@ -508,7 +540,7 @@ Net effect: the code for a Hermes-compatible planning todo tool is present and d

---

## 10. Debugging Tips (Agent Hard-won Lessons)
## 11. Debugging Tips (Agent Hard-won Lessons)

### UNO UI Controls
- **Populating ListBox/ComboBox**: Setting `.Text` or `.String` is often not enough for selection lists. Use the **`StringItemList`** property (a tuple of strings) to populate a `ListBox` or `ComboBox` model. The UI will not show items otherwise.
Expand Down
113 changes: 113 additions & 0 deletions ERROR_HANDLING_DEV_PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# Error Handling & Consistency: Dev Plan

## Focus: Standardize Error Patterns Across Modules

**Goal**: Make debugging easier and more predictable by creating unified error classes, standardizing error payload formats, improving error context logging, and adding recovery patterns for common failures.

This plan details a phased refactor to clean up widespread, ad-hoc `Exception` usage across the codebase.

---

## 1. Audit & Problem Statement

Currently, the WriterAgent codebase suffers from fragmented error handling:
- **Overuse of Base Exception**: Modules like `plugin/modules/http/client.py`, `plugin/framework/document.py`, and `plugin/framework/image_utils.py` extensively use `raise Exception("...")` and catch generic `except Exception:`.
- **Fragmented Custom Exceptions**: There are isolated custom exceptions scattered throughout (`ConfigAccessError` in `config.py`, `AuthError` in `auth.py`, `TunnelError` in `tunnel/__init__.py`, `BusyError` in `mcp_protocol.py`, `IdentifiedError` in `aihordeclient`).
- **Inconsistent Error Payloads**: Tool execution errors, MCP API responses, and streaming errors do not follow a strict JSON payload schema. Sometimes they return `{"status": "error", "message": "..."}`, sometimes just a raw string, sometimes an empty `{}`.
- **Lost Context**: Generic exceptions often lose their traceback or operational context (e.g. *which* tool failed, *which* UNO document threw the error), making the `writeragent_debug.log` hard to parse.

---

## 2. Refactor Strategy & Unified Error Classes

### Phase 1: Centralized Error Hierarchy (`plugin/framework/errors.py`)
Create a new central file `plugin/framework/errors.py` (or similar) defining a standard exception hierarchy:

```python
class WriterAgentException(Exception):
"""Base exception for all WriterAgent errors."""
def __init__(self, message, code="INTERNAL_ERROR", context=None):
super().__init__(message)
self.code = code
self.context = context or {}

class ConfigError(WriterAgentException):
"""Configuration, Auth, or Settings issues."""
# Replaces ConfigAccessError, AuthError

class NetworkError(WriterAgentException):
"""HTTP/Network related failures."""
# Replaces ad-hoc LlmClient / Request exceptions

class UnoObjectError(WriterAgentException):
"""LibreOffice UNO interface failures (stale docs, missing properties)."""

class ToolExecutionError(WriterAgentException):
"""Tool invocation and execution failures."""

class AgentParsingError(WriterAgentException):
"""LLM output / JSON parsing failures."""
```

*Action Items*:
- Map existing scattered custom exceptions (`ConfigAccessError`, `AuthError`, etc.) to inherit from or be replaced by these centralized classes.
- Update `plugin/modules/http/client.py` to raise `NetworkError` or `AgentParsingError` instead of generic `Exception`.
- Update `plugin/framework/document.py` UNO interactions to catch `Exception` and re-raise as context-rich `UnoObjectError` or handle gracefully.

---

## 3. Standardize Error Payload Formats

### Phase 2: Payload Normalization
Standardize how tools, MCP server responses, and UI elements serialize errors.

**Proposed Standard JSON Payload**:
```json
{
"status": "error",
"code": "SPECIFIC_ERROR_CODE",
"message": "Human readable summary of the failure.",
"details": {
"key": "value (e.g., endpoint URL, tool name, token count)"
}
}
```

*Action Items*:
- Refactor `ToolRegistry.execute()` (in `plugin/framework/tool_registry.py`) to consistently output this schema when catching a `ToolExecutionError` or `WriterAgentException`.
- Refactor MCP Server (`plugin/modules/http/server.py` and `mcp_protocol.py`) to map `WriterAgentException` codes to appropriate JSON-RPC or HTTP error payloads.
- Ensure the AI UI (Chat Sidebar) unwraps these payloads cleanly (e.g., in `send_handlers.py` and `api.format_error_for_display()`).

---

## 4. Improve Error Context Logging

### Phase 3: Enhanced Diagnostics
Generic exceptions currently lose the context of *what* the system was doing.

*Action Items*:
- Update `plugin/framework/logging.py` (`debug_log` and `log_exception`) to automatically extract and format the `.context` dictionary attached to `WriterAgentException`.
- When catching generic `Exception` in high-level loops (like the `drain_loop` in `tool_loop.py` or background workers in `worker_pool.py`), ensure the full `traceback.format_exc()` is reliably logged with context before gracefully returning the standardized error payload.

---

## 5. Recovery Patterns for Common Failures

### Phase 4: Resiliency
Based on the audit, implement localized recovery patterns where `except Exception` blocks are currently failing silently or crashing abruptly:

1. **UNO Stale Object Recovery** (`plugin/framework/document.py`):
- When referencing a document element (e.g. `cursor`, `shape`) that no longer exists (often resulting in `DisposedException` or `RuntimeException`), catch it, invalidate the `DocumentCache`, and attempt a retry before giving up.
2. **Network/HTTP Transient Failures** (`plugin/modules/http/client.py`):
- Implement exponential backoff for 502/503/504 errors on API endpoints instead of immediately hard-failing the chat session.
3. **JSON Parse Fallbacks** (`send_handlers.py` / `tool_call_parsers`):
- If an LLM returns malformed JSON for a tool call, inject a fast-recovery prompt (e.g., "The previous tool call had malformed JSON. Please fix syntax and try again.") instead of surfacing the raw JSON decode error to the user.

---

## Summary of Next Steps for Implementation
1. Add `plugin/framework/errors.py`.
2. Migrate `plugin/framework/config.py` and `plugin/framework/auth.py` to use new error classes.
3. Overhaul `plugin/modules/http/client.py` to strip out bare `raise Exception("...")` calls.
4. Refactor `ToolRegistry` to emit the standardized `{"status": "error", "code": "...", "message": "..."}` payload format.
5. Run the full test suite (`make test`) ensuring that mocked/simulated error conditions correctly trigger the new payloads and log outputs.