Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions docs/SIF/SEO_DASHBOARD_INSIGHTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,3 +79,60 @@ The `RealTimeSemanticMonitor` service runs periodically (default: daily or on-de
1. **Polls SIF**: Checks for new indexed documents.
2. **Runs Agents**: Executes agent logic against the fresh index.
3. **Generates Alerts**: If a critical threshold is breached (e.g., Health < 50%), it sends a system notification.

---

## 🤝 Team Huddle

The SEO Dashboard includes a dedicated **Team Huddle** stream that translates agent orchestration into a user-readable operational timeline.

### Data Contract
Each huddle item conforms to a normalized event envelope so the widget, activity page, and notification system render the same source of truth.

| Contract Block | Required Fields | Notes |
|---|---|---|
| `status` | `agent_id`, `state`, `started_at`, `last_heartbeat_at` | `state` enum: `idle`, `running`, `blocked`, `waiting_approval`, `degraded`. |
| `run` | `run_id`, `workflow_type`, `trigger`, `started_at`, `ended_at`, `duration_ms`, `outcome` | `trigger` enum: `scheduled`, `manual`, `event_driven`. |
| `event` | `event_id`, `run_id`, `agent_id`, `event_type`, `severity`, `summary`, `created_at` | `event_type` enum: `insight`, `task`, `system`, `handoff`. |
| `alert` | `alert_id`, `event_id`, `threshold_key`, `threshold_value`, `observed_value`, `created_at`, `is_acknowledged` | Used by in-product banners and digest notifications. |
| `approval` | `approval_id`, `run_id`, `action_label`, `requested_by`, `requested_at`, `expires_at`, `approval_state` | `approval_state` enum: `pending`, `approved`, `rejected`, `expired`. |

### Refresh + Stream Semantics
- **Initial load**: fetch the latest 50 Team Huddle rows for the active workspace.
- **Near real-time stream**: server-sent events (SSE) push deltas every 1-3 seconds when new events exist.
- **Polling fallback**: if SSE disconnects, poll every 15 seconds with `since=<last_event_timestamp>`.
- **Ordering rule**: sort by `created_at DESC`, break ties using monotonically increasing `event_id`.
- **Idempotency**: clients de-duplicate using `event_id` to prevent duplicate cards during reconnect.

### Latency Targets
- **P50 ingest-to-display**: <= 2 seconds for `status` and `event` updates.
- **P95 ingest-to-display**: <= 5 seconds under normal load.
- **Critical alerts**: banner render in <= 3 seconds P95 after alert creation.
- **Approval state changes**: reflected in UI in <= 2 seconds P95.

### Failure + Fallback Behavior
- If stream transport fails, show a non-blocking "Live updates paused" badge and automatically switch to polling.
- If both stream and polling fail, keep last known data, mark timestamp as stale, and expose a "Retry" action.
- If huddle payload validation fails, quarantine invalid records and render a generic "system event" row instead of crashing the feed.
- If agent status heartbeats are missing for >2 intervals, render agent as `degraded` with tooltip context.

### User-Visible Detail Tiers + Security Constraints
- **Tier 1 (Overview)**: summary text, agent name, timestamp, severity color.
- **Tier 2 (Operational)**: run metadata (`run_id`, trigger, duration, outcome), alert thresholds, approval state.
- **Tier 3 (Debug/Admin)**: correlation IDs, raw payload excerpt, retry metadata, trace IDs.
- Access controls:
- Tier 1 is available to all workspace members.
- Tier 2 requires analyst/editor role.
- Tier 3 requires admin role and is excluded from exported reports by default.
- Sensitive fields (tokens, secrets, external auth headers, personal identifiers) must be redacted prior to persistence and never emitted in SSE payloads.

### Acceptance Criteria: View Full Team Activity
- "View Full Team Activity" opens a full-page activity timeline filtered to the currently selected date range and workspace.
- Expected row fields: `event_id`, `created_at`, `agent_id`, `event_type`, `severity`, `summary`, `run_id`, `workflow_type`, `outcome`, `approval_state` (if present), `alert_id` (if present).
- Interaction flow:
1. User clicks **View Full Team Activity** from Team Huddle widget.
2. System opens Activity page and preserves dashboard filters (date, agent, severity).
3. User expands a row to view Tier 2 details; admins can toggle Tier 3 diagnostics.
4. User can acknowledge alerts inline and approve/reject pending approvals where authorized.
5. Returning to Dashboard restores previous scroll position and active widget tab.
- Empty state behavior: show "No team activity in this range" plus quick actions to clear filters or jump to last 24 hours.
70 changes: 70 additions & 0 deletions docs/SIF/SIF_AGENTS_TEAM_ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,76 @@ The agents are visible to the user in three key areas:

---

## 🤝 Team Huddle (System Contract)

The Team Huddle is the canonical operational surface for multi-agent coordination. It must stay consistent across dashboard widget, notifications, and the full activity view.

### Event/Data Contract
All orchestration updates are emitted as typed records under a shared schema:

- **`status`**
- `agent_id`, `state`, `started_at`, `last_heartbeat_at`, `run_id?`
- State enum: `idle`, `running`, `blocked`, `waiting_approval`, `degraded`.
- **`run`**
- `run_id`, `workflow_type`, `trigger`, `started_at`, `ended_at`, `duration_ms`, `outcome`
- Trigger enum: `scheduled`, `manual`, `event_driven`.
- **`event`**
- `event_id`, `run_id`, `agent_id`, `event_type`, `severity`, `summary`, `created_at`, `metadata`
- Event type enum: `insight`, `task`, `system`, `handoff`.
- **`alert`**
- `alert_id`, `event_id`, `threshold_key`, `threshold_value`, `observed_value`, `created_at`, `is_acknowledged`
- **`approval`**
- `approval_id`, `run_id`, `action_label`, `requested_by`, `requested_at`, `expires_at`, `approval_state`
- Approval state enum: `pending`, `approved`, `rejected`, `expired`.

### Refresh + Stream Semantics
- Primary transport is SSE with incremental delivery for each record type.
- Clients bootstrap with latest N (default 50) records, then subscribe for deltas.
- On disconnect: exponential backoff reconnect; if retries exhausted, switch to 15-second polling.
- Feed ordering is deterministic by `created_at DESC`, tie-broken by `event_id`.
- Duplicate prevention uses idempotency key = `event_id` (`status` events key by `agent_id + last_heartbeat_at`).

### Latency SLOs
- P50 ingest-to-UI: <= 2s for status/event.
- P95 ingest-to-UI: <= 5s for all non-bulk events.
- Critical alert propagation: <= 3s P95.
- Approval decision reflection: <= 2s P95.

### Failure + Fallback Behavior
- If ingestion pipeline lags, emit synthetic `system` event with severity `warning` to inform users.
- If an agent misses two heartbeat windows, transition status to `degraded` and suspend dependent handoffs.
- If schema validation fails, route to dead-letter queue and emit sanitized `system` placeholder event.
- If transport unavailable, UI remains functional in read-only cached mode with manual refresh controls.

### User Detail Tiers + Security Constraints
- **Tier 1: Summary** — agent, summary, timestamp, severity.
- **Tier 2: Operational** — run context, thresholds, workflow outcome, approval state.
- **Tier 3: Diagnostic** — trace/correlation IDs, retry counters, raw sanitized metadata.
- Role mapping:
- Workspace Member -> Tier 1
- Analyst/Editor -> Tier 1-2
- Admin/Owner -> Tier 1-3
- Security rules:
- Secrets, credentials, API keys, and personal identifiers are redacted before persistence.
- Tier 3 data is never included in default exports or external webhook mirrors.
- Approval actions require explicit authorization and audit logging of actor + timestamp.

### Acceptance Criteria: View Full Team Activity
- The "View Full Team Activity" control navigates from widget to a dedicated timeline route and preserves filters.
- The timeline supports filtering by agent, event type, severity, status, and approval state.
- Minimum visible fields per row:
- `event_id`, `created_at`, `agent_id`, `event_type`, `severity`, `summary`
- `run_id`, `workflow_type`, `outcome`
- `alert_id` (when present), `approval_id` + `approval_state` (when present)
- Row expansion reveals Tier 2 details; Tier 3 panel is visible only for admin/owner roles.
- Inline interactions:
1. Acknowledge/unacknowledge alerts.
2. Approve/reject pending approval requests.
3. Jump from event row to related task/insight detail.
- Navigation continuity: returning to dashboard restores previous Team Huddle scroll position and active filters.

---

## 🚀 Future Roadmap

* **Inter-Agent Chat**: Allow agents to debate strategy (e.g., SEO Agent vs. Creative Agent).
Expand Down