Skip to content

Latest commit

 

History

History
163 lines (122 loc) · 14.3 KB

File metadata and controls

163 lines (122 loc) · 14.3 KB

CLI Agent Spec — Full Index

All 67 failure modes across 7 parts. Each failure mode linked to its source file.


Table of Contents


1. Ecosystem Runtime Agent Specific

Agent-specific patterns discovered from real frameworks, libraries, and multi-agent deployments.

34 failure modes  |  🔴 11 critical · 🟠 19 high · 🟡 4 medium

# Title Severity Frequency Detectability Token Spend Time Context
§34 Shell Injection via Agent-Constructed Commands 🔴 Critical Common Hard High High Medium
§37 REPL / Interactive Mode Accidental Triggering 🔴 Critical Situational Hard High Critical Low
§42 Debug / Trace Mode Secret Leakage 🔴 Critical Situational Hard Low Low High
§43 Tool Output Result Size Unboundedness 🔴 Critical Common Hard Critical High Critical
§45 Headless Authentication / OAuth Browser Flow Blocking 🔴 Critical Common Hard High Critical Low
§50 Stdin Consumption Deadlock 🔴 Critical Common Hard High Critical Low
§53 Credential Expiry Mid-Session 🔴 Critical Common Hard High High Low
§60 OS Output Buffer Deadlock 🔴 Critical Common Hard High Critical Low
§61 Bidirectional Pipe Payload Deadlock 🔴 Critical Situational Hard High Critical Low
§62 $EDITOR and $VISUAL Trap 🔴 Critical Common Hard High Critical Low
§64 Headless Display and GUI Launch Blocking 🔴 Critical Common Hard High Critical Low
§35 Agent Hallucination Input Patterns 🟠 High Common Hard Medium Medium Low
§38 Runtime Dependency Version Mismatch 🟠 High Common Medium High High Low
§40 parse() vs parseAsync() Silent Race Condition 🟠 High Common (Node.js ecosystem) Hard High High Low
§41 Update Notifier Side-Channel Output Pollution 🟠 High Common (Node.js/npm ecosystem) Medium Medium Medium Medium
§46 API Schema to CLI Flag Translation Loss 🟠 High Common Medium High Medium Medium
§47 MCP Wrapper Schema Staleness 🟠 High Common Hard High High Low
§49 Async Job / Polling Protocol Absence 🟠 High Common Hard High High Medium
§51 Shell Word Splitting and Glob Expansion Interference 🟠 High Common Medium Medium Medium Low
§54 Conditional / Dependent Argument Requirements 🟠 High Common Hard High Medium Low
§55 Silent Data Truncation 🟠 High Common Hard Medium Medium Low
§56 Exit Code Masking in Shell Pipelines 🟠 High Common Hard Medium Low Low
§58 Multi-Agent Concurrent Invocation Conflict 🟠 High Situational Hard Medium High Low
§59 High-Entropy String Token Poisoning 🟠 High Common Medium High Low High
§65 Global Configuration State Contamination 🟠 High Common Hard Medium High Low
§66 Symlink Loop and Recursive Traversal Exhaustion 🟠 High Situational Hard Medium Critical Low
§67 Agent-Generated Input Syntax Rejection 🟠 High Common Easy High Medium Low
§68 Third-Party Library Stdout Pollution 🟠 High Common Medium Medium Low High
§69 Argument Order Ambiguity 🟠 High Common Medium Medium Medium Low
§70 Single-Argument Arity Forcing Agent Loop Overhead 🟠 High Common Easy Medium Medium Low
§44 Agent Knowledge Packaging Absence 🟡 Medium Very Common Easy High High Medium
§52 Recursive Command Tree Discovery Cost 🟡 Medium Very Common Easy High Medium High
§57 Locale-Dependent Error Messages 🟡 Medium Situational Easy High Low Medium
§63 Terminal Column Width Output Corruption 🟡 Medium Common Easy Medium Low Medium

2. Execution And Reliability

Execution flow, blocking behavior, atomicity, and reliability under agent orchestration.

8 failure modes  |  🔴 4 critical · 🟠 3 high · 🟡 1 medium

# Title Severity Frequency Detectability Token Spend Time Context
§10 Interactivity & TTY Requirements 🔴 Critical Common Hard High Critical Low
§11 Timeouts & Hanging Processes 🔴 Critical Common Hard High Critical Low
§12 Idempotency & Safe Retries 🔴 Critical Common Hard High High Medium
§13 Partial Failure & Atomicity 🔴 Critical Common Hard High High Medium
§14 Argument Validation Before Side Effects 🟠 High Common Medium Medium Medium Low
§15 Race Conditions & Concurrency 🟠 High Situational Hard Medium Medium Low
§16 Signal Handling & Graceful Cancellation 🟠 High Situational Hard Medium Medium Low
§17 Child Process Leakage 🟡 Medium Situational Hard Low Low Low

3. Security

Destructive operations, authentication, secret handling, and prompt injection.

3 failure modes  |  🔴 3 critical

# Title Severity Frequency Detectability Token Spend Time Context
§23 Side Effects & Destructive Operations 🔴 Critical Common Medium Medium High Medium
§24 Authentication & Secret Handling 🔴 Critical Common Hard Medium Medium Low
§25 Prompt Injection via Output 🔴 Critical Situational Hard High High High

4. Output And Parsing

How CLI tools format, stream, and structure their output for agent consumption.

9 failure modes  |  🔴 2 critical · 🟠 4 high · 🟡 3 medium

# Title Severity Frequency Detectability Token Spend Time Context
§1 Exit Codes & Status Signaling 🔴 Critical Very Common Hard High High Low
§2 Output Format & Parseability 🔴 Critical Very Common Easy High Medium High
§3 Stderr vs Stdout Discipline 🟠 High Very Common Hard Medium Low High
§5 Pagination & Large Output 🟠 High Common Hard High High Critical
§8 ANSI & Color Code Leakage 🟠 High Common Hard Medium Low Medium
§9 Binary & Encoding Safety 🟠 High Situational Hard Low Medium Low
§4 Verbosity & Token Cost 🟡 Medium Very Common Easy High Low High
§6 Command Composition & Piping 🟡 Medium Common Easy Medium Low Low
§7 Output Non-Determinism 🟡 Medium Common Hard Medium Medium Low

5. Environment And State

Session state, configuration, working directory, filesystem, network, and runtime environment.

7 failure modes  |  🟠 4 high · 🟡 3 medium

# Title Severity Frequency Detectability Token Spend Time Context
§26 Stateful Commands & Session Management 🟠 High Common Hard Medium Medium Low
§28 Config File Shadowing & Precedence 🟠 High Common Hard High High Medium
§31 Network Proxy Unawareness 🟠 High Situational Hard Medium High Low
§32 Self-Update & Auto-Upgrade Behavior 🟠 High Situational Hard Medium High Low
§27 Platform & Shell Portability 🟡 Medium Common Easy Medium Medium Low
§29 Working Directory Sensitivity 🟡 Medium Common Medium Medium Low Low
§30 Undeclared Filesystem Side Effects 🟡 Medium Common Hard Low Low Low

6. Errors And Discoverability

Error quality, retry guidance, schema discovery, and versioning.

5 failure modes  |  🟠 3 high · 🟡 2 medium

# Title Severity Frequency Detectability Token Spend Time Context
§18 Error Message Quality 🟠 High Very Common Easy High Medium High
§19 Retry Hints in Error Responses 🟠 High Very Common Medium High High Medium
§22 Schema Versioning & Output Stability 🟠 High Common Hard High High Medium
§20 Environment & Dependency Discovery 🟡 Medium Common Easy Medium Medium Low
§21 Schema & Help Discoverability 🟡 Medium Very Common Easy High Medium Medium

7. Observability

Audit trails, request tracing, and operational visibility.

1 failure mode  |  🟡 1 medium

# Title Severity Frequency Detectability Token Spend Time Context
§33 Observability & Audit Trail 🟡 Medium Very Common Easy Medium High Medium

67 active failure modes across 7 parts. CLI Agent Spec v1.6 — 2026-04-01.