diff --git a/.cursor/commands/specfact.01-import.md b/.cursor/commands/specfact.01-import.md index 1c3acd2..e06af34 100644 --- a/.cursor/commands/specfact.01-import.md +++ b/.cursor/commands/specfact.01-import.md @@ -21,17 +21,20 @@ Import codebase → plan bundle. CLI extracts routes/schemas/relationships/contr ## Workflow -1. **Execute CLI**: `specfact import from-code [] --repo [options]` +1. **Execute CLI**: `specfact [GLOBAL OPTIONS] import from-code [] --repo [options]` - CLI extracts: routes (FastAPI/Flask/Django), schemas (Pydantic), relationships, contracts (OpenAPI scaffolds), source tracking - Uses active plan if bundle not specified + - Note: `--no-interactive` is a global option and must appear before the subcommand (e.g., `specfact --no-interactive import from-code ...`). - **Auto-enrichment enabled by default**: Automatically enhances vague acceptance criteria, incomplete requirements, and generic tasks using PlanEnricher (same logic as `plan review --auto-enrich`) - Use `--no-enrich-for-speckit` to disable auto-enrichment + - **Contract extraction**: OpenAPI contracts are extracted automatically **only** for features with `source_tracking.implementation_files` and detectable API endpoints (FastAPI/Flask patterns). For enrichment-added features or Django apps, use `specfact contract init` after enrichment (see Phase 4) -2. **LLM Enrichment** (if `--enrichment` provided): - - Read `.specfact/projects//enrichment_context.md` - - Enrich: business context, "why" reasoning, missing acceptance criteria - - Validate: contracts vs code, feature/story alignment - - Save enrichment report to `.specfact/projects//reports/enrichment/` (bundle-specific, Phase 8.5, if created) +2. **LLM Enrichment** (Copilot-only, before applying `--enrichment`): + - Read CLI artifacts: `.specfact/projects//enrichment_context.md`, feature YAMLs, contract scaffolds, and brownfield reports + - Scan the codebase within `--entry-point` (and adjacent modules) to identify missing features, dependencies, and behavior; do **not** rely solely on AST-derived YAML + - Compare code findings vs CLI artifacts, then add missing features/stories, reasoning, and acceptance criteria (each added feature must include at least one story) + - Save the enrichment report to `.specfact/projects//reports/enrichment/-.enrichment.md` (bundle-specific, Phase 8.5) + - **CRITICAL**: Follow the exact enrichment report format (see "Enrichment Report Format" section below) to ensure successful parsing 3. **Present**: Bundle location, report path, summary (features/stories/contracts/relationships) @@ -42,7 +45,7 @@ Import codebase → plan bundle. CLI extracts routes/schemas/relationships/contr **Rules:** - Execute CLI first - never create artifacts directly -- Use `--no-interactive` flag in CI/CD environments +- Use the global `--no-interactive` flag in CI/CD environments (must appear before the subcommand) - Never modify `.specfact/` directly - Use CLI output as grounding for validation - Code generation requires LLM (only via AI IDE slash prompts, not CLI-only) @@ -55,7 +58,7 @@ When in copilot mode, follow this three-phase workflow: ```bash # Execute CLI to get structured output -specfact import from-code [] --repo --no-interactive +specfact --no-interactive import from-code [] --repo ``` **Capture**: @@ -71,10 +74,10 @@ specfact import from-code [] --repo --no-interactive **What to do**: - Read CLI-generated artifacts (use file reading tools for display only) -- Research codebase for additional context -- Identify missing features/stories -- Suggest confidence adjustments -- Extract business context +- Scan the codebase within `--entry-point` for missing features/behavior and compare against CLI artifacts +- Identify missing features/stories and add reasoning/acceptance criteria (no direct edits to `.specfact/`) +- Suggest confidence adjustments and extract business context +- **CRITICAL**: Generate enrichment report in the exact format specified below (see "Enrichment Report Format" section) **What NOT to do**: @@ -83,20 +86,159 @@ specfact import from-code [] --repo --no-interactive - ❌ Bypass CLI validation - ❌ Write to `.specfact/` folder directly (always use CLI) - ❌ Use direct file manipulation tools for writing (use CLI commands) +- ❌ Deviate from the enrichment report format (will cause parsing failures) **Output**: Generate enrichment report (Markdown) saved to `.specfact/projects//reports/enrichment/` (bundle-specific, Phase 8.5) +**Enrichment Report Format** (REQUIRED for successful parsing): + +The enrichment parser expects a specific Markdown format. Follow this structure exactly: + +```markdown +# [Bundle Name] Enrichment Report + +**Date**: YYYY-MM-DDTHH:MM:SS +**Bundle**: + +--- + +## Missing Features + +1. **Feature Title** (Key: FEATURE-XXX) + - Confidence: 0.85 + - Outcomes: outcome1, outcome2, outcome3 + - Stories: + 1. Story title here + - Acceptance: criterion1, criterion2, criterion3 + 2. Another story title + - Acceptance: criterion1, criterion2 + +2. **Another Feature** (Key: FEATURE-YYY) + - Confidence: 0.80 + - Outcomes: outcome1, outcome2 + - Stories: + 1. Story title + - Acceptance: criterion1, criterion2, criterion3 + +## Confidence Adjustments + +- FEATURE-EXISTING-KEY: 0.90 (reason: improved understanding after code review) + +## Business Context + +- Priority: High priority feature for core functionality +- Constraint: Must support both REST and GraphQL APIs +- Risk: Potential performance issues with large datasets +``` + +**Format Requirements**: + +1. **Section Header**: Must use `## Missing Features` (case-insensitive, but prefer this exact format) +2. **Feature Format**: + - Numbered list: `1. **Feature Title** (Key: FEATURE-XXX)` + - **Bold title** is required (use `**Title**`) + - **Key in parentheses**: `(Key: FEATURE-XXX)` - must be uppercase, alphanumeric with hyphens/underscores + - Fields on separate lines with `-` prefix: + - `- Confidence: 0.85` (float between 0.0-1.0) + - `- Outcomes: comma-separated or line-separated list` + - `- Stories:` (required - each feature must have at least one story) +3. **Stories Format**: + - Numbered list under `Stories:` section: `1. Story title` + - **Indentation**: Stories must be indented (2-4 spaces) under the feature + - **Acceptance Criteria**: `- Acceptance: criterion1, criterion2, criterion3` + - Can be comma-separated on one line + - Or multi-line (each criterion on new line) + - Must start with `- Acceptance:` +4. **Optional Sections**: + - `## Confidence Adjustments`: List existing features with confidence updates + - `## Business Context`: Priorities, constraints, risks (bullet points) +5. **File Naming**: `-.enrichment.md` (e.g., `djangogoat-2025-12-23T23-50-00.enrichment.md`) + +**Example** (working format): + +```markdown +## Missing Features + +1. **User Authentication** (Key: FEATURE-USER-AUTHENTICATION) + - Confidence: 0.85 + - Outcomes: User registration, login, profile management + - Stories: + 1. User can sign up for new account + - Acceptance: sign_up view processes POST requests, creates User automatically, user is logged in after signup, redirects to profile page + 2. User can log in with credentials + - Acceptance: log_in view authenticates username/password, on success user is logged in and redirected, on failure error message is displayed +``` + +**Common Mistakes to Avoid**: + +- ❌ Missing `(Key: FEATURE-XXX)` - parser needs this to identify features +- ❌ Missing `Stories:` section - every feature must have at least one story +- ❌ Stories not indented - parser expects indented numbered lists +- ❌ Missing `- Acceptance:` prefix - acceptance criteria won't be parsed +- ❌ Using bullet points (`-`) instead of numbers (`1.`) for stories +- ❌ Feature title not in bold (`**Title**`) - parser may not extract title correctly + ### Phase 3: CLI Artifact Creation (REQUIRED) ```bash # Use enrichment to update plan via CLI -specfact import from-code [] --repo --enrichment --no-interactive +specfact --no-interactive import from-code [] --repo --enrichment ``` **Result**: Final artifacts are CLI-generated with validated enrichments **Note**: If code generation is needed, use the validation loop pattern (see [CLI Enforcement Rules](./shared/cli-enforcement.md#standard-validation-loop-pattern-for-llm-generated-code)) +### Phase 4: OpenAPI Contract Generation (REQUIRED for Sidecar Validation) + +**When contracts are generated automatically:** + +The `import from-code` command attempts to extract OpenAPI contracts automatically, but **only if**: + +1. Features have `source_tracking.implementation_files` (AST-detected features) +2. The OpenAPI extractor finds API endpoints (FastAPI/Flask patterns like `@app.get`, `@router.post`, `@app.route`) + +**When contracts are NOT generated:** + +Contracts are **NOT** generated automatically when: + +- Features were added via enrichment (no `source_tracking.implementation_files`) +- Django applications (Django `path()` patterns are not detected by the extractor) +- Features without API endpoints (models, utilities, middleware, etc.) +- Framework SDKs or libraries without web endpoints + +**How to generate contracts manually:** + +For features that need OpenAPI contracts (e.g., for sidecar validation with CrossHair), use: + +```bash +# Generate contract for a single feature +specfact --no-interactive contract init --bundle --feature --repo + +# Example: Generate contracts for all enrichment-added features +specfact --no-interactive contract init --bundle djangogoat-validation --feature FEATURE-USER-AUTHENTICATION --repo . +specfact --no-interactive contract init --bundle djangogoat-validation --feature FEATURE-NOTES-MANAGEMENT --repo . +# ... repeat for each feature that needs a contract +``` + +**When to apply contract generation:** + +- **After Phase 3** (enrichment applied): Check which features have contracts in `.specfact/projects//contracts/` +- **Before sidecar validation**: All features that will be analyzed by CrossHair/Specmatic need OpenAPI contracts +- **For Django apps**: Always generate contracts manually after enrichment, as Django URL patterns are not auto-detected + +**Verification:** + +```bash +# Check which features have contracts +ls .specfact/projects//contracts/*.yaml + +# Compare with total features +ls .specfact/projects//features/*.yaml +``` + +If the contract count is less than the feature count, generate missing contracts using `contract init`. + ## Expected Output **Success**: Bundle location, report path, summary (features/stories/contracts/relationships) diff --git a/.cursor/commands/specfact.02-plan.md b/.cursor/commands/specfact.02-plan.md index 2f1aced..8bc2b92 100644 --- a/.cursor/commands/specfact.02-plan.md +++ b/.cursor/commands/specfact.02-plan.md @@ -103,7 +103,9 @@ specfact plan [--bundle ] [options] --no-interactive **What to do**: - Read CLI-generated artifacts (use file reading tools for display only) -- Research codebase for additional context +- Use CLI artifacts as the source of truth for keys/structure/metadata +- Scan codebase only if asked to align the plan with implementation or to add missing features +- When scanning, compare findings against CLI artifacts and propose updates via CLI commands - Identify missing features/stories - Suggest confidence adjustments - Extract business context diff --git a/.cursor/commands/specfact.03-review.md b/.cursor/commands/specfact.03-review.md index f57d665..c40efef 100644 --- a/.cursor/commands/specfact.03-review.md +++ b/.cursor/commands/specfact.03-review.md @@ -131,6 +131,9 @@ For these cases, use the **export-to-file → LLM reasoning → import-from-file **CRITICAL**: Always use `/tmp/` for temporary artifacts to avoid polluting the codebase. Never create temporary files in the project root. +**CRITICAL**: Question IDs are generated per run and can change if you re-run review. +**Do not** re-run `plan review` between exporting questions and applying answers. Always answer using the exact exported questions file for that session. + **Note**: The `--max-questions` parameter (default: 5) limits the number of questions per session, not the total number of available questions. If there are more questions available, you may need to run the review multiple times to answer all questions. Each session will ask different questions (avoiding duplicates from previous sessions). **Export questions to file for LLM reasoning:** @@ -393,6 +396,11 @@ specfact plan review [] --list-questions --output-questions /tmp/qu **What to do**: +0. **Grounding rule**: + - Treat CLI-exported questions as the source of truth; consult codebase/docs only to answer them (do not invent new artifacts) + - **Feature/Story Completeness note**: Answers here are clarifications only. They do **NOT** create stories. + For missing stories, use `specfact plan add-story` (or `plan update-story --batch-updates` if stories already exist). + 1. **Read exported questions file** (`/tmp/questions.json`): - Review all questions and their categories - Identify questions requiring code/feature analysis @@ -601,6 +609,102 @@ Create one with: specfact plan init legacy-api - Use `plan update-idea` to update idea fields directly - If bundle needs regeneration, use `import from-code --enrichment` +**Note on OpenAPI Contracts:** + +After applying enrichment or review updates, check if features need OpenAPI contracts for sidecar validation: + +- Features added via enrichment typically don't have contracts (no `source_tracking`) +- Django applications require manual contract generation (Django URL patterns not auto-detected) +- Use `specfact contract init --bundle --feature ` to generate contracts for features that need them + +**Enrichment Report Format** (for `import from-code --enrichment`): + +When generating enrichment reports for use with `import from-code --enrichment`, follow this exact format: + +```markdown +# [Bundle Name] Enrichment Report + +**Date**: YYYY-MM-DDTHH:MM:SS +**Bundle**: + +--- + +## Missing Features + +1. **Feature Title** (Key: FEATURE-XXX) + - Confidence: 0.85 + - Outcomes: outcome1, outcome2, outcome3 + - Stories: + 1. Story title here + - Acceptance: criterion1, criterion2, criterion3 + 2. Another story title + - Acceptance: criterion1, criterion2 + +2. **Another Feature** (Key: FEATURE-YYY) + - Confidence: 0.80 + - Outcomes: outcome1, outcome2 + - Stories: + 1. Story title + - Acceptance: criterion1, criterion2, criterion3 + +## Confidence Adjustments + +- FEATURE-EXISTING-KEY: 0.90 (reason: improved understanding after code review) + +## Business Context + +- Priority: High priority feature for core functionality +- Constraint: Must support both REST and GraphQL APIs +- Risk: Potential performance issues with large datasets +``` + +**Format Requirements**: + +1. **Section Header**: Must use `## Missing Features` (case-insensitive, but prefer this exact format) +2. **Feature Format**: + - Numbered list: `1. **Feature Title** (Key: FEATURE-XXX)` + - **Bold title** is required (use `**Title**`) + - **Key in parentheses**: `(Key: FEATURE-XXX)` - must be uppercase, alphanumeric with hyphens/underscores + - Fields on separate lines with `-` prefix: + - `- Confidence: 0.85` (float between 0.0-1.0) + - `- Outcomes: comma-separated or line-separated list` + - `- Stories:` (required - each feature must have at least one story) +3. **Stories Format**: + - Numbered list under `Stories:` section: `1. Story title` + - **Indentation**: Stories must be indented (2-4 spaces) under the feature + - **Acceptance Criteria**: `- Acceptance: criterion1, criterion2, criterion3` + - Can be comma-separated on one line + - Or multi-line (each criterion on new line) + - Must start with `- Acceptance:` +4. **Optional Sections**: + - `## Confidence Adjustments`: List existing features with confidence updates + - `## Business Context`: Priorities, constraints, risks (bullet points) +5. **File Naming**: `-.enrichment.md` (e.g., `djangogoat-2025-12-23T23-50-00.enrichment.md`) + +**Example** (working format): + +```markdown +## Missing Features + +1. **User Authentication** (Key: FEATURE-USER-AUTHENTICATION) + - Confidence: 0.85 + - Outcomes: User registration, login, profile management + - Stories: + 1. User can sign up for new account + - Acceptance: sign_up view processes POST requests, creates User automatically, user is logged in after signup, redirects to profile page + 2. User can log in with credentials + - Acceptance: log_in view authenticates username/password, on success user is logged in and redirected, on failure error message is displayed +``` + +**Common Mistakes to Avoid**: + +- ❌ Missing `(Key: FEATURE-XXX)` - parser needs this to identify features +- ❌ Missing `Stories:` section - every feature must have at least one story +- ❌ Stories not indented - parser expects indented numbered lists +- ❌ Missing `- Acceptance:` prefix - acceptance criteria won't be parsed +- ❌ Using bullet points (`-`) instead of numbers (`1.`) for stories +- ❌ Feature title not in bold (`**Title**`) - parser may not extract title correctly + ## Context {ARGS} diff --git a/.cursor/commands/specfact.04-sdd.md b/.cursor/commands/specfact.04-sdd.md index 5e96f3a..5d0cf16 100644 --- a/.cursor/commands/specfact.04-sdd.md +++ b/.cursor/commands/specfact.04-sdd.md @@ -86,6 +86,7 @@ specfact plan harden [] [--sdd ] --no-interactive **What to do**: - Read CLI-generated SDD (use file reading tools for display only) +- Treat CLI SDD as the source of truth; scan codebase only to enrich WHY/WHAT/HOW context - Research codebase for additional context - Suggest improvements to WHY/WHAT/HOW sections diff --git a/.cursor/commands/specfact.05-enforce.md b/.cursor/commands/specfact.05-enforce.md index e804719..7ae5001 100644 --- a/.cursor/commands/specfact.05-enforce.md +++ b/.cursor/commands/specfact.05-enforce.md @@ -90,6 +90,7 @@ specfact enforce sdd [] [--sdd ] --no-interactive **What to do**: - Read CLI-generated validation report (use file reading tools for display only) +- Treat the CLI report as the source of truth; scan codebase only to explain deviations or propose fixes - Research codebase for context on deviations - Suggest fixes for validation failures diff --git a/.cursor/commands/specfact.06-sync.md b/.cursor/commands/specfact.06-sync.md index ad20105..9e2c6cf 100644 --- a/.cursor/commands/specfact.06-sync.md +++ b/.cursor/commands/specfact.06-sync.md @@ -93,6 +93,7 @@ specfact sync bridge --adapter --repo [options] --no-interactiv **What to do**: - Read CLI-generated sync results (use file reading tools for display only) +- Treat CLI sync output as the source of truth; scan codebase only to explain conflicts - Research codebase for context on conflicts - Suggest resolution strategies diff --git a/.cursor/commands/specfact.compare.md b/.cursor/commands/specfact.compare.md index 2bb2441..860f336 100644 --- a/.cursor/commands/specfact.compare.md +++ b/.cursor/commands/specfact.compare.md @@ -90,6 +90,7 @@ specfact plan compare [--bundle ] [options] --no-interactive **What to do**: - Read CLI-generated comparison report (use file reading tools for display only) +- Treat the comparison report as the source of truth; scan codebase only to explain or confirm deviations - Research codebase for context on deviations - Suggest fixes for missing features or mismatches diff --git a/.cursor/commands/specfact.validate.md b/.cursor/commands/specfact.validate.md index 6f74f39..89464df 100644 --- a/.cursor/commands/specfact.validate.md +++ b/.cursor/commands/specfact.validate.md @@ -92,6 +92,7 @@ specfact repro --repo [options] --no-interactive **What to do**: - Read CLI-generated validation report (use file reading tools for display only) +- Treat the validation report as the source of truth; scan codebase only to explain failures - Research codebase for context on failures - Suggest fixes for validation failures diff --git a/.github/prompts/specfact.01-import.prompt.md b/.github/prompts/specfact.01-import.prompt.md index 1c3acd2..fbdab31 100644 --- a/.github/prompts/specfact.01-import.prompt.md +++ b/.github/prompts/specfact.01-import.prompt.md @@ -24,6 +24,7 @@ Import codebase → plan bundle. CLI extracts routes/schemas/relationships/contr 1. **Execute CLI**: `specfact import from-code [] --repo [options]` - CLI extracts: routes (FastAPI/Flask/Django), schemas (Pydantic), relationships, contracts (OpenAPI scaffolds), source tracking - Uses active plan if bundle not specified + - Note: `--no-interactive` is a global option and must appear before the subcommand (e.g., `specfact --no-interactive import from-code ...`). - **Auto-enrichment enabled by default**: Automatically enhances vague acceptance criteria, incomplete requirements, and generic tasks using PlanEnricher (same logic as `plan review --auto-enrich`) - Use `--no-enrich-for-speckit` to disable auto-enrichment @@ -42,7 +43,7 @@ Import codebase → plan bundle. CLI extracts routes/schemas/relationships/contr **Rules:** - Execute CLI first - never create artifacts directly -- Use `--no-interactive` flag in CI/CD environments +- Use the global `--no-interactive` flag in CI/CD environments (must appear before the subcommand) - Never modify `.specfact/` directly - Use CLI output as grounding for validation - Code generation requires LLM (only via AI IDE slash prompts, not CLI-only) @@ -55,7 +56,7 @@ When in copilot mode, follow this three-phase workflow: ```bash # Execute CLI to get structured output -specfact import from-code [] --repo --no-interactive +specfact --no-interactive import from-code [] --repo ``` **Capture**: @@ -90,7 +91,7 @@ specfact import from-code [] --repo --no-interactive ```bash # Use enrichment to update plan via CLI -specfact import from-code [] --repo --enrichment --no-interactive +specfact --no-interactive import from-code [] --repo --enrichment ``` **Result**: Final artifacts are CLI-generated with validated enrichments diff --git a/.github/prompts/specfact.03-review.prompt.md b/.github/prompts/specfact.03-review.prompt.md index 643ab70..f57d665 100644 --- a/.github/prompts/specfact.03-review.prompt.md +++ b/.github/prompts/specfact.03-review.prompt.md @@ -16,6 +16,75 @@ Review project bundle to identify/resolve ambiguities and missing information. A **Quick:** `/specfact.03-review` (uses active plan) or `/specfact.03-review legacy-api` +## Interactive Question Presentation + +**CRITICAL**: When presenting questions interactively, **ALWAYS** generate and display multiple answer options in a table format. This makes it easier for users to select appropriate answers. + +### Answer Options Format + +For each question, generate 3-5 reasonable answer options based on: + +- **Code analysis**: Review existing patterns, similar features, error handling approaches +- **Domain knowledge**: Best practices, common scenarios, industry standards +- **Business context**: Product requirements, user needs, feature relationships + +**Present options in a numbered table with recommended answer:** + +```text +Question 1/5 +Category: Interaction & UX Flow +Q: What error/empty states should be handled for story STORY-XXX? + +Current Plan Settings: +Story STORY-XXX Acceptance: [current acceptance criteria] + +Answer Options: +┌─────┬─────────────────────────────────────────────────────────────────┐ +│ No. │ Option │ +├─────┼─────────────────────────────────────────────────────────────────┤ +│ 1 │ Error handling: Invalid input produces clear error messages │ +│ │ Empty states: Missing data shows "No data available" message │ +│ │ Validation: Required fields validated before processing │ +│ │ ⭐ Recommended (based on code analysis) │ +├─────┼─────────────────────────────────────────────────────────────────┤ +│ 2 │ Error handling: Network failures retry with exponential backoff │ +│ │ Empty states: Show empty state UI with helpful guidance │ +│ │ Validation: Schema-based validation with clear error messages │ +├─────┼─────────────────────────────────────────────────────────────────┤ +│ 3 │ Error handling: Errors logged to stderr with exit codes (CLI) │ +│ │ Empty states: Sensible defaults when data is missing │ +│ │ Validation: Covered in OpenAPI contract files │ +├─────┼─────────────────────────────────────────────────────────────────┤ +│ 4 │ Not applicable - error handling covered in contract files │ +├─────┼─────────────────────────────────────────────────────────────────┤ +│ 5 │ [Custom answer - type your own] │ +└─────┴─────────────────────────────────────────────────────────────────┘ + +Your answer (1-5, or type custom answer): [1] ⭐ Recommended +``` + +**CRITICAL**: Always provide a **recommended answer** (marked with ⭐) based on: + +- Code analysis (what the actual implementation does) +- Best practices (industry standards, common patterns) +- Domain knowledge (what makes sense for this feature) + +The recommendation helps less-experienced users make informed decisions. + +### Guidelines for Answer Options + +- **Option 1-3**: Specific, actionable options based on code analysis and domain knowledge +- **Option 4**: "Not applicable" or "Covered elsewhere" when appropriate +- **Option 5**: Always include "[Custom answer - type your own]" as the last option +- **Base options on research**: Review codebase, similar features, existing patterns +- **Make options specific**: Avoid generic responses - be concrete and actionable +- **Use numbered selection**: Allow users to select by number (1-5) or letter (A-E) +- **⭐ Always provide a recommended answer**: Mark one option as recommended (⭐) based on: + - Code analysis (what the actual implementation does or should do) + - Best practices (industry standards, common patterns) + - Domain knowledge (what makes sense for this specific feature) + - The recommendation helps less-experienced users make informed decisions + ## Parameters ### Target/Input @@ -28,6 +97,7 @@ Review project bundle to identify/resolve ambiguities and missing information. A - `--list-questions` - Output questions in JSON format. Default: False - `--output-questions PATH` - Save questions directly to file (JSON format). Use with `--list-questions` to save instead of stdout. Default: None - `--list-findings` - Output all findings in structured format. Default: False +- `--output-findings PATH` - Save findings directly to file (JSON/YAML format). Use with `--list-findings` to save instead of stdout. Default: None - `--findings-format FORMAT` - Output format: json, yaml, or table. Default: json for non-interactive, table for interactive ### Behavior/Options @@ -36,10 +106,20 @@ Review project bundle to identify/resolve ambiguities and missing information. A - `--answers JSON` - JSON object with question_id -> answer mappings. Default: None - `--auto-enrich` - Automatically enrich vague acceptance criteria using PlanEnricher (same enrichment logic as `import from-code`). Default: False (opt-in for review, but import has auto-enrichment enabled by default) +**Important**: `--auto-enrich` will **NOT** resolve partial findings such as: + +- Missing error handling specifications ("Interaction & UX Flow" category) +- Vague acceptance criteria requiring domain knowledge ("Completion Signals" category) +- Business context questions requiring human judgment + +For these cases, use the **export-to-file → LLM reasoning → import-from-file** workflow (see Step 4). + ### Advanced/Configuration - `--max-questions INT` - Maximum questions per session. Default: 5 (range: 1-10) + **Important**: This limits the number of questions asked per review session, not the total number of available questions. If there are more questions than the limit, you may need to run the review multiple times to answer all questions. Each session will ask different questions (avoiding duplicates from previous sessions). + ## Workflow ### Step 1: Parse Arguments @@ -47,111 +127,178 @@ Review project bundle to identify/resolve ambiguities and missing information. A - Extract bundle name (defaults to active plan if not specified) - Extract optional parameters (max-questions, category, etc.) -### Step 2: Execute CLI to Get Findings +### Step 2: Execute CLI to Export Questions -**First, get findings to understand what needs enrichment:** +**CRITICAL**: Always use `/tmp/` for temporary artifacts to avoid polluting the codebase. Never create temporary files in the project root. -```bash -# Get findings (saves to stdout - can redirect to file) -specfact plan review [] --list-findings --findings-format json --no-interactive > findings.json +**Note**: The `--max-questions` parameter (default: 5) limits the number of questions per session, not the total number of available questions. If there are more questions available, you may need to run the review multiple times to answer all questions. Each session will ask different questions (avoiding duplicates from previous sessions). -# Or get questions and save directly to file (recommended) -specfact plan review [] --list-questions --output-questions questions.json --no-interactive +**Export questions to file for LLM reasoning:** + +```bash +# Export questions to file (REQUIRED for LLM enrichment workflow) +# Use /tmp/ to avoid polluting the codebase +specfact plan review [] --list-questions --output-questions /tmp/questions.json --no-interactive # Uses active plan if bundle not specified ``` +**Optional: Get findings for comprehensive analysis:** + +```bash +# Get findings (saves to stdout - can redirect to /tmp/) +# Use /tmp/ to avoid polluting the codebase +# Option 1: Redirect output (includes CLI banner - not recommended) +specfact plan review [] --list-findings --findings-format json --no-interactive > /tmp/findings.json + +# Option 2: Save directly to file (recommended - clean JSON only) +specfact plan review [] --list-findings --output-findings /tmp/findings.json --no-interactive +``` + **Note**: The `--output-questions` option saves questions directly to a file, avoiding the need for complex JSON parsing. The ambiguity scanner now recognizes the simplified format (e.g., "Must verify X works correctly (see contract examples)") as valid and will not flag it as vague. -### Step 3: Create Enrichment Report (if needed) +**Important**: Always use `/tmp/` for temporary files (`questions.json`, `findings.json`, etc.) to keep the project root clean and avoid accidental commits of temporary artifacts. + +### Step 3: LLM Reasoning and Answer Generation + +**CRITICAL**: For partial findings (missing error handling, vague acceptance criteria, business context), `--auto-enrich` will **NOT** resolve them. You must use LLM reasoning. -Based on the findings, create a Markdown enrichment report that addresses: +**CRITICAL WORKFLOW**: Present questions with answer options **IN THE CHAT**, wait for user selection, then add selected answers to file. -- **Business Context**: Priorities, constraints, unknowns -- **Confidence Adjustments**: Feature confidence score updates (if needed) -- **Missing Features**: New features to add (if any) -- **Manual Updates**: Guidance for updating `idea.yaml` fields like `target_users`, `value_hypothesis`, `narrative` +**Workflow:** -**Enrichment Report Format:** +1. **Read the exported questions file** (`/tmp/questions.json`): -```markdown -## Business Context + - Review all questions in the file + - Identify which questions require code/feature analysis + - Determine which questions need domain knowledge or business context -### Priorities -- Priority 1 -- Priority 2 +2. **Research codebase and features** (as needed): -### Constraints -- Constraint 1 -- Constraint 2 + - For error handling questions: Check existing error handling patterns in the codebase + - For acceptance criteria questions: Review related features and stories + - For business context questions: Review `idea.yaml`, `product.yaml`, and related documentation -### Unknowns -- Unknown 1 -- Unknown 2 +3. **Present questions with answer options IN THE CHAT** (REQUIRED): -## Confidence Adjustments + **DO NOT add answers to the file yet!** Present each question with answer options in the chat conversation and wait for user selection. -FEATURE-KEY → 0.95 -FEATURE-OTHER → 0.8 + For each question: -## Missing Features + - **Generate 3-5 reasonable answer options** based on: + - Code analysis (existing patterns, similar features) + - Domain knowledge (best practices, common scenarios) + - Business context (product requirements, user needs) + - **Present options in a table format** in the chat with numbered choices: -(If any features are missing) + ```text + Question 1/5 + Category: Interaction & UX Flow + Q: What error/empty states should be handled for story STORY-XXX? -## Recommendations for Manual Updates + Current Plan Settings: + Story STORY-XXX Acceptance: [current acceptance criteria] -### idea.yaml Updates Required + Answer Options: + ┌─────┬─────────────────────────────────────────────────────────────────┐ + │ No. │ Option │ + ├─────┼─────────────────────────────────────────────────────────────────┤ + │ 1 │ Error handling: Invalid input produces clear error messages │ + │ │ Empty states: Missing data shows "No data available" message │ + │ │ Validation: Required fields validated before processing │ + │ │ ⭐ Recommended (based on code analysis) │ + ├─────┼─────────────────────────────────────────────────────────────────┤ + │ 2 │ Error handling: Network failures retry with exponential backoff │ + │ │ Empty states: Show empty state UI with helpful guidance │ + │ │ Validation: Schema-based validation with clear error messages │ + ├─────┼─────────────────────────────────────────────────────────────────┤ + │ 3 │ Error handling: Errors logged to stderr with exit codes (CLI) │ + │ │ Empty states: Sensible defaults when data is missing │ + │ │ Validation: Covered in OpenAPI contract files │ + ├─────┼─────────────────────────────────────────────────────────────────┤ + │ 4 │ Not applicable - error handling covered in contract files │ + ├─────┼─────────────────────────────────────────────────────────────────┤ + │ 5 │ [Custom answer - type your own] │ + └─────┴─────────────────────────────────────────────────────────────────┘ -**target_users:** -- Primary: [description] -- Secondary: [description] + Your answer (1-5, or type custom answer): [1] ⭐ Recommended + ``` + + - **Wait for user to select an answer** (number 1-5, letter A-E, or custom text) + - **Option 5 (or last option)** should always be "[Custom answer - type your own]" to allow free-form input + - **Base options on code analysis** - review similar features, existing error handling patterns, and domain knowledge + - **Make options specific and actionable** - not generic responses + - **⭐ Always provide a recommended answer** - mark one option as recommended (⭐) based on code analysis, best practices, and domain knowledge. This helps less-experienced users make informed decisions. + - **Present one question at a time** and wait for user selection before moving to the next + +4. **After user has selected all answers**: + + - **THEN** export the selected answers to a separate file `/tmp/answers.json` + - Map user selections to the actual answer text (if user selected option 1, use the text from option 1) + - If user selected a custom answer, use that text directly + - **Export format**: Create a JSON object with `question_id -> answer` mappings + - **DO NOT** add answers to the file until user has selected all answers + - **CRITICAL**: Export answers to `/tmp/answers.json` (not `/tmp/questions.json`) for CLI import + +**Example `/tmp/questions.json` structure:** + +```json +{ + "questions": [ + { + "id": "Q001", + "category": "Interaction & UX Flow", + "question": "What error/empty states should be handled for story STORY-XXX?", + "related_sections": ["features.FEATURE-XXX.stories.STORY-XXX.acceptance"] + } + ], + "total": 5 +} +``` -**value_hypothesis:** -[Value proposition] +**Example `/tmp/answers.json` structure (exported after user selections):** -**narrative:** -[Improved narrative] +```json +{ + "Q001": "Error handling should include: network failures (retry with exponential backoff), invalid input (clear validation messages), empty results (show 'No data available' message), timeout errors (show progress indicator and allow cancellation). Based on analysis of similar features in the codebase.", + "Q002": "Answer for question 2 based on code review..." +} ``` -### Step 4: Apply Enrichment +**CRITICAL**: Export answers to `/tmp/answers.json` (separate file), not to `/tmp/questions.json`. The CLI expects a file path for `--answers`, not a JSON string extracted from the questions file. -#### Option A: Use enrichment to answer review questions +### Step 4: Apply Enrichment via CLI -**Recommended workflow:** +**REQUIRED workflow for partial findings:** -1. **Get questions and save to file:** +1. **Export questions to file** (already done in Step 2): ```bash - specfact plan review [] --list-questions --output-questions questions.json --no-interactive + # Use /tmp/ to avoid polluting the codebase + specfact plan review [] --list-questions --output-questions /tmp/questions.json --no-interactive ``` -2. **Edit the JSON file** to add answers: - - ```json - { - "questions": [...], - "total": 5, - "answers": { - "Q001": "Answer for question 1", - "Q002": "Answer for question 2" - } - } - ``` +2. **LLM reasoning and user selection** (Step 3): + + - LLM presents questions with answer options **IN THE CHAT** + - User selects answers (1-5, A-E, or custom text) + - **After user has selected all answers**, LLM adds selected answers to `/tmp/questions.json` -3. **Extract answers and provide to CLI:** +3. **Import answers via CLI** (after user selections are complete): ```bash - # Extract answers from file (simple approach) - specfact plan review [] --answers "$(jq -c '.answers' questions.json)" --no-interactive - - # Or provide answers directly - specfact plan review [] --answers '{"Q001": "answer1", "Q002": "answer2"}' --no-interactive + # Import answers from exported file + # Use /tmp/ to avoid polluting the codebase + specfact plan review [] --answers /tmp/answers.json --no-interactive ``` -**Alternative**: Create answers JSON from enrichment report: +**CRITICAL**: -```bash -specfact plan review [] --answers '{"Q001": "answer1", "Q002": "answer2"}' -``` +- Do NOT add answers to the file until the user has selected all answers +- Present questions in chat, wait for selections +- Export answers to `/tmp/answers.json` (separate file, not `/tmp/questions.json`) +- Import via CLI using the file path: `--answers /tmp/answers.json` + +**Alternative approaches** (for non-partial findings only): #### Option B: Update idea fields directly via CLI @@ -169,6 +316,14 @@ specfact import from-code [] --repo . --enrichment enrichment-repor **Note:** +- **For partial findings**: Always use Option A (export → LLM reasoning → import) +- **For business context only**: Option B (update-idea) may be sufficient +- **For bundle regeneration**: Only use Option C if you need to regenerate the bundle +- **CRITICAL**: Never manually edit `.specfact/` files directly - always use CLI commands + - This includes `idea.yaml`, `product.yaml`, feature files, story files, etc. + - Even if a file doesn't exist yet, use CLI commands to create it (e.g., `plan update-idea` will create `idea.yaml` if needed) + - Direct file modification bypasses validation and can cause inconsistencies + - **Preferred**: Use Option A (answers) or Option B (update-idea) for most cases - Only use Option C if you need to regenerate the bundle - **CRITICAL**: Never manually edit `.specfact/` files directly - always use CLI commands @@ -207,11 +362,12 @@ When in copilot mode, follow this three-phase workflow: ### Phase 1: CLI Grounding (REQUIRED) ```bash -# Option 1: Get findings (redirect to file if needed) -specfact plan review [] --list-findings --findings-format json --no-interactive > findings.json +# Option 1: Get findings (redirect to /tmp/ to avoid polluting codebase) +# Option 1: Save findings directly to file (recommended - clean JSON only) +specfact plan review [] --list-findings --output-findings /tmp/findings.json --no-interactive -# Option 2: Get questions and save directly to file (recommended - avoids JSON parsing) -specfact plan review [] --list-questions --output-questions questions.json --no-interactive +# Option 2: Get questions and save directly to /tmp/ (recommended - avoids JSON parsing) +specfact plan review [] --list-questions --output-questions /tmp/questions.json --no-interactive ``` **Capture**: @@ -223,35 +379,102 @@ specfact plan review [] --list-questions --output-questions questio **Note**: Use `--output-questions` to save questions directly to a file. This avoids the need for complex on-the-fly Python code to extract JSON from CLI output. -### Phase 2: LLM Enrichment (OPTIONAL, Copilot Only) +**CRITICAL**: Always use `/tmp/` for temporary artifacts (`questions.json`, `findings.json`, etc.) to avoid polluting the codebase and prevent accidental commits of temporary files. + +### Phase 2: LLM Enrichment (REQUIRED for Partial Findings) -**Purpose**: Add semantic understanding to CLI findings +**Purpose**: Add semantic understanding and domain knowledge to CLI findings + +**CRITICAL**: `--auto-enrich` will **NOT** resolve partial findings. LLM reasoning is **REQUIRED** for: + +- Missing error handling specifications ("Interaction & UX Flow" category) +- Vague acceptance criteria requiring domain knowledge ("Completion Signals" category) +- Business context questions requiring human judgment **What to do**: -- Read CLI-generated findings (use file reading tools for display only) -- Research codebase for additional context -- Generate enrichment report or batch update file -- Address ambiguities with business context +1. **Read exported questions file** (`/tmp/questions.json`): + - Review all questions and their categories + - Identify questions requiring code/feature analysis + - Determine questions needing domain knowledge + +2. **Research codebase**: + - For error handling: Analyze existing error handling patterns + - For acceptance criteria: Review related features and stories + - For business context: Review `idea.yaml`, `product.yaml`, documentation + +3. **Present questions with answer options IN THE CHAT** (REQUIRED): + + **DO NOT add answers to the file yet!** Present each question with answer options in the chat conversation. + + **For each question:** + + - Generate 3-5 reasonable options based on code analysis and domain knowledge + - Present in a numbered table (1-5) or lettered table (A-E) **IN THE CHAT** + - Include a "[Custom answer]" option as the last choice + - Make options specific and actionable, not generic + - **Wait for user to select an answer** before moving to the next question + + **Example format (present in chat):** + + ```text + Question 1/5 + Category: Interaction & UX Flow + Q: What error/empty states should be handled for story STORY-XXX? + + Answer Options: + ┌─────┬─────────────────────────────────────────────────────────────┐ + │ No. │ Option │ + ├─────┼─────────────────────────────────────────────────────────────┤ + │ 1 │ [Option based on code analysis - specific and actionable] │ + │ │ ⭐ Recommended (based on code analysis) │ + │ 2 │ [Option based on best practices - domain knowledge] │ + │ 3 │ [Option based on similar features - pattern matching] │ + │ 4 │ [Not applicable / covered elsewhere] │ + │ 5 │ [Custom answer - type your own] │ + └─────┴─────────────────────────────────────────────────────────────┘ + + Your answer (1-5, or type custom answer): [1] ⭐ Recommended + ``` + +4. **After user has selected all answers**: + + - **THEN** add the selected answers to `/tmp/questions.json` in the `answers` object + - Map user selections (1-5) to the actual answer text from the options + - If user selected a custom answer, use that text directly + - **DO NOT** add answers to the file until user has selected all answers **What NOT to do**: +- ❌ Use `--auto-enrich` expecting it to resolve partial findings - ❌ Create YAML/JSON artifacts directly (even if they don't exist yet) - ❌ Modify CLI artifacts directly (use CLI commands to update) - ❌ Edit `idea.yaml`, `product.yaml`, feature files, or story files manually - ❌ Create new artifact files manually - use CLI commands instead - ❌ Bypass CLI validation - ❌ Write to `.specfact/` folder directly (always use CLI) +- ❌ Create temporary files in project root (always use `/tmp/`) -**Output**: Generate enrichment report (Markdown) or batch update JSON/YAML file +**Output**: Updated `/tmp/questions.json` file with `answers` object populated ### Phase 3: CLI Artifact Creation (REQUIRED) +**For partial findings (REQUIRED workflow):** + ```bash -# Use enrichment to update plan via CLI -specfact plan update-feature [--bundle ] --batch-updates --no-interactive -# Or use auto-enrich: +# Import answers from /tmp/questions.json file +# Use /tmp/ to avoid polluting the codebase +specfact plan review [] --answers "$(jq -c '.answers' /tmp/questions.json)" --no-interactive +``` + +**For non-partial findings only:** + +```bash +# Use auto-enrich for simple vague criteria (not partial findings) specfact plan review [] --auto-enrich --no-interactive + +# Or use batch updates for feature updates +specfact plan update-feature [--bundle ] --batch-updates --no-interactive ``` **Result**: Final artifacts are CLI-generated with validated enrichments @@ -292,56 +515,91 @@ Create one with: specfact plan init legacy-api # Get findings first /specfact.03-review --list-findings # List all findings /specfact.03-review --list-findings --findings-format json # JSON format for enrichment +/specfact.03-review --list-findings --output-findings /tmp/findings.json # Save findings to file (clean JSON) # Interactive review -/specfact.03-review # Uses active plan +/specfact.03-review # Uses active plan (default: 5 questions per session) /specfact.03-review legacy-api # Specific bundle -/specfact.03-review --max-questions 3 # Limit questions +/specfact.03-review --max-questions 3 # Limit questions per session (may need multiple runs) /specfact.03-review --category "Functional Scope" # Focus category +/specfact.03-review --max-questions 10 # Ask more questions per session (up to 10) # Non-interactive with answers /specfact.03-review --answers '{"Q001": "answer"}' # Provide answers directly /specfact.03-review --list-questions # Output questions as JSON to stdout -/specfact.03-review --list-questions --output-questions questions.json # Save questions to file - -# Auto-enrichment -/specfact.03-review --auto-enrich # Auto-enrich vague criteria +/specfact.03-review --list-questions --output-questions /tmp/questions.json # Save questions to /tmp/ + +# Auto-enrichment (NOTE: Will NOT resolve partial findings - use export/LLM/import workflow instead) +/specfact.03-review --auto-enrich # Auto-enrich simple vague criteria only + +# Recommended workflow for partial findings (use /tmp/ to avoid polluting codebase) +/specfact.03-review --list-questions --output-questions /tmp/questions.json # Export questions (default: 5 per session) +# [LLM reasoning: present questions in chat, wait for user selections, then export answers] +/specfact.03-review --answers /tmp/answers.json # Import answers from file +# [Repeat if more questions available - each session asks different questions] +/specfact.03-review --list-questions --output-questions /tmp/questions.json # Export next batch +/specfact.03-review --answers /tmp/answers.json # Import next batch ``` ## Enrichment Workflow -**Note**: Import command (`specfact import from-code`) has **auto-enrichment enabled by default** using PlanEnricher. Review command requires explicit `--auto-enrich` flag. +**CRITICAL**: `--auto-enrich` will **NOT** resolve partial findings such as: -**Typical workflow when enrichment is needed:** +- Missing error handling specifications ("Interaction & UX Flow" category) +- Vague acceptance criteria requiring domain knowledge ("Completion Signals" category) +- Business context questions requiring human judgment -1. **Get questions** (save to file for easy editing): +**For partial findings, use this REQUIRED workflow:** + +1. **Export questions to file** (use `/tmp/` to avoid polluting codebase): ```bash - specfact plan review --list-questions --output-questions questions.json --no-interactive + specfact plan review [] --list-questions --output-questions /tmp/questions.json --no-interactive ``` -2. **Get findings** (optional, for comprehensive analysis): +2. **Get findings** (optional, for comprehensive analysis - use `/tmp/`): ```bash - specfact plan review --list-findings --findings-format json --no-interactive > findings.json + specfact plan review [] --list-findings --output-findings /tmp/findings.json --no-interactive ``` -3. **Analyze findings**: Review missing information (target_users, value_hypothesis, etc.) +3. **LLM reasoning and user selection** (REQUIRED for partial findings): + + **CRITICAL**: Present questions with answer options **IN THE CHAT**, wait for user selections, then add selected answers to file. + + - Read `/tmp/questions.json` file + - Research codebase for error handling patterns, feature relationships, domain knowledge + - **Present each question with answer options IN THE CHAT** (see Step 3 for format) + - **Wait for user to select answers** (1-5, A-E, or custom text) + - **After user has selected all answers**, export selected answers to `/tmp/answers.json` (separate file) + - Map user selections to actual answer text (if user selected option 1, use the text from option 1) + - **Export format**: Create a JSON object with `question_id -> answer` mappings + - **DO NOT** export answers to file until user has selected all answers + - **CRITICAL**: Export to `/tmp/answers.json` (not `/tmp/questions.json`) for CLI import + +4. **Import answers via CLI** (after user selections are complete): + + ```bash + # Import answers from exported file + specfact plan review [] --answers /tmp/answers.json --no-interactive + ``` + + **CRITICAL**: Use the file path `/tmp/answers.json` (not a JSON string extracted from `/tmp/questions.json`) + +5. **Verify**: Run `plan review` again to confirm improvements -4. **Apply automatic enrichment** (if needed): + **Important**: The `--max-questions` parameter (default: 5) limits questions per session, not the total available. If there are more questions, repeat the workflow (Steps 2-4) until all are answered. Each session asks different questions, avoiding duplicates from previous sessions. - - **During import**: Auto-enrichment happens automatically (enabled by default) - - **After import**: Use `specfact plan review --auto-enrich` to enhance vague criteria - - **Note**: The scanner now recognizes simplified format (e.g., "Must verify X works correctly (see contract examples)") as valid +**For non-partial findings only:** -5. **Create enrichment report** (for business context, confidence adjustments, missing features): Write Markdown file addressing findings +- **During import**: Auto-enrichment happens automatically (enabled by default) +- **After import**: Use `specfact plan review --auto-enrich` for simple vague criteria +- **Note**: The scanner now recognizes simplified format (e.g., "Must verify X works correctly (see contract examples)") as valid -6. **Apply manual enrichment**: - - **Preferred**: Edit `questions.json` file to add answers, then use `--answers` with the file - - **Alternative**: Use `plan update-idea` to update idea fields directly - - **Last resort**: If bundle needs regeneration, use `import from-code --enrichment` +**Alternative approaches** (for business context only): -7. **Verify**: Run `plan review` again to confirm improvements +- Use `plan update-idea` to update idea fields directly +- If bundle needs regeneration, use `import from-code --enrichment` ## Context diff --git a/.github/workflows/pr-orchestrator.yml b/.github/workflows/pr-orchestrator.yml index e359867..8d8ed67 100644 --- a/.github/workflows/pr-orchestrator.yml +++ b/.github/workflows/pr-orchestrator.yml @@ -294,6 +294,9 @@ jobs: runs-on: ubuntu-latest needs: [package-validation] if: github.event_name == 'push' && github.ref == 'refs/heads/main' + outputs: + published: ${{ steps.publish.outputs.published }} + version: ${{ steps.publish.outputs.version }} permissions: contents: read steps: @@ -324,6 +327,7 @@ jobs: PYPI_API_TOKEN: ${{ secrets.PYPI_API_TOKEN }} run: | ./.github/workflows/scripts/check-and-publish-pypi.sh + - name: Summary if: always() @@ -344,3 +348,64 @@ jobs: echo "| Status | ⏭️ Skipped (version not newer) |" fi } >> "$GITHUB_STEP_SUMMARY" + + create-release: + name: Create GitHub Release + runs-on: ubuntu-latest + needs: [publish-pypi] + if: needs.publish-pypi.outputs.published == 'true' && github.event_name == 'push' && github.ref == 'refs/heads/main' + permissions: + contents: write + steps: + - name: Checkout + uses: actions/checkout@v4 + with: + fetch-depth: 0 + + - name: Install GitHub CLI + run: | + type -p curl >/dev/null || (apt update && apt install curl -y) + curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg \ + && chmod go+r /usr/share/keyrings/githubcli-archive-keyring.gpg \ + && echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | tee /etc/apt/sources.list.d/github-cli.list > /dev/null \ + && apt update \ + && apt install gh -y + + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: "3.12" + + - name: Make scripts executable + run: | + chmod +x .github/workflows/scripts/generate-release-notes.sh + chmod +x .github/workflows/scripts/create-github-release.sh + + - name: Get version from PyPI publish step + id: get_version + run: | + # Use version from publish-pypi job output + VERSION="${{ needs.publish-pypi.outputs.version }}" + echo "version=$VERSION" >> $GITHUB_OUTPUT + echo "📦 Version: $VERSION" + + - name: Create GitHub Release + env: + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + run: | + VERSION="${{ steps.get_version.outputs.version }}" + echo "🚀 Creating GitHub release for version $VERSION..." + ./.github/workflows/scripts/create-github-release.sh "$VERSION" + + - name: Release Summary + if: always() + run: | + VERSION="${{ steps.get_version.outputs.version }}" + { + echo "## GitHub Release Summary" + echo "| Parameter | Value |" + echo "|-----------|--------|" + echo "| Version | $VERSION |" + echo "| Status | ✅ Release created |" + echo "| URL | https://github.com/${{ github.repository }}/releases/tag/v${VERSION} |" + } >> "$GITHUB_STEP_SUMMARY" diff --git a/.github/workflows/scripts/create-github-release.sh b/.github/workflows/scripts/create-github-release.sh new file mode 100755 index 0000000..ad14d2f --- /dev/null +++ b/.github/workflows/scripts/create-github-release.sh @@ -0,0 +1,45 @@ +#!/usr/bin/env bash +set -euo pipefail + +# create-github-release.sh +# Create a GitHub release with release notes from CHANGELOG.md +# Usage: create-github-release.sh +# Requires: GITHUB_TOKEN environment variable + +if [[ $# -ne 1 ]]; then + echo "Usage: $0 " >&2 + exit 1 +fi + +VERSION="$1" +# Ensure version has 'v' prefix +if [[ ! $VERSION =~ ^v ]]; then + VERSION="v${VERSION}" +fi + +echo "📝 Generating release notes for $VERSION..." + +# Generate release notes from CHANGELOG.md +RELEASE_NOTES=$(./.github/workflows/scripts/generate-release-notes.sh "$VERSION") + +if [[ -z "$RELEASE_NOTES" ]]; then + echo "❌ Failed to generate release notes" >&2 + exit 1 +fi + +echo "🚀 Creating GitHub release $VERSION..." + +# Check if release already exists +if gh release view "$VERSION" &>/dev/null; then + echo "⚠️ Release $VERSION already exists, skipping creation" + exit 0 +fi + +# Create the release +gh release create "$VERSION" \ + --title "$VERSION" \ + --notes "$RELEASE_NOTES" \ + --repo "$GITHUB_REPOSITORY" + +echo "✅ Successfully created GitHub release $VERSION" + diff --git a/.github/workflows/scripts/generate-release-notes.sh b/.github/workflows/scripts/generate-release-notes.sh new file mode 100755 index 0000000..ba1564e --- /dev/null +++ b/.github/workflows/scripts/generate-release-notes.sh @@ -0,0 +1,57 @@ +#!/usr/bin/env bash +set -euo pipefail + +# generate-release-notes.sh +# Extracts release notes from CHANGELOG.md for a given version +# Usage: generate-release-notes.sh +# Outputs release notes to stdout + +if [[ $# -ne 1 ]]; then + echo "Usage: $0 " >&2 + exit 1 +fi + +VERSION="$1" +# Remove 'v' prefix if present for CHANGELOG matching +VERSION_NO_V=${VERSION#v} + +# Extract the section for this version from CHANGELOG.md +# Look for the version header and extract until the next version header or end of file +python3 << PYTHON_SCRIPT +import re +import sys + +version = "$VERSION_NO_V" +changelog_path = "CHANGELOG.md" + +try: + with open(changelog_path, 'r', encoding='utf-8') as f: + content = f.read() + + # Find the version section + # Pattern: ## [version] - date + pattern = rf'## \[{re.escape(version)}\][^\n]*\n(.*?)(?=\n## \[|\Z)' + match = re.search(pattern, content, re.DOTALL) + + if match: + notes = match.group(1).strip() + # Remove leading/trailing whitespace and horizontal rules + notes = re.sub(r'^---\s*$', '', notes, flags=re.MULTILINE) + notes = notes.strip() + + if notes: + print(notes) + else: + print(f"No release notes found for version {version}", file=sys.stderr) + sys.exit(1) + else: + print(f"Version {version} not found in CHANGELOG.md", file=sys.stderr) + sys.exit(1) +except FileNotFoundError: + print(f"Error: {changelog_path} not found", file=sys.stderr) + sys.exit(1) +except Exception as e: + print(f"Error extracting release notes: {e}", file=sys.stderr) + sys.exit(1) +PYTHON_SCRIPT + diff --git a/CHANGELOG.md b/CHANGELOG.md index 2972ea8..4c250cd 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,60 @@ All notable changes to this project will be documented in this file. --- +## [0.20.5] - 2025-12-24 + +### Fixed (0.20.5) + +- **Sidecar Template Code Quality**: Fixed formatting and linting issues in sidecar template files + - **`adapters.py`**: Removed whitespace from blank line, removed unused imports (`HttpRequest`, `QueryDict`), fixed exception chaining with `raise ... from None` + - **`crosshair_django_wrapper.py`**: Combined nested if statements to reduce complexity (SIM102) + - **`populate_contracts.py`**: Replaced for loop with `any()` expression for better Pythonic code (SIM110) + - **`django_form_extractor.py`**: Combined nested if statements, fixed indentation issues throughout the file + - **`django_url_extractor.py`**: Combined nested if statements, improved code formatting + - All files now pass `hatch run format` checks with no errors + - Improves code maintainability and follows Python best practices + +--- + +## [0.20.4] - 2025-12-23 + +### Fixed (0.20.4) + +- **Enrichment Parser Story Merging**: Fixed critical issue where stories from enrichment reports were not added when updating existing features + - Previously, stories were only added when creating new features, not when updating existing ones + - Now correctly merges stories from enrichment reports into existing features (adds new stories that don't already exist by key) + - Also updates feature title if it was empty + - Preserves existing stories while adding new ones from enrichment reports + - Enables full dual-stack enrichment workflow: CLI grounding → LLM enrichment → CLI artifact creation with complete story details + - Verified with DjangoGoat validation: 24 stories now correctly added across 8 features + +--- + +## [0.20.3] - 2025-12-22 + +### Added (0.20.3) + +- **Sidecar Template Guidance (Phase B)**: Added refresh workflow guidance and recommended CrossHair defaults to sidecar templates for internal research validation. + +### Fixed (0.20.3) + +- **Sidecar Adapters**: Resolved registry adapter typing, callback closure binding, duplicate adapter definitions, and teardown return flow in sidecar templates. + +--- + +## [0.20.2] - 2025-12-22 + +### Fixed (0.20.2) + +- **`repro` CrossHair Execution**: Avoided import-time side effects by expanding directory targets into files and excluding `__main__.py` + - Prevents Flask-style CLI entrypoints from consuming CrossHair arguments + - Keeps contract exploration focused on analyzable code paths +- **`repro` CrossHair Imports**: Use module targets with `PYTHONPATH` roots to support namespace packages + - Fixes relative-import failures for layouts like `flask/sansio` without `__init__.py` +- **`repro` Success Messaging**: Clarified output when only CrossHair fails (advisory) instead of reporting full success + +--- + ## [0.20.1] - 2025-12-20 ### Fixed (0.20.1) diff --git a/docs/examples/integration-showcases/integration-showcases-testing-guide.md b/docs/examples/integration-showcases/integration-showcases-testing-guide.md index 8f6d4a9..bb076c7 100644 --- a/docs/examples/integration-showcases/integration-showcases-testing-guide.md +++ b/docs/examples/integration-showcases/integration-showcases-testing-guide.md @@ -185,6 +185,11 @@ def process_payment(request): 2. **Enrichment Report Creation**: The AI will: - Draft an enrichment markdown file: `-.enrichment.md` (saved to `.specfact/projects//reports/enrichment/`, Phase 8.5) - Include missing features, stories, confidence adjustments, and business context + - **CRITICAL**: Follow the exact enrichment report format (see [Dual-Stack Enrichment Guide](../../guides/dual-stack-enrichment.md) for format requirements): + - Features must use numbered list: `1. **Feature Title** (Key: FEATURE-XXX)` + - Each feature must have a `Stories:` section with numbered stories + - Stories must have `- Acceptance:` criteria + - Stories must be indented under the feature 3. **Apply Enrichment**: The AI will run: ```bash @@ -1041,6 +1046,7 @@ Report written to: .specfact/projects//reports/enforcement/report-< - Smoke tests (pytest) - only if `tests/smoke/` directory exists **CrossHair Setup**: Before running `repro` for the first time, set up CrossHair configuration: + ```bash specfact repro setup ``` diff --git a/docs/guides/dual-stack-enrichment.md b/docs/guides/dual-stack-enrichment.md index 0566d4d..90537ff 100644 --- a/docs/guides/dual-stack-enrichment.md +++ b/docs/guides/dual-stack-enrichment.md @@ -1,7 +1,8 @@ # Dual-Stack Enrichment Pattern **Status**: ✅ **AVAILABLE** (v0.13.0+) -**Last Updated**: 2025-12-02 +**Last Updated**: 2025-12-23 +**Version**: v0.20.4 (enrichment parser improvements: story merging, format validation) --- @@ -74,6 +75,7 @@ specfact [options] --no-interactive - Identify missing features/stories - Suggest confidence adjustments - Extract business context +- **CRITICAL**: Generate enrichment report in the exact format specified below (see "Enrichment Report Format" section) **What NOT to do**: @@ -82,23 +84,126 @@ specfact [options] --no-interactive - ❌ Bypass CLI validation - ❌ Write to `.specfact/` folder directly (always use CLI) - ❌ Use direct file manipulation tools for writing (use CLI commands) +- ❌ Deviate from the enrichment report format (will cause parsing failures) **Output**: Generate enrichment report (Markdown) saved to `.specfact/projects//reports/enrichment/` (bundle-specific, Phase 8.5) +**Enrichment Report Format** (REQUIRED for successful parsing): + +The enrichment parser expects a specific Markdown format. Follow this structure exactly: + +```markdown +# [Bundle Name] Enrichment Report + +**Date**: YYYY-MM-DDTHH:MM:SS +**Bundle**: + +--- + +## Missing Features + +1. **Feature Title** (Key: FEATURE-XXX) + - Confidence: 0.85 + - Outcomes: outcome1, outcome2, outcome3 + - Stories: + 1. Story title here + - Acceptance: criterion1, criterion2, criterion3 + 2. Another story title + - Acceptance: criterion1, criterion2 + +2. **Another Feature** (Key: FEATURE-YYY) + - Confidence: 0.80 + - Outcomes: outcome1, outcome2 + - Stories: + 1. Story title + - Acceptance: criterion1, criterion2, criterion3 + +## Confidence Adjustments + +- FEATURE-EXISTING-KEY: 0.90 (reason: improved understanding after code review) + +## Business Context + +- Priority: High priority feature for core functionality +- Constraint: Must support both REST and GraphQL APIs +- Risk: Potential performance issues with large datasets +``` + +**Format Requirements**: + +1. **Section Header**: Must use `## Missing Features` (case-insensitive, but prefer this exact format) +2. **Feature Format**: + - Numbered list: `1. **Feature Title** (Key: FEATURE-XXX)` + - **Bold title** is required (use `**Title**`) + - **Key in parentheses**: `(Key: FEATURE-XXX)` - must be uppercase, alphanumeric with hyphens/underscores + - Fields on separate lines with `-` prefix: + - `- Confidence: 0.85` (float between 0.0-1.0) + - `- Outcomes: comma-separated or line-separated list` + - `- Stories:` (required - each feature must have at least one story) +3. **Stories Format**: + - Numbered list under `Stories:` section: `1. Story title` + - **Indentation**: Stories must be indented (2-4 spaces) under the feature + - **Acceptance Criteria**: `- Acceptance: criterion1, criterion2, criterion3` + - Can be comma-separated on one line + - Or multi-line (each criterion on new line) + - Must start with `- Acceptance:` +4. **Optional Sections**: + - `## Confidence Adjustments`: List existing features with confidence updates + - `## Business Context`: Priorities, constraints, risks (bullet points) +5. **File Naming**: `-.enrichment.md` (e.g., `djangogoat-2025-12-23T23-50-00.enrichment.md`) + +**Example** (working format): + +```markdown +## Missing Features + +1. **User Authentication** (Key: FEATURE-USER-AUTHENTICATION) + - Confidence: 0.85 + - Outcomes: User registration, login, profile management + - Stories: + 1. User can sign up for new account + - Acceptance: sign_up view processes POST requests, creates User automatically, user is logged in after signup, redirects to profile page + 2. User can log in with credentials + - Acceptance: log_in view authenticates username/password, on success user is logged in and redirected, on failure error message is displayed +``` + +**Common Mistakes to Avoid**: + +- ❌ Missing `(Key: FEATURE-XXX)` - parser needs this to identify features +- ❌ Missing `Stories:` section - every feature must have at least one story +- ❌ Stories not indented - parser expects indented numbered lists +- ❌ Missing `- Acceptance:` prefix - acceptance criteria won't be parsed +- ❌ Using bullet points (`-`) instead of numbers (`1.`) for stories +- ❌ Feature title not in bold (`**Title**`) - parser may not extract title correctly + +**Important Notes**: + +- **Stories are merged**: When updating existing features (not creating new ones), stories from the enrichment report are merged into the existing feature. New stories are added, existing stories are preserved. +- **Feature titles updated**: If a feature exists but has an empty title, the enrichment report will update it. +- **Validation**: The enrichment parser validates the format and will fail with clear error messages if the format is incorrect. + ### Phase 3: CLI Artifact Creation (REQUIRED) ```bash # Use enrichment to update plan via CLI -specfact plan update-feature [--bundle ] [options] --no-interactive +specfact import from-code [] --repo --enrichment --no-interactive ``` **Result**: Final artifacts are CLI-generated with validated enrichments +**What happens during enrichment application**: + +- Missing features are added with their stories and acceptance criteria +- Existing features are updated (confidence, outcomes, title if empty) +- Stories are merged into existing features (new stories added, existing preserved) +- Business context is applied to the plan bundle +- All changes are validated and saved via CLI + ## Standard Validation Loop Pattern (For LLM-Generated Code) When generating or enhancing code via LLM, **ALWAYS** follow this pattern: -``` +```text 1. CLI Prompt Generation (Required) ↓ CLI generates structured prompt → saved to .specfact/prompts/ diff --git a/docs/reference/commands.md b/docs/reference/commands.md index 8571652..0eaafeb 100644 --- a/docs/reference/commands.md +++ b/docs/reference/commands.md @@ -272,7 +272,11 @@ specfact import from-code [OPTIONS] - **Large codebases**: Focus on specific modules or subsystems for faster analysis - **Incremental modernization**: Modernize one part of the codebase at a time - Example: `--entry-point src/core` analyzes only `src/core/` and its subdirectories -- `--enrichment PATH` - Path to Markdown enrichment report from LLM (applies missing features, confidence adjustments, business context) +- `--enrichment PATH` - Path to Markdown enrichment report from LLM (applies missing features, confidence adjustments, business context). The enrichment report must follow a specific format (see [Dual-Stack Enrichment Guide](../guides/dual-stack-enrichment.md) for format requirements). When applied: + - Missing features are added with their stories and acceptance criteria + - Existing features are updated (confidence, outcomes, title if empty) + - Stories are merged into existing features (new stories added, existing preserved) + - Business context is applied to the plan bundle **Note**: The bundle name (positional argument) will be automatically sanitized (lowercased, spaces/special chars removed) for filesystem persistence. The bundle is created at `.specfact/projects//`. diff --git a/pyproject.toml b/pyproject.toml index a0f9c15..4a5d859 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "hatchling.build" [project] name = "specfact-cli" -version = "0.20.1" +version = "0.20.5" description = "Brownfield-first CLI: Reverse engineer legacy Python → specs → enforced contracts. Automate legacy code documentation and prevent modernization regressions." readme = "README.md" requires-python = ">=3.11" diff --git a/resources/prompts/shared/cli-enforcement.md b/resources/prompts/shared/cli-enforcement.md index 4a9612c..c2952db 100644 --- a/resources/prompts/shared/cli-enforcement.md +++ b/resources/prompts/shared/cli-enforcement.md @@ -37,6 +37,12 @@ These operations **require LLM** and are only available via AI IDE slash prompts **Access**: Only available via AI IDE slash prompts (Cursor, CoPilot, etc.) **Pattern**: Slash prompt → LLM generates → CLI validates → Apply if valid +## LLM Grounding Rules + +- Treat CLI artifacts as the source of truth for keys, structure, and metadata. +- Scan the codebase only when asked to infer missing behavior/context or explain deviations; respect `--entry-point` scope when provided. +- Use codebase findings to propose updates via CLI (enrichment report, plan update commands), never to rewrite artifacts directly. + ## Rules 1. **Execute CLI First**: Always run CLI commands before any analysis diff --git a/resources/prompts/specfact.01-import.md b/resources/prompts/specfact.01-import.md index a711fee..388f628 100644 --- a/resources/prompts/specfact.01-import.md +++ b/resources/prompts/specfact.01-import.md @@ -25,17 +25,20 @@ Import codebase → plan bundle. CLI extracts routes/schemas/relationships/contr ## Workflow -1. **Execute CLI**: `specfact import from-code [] --repo [options]` +1. **Execute CLI**: `specfact [GLOBAL OPTIONS] import from-code [] --repo [options]` - CLI extracts: routes (FastAPI/Flask/Django), schemas (Pydantic), relationships, contracts (OpenAPI scaffolds), source tracking - Uses active plan if bundle not specified + - Note: `--no-interactive` is a global option and must appear before the subcommand (e.g., `specfact --no-interactive import from-code ...`). - **Auto-enrichment enabled by default**: Automatically enhances vague acceptance criteria, incomplete requirements, and generic tasks using PlanEnricher (same logic as `plan review --auto-enrich`) - Use `--no-enrich-for-speckit` to disable auto-enrichment + - **Contract extraction**: OpenAPI contracts are extracted automatically **only** for features with `source_tracking.implementation_files` and detectable API endpoints (FastAPI/Flask patterns). For enrichment-added features or Django apps, use `specfact contract init` after enrichment (see Phase 4) -2. **LLM Enrichment** (if `--enrichment` provided): - - Read `.specfact/projects//enrichment_context.md` - - Enrich: business context, "why" reasoning, missing acceptance criteria - - Validate: contracts vs code, feature/story alignment - - Save enrichment report to `.specfact/projects//reports/enrichment/` (bundle-specific, Phase 8.5, if created) +2. **LLM Enrichment** (Copilot-only, before applying `--enrichment`): + - Read CLI artifacts: `.specfact/projects//enrichment_context.md`, feature YAMLs, contract scaffolds, and brownfield reports + - Scan the codebase within `--entry-point` (and adjacent modules) to identify missing features, dependencies, and behavior; do **not** rely solely on AST-derived YAML + - Compare code findings vs CLI artifacts, then add missing features/stories, reasoning, and acceptance criteria (each added feature must include at least one story) + - Save the enrichment report to `.specfact/projects//reports/enrichment/-.enrichment.md` (bundle-specific, Phase 8.5) + - **CRITICAL**: Follow the exact enrichment report format (see "Enrichment Report Format" section below) to ensure successful parsing 3. **Present**: Bundle location, report path, summary (features/stories/contracts/relationships) @@ -46,7 +49,7 @@ Import codebase → plan bundle. CLI extracts routes/schemas/relationships/contr **Rules:** - Execute CLI first - never create artifacts directly -- Use `--no-interactive` flag in CI/CD environments +- Use the global `--no-interactive` flag in CI/CD environments (must appear before the subcommand) - Never modify `.specfact/` directly - Use CLI output as grounding for validation - Code generation requires LLM (only via AI IDE slash prompts, not CLI-only) @@ -59,7 +62,7 @@ When in copilot mode, follow this three-phase workflow: ```bash # Execute CLI to get structured output -specfact import from-code [] --repo --no-interactive +specfact --no-interactive import from-code [] --repo ``` **Capture**: @@ -75,10 +78,10 @@ specfact import from-code [] --repo --no-interactive **What to do**: - Read CLI-generated artifacts (use file reading tools for display only) -- Research codebase for additional context -- Identify missing features/stories -- Suggest confidence adjustments -- Extract business context +- Scan the codebase within `--entry-point` for missing features/behavior and compare against CLI artifacts +- Identify missing features/stories and add reasoning/acceptance criteria (no direct edits to `.specfact/`) +- Suggest confidence adjustments and extract business context +- **CRITICAL**: Generate enrichment report in the exact format specified below (see "Enrichment Report Format" section) **What NOT to do**: @@ -87,20 +90,159 @@ specfact import from-code [] --repo --no-interactive - ❌ Bypass CLI validation - ❌ Write to `.specfact/` folder directly (always use CLI) - ❌ Use direct file manipulation tools for writing (use CLI commands) +- ❌ Deviate from the enrichment report format (will cause parsing failures) **Output**: Generate enrichment report (Markdown) saved to `.specfact/projects//reports/enrichment/` (bundle-specific, Phase 8.5) +**Enrichment Report Format** (REQUIRED for successful parsing): + +The enrichment parser expects a specific Markdown format. Follow this structure exactly: + +```markdown +# [Bundle Name] Enrichment Report + +**Date**: YYYY-MM-DDTHH:MM:SS +**Bundle**: + +--- + +## Missing Features + +1. **Feature Title** (Key: FEATURE-XXX) + - Confidence: 0.85 + - Outcomes: outcome1, outcome2, outcome3 + - Stories: + 1. Story title here + - Acceptance: criterion1, criterion2, criterion3 + 2. Another story title + - Acceptance: criterion1, criterion2 + +2. **Another Feature** (Key: FEATURE-YYY) + - Confidence: 0.80 + - Outcomes: outcome1, outcome2 + - Stories: + 1. Story title + - Acceptance: criterion1, criterion2, criterion3 + +## Confidence Adjustments + +- FEATURE-EXISTING-KEY: 0.90 (reason: improved understanding after code review) + +## Business Context + +- Priority: High priority feature for core functionality +- Constraint: Must support both REST and GraphQL APIs +- Risk: Potential performance issues with large datasets +``` + +**Format Requirements**: + +1. **Section Header**: Must use `## Missing Features` (case-insensitive, but prefer this exact format) +2. **Feature Format**: + - Numbered list: `1. **Feature Title** (Key: FEATURE-XXX)` + - **Bold title** is required (use `**Title**`) + - **Key in parentheses**: `(Key: FEATURE-XXX)` - must be uppercase, alphanumeric with hyphens/underscores + - Fields on separate lines with `-` prefix: + - `- Confidence: 0.85` (float between 0.0-1.0) + - `- Outcomes: comma-separated or line-separated list` + - `- Stories:` (required - each feature must have at least one story) +3. **Stories Format**: + - Numbered list under `Stories:` section: `1. Story title` + - **Indentation**: Stories must be indented (2-4 spaces) under the feature + - **Acceptance Criteria**: `- Acceptance: criterion1, criterion2, criterion3` + - Can be comma-separated on one line + - Or multi-line (each criterion on new line) + - Must start with `- Acceptance:` +4. **Optional Sections**: + - `## Confidence Adjustments`: List existing features with confidence updates + - `## Business Context`: Priorities, constraints, risks (bullet points) +5. **File Naming**: `-.enrichment.md` (e.g., `djangogoat-2025-12-23T23-50-00.enrichment.md`) + +**Example** (working format): + +```markdown +## Missing Features + +1. **User Authentication** (Key: FEATURE-USER-AUTHENTICATION) + - Confidence: 0.85 + - Outcomes: User registration, login, profile management + - Stories: + 1. User can sign up for new account + - Acceptance: sign_up view processes POST requests, creates User automatically, user is logged in after signup, redirects to profile page + 2. User can log in with credentials + - Acceptance: log_in view authenticates username/password, on success user is logged in and redirected, on failure error message is displayed +``` + +**Common Mistakes to Avoid**: + +- ❌ Missing `(Key: FEATURE-XXX)` - parser needs this to identify features +- ❌ Missing `Stories:` section - every feature must have at least one story +- ❌ Stories not indented - parser expects indented numbered lists +- ❌ Missing `- Acceptance:` prefix - acceptance criteria won't be parsed +- ❌ Using bullet points (`-`) instead of numbers (`1.`) for stories +- ❌ Feature title not in bold (`**Title**`) - parser may not extract title correctly + ### Phase 3: CLI Artifact Creation (REQUIRED) ```bash # Use enrichment to update plan via CLI -specfact import from-code [] --repo --enrichment --no-interactive +specfact --no-interactive import from-code [] --repo --enrichment ``` **Result**: Final artifacts are CLI-generated with validated enrichments **Note**: If code generation is needed, use the validation loop pattern (see [CLI Enforcement Rules](./shared/cli-enforcement.md#standard-validation-loop-pattern-for-llm-generated-code)) +### Phase 4: OpenAPI Contract Generation (REQUIRED for Sidecar Validation) + +**When contracts are generated automatically:** + +The `import from-code` command attempts to extract OpenAPI contracts automatically, but **only if**: + +1. Features have `source_tracking.implementation_files` (AST-detected features) +2. The OpenAPI extractor finds API endpoints (FastAPI/Flask patterns like `@app.get`, `@router.post`, `@app.route`) + +**When contracts are NOT generated:** + +Contracts are **NOT** generated automatically when: + +- Features were added via enrichment (no `source_tracking.implementation_files`) +- Django applications (Django `path()` patterns are not detected by the extractor) +- Features without API endpoints (models, utilities, middleware, etc.) +- Framework SDKs or libraries without web endpoints + +**How to generate contracts manually:** + +For features that need OpenAPI contracts (e.g., for sidecar validation with CrossHair), use: + +```bash +# Generate contract for a single feature +specfact --no-interactive contract init --bundle --feature --repo + +# Example: Generate contracts for all enrichment-added features +specfact --no-interactive contract init --bundle djangogoat-validation --feature FEATURE-USER-AUTHENTICATION --repo . +specfact --no-interactive contract init --bundle djangogoat-validation --feature FEATURE-NOTES-MANAGEMENT --repo . +# ... repeat for each feature that needs a contract +``` + +**When to apply contract generation:** + +- **After Phase 3** (enrichment applied): Check which features have contracts in `.specfact/projects//contracts/` +- **Before sidecar validation**: All features that will be analyzed by CrossHair/Specmatic need OpenAPI contracts +- **For Django apps**: Always generate contracts manually after enrichment, as Django URL patterns are not auto-detected + +**Verification:** + +```bash +# Check which features have contracts +ls .specfact/projects//contracts/*.yaml + +# Compare with total features +ls .specfact/projects//features/*.yaml +``` + +If the contract count is less than the feature count, generate missing contracts using `contract init`. + ## Expected Output **Success**: Bundle location, report path, summary (features/stories/contracts/relationships) diff --git a/resources/prompts/specfact.02-plan.md b/resources/prompts/specfact.02-plan.md index be192b4..66c7c01 100644 --- a/resources/prompts/specfact.02-plan.md +++ b/resources/prompts/specfact.02-plan.md @@ -107,7 +107,9 @@ specfact plan [--bundle ] [options] --no-interactive **What to do**: - Read CLI-generated artifacts (use file reading tools for display only) -- Research codebase for additional context +- Use CLI artifacts as the source of truth for keys/structure/metadata +- Scan codebase only if asked to align the plan with implementation or to add missing features +- When scanning, compare findings against CLI artifacts and propose updates via CLI commands - Identify missing features/stories - Suggest confidence adjustments - Extract business context diff --git a/resources/prompts/specfact.03-review.md b/resources/prompts/specfact.03-review.md index 7f38c61..35ddc85 100644 --- a/resources/prompts/specfact.03-review.md +++ b/resources/prompts/specfact.03-review.md @@ -135,6 +135,9 @@ For these cases, use the **export-to-file → LLM reasoning → import-from-file **CRITICAL**: Always use `/tmp/` for temporary artifacts to avoid polluting the codebase. Never create temporary files in the project root. +**CRITICAL**: Question IDs are generated per run and can change if you re-run review. +**Do not** re-run `plan review` between exporting questions and applying answers. Always answer using the exact exported questions file for that session. + **Note**: The `--max-questions` parameter (default: 5) limits the number of questions per session, not the total number of available questions. If there are more questions available, you may need to run the review multiple times to answer all questions. Each session will ask different questions (avoiding duplicates from previous sessions). **Export questions to file for LLM reasoning:** @@ -397,6 +400,11 @@ specfact plan review [] --list-questions --output-questions /tmp/qu **What to do**: +0. **Grounding rule**: + - Treat CLI-exported questions as the source of truth; consult codebase/docs only to answer them (do not invent new artifacts) + - **Feature/Story Completeness note**: Answers here are clarifications only. They do **NOT** create stories. + For missing stories, use `specfact plan add-story` (or `plan update-story --batch-updates` if stories already exist). + 1. **Read exported questions file** (`/tmp/questions.json`): - Review all questions and their categories - Identify questions requiring code/feature analysis @@ -605,6 +613,102 @@ Create one with: specfact plan init legacy-api - Use `plan update-idea` to update idea fields directly - If bundle needs regeneration, use `import from-code --enrichment` +**Note on OpenAPI Contracts:** + +After applying enrichment or review updates, check if features need OpenAPI contracts for sidecar validation: + +- Features added via enrichment typically don't have contracts (no `source_tracking`) +- Django applications require manual contract generation (Django URL patterns not auto-detected) +- Use `specfact contract init --bundle --feature ` to generate contracts for features that need them + +**Enrichment Report Format** (for `import from-code --enrichment`): + +When generating enrichment reports for use with `import from-code --enrichment`, follow this exact format: + +```markdown +# [Bundle Name] Enrichment Report + +**Date**: YYYY-MM-DDTHH:MM:SS +**Bundle**: + +--- + +## Missing Features + +1. **Feature Title** (Key: FEATURE-XXX) + - Confidence: 0.85 + - Outcomes: outcome1, outcome2, outcome3 + - Stories: + 1. Story title here + - Acceptance: criterion1, criterion2, criterion3 + 2. Another story title + - Acceptance: criterion1, criterion2 + +2. **Another Feature** (Key: FEATURE-YYY) + - Confidence: 0.80 + - Outcomes: outcome1, outcome2 + - Stories: + 1. Story title + - Acceptance: criterion1, criterion2, criterion3 + +## Confidence Adjustments + +- FEATURE-EXISTING-KEY: 0.90 (reason: improved understanding after code review) + +## Business Context + +- Priority: High priority feature for core functionality +- Constraint: Must support both REST and GraphQL APIs +- Risk: Potential performance issues with large datasets +``` + +**Format Requirements**: + +1. **Section Header**: Must use `## Missing Features` (case-insensitive, but prefer this exact format) +2. **Feature Format**: + - Numbered list: `1. **Feature Title** (Key: FEATURE-XXX)` + - **Bold title** is required (use `**Title**`) + - **Key in parentheses**: `(Key: FEATURE-XXX)` - must be uppercase, alphanumeric with hyphens/underscores + - Fields on separate lines with `-` prefix: + - `- Confidence: 0.85` (float between 0.0-1.0) + - `- Outcomes: comma-separated or line-separated list` + - `- Stories:` (required - each feature must have at least one story) +3. **Stories Format**: + - Numbered list under `Stories:` section: `1. Story title` + - **Indentation**: Stories must be indented (2-4 spaces) under the feature + - **Acceptance Criteria**: `- Acceptance: criterion1, criterion2, criterion3` + - Can be comma-separated on one line + - Or multi-line (each criterion on new line) + - Must start with `- Acceptance:` +4. **Optional Sections**: + - `## Confidence Adjustments`: List existing features with confidence updates + - `## Business Context`: Priorities, constraints, risks (bullet points) +5. **File Naming**: `-.enrichment.md` (e.g., `djangogoat-2025-12-23T23-50-00.enrichment.md`) + +**Example** (working format): + +```markdown +## Missing Features + +1. **User Authentication** (Key: FEATURE-USER-AUTHENTICATION) + - Confidence: 0.85 + - Outcomes: User registration, login, profile management + - Stories: + 1. User can sign up for new account + - Acceptance: sign_up view processes POST requests, creates User automatically, user is logged in after signup, redirects to profile page + 2. User can log in with credentials + - Acceptance: log_in view authenticates username/password, on success user is logged in and redirected, on failure error message is displayed +``` + +**Common Mistakes to Avoid**: + +- ❌ Missing `(Key: FEATURE-XXX)` - parser needs this to identify features +- ❌ Missing `Stories:` section - every feature must have at least one story +- ❌ Stories not indented - parser expects indented numbered lists +- ❌ Missing `- Acceptance:` prefix - acceptance criteria won't be parsed +- ❌ Using bullet points (`-`) instead of numbers (`1.`) for stories +- ❌ Feature title not in bold (`**Title**`) - parser may not extract title correctly + ## Context {ARGS} diff --git a/resources/prompts/specfact.04-sdd.md b/resources/prompts/specfact.04-sdd.md index 9dd9364..6e40699 100644 --- a/resources/prompts/specfact.04-sdd.md +++ b/resources/prompts/specfact.04-sdd.md @@ -90,6 +90,7 @@ specfact plan harden [] [--sdd ] --no-interactive **What to do**: - Read CLI-generated SDD (use file reading tools for display only) +- Treat CLI SDD as the source of truth; scan codebase only to enrich WHY/WHAT/HOW context - Research codebase for additional context - Suggest improvements to WHY/WHAT/HOW sections diff --git a/resources/prompts/specfact.05-enforce.md b/resources/prompts/specfact.05-enforce.md index b1be7df..0d0c227 100644 --- a/resources/prompts/specfact.05-enforce.md +++ b/resources/prompts/specfact.05-enforce.md @@ -94,6 +94,7 @@ specfact enforce sdd [] [--sdd ] --no-interactive **What to do**: - Read CLI-generated validation report (use file reading tools for display only) +- Treat the CLI report as the source of truth; scan codebase only to explain deviations or propose fixes - Research codebase for context on deviations - Suggest fixes for validation failures diff --git a/resources/prompts/specfact.06-sync.md b/resources/prompts/specfact.06-sync.md index adf7f23..d3929b0 100644 --- a/resources/prompts/specfact.06-sync.md +++ b/resources/prompts/specfact.06-sync.md @@ -97,6 +97,7 @@ specfact sync bridge --adapter --repo [options] --no-interactiv **What to do**: - Read CLI-generated sync results (use file reading tools for display only) +- Treat CLI sync output as the source of truth; scan codebase only to explain conflicts - Research codebase for context on conflicts - Suggest resolution strategies diff --git a/resources/prompts/specfact.compare.md b/resources/prompts/specfact.compare.md index 97e91de..637c198 100644 --- a/resources/prompts/specfact.compare.md +++ b/resources/prompts/specfact.compare.md @@ -94,6 +94,7 @@ specfact plan compare [--bundle ] [options] --no-interactive **What to do**: - Read CLI-generated comparison report (use file reading tools for display only) +- Treat the comparison report as the source of truth; scan codebase only to explain or confirm deviations - Research codebase for context on deviations - Suggest fixes for missing features or mismatches diff --git a/resources/prompts/specfact.validate.md b/resources/prompts/specfact.validate.md index 2595e43..2548a8e 100644 --- a/resources/prompts/specfact.validate.md +++ b/resources/prompts/specfact.validate.md @@ -96,6 +96,7 @@ specfact repro --repo [options] --no-interactive **What to do**: - Read CLI-generated validation report (use file reading tools for display only) +- Treat the validation report as the source of truth; scan codebase only to explain failures - Research codebase for context on failures - Suggest fixes for validation failures diff --git a/resources/templates/sidecar/README.md b/resources/templates/sidecar/README.md new file mode 100644 index 0000000..963ad57 --- /dev/null +++ b/resources/templates/sidecar/README.md @@ -0,0 +1,187 @@ +# Sidecar Validation Templates + +Purpose: Run validation tools against a target repository without modifying its source. + +This template set is intended for Phase B validation and can be copied into a +separate sidecar workspace. Use `.specfact/projects//contracts/` as the +source of truth for API contracts (specmatic/OpenAPI). + +## Quick Start + +1. Copy this folder into a sidecar workspace, or run: + +```bash +./sidecar-init.sh /path/to/sidecar /path/to/target/repo bundle-name +``` + +1. Export environment variables (or use the `.env` file created by `sidecar-init.sh`): + +```bash +export REPO_PATH=/path/to/target/repo +export BUNDLE_NAME=bundle-name +export SEMGREP_CONFIG=/path/to/semgrep.yml # optional +export REPO_PYTHONPATH="/path/to/target/repo/src:/path/to/target/repo" +export SIDECAR_SOURCE_DIRS="/path/to/target/repo/src" # optional (defaults to src/) +export PYTHON_CMD=python3 # optional (auto-detects venv if .venv/ or venv/ exists) +export DJANGO_SETTINGS_MODULE=project.settings # optional (auto-detected for Django projects) +export SPECMATIC_CMD=/path/to/specmatic # optional (CLI binary, or: "npx --yes specmatic") +export SPECMATIC_JAR=/path/to/specmatic.jar # optional (java -jar fallback) +export SPECMATIC_CONFIG=/path/to/specmatic.yaml # optional config file +export SPECMATIC_TEST_BASE_URL=http://localhost:5000 # optional target +export SPECMATIC_HOST=localhost # optional target host +export SPECMATIC_PORT=5000 # optional target port +export SPECMATIC_TIMEOUT=30 # optional request timeout (seconds) +export SPECMATIC_AUTO_STUB=1 # optional (default: 1) run stub when no target configured +export SPECMATIC_STUB_HOST=127.0.0.1 # optional stub host +export SPECMATIC_STUB_PORT=19000 # optional stub port +export SPECMATIC_STUB_WAIT=15 # optional stub startup wait (seconds) +export SIDECAR_APP_CMD="python -m your_app" # optional app command to run +export SIDECAR_APP_HOST=127.0.0.1 # optional app host +export SIDECAR_APP_PORT=5000 # optional app port +export SIDECAR_APP_WAIT=15 # optional app startup wait (seconds) +export BINDINGS_PATH=/path/to/bindings.yaml # optional bindings map +export FEATURES_DIR=/path/to/features # optional features dir (FEATURE-*.yaml) +export CROSSHAIR_VERBOSE=1 # optional CrossHair debug output +export CROSSHAIR_REPORT_ALL=1 # optional report all postconditions +export CROSSHAIR_REPORT_VERBOSE=1 # optional report stack traces +export CROSSHAIR_MAX_UNINTERESTING_ITERATIONS=50 # optional iteration budget +export CROSSHAIR_PER_PATH_TIMEOUT=2 # optional per-path timeout +export CROSSHAIR_PER_CONDITION_TIMEOUT=10 # optional per-condition timeout +export CROSSHAIR_ANALYSIS_KIND=icontract # optional kinds (comma-separated) +export CROSSHAIR_EXTRA_PLUGIN=/path/to/plugin.py # optional extra plugins +export RUN_BASEDPYRIGHT=0 # optional toggle per tool (default: 0) +export TIMEOUT_BASEDPYRIGHT=30 # optional per-tool timeout +export GENERATE_HARNESS=1 # optional (default: 1) +export HARNESS_PATH=harness_contracts.py # optional +export INPUTS_PATH=inputs.json # optional +export SIDECAR_REPORTS_DIR=/path/to/repo/.specfact/projects/bundle/reports/sidecar # optional +``` + +1. Run the sidecar script: + +```bash +./run_sidecar.sh +``` + +## Refreshing the Sidecar Workspace + +If you update templates (for example `adapters.py`, `run_sidecar.sh`, or the +harness generator), re-initialize the sidecar workspace so the new templates +are copied over: + +```bash +./sidecar-init.sh /path/to/sidecar /path/to/target/repo bundle-name +``` + +Notes: + +- This overwrites existing template files in the sidecar workspace. +- Preserve local changes (for example `bindings.yaml` or `.env`) before + re-running if you have custom edits. +- Re-run `./run_sidecar.sh` with `GENERATE_HARNESS=1` if you want a fresh + `harness_contracts.py` and `inputs.json` after template updates. + +## Notes + +- CrossHair requires contracts (icontract/PEP316/deal) or registered contracts. + Use `harness_contracts.py` or `crosshair_plugin.py` to attach contracts + externally without touching production code. + +- **Dual CrossHair Analysis**: The sidecar runs CrossHair in two modes: + 1. **Source code analysis**: Analyzes source directories directly to catch existing decorators + (beartype, icontract, PEP316, deal) already present in the codebase (e.g., SpecFact CLI dogfooding). + 2. **Harness analysis**: Analyzes generated harness files to catch contracts added externally + for code without decorators (e.g., DjangoGoat, Flask, Requests). + + Both analyses are necessary for complete coverage: + - **Case A**: Code with existing decorators → CrossHair analyzes source directly + - **Case B**: Code without decorators → CrossHair analyzes harness with externally-added contracts +- Specmatic contracts are expected in: + `/.specfact/projects//contracts/` +- If you only have the Python `specmatic` package installed, note it does not + expose a CLI or module runner. Provide a CLI path (`SPECMATIC_CMD`), use `npx`, + or supply a jar (`SPECMATIC_JAR`) to execute Specmatic in the sidecar. +- For contract tests that hit a running service, set `SPECMATIC_TEST_BASE_URL` + (or `SPECMATIC_HOST`/`SPECMATIC_PORT`) so Specmatic knows where to send requests. +- If you don't have a target service, the sidecar can auto-start a Specmatic + stub (`SPECMATIC_AUTO_STUB=1`) or launch a real service with `SIDECAR_APP_CMD`. +- All reports/logs should be written to SpecFact bundle reports, not into the + target repo. + +## CrossHair Defaults (Suggested) + +These defaults provide stable results for most repositories; tune them if your +codebase is large or has heavy initialization. + +```bash +export CROSSHAIR_ANALYSIS_KIND=icontract +export CROSSHAIR_PER_PATH_TIMEOUT=2 +export CROSSHAIR_PER_CONDITION_TIMEOUT=10 +export CROSSHAIR_MAX_UNINTERESTING_ITERATIONS=50 +export CROSSHAIR_REPORT_ALL=1 +export CROSSHAIR_REPORT_VERBOSE=0 +``` + +## Tool Toggles & Timeouts + +The sidecar runner supports basic toggles (set to `0` to skip): + +- `RUN_SEMGREP`, `RUN_BASEDPYRIGHT`, `RUN_SPECMATIC`, `RUN_CROSSHAIR` +- `GENERATE_HARNESS` (generate `harness_contracts.py` and `inputs.json` from OpenAPI) + +And per-tool timeouts in seconds: + +- `TIMEOUT_SEMGREP`, `TIMEOUT_BASEDPYRIGHT`, `TIMEOUT_SPECMATIC`, `TIMEOUT_CROSSHAIR` + +## Harness Generation + +The harness is auto-generated from OpenAPI contracts in: +`/.specfact/projects//contracts/` + +Generated outputs (in the sidecar workspace): + +- `harness_contracts.py` (CrossHair harness) +- `inputs.json` (deterministic example requests/responses) +- `bindings.yaml` (optional mapping to real code) + +Bindings let you attach harness functions to real code without editing the repo. +If `bindings.yaml` is present, the harness will call the bound function instead +of returning a fixed example. + +Binding schema (minimal): + +```yaml +bindings: + - operation_id: create_item + target: your_package.factory:ItemFactory + method: create + factory: + args: ["$request.item_type"] + call_style: kwargs +``` + +Optional keys: + +- `adapter`: name of a function in `adapters.py` to handle complex setup. +- `factory.target`: alternate factory callable (module:func) to create instance. +- `factory.args` / `factory.kwargs`: supports `$request.` or `$env.` values. + +Available adapters (see `adapters.py` for config fields): + +- `call_method_with_factory`: construct instance then call method. +- `call_constructor_then_method`: construct instance via `init` and call method. +- `call_classmethod`: call a classmethod/staticmethod on a class target. +- `call_with_context_manager`: create resource in `with` block then call method. +- `call_async`: call async function and run the coroutine. +- `call_with_setup_teardown`: run setup/teardown around a target call. +- `call_with_request_transform`: rename/drop/set/coerce request fields before call. +- `call_generator`: consume a generator/iterator and return list/last/count. +- `call_from_registry`: resolve a callable from a registry/entrypoint map. +- `call_with_overrides`: temporarily override module attributes during a call. +- `call_with_contextvars`: set context variables for the call duration. +- `call_with_session`: create a session/transaction around the call. +- `call_with_callbacks`: inject callbacks into the request payload. + +Logs are written to: + +- `/.specfact/projects//reports/sidecar/` diff --git a/resources/templates/sidecar/__init__.py b/resources/templates/sidecar/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/resources/templates/sidecar/adapters.py b/resources/templates/sidecar/adapters.py new file mode 100644 index 0000000..74b146c --- /dev/null +++ b/resources/templates/sidecar/adapters.py @@ -0,0 +1,632 @@ +# pyright: reportMissingImports=false +"""Sidecar binding adapters.""" + +from __future__ import annotations + +import asyncio +import contextvars +import importlib +import inspect +import os +import sys +from collections.abc import Callable +from typing import Any, Protocol, cast, runtime_checkable + + +@runtime_checkable +class SupportsGet(Protocol): + def get(self, key: Any, default: Any = None) -> Any: ... + + +@runtime_checkable +class SupportsGetItem(Protocol): + def __getitem__(self, key: Any) -> Any: ... + + +def call_method_with_factory( + binding: dict[str, Any], + request: Any, + load_binding: Callable[[str], Callable[..., Any]], + call_target: Callable[[Callable[..., Any], str, Any, list[str]], Any], + resolve_value: Callable[[Any, Any], Any], +) -> Any: + """Create an instance via factory and call a method on it.""" + target_name = binding.get("target") or binding.get("function") + if not target_name: + raise ValueError("Binding missing target") + method_name = binding.get("method") + if not method_name: + raise ValueError("Binding missing method") + factory_cfg = binding.get("factory") or {} + factory_target_name = factory_cfg.get("target") or target_name + factory_target = load_binding(factory_target_name) + factory_args = [resolve_value(item, request) for item in factory_cfg.get("args", [])] + factory_kwargs = {key: resolve_value(val, request) for key, val in factory_cfg.get("kwargs", {}).items()} + instance = factory_target(*factory_args, **factory_kwargs) + method = getattr(instance, method_name) + call_style = binding.get("call_style", "dict") + args = binding.get("args", []) + return call_target(method, call_style, request, args) + + +def call_constructor_then_method( + binding: dict[str, Any], + request: Any, + load_binding: Callable[[str], Callable[..., Any]], + call_target: Callable[[Callable[..., Any], str, Any, list[str]], Any], + resolve_value: Callable[[Any, Any], Any], +) -> Any: + """Construct an object and call a method on the instance.""" + target_name = binding.get("target") or binding.get("class") + if not target_name: + raise ValueError("Binding missing target") + method_name = binding.get("method") + if not method_name: + raise ValueError("Binding missing method") + init_cfg = binding.get("init") or {} + ctor = load_binding(target_name) + ctor_args = [resolve_value(item, request) for item in init_cfg.get("args", [])] + ctor_kwargs = {key: resolve_value(val, request) for key, val in init_cfg.get("kwargs", {}).items()} + instance = ctor(*ctor_args, **ctor_kwargs) + method = getattr(instance, method_name) + call_style = binding.get("call_style", "dict") + args = binding.get("args", []) + return call_target(method, call_style, request, args) + + +def call_classmethod( + binding: dict[str, Any], + request: Any, + load_binding: Callable[[str], Callable[..., Any]], + call_target: Callable[[Callable[..., Any], str, Any, list[str]], Any], + resolve_value: Callable[[Any, Any], Any], +) -> Any: + """Call a classmethod/staticmethod on a target class.""" + target_name = binding.get("target") or binding.get("class") + if not target_name: + raise ValueError("Binding missing target") + method_name = binding.get("method") + if not method_name: + raise ValueError("Binding missing method") + klass = load_binding(target_name) + method = getattr(klass, method_name) + call_style = binding.get("call_style", "dict") + args = binding.get("args", []) + return call_target(method, call_style, request, args) + + +def call_with_context_manager( + binding: dict[str, Any], + request: Any, + load_binding: Callable[[str], Callable[..., Any]], + call_target: Callable[[Callable[..., Any], str, Any, list[str]], Any], + resolve_value: Callable[[Any, Any], Any], +) -> Any: + """Use a context manager to create a resource, then call an optional method.""" + target_name = binding.get("target") + if not target_name: + raise ValueError("Binding missing target") + method_name = binding.get("method") + init_cfg = binding.get("init") or {} + ctor = load_binding(target_name) + ctor_args = [resolve_value(item, request) for item in init_cfg.get("args", [])] + ctor_kwargs = {key: resolve_value(val, request) for key, val in init_cfg.get("kwargs", {}).items()} + with ctor(*ctor_args, **ctor_kwargs) as resource: + if not method_name: + return resource + method = getattr(resource, method_name) + call_style = binding.get("call_style", "dict") + args = binding.get("args", []) + return call_target(method, call_style, request, args) + + +def call_async( + binding: dict[str, Any], + request: Any, + load_binding: Callable[[str], Callable[..., Any]], + call_target: Callable[[Callable[..., Any], str, Any, list[str]], Any], + resolve_value: Callable[[Any, Any], Any], +) -> Any: + """Call an async function and run the coroutine to completion.""" + target_name = binding.get("target") or binding.get("function") + if not target_name: + raise ValueError("Binding missing target") + target = load_binding(target_name) + call_style = binding.get("call_style", "dict") + args = binding.get("args", []) + result = call_target(target, call_style, request, args) + if asyncio.iscoroutine(result): + try: + loop = asyncio.get_running_loop() + except RuntimeError: + return asyncio.run(result) + if loop.is_running(): + raise RuntimeError("Async adapter cannot run inside a running event loop") + return loop.run_until_complete(result) + return result + + +def call_with_setup_teardown( + binding: dict[str, Any], + request: Any, + load_binding: Callable[[str], Callable[..., Any]], + call_target: Callable[[Callable[..., Any], str, Any, list[str]], Any], + resolve_value: Callable[[Any, Any], Any], +) -> Any: + """Run setup/teardown around a target call.""" + target_name = binding.get("target") or binding.get("function") + if not target_name: + raise ValueError("Binding missing target") + setup_name = binding.get("setup") + teardown_name = binding.get("teardown") + setup_args = [resolve_value(item, request) for item in binding.get("setup_args", [])] + setup_kwargs = {key: resolve_value(val, request) for key, val in binding.get("setup_kwargs", {}).items()} + teardown_args = [resolve_value(item, request) for item in binding.get("teardown_args", [])] + teardown_kwargs = {key: resolve_value(val, request) for key, val in binding.get("teardown_kwargs", {}).items()} + setup_result = None + if setup_name: + setup_func = load_binding(setup_name) + setup_result = setup_func(*setup_args, **setup_kwargs) + call_style = binding.get("call_style", "dict") + args = binding.get("args", []) + request_payload = request + setup_key = binding.get("setup_result_key") + if setup_key and isinstance(request, dict): + request_payload = dict(request) + request_payload[setup_key] = setup_result + target = load_binding(target_name) + result = None + try: + result = call_target(target, call_style, request_payload, args) + finally: + if teardown_name: + teardown_func = load_binding(teardown_name) + teardown_pass = binding.get("teardown_pass", "none") + if teardown_pass == "setup": + teardown_func(setup_result, *teardown_args, **teardown_kwargs) + elif teardown_pass == "result": + teardown_func(result, *teardown_args, **teardown_kwargs) + elif teardown_pass == "both": + teardown_func(setup_result, result, *teardown_args, **teardown_kwargs) + else: + teardown_func(*teardown_args, **teardown_kwargs) + return result + + +def call_with_request_transform( + binding: dict[str, Any], + request: Any, + load_binding: Callable[[str], Callable[..., Any]], + call_target: Callable[[Callable[..., Any], str, Any, list[str]], Any], + resolve_value: Callable[[Any, Any], Any], +) -> Any: + """Transform request fields before invoking the target.""" + target_name = binding.get("target") or binding.get("function") + if not target_name: + raise ValueError("Binding missing target") + if not isinstance(request, dict): + raise ValueError("Request transform adapter expects dict request") + transform = binding.get("transform") or {} + rename = transform.get("rename", {}) + drop = set(transform.get("drop", [])) + set_values = transform.get("set", {}) + coerce = transform.get("coerce", {}) + payload = dict(request) + for old_key, new_key in rename.items(): + if old_key in payload: + payload[new_key] = payload.pop(old_key) + for key in drop: + payload.pop(key, None) + for key, value in set_values.items(): + payload[key] = resolve_value(value, payload) + for key, cast_name in coerce.items(): + if key not in payload: + continue + value = payload[key] + if cast_name == "int": + payload[key] = int(value) + elif cast_name == "float": + payload[key] = float(value) + elif cast_name == "str": + payload[key] = str(value) + elif cast_name == "bool": + payload[key] = bool(value) + target = load_binding(target_name) + call_style = binding.get("call_style", "dict") + args = binding.get("args", []) + return call_target(target, call_style, payload, args) + + +def _consume_iterable(iterable: Any, limit: int | None) -> list[Any]: + items: list[Any] = [] + for item in iterable: + items.append(item) + if limit is not None and len(items) >= limit: + break + return items + + +async def _consume_async_iterable(iterable: Any, limit: int | None) -> list[Any]: + items: list[Any] = [] + async for item in iterable: + items.append(item) + if limit is not None and len(items) >= limit: + break + return items + + +def call_generator( + binding: dict[str, Any], + request: Any, + load_binding: Callable[[str], Callable[..., Any]], + call_target: Callable[[Callable[..., Any], str, Any, list[str]], Any], + resolve_value: Callable[[Any, Any], Any], +) -> Any: + """Consume a generator/iterator and return collected items.""" + target_name = binding.get("target") or binding.get("function") + if not target_name: + raise ValueError("Binding missing target") + target = load_binding(target_name) + call_style = binding.get("call_style", "dict") + args = binding.get("args", []) + result = call_target(target, call_style, request, args) + consume = binding.get("consume", "all") + limit = None if consume in ("all", None) else int(consume) + if inspect.isasyncgen(result) or hasattr(result, "__aiter__"): + try: + loop = asyncio.get_running_loop() + except RuntimeError: + items = asyncio.run(_consume_async_iterable(result, limit)) + else: + if loop.is_running(): + raise RuntimeError("Async generator adapter cannot run inside a running event loop") + items = loop.run_until_complete(_consume_async_iterable(result, limit)) + else: + items = _consume_iterable(result, limit) + collect = binding.get("collect", "list") + if collect == "last": + return items[-1] if items else None + if collect == "tuple": + return tuple(items) + if collect == "count": + return len(items) + return items + + +def call_from_registry( + binding: dict[str, Any], + request: Any, + load_binding: Callable[[str], Any], + call_target: Callable[[Callable[..., Any], str, Any, list[str]], Any], + resolve_value: Callable[[Any, Any], Any], +) -> Any: + """Resolve a callable from a registry and invoke it.""" + registry_target = binding.get("registry") + if not registry_target: + raise ValueError("Binding missing registry") + registry_obj: Any = load_binding(registry_target) + key = binding.get("key") + request_key = binding.get("request_key") + if key is None and request_key and isinstance(request, dict): + key = request.get(request_key) + if key is None: + raise ValueError("Registry key not provided") + entry = None + lookup = binding.get("lookup") + if lookup == "call": + if not callable(registry_obj): + raise ValueError("Registry lookup requested call but registry is not callable") + entry = registry_obj(key) + elif isinstance(registry_obj, SupportsGet): + entry = registry_obj.get(key) + elif isinstance(registry_obj, SupportsGetItem): + entry = registry_obj[key] + else: + raise ValueError("Registry object does not support lookup") + if entry is None: + raise ValueError("Registry entry not found") + if isinstance(entry, str) and ":" in entry: + entry = load_binding(entry) + method_name = binding.get("method") + if method_name: + entry = getattr(entry, method_name) + if not callable(entry): + raise ValueError("Registry entry is not callable") + call_style = binding.get("call_style", "dict") + args = binding.get("args", []) + return call_target(cast(Callable[..., Any], entry), call_style, request, args) + + +def call_with_overrides( + binding: dict[str, Any], + request: Any, + load_binding: Callable[[str], Callable[..., Any]], + call_target: Callable[[Callable[..., Any], str, Any, list[str]], Any], + resolve_value: Callable[[Any, Any], Any], +) -> Any: + """Temporarily override module attributes during the call.""" + target_name = binding.get("target") or binding.get("function") + if not target_name: + raise ValueError("Binding missing target") + overrides = binding.get("overrides", []) + if not isinstance(overrides, list): + raise ValueError("Overrides must be a list") + sentinel = object() + patched: list[tuple[Any, str, Any]] = [] + for override in overrides: + if not isinstance(override, dict): + continue + target = override.get("target") + if not target: + raise ValueError("Override missing target") + module_name, attr_path = target.split(":", 1) + obj = importlib.import_module(module_name) + parts = attr_path.split(".") + for part in parts[:-1]: + obj = getattr(obj, part) + attr_name = parts[-1] + old_value = getattr(obj, attr_name, sentinel) + new_value = resolve_value(override.get("value"), request) + setattr(obj, attr_name, new_value) + patched.append((obj, attr_name, old_value)) + call_style = binding.get("call_style", "dict") + args = binding.get("args", []) + target = load_binding(target_name) + try: + return call_target(target, call_style, request, args) + finally: + for obj, attr_name, old_value in reversed(patched): + if old_value is sentinel: + delattr(obj, attr_name) + else: + setattr(obj, attr_name, old_value) + + +def call_with_contextvars( + binding: dict[str, Any], + request: Any, + load_binding: Callable[[str], Callable[..., Any]], + call_target: Callable[[Callable[..., Any], str, Any, list[str]], Any], + resolve_value: Callable[[Any, Any], Any], +) -> Any: + """Set context variables for the duration of the call.""" + target_name = binding.get("target") or binding.get("function") + if not target_name: + raise ValueError("Binding missing target") + contexts = binding.get("context_vars", []) or binding.get("context", []) + if not isinstance(contexts, list): + raise ValueError("context_vars must be a list") + tokens: list[tuple[contextvars.ContextVar[Any], contextvars.Token[Any]]] = [] + for entry in contexts: + if not isinstance(entry, dict): + continue + var_target = entry.get("var") + if not var_target: + raise ValueError("Context var missing var") + var = load_binding(var_target) + if not isinstance(var, contextvars.ContextVar): + raise ValueError("Context var target is not a ContextVar") + value = resolve_value(entry.get("value"), request) + context_var = cast(contextvars.ContextVar[Any], var) + token = context_var.set(value) + tokens.append((context_var, token)) + call_style = binding.get("call_style", "dict") + args = binding.get("args", []) + target = load_binding(target_name) + try: + return call_target(target, call_style, request, args) + finally: + for var, token in reversed(tokens): + var.reset(token) + + +def call_with_session( + binding: dict[str, Any], + request: Any, + load_binding: Callable[[str], Callable[..., Any]], + call_target: Callable[[Callable[..., Any], str, Any, list[str]], Any], + resolve_value: Callable[[Any, Any], Any], +) -> Any: + """Create a session/transaction around the target call.""" + target_name = binding.get("target") or binding.get("function") + if not target_name: + raise ValueError("Binding missing target") + session_target = binding.get("session") or binding.get("session_factory") + if not session_target: + raise ValueError("Binding missing session factory") + session_ctor = load_binding(session_target) + session_args = [resolve_value(item, request) for item in binding.get("session_args", [])] + session_kwargs = {key: resolve_value(val, request) for key, val in binding.get("session_kwargs", {}).items()} + session = session_ctor(*session_args, **session_kwargs) + begin_name = binding.get("begin") + commit_name = binding.get("commit", "commit") + rollback_name = binding.get("rollback", "rollback") + close_name = binding.get("close", "close") + if begin_name: + getattr(session, begin_name)() + request_payload = request + session_key = binding.get("session_key") + if session_key and isinstance(request, dict): + request_payload = dict(request) + request_payload[session_key] = session + call_style = binding.get("call_style", "dict") + args = binding.get("args", []) + target = load_binding(target_name) + result = None + try: + result = call_target(target, call_style, request_payload, args) + if commit_name: + getattr(session, commit_name)() + return result + except Exception: + if rollback_name: + getattr(session, rollback_name)() + raise + finally: + if close_name: + getattr(session, close_name)() + + +def call_with_callbacks( + binding: dict[str, Any], + request: Any, + load_binding: Callable[[str], Callable[..., Any]], + call_target: Callable[[Callable[..., Any], str, Any, list[str]], Any], + resolve_value: Callable[[Any, Any], Any], +) -> Any: + """Inject callbacks into the request payload before calling.""" + target_name = binding.get("target") or binding.get("function") + if not target_name: + raise ValueError("Binding missing target") + if not isinstance(request, dict): + raise ValueError("Callback adapter expects dict request") + callbacks = binding.get("callbacks", []) + if not isinstance(callbacks, list): + raise ValueError("Callbacks must be a list") + payload = dict(request) + for entry in callbacks: + if not isinstance(entry, dict): + continue + name = entry.get("name") + if not name: + raise ValueError("Callback entry missing name") + target = entry.get("target") + if target: + callback = load_binding(target) + else: + return_value = entry.get("return") + + def _noop(*args: Any, _return_value: Any = return_value, **kwargs: Any) -> Any: + return _return_value + + callback = _noop + payload[name] = callback + call_style = binding.get("call_style", "dict") + args = binding.get("args", []) + target = load_binding(target_name) + return call_target(target, call_style, payload, args) + + +def call_django_view( + binding: dict[str, Any], + request: Any, + load_binding: Callable[[str], Callable[..., Any]], + call_target: Callable[[Callable[..., Any], str, Any, list[str]], Any], + resolve_value: Callable[[Any, Any], Any], +) -> Any: + """ + Convert dict request to Django HttpRequest and call Django view. + + This adapter: + 1. Creates a Django HttpRequest object from the dict + 2. Sets request.method, request.POST, request.GET, request.user + 3. Extracts path parameters from the request dict + 4. Calls the Django view function + """ + target_name = binding.get("target") or binding.get("function") + if not target_name: + raise ValueError("Binding missing target") + + # Import Django components (lazy import to avoid errors if Django not available) + # Type checker warnings are expected - Django is optional dependency + # Ensure repo path is in sys.path for Django imports + repo_path_str = os.environ.get("REPO_PATH") + if repo_path_str and repo_path_str not in sys.path: + sys.path.insert(0, repo_path_str) + + # Initialize Django if not already initialized + django_settings = os.environ.get("DJANGO_SETTINGS_MODULE") + if django_settings: + try: + import django + + if not django.apps.apps.ready: + django.setup() + except (ImportError, Exception): + pass # Django not available or already configured + + try: + from django.contrib.auth.models import AnonymousUser # type: ignore[reportMissingImports] + from django.test import RequestFactory # type: ignore[reportMissingImports] + except ImportError: + raise ImportError("Django adapter requires Django to be installed") from None + + if not isinstance(request, dict): + raise ValueError("Django adapter expects dict request") + + # Extract HTTP method (default to POST for form submissions) + method = request.get("_method", "POST").upper() + + # Extract path parameters (e.g., pk, friend_pk) + path_params = {k: v for k, v in request.items() if k.startswith("_path_")} + path_params = {k.replace("_path_", ""): v for k, v in path_params.items()} + + # Extract form data (POST body) + form_data = {k: v for k, v in request.items() if not k.startswith("_")} + + # Create Django HttpRequest using RequestFactory + factory = RequestFactory() + + # Build path with parameters + path = binding.get("path", "/") + for param_name, param_value in path_params.items(): + path = path.replace(f"{{{param_name}}}", str(param_value)) + + # Create request based on method + if method == "GET": + django_request = factory.get(path, form_data) + elif method == "POST": + django_request = factory.post(path, form_data) + elif method == "PUT": + django_request = factory.put(path, form_data) + elif method == "PATCH": + django_request = factory.patch(path, form_data) + elif method == "DELETE": + django_request = factory.delete(path, form_data) + else: + django_request = factory.request(REQUEST_METHOD=method, path=path) + + # Set user if provided + user_target = binding.get("user") or binding.get("user_factory") + if user_target: + user = load_binding(user_target) + if callable(user): + user = user() + django_request.user = user + else: + django_request.user = AnonymousUser() + + # Load and call the Django view + view_func = load_binding(target_name) + + # Django views typically take (request, *args, **kwargs) + # Extract args from path_params + view_args = [] + view_kwargs = {} + + # If path_params exist, they become kwargs + if path_params: + view_kwargs = path_params + else: + # Try to extract from binding args + args_list = binding.get("args", []) + for arg_name in args_list: + if arg_name in request: + view_args.append(request[arg_name]) + + # Call the view + try: + result = view_func(django_request, *view_args, **view_kwargs) + # Convert HttpResponse to dict for CrossHair + if hasattr(result, "status_code"): + return { + "status_code": result.status_code, + "content": getattr(result, "content", b"").decode("utf-8", errors="ignore"), + } + return result + except Exception as e: + # Return error info for CrossHair to analyze + return { + "error": type(e).__name__, + "message": str(e), + } diff --git a/resources/templates/sidecar/bindings.yaml b/resources/templates/sidecar/bindings.yaml new file mode 100644 index 0000000..917abe8 --- /dev/null +++ b/resources/templates/sidecar/bindings.yaml @@ -0,0 +1,26 @@ +bindings: + # Map operationId or function_name to a real code function. + # function is module_path:function_name. + # call_style: dict|kwargs|args|none + # args: list of request keys when call_style=args + # adapters: call_method_with_factory, call_constructor_then_method, call_classmethod, + # call_with_context_manager, call_async, call_with_setup_teardown, + # call_with_request_transform, call_generator, call_from_registry, + # call_with_overrides, call_with_contextvars, call_with_session, + # call_with_callbacks, call_django_view + # + # - operation_id: create_item + # target: your_package.factory:ItemFactory + # method: create + # factory: + # args: ["$request.item_type"] + # call_style: kwargs + # - function_name: find_item_by_id + # adapter: call_method_with_factory + # target: your_package.repo:ItemRepository + # method: find_by_id + # factory: + # args: [] + # call_style: args + # args: + # - item_id diff --git a/resources/templates/sidecar/bindings.yaml.example b/resources/templates/sidecar/bindings.yaml.example new file mode 100644 index 0000000..40223fc --- /dev/null +++ b/resources/templates/sidecar/bindings.yaml.example @@ -0,0 +1,42 @@ +bindings: + # Map operationId or function_name to a real code function. + # function is module_path:function_name. + # call_style: dict|kwargs|args|none + # args: list of request keys when call_style=args + # adapters: call_method_with_factory, call_constructor_then_method, call_classmethod, + # call_with_context_manager, call_async, call_with_setup_teardown, + # call_with_request_transform, call_generator, call_from_registry, + # call_with_overrides, call_with_contextvars, call_with_session, + # call_with_callbacks, call_django_view + # + # Django view bindings example: + # - operation_id: log_in + # adapter: call_django_view + # target: authentication.views:log_in + # path: /login/ + # # Optional: user factory for authenticated requests + # # user: authentication.factories:create_test_user + # + # - operation_id: profile + # adapter: call_django_view + # target: authentication.views:profile + # path: /profile/{pk}/ + # # Path parameters are extracted from request dict with _path_ prefix + # # e.g., request['_path_pk'] = 1 + # + # - operation_id: create_item + # target: your_package.factory:ItemFactory + # method: create + # factory: + # args: ["$request.item_type"] + # call_style: kwargs + # - function_name: find_item_by_id + # adapter: call_method_with_factory + # target: your_package.repo:ItemRepository + # method: find_by_id + # factory: + # args: [] + # call_style: args + # args: + # - item_id + diff --git a/resources/templates/sidecar/crosshair_django_wrapper.py b/resources/templates/sidecar/crosshair_django_wrapper.py new file mode 100755 index 0000000..08e8cec --- /dev/null +++ b/resources/templates/sidecar/crosshair_django_wrapper.py @@ -0,0 +1,140 @@ +#!/usr/bin/env python3 +# pyright: reportMissingImports=false +""" +Django-aware CrossHair wrapper for source code analysis. + +This wrapper initializes Django's app registry before running CrossHair, +enabling analysis of Django models and views that require the app registry to be ready. + +Usage: + python crosshair_django_wrapper.py + +Example: + python crosshair_django_wrapper.py check --verbose /path/to/django/app +""" + +from __future__ import annotations + +import os +import sys +from pathlib import Path + + +def _initialize_django(repo_path: Path | None = None) -> None: + """ + Initialize Django's app registry before CrossHair analysis. + + Args: + repo_path: Optional path to Django repository root. + If not provided, attempts to infer from current directory or environment. + """ + # Check if Django is available + try: + import django + except ImportError: + print("Warning: Django not available, skipping Django initialization", file=sys.stderr) + return + + # Set Django settings module if not already set + django_settings = os.environ.get("DJANGO_SETTINGS_MODULE") + if not django_settings: + # Try to infer from repo structure + if repo_path: + # Common Django project structure: /settings.py + settings_candidates = [ + repo_path / "settings.py", + repo_path / "config" / "settings.py", + repo_path / "project" / "settings.py", + ] + for candidate in settings_candidates: + if candidate.exists(): + # Infer module path from file location + # e.g., /path/to/djangogoat/settings.py -> djangogoat.settings + relative = candidate.relative_to(repo_path.parent) + module_path = str(relative.with_suffix("")).replace(os.sep, ".") + os.environ.setdefault("DJANGO_SETTINGS_MODULE", module_path) + break + + # If still not set, try common patterns + if not os.environ.get("DJANGO_SETTINGS_MODULE") and repo_path: + # Check for manage.py to infer project name + manage_py = repo_path / "manage.py" + if manage_py.exists(): + # Read manage.py to find settings module + try: + content = manage_py.read_text(encoding="utf-8") + # Look for DJANGO_SETTINGS_MODULE or os.environ.setdefault + import re + + match = re.search(r"DJANGO_SETTINGS_MODULE\s*=\s*['\"]([^'\"]+)['\"]", content) + if match: + os.environ.setdefault("DJANGO_SETTINGS_MODULE", match.group(1)) + else: + # Fallback: assume project name matches directory name + project_name = repo_path.name + os.environ.setdefault("DJANGO_SETTINGS_MODULE", f"{project_name}.settings") + except Exception: + # Fallback: assume project name matches directory name + project_name = repo_path.name + os.environ.setdefault("DJANGO_SETTINGS_MODULE", f"{project_name}.settings") + + # Initialize Django + try: + django.setup() + print(f"Django initialized with settings: {os.environ.get('DJANGO_SETTINGS_MODULE')}", file=sys.stderr) + except Exception as e: + print(f"Warning: Django setup failed: {e}", file=sys.stderr) + print("CrossHair will continue, but Django models may not be analyzable", file=sys.stderr) + + +def main() -> int: + """Main entry point for Django-aware CrossHair wrapper.""" + # Parse arguments + # Format: crosshair_django_wrapper.py + # We need to separate CrossHair arguments from source directories + if len(sys.argv) < 2: + print("Usage: crosshair_django_wrapper.py ", file=sys.stderr) + return 1 + + # Try to find repo path from environment or current directory + repo_path: Path | None = None + repo_path_str = os.environ.get("REPO_PATH") + if repo_path_str: + repo_path = Path(repo_path_str).resolve() + else: + # Try to infer from current working directory + cwd = Path.cwd() + # Look for manage.py or settings.py + if (cwd / "manage.py").exists() or (cwd / "settings.py").exists(): + repo_path = cwd + else: + # Try parent directories + for parent in cwd.parents: + if (parent / "manage.py").exists() or (parent / "settings.py").exists(): + repo_path = parent + break + + # Initialize Django before importing CrossHair + _initialize_django(repo_path) + + # Now import and run CrossHair + try: + from crosshair.main import main as crosshair_main + + # CrossHair's main expects sys.argv to be set up correctly + # We pass all arguments except our script name + result = crosshair_main() + # Ensure we always return an int (CrossHair may return None) + if result is None: + return 0 + return result if isinstance(result, int) else 1 + except ImportError: + print("Error: CrossHair not available. Install with: pip install crosshair-tool", file=sys.stderr) + return 1 + except Exception as e: + print(f"Error running CrossHair: {e}", file=sys.stderr) + return 1 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/resources/templates/sidecar/crosshair_plugin.py b/resources/templates/sidecar/crosshair_plugin.py new file mode 100644 index 0000000..6078779 --- /dev/null +++ b/resources/templates/sidecar/crosshair_plugin.py @@ -0,0 +1,34 @@ +""" +CrossHair plugin template for registering contracts without source edits. + +Use with: python -m crosshair check --extra_plugin crosshair_plugin.py +""" + +import importlib +from collections.abc import Callable +from typing import Any, cast + +from crosshair.register_contract import register_contract + + +def _is_non_empty_str(value: object) -> bool: + return isinstance(value, str) and len(value.strip()) > 0 + + +def _post_is_not_none(result: object) -> bool: + return result is not None + + +def plugin() -> None: + """ + Register contracts dynamically for target functions. + + Replace the target import and constraints below. + """ + target_module = "your_package.your_module" + target_function = "your_function" + + module = importlib.import_module(target_module) + func = cast(Callable[..., Any], getattr(module, target_function)) + + register_contract(func, pre=_is_non_empty_str, post=_post_is_not_none) diff --git a/resources/templates/sidecar/django_form_extractor.py b/resources/templates/sidecar/django_form_extractor.py new file mode 100644 index 0000000..2c39e43 --- /dev/null +++ b/resources/templates/sidecar/django_form_extractor.py @@ -0,0 +1,440 @@ +#!/usr/bin/env python3 +""" +Django form schema extractor for sidecar contract population. + +Extracts form field schemas from Django form classes and converts them to OpenAPI format. +""" + +from __future__ import annotations + +import ast +import sys +from pathlib import Path +from typing import TYPE_CHECKING, cast + + +if TYPE_CHECKING: + from typing import Any +else: + # Runtime: Allow Any for dynamic schema structures + Any = object # type: ignore[assignment, misc] + + +def _django_field_to_openapi_type(field_class: str) -> dict[str, str | int | float | bool | list[str]]: + """ + Convert Django form field class to OpenAPI schema type. + + Args: + field_class: Django field class name (e.g., 'CharField', 'EmailField') + + Returns: + OpenAPI schema dictionary + """ + field_lower = field_class.lower() + + # String types + if "char" in field_lower or "text" in field_lower or "slug" in field_lower or "url" in field_lower: + return {"type": "string"} + if "email" in field_lower: + return {"type": "string", "format": "email"} + if "password" in field_lower: + return {"type": "string", "format": "password"} + if "uuid" in field_lower: + return {"type": "string", "format": "uuid"} + + # Numeric types + if "integer" in field_lower or "int" in field_lower: + return {"type": "integer"} + if "float" in field_lower or "decimal" in field_lower: + return {"type": "number", "format": "float"} + + # Boolean + if "boolean" in field_lower or "bool" in field_lower: + return {"type": "boolean"} + + # Date/time types + if "date" in field_lower: + return {"type": "string", "format": "date"} + if "time" in field_lower: + return {"type": "string", "format": "time"} + if "datetime" in field_lower: + return {"type": "string", "format": "date-time"} + + # File types + if "file" in field_lower or "image" in field_lower: + return {"type": "string", "format": "binary"} + + # Choice/select fields + if "choice" in field_lower: + return {"type": "string"} # enum will be added separately if available + + # Default to string + return {"type": "string"} + + +def _extract_field_validators( + field_node: ast.Assign | ast.AnnAssign, +) -> dict[str, str | int | float | bool | list[str]]: + """ + Extract validators and constraints from Django form field. + + Args: + field_node: AST node for field assignment + + Returns: + Dictionary with validation constraints + """ + constraints: dict[str, str | int | float | bool | list[str]] = {} + + # Check for field instantiation (e.g., CharField(max_length=100)) + if isinstance(field_node, ast.Assign): + # ast.Assign has a single 'value' attribute, not 'values' + value = field_node.value + if isinstance(value, ast.Call): + # Check keyword arguments for validators + for kw in value.keywords: + if kw.arg == "max_length" and isinstance(kw.value, ast.Constant): + max_len = kw.value.value + if isinstance(max_len, (int, float)): + constraints["maxLength"] = int(max_len) + elif kw.arg == "min_length" and isinstance(kw.value, ast.Constant): + min_len = kw.value.value + if isinstance(min_len, (int, float)): + constraints["minLength"] = int(min_len) + elif kw.arg == "required" and isinstance(kw.value, ast.Constant): + required_val = kw.value.value + if isinstance(required_val, bool) and required_val is False: + constraints["nullable"] = True + elif kw.arg == "choices" and isinstance(kw.value, (ast.List, ast.Tuple)): + # Extract enum values if available + enum_values: list[str] = [] + for elt in kw.value.elts if hasattr(kw.value, "elts") else []: + if isinstance(elt, (ast.Tuple, ast.List)) and len(elt.elts) >= 1: + first_val = elt.elts[0] + if isinstance(first_val, ast.Constant): + enum_val = first_val.value + if isinstance(enum_val, str): + enum_values.append(enum_val) + if enum_values: + constraints["enum"] = enum_values + + return constraints + + +def _extract_form_fields_from_ast(form_file: Path, form_class_name: str) -> dict[str, dict[str, object]]: + """ + Extract form fields from Django form class using AST. + + Args: + form_file: Path to Python file containing form class + form_class_name: Name of form class (e.g., 'UserProfileForm') + + Returns: + Dictionary mapping field names to OpenAPI schema properties + """ + if not form_file.exists(): + return {} + + try: + content = form_file.read_text(encoding="utf-8") + tree = ast.parse(content, filename=str(form_file)) + except Exception as e: + print(f"Warning: Could not parse {form_file}: {e}", file=sys.stderr) + return {} + + fields: dict[str, dict[str, Any]] = {} + + for node in ast.walk(tree): + # Find the target form class + if isinstance(node, ast.ClassDef) and node.name == form_class_name: + # Check for Meta class (ModelForm) + for item in node.body: + if isinstance(item, ast.ClassDef) and item.name == "Meta": + # Extract fields from Meta.fields + for meta_item in item.body: + if isinstance(meta_item, ast.Assign): + for target in meta_item.targets: + if ( + isinstance(target, ast.Name) + and target.id == "fields" + and isinstance(meta_item.value, (ast.Tuple, ast.List)) + ): + # Extract fields tuple/list + for field_elt in meta_item.value.elts: + if isinstance(field_elt, ast.Constant): + field_elt_value = field_elt.value + if isinstance(field_elt_value, str): + field_name = field_elt_value + # Default schema for ModelForm fields + fields[field_name] = {"type": "string"} # type: ignore[assignment] + elif hasattr(ast, "Str") and isinstance(field_elt, ast.Str): # type: ignore[attr-defined, comparison-overlap] + field_elt_value = field_elt.s # type: ignore[attr-defined, deprecated] + if isinstance(field_elt_value, str): + field_name = field_elt_value + # Default schema for ModelForm fields + fields[field_name] = {"type": "string"} # type: ignore[assignment] + + # Extract explicit field definitions (forms.Form style) + for item in node.body: + if isinstance(item, ast.Assign): + # Check if it's a field assignment + for target in item.targets: + if isinstance(target, ast.Name): + field_name = target.id + # Check if value is a Django field + if isinstance(item.value, ast.Call): + if isinstance(item.value.func, ast.Attribute): + field_class = item.value.func.attr + elif isinstance(item.value.func, ast.Name): + field_class = item.value.func.id + else: + continue + + # Convert to OpenAPI type + schema = _django_field_to_openapi_type(field_class) + # Add validators + validators = _extract_field_validators(item) + schema.update(validators) + fields[field_name] = schema # type: ignore[assignment] + + return fields + + +def _extract_model_fields_from_meta(form_file: Path, form_class_name: str) -> dict[str, dict[str, object]]: + """ + Extract fields from Django ModelForm Meta class. + + Args: + form_file: Path to Python file containing form class + form_class_name: Name of form class + model_name: Optional model name to look up + + Returns: + Dictionary mapping field names to OpenAPI schema properties + """ + # For ModelForm, we can infer basic types from field names + # This is a simplified approach - full implementation would parse the model + fields: dict[str, dict[str, Any]] = {} + + if not form_file.exists(): + return fields + + try: + content = form_file.read_text(encoding="utf-8") + tree = ast.parse(content, filename=str(form_file)) + except Exception: + return fields + + for node in ast.walk(tree): + if isinstance(node, ast.ClassDef) and node.name == form_class_name: + for item in node.body: + if isinstance(item, ast.ClassDef) and item.name == "Meta": + for meta_item in item.body: + if isinstance(meta_item, ast.Assign): + for target in meta_item.targets: + if ( + isinstance(target, ast.Name) + and target.id == "fields" + and isinstance(meta_item.value, (ast.Tuple, ast.List)) + ): + for field_elt in meta_item.value.elts: + if isinstance(field_elt, (ast.Constant, ast.Str)): + if isinstance(field_elt, ast.Constant): + field_elt_value = field_elt.value + if isinstance(field_elt_value, str): + field_name = field_elt_value + field_name_lower = field_name.lower() + # Infer type from field name + if "avatar" in field_name_lower or "image" in field_name_lower: + fields[field_name] = {"type": "string", "format": "binary"} # type: ignore[assignment] + elif "bio" in field_name_lower or "content" in field_name_lower: + fields[field_name] = {"type": "string"} # type: ignore[assignment] + elif "receiver" in field_name_lower or "user" in field_name_lower: + fields[field_name] = {"type": "integer"} # type: ignore[assignment] # Usually a ForeignKey + else: + fields[field_name] = {"type": "string"} # type: ignore[assignment] + elif hasattr(ast, "Str") and isinstance(field_elt, ast.Str): # type: ignore[attr-defined, comparison-overlap] + field_elt_value = field_elt.s # type: ignore[attr-defined, deprecated] + if isinstance(field_elt_value, str): + field_name = field_elt_value + field_name_lower = field_name.lower() + # Infer type from field name + if "avatar" in field_name_lower or "image" in field_name_lower: + fields[field_name] = {"type": "string", "format": "binary"} # type: ignore[assignment] + elif "bio" in field_name_lower or "content" in field_name_lower: + fields[field_name] = {"type": "string"} # type: ignore[assignment] + elif "receiver" in field_name_lower or "user" in field_name_lower: + fields[field_name] = {"type": "integer"} # type: ignore[assignment] # Usually a ForeignKey + else: + fields[field_name] = {"type": "string"} # type: ignore[assignment] + + return fields + + +def extract_form_schema(repo_path: Path, form_module: str, form_class_name: str) -> dict[str, object]: + """ + Extract OpenAPI schema from Django form class. + + Args: + repo_path: Path to Django repository root + form_module: Module path (e.g., 'authentication.forms') + form_class_name: Form class name (e.g., 'UserProfileForm') + + Returns: + OpenAPI schema dictionary with properties and required fields + """ + # Convert module path to file path + module_parts = form_module.split(".") + form_file = repo_path + for part in module_parts: + form_file = form_file / part + form_file = form_file.with_suffix(".py") + + if not form_file.exists(): + # Try alternative locations + possible_paths = [ + repo_path / form_module.replace(".", "/") / "__init__.py", + repo_path / form_module.replace(".", "/") / "forms.py", + ] + for path in possible_paths: + if path.exists(): + form_file = path + break + else: + return {"type": "object", "properties": {}, "required": []} + + # Extract fields + fields = _extract_form_fields_from_ast(form_file, form_class_name) + + # If no fields found, try ModelForm Meta extraction + if not fields: + fields = _extract_model_fields_from_meta(form_file, form_class_name) + + # Build OpenAPI schema + properties: dict[str, dict[str, object]] = {} + required: list[str] = [] + + for field_name, field_schema in fields.items(): + properties[field_name] = cast(dict[str, object], field_schema) + # Assume all fields are required unless explicitly nullable + if not field_schema.get("nullable", False): + required.append(field_name) + + return { + "type": "object", + "properties": properties, + "required": required if required else [], + } + + +def extract_view_form_schema(repo_path: Path, view_module: str, view_function: str) -> dict[str, object] | None: + """ + Extract form schema from Django view function. + + Args: + repo_path: Path to Django repository root + view_module: Module path (e.g., 'authentication.views') + view_function: View function name (e.g., 'sign_up') + + Returns: + OpenAPI schema dictionary or None if no form found + """ + # Convert module path to file path + module_parts = view_module.split(".") + view_file = repo_path + for part in module_parts: + view_file = view_file / part + view_file = view_file.with_suffix(".py") + + if not view_file.exists(): + return None + + try: + content = view_file.read_text(encoding="utf-8") + tree = ast.parse(content, filename=str(view_file)) + except Exception: + return None + + # Find the view function + for node in ast.walk(tree): + if isinstance(node, ast.FunctionDef) and node.name == view_function: + # Look for form instantiation (e.g., UserCreationForm(request.POST)) + for item in ast.walk(node): + if isinstance(item, ast.Call): + # Check if it's a form class instantiation + if isinstance(item.func, ast.Name): + form_class = item.func.id + # Check if it ends with 'Form' (common Django pattern) + if form_class.endswith("Form"): + # Try to find the form module + # Look for imports in the file + for import_node in ast.walk(tree): + if isinstance(import_node, ast.ImportFrom) and import_node.module: + for alias in import_node.names: + if alias.name == form_class: + form_module = import_node.module or "" + if form_module: + return extract_form_schema(repo_path, form_module, form_class) + elif isinstance(item.func, ast.Attribute): + # Handle cases like forms.UserCreationForm + if isinstance(item.func.value, ast.Name): + module_name = item.func.value.id + form_class = item.func.attr + # Common Django form modules + if module_name == "forms" and ("Creation" in form_class or "Sign" in form_class): + # This is likely django.contrib.auth.forms.UserCreationForm + # Return a basic schema for login/signup + return { + "type": "object", + "properties": { + "username": {"type": "string", "minLength": 1}, + "password1": {"type": "string", "minLength": 1}, + "password2": {"type": "string", "minLength": 1}, + }, + "required": ["username", "password1", "password2"], + } + + return None + + +def main() -> int: + """Main entry point for Django form extractor.""" + import argparse + import json + + parser = argparse.ArgumentParser(description="Extract Django form schemas for contract population.") + parser.add_argument("--repo", required=True, help="Path to Django repository") + parser.add_argument("--form-module", help="Form module path (e.g., authentication.forms)") + parser.add_argument("--form-class", help="Form class name (e.g., UserProfileForm)") + parser.add_argument("--view-module", help="View module path (e.g., authentication.views)") + parser.add_argument("--view-function", help="View function name (e.g., sign_up)") + parser.add_argument("--output", help="Output JSON file (default: stdout)") + args = parser.parse_args() + + repo_path = Path(str(args.repo)).resolve() # type: ignore[arg-type] + + schema: dict[str, object] | None = None + + if args.form_module and args.form_class: + schema = extract_form_schema(repo_path, str(args.form_module), str(args.form_class)) # type: ignore[arg-type] + elif args.view_module and args.view_function: + schema = extract_view_form_schema(repo_path, str(args.view_module), str(args.view_function)) # type: ignore[arg-type] + else: + print("Error: Must provide either --form-module/--form-class or --view-module/--view-function", file=sys.stderr) + return 1 + + if schema is None: + schema = {"type": "object", "properties": {}, "required": []} + + output_json = json.dumps(schema, indent=2) + + if args.output: + Path(str(args.output)).write_text(output_json, encoding="utf-8") # type: ignore[arg-type] + else: + print(output_json) + + return 0 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/resources/templates/sidecar/django_url_extractor.py b/resources/templates/sidecar/django_url_extractor.py new file mode 100644 index 0000000..3ed2d90 --- /dev/null +++ b/resources/templates/sidecar/django_url_extractor.py @@ -0,0 +1,304 @@ +#!/usr/bin/env python3 +""" +Django URL pattern extractor for sidecar contract population. + +Extracts URL patterns from Django urls.py files and converts them to OpenAPI paths. +""" + +from __future__ import annotations + +import argparse +import ast +import json +import re +from collections.abc import Sequence +from pathlib import Path +from typing import Any + + +def _extract_path_parameters(path: str) -> tuple[str, list[dict[str, object]]]: + """ + Extract path parameters from Django URL pattern. + + Converts Django format (, ) to OpenAPI format ({pk}, {name}). + + Args: + path: Django URL pattern (e.g., 'notes//') + + Returns: + Tuple of (normalized_path, path_params) + """ + path_params: list[dict[str, Any]] = [] + normalized_path = path + + # Django pattern: or + pattern = r"<(?:(?P\w+):)?(?P\w+)>" + matches = list(re.finditer(pattern, path)) + + for match in matches: + param_type = match.group("type") or "str" + param_name = match.group("name") + + # Convert Django type to OpenAPI type + type_map = { + "int": "integer", + "float": "number", + "str": "string", + "string": "string", + "slug": "string", + "uuid": "string", + "path": "string", + } + openapi_type = type_map.get(param_type.lower(), "string") + + path_params.append( + { + "name": param_name, + "in": "path", + "required": True, + "schema": {"type": openapi_type}, + } + ) + + # Replace with OpenAPI format + normalized_path = normalized_path.replace(match.group(0), f"{{{param_name}}}") + + return normalized_path, path_params + + +def _resolve_view_reference(view_node: ast.AST, imports: dict[str, str]) -> str | None: + """ + Resolve Django view reference to a module path. + + Args: + view_node: AST node representing the view (Name, Attribute, Call) + imports: Dictionary of import aliases to module paths + + Returns: + Module path string (e.g., 'authentication.views.sign_up') or None + """ + if isinstance(view_node, ast.Name): + # Direct reference: sign_up + if view_node.id in imports: + return imports[view_node.id] + return view_node.id + if isinstance(view_node, ast.Attribute): + # Attribute reference: auth_views.sign_up + if isinstance(view_node.value, ast.Name): + module_alias = view_node.value.id + if module_alias in imports: + module_path = imports[module_alias] + return f"{module_path}.{view_node.attr}" + return f"{module_alias}.{view_node.attr}" + elif isinstance(view_node, ast.Call) and isinstance(view_node.func, ast.Attribute): + # Class-based view: NoteDetailView.as_view() + return _resolve_view_reference(view_node.func.value, imports) + + return None + + +def _infer_http_method(view_name: str, view_path: str | None = None) -> str: + """ + Infer HTTP method from view name or path. + + Args: + view_name: Name of the view function + view_path: URL path pattern (optional) + + Returns: + HTTP method (default: 'GET') + """ + view_lower = view_name.lower() + + # Common patterns + if any( + keyword in view_lower + for keyword in ["create", "add", "new", "signup", "sign_up", "login", "log_in", "register"] + ): + return "POST" + if any(keyword in view_lower for keyword in ["update", "edit", "change"]): + return "PUT" + if any(keyword in view_lower for keyword in ["delete", "remove"]): + return "DELETE" + if any(keyword in view_lower for keyword in ["list", "index", "all"]): + return "GET" + if view_path and any(keyword in view_path.lower() for keyword in ["write", "create", "add"]): + return "POST" + + return "GET" + + +def extract_django_urls(repo_path: Path, urls_file: Path | None = None) -> list[dict[str, object]]: + """ + Extract URL patterns from Django urls.py file. + + Args: + repo_path: Path to Django repository root + urls_file: Path to urls.py file (default: find automatically) + + Returns: + List of URL pattern dictionaries with path, method, view, etc. + """ + if urls_file is None: + # Try to find main urls.py + candidates = [ + repo_path / "urls.py", + repo_path / repo_path.name / "urls.py", # project/urls.py + ] + for candidate in candidates: + if candidate.exists(): + urls_file = candidate + break + + if urls_file is None: + # Search for urls.py files + urls_files = list(repo_path.rglob("urls.py")) + if urls_files: + urls_file = urls_files[0] + + if urls_file is None or not urls_file.exists(): + return [] + + with urls_file.open("r", encoding="utf-8") as f: + content = f.read() + + try: + tree = ast.parse(content, filename=str(urls_file)) + except SyntaxError: + return [] + + # Extract imports + imports: dict[str, str] = {} + for node in ast.walk(tree): + if isinstance(node, ast.ImportFrom): + module = node.module or "" + for alias in node.names: + alias_name = alias.asname or alias.name + imports[alias_name] = f"{module}.{alias.name}" + elif isinstance(node, ast.Import): + for alias in node.names: + alias_name = alias.asname or alias.name + imports[alias_name] = alias.name + + # Find urlpatterns + urlpatterns: Sequence[ast.expr] = [] + for node in ast.walk(tree): + if isinstance(node, ast.Assign): + for target in node.targets: + if isinstance(target, ast.Name) and target.id == "urlpatterns": + if isinstance(node.value, ast.List): + urlpatterns = node.value.elts + break + + results: list[dict[str, object]] = [] + + for pattern_node in urlpatterns: + if not isinstance(pattern_node, ast.Call): + continue + + # Check if it's path() or re_path() + if isinstance(pattern_node.func, ast.Name): + func_name = pattern_node.func.id + elif isinstance(pattern_node.func, ast.Attribute): + func_name = pattern_node.func.attr + else: + continue + + if func_name not in ("path", "re_path"): + continue + + # Extract path pattern (first argument) + if not pattern_node.args: + continue + + path_arg = pattern_node.args[0] + if isinstance(path_arg, ast.Constant): + path_pattern = path_arg.value + elif hasattr(ast, "Str") and isinstance(path_arg, ast.Str): # Python < 3.8 + path_pattern = path_arg.s # type: ignore[attr-defined] + else: + continue + + if not isinstance(path_pattern, str): + continue + + # Extract view (second argument) + view_ref = None + if len(pattern_node.args) > 1: + view_node = pattern_node.args[1] + view_ref = _resolve_view_reference(view_node, imports) + + # Extract name (keyword argument or third positional) + pattern_name: str | None = None + for kw in pattern_node.keywords: + if kw.arg == "name" and isinstance(kw.value, ast.Constant): + constant_value = kw.value.value + if isinstance(constant_value, str): + pattern_name = constant_value + break + if kw.arg == "name" and hasattr(ast, "Str") and isinstance(kw.value, ast.Str): + str_value = kw.value.s # type: ignore[attr-defined, deprecated] + pattern_name = str_value if isinstance(str_value, str) else None + break + + if not pattern_name and len(pattern_node.args) > 2: + name_arg = pattern_node.args[2] + if isinstance(name_arg, ast.Constant): + constant_value = name_arg.value + if isinstance(constant_value, str): + pattern_name = constant_value + elif hasattr(ast, "Str") and isinstance(name_arg, ast.Str): + str_value = name_arg.s # type: ignore[attr-defined, deprecated] + pattern_name = str_value if isinstance(str_value, str) else None + + # Normalize path and extract parameters + normalized_path, path_params = _extract_path_parameters(path_pattern) + + # Infer HTTP method + view_name_for_inference = view_ref or pattern_name or "" + if not isinstance(view_name_for_inference, str): + view_name_for_inference = "" + method = _infer_http_method(view_name_for_inference, path_pattern) + + # Extract operation_id from view reference or pattern name + operation_id = pattern_name or (view_ref.split(".")[-1] if view_ref else "unknown") + + results.append( + { + "path": normalized_path, + "method": method, + "view": view_ref, + "operation_id": operation_id, + "path_params": path_params, + "original_path": path_pattern, + } + ) + + return results + + +def main() -> int: + """Main entry point for Django URL extractor.""" + parser = argparse.ArgumentParser(description="Extract Django URL patterns for contract population.") + parser.add_argument("--repo", required=True, help="Path to Django repository") + parser.add_argument("--urls", help="Path to urls.py file (auto-detected if not provided)") + parser.add_argument("--output", help="Output JSON file (default: stdout)") + args = parser.parse_args() + + repo_path = Path(str(args.repo)).resolve() # type: ignore[arg-type] + urls_file = Path(str(args.urls)).resolve() if args.urls else None # type: ignore[arg-type] + + results = extract_django_urls(repo_path, urls_file) + + output_json = json.dumps(results, indent=2, sort_keys=True) + + if args.output: + Path(str(args.output)).write_text(output_json, encoding="utf-8") # type: ignore[arg-type] + else: + print(output_json) + + return 0 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/resources/templates/sidecar/generate_harness.py b/resources/templates/sidecar/generate_harness.py new file mode 100644 index 0000000..e62c105 --- /dev/null +++ b/resources/templates/sidecar/generate_harness.py @@ -0,0 +1,561 @@ +#!/usr/bin/env python3 +""" +Generate deterministic CrossHair harness and inputs from OpenAPI contracts. +""" + +from __future__ import annotations + +import argparse +import json +import pprint +import re +from copy import deepcopy +from pathlib import Path +from typing import Any + +import yaml + + +METHOD_ORDER = ["get", "post", "put", "patch", "delete", "options", "head", "trace"] + + +def _python_literal(value: Any) -> str: + return pprint.pformat(value, width=120, sort_dicts=True) + + +def _base_type_name(type_str: str) -> str: + candidates = [part.strip() for part in type_str.split("|") if part.strip()] + for candidate in candidates: + if candidate in {"None", "NoneType"}: + continue + return re.split(r"[\[( ]", candidate, maxsplit=1)[0] + return candidates[0] if candidates else type_str + + +def _sorted_methods(methods: list[str]) -> list[str]: + method_rank = {name: index for index, name in enumerate(METHOD_ORDER)} + return sorted(methods, key=lambda name: (method_rank.get(name, 99), name)) + + +def _load_yaml(path: Path) -> dict[str, Any]: + with path.open("r", encoding="utf-8") as handle: + return yaml.safe_load(handle) or {} + + +def _resolve_ref(ref: str, doc: dict[str, Any]) -> dict[str, Any] | None: + if not ref.startswith("#/"): + return None + parts = [part for part in ref.lstrip("#/").split("/") if part] + node: Any = doc + for part in parts: + if not isinstance(node, dict) or part not in node: + return None + node = node[part] + if isinstance(node, dict): + return deepcopy(node) + return None + + +def _merge_schemas(schemas: list[dict[str, Any]]) -> dict[str, Any]: + merged: dict[str, Any] = {} + for schema in schemas: + for key, value in schema.items(): + if key == "required": + existing = set(merged.get("required", [])) + existing.update(value or []) + merged["required"] = sorted(existing) + elif key == "properties": + props = merged.setdefault("properties", {}) + for prop_key, prop_value in (value or {}).items(): + props[prop_key] = prop_value + elif key == "allOf": + continue + else: + merged.setdefault(key, value) + return merged + + +def _normalize_schema(schema: dict[str, Any], doc: dict[str, Any], depth: int = 0) -> dict[str, Any]: + if depth > 10: + return schema + if "$ref" in schema: + resolved = _resolve_ref(schema["$ref"], doc) + if resolved is None: + return schema + return _normalize_schema(resolved, doc, depth + 1) + + schema = deepcopy(schema) + + if "allOf" in schema: + merged = _merge_schemas([_normalize_schema(item, doc, depth + 1) for item in schema["allOf"]]) + schema.pop("allOf", None) + schema = _merge_schemas([merged, schema]) + + for key in ("oneOf", "anyOf"): + if key in schema: + schema[key] = [_normalize_schema(item, doc, depth + 1) for item in schema[key]] + + if "properties" in schema: + schema["properties"] = { + prop_key: _normalize_schema(prop_value, doc, depth + 1) + for prop_key, prop_value in schema["properties"].items() + } + + if "items" in schema and isinstance(schema["items"], dict): + schema["items"] = _normalize_schema(schema["items"], doc, depth + 1) + + return schema + + +def _example_from_schema(schema: dict[str, Any]) -> Any: + if "example" in schema: + return schema["example"] + if "default" in schema: + return schema["default"] + if "enum" in schema: + return schema["enum"][0] if schema["enum"] else None + + schema_type = schema.get("type") + if schema_type == "object": + properties = schema.get("properties", {}) + required = set(schema.get("required", [])) + result: dict[str, Any] = {} + for key in sorted(required): + if key in properties: + result[key] = _example_from_schema(properties[key]) + return result + if schema_type == "array": + item_schema = schema.get("items", {}) + return [_example_from_schema(item_schema)] if item_schema else [] + if schema_type == "string": + return "example" + if schema_type == "integer": + return 0 + if schema_type == "number": + return 0.0 + if schema_type == "boolean": + return True + return None + + +def _sanitize_identifier(value: str) -> str: + normalized = re.sub(r"[^a-zA-Z0-9_]+", "_", value.strip("/")) + normalized = re.sub(r"_+", "_", normalized).strip("_") + if not normalized: + return "root" + if normalized[0].isdigit(): + return f"op_{normalized}" + return normalized.lower() + + +def _extract_request_schema(operation: dict[str, Any], doc: dict[str, Any]) -> dict[str, Any]: + request_body = operation.get("requestBody", {}) + content = request_body.get("content", {}) + # Try JSON first, then form-urlencoded (common for Django) + json_content = content.get("application/json") or {} + form_content = content.get("application/x-www-form-urlencoded") or {} + schema = json_content.get("schema") or form_content.get("schema") or {} + if not schema: + return {} + return _normalize_schema(schema, doc) + + +def _extract_response_schema(operation: dict[str, Any], doc: dict[str, Any]) -> dict[str, Any]: + responses = operation.get("responses", {}) + for status_code in sorted(responses.keys()): + if not str(status_code).startswith("2"): + continue + response = responses[status_code] or {} + content = response.get("content", {}) + json_content = content.get("application/json") or {} + schema = json_content.get("schema", {}) + if schema: + return _normalize_schema(schema, doc) + return {} + + +def _extract_operation_examples( + request_schema: dict[str, Any], response_schema: dict[str, Any] +) -> tuple[dict[str, Any], dict[str, Any]]: + return ( + _example_from_schema(request_schema) if request_schema else {}, + _example_from_schema(response_schema) if response_schema else {}, + ) + + +def _collect_operations(contracts_dir: Path) -> list[dict[str, Any]]: + operations: list[dict[str, Any]] = [] + for contract_file in sorted(contracts_dir.glob("*.openapi.yaml")): + doc = _load_yaml(contract_file) + paths = doc.get("paths", {}) if isinstance(doc, dict) else {} + for path, path_item in sorted(paths.items()): + if not isinstance(path_item, dict): + continue + methods = [method for method in path_item if method.lower() in METHOD_ORDER] + for method in _sorted_methods([method.lower() for method in methods]): + operation = path_item.get(method, {}) if isinstance(path_item, dict) else {} + request_schema = _extract_request_schema(operation, doc) + response_schema = _extract_response_schema(operation, doc) + request_example, response_example = _extract_operation_examples(request_schema, response_schema) + op_id = operation.get("operationId") or f"{method}_{path}" + operations.append( + { + "operation_id": op_id, + "path": path, + "method": method, + "request_schema": request_schema, + "response_schema": response_schema, + "request_example": request_example, + "response_example": response_example, + } + ) + return operations + + +def _load_feature_contracts(features_dir: Path) -> dict[str, dict[str, Any]]: + contracts: dict[str, dict[str, Any]] = {} + if not features_dir.exists(): + return contracts + for feature_file in sorted(features_dir.glob("FEATURE-*.yaml")): + feature = _load_yaml(feature_file) + if not isinstance(feature, dict): + continue + source_functions = feature.get("source_functions", []) or [] + func_names = [] + for source in source_functions: + if "::" in source: + func_names.append(source.split("::")[-1]) + stories = feature.get("stories", []) or [] + for story in stories: + contract = story.get("contracts") or {} + if not contract: + continue + return_type = contract.get("return_type") or {} + return_type_str = return_type.get("type") + nullable = return_type.get("nullable", True) + required_params = [ + param.get("name") + for param in (contract.get("parameters") or []) + if param.get("required") and param.get("name") and param.get("name") != "self" + ] + for func_name in func_names: + entry = contracts.setdefault( + func_name, + {"required_params": set(), "return_type": None, "nullable": True}, + ) + entry["required_params"].update(required_params) + if return_type_str and entry["return_type"] is None: + entry["return_type"] = return_type_str + if nullable is False: + entry["nullable"] = False + return contracts + + +def _load_bindings(bindings_path: Path) -> dict[str, dict[str, Any]]: + bindings: dict[str, dict[str, Any]] = {} + if not bindings_path.exists(): + return bindings + data = _load_yaml(bindings_path) + if not isinstance(data, dict): + return bindings + for entry in data.get("bindings", []) or []: + if not isinstance(entry, dict): + continue + key = entry.get("operation_id") or entry.get("function_name") + if not key: + continue + payload = dict(entry) + payload.setdefault("call_style", "dict") + bindings[key] = payload + return bindings + + +def _render_harness( + operations: list[dict[str, Any]], + feature_contracts: dict[str, dict[str, Any]], + bindings: dict[str, dict[str, Any]], +) -> str: + lines: list[str] = [] + lines.append('"""Generated sidecar harness (deterministic)."""') + lines.append("from __future__ import annotations") + lines.append("") + lines.append("from functools import lru_cache") + lines.append("from typing import Any, Callable") + lines.append("") + lines.append("from beartype import beartype") + lines.append("from icontract import ensure, require") + lines.append("") + lines.append("import importlib") + lines.append("import os") + lines.append("import adapters as sidecar_adapters") + lines.append("") + lines.append("# Django initialization (if Django is available)") + lines.append("_django_initialized = False") + lines.append("try:") + lines.append(" django_settings = os.environ.get('DJANGO_SETTINGS_MODULE')") + lines.append(" if django_settings:") + lines.append(" import django") + lines.append(" django.setup()") + lines.append(" _django_initialized = True") + lines.append("except (ImportError, Exception):") + lines.append(" pass # Django not available or already configured") + lines.append("") + lines.append("") + lines.append("@lru_cache(maxsize=None)") + lines.append("def _load_binding(target: str) -> Callable[..., Any]:") + lines.append(' module_name, func_name = target.split(":", 1)') + lines.append(" module = importlib.import_module(module_name)") + lines.append(" return getattr(module, func_name)") + lines.append("") + lines.append("def _matches_schema(value: Any, schema: dict[str, Any]) -> bool:") + lines.append(" if not schema:") + lines.append(" return True") + lines.append(' if schema.get("nullable") and value is None:') + lines.append(" return True") + lines.append(' if "enum" in schema and value not in schema.get("enum", []):') + lines.append(" return False") + lines.append(' schema_type = schema.get("type")') + lines.append(' if schema_type == "object":') + lines.append(" if not isinstance(value, dict):") + lines.append(" return False") + lines.append(' required = schema.get("required", [])') + lines.append(" for key in required:") + lines.append(" if key not in value:") + lines.append(" return False") + lines.append(' properties = schema.get("properties", {})') + lines.append(" for key, prop_schema in properties.items():") + lines.append(" if key in value and not _matches_schema(value[key], prop_schema):") + lines.append(" return False") + lines.append(" return True") + lines.append(' if schema_type == "array":') + lines.append(" if not isinstance(value, list):") + lines.append(" return False") + lines.append(' min_items = schema.get("minItems")') + lines.append(' max_items = schema.get("maxItems")') + lines.append(" if min_items is not None and len(value) < min_items:") + lines.append(" return False") + lines.append(" if max_items is not None and len(value) > max_items:") + lines.append(" return False") + lines.append(' item_schema = schema.get("items", {})') + lines.append(" return all(_matches_schema(item, item_schema) for item in value)") + lines.append(' if schema_type == "string":') + lines.append(" if not isinstance(value, str):") + lines.append(" return False") + lines.append(' min_len = schema.get("minLength")') + lines.append(' max_len = schema.get("maxLength")') + lines.append(" if min_len is not None and len(value) < min_len:") + lines.append(" return False") + lines.append(" if max_len is not None and len(value) > max_len:") + lines.append(" return False") + lines.append(" return True") + lines.append(' if schema_type == "integer":') + lines.append(" if not isinstance(value, int):") + lines.append(" return False") + lines.append(' minimum = schema.get("minimum")') + lines.append(' maximum = schema.get("maximum")') + lines.append(" if minimum is not None and value < minimum:") + lines.append(" return False") + lines.append(" if maximum is not None and value > maximum:") + lines.append(" return False") + lines.append(" return True") + lines.append(' if schema_type == "number":') + lines.append(" if not isinstance(value, (int, float)):") + lines.append(" return False") + lines.append(' minimum = schema.get("minimum")') + lines.append(' maximum = schema.get("maximum")') + lines.append(" if minimum is not None and value < minimum:") + lines.append(" return False") + lines.append(" if maximum is not None and value > maximum:") + lines.append(" return False") + lines.append(" return True") + lines.append(' if schema_type == "boolean":') + lines.append(" return isinstance(value, bool)") + lines.append(' any_of = schema.get("anyOf") or schema.get("oneOf")') + lines.append(" if any_of:") + lines.append(" return any(_matches_schema(value, item) for item in any_of)") + lines.append(' all_of = schema.get("allOf")') + lines.append(" if all_of:") + lines.append(" return all(_matches_schema(value, item) for item in all_of)") + lines.append(" return True") + lines.append("") + lines.append("def _request_has_required(request: Any, required: list[str]) -> bool:") + lines.append(" if not required:") + lines.append(" return True") + lines.append(" if not isinstance(request, dict):") + lines.append(" return False") + lines.append(" for key in required:") + lines.append(" if key not in request or request[key] is None:") + lines.append(" return False") + lines.append(" return True") + lines.append("") + lines.append("") + lines.append("def _matches_return_contract(value: Any, contract: dict[str, Any]) -> bool:") + lines.append(" if not contract:") + lines.append(" return True") + lines.append(" nullable = contract.get('nullable', True)") + lines.append(" if value is None:") + lines.append(" return nullable") + lines.append(" expected = contract.get('type')") + lines.append(" if not expected:") + lines.append(" return True") + lines.append(" expected = expected.split('|')[0].strip()") + lines.append(" expected = expected.split('[', 1)[0].split('(', 1)[0]") + lines.append(" type_map = {'str': str, 'int': int, 'float': float, 'bool': bool, 'dict': dict, 'list': list}") + lines.append(" if expected in {'None', 'NoneType'}:") + lines.append(" return value is None") + lines.append(" py_type = type_map.get(expected)") + lines.append(" if py_type:") + lines.append(" return isinstance(value, py_type)") + lines.append(" return True") + lines.append("") + lines.append("") + lines.append("def _resolve_value(value: Any, request: Any) -> Any:") + lines.append(" if isinstance(value, str):") + lines.append(" if value.startswith('$request.') and isinstance(request, dict):") + lines.append(" return request.get(value.split('.', 1)[1])") + lines.append(" if value.startswith('$env.'):") + lines.append(" return os.environ.get(value.split('.', 1)[1])") + lines.append(" return value") + lines.append("") + lines.append("") + lines.append("def _resolve_list(values: list[Any], request: Any) -> list[Any]:") + lines.append(" return [_resolve_value(item, request) for item in values]") + lines.append("") + lines.append("") + lines.append("def _resolve_dict(values: dict[str, Any], request: Any) -> dict[str, Any]:") + lines.append(" return {key: _resolve_value(value, request) for key, value in values.items()}") + lines.append("") + lines.append("") + lines.append("def _call_target(target: Callable[..., Any], call_style: str, request: Any, args: list[str]) -> Any:") + lines.append(" if call_style == 'none':") + lines.append(" return target()") + lines.append(" if call_style == 'kwargs':") + lines.append(" payload = request if isinstance(request, dict) else {}") + lines.append(" return target(**payload)") + lines.append(" if call_style == 'args':") + lines.append(" payload = request if isinstance(request, dict) else {}") + lines.append(" resolved = [payload.get(name) for name in args]") + lines.append(" return target(*resolved)") + lines.append(" return target(request)") + lines.append("") + lines.append("") + lines.append("def _execute_binding(binding: dict[str, Any], request: Any) -> Any:") + lines.append(" adapter_name = binding.get('adapter')") + lines.append(" if adapter_name:") + lines.append(" adapter = getattr(sidecar_adapters, adapter_name, None)") + lines.append(" if adapter is None:") + lines.append(" raise ValueError(f'Unknown adapter: {adapter_name}')") + lines.append(" return adapter(binding, request, _load_binding, _call_target, _resolve_value)") + lines.append(" target_name = binding.get('target') or binding.get('function')") + lines.append(" if not target_name:") + lines.append(" raise ValueError('Binding missing target')") + lines.append(" call_style = binding.get('call_style', 'dict')") + lines.append(" args = binding.get('args', [])") + lines.append(" method_name = binding.get('method')") + lines.append(" factory_cfg = binding.get('factory') or {}") + lines.append(" if method_name:") + lines.append(" factory_target_name = factory_cfg.get('target') or target_name") + lines.append(" factory_target = _load_binding(factory_target_name)") + lines.append(" factory_args = _resolve_list(factory_cfg.get('args', []), request)") + lines.append(" factory_kwargs = _resolve_dict(factory_cfg.get('kwargs', {}), request)") + lines.append(" instance = factory_target(*factory_args, **factory_kwargs)") + lines.append(" method = getattr(instance, method_name)") + lines.append(" return _call_target(method, call_style, request, args)") + lines.append(" target = _load_binding(target_name)") + lines.append(" return _call_target(target, call_style, request, args)") + lines.append("") + lines.append("") + + for operation in operations: + func_name = _sanitize_identifier(f"{operation['method']}_{operation['path']}") + op_id = operation["operation_id"] + request_schema = operation["request_schema"] + response_schema = operation["response_schema"] + response_example = operation["response_example"] + feature = feature_contracts.get(op_id) or feature_contracts.get(func_name) or {} + binding = bindings.get(op_id) or bindings.get(func_name) or {} + binding_target = binding.get("target") or binding.get("function") + required_params = sorted(feature.get("required_params", [])) + return_contract: dict[str, Any] = {} + if feature.get("return_type") or feature.get("nullable") is False: + return_contract = { + "type": _base_type_name(feature["return_type"]) if feature.get("return_type") else "", + "nullable": feature.get("nullable", True), + } + lines.append(f"REQUEST_SCHEMA_{func_name.upper()} = {_python_literal(request_schema)}") + lines.append(f"RESPONSE_SCHEMA_{func_name.upper()} = {_python_literal(response_schema)}") + lines.append(f"RESPONSE_EXAMPLE_{func_name.upper()} = {_python_literal(response_example)}") + if required_params: + lines.append(f"FEATURE_REQUIRED_{func_name.upper()} = {_python_literal(required_params)}") + if return_contract: + lines.append(f"FEATURE_RETURN_CONTRACT_{func_name.upper()} = {_python_literal(return_contract)}") + if binding and binding_target: + lines.append(f"BINDING_{func_name.upper()} = {_python_literal(binding)}") + lines.append("") + lines.append(f"@require(lambda request: _matches_schema(request, REQUEST_SCHEMA_{func_name.upper()}))") + if required_params: + lines.append( + f"@require(lambda request: _request_has_required(request, FEATURE_REQUIRED_{func_name.upper()}))" + ) + lines.append(f"@ensure(lambda result: _matches_schema(result, RESPONSE_SCHEMA_{func_name.upper()}))") + if return_contract: + lines.append( + f"@ensure(lambda result: _matches_return_contract(result, FEATURE_RETURN_CONTRACT_{func_name.upper()}))" + ) + lines.append("@beartype") + lines.append(f"def {func_name}(request: Any) -> Any:") + lines.append(' """Generated operation harness."""') + if binding and binding_target: + lines.append(f" return _execute_binding(BINDING_{func_name.upper()}, request)") + else: + lines.append(f" return RESPONSE_EXAMPLE_{func_name.upper()}") + lines.append("") + + lines.append("__all__ = [") + for operation in operations: + func_name = _sanitize_identifier(f"{operation['method']}_{operation['path']}") + lines.append(f' "{func_name}",') + lines.append("]") + lines.append("") + return "\n".join(lines) + + +def main() -> int: + parser = argparse.ArgumentParser(description="Generate deterministic sidecar harness from OpenAPI contracts.") + parser.add_argument("--contracts", required=True, help="Contracts directory containing *.openapi.yaml files") + parser.add_argument("--output", required=True, help="Output harness path") + parser.add_argument("--inputs", required=True, help="Output inputs JSON path") + parser.add_argument("--features", help="Features directory containing FEATURE-*.yaml files") + parser.add_argument("--bindings", help="Bindings YAML mapping operation_ids to functions") + args = parser.parse_args() + + contracts_dir = Path(args.contracts) + if not contracts_dir.exists(): + raise SystemExit(f"Contracts directory not found: {contracts_dir}") + + operations = _collect_operations(contracts_dir) + features_dir = Path(str(args.features)) if args.features else contracts_dir.parent / "features" + bindings_path = Path(str(args.bindings)) if args.bindings else Path("bindings.yaml") + feature_contracts = _load_feature_contracts(features_dir) + bindings = _load_bindings(bindings_path) + inputs_payload = { + "operations": [ + { + "operation_id": op["operation_id"], + "method": op["method"], + "path": op["path"], + "request": op["request_example"], + "response": op["response_example"], + } + for op in operations + ] + } + + Path(str(args.inputs)).write_text(json.dumps(inputs_payload, sort_keys=True, indent=2), encoding="utf-8") + Path(str(args.output)).write_text(_render_harness(operations, feature_contracts, bindings), encoding="utf-8") + + return 0 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/resources/templates/sidecar/harness_contracts.py b/resources/templates/sidecar/harness_contracts.py new file mode 100644 index 0000000..bb61b66 --- /dev/null +++ b/resources/templates/sidecar/harness_contracts.py @@ -0,0 +1,26 @@ +""" +Sidecar harness for attaching contracts without modifying the target repo. + +Replace the placeholders with real imports and constraints based on your bundle. +""" + +from beartype import beartype +from icontract import ensure, require + + +# Example: from your_package import your_module +# Replace "your_package.your_module" with the actual module to validate. + + +@require(lambda value: value is not None) +@ensure(lambda result: isinstance(result, bool)) +@beartype +def sidecar_example_check(value: object) -> bool: + """ + Example sidecar contract. Replace with real invariants. + """ + # Replace with calls into the target repo. + return True + + +__all__ = ["sidecar_example_check"] diff --git a/resources/templates/sidecar/populate_contracts.py b/resources/templates/sidecar/populate_contracts.py new file mode 100644 index 0000000..5bc7892 --- /dev/null +++ b/resources/templates/sidecar/populate_contracts.py @@ -0,0 +1,278 @@ +#!/usr/bin/env python3 +# pyright: reportMissingImports=false, reportImplicitRelativeImport=false +""" +Populate OpenAPI contract stubs with Django URL patterns. + +Reads Django URL patterns and populates existing OpenAPI contract files. + +Note: This is a template file that gets copied to the sidecar workspace. +The imports work at runtime when the file is in the sidecar directory. +""" + +from __future__ import annotations + +import argparse +import sys +from pathlib import Path +from typing import TYPE_CHECKING, cast + +import yaml + + +# Type stubs for template file imports +# These are template files that get copied to sidecar workspace where imports work at runtime +if TYPE_CHECKING: + + def extract_django_urls(repo_path: Path, urls_file: Path | None = None) -> list[dict[str, object]]: ... + def extract_view_form_schema(repo_path: Path, view_module: str, view_function: str) -> dict[str, object] | None: ... + + +# Import from same directory (sidecar templates) +# These scripts are run directly, so we need to handle imports differently +# Add current directory to path for direct import when run as script +_script_dir = Path(__file__).parent +if str(_script_dir) not in sys.path: + sys.path.insert(0, str(_script_dir)) + +# These imports work at runtime when scripts are run directly from sidecar directory +# Type checker uses TYPE_CHECKING stubs above; runtime uses actual imports below +# The sidecar directory has __init__.py, making it a package, so relative imports work at runtime +try: + # Try explicit relative imports first (preferred for type checking) + # These work when the sidecar directory is a proper package (has __init__.py) + from .django_form_extractor import ( # type: ignore[reportMissingImports] + extract_view_form_schema, + ) + from .django_url_extractor import extract_django_urls # type: ignore[reportMissingImports] +except ImportError: + # Fallback for when run as script (runtime path manipulation case) + # This happens when the script is executed directly from the sidecar workspace + # and sys.path manipulation makes absolute imports work + from django_form_extractor import ( # type: ignore[reportMissingImports] + extract_view_form_schema, + ) + from django_url_extractor import ( + extract_django_urls, # type: ignore[reportImplicitRelativeImport, reportMissingImports] + ) + + +def _match_url_to_feature(url_pattern: dict[str, object], feature_key: str) -> bool: + """ + Match URL pattern to feature by operation_id or view name. + + Args: + url_pattern: URL pattern dictionary from extractor + feature_key: Feature key (e.g., 'FEATURE-USER-AUTHENTICATION') + + Returns: + True if pattern matches feature + """ + operation_id = str(url_pattern.get("operation_id", "")).lower() + view = str(url_pattern.get("view", "")).lower() + feature_lower = feature_key.lower().replace("feature-", "").replace("-", "_") + + # Check if operation_id or view contains feature keywords + keywords = feature_lower.split("_") + return any(keyword and (keyword in operation_id or keyword in view) for keyword in keywords) + + +def _create_openapi_operation( + url_pattern: dict[str, object], + repo_path: Path, + form_schema: dict[str, object] | None = None, +) -> dict[str, object]: + """ + Create OpenAPI operation from Django URL pattern. + + Args: + url_pattern: URL pattern dictionary from extractor + repo_path: Path to Django repository (for form extraction) + form_schema: Optional pre-extracted form schema + + Returns: + OpenAPI operation dictionary + """ + method = str(url_pattern["method"]).lower() + path = str(url_pattern["path"]) + operation_id = str(url_pattern["operation_id"]) + path_params = url_pattern.get("path_params", []) + if not isinstance(path_params, list): + path_params = [] + view_ref = url_pattern.get("view") + + operation: dict[str, object] = { + "operationId": operation_id, + "summary": f"{method.upper()} {path}", + "responses": { + "200": {"description": "Success"}, + "400": {"description": "Bad request"}, + "500": {"description": "Internal server error"}, + }, + } + + # Add path parameters + if path_params: + operation["parameters"] = path_params + + # Add request body for POST/PUT/PATCH + if method in ("post", "put", "patch"): + # Try to extract form schema from view + schema: dict[str, object] | None = form_schema + if schema is None and view_ref: + # Try to extract from view function + view_str = str(view_ref) + if "." in view_str: + parts = view_str.split(".") + if len(parts) >= 2: + view_module = ".".join(parts[:-1]) + view_function = parts[-1] + schema = extract_view_form_schema(repo_path, view_module, view_function) + + # Special case: login view doesn't use a form + if schema is None and "login" in operation_id.lower(): + schema = { + "type": "object", + "properties": { + "username": {"type": "string", "minLength": 1}, + "password": {"type": "string", "minLength": 1}, + }, + "required": ["username", "password"], + } + + # Use extracted schema or default empty schema + if schema is None: + schema = {"type": "object", "properties": {}, "required": []} + + operation["requestBody"] = { + "required": True, + "content": { + "application/x-www-form-urlencoded": { + "schema": schema, + } + }, + } + + return operation # type: ignore[return-value] + + +def populate_contracts( + contracts_dir: Path, repo_path: Path, urls_file: Path | None = None, extract_forms: bool = True +) -> dict[str, int]: + """ + Populate OpenAPI contract stubs with Django URL patterns. + + Args: + contracts_dir: Directory containing *.openapi.yaml files + repo_path: Path to Django repository + urls_file: Path to urls.py file (auto-detected if not provided) + + Returns: + Dictionary with statistics (populated, skipped, errors) + """ + # Extract Django URL patterns + url_patterns = extract_django_urls(repo_path, urls_file) + + if not url_patterns: + return {"populated": 0, "skipped": 0, "errors": 0} + + # Find all contract files + contract_files = list(contracts_dir.glob("*.openapi.yaml")) + + stats = {"populated": 0, "skipped": 0, "errors": 0} + + for contract_file in contract_files: + try: + # Load contract + with contract_file.open("r", encoding="utf-8") as f: + contract_data = yaml.safe_load(f) # type: ignore[assignment] + if not isinstance(contract_data, dict): + contract_data = {} + contract = cast(dict[str, object], contract_data) + + if "paths" not in contract: + contract["paths"] = {} + + # Extract feature key from filename + feature_key = contract_file.stem.replace(".openapi", "").upper() + + # Find matching URL patterns + matching_patterns = [p for p in url_patterns if _match_url_to_feature(p, feature_key)] + + if not matching_patterns: + stats["skipped"] += 1 + continue + + # Populate paths + for pattern in matching_patterns: + path = str(pattern["path"]) + method = str(pattern["method"]).lower() + + paths_dict = contract.get("paths", {}) + if not isinstance(paths_dict, dict): + paths_dict = {} + contract["paths"] = paths_dict + if path not in paths_dict: + paths_dict[path] = {} # type: ignore[assignment] + + # Extract form schema if enabled + form_schema: dict[str, object] | None = None + if extract_forms: + view_ref = pattern.get("view") + if view_ref: + view_str = str(view_ref) + if "." in view_str: + parts = view_str.split(".") + if len(parts) >= 2: + view_module = ".".join(parts[:-1]) + view_function = parts[-1] + form_schema = extract_view_form_schema(repo_path, view_module, view_function) + + operation = _create_openapi_operation(pattern, repo_path, form_schema) # type: ignore[arg-type] + if isinstance(paths_dict, dict) and isinstance(paths_dict.get(path), dict): + paths_dict[path][method] = operation # type: ignore[assignment, index] + + # Save updated contract + with contract_file.open("w", encoding="utf-8") as f: + yaml.dump(contract, f, default_flow_style=False, sort_keys=False, allow_unicode=True) + + stats["populated"] += 1 + + except Exception as e: + print(f"Error processing {contract_file}: {e}") + stats["errors"] += 1 + + return stats + + +def main() -> int: + """Main entry point for contract population.""" + parser = argparse.ArgumentParser(description="Populate OpenAPI contracts with Django URL patterns.") + parser.add_argument("--contracts", required=True, help="Contracts directory containing *.openapi.yaml files") + parser.add_argument("--repo", required=True, help="Path to Django repository") + parser.add_argument("--urls", help="Path to urls.py file (auto-detected if not provided)") + args = parser.parse_args() + + contracts_dir = Path(str(args.contracts)).resolve() # type: ignore[arg-type] + repo_path = Path(str(args.repo)).resolve() # type: ignore[arg-type] + urls_file = Path(str(args.urls)).resolve() if args.urls else None # type: ignore[arg-type] + + # Suppress unused result warnings for argparse (these are intentional) + _ = parser.add_argument # type: ignore[assignment, unused-ignore] + + if not contracts_dir.exists(): + print(f"Error: Contracts directory not found: {contracts_dir}") + return 1 + + if not repo_path.exists(): + print(f"Error: Repository path not found: {repo_path}") + return 1 + + stats = populate_contracts(contracts_dir, repo_path, urls_file) + + print(f"Populated: {stats['populated']}, Skipped: {stats['skipped']}, Errors: {stats['errors']}") + + return 0 if stats["errors"] == 0 else 1 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/resources/templates/sidecar/run_sidecar.sh b/resources/templates/sidecar/run_sidecar.sh new file mode 100644 index 0000000..36239a7 --- /dev/null +++ b/resources/templates/sidecar/run_sidecar.sh @@ -0,0 +1,410 @@ +#!/usr/bin/env bash +set -euo pipefail + +# Determine sidecar directory (where this script is located) +SIDECAR_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +if [[ -f ".env" ]]; then + set -a + . ./.env + set +a +fi + +REPO_PATH="${REPO_PATH:-${1:-}}" +BUNDLE_NAME="${BUNDLE_NAME:-${2:-}}" +SEMGREP_CONFIG="${SEMGREP_CONFIG:-}" +REPO_PYTHONPATH="${REPO_PYTHONPATH:-${REPO_PATH}/src:${REPO_PATH}}" +SIDECAR_SOURCE_DIRS="${SIDECAR_SOURCE_DIRS:-}" +RUN_SEMGREP="${RUN_SEMGREP:-1}" +RUN_BASEDPYRIGHT="${RUN_BASEDPYRIGHT:-0}" +RUN_SPECMATIC="${RUN_SPECMATIC:-1}" +RUN_CROSSHAIR="${RUN_CROSSHAIR:-1}" +GENERATE_HARNESS="${GENERATE_HARNESS:-1}" +TIMEOUT_SEMGREP="${TIMEOUT_SEMGREP:-60}" +TIMEOUT_BASEDPYRIGHT="${TIMEOUT_BASEDPYRIGHT:-60}" +TIMEOUT_SPECMATIC="${TIMEOUT_SPECMATIC:-60}" +TIMEOUT_CROSSHAIR="${TIMEOUT_CROSSHAIR:-60}" +HARNESS_PATH="${HARNESS_PATH:-harness_contracts.py}" +INPUTS_PATH="${INPUTS_PATH:-inputs.json}" +SIDECAR_REPORTS_DIR="${SIDECAR_REPORTS_DIR:-${REPO_PATH}/.specfact/projects/${BUNDLE_NAME}/reports/sidecar}" +BINDINGS_PATH="${BINDINGS_PATH:-bindings.yaml}" +FEATURES_DIR="${FEATURES_DIR:-}" +SPECMATIC_CMD="${SPECMATIC_CMD:-}" +SPECMATIC_JAR="${SPECMATIC_JAR:-}" +SPECMATIC_CONFIG="${SPECMATIC_CONFIG:-}" +SPECMATIC_TEST_BASE_URL="${SPECMATIC_TEST_BASE_URL:-}" +SPECMATIC_HOST="${SPECMATIC_HOST:-}" +SPECMATIC_PORT="${SPECMATIC_PORT:-}" +SPECMATIC_TIMEOUT="${SPECMATIC_TIMEOUT:-}" +SPECMATIC_AUTO_STUB="${SPECMATIC_AUTO_STUB:-1}" +SPECMATIC_STUB_HOST="${SPECMATIC_STUB_HOST:-127.0.0.1}" +SPECMATIC_STUB_PORT="${SPECMATIC_STUB_PORT:-19000}" +SPECMATIC_STUB_WAIT="${SPECMATIC_STUB_WAIT:-15}" +SIDECAR_APP_CMD="${SIDECAR_APP_CMD:-}" +SIDECAR_APP_HOST="${SIDECAR_APP_HOST:-127.0.0.1}" +SIDECAR_APP_PORT="${SIDECAR_APP_PORT:-}" +SIDECAR_APP_WAIT="${SIDECAR_APP_WAIT:-15}" +SIDECAR_APP_LOG="${SIDECAR_APP_LOG:-}" +CROSSHAIR_VERBOSE="${CROSSHAIR_VERBOSE:-0}" +CROSSHAIR_REPORT_ALL="${CROSSHAIR_REPORT_ALL:-0}" +CROSSHAIR_REPORT_VERBOSE="${CROSSHAIR_REPORT_VERBOSE:-0}" +CROSSHAIR_MAX_UNINTERESTING_ITERATIONS="${CROSSHAIR_MAX_UNINTERESTING_ITERATIONS:-}" +CROSSHAIR_PER_PATH_TIMEOUT="${CROSSHAIR_PER_PATH_TIMEOUT:-}" +CROSSHAIR_PER_CONDITION_TIMEOUT="${CROSSHAIR_PER_CONDITION_TIMEOUT:-}" +CROSSHAIR_ANALYSIS_KIND="${CROSSHAIR_ANALYSIS_KIND:-}" +CROSSHAIR_EXTRA_PLUGIN="${CROSSHAIR_EXTRA_PLUGIN:-}" + +if [[ -z "${REPO_PATH}" || -z "${BUNDLE_NAME}" ]]; then + echo "Usage: REPO_PATH=/path/to/repo BUNDLE_NAME=bundle ./run_sidecar.sh" + echo " Optional: SEMGREP_CONFIG=/path/to/semgrep.yml" + echo " Optional: REPO_PYTHONPATH=/path/to/repo/src:/path/to/repo" + exit 1 +fi + +CONTRACTS_DIR="${REPO_PATH}/.specfact/projects/${BUNDLE_NAME}/contracts" +export PYTHONPATH="${REPO_PYTHONPATH}:${PYTHONPATH:-}" +# Export Django settings module if set (for framework detection) +if [[ -n "${DJANGO_SETTINGS_MODULE:-}" ]]; then + export DJANGO_SETTINGS_MODULE="${DJANGO_SETTINGS_MODULE}" +fi +TIMESTAMP="$(date -u +%Y%m%dT%H%M%SZ)" +SIDECAR_APP_LOG="${SIDECAR_APP_LOG:-${SIDECAR_REPORTS_DIR}/${TIMESTAMP}-app.log}" + +# Detect Python executable (prefer venv if available) +PYTHON_CMD="${PYTHON_CMD:-python3}" +if [[ -d "${REPO_PATH}/.venv" ]]; then + VENV_PYTHON="${REPO_PATH}/.venv/bin/python" + if [[ -f "${VENV_PYTHON}" ]]; then + PYTHON_CMD="${VENV_PYTHON}" + echo "[sidecar] using venv Python: ${PYTHON_CMD}" + fi +elif [[ -d "${REPO_PATH}/venv" ]]; then + VENV_PYTHON="${REPO_PATH}/venv/bin/python" + if [[ -f "${VENV_PYTHON}" ]]; then + PYTHON_CMD="${VENV_PYTHON}" + echo "[sidecar] using venv Python: ${PYTHON_CMD}" + fi +fi + +# Detect framework type for environment setup +FRAMEWORK_TYPE="${FRAMEWORK_TYPE:-}" +if [[ -z "${FRAMEWORK_TYPE}" ]]; then + # Django detection + if [[ -f "${REPO_PATH}/manage.py" ]] || find "${REPO_PATH}" -maxdepth 2 -name "urls.py" -type f 2>/dev/null | grep -q .; then + FRAMEWORK_TYPE="django" + echo "[sidecar] detected framework: Django" + # Set Django settings module if not already set + if [[ -z "${DJANGO_SETTINGS_MODULE:-}" ]]; then + # Try to detect Django settings module + if [[ -f "${REPO_PATH}/manage.py" ]]; then + SETTINGS_MODULE=$(grep -oP "DJANGO_SETTINGS_MODULE\s*=\s*['\"]([^'\"]+)['\"]" "${REPO_PATH}/manage.py" 2>/dev/null | head -1 | sed "s/.*['\"]\([^'\"]*\)['\"].*/\1/" || echo "") + if [[ -n "${SETTINGS_MODULE}" ]]; then + export DJANGO_SETTINGS_MODULE="${SETTINGS_MODULE}" + echo "[sidecar] auto-detected DJANGO_SETTINGS_MODULE=${DJANGO_SETTINGS_MODULE}" + fi + fi + else + export DJANGO_SETTINGS_MODULE="${DJANGO_SETTINGS_MODULE}" + echo "[sidecar] using DJANGO_SETTINGS_MODULE=${DJANGO_SETTINGS_MODULE}" + fi + # Add other framework detection here (Pyramid, etc.) + fi +fi + +if [[ -z "${SIDECAR_SOURCE_DIRS}" ]]; then + if [[ -d "${REPO_PATH}/src" ]]; then + SIDECAR_SOURCE_DIRS="${REPO_PATH}/src" + elif [[ -d "${REPO_PATH}/lib" ]]; then + SIDECAR_SOURCE_DIRS="${REPO_PATH}/lib" + else + SIDECAR_SOURCE_DIRS="${REPO_PATH}" + fi +fi + +run_with_timeout() { + local timeout_secs="$1" + shift + if command -v timeout >/dev/null 2>&1; then + timeout "${timeout_secs}" "$@" || true + else + "$@" || true + fi +} + +run_and_log() { + local timeout_secs="$1" + local log_file="$2" + shift 2 + mkdir -p "$(dirname "${log_file}")" + if command -v timeout >/dev/null 2>&1; then + timeout "${timeout_secs}" "$@" 2>&1 | tee "${log_file}" || true + else + "$@" 2>&1 | tee "${log_file}" || true + fi +} + +wait_for_port() { + local host="$1" + local port="$2" + local timeout_secs="$3" + local start_ts + start_ts="$(date +%s)" + while true; do + if (echo >"/dev/tcp/${host}/${port}") >/dev/null 2>&1; then + return 0 + fi + if (( $(date +%s) - start_ts >= timeout_secs )); then + return 1 + fi + sleep 0.2 + done +} + +resolve_specmatic_cmd() { + SPEC_CMD=() + SPEC_CMD_LABEL="" + if [[ -n "${SPECMATIC_CMD}" ]]; then + read -r -a SPEC_CMD <<< "${SPECMATIC_CMD}" + SPEC_CMD_LABEL="cmd" + elif [[ -n "${SPECMATIC_JAR}" && -f "${SPECMATIC_JAR}" ]]; then + SPEC_CMD=(java -jar "${SPECMATIC_JAR}") + SPEC_CMD_LABEL="jar" + elif command -v specmatic >/dev/null 2>&1; then + SPEC_CMD=(specmatic) + SPEC_CMD_LABEL="cli" + elif command -v npx >/dev/null 2>&1; then + SPEC_CMD=(npx --yes specmatic) + SPEC_CMD_LABEL="npx" + elif python - <<'PY' >/dev/null 2>&1 +import importlib.util +raise SystemExit(0 if importlib.util.find_spec("specmatic.cli") else 1) +PY + then + SPEC_CMD=(python -m specmatic.cli) + SPEC_CMD_LABEL="module" + elif python - <<'PY' >/dev/null 2>&1 +import importlib.util +raise SystemExit(0 if importlib.util.find_spec("specmatic.__main__") else 1) +PY + then + SPEC_CMD=(python -m specmatic) + SPEC_CMD_LABEL="module-main" + fi +} + +echo "[sidecar] repo: ${REPO_PATH}" +echo "[sidecar] bundle: ${BUNDLE_NAME}" +echo "[sidecar] contracts: ${CONTRACTS_DIR}" +echo "[sidecar] sources: ${SIDECAR_SOURCE_DIRS}" +echo "[sidecar] reports: ${SIDECAR_REPORTS_DIR}" + +# Populate contracts with framework-specific patterns (Django, etc.) +POPULATE_CONTRACTS="${POPULATE_CONTRACTS:-1}" +if [[ "${POPULATE_CONTRACTS}" == "1" ]] && [[ -d "${CONTRACTS_DIR}" ]]; then + if [[ "${FRAMEWORK_TYPE}" == "django" ]]; then + echo "[sidecar] populate contracts (Django URL patterns)..." + run_and_log "${TIMEOUT_CROSSHAIR}" \ + "${SIDECAR_REPORTS_DIR}/${TIMESTAMP}-populate-contracts.log" \ + "${PYTHON_CMD}" populate_contracts.py \ + --contracts "${CONTRACTS_DIR}" \ + --repo "${REPO_PATH}" \ + || echo "[sidecar] warning: contract population failed (continuing anyway)" + fi +fi + +if [[ "${GENERATE_HARNESS}" == "1" ]]; then + if [[ -d "${CONTRACTS_DIR}" ]]; then + if [[ -z "${FEATURES_DIR}" ]]; then + FEATURES_DIR="${CONTRACTS_DIR}/../features" + fi + echo "[sidecar] generate harness..." + run_and_log "${TIMEOUT_CROSSHAIR}" \ + "${SIDECAR_REPORTS_DIR}/${TIMESTAMP}-harness.log" \ + "${PYTHON_CMD}" generate_harness.py \ + --contracts "${CONTRACTS_DIR}" \ + --output "${HARNESS_PATH}" \ + --inputs "${INPUTS_PATH}" \ + --features "${FEATURES_DIR}" \ + --bindings "${BINDINGS_PATH}" + fi +fi + +if [[ "${RUN_SEMGREP}" == "1" && -n "${SEMGREP_CONFIG}" && -f "${SEMGREP_CONFIG}" ]]; then + echo "[sidecar] semgrep..." + run_and_log "${TIMEOUT_SEMGREP}" \ + "${SIDECAR_REPORTS_DIR}/${TIMESTAMP}-semgrep.log" \ + semgrep --config "${SEMGREP_CONFIG}" ${SIDECAR_SOURCE_DIRS} +fi + +if [[ "${RUN_BASEDPYRIGHT}" == "1" ]]; then + BASEDPYRIGHT_CMD="basedpyright" + if [[ -f "${PYTHON_CMD}" ]] && "${PYTHON_CMD}" -m basedpyright --version >/dev/null 2>&1; then + BASEDPYRIGHT_CMD="${PYTHON_CMD} -m basedpyright" + elif ! command -v basedpyright >/dev/null 2>&1; then + echo "[sidecar] basedpyright skipped (not available)" + else + echo "[sidecar] basedpyright..." + run_and_log "${TIMEOUT_BASEDPYRIGHT}" \ + "${SIDECAR_REPORTS_DIR}/${TIMESTAMP}-basedpyright.log" \ + ${BASEDPYRIGHT_CMD} ${SIDECAR_SOURCE_DIRS} + fi +fi + +if [[ "${RUN_SPECMATIC}" == "1" && -d "${CONTRACTS_DIR}" ]]; then + mapfile -t SPEC_CONTRACTS < <( + find "${CONTRACTS_DIR}" -maxdepth 1 -type f \( \ + -name "*.openapi.yaml" -o -name "*.openapi.yml" -o -name "*.openapi.json" \ + \) | sort + ) + resolve_specmatic_cmd + if [[ "${#SPEC_CONTRACTS[@]}" -eq 0 && -z "${SPECMATIC_CONFIG}" ]]; then + echo "[sidecar] specmatic skipped (no contracts found)." + elif [[ "${#SPEC_CMD[@]}" -eq 0 ]]; then + echo "[sidecar] specmatic not available (set SPECMATIC_CMD or SPECMATIC_JAR)." + else + SPEC_ARGS=() + if [[ -n "${SPECMATIC_CONFIG}" ]]; then + SPEC_ARGS+=(--config "${SPECMATIC_CONFIG}") + fi + if [[ -n "${SPECMATIC_TEST_BASE_URL}" ]]; then + SPEC_ARGS+=(--testBaseURL "${SPECMATIC_TEST_BASE_URL}") + fi + if [[ -n "${SPECMATIC_HOST}" ]]; then + SPEC_ARGS+=(--host "${SPECMATIC_HOST}") + fi + if [[ -n "${SPECMATIC_PORT}" ]]; then + SPEC_ARGS+=(--port "${SPECMATIC_PORT}") + fi + if [[ -n "${SPECMATIC_TIMEOUT}" ]]; then + SPEC_ARGS+=(--timeout "${SPECMATIC_TIMEOUT}") + fi + + SIDECAR_APP_PID="" + SIDECAR_STUB_PID="" + + if [[ -n "${SIDECAR_APP_CMD}" ]]; then + echo "[sidecar] starting app: ${SIDECAR_APP_CMD}" + mkdir -p "$(dirname "${SIDECAR_APP_LOG}")" + bash -c "${SIDECAR_APP_CMD}" >"${SIDECAR_APP_LOG}" 2>&1 & + SIDECAR_APP_PID=$! + if [[ -n "${SIDECAR_APP_PORT}" ]]; then + if ! wait_for_port "${SIDECAR_APP_HOST}" "${SIDECAR_APP_PORT}" "${SIDECAR_APP_WAIT}"; then + echo "[sidecar] app did not become ready on ${SIDECAR_APP_HOST}:${SIDECAR_APP_PORT}" + fi + fi + if [[ -z "${SPECMATIC_TEST_BASE_URL}" && -n "${SIDECAR_APP_PORT}" ]]; then + SPECMATIC_TEST_BASE_URL="http://${SIDECAR_APP_HOST}:${SIDECAR_APP_PORT}" + SPEC_ARGS+=(--testBaseURL "${SPECMATIC_TEST_BASE_URL}") + fi + elif [[ "${SPECMATIC_AUTO_STUB}" == "1" && -z "${SPECMATIC_TEST_BASE_URL}" && -z "${SPECMATIC_HOST}" && -z "${SPECMATIC_PORT}" && -z "${SPECMATIC_CONFIG}" ]]; then + echo "[sidecar] specmatic stub (${SPEC_CMD_LABEL})..." + STUB_LOG="${SIDECAR_REPORTS_DIR}/${TIMESTAMP}-specmatic-stub.log" + mkdir -p "$(dirname "${STUB_LOG}")" + "${SPEC_CMD[@]}" stub --host "${SPECMATIC_STUB_HOST}" --port "${SPECMATIC_STUB_PORT}" "${SPEC_CONTRACTS[@]}" \ + >"${STUB_LOG}" 2>&1 & + SIDECAR_STUB_PID=$! + if wait_for_port "${SPECMATIC_STUB_HOST}" "${SPECMATIC_STUB_PORT}" "${SPECMATIC_STUB_WAIT}"; then + SPECMATIC_TEST_BASE_URL="http://${SPECMATIC_STUB_HOST}:${SPECMATIC_STUB_PORT}" + SPEC_ARGS+=(--testBaseURL "${SPECMATIC_TEST_BASE_URL}") + else + echo "[sidecar] specmatic stub did not start on ${SPECMATIC_STUB_HOST}:${SPECMATIC_STUB_PORT}" + fi + fi + + echo "[sidecar] specmatic (${SPEC_CMD_LABEL})..." + run_and_log "${TIMEOUT_SPECMATIC}" \ + "${SIDECAR_REPORTS_DIR}/${TIMESTAMP}-specmatic.log" \ + "${SPEC_CMD[@]}" test "${SPEC_ARGS[@]}" "${SPEC_CONTRACTS[@]}" + + if [[ -n "${SIDECAR_STUB_PID}" ]]; then + kill "${SIDECAR_STUB_PID}" >/dev/null 2>&1 || true + fi + if [[ -n "${SIDECAR_APP_PID}" ]]; then + kill "${SIDECAR_APP_PID}" >/dev/null 2>&1 || true + fi + fi +fi + +if [[ "${RUN_CROSSHAIR}" == "1" ]] && command -v crosshair >/dev/null 2>&1; then + CROSSHAIR_ARGS=() + if [[ "${CROSSHAIR_VERBOSE}" == "1" ]]; then + CROSSHAIR_ARGS+=(--verbose) + fi + if [[ "${CROSSHAIR_REPORT_ALL}" == "1" ]]; then + CROSSHAIR_ARGS+=(--report_all) + fi + if [[ "${CROSSHAIR_REPORT_VERBOSE}" == "1" ]]; then + CROSSHAIR_ARGS+=(--report_verbose) + fi + if [[ -n "${CROSSHAIR_MAX_UNINTERESTING_ITERATIONS}" ]]; then + CROSSHAIR_ARGS+=(--max_uninteresting_iterations "${CROSSHAIR_MAX_UNINTERESTING_ITERATIONS}") + fi + if [[ -n "${CROSSHAIR_PER_PATH_TIMEOUT}" ]]; then + CROSSHAIR_ARGS+=(--per_path_timeout "${CROSSHAIR_PER_PATH_TIMEOUT}") + fi + if [[ -n "${CROSSHAIR_PER_CONDITION_TIMEOUT}" ]]; then + CROSSHAIR_ARGS+=(--per_condition_timeout "${CROSSHAIR_PER_CONDITION_TIMEOUT}") + fi + if [[ -n "${CROSSHAIR_ANALYSIS_KIND}" ]]; then + CROSSHAIR_ARGS+=(--analysis_kind "${CROSSHAIR_ANALYSIS_KIND}") + fi + if [[ -n "${CROSSHAIR_EXTRA_PLUGIN}" ]]; then + CROSSHAIR_ARGS+=(--extra_plugin "${CROSSHAIR_EXTRA_PLUGIN}") + fi + + # Case A: Analyze source code directly (for existing decorators: beartype, icontract, etc.) + # This catches contracts that are already in the source code (e.g., SpecFact CLI dogfooding) + # For Django projects, use the Django-aware wrapper to initialize the app registry first + echo "[sidecar] crosshair (source code - existing decorators)..." + if [[ "${FRAMEWORK_TYPE}" == "django" ]]; then + # Use Django-aware wrapper for source code analysis + CROSSHAIR_WRAPPER="${SIDECAR_DIR}/crosshair_django_wrapper.py" + if [[ -f "${CROSSHAIR_WRAPPER}" ]]; then + echo "[sidecar] using Django-aware CrossHair wrapper for source analysis" + # Export environment variables for Django initialization + CROSSHAIR_ENV="" + if [[ -n "${DJANGO_SETTINGS_MODULE:-}" ]]; then + CROSSHAIR_ENV="DJANGO_SETTINGS_MODULE=${DJANGO_SETTINGS_MODULE} " + fi + if [[ -n "${REPO_PATH:-}" ]]; then + CROSSHAIR_ENV="${CROSSHAIR_ENV}REPO_PATH=${REPO_PATH} " + fi + if [[ -n "${PYTHONPATH:-}" ]]; then + CROSSHAIR_ENV="${CROSSHAIR_ENV}PYTHONPATH=${PYTHONPATH} " + fi + run_and_log "${TIMEOUT_CROSSHAIR}" \ + "${SIDECAR_REPORTS_DIR}/${TIMESTAMP}-crosshair-source.log" \ + env ${CROSSHAIR_ENV}"${PYTHON_CMD}" "${CROSSHAIR_WRAPPER}" check "${CROSSHAIR_ARGS[@]}" ${SIDECAR_SOURCE_DIRS} + else + echo "[sidecar] warning: Django wrapper not found, using standard CrossHair (may fail)" + run_and_log "${TIMEOUT_CROSSHAIR}" \ + "${SIDECAR_REPORTS_DIR}/${TIMESTAMP}-crosshair-source.log" \ + "${PYTHON_CMD}" -m crosshair check "${CROSSHAIR_ARGS[@]}" ${SIDECAR_SOURCE_DIRS} + fi + else + # Standard CrossHair for non-Django projects + run_and_log "${TIMEOUT_CROSSHAIR}" \ + "${SIDECAR_REPORTS_DIR}/${TIMESTAMP}-crosshair-source.log" \ + "${PYTHON_CMD}" -m crosshair check "${CROSSHAIR_ARGS[@]}" ${SIDECAR_SOURCE_DIRS} + fi + + # Case B: Analyze harness (for contracts added via harness generation) + # This catches contracts added externally via harness_contracts.py for code without decorators + # This is the primary analysis method for frameworks without decorators (Django, etc.) + if [[ -f "${HARNESS_PATH}" ]]; then + echo "[sidecar] crosshair (harness - external contracts)..." + # Export environment variables for CrossHair subprocess + CROSSHAIR_ENV="" + if [[ -n "${DJANGO_SETTINGS_MODULE:-}" ]]; then + CROSSHAIR_ENV="DJANGO_SETTINGS_MODULE=${DJANGO_SETTINGS_MODULE} " + fi + if [[ -n "${PYTHONPATH:-}" ]]; then + CROSSHAIR_ENV="${CROSSHAIR_ENV}PYTHONPATH=${PYTHONPATH} " + fi + run_and_log "${TIMEOUT_CROSSHAIR}" \ + "${SIDECAR_REPORTS_DIR}/${TIMESTAMP}-crosshair-harness.log" \ + env ${CROSSHAIR_ENV}"${PYTHON_CMD}" -m crosshair check "${CROSSHAIR_ARGS[@]}" "${HARNESS_PATH}" + else + echo "[sidecar] crosshair harness skipped (${HARNESS_PATH} not found)" + fi +fi diff --git a/resources/templates/sidecar/sidecar-init.sh b/resources/templates/sidecar/sidecar-init.sh new file mode 100755 index 0000000..6ff4148 --- /dev/null +++ b/resources/templates/sidecar/sidecar-init.sh @@ -0,0 +1,107 @@ +#!/usr/bin/env bash +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +TEMPLATE_DIR="${SCRIPT_DIR}" + +TARGET_DIR="${1:-}" +REPO_PATH="${2:-}" +BUNDLE_NAME="${3:-}" + +if [[ -z "${TARGET_DIR}" ]]; then + echo "Usage: ${0} [repo_path] [bundle_name]" + exit 1 +fi + +mkdir -p "${TARGET_DIR}" +cp -R "${TEMPLATE_DIR}/." "${TARGET_DIR}/" + +if [[ -n "${REPO_PATH}" && -n "${BUNDLE_NAME}" ]]; then + # Detect Python executable (prefer venv if available) + PYTHON_CMD="python3" + if [[ -d "${REPO_PATH}/.venv" ]] && [[ -f "${REPO_PATH}/.venv/bin/python" ]]; then + PYTHON_CMD="${REPO_PATH}/.venv/bin/python" + elif [[ -d "${REPO_PATH}/venv" ]] && [[ -f "${REPO_PATH}/venv/bin/python" ]]; then + PYTHON_CMD="${REPO_PATH}/venv/bin/python" + fi + + # Detect framework and set environment variables + DJANGO_SETTINGS_MODULE="" + if [[ -f "${REPO_PATH}/manage.py" ]] || find "${REPO_PATH}" -maxdepth 2 -name "urls.py" -type f 2>/dev/null | grep -q .; then + # Django detected - try to extract settings module + if [[ -f "${REPO_PATH}/manage.py" ]]; then + # Try multiple patterns to extract Django settings module + SETTINGS_MODULE=$(grep -oP "DJANGO_SETTINGS_MODULE\s*=\s*['\"]([^'\"]+)['\"]" "${REPO_PATH}/manage.py" 2>/dev/null | head -1 | sed "s/.*['\"]\([^'\"]*\)['\"].*/\1/" || echo "") + # If not found, try to detect from project structure + if [[ -z "${SETTINGS_MODULE}" ]]; then + # Look for settings.py in common locations + SETTINGS_FILE=$(find "${REPO_PATH}" -maxdepth 2 -name "settings.py" -type f 2>/dev/null | head -1) + if [[ -n "${SETTINGS_FILE}" ]]; then + # Extract module path (e.g., /path/to/djangogoat/settings.py -> djangogoat.settings) + SETTINGS_DIR=$(dirname "${SETTINGS_FILE}" | sed "s|${REPO_PATH}/||" | sed "s|^\./||") + if [[ -n "${SETTINGS_DIR}" ]]; then + SETTINGS_MODULE="${SETTINGS_DIR//\//.}.settings" + else + # If settings.py is in repo root, try to detect project name from manage.py + PROJECT_NAME=$(grep -oP "DJANGO_SETTINGS_MODULE\s*=\s*['\"]([^'\"]+)['\"]" "${REPO_PATH}/manage.py" 2>/dev/null | head -1 | sed "s/.*['\"]\([^'\"]*\)['\"].*/\1/" | cut -d. -f1 || echo "") + if [[ -z "${PROJECT_NAME}" ]]; then + # Fallback: use directory name + PROJECT_NAME=$(basename "${REPO_PATH}") + fi + SETTINGS_MODULE="${PROJECT_NAME}.settings" + fi + fi + fi + if [[ -n "${SETTINGS_MODULE}" ]]; then + DJANGO_SETTINGS_MODULE="${SETTINGS_MODULE}" + fi + fi + # Default Django Python path includes venv if available + # Find Python version-specific site-packages directory + if [[ -d "${REPO_PATH}/.venv" ]]; then + # Try to find actual Python version directory + PYTHON_VERSION_DIR=$(find "${REPO_PATH}/.venv/lib" -maxdepth 1 -type d -name "python*" 2>/dev/null | head -1) + if [[ -n "${PYTHON_VERSION_DIR}" ]] && [[ -d "${PYTHON_VERSION_DIR}/site-packages" ]]; then + REPO_PYTHONPATH="${PYTHON_VERSION_DIR}/site-packages:${REPO_PATH}/src:${REPO_PATH}" + else + REPO_PYTHONPATH="${REPO_PATH}/.venv/lib/python*/site-packages:${REPO_PATH}/src:${REPO_PATH}" + fi + elif [[ -d "${REPO_PATH}/venv" ]]; then + PYTHON_VERSION_DIR=$(find "${REPO_PATH}/venv/lib" -maxdepth 1 -type d -name "python*" 2>/dev/null | head -1) + if [[ -n "${PYTHON_VERSION_DIR}" ]] && [[ -d "${PYTHON_VERSION_DIR}/site-packages" ]]; then + REPO_PYTHONPATH="${PYTHON_VERSION_DIR}/site-packages:${REPO_PATH}/src:${REPO_PATH}" + else + REPO_PYTHONPATH="${REPO_PATH}/venv/lib/python*/site-packages:${REPO_PATH}/src:${REPO_PATH}" + fi + else + REPO_PYTHONPATH="${REPO_PATH}/src:${REPO_PATH}" + fi + else + REPO_PYTHONPATH="${REPO_PATH}/src:${REPO_PATH}" + fi + + cat > "${TARGET_DIR}/.env" <> "${TARGET_DIR}/.env" + fi + + echo "[sidecar-init] wrote ${TARGET_DIR}/.env" + if [[ -n "${DJANGO_SETTINGS_MODULE}" ]]; then + echo "[sidecar-init] detected Django framework, set DJANGO_SETTINGS_MODULE=${DJANGO_SETTINGS_MODULE}" + fi + if [[ "${PYTHON_CMD}" != "python3" ]]; then + echo "[sidecar-init] detected venv, using ${PYTHON_CMD}" + fi +fi + +echo "[sidecar-init] sidecar templates copied to ${TARGET_DIR}" diff --git a/setup.py b/setup.py index 1ad3f68..03dabf6 100644 --- a/setup.py +++ b/setup.py @@ -7,7 +7,7 @@ if __name__ == "__main__": _setup = setup( name="specfact-cli", - version="0.20.1", + version="0.20.5", description="SpecFact CLI - Spec→Contract→Sentinel tool for contract-driven development", packages=find_packages(where="src"), package_dir={"": "src"}, diff --git a/src/__init__.py b/src/__init__.py index 56237ca..15c4777 100644 --- a/src/__init__.py +++ b/src/__init__.py @@ -3,4 +3,4 @@ """ # Define the package version (kept in sync with pyproject.toml and setup.py) -__version__ = "0.20.1" +__version__ = "0.20.5" diff --git a/src/specfact_cli/__init__.py b/src/specfact_cli/__init__.py index 1807427..3485133 100644 --- a/src/specfact_cli/__init__.py +++ b/src/specfact_cli/__init__.py @@ -9,6 +9,6 @@ - Validating reproducibility """ -__version__ = "0.20.1" +__version__ = "0.20.5" __all__ = ["__version__"] diff --git a/src/specfact_cli/commands/repro.py b/src/specfact_cli/commands/repro.py index 2f129fc..9657748 100644 --- a/src/specfact_cli/commands/repro.py +++ b/src/specfact_cli/commands/repro.py @@ -322,8 +322,17 @@ def main( # Exit with appropriate code exit_code = report.get_exit_code() if exit_code == 0: - console.print("\n[bold green]✓[/bold green] All validations passed!") - console.print("[dim]Reproducibility verified[/dim]") + crosshair_failed = any( + check.tool == "crosshair" and check.status.value == "failed" for check in report.checks + ) + if crosshair_failed: + console.print( + "\n[bold yellow]![/bold yellow] Required validations passed, but CrossHair failed (advisory)" + ) + console.print("[dim]Reproducibility verified with advisory failures[/dim]") + else: + console.print("\n[bold green]✓[/bold green] All validations passed!") + console.print("[dim]Reproducibility verified[/dim]") elif exit_code == 1: console.print("\n[bold red]✗[/bold red] Some validations failed") raise typer.Exit(1) diff --git a/src/specfact_cli/utils/enrichment_parser.py b/src/specfact_cli/utils/enrichment_parser.py index 5a80fc3..c8c5c9c 100644 --- a/src/specfact_cli/utils/enrichment_parser.py +++ b/src/specfact_cli/utils/enrichment_parser.py @@ -112,7 +112,9 @@ def _parse_missing_features(self, content: str, report: EnrichmentReport) -> Non section = match.group(1) # Extract individual features (numbered or bulleted) - feature_pattern = r"(?:^|\n)(?:\d+\.|\*|\-)\s*(.+?)(?=\n(?:^\d+\.|\*|\-|\Z))" + # Stop at next feature (numbered item at start of line, optionally followed by bold text) + # This avoids stopping at story numbers which are indented + feature_pattern = r"(?:^|\n)(?:\d+\.|\*|\-)\s*(.+?)(?=\n(?:^\d+\.\s*\*\*|^\d+\.\s+[A-Z]|\*|\-|\Z))" features = re.findall(feature_pattern, section, re.MULTILINE | re.DOTALL) for feature_text in features: @@ -133,20 +135,29 @@ def _parse_feature_block(self, feature_text: str) -> dict[str, Any] | None: "stories": [], } - # Extract key (e.g., "FEATURE-IDEINTEGRATION" or "Suggested key: FEATURE-IDEINTEGRATION") - key_match = re.search(r"(?:key|Key):\s*([A-Z0-9_-]+)", feature_text, re.IGNORECASE) + # Extract title first (from bold text: "**Title** (Key: ...)" or "1. **Title** (Key: ...)") + # Feature text may or may not include the leading number (depends on extraction pattern) + title_match = re.search(r"^\*\*([^*]+)\*\*", feature_text, re.MULTILINE) + if not title_match: + # Try with optional number prefix + title_match = re.search(r"^\d+\.\s*\*\*([^*]+)\*\*", feature_text, re.MULTILINE) + if title_match: + feature["title"] = title_match.group(1).strip() + + # Extract key (e.g., "FEATURE-IDEINTEGRATION" or "(Key: FEATURE-IDEINTEGRATION)") + # Try parentheses format first: (Key: FEATURE-XXX) + key_match = re.search(r"\(Key:\s*([A-Z0-9_-]+)\)", feature_text, re.IGNORECASE) + if not key_match: + # Try without parentheses: Key: FEATURE-XXX + key_match = re.search(r"(?:key|Key):\s*([A-Z0-9_-]+)", feature_text, re.IGNORECASE) if key_match: feature["key"] = key_match.group(1) else: - # Try to extract from title - title_match = re.search(r"^\*\*([^*]+)\*\*", feature_text, re.MULTILINE) - if title_match: - # Generate key from title - title = title_match.group(1).strip() - feature["title"] = title - feature["key"] = f"FEATURE-{title.upper().replace(' ', '').replace('-', '')[:20]}" + # Generate key from title if we have one + if feature["title"]: + feature["key"] = f"FEATURE-{feature['title'].upper().replace(' ', '').replace('-', '')[:20]}" - # Extract title + # Extract title from "Title:" keyword if not found in bold text if not feature["title"]: title_match = re.search(r"(?:title|Title):\s*(.+?)(?:\n|$)", feature_text, re.IGNORECASE) if title_match: @@ -158,14 +169,20 @@ def _parse_feature_block(self, feature_text: str) -> dict[str, Any] | None: with suppress(ValueError): feature["confidence"] = float(confidence_match.group(1)) - # Extract outcomes + # Extract outcomes (stop at Stories: section to avoid capturing story text) outcomes_match = re.search( - r"(?:outcomes?|Outcomes?):\s*(.+?)(?:\n(?:stories?|Stories?)|$)", feature_text, re.IGNORECASE | re.DOTALL + r"(?:outcomes?|Outcomes?):\s*(.+?)(?:\n\s*(?:stories?|Stories?):|\Z)", + feature_text, + re.IGNORECASE | re.DOTALL, ) if outcomes_match: - outcomes_text = outcomes_match.group(1) - # Split by lines or bullets - outcomes = [o.strip() for o in re.split(r"\n|,", outcomes_text) if o.strip()] + outcomes_text = outcomes_match.group(1).strip() + # Split by lines or commas, filter out empty strings and story markers + outcomes = [ + o.strip() + for o in re.split(r"\n|,", outcomes_text) + if o.strip() and not o.strip().startswith("- Stories:") + ] feature["outcomes"] = outcomes # Extract business value or reason @@ -180,11 +197,12 @@ def _parse_feature_block(self, feature_text: str) -> dict[str, Any] | None: feature["outcomes"].append(reason) # Extract stories (REQUIRED for features to pass promotion validation) + # Stop at next feature (numbered with bold title) or section header stories_match = re.search( - r"(?:stories?|Stories?):\s*(.+?)(?:\n(?:##|$))", feature_text, re.IGNORECASE | re.DOTALL + r"(?:stories?|Stories?):\s*(.+?)(?=\n\d+\.\s*\*\*|\n##|\Z)", feature_text, re.IGNORECASE | re.DOTALL ) if stories_match: - stories_text = stories_match.group(1) + stories_text = stories_match.group(1).strip() stories = self._parse_stories_from_text(stories_text, feature.get("key", "")) feature["stories"] = stories @@ -204,9 +222,16 @@ def _parse_stories_from_text(self, stories_text: str, feature_key: str) -> list[ # Extract individual stories (numbered, bulleted, or sub-headers) # Pattern matches: "1. Story title", "- Story title", "### Story title", etc. - story_pattern = r"(?:^|\n)(?:(?:\d+\.|\*|\-|\#\#\#)\s*)?(.+?)(?=\n(?:^\d+\.|\*|\-|\#\#\#|\Z))" + # Handle indented stories (common in nested lists) + # Match numbered stories with optional indentation: " 1. Story title" or "1. Story title" + story_pattern = r"(?:^|\n)(?:\s*)(?:\d+\.)\s*(.+?)(?=\n(?:\s*)(?:\d+\.)|\Z)" story_matches = re.findall(story_pattern, stories_text, re.MULTILINE | re.DOTALL) + # If no matches with numbered pattern, try bulleted pattern + if not story_matches: + story_pattern = r"(?:^|\n)(?:\s*)(?:\*|\-)\s*(.+?)(?=\n(?:\s*)(?:\*|\-|\d+\.)|\Z)" + story_matches = re.findall(story_pattern, stories_text, re.MULTILINE | re.DOTALL) + for idx, story_text in enumerate(story_matches, start=1): story = self._parse_story_block(story_text, feature_key, idx) if story: @@ -246,20 +271,42 @@ def _parse_story_block(self, story_text: str, feature_key: str, story_number: in if title_match: story["title"] = title_match.group(1).strip() else: - # Use first line as title + # Use first line as title (remove leading number/bullet if present) first_line = story_text.split("\n")[0].strip() - if first_line and not first_line.startswith("#"): + # Remove leading number/bullet: "1. Title" -> "Title" or "- Title" -> "Title" + first_line = re.sub(r"^(?:\d+\.|\*|\-)\s*", "", first_line).strip() + # Remove story key prefix if present: "STORY-XXX: Title" -> "Title" + first_line = re.sub(r"^STORY-[A-Z0-9-]+:\s*", "", first_line, flags=re.IGNORECASE).strip() + if first_line and not first_line.startswith("#") and not first_line.startswith("-"): story["title"] = first_line # Extract acceptance criteria + # Handle both "- Acceptance: ..." and "Acceptance: ..." formats + # Pattern matches: "- Acceptance: ..." or "Acceptance: ..." (with optional indentation and dash) + # Use simple pattern that matches "Acceptance:" and captures until end or next numbered item acceptance_match = re.search( - r"(?:acceptance|Acceptance|criteria|Criteria):\s*(.+?)(?:\n(?:tasks?|Tasks?|points?|Points?)|$)", + r"(?:acceptance|Acceptance|criteria|Criteria):\s*(.+?)(?=\n\s*\d+\.|\n\s*(?:tasks?|Tasks?|points?|Points?|##)|\Z)", story_text, re.IGNORECASE | re.DOTALL, ) if acceptance_match: - acceptance_text = acceptance_match.group(1) - acceptance = [a.strip() for a in re.split(r"\n|,", acceptance_text) if a.strip()] + acceptance_text = acceptance_match.group(1).strip() + # Split by commas (common format: "criterion1, criterion2, criterion3") + # Use lookahead to split on comma-space before capital letter (sentence boundaries) + # Also split on newlines for multi-line format + acceptance = [ + a.strip() + for a in re.split(r",\s+(?=[A-Z][a-z])|\n", acceptance_text) + if a.strip() and not a.strip().startswith("-") and not a.strip().startswith("Acceptance:") + ] + # If splitting didn't work well, try simpler comma split + if not acceptance or len(acceptance) == 1: + acceptance = [ + a.strip() for a in acceptance_text.split(",") if a.strip() and not a.strip().startswith("-") + ] + # If still empty after splitting, use the whole text as one criterion + if not acceptance: + acceptance = [acceptance_text] story["acceptance"] = acceptance else: # Default acceptance if none found @@ -399,11 +446,37 @@ def apply_enrichment(plan_bundle: PlanBundle, enrichment: EnrichmentReport) -> P # Update confidence if provided if "confidence" in missing_feature_data: existing_feature.confidence = missing_feature_data["confidence"] + # Update title if provided and empty + if "title" in missing_feature_data and missing_feature_data["title"] and not existing_feature.title: + existing_feature.title = missing_feature_data["title"] # Merge outcomes if "outcomes" in missing_feature_data: for outcome in missing_feature_data["outcomes"]: if outcome not in existing_feature.outcomes: existing_feature.outcomes.append(outcome) + # Merge stories (add new stories that don't already exist) + stories_data = missing_feature_data.get("stories", []) + if stories_data: + existing_story_keys = {s.key for s in existing_feature.stories} + for story_data in stories_data: + if isinstance(story_data, dict): + story_key = story_data.get("key", "") + # Only add story if it doesn't already exist + if story_key and story_key not in existing_story_keys: + story = Story( + key=story_key, + title=story_data.get("title", "Untitled Story"), + acceptance=story_data.get("acceptance", []), + story_points=story_data.get("story_points"), + value_points=story_data.get("value_points"), + tasks=story_data.get("tasks", []), + confidence=story_data.get("confidence", 0.8), + draft=False, + scenarios=None, + contracts=None, + ) + existing_feature.stories.append(story) + existing_story_keys.add(story_key) else: # Create new feature with stories (if provided) stories_data = missing_feature_data.get("stories", []) diff --git a/src/specfact_cli/validators/repro_checker.py b/src/specfact_cli/validators/repro_checker.py index f67400d..6aab56a 100644 --- a/src/specfact_cli/validators/repro_checker.py +++ b/src/specfact_cli/validators/repro_checker.py @@ -7,6 +7,7 @@ from __future__ import annotations +import os import re import shutil import subprocess @@ -45,6 +46,108 @@ def _strip_ansi_codes(text: str) -> str: return ansi_escape.sub("", text) +@beartype +@require(lambda repo_path: isinstance(repo_path, Path), "repo_path must be Path") +@require(lambda targets: isinstance(targets, list), "targets must be list") +@ensure(lambda result: isinstance(result, tuple) and len(result) == 3, "Must return (list, bool, list)") +@ensure( + lambda result: isinstance(result[0], list) and isinstance(result[1], bool) and isinstance(result[2], list), + "Must return (list, bool, list)", +) +def _expand_crosshair_targets(repo_path: Path, targets: list[str]) -> tuple[list[str], bool, list[str]]: + """ + Expand directory targets into module names and PYTHONPATH roots, excluding __main__.py. + """ + expanded: list[str] = [] + excluded_main = False + pythonpath_roots: list[str] = [] + src_root = (repo_path / "src").resolve() + lib_root = (repo_path / "lib").resolve() + + for target in targets: + target_path = repo_path / target + if not target_path.exists(): + continue + if target_path.is_dir(): + target_root = target_path.resolve() + if target_root in (src_root, lib_root): + module_root = target_root + pythonpath_root = target_root + else: + module_root = repo_path.resolve() + pythonpath_root = repo_path.resolve() + pythonpath_root_str = str(pythonpath_root) + if pythonpath_root_str not in pythonpath_roots: + pythonpath_roots.append(pythonpath_root_str) + for py_file in target_root.rglob("*.py"): + if py_file.name == "__main__.py": + excluded_main = True + continue + module_name = _module_name_from_path(module_root, py_file) + if module_name: + expanded.append(module_name) + else: + if target_path.name == "__main__.py": + excluded_main = True + continue + if target_path.suffix == ".py": + file_path = target_path.resolve() + if file_path.is_relative_to(src_root): + module_root = src_root + pythonpath_root = src_root + elif file_path.is_relative_to(lib_root): + module_root = lib_root + pythonpath_root = lib_root + else: + module_root = repo_path.resolve() + pythonpath_root = repo_path.resolve() + pythonpath_root_str = str(pythonpath_root) + if pythonpath_root_str not in pythonpath_roots: + pythonpath_roots.append(pythonpath_root_str) + module_name = _module_name_from_path(module_root, file_path) + if module_name: + expanded.append(module_name) + + expanded = sorted(set(expanded)) + return expanded, excluded_main, pythonpath_roots + + +@beartype +@require(lambda root: isinstance(root, Path), "root must be Path") +@require(lambda file_path: isinstance(file_path, Path), "file_path must be Path") +@ensure(lambda result: result is None or isinstance(result, str), "Must return str or None") +def _module_name_from_path(root: Path, file_path: Path) -> str | None: + """Convert a file path to a module name relative to the root.""" + try: + rel_path = file_path.relative_to(root) + except ValueError: + return None + parts = list(rel_path.parts) + if not parts: + return None + if parts[-1] == "__init__.py": + parts = parts[:-1] + else: + parts[-1] = parts[-1].removesuffix(".py") + if not parts or any(part == "" for part in parts): + return None + return ".".join(parts) + + +@beartype +@require(lambda pythonpath_roots: isinstance(pythonpath_roots, list), "pythonpath_roots must be list") +@ensure(lambda result: result is None or isinstance(result, dict), "Must return dict or None") +def _build_crosshair_env(pythonpath_roots: list[str]) -> dict[str, str] | None: + """Build environment with PYTHONPATH for CrossHair module imports.""" + if not pythonpath_roots: + return None + env = os.environ.copy() + existing = env.get("PYTHONPATH", "") + combined = os.pathsep.join(pythonpath_roots + ([existing] if existing else [])) + env["PYTHONPATH"] = combined + return env + + @beartype @require(lambda output: isinstance(output, str), "Output must be string") @ensure(lambda result: isinstance(result, dict), "Must return dictionary") @@ -540,6 +643,7 @@ def __init__( @require(lambda tool: isinstance(tool, str) and len(tool) > 0, "Tool must be non-empty string") @require(lambda command: isinstance(command, list) and len(command) > 0, "Command must be non-empty list") @require(lambda timeout: timeout is None or timeout > 0, "Timeout must be positive if provided") + @require(lambda env: env is None or isinstance(env, dict), "env must be dict or None") @ensure(lambda result: isinstance(result, CheckResult), "Must return CheckResult") @ensure(lambda result: result.duration is None or result.duration >= 0, "Duration must be non-negative") def run_check( @@ -549,6 +653,7 @@ def run_check( command: list[str], timeout: int | None = None, skip_if_missing: bool = True, + env: dict[str, str] | None = None, ) -> CheckResult: """ Run a single validation check. @@ -559,6 +664,7 @@ def run_check( command: Command to execute timeout: Per-check timeout (default: budget / number of checks, must be > 0 if provided) skip_if_missing: Skip check if tool not found + env: Optional environment variables to pass to the subprocess Returns: CheckResult with status and output @@ -598,6 +704,7 @@ def run_check( text=True, timeout=check_timeout, check=False, + env=env, ) result.duration = time.time() - start @@ -690,7 +797,7 @@ def run_all_checks(self) -> ReproReport: smoke_tests = self.repo_path / "tests" / "smoke" tests_dir = self.repo_path / "tests" - checks: list[tuple[str, str, list[str], int | None, bool]] = [] + checks: list[tuple[str, str, list[str], int | None, bool, dict[str, str] | None]] = [] # Linting (ruff) - optional ruff_available, _ = check_tool_in_env(self.repo_path, "ruff", env_info) @@ -701,26 +808,10 @@ def run_all_checks(self) -> ReproReport: if (self.repo_path / "tools").exists(): ruff_command.append("tools/") ruff_command = build_tool_command(env_info, ruff_command) - checks.append( - ( - "Linting (ruff)", - "ruff", - ruff_command, - None, - True, - ) - ) + checks.append(("Linting (ruff)", "ruff", ruff_command, None, True, None)) else: # Add as skipped check with message - checks.append( - ( - "Linting (ruff)", - "ruff", - [], - None, - True, - ) - ) + checks.append(("Linting (ruff)", "ruff", [], None, True, None)) # Semgrep - optional, only if config exists if semgrep_enabled: @@ -730,25 +821,9 @@ def run_all_checks(self) -> ReproReport: if self.fix: semgrep_command.append("--autofix") semgrep_command = build_tool_command(env_info, semgrep_command) - checks.append( - ( - "Async patterns (semgrep)", - "semgrep", - semgrep_command, - 30, - True, - ) - ) + checks.append(("Async patterns (semgrep)", "semgrep", semgrep_command, 30, True, None)) else: - checks.append( - ( - "Async patterns (semgrep)", - "semgrep", - [], - 30, - True, - ) - ) + checks.append(("Async patterns (semgrep)", "semgrep", [], 30, True, None)) # Type checking (basedpyright) - optional basedpyright_available, _ = check_tool_in_env(self.repo_path, "basedpyright", env_info) @@ -757,9 +832,9 @@ def run_all_checks(self) -> ReproReport: if (self.repo_path / "tools").exists(): basedpyright_command.append("tools/") basedpyright_command = build_tool_command(env_info, basedpyright_command) - checks.append(("Type checking (basedpyright)", "basedpyright", basedpyright_command, None, True)) + checks.append(("Type checking (basedpyright)", "basedpyright", basedpyright_command, None, True, None)) else: - checks.append(("Type checking (basedpyright)", "basedpyright", [], None, True)) + checks.append(("Type checking (basedpyright)", "basedpyright", [], None, True, None)) # CrossHair - optional, only if source directories exist if source_dirs: @@ -771,28 +846,27 @@ def run_all_checks(self) -> ReproReport: if (self.repo_path / "tools").exists(): crosshair_targets.append("tools/") - # Build command: python -m crosshair check - crosshair_base = ["python", "-m", "crosshair", "check", *crosshair_targets] - crosshair_command = build_tool_command(env_info, crosshair_base) - checks.append( - ( - "Contract exploration (CrossHair)", - "crosshair", - crosshair_command, - 60, - True, - ) + expanded_targets, _excluded_main, pythonpath_roots = _expand_crosshair_targets( + self.repo_path, crosshair_targets ) - else: - checks.append( - ( - "Contract exploration (CrossHair)", - "crosshair", - [], - 60, - True, + if expanded_targets: + crosshair_base = ["python", "-m", "crosshair", "check", *expanded_targets] + crosshair_command = build_tool_command(env_info, crosshair_base) + crosshair_env = _build_crosshair_env(pythonpath_roots) + checks.append( + ( + "Contract exploration (CrossHair)", + "crosshair", + crosshair_command, + 60, + True, + crosshair_env, + ) ) - ) + else: + checks.append(("Contract exploration (CrossHair)", "crosshair", [], 60, True, None)) + else: + checks.append(("Contract exploration (CrossHair)", "crosshair", [], 60, True, None)) # Property tests - optional, only if directory exists if contracts_tests.exists(): @@ -800,25 +874,9 @@ def run_all_checks(self) -> ReproReport: if pytest_available: pytest_command = ["pytest", "tests/contracts/", "-v"] pytest_command = build_tool_command(env_info, pytest_command) - checks.append( - ( - "Property tests (pytest contracts)", - "pytest", - pytest_command, - 30, - True, - ) - ) + checks.append(("Property tests (pytest contracts)", "pytest", pytest_command, 30, True, None)) else: - checks.append( - ( - "Property tests (pytest contracts)", - "pytest", - [], - 30, - True, - ) - ) + checks.append(("Property tests (pytest contracts)", "pytest", [], 30, True, None)) # Smoke tests - optional, only if directory exists if smoke_tests.exists(): @@ -826,9 +884,9 @@ def run_all_checks(self) -> ReproReport: if pytest_available: pytest_command = ["pytest", "tests/smoke/", "-v"] pytest_command = build_tool_command(env_info, pytest_command) - checks.append(("Smoke tests (pytest smoke)", "pytest", pytest_command, 30, True)) + checks.append(("Smoke tests (pytest smoke)", "pytest", pytest_command, 30, True, None)) else: - checks.append(("Smoke tests (pytest smoke)", "pytest", [], 30, True)) + checks.append(("Smoke tests (pytest smoke)", "pytest", [], 30, True, None)) for check_args in checks: # Check budget before starting @@ -838,7 +896,7 @@ def run_all_checks(self) -> ReproReport: break # Skip checks with empty commands (tool not available) - name, tool, command, _timeout, _skip_if_missing = check_args + name, tool, command, _timeout, _skip_if_missing, _env = check_args if not command: # Tool not available - create skipped result with helpful message _tool_available, tool_message = check_tool_in_env(self.repo_path, tool, env_info) diff --git a/tests/e2e/test_enrichment_workflow.py b/tests/e2e/test_enrichment_workflow.py index 198fc42..9d5135d 100644 --- a/tests/e2e/test_enrichment_workflow.py +++ b/tests/e2e/test_enrichment_workflow.py @@ -113,12 +113,18 @@ def test_dual_stack_enrichment_workflow(self, sample_repo: Path, tmp_path: Path) 1. **API Gateway Feature** (Key: FEATURE-APIGATEWAY) - Confidence: 0.85 - Outcomes: Provides API routing and gateway functionality - - Reason: AST missed because it's in a separate service module + - Stories: + 1. API Gateway routes requests to appropriate services + - Acceptance: Gateway receives HTTP requests, routes to correct service endpoint, returns service response + 2. API Gateway handles authentication + - Acceptance: Gateway validates API keys, forwards authenticated requests, rejects invalid requests 2. **Database Manager** (Key: FEATURE-DATABASEMANAGER) - Confidence: 0.80 - Outcomes: Handles database connections and queries - - Reason: Not detected in AST analysis + - Stories: + 1. Database Manager establishes connections + - Acceptance: Manager creates database connection pool, manages connection lifecycle, handles connection errors ## Confidence Adjustments diff --git a/tests/integration/test_enrichment_parser_integration.py b/tests/integration/test_enrichment_parser_integration.py new file mode 100644 index 0000000..bdee1b9 --- /dev/null +++ b/tests/integration/test_enrichment_parser_integration.py @@ -0,0 +1,78 @@ +"""Integration tests for enrichment parser with stories.""" + +from __future__ import annotations + +from pathlib import Path + +from specfact_cli.models.plan import Idea, PlanBundle, Product +from specfact_cli.utils.enrichment_parser import EnrichmentParser, apply_enrichment + + +class TestEnrichmentParserIntegration: + """Integration tests for enrichment parser pipeline with stories.""" + + def test_parse_and_apply_enrichment_with_stories(self, tmp_path: Path): + """Test parsing enrichment report with stories and applying to plan bundle.""" + # Create enrichment report with stories + report_content = """# Enrichment Report + +## Missing Features + +1. **User Authentication** (Key: FEATURE-USER-AUTHENTICATION) + - Confidence: 0.85 + - Outcomes: User registration, login, profile management + - Stories: + 1. User can sign up for new account + - Acceptance: sign_up view processes POST requests, creates User and UserProfile automatically, user is automatically logged in after signup, redirects to profile page after signup + 2. User can log in with credentials + - Acceptance: log_in view authenticates username/password, on success user is logged in and redirected to dash, on failure error message is displayed +""" + report_file = tmp_path / "enrichment.md" + report_file.write_text(report_content) + + # Create initial plan bundle + plan_bundle = PlanBundle( + version="1.0", + idea=Idea(title="Test", narrative="Test narrative", metrics=None), + product=Product(themes=["Test"]), + features=[], + business=None, + metadata=None, + clarifications=None, + ) + + # Parse enrichment report + parser = EnrichmentParser() + enrichment = parser.parse(report_file) + + # Verify parsing + assert len(enrichment.missing_features) == 1 + feature_data = enrichment.missing_features[0] + assert feature_data["key"] == "FEATURE-USER-AUTHENTICATION" + assert feature_data["title"] == "User Authentication" + assert len(feature_data["stories"]) == 2 + + # Verify stories were parsed correctly + story1 = feature_data["stories"][0] + assert story1["title"] == "User can sign up for new account" + assert len(story1["acceptance"]) >= 4 + + story2 = feature_data["stories"][1] + assert story2["title"] == "User can log in with credentials" + assert len(story2["acceptance"]) >= 3 + + # Apply enrichment + enriched = apply_enrichment(plan_bundle, enrichment) + + # Verify enriched bundle + assert len(enriched.features) == 1 + feature = enriched.features[0] + assert feature.key == "FEATURE-USER-AUTHENTICATION" + assert feature.title == "User Authentication" + assert len(feature.stories) == 2 + + # Verify stories were added correctly + assert feature.stories[0].title == "User can sign up for new account" + assert len(feature.stories[0].acceptance) >= 4 + assert feature.stories[1].title == "User can log in with credentials" + assert len(feature.stories[1].acceptance) >= 3 diff --git a/tests/unit/utils/test_enrichment_parser_stories.py b/tests/unit/utils/test_enrichment_parser_stories.py new file mode 100644 index 0000000..ae64226 --- /dev/null +++ b/tests/unit/utils/test_enrichment_parser_stories.py @@ -0,0 +1,72 @@ +"""Unit tests for enrichment parser with stories format.""" + +from __future__ import annotations + +from pathlib import Path + +from specfact_cli.utils.enrichment_parser import EnrichmentParser + + +class TestEnrichmentParserStories: + """Test EnrichmentParser with enrichment reports containing stories.""" + + def test_parse_enrichment_report_with_stories(self, tmp_path: Path): + """Test parsing enrichment report with features containing stories and acceptance criteria.""" + # Create test enrichment report with stories format + report_content = """# Enrichment Report + +## Missing Features + +1. **User Authentication** (Key: FEATURE-USER-AUTHENTICATION) + - Confidence: 0.85 + - Outcomes: User registration, login, profile management, authentication system + - Stories: + 1. User can sign up for new account + - Acceptance: sign_up view processes POST requests, creates User and UserProfile automatically, user is automatically logged in after signup, redirects to profile page after signup + 2. User can log in with credentials + - Acceptance: log_in view authenticates username/password, on success user is logged in and redirected to dash, on failure error message is displayed + +2. **Notes Management** (Key: FEATURE-NOTES-MANAGEMENT) + - Confidence: 0.85 + - Outcomes: Messaging system, dashboard with conversations + - Stories: + 1. User can view dashboard with latest notes + - Acceptance: dash view shows latest note from each conversation, dashboard includes all users who have exchanged notes with current user + +## Business Context + +- Priority: Core application for user management +- Constraint: Must support authentication and messaging features +""" + report_file = tmp_path / "enrichment.md" + report_file.write_text(report_content) + + parser = EnrichmentParser() + report = parser.parse(report_file) + + # Verify features were parsed + assert len(report.missing_features) == 2 + + # Verify first feature + feature1 = report.missing_features[0] + assert feature1["key"] == "FEATURE-USER-AUTHENTICATION" + assert feature1["title"] == "User Authentication" + assert feature1["confidence"] == 0.85 + assert len(feature1["outcomes"]) > 0 + assert len(feature1["stories"]) == 2 + + # Verify first story + story1 = feature1["stories"][0] + assert story1["title"] == "User can sign up for new account" + assert len(story1["acceptance"]) >= 4 # Should have multiple acceptance criteria + assert "sign_up view processes POST requests" in story1["acceptance"][0] + + # Verify second feature + feature2 = report.missing_features[1] + assert feature2["key"] == "FEATURE-NOTES-MANAGEMENT" + assert feature2["title"] == "Notes Management" + assert len(feature2["stories"]) == 1 + + # Verify business context + assert len(report.business_context["priorities"]) > 0 + assert len(report.business_context["constraints"]) > 0