Skip to content

Commit 8be301e

Browse files
authored
feat(backlog): normalize daily summarize Markdown output (#323)
* feat(backlog): summarize Markdown normalization and TTY/CI rendering * chore(openspec): drop implementation snapshot from change * Update title --------- Co-authored-by: Dominikus Nold <djm81@users.noreply.github.com>
1 parent 37d8475 commit 8be301e

File tree

13 files changed

+425
-16
lines changed

13 files changed

+425
-16
lines changed

docs/getting-started/tutorial-daily-standup-sprint-review.md

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,9 @@ Preferred command path is `specfact backlog ceremony standup ...`. The legacy `s
2727
the adapter supports fetching comments
2828
- Use **`--summarize`** or **`--summarize-to <path>`** to output a **prompt** (instruction + filter context
2929
+ standup data) for a slash command (e.g. `specfact.daily`) or copy-paste to Copilot to **generate a
30-
standup summary**; add **`--comments`**/**`--annotations`** to include comment annotations in the prompt
30+
standup summary**; add **`--comments`**/**`--annotations`** to include comment annotations in the prompt.
31+
The prompt content is always **normalized to Markdown-only text** (no raw HTML tags or HTML entities) so
32+
ADO-style HTML descriptions/comments and GitHub/Markdown content render consistently.
3133
- Use the **`specfact.backlog-daily`** (or `specfact.daily`) slash prompt for interactive walkthrough with the DevOps team story-by-story (focus, issues, open questions, discussion notes as comments)
3234
- Filter by **`--assignee`**, **`--sprint`** / **`--iteration`**, **`--search`**, **`--release`**, **`--id`**, **`--first-issues`** / **`--last-issues`**, **`--blockers-first`**, and optional **`--suggest-next`**
3335

@@ -142,18 +144,21 @@ the standup table (state, assignee, limit, etc.).
142144
To get a **prompt** you can paste into Copilot or feed to a slash command (e.g. `specfact.daily`) so an AI can **generate a short standup summary** (e.g. "Today: 3 in progress, 1 blocked, 2 pending commitment"):
143145

144146
```bash
145-
# Print prompt to stdout (copy-paste to Copilot)
147+
# Print prompt to stdout (copy-paste to Copilot). In an interactive terminal, SpecFact renders a
148+
# Markdown-formatted view; in CI/non-interactive environments the same normalized Markdown is printed
149+
# without ANSI formatting.
146150
specfact backlog ceremony standup github --summarize --comments
147151

148-
# Write prompt to a file (e.g. for slash command)
152+
# Write prompt to a file (e.g. for slash command). The file always contains plain Markdown-only content
153+
# (no raw HTML, no ANSI control codes), suitable for IDE slash commands or copy/paste into Copilot.
149154
specfact backlog ceremony standup github --summarize-to ./standup-prompt.md --comments
150155
```
151156

152157
The output includes an instruction to generate a standup summary, the applied filter context (adapter,
153158
state, sprint, assignee, limit), and the same per-item data as `--copilot-export`. With
154-
`--comments`/`--annotations`, the prompt includes comment annotations when supported. Use it with the
155-
**`specfact.backlog-daily`** slash prompt for interactive team walkthrough (story-by-story, current focus,
156-
issues/open questions, discussion notes as comments).
159+
`--comments`/`--annotations`, the prompt includes normalized descriptions and comment annotations when
160+
supported. Use it with the **`specfact.backlog-daily`** slash prompt for interactive team walkthrough
161+
(story-by-story, current focus, issues/open questions, discussion notes as comments).
157162

158163
---
159164

docs/guides/agile-scrum-workflows.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -91,8 +91,10 @@ SpecFact CLI supports real-world agile/scrum practices through:
9191
to show suggested next item by value score (business_value / (story_points × priority)).
9292
**Copilot export**: Use `--copilot-export <path>` to write a summarized Markdown file of each story for
9393
Copilot. Add `--comments` (alias `--annotations`) to include descriptions and comment annotations in
94-
`--copilot-export` and `--summarize` outputs when the adapter supports `get_comments` (GitHub, ADO). Use
95-
`--first-comments N` or `--last-comments N` to scope comment volume when needed (default: include all).
94+
`--copilot-export` and `--summarize` outputs when the adapter supports `get_comments` (GitHub, ADO). All
95+
summarize/copilot-export content is **normalized to Markdown-only text** (no raw HTML tags or entities)
96+
so ADO-style HTML fields and Markdown-native fields render consistently. Use `--first-comments N` or
97+
`--last-comments N` to scope comment volume when needed (default: include all).
9698
Use `--first-issues N` or `--last-issues N` (mutually exclusive) to scope daily output to oldest/newest
9799
items by numeric issue/work-item ID.
98100
**Kanban**: omit iteration/sprint and use state + limit; unassigned = pullable work. **Scrum/SAFe**: use
@@ -158,8 +160,8 @@ specfact backlog ceremony standup github --interactive # step-through; detail
158160
# or
159161
specfact backlog ceremony standup github --copilot-export ./standup.md --comments --last-comments 5
160162
# or
161-
specfact backlog ceremony standup github --summarize --comments # prompt to stdout for AI to generate standup summary
162-
specfact backlog ceremony standup github --summarize-to ./standup-prompt.md
163+
specfact backlog ceremony standup github --summarize --comments # prompt to stdout for AI to generate standup summary (Markdown-only)
164+
specfact backlog ceremony standup github --summarize-to ./standup-prompt.md # plain Markdown file (no HTML/ANSI)
163165
```
164166

165167
Use the **`specfact.backlog-daily`** (or `specfact.daily`) slash prompt for interactive walkthrough with the
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
schema: spec-driven
2+
created: 2026-02-27
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
# Change Validation Report: backlog-scrum-05-summarize-markdown-output
2+
3+
**Validation Date**: 2026-02-27T13:01:44+01:00
4+
**Change Proposal**: [proposal.md](./proposal.md)
5+
**Validation Method**: Dry-run dependency analysis and OpenSpec strict validation (post-implementation)
6+
7+
## Executive Summary
8+
9+
- **Breaking Changes**: 0 detected
10+
- **Dependent Files**: 2 affected (implementation and tests; both updated in same change)
11+
- **Impact Level**: Low
12+
- **Validation Result**: Pass
13+
- **User Decision**: Proceed (implementation completed)
14+
15+
## Breaking Changes Detected
16+
17+
None. All changes are additive or internal:
18+
19+
- **`_normalize_markdown_text(text: str) -> str`**: New private helper in `commands.py`; no public API change.
20+
- **`_is_interactive_tty() -> bool`**: New private helper; no public API change.
21+
- **`_build_summarize_prompt_content(...)`**: Signature unchanged; behavior change is internal (normalization of body/comment strings before inclusion). All call sites (same module and unit tests) remain compatible.
22+
23+
## Dependencies Affected
24+
25+
### Critical Updates Required
26+
27+
None.
28+
29+
### Recommended Updates
30+
31+
- **`src/specfact_cli/modules/backlog/src/commands.py`**: Already updated (normalization, TTY detection, Rich Markdown rendering).
32+
- **`tests/unit/commands/test_backlog_daily.py`**: Already updated (HTML normalization tests, existing summarize tests still pass).
33+
34+
### Optional
35+
36+
- **`docs/getting-started/tutorial-daily-standup-sprint-review.md`**: Updated to describe Markdown-only and interactive vs CI behavior.
37+
- **`docs/guides/agile-scrum-workflows.md`**: Updated to note normalized Markdown-only summarize/copilot-export content.
38+
39+
## Impact Assessment
40+
41+
- **Code Impact**: Single module (`modules/backlog/src/commands.py`); new helpers and wiring inside existing summarize path.
42+
- **Test Impact**: New unit tests for HTML normalization; existing summarize tests unchanged in contract.
43+
- **Documentation Impact**: Tutorial and agile guide updated.
44+
- **Release Impact**: Patch (backward-compatible behavior change; output format improved, not contracted).
45+
46+
## User Decision
47+
48+
**Decision**: Proceed
49+
**Rationale**: Implementation completed; no breaking changes; OpenSpec validation passed.
50+
**Next Steps**: Merge via PR from feature worktree to `dev`; optionally run `/opsx:archive` after merge.
51+
52+
## Format Validation
53+
54+
- **proposal.md Format**: Pass
55+
- Required sections present: Why, What Changes, Capabilities, Impact
56+
- Capabilities section lists new capability and modified daily-standup
57+
- **tasks.md Format**: Pass
58+
- Numbered sections and checkbox task format per config
59+
- All tasks completed except 4.3 (now completed by this validation)
60+
- **specs Format**: Pass
61+
- ADDED/MODIFIED requirements with scenarios (When/Then)
62+
- **design.md Format**: Pass
63+
- Context, Goals/Non-Goals, Decisions, Risks documented
64+
- **Config.yaml Compliance**: Pass
65+
66+
## OpenSpec Validation
67+
68+
- **Status**: Pass
69+
- **Validation Command**: `openspec validate backlog-scrum-05-summarize-markdown-output --strict`
70+
- **Issues Found**: 0
71+
- **Issues Fixed**: 0
72+
- **Re-validated**: N/A
73+
74+
## Validation Artifacts
75+
76+
- Dependency search: `_normalize_markdown_text`, `_is_interactive_tty`, `_build_summarize_prompt_content` usages confined to `commands.py` and `test_backlog_daily.py`.
77+
- No temporary workspace created; validation performed in-repo post-implementation.
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
## TDD Evidence for backlog-scrum-05-summarize-markdown-output
2+
3+
### Failing-before run (new summarize normalization tests)
4+
5+
- **Command:**
6+
7+
```bash
8+
hatch test --cover -v tests/unit/commands/test_backlog_daily.py -k "summarize_prompt_normalizes_html"
9+
```
10+
11+
- **Timestamp:** 2026-02-27 (see CI logs / local shell history for exact time)
12+
13+
- **Failure summary:**
14+
- `test_summarize_prompt_normalizes_html_description_to_markdown`:
15+
- Expected HTML `<p>` / `<br />` and `&amp;` entities to be removed from summarize prompt output.
16+
- Actual output still contained raw `<p>Line 1<br />Line 2 &amp; more</p>` in the Description section.
17+
- `test_summarize_prompt_normalizes_html_comments_to_markdown`:
18+
- Expected HTML `<div>` and `<br>` plus `&amp;` entities in comments to be removed.
19+
- Actual output still contained raw `<div>Comment &amp; note<br>next line</div>` in the Comments section.
20+
21+
These failures confirm current behavior violates the new spec delta: summarize prompts include raw HTML and entities from ADO-style bodies and comments instead of normalized Markdown-only content.
22+
23+
### Passing-after run (summarize normalization implemented)
24+
25+
- **Command:**
26+
27+
```bash
28+
hatch test --cover -v tests/unit/commands/test_backlog_daily.py -k "summarize_prompt_normalizes_html"
29+
```
30+
31+
- **Result:** ✅ 2 passed (normalization tests), remaining tests deselected in this targeted run.
32+
33+
- **Behavior summary:**
34+
- `_build_summarize_prompt_content` now:
35+
- Normalizes HTML-based `body_markdown` values to Markdown-friendly text (no `<p>`, `<br>` tags or `&amp;` entities).
36+
- Normalizes HTML comments before including them under the "Comments (annotations)" section.
37+
- New helper `_normalize_markdown_text` (with `@beartype` and `@ensure`) enforces that the returned text does not contain raw HTML tags, satisfying the updated `daily-standup` summarize requirements.
38+
39+
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
## Context
2+
3+
`specfact backlog daily` already supports a `--summarize` / `--summarize-to` flow that builds a prompt-ready view of the current standup scope (filters, per-item data, body, comments). When used against ADO, the underlying work item body and comments are often stored as HTML, while GitHub and some ADO comments use Markdown. Today the summarize builder can emit raw HTML fragments and entities into the prompt, which is noisy for both humans and LLMs and inconsistent with Markdown-centric flows elsewhere in SpecFact.
4+
5+
At the same time, SpecFact needs to support both interactive, rich terminal sessions (for humans running standup from a shell) and non-interactive / CI environments where summarize output is consumed by other tools or stored as artifacts.
6+
7+
## Goals / Non-Goals
8+
9+
**Goals:**
10+
11+
- Normalize all descriptions and comments included in summarize output into clean Markdown text, regardless of provider format.
12+
- Ensure summarize prompts never contain raw HTML tags or HTML entities.
13+
- Provide a Markdown-aware, readable view of the summarize content in interactive terminals (e.g. Rich Markdown rendering), while keeping the underlying Markdown text stable and prompt-ready.
14+
- Preserve existing summarize semantics: same filters, same per-item data fields, same `--summarize` vs `--summarize-to` behavior.
15+
16+
**Non-Goals:**
17+
18+
- Do not change which items are included in standup or summarize scope (filters and selection logic remain as defined in `daily-standup`).
19+
- Do not change how comments or bodies are stored in providers; normalization is applied only at summarize/export time.
20+
- Do not introduce a hard dependency on any particular HTML-to-Markdown library that would block offline usage; implementation must remain Python-only and bundle-safe.
21+
22+
## Decisions
23+
24+
- Introduce a small normalization utility (e.g. in the backlog module package) that:
25+
- Accepts raw body/comment text and a hint about source format (HTML vs Markdown when known).
26+
- Converts HTML to Markdown using a deterministic, testable strategy.
27+
- Always returns Markdown-only text suitable for inclusion in prompts.
28+
- Extend the summarize builder for `backlog daily` so that:
29+
- Before assembling the per-item section, it passes body and comment text through the normalization utility.
30+
- It treats GitHub/Markdown-native content as Markdown but still routes through the same normalization path for consistency.
31+
- Add a simple environment/TTY detection layer around summarize output:
32+
- If running in an interactive TTY and not explicitly in CI mode, render the normalized Markdown using Rich (or an equivalent Markdown-capable view) for the user.
33+
- If output is redirected, piped, or CI mode is detected, emit plain Markdown text without terminal control codes.
34+
35+
## Risks / Trade-offs
36+
37+
- HTML-to-Markdown conversion can be lossy if not carefully tuned; we must verify typical ADO HTML patterns (paragraphs, lists, bold, links) produce acceptable Markdown for standup prompts.
38+
- Rich or similar libraries must be used in a way that does not leak ANSI control codes into `--summarize-to` files or CI logs; separation between rendered view and underlying text needs to be clear in implementation.
39+
- Normalization adds a processing step per item/comment; for very large backlogs this can affect performance, so implementation should be efficient and optionally short-circuit when input is already clean Markdown.
40+
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# Change: Normalize daily summarize Markdown output
2+
3+
## Why
4+
5+
6+
7+
The current `specfact backlog daily --summarize/--summarize-to` output often contains raw HTML fragments and entities from ADO work item comments, mixed with Markdown-formatted text from GitHub and ADO. This makes the standup summary prompt hard to read for humans and noisy for LLMs, even though the underlying data is correct.
8+
9+
## What Changes
10+
11+
12+
13+
- Normalize backlog comments and descriptions used by `specfact backlog daily --summarize/--summarize-to` so that:
14+
- HTML-formatted content is converted into clean Markdown before it is included in the prompt.
15+
- Existing Markdown content is preserved as Markdown (no lossy reformatting).
16+
- For interactive terminal sessions:
17+
- Render the summarized standup prompt using a Markdown-aware terminal view (e.g. Rich Markdown rendering) so users see a readable, formatted view instead of raw Markdown or HTML.
18+
- For non-interactive / CI environments and plain terminals:
19+
- Fall back to emitting structured Markdown text directly (never raw HTML), preserving prompt-ready formatting for copy/paste into Copilot or slash commands.
20+
- Ensure the summarize output logic can distinguish between:
21+
- Interactive rich terminal usage (formatted view, still based on the same Markdown text).
22+
- Non-interactive/CI usage (plain Markdown text, no color/control codes).
23+
24+
## Capabilities
25+
### New Capabilities
26+
- `backlog-daily-markdown-normalization`: Normalize backlog item bodies and comments into Markdown-only text for daily standup summarize prompts, with environment-aware rendering (rich Markdown view in interactive terminals, plain Markdown in CI/non-interactive mode).
27+
28+
### Modified Capabilities
29+
- `daily-standup`: Clarify that the `--summarize/--summarize-to` scenarios must:
30+
- Include only Markdown (no raw HTML fragments or entities) in per-item body/comment fields.
31+
- Prefer a Markdown-formatted view in interactive terminals while keeping the underlying output prompt-ready for LLMs.
32+
33+
34+
---
35+
36+
## Source Tracking
37+
38+
<!-- source_repo: nold-ai/specfact-cli -->
39+
- **GitHub Issue**: #324
40+
- **Issue URL**: <https://github.com/nold-ai/specfact-cli/issues/324>
41+
- **Last Synced Status**: proposed
42+
- **Sanitized**: false
43+
<!-- content_hash: da76bac5d7da3752 -->
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
## ADDED Requirements
2+
3+
### Requirement: Normalize HTML and Markdown for summarize output
4+
5+
The system SHALL normalize all backlog item descriptions and comments included in `specfact backlog daily --summarize` and `--summarize-to` output so that the resulting prompt contains **only Markdown-formatted text** (no raw HTML tags or HTML entities), regardless of whether the underlying provider stores content as HTML (e.g. ADO) or Markdown (e.g. GitHub, Markdown-style ADO comments).
6+
7+
#### Scenario: HTML comments from ADO are converted to Markdown
8+
- **WHEN** `specfact backlog daily --summarize` or `--summarize-to` includes work items whose description or comments are stored as HTML (e.g. ADO discussion/comments)
9+
- **THEN** the system converts that HTML content into readable Markdown before including it in the summarize prompt
10+
- **AND** the resulting output does not contain raw HTML tags or un-decoded HTML entities (e.g. `&lt;div&gt;`, `<p>`, `<br />`)
11+
12+
#### Scenario: Existing Markdown comments are preserved as Markdown
13+
- **WHEN** `specfact backlog daily --summarize` or `--summarize-to` includes items whose description or comments are already stored as Markdown (e.g. GitHub issues, Markdown-formatted ADO comments)
14+
- **THEN** the system preserves the original Markdown semantics when building the summarize prompt (headings, lists, code fences, emphasis)
15+
- **AND** the system does not degrade Markdown into a less structured format (e.g. by stripping list markers or collapsing headings)
16+
17+
#### Scenario: Mixed HTML and Markdown sources produce a consistent Markdown prompt
18+
- **WHEN** the daily summarize command aggregates items from sources that use different underlying formats (HTML and Markdown)
19+
- **THEN** the combined summarize output is a single, consistent Markdown document suitable for LLM consumption
20+
- **AND** no raw HTML tags or entities appear anywhere in the per-item body or comments sections
21+
22+
### Requirement: Environment-aware rendering for summarize output
23+
24+
The system SHALL render the same normalized Markdown summarize content differently depending on whether it is running in an interactive terminal session or in a non-interactive / CI environment, while always preserving a prompt-ready Markdown representation that tools can consume.
25+
26+
#### Scenario: Interactive terminal shows rich Markdown view
27+
- **WHEN** a user runs `specfact backlog daily --summarize` in an interactive terminal that supports rich output (e.g. TTY, not redirected to a file)
28+
- **THEN** the CLI MAY render the summarize content using a Markdown-aware terminal view (for example, Rich Markdown rendering)
29+
- **AND** the user sees a readable, formatted standup summary prompt (headings, lists, emphasis) instead of raw Markdown or HTML
30+
- **AND** the underlying content remains logically the same as the Markdown text used for `--summarize-to` (same sections and text, just rendered differently)
31+
32+
#### Scenario: Non-interactive or CI environments emit plain Markdown
33+
- **WHEN** `specfact backlog daily --summarize` or `--summarize-to` is run in a non-interactive environment (e.g. CI/CD job, output redirected to a file or piped)
34+
- **THEN** the system emits plain, prompt-ready Markdown text without ANSI color codes or interactive formatting controls
35+
- **AND** the output still satisfies the existing summarize requirement to include instruction text, filter context, and per-item data (including normalized body and comments)
36+
37+
#### Scenario: Summarize-to file output is always Markdown-only
38+
- **WHEN** the user runs `specfact backlog daily --summarize-to <path>`
39+
- **THEN** the file at `<path>` contains only normalized Markdown content (no raw HTML tags or entities, no terminal control codes)
40+
- **AND** the file is suitable for direct copy/paste into IDE slash commands or Copilot prompts without additional cleanup
41+

0 commit comments

Comments
 (0)