Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5826f84 to
b26da38
Compare
Introduce 16 regression tests covering the full verilib-cli pipeline (create -> atomize -> specify -> verify) with a Rust mock probe-verus binary for end-to-end testing without external dependencies. Test categories: - Full pipeline with mock probe-verus - Cross-stage data flow (atomize -> specify -> verify) - Quantitative regression (enrichment coverage, spec-text, verification) - "Structure verif proj" doc requirement validation (code-name precedence, .md immutability, cert timestamps, dependency correctness, required fields) - Idempotency (atomize and verify produce stable output) All assertions check exit codes and artifact contents, never stdout/stderr message formats, for resilience to implementation changes. Also adds a GitHub Actions CI workflow that runs fmt, clippy, and tests on pull requests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
b26da38 to
f795185
Compare
When --no-probe is set, parse .md frontmatter directly using the existing structure module instead of calling probe-verus stubify. This makes atomize --no-probe fully self-contained and fixes CI failures where probe-verus is not available. Co-authored-by: Cursor <cursoragent@cursor.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This pull request introduces a comprehensive regression test suite for the verilib-cli pipeline along with a CI workflow. The PR adds 16 regression tests organized into 5 categories: full pipeline tests with a mock probe-verus binary, cross-stage data flow tests, quantitative regression tests, requirement validation tests (based on the "Structure for Verification Projects" document), and idempotency tests.
Changes:
- Added 16 regression tests in
tests/regression_test.rscovering the full create → atomize → specify → verify pipeline - Introduced a Rust-based mock probe-verus binary for end-to-end testing without external dependencies
- Added test fixtures and configuration files to support the test suite
- Added GitHub Actions CI workflow that runs
fmt,clippy, and tests on pull requests - Applied extensive formatting improvements across the codebase (clippy fixes, consistent indentation, line breaks)
- Enhanced atomize command with
load_stubs_from_md_filesfunction to support--no-probemode by directly parsing .md frontmatter
Reviewed changes
Copilot reviewed 34 out of 35 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/regression_test.rs | New comprehensive test suite with 16 tests across 5 categories covering full pipeline, data flow, quantitative metrics, requirements validation, and idempotency |
| tests/helpers/mock_probe_verus.rs | Mock probe-verus binary that copies fixture files based on subcommand for testing without external dependencies |
| tests/fixtures/*.{json,csv} | Test fixtures for atoms, specs, proofs, stubs, and tracked functions |
| tests/structure_commands_test.rs | Formatting improvements to existing tests |
| src/commands/atomize.rs | Added load_stubs_from_md_files function to support --no-probe mode |
| src/structure/*.rs | Formatting improvements (indentation, line breaks) |
| src/storage/*.rs | Formatting improvements and module reordering |
| src/download/*.rs | Formatting improvements |
| src/commands/*.rs | Extensive formatting improvements across all command files |
| src/main.rs, src/cli.rs, src/constants.rs | Minor formatting improvements |
| .github/workflows/ci.yml | New CI workflow for fmt, clippy, and test checks on PRs |
| Cargo.toml | Added mock-probe-verus binary configuration |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
1f8ff36 to
7f94348
Compare
Merge regression_test.rs and structure_commands_test.rs into a single file with 27 behavioral tests (down from 40), organized into 7 modules: atomize, atomize_atoms_only, specify, verify, error_handling, create, and pipeline. Removed 13 tests: stdout/stderr message assertions (not behavioral), vacuous tests with if-guards that skip assertions, and exact duplicates between the two files. Every remaining test has a doc comment explaining the behavioral property it verifies. Also fix atomize --no-probe to parse .md frontmatter directly instead of calling probe-verus stubify, so the flag works without probe-verus installed. Co-authored-by: Cursor <cursoragent@cursor.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1617a8d to
71d11bd
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 34 out of 35 changed files in this pull request and generated 9 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Enrichment is idempotent: running atomize twice with the same inputs | ||
| /// must produce byte-identical stubs.json. | ||
| #[test] | ||
| fn two_runs_produce_identical_output() { | ||
| let tmp = setup_project(); | ||
|
|
||
| assert_success(&cli(&["atomize", "--no-probe"], tmp.path()), "first run"); | ||
| let first: serde_json::Value = read_json(&tmp.path().join(".verilib/stubs.json")); | ||
|
|
||
| assert_success(&cli(&["atomize", "--no-probe"], tmp.path()), "second run"); | ||
| let second: serde_json::Value = read_json(&tmp.path().join(".verilib/stubs.json")); | ||
|
|
||
| assert_eq!(first, second, "atomize must be idempotent"); | ||
| } |
There was a problem hiding this comment.
The doc comment claims the output is “byte-identical”, but the test compares parsed serde_json::Values (assert_eq!(first, second)), which only checks semantic JSON equality (formatting/key order differences won’t be detected). Either adjust the comment to match the assertion, or compare the raw file bytes/strings if true byte-level stability is required.
| /// Verification is idempotent: running verify twice with the same inputs | ||
| /// must produce byte-identical stubs.json. | ||
| #[test] | ||
| fn two_runs_produce_identical_output() { | ||
| let tmp = setup_project(); | ||
| cli(&["atomize", "--no-probe"], tmp.path()); | ||
|
|
||
| assert_success(&cli(&["verify", "--no-probe"], tmp.path()), "first run"); | ||
| let first: serde_json::Value = read_json(&tmp.path().join(".verilib/stubs.json")); | ||
|
|
||
| assert_success(&cli(&["verify", "--no-probe"], tmp.path()), "second run"); | ||
| let second: serde_json::Value = read_json(&tmp.path().join(".verilib/stubs.json")); | ||
|
|
||
| assert_eq!(first, second, "verify must be idempotent"); | ||
| } |
There was a problem hiding this comment.
This doc comment says verify should produce “byte-identical stubs.json”, but the test compares deserialized JSON values, not the file’s raw bytes. Please either reword the comment to “JSON-identical” (semantic equality) or change the assertion to compare the file contents if byte-level stability is intended.
| let tmp = setup_project_with_config("config_auto_validate.json"); | ||
| cli(&["atomize", "--no-probe"], tmp.path()); | ||
| assert_success(&cli(&["specify", "--no-probe"], tmp.path()), "specify"); |
There was a problem hiding this comment.
These tests run atomize --no-probe as a prerequisite but do not assert it succeeds. If atomize fails, the specify assertions may still pass using the pre-copied fixtures, which can mask regressions in cross-stage data flow. Please assert success on the atomize step (or remove it if it’s intentionally not part of the test’s setup).
| fn sets_verified_field_from_proofs() { | ||
| let tmp = setup_project(); | ||
| cli(&["atomize", "--no-probe"], tmp.path()); | ||
| assert_success(&cli(&["verify", "--no-probe"], tmp.path()), "verify"); |
There was a problem hiding this comment.
This verify test runs atomize --no-probe but doesn’t check its exit status. If atomize fails, the verify assertion may fail for the wrong reason or (depending on fixture state) still pass without actually exercising the intended pipeline. Please assert that the atomize step succeeds.
| let before = collect_md_checksums(&dir); | ||
| assert!(!before.is_empty()); | ||
|
|
||
| cli(&["verify", "--no-probe"], tmp.path()); | ||
|
|
There was a problem hiding this comment.
verify --no-probe is invoked here without asserting success. If verify fails, the checksum comparison may be skipped only because the test panics later, and the failure reason becomes unclear. Please assert the verify command’s exit status so the test failure points to the real regression.
| let rel_path = path | ||
| .strip_prefix(structure_root) | ||
| .unwrap_or(path) | ||
| .to_string_lossy() | ||
| .to_string(); | ||
| match parse_frontmatter(path) { | ||
| Ok(fm) => { | ||
| stubs.insert(rel_path, serde_json::to_value(fm)?); |
There was a problem hiding this comment.
load_stubs_from_md_files uses to_string_lossy() on a Path to derive stub keys. On Windows this will produce src\\module.rs\\... keys, while other parts of the pipeline/fixtures typically use forward slashes (e.g., src/module.rs/...). This can make stubs.json non-portable across platforms and can diverge from probe-verus’s key format. Consider normalizing the relative path to a consistent separator (e.g., /) before inserting into the stubs map.
| let rel_path = path | |
| .strip_prefix(structure_root) | |
| .unwrap_or(path) | |
| .to_string_lossy() | |
| .to_string(); | |
| match parse_frontmatter(path) { | |
| Ok(fm) => { | |
| stubs.insert(rel_path, serde_json::to_value(fm)?); | |
| let rel_path = path.strip_prefix(structure_root).unwrap_or(path); | |
| let rel_path = rel_path | |
| .to_string_lossy() | |
| .replace('\\', "/"); | |
| match parse_frontmatter(path) { | |
| Ok(fm) => { | |
| stubs.insert(rel_path.clone(), serde_json::to_value(fm)?); |
| let src = std::path::PathBuf::from(&fixtures).join(fixture_file); | ||
| if let Some(dest) = output_path { | ||
| if let Some(parent) = std::path::Path::new(dest.as_str()).parent() { | ||
| let _ = fs::create_dir_all(parent); | ||
| } |
There was a problem hiding this comment.
The mock only writes an output file when -o/--output is present; if it’s missing, the process exits successfully without producing anything. That can mask regressions (the caller thinks probe-verus succeeded, but artifacts are missing). Consider exiting non-zero when no output path is provided so failures are immediate and easier to diagnose.
| let tmp = setup_project(); | ||
| cli(&["atomize", "--no-probe"], tmp.path()); | ||
| assert_success( | ||
| &cli(&["atomize", "--no-probe", "--check-only"], tmp.path()), | ||
| "atomize --check-only", |
There was a problem hiding this comment.
In this test, the initial atomize --no-probe invocation is not asserted to succeed. If that setup step fails, the subsequent --check-only assertion may become misleading (or fail for the wrong reason). Please assert the first command’s exit status (e.g., via assert_success) to ensure the test only fails on the behavior under test.
| let tmp = setup_project(); | ||
| cli(&["atomize", "--no-probe"], tmp.path()); | ||
|
|
||
| assert_success(&cli(&["verify", "--no-probe"], tmp.path()), "first run"); | ||
| let first: serde_json::Value = read_json(&tmp.path().join(".verilib/stubs.json")); |
There was a problem hiding this comment.
This idempotency test runs atomize --no-probe as setup but doesn’t assert that it succeeded before comparing outputs. Please assert success on the atomize step to avoid false positives/unclear failures if atomize regresses.
Bare cli() calls used as test setup were silently discarding exit codes, which could mask failures and produce confusing downstream assertions. Wrap all setup calls with assert_success. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Closing as superseded by #37 |
Introduce 16 regression tests covering the full verilib-cli pipeline (create -> atomize -> specify -> verify) with a Rust mock probe-verus binary for end-to-end testing without external dependencies.
Test categories:
All assertions check exit codes and artifact contents, never stdout/stderr message formats, for resilience to implementation changes.
Also adds a GitHub Actions CI workflow that runs fmt, clippy, and tests on pull requests.