Skip to content
This repository was archived by the owner on May 13, 2026. It is now read-only.

Document fork provenance and readiness#1

Closed
hummbl-dev wants to merge 2 commits into
masterfrom
chore/codex/repo-readiness
Closed

Document fork provenance and readiness#1
hummbl-dev wants to merge 2 commits into
masterfrom
chore/codex/repo-readiness

Conversation

@hummbl-dev
Copy link
Copy Markdown
Owner

Summary

This readiness pass makes the Windows RTX fork cleaner to run and clearer about upstream provenance.

Changes:

  • add a lightweight CI workflow for uv lock --check and Python compile checks
  • refresh uv.lock after removing the stale upstream kernels dependency closure
  • document fork provenance, upstream ancestry, and current license status in PROVENANCE.md
  • update README license/provenance wording without adding a standalone license file before upstream clarification
  • fix Windows cache guidance in program.md
  • ignore generated training outputs, including results.tsv
  • fix smoke-test reporting so smoke runs print the smoke time budget and measured training time

Upstream posture

An upstream draft PR was opened to add a standalone MIT license file matching Karpathy's README license statement:

karpathy/autoresearch#575

This fork intentionally waits for upstream license clarification before adding its own standalone license file.

Validation

  • uv sync --locked
  • uv run prepare.py
  • uv run train.py --smoke-test
  • uv lock --check
  • python -m py_compile prepare.py train.py
  • git diff --check

@hummbl-dev
Copy link
Copy Markdown
Owner Author

REVIEW ACK from Primary Anvil Codex. No merge-blocking findings.

Evidence checked:

  • PR Document fork provenance and readiness #1 head 9c179c8e8616a75db83895f56b49781e13874217; commit has good GPG signature locally.
  • GitHub check Static checks is green.
  • git merge-base HEAD refs/remotes/upstream/master resolves to 2224cd7cae18f3e8e168a452f474592cfce1c2bd, matching PROVENANCE.md.
  • uv lock --check passed.
  • uv sync --locked passed.
  • python -m py_compile prepare.py train.py passed.
  • git diff --check origin/master...HEAD passed.
  • GitHub license API for this fork still returns 404, matching the provenance note that no standalone license is detected.

Non-blocking cleanup before marking ready:

  • P3: PROVENANCE.md says upstream "draft PR" for Add MIT license file matching README karpathy/autoresearch#575, but #575 is currently open/non-draft. Prefer "upstream PR" to avoid stale wording.
  • P3: CI is useful but minimal. Consider adding a Windows static-check matrix before calling the fork "readiness" complete; no GPU smoke is required for this PR, but Windows uv lock --check + py_compile would better match the fork target.

I would keep this PR draft until the operator decides whether to wait for upstream #575 or merge the provenance-first posture now.

@hummbl-dev
Copy link
Copy Markdown
Owner Author

Follow-up after Codex review:

  • Resolved the provenance P3: upstream Add MIT license file matching README karpathy/autoresearch#575 is open and non-draft, so PROVENANCE now says "upstream PR" instead of "upstream draft PR".
  • Added a CI matrix for static checks on both ubuntu-latest and windows-latest.
  • Verified local checks on Anvil: uv lock --check, python -m py_compile prepare.py train.py, git diff --check.
  • Verified GitHub Actions run 25287119611: Static checks (ubuntu-latest) and Static checks (windows-latest) both passed on commit 2bdb3a9.

PR remains draft pending the operator decision on whether to wait for upstream license PR #575 before merging this fork-readiness PR.

@hummbl-dev
Copy link
Copy Markdown
Owner Author

Huxley Codex peer-review pass on current head 2bdb3a9.

Findings:

  • No P1/P2 code or provenance blockers found in the PR diff.
  • P2 process caveat: GitHub commit verification for 2bdb3a9 currently reports verified=false, reason bad_email. The commit does contain a PGP signature, and prior Anvil review reported local signature verification as good, so this is only a blocker if this repo requires GitHub-verified signed commits before merge. If that gate applies, fix the signing email/key association or record an explicit operator waiver.

Evidence checked from Huxley:

  • Diff scope: 7 files, +138/-386.
  • git diff --check origin/master...HEAD passed in a temp clone.
  • python -m py_compile prepare.py train.py passed in a temp clone.
  • GitHub CI is green for both Static checks (ubuntu-latest) and Static checks (windows-latest).
  • Upstream Add MIT license file matching README karpathy/autoresearch#575 is still open, non-draft, and mergeable.
  • GitHub license API still returns 404 for this fork, matching PROVENANCE.md.

Verdict: REVIEW ACK for code/provenance posture; keep PR draft until the operator decides whether to wait for upstream #575, and resolve/waive the GitHub signature verification caveat if verified signed commits are required for this repo.

@hummbl-dev
Copy link
Copy Markdown
Owner Author

AAR follow-through readiness decision from Codex:

Keep this PR in draft for now. Technical checks are green and the branch is mergeable, but the upstream license/provenance gate is still unresolved:

  • karpathy/autoresearch#575 is open, non-draft, mergeable, and adds only LICENSE, but it is not merged yet.
  • GitHub still reports no detected license on upstream default branch.
  • Local PROVENANCE.md says not to add a standalone license file until upstream accepts or clarifies wording.

Ready-to-undraft gates:

  1. Upstream #575 merges or upstream otherwise clarifies license text, then this PR mirrors/updates provenance as needed.
  2. Operator explicitly waives the upstream-acceptance gate and accepts README-only MIT provenance for this fork.
  3. Non-author review is obtained before merge if this moves out of draft.

@hummbl-dev
Copy link
Copy Markdown
Owner Author

Codex review for draft PR #1.

P1: none.
P2: none.

P3 - Evidence ledger is now ignored without a replacement durable record. .gitignore:22-25 ignores results.tsv, but program.md:66-79 still instructs agents to log each experiment to results.tsv as the run ledger. If that file is intentionally ephemeral, the docs should state where reviewed experiment evidence lives instead, for example a committed run summary, a results/ artifact policy, PR body template, or explicit "do not commit local results" language. As written, an agent can complete runs, record results locally, and produce a PR with no durable experiment table.

Verification:

  • uv lock --check: pass.
  • python -m py_compile prepare.py train.py: pass.
  • git diff --check origin/master...HEAD: clean.
  • Visible GH checks pass on ubuntu-latest and windows-latest.
  • Provenance spot-checks: upstream karpathy/autoresearch still has no detected GitHub license, no standalone LICENSE/COPYING file at repo root, merge base matches 2224cd7cae18f3e8e168a452f474592cfce1c2bd, and upstream PR #575 remains open.

No merge action taken.

@hummbl-dev
Copy link
Copy Markdown
Owner Author

Codex re-review for current draft head 2bdb3a9386afdcc0092274efc9906f8085fa5fa3.

P1: none found.
P2: none found in code/provenance posture.

P3 still open - the experiment ledger is ignored without a durable replacement. .gitignore:25 ignores results.tsv, while program.md:66-83 still instructs agents to log completed experiments to results.tsv as the TSV ledger. This is not a merge blocker for a provenance/readiness draft, but before using this repo for repeated agent runs the docs should say whether results.tsv is intentionally local-only and where reviewed run evidence should be preserved.

Process caveat still open - GitHub commit verification for 2bdb3a9 reports verified=false, reason bad_email, even though the commit contains a PGP signature. Treat this as blocking only if this repo requires GitHub-verified signed commits before merge; otherwise record an operator waiver or accept the local-signature posture.

Verification run on this pass:

  • uv lock --check: pass (uv 0.11.7, CPython 3.10.20 resolver)
  • python -m py_compile prepare.py train.py: pass
  • git diff --check origin/master...HEAD: pass
  • GitHub checks: Static checks (ubuntu-latest) and Static checks (windows-latest) pass
  • Merge base with upstream karpathy/autoresearch is 2224cd7cae18f3e8e168a452f474592cfce1c2bd, matching PROVENANCE.md
  • Upstream karpathy/autoresearch#575 is still open, non-draft, mergeable, and not merged; GitHub license API still returns 404 for both upstream and this fork

Verdict: keep draft until the upstream-license gate is resolved or waived. No merge action taken.

@hummbl-dev
Copy link
Copy Markdown
Owner Author

Draft peer review on 2bdb3a9:

No P1/P2/P3 findings.

What I verified:

  • Diff scope is docs/readiness/static-CI only: workflow, provenance/README/program docs, .gitignore, train.py smoke reporting, and uv.lock cleanup.
  • Upstream karpathy/autoresearch still has README ## License / MIT, no LICENSE, LICENSE.md, or COPYING file, and GitHub license API returns 404 for upstream and this fork.
  • Upstream license clarification PR Add MIT license file matching README karpathy/autoresearch#575 is still open, so the fork posture of not adding its own standalone license file yet is defensible.
  • Merge base with upstream is still 2224cd7cae18f3e8e168a452f474592cfce1c2bd, matching PROVENANCE.md.
  • Local static checks in an isolated clone: uv lock --check passed, python -m py_compile prepare.py train.py passed, git diff --check origin/master...HEAD passed.
  • GitHub CI is green on ubuntu-latest and windows-latest.

Caveat / recommendation before undrafting: I did not rerun uv run train.py --smoke-test locally because Anvil has an active GPU training lane; keep the PR-body smoke validation as the runtime evidence, or rerun smoke after the GPU is free if this is promoted from draft to merge-ready.

@hummbl-dev
Copy link
Copy Markdown
Owner Author

Reviewed components:

  1. CI Workflow (.github/workflows/ci.yml): New lightweight CI for uv lock --check and py_compile on ubuntu-latest + windows-latest. Appropriate for fork scope, passes both platforms.

  2. Dependency cleanup (uv.lock): Removed stale upstream kernels dependency closure. Refreshed lockfile after removal. Reduces dependency surface for Windows-focused fork.

  3. Provenance documentation (PROVENANCE.md): Comprehensive provenance note documenting fork relationship to karpathy/autoresearch, merge base (2224cd7), license status (upstream README states MIT but no standalone LICENSE file), and upstream PR #575 tracking license clarification. Careful license handling - waits for upstream before adding standalone file.

  4. README updates: Clarified fork as Windows/consumer-RTX fork, added PROVENANCE.md link, updated license wording to reference provenance doc without adding standalone license file.

  5. program.md cache path fix: Clarified Windows cache paths (%LOCALAPPDATA%\autoresearch vs legacy ~/.cache/autoresearch) and AUTORESEARCH_CACHE_DIR override. Helpful for Windows users.

  6. train.py smoke-test reporting fix: Prints time budget with ' (smoke test)' suffix when in smoke mode, fixes step > 10 condition to include smoke_test. Improves smoke-test UX.

  7. .gitignore updates: Ignores checkpoint_pre_eval.pt, run.log, results.tsv (training outputs). Prevents committing generated artifacts.

Validation: All CI checks passing (ubuntu-latest, windows-latest). Author validated uv sync --locked, uv run prepare.py, uv run train.py --smoke-test, uv lock --check, py_compile, git diff --check.

Concerns: None. All changes are appropriate for a Windows/consumer-RTX compatibility fork. License handling is careful and defers to upstream clarification.

Verdict: APPROVE - clean readiness pass with appropriate scope for fork objectives.

@hummbl-dev
Copy link
Copy Markdown
Owner Author

Closing - fork provenance doc, not needed upstream.

@hummbl-dev hummbl-dev closed this May 13, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant