Summary
Each Z3 proof bundle is ~22 MB raw (45 functions, pmemlog). This is fine for a few projects, but will hit GitHub's repo size limits at scale. We need a storage strategy before scaling up.
Current numbers (pmemlog, 45 functions)
| Artifact |
Raw size |
Compressed (gzip) |
smt_queries/ (45 .smt2 files) |
12 MB |
~1.5 MB |
z3_proofs/ (45 .proof files) |
11 MB |
~1.4 MB |
proofs.json |
64 KB |
negligible |
| Total per bundle |
~22 MB |
~2.9 MB |
For comparison, 10 certifications of results + specs are only 1.2 MB total. The proof bundles are ~100x larger than results + specs combined.
Scaling projections
| Scenario |
Raw git growth |
Timeline to 5 GB soft limit |
| 1 project, 1 cert/week |
~1.1 GB/year |
~4.5 years |
| 5 projects, monthly |
~1.3 GB/year |
~3.8 years |
| 20 projects, monthly |
~5.3 GB/year |
< 1 year |
| 20 projects, weekly |
~23 GB/year |
~3 months |
Git compresses well (22 MB → 2.9 MB gzip), but git never forgets — every bundle stays in the history forever.
GitHub limits
- Repo size soft limit: 5 GB (GitHub warns)
- Repo size hard limit: 100 GB (GitHub may restrict pushes)
- Single file limit: 100 MB (we're fine — largest is 2.2 MB)
- Git LFS: 1 GB free storage, 1 GB bandwidth/month
Alternatives to evaluate
Option A: Git LFS (easiest migration)
- Move
.smt2 and .proof files to Git LFS
- Git repo stays small; LFS stores blobs externally
- Pros: minimal code changes, same workflow
- Cons: still paying for storage ($5/month per 50 GB), LFS has bandwidth limits
Option B: Keep only latest, archive to external storage
- Store only
proofs/latest/ in git (overwrite each time — no history accumulation)
- Archive timestamped bundles to IPFS, S3, or GitHub Releases
- Store the content-addressable hash in
history.json for verification
- Pros: git stays lean forever; external storage is cheap
- Cons: needs upload/fetch code for external store
Option C: IPFS (most aligned with decentralization)
- Upload each proof bundle to IPFS, get a CID
- Store only the CID in
history.json (alongside the Merkle hash on-chain)
- Anyone can pin and retrieve the bundle via CID
- Pros: content-addressable (CID = hash of content), decentralized, permanent
- Cons: needs pinning service (Pinata, web3.storage) for reliability
Option D: GitHub Releases as artifact storage
- Attach compressed proof bundles (
.tar.gz) as release assets
history.json links to the release asset URL
- GitHub Releases have 2 GB per-file limit, no total cap on releases
- Pros: free, integrated with GitHub, no new infrastructure
- Cons: not content-addressable, relies on GitHub availability
Option E: Compress + deduplicate in-repo
- Store bundles as
.tar.gz instead of expanded directories
- Deduplicate shared SMT preambles (most
.smt2 files share ~80% of content)
- Could reduce per-bundle size from 22 MB to ~3 MB
- Pros: no external dependencies
- Cons: still grows linearly, just slower
Recommended approach
- Short-term (now → 10 projects): Keep current approach. Sizes are manageable.
- Medium-term (10+ projects): Option B + D — keep only
latest in git, attach compressed bundles to GitHub Releases, store the release URL + hash in history.json.
- Long-term: Option C (IPFS) — proof bundle CID alongside on-chain Merkle hash for full decentralization alignment.
Related
Summary
Each Z3 proof bundle is ~22 MB raw (45 functions, pmemlog). This is fine for a few projects, but will hit GitHub's repo size limits at scale. We need a storage strategy before scaling up.
Current numbers (pmemlog, 45 functions)
smt_queries/(45.smt2files)z3_proofs/(45.prooffiles)proofs.jsonFor comparison, 10 certifications of results + specs are only 1.2 MB total. The proof bundles are ~100x larger than results + specs combined.
Scaling projections
Git compresses well (22 MB → 2.9 MB gzip), but git never forgets — every bundle stays in the history forever.
GitHub limits
Alternatives to evaluate
Option A: Git LFS (easiest migration)
.smt2and.prooffiles to Git LFSOption B: Keep only
latest, archive to external storageproofs/latest/in git (overwrite each time — no history accumulation)history.jsonfor verificationOption C: IPFS (most aligned with decentralization)
history.json(alongside the Merkle hash on-chain)Option D: GitHub Releases as artifact storage
.tar.gz) as release assetshistory.jsonlinks to the release asset URLOption E: Compress + deduplicate in-repo
.tar.gzinstead of expanded directories.smt2files share ~80% of content)Recommended approach
latestin git, attach compressed bundles to GitHub Releases, store the release URL + hash inhistory.json.Related