Skip to content

csaf-poc/ghsa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GHSA to CSAF Advisory Converter

This repository provides tooling to transform GitHub Security Advisories (GHSA) into Common Security Advisory Framework (CSAF) 2.0 advisories. Its primary focus is a converter that ingests GHSA JSON (from the GitHub API) and produces a valid CSAF advisory file.


Motivation

This project is an exploratory, proof‑of‑concept attempt to see how far a direct GHSA → CSAF mapping can go without extensive manual curation. It is not a complete nor authoritative converter, and the output should be treated as a starting point for experimentation, review, and potential enrichment.

Why explore this?

  • To surface the practical friction points when aligning an informal ecosystem advisory format (GHSA) with a formal standard (CSAF).
  • To provide a lightweight sandbox for evaluating the viability of adopting CSAF for open source package advisories.
  • To document gaps and edge cases rather than to claim seamless interoperability.

What this is NOT:

  • A drop‑in production adapter guaranteeing schema fullness or semantic fidelity in all cases.
  • An exhaustive coverage of all CSAF fields or advanced constructs (distribution, localization, signature handling, etc.).
  • A normalization engine that infers missing business/vendor context.

Expectation management & disclaimers:

  • Information loss is unavoidable: many CSAF fields lack source data in GHSA and are intentionally omitted.
  • Some field representations are simplified or flattened (e.g., product taxonomy) to keep the prototype maintainable.
  • No guarantee that version range formatting or score sets match best‑practice CSAF authoring guidelines.
  • Consumers should perform validation and apply domain-specific post‑processing before relying on the output.

Value of the prototype:

  • Makes differences concrete by producing tangible CSAF documents from real GHSA examples.
  • Highlights where additional metadata or tooling would be required for a robust pipeline.
  • Serves as a foundation others can iterate on (fill gaps, strengthen mappings, add validation).

Optional Sections (Product Tree & Vulnerabilities)

Both product_tree and vulnerabilities are optional in the CSAF 2.0 specification; a syntactically valid document can consist solely of the mandatory document section. For clarity and experimentation this prototype chooses to populate them when GHSA provides enough input.

Heuristic choices applied, e.g. ecosystem becomes a top-level language/category branch.

If these assumptions do not align with a consumer’s taxonomy strategy or introduce risk of misinterpretation, the generation of these sections can be skipped or pruned in the future—yielding a leaner CSAF advisory focused only on tracking metadata.


Limitations and Differences

CSAF Document

Aspect GHSA Source CSAF Expectation Result / Handling / Assumption
Acknowledgments credits_detailed entries acknowledgments[] optional Mapped per entry: Names = user.login, Organization = organizations_url, URLs = html_url, Summary via credit type
Aggregate severity severity string aggregate_severity.text Direct mapping to text; namespace omitted
Category n/a document.category required Fixed constant from config (Security Advisory)
CSAF version n/a csaf_version required Fixed to CSAF 2.0
Distribution (TLP) n/a document.distribution Set TLP to White by default
Language (lang) n/a Optional Default to en (GHSA does not provide language)
Notes summary, description Optional notes[] Two notes created: Summary + Description
Publisher: Category n/a Category (e.g., coordinator/discovery/other) Use Discoverer
Publisher: Issuing Authority n/a Issuer Use GitHub
Publisher: Name user.login, user.name Single name Use login because it is always set
Publisher: Namespace user.html_url URI/namespace Use HTMLURL as namespace
Publisher: Contact details user.html_url, optional user.email Optional contact string Compose: URL: <html_url>; email: <email> if present
References URLs in GHSA body Optional references array Not populated
Source language n/a Optional Not populated
Title summary Required Use summary; nil if empty
Tracking: Aliases identifiers[] Optional list Map all GHSA identifiers to aliases
Tracking: ID ghsa_id Required Use GHSA ID
Tracking: Initial release published_at Required (ISO 8601) Use published_at
Tracking: Current release updated_at if > published_at Required (ISO 8601) Use updated_at if newer, else published_at
Tracking: Revision history published_at, updated_at Required Synthesized: 1 = published, 2 = updated (if newer); numbers via strconv.Itoa
Tracking: Status n/a Required Fixed to final
Tracking: Version n/a Required Length of revision history (as decimal string)
Digital signatures n/a Optional signing metadata Not populated

Product Tree

Aspect GHSA Source CSAF Expectation Result / Handling / Assumption
Branch hierarchy Ecosystem → package name → vulnerable range Hierarchical branches Implemented as LanguageProductNameProductVersionRangeProduct.
Ecosystem category package.ecosystem Category label Mapped to Language category. Assumption: Language comprises programming language. Other categories possible, e.g. Vendor if it is GitHub or the Package Owner.
Product name display package.name and repository path Full product name Derived via getRepositoryName: third path segment (e.g., github.com/org/reporepo), fallback to full package name.
Product ID package.name Stable identifier Use full package name as product_id.
Version range formatting vulnerable_version_range Clean canonical ranges Whitespace normalized with operator replacement; < and <= may be substituted with Unicode lookalikes to avoid JSON HTML escaping (see normalizeOperators).
Multi-product relationships Multiple packages per advisory Cross-product mapping Each package is a separate branch; no merging across packages.
Full product names Consolidated list full_product_names[] Populated alongside branches for all products.

Vulnerabilities

Aspect GHSA Source CSAF Expectation Result / Handling / Assumption
Vulnerability count Single advisory with multiple CWEs One CWE per flaw Single CSAF Vulnerability generated; CWEs reduced to the first (primary) one to respect 1:1 CWE constraint.
IDs ghsa_id and cve_id Multiple identifiers IDs[] includes GHSA ID; CVE set if provided.
CWE mapping cwes[] One CWE per vulnerability Map first CWE (id and name), omit others.
References Advisory URL (html_url) Typed references One external reference pointing to GHSA HTML page with summary "Advisory HTML URL".
Product status Affected packages known_affected, fixed, etc. All products derived from product tree marked as KnownAffected; no unaffected or fixed breakdown yet.
Scores (CVSS) cvss_severities, cvss legacy CVSS3 with version Prefer CVSS v3.1/v3.0 vectors; legacy CVSS used if v3 absent; unsupported vectors are skipped. Severity derived from base score.
Remediations patched_versions per vulnerability Remediation entries If patched versions exist, one VendorFix remediation with details "Upgrade to version: " and product IDs attached.
Discovery / release dates Not distinct in GHSA Optional fields Omitted; document tracking covers publish/update.
Threats / VEX flags Not present Optional Omitted.
Notes / Title Advisory-level text Optional per-vuln Omitted to avoid duplication; document notes/title already describe the advisory.

Not performed: semantic rewriting of version operators, inference of missing ecosystem data, enrichment of vendor/product naming conventions beyond what GHSA provides.


Design and Workflow

High-level pipeline:

  1. Input acquisition: GHSA advisory JSON loaded into internal model (models/ghsa/...).
  2. Conversion orchestrator (service/converter/) assembles CSAF structures.
  3. Product tree construction (producttree.go): branches categorized (language → product → version range → product instance).
  4. Vulnerability section (vulnerabilities.go): maps GHSA vulnerabilities, identifiers, scores, references.
  5. Revision history (document.go): timestamps converted into sequential revision numbers (fixed from earlier control character issue by using proper integer → string conversion).
  6. Tracking and metadata: populates minimal CSAF tracking fields from GHSA advisory context.
  7. Serialization: CSAF advisory saved (current path uses upstream library defaults; HTML escaping of < may appear as \u003c).

Development notes:

  • Project layout mirrors responsibility (converter, downloader, store, models, schemas).
  • Utilities (internal/utils/ref.go) help with pointer/value wrapping to reduce noise when assembling CSAF structs.
  • Tests exist for product tree and downloader components to validate structure and basic behaviors.
  • Incremental enhancements can extend mapping coverage without breaking existing schema usage.

Usage

Prerequisites: Go ≥ 1.21.

Install dependencies:

go mod download

Run converter (example):

go run ./cmd --input examples/repository_GHSA/GHSA-mh63-6h87-95cp.json --output out/csaf.json

(Adjust flags according to your actual command interface; example placeholders.)

Inspect output:

cat out/csaf.json | jq '.'

Data Mapping (GHSA → CSAF)

  • GHSA advisory ID → document.tracking.id and vulnerability IDs list.
  • CVE (if present) → vulnerabilities[].cve.
  • Package ecosystem/name → Product tree branches.
  • Severity / CVSS → vulnerabilities[].scores[] with score type set appropriately.
  • References (URLs) → vulnerabilities[].references[].
  • Published / Updated timestamps → document.tracking.revision_history[] entries.
  • Description / summary → vulnerabilities[].notes[] (if implemented; may be minimal).

Unsupported or partially mapped:

  • Advisory aliases beyond CVE/GHSA ID (unless provided explicitly).
  • Rich supplier, distributor, and release channel metadata.
  • Full remediation guidance if GHSA lacks structured fix details.

JSON Encoding Notes

We rely on the upstream gocsaf.SaveAdvisory function to serialize the CSAF advisory. That helper internally creates its own json.Encoder with Go's default settings; we cannot inject SetEscapeHTML(false). As a consequence characters <, >, and & are HTML‑escaped in the emitted JSON (e.g. < becomes \u003c). For version range fields this makes strict comparisons < and <= awkward. To keep output ASCII‑only and avoid escaped sequences without changing the encoder, the converter now normalizes operators into descriptive English phrases:

  • <=less or equal
  • >=greater or equal
  • <less than
  • >greater than

This sidesteps HTML escaping while preserving the intended comparison semantics in a human‑readable form. The trade‑off is that downstream tooling expecting literal operators will need to adapt. A more robust long‑term solution would be either (a) bypassing gocsaf.SaveAdvisory and performing our own encoding with enc.SetEscapeHTML(false), or (b) post‑processing the JSON to unescape these characters.


Examples

See examples/ directory:

  • global_GHSA/GHSA-cpj6-fhp6-mr6j.json (global advisory input).
  • repository_GHSA/GHSA-mh63-6h87-95cp.json (repository advisory input).
  • repository_GHSA/csaf_example_output.json (sample converted CSAF output).

You can diff the input vs. output to observe:

  • Product tree hierarchy creation.
  • Revision history entries.
  • Identifier and reference mappings.

Troubleshooting

Symptom Cause Resolution
Escaped < in version range Default JSON encoder HTML escape Accept as-is or post-process; custom encoder if allowed.
Missing product branches Empty vulnerabilities list in GHSA Validate input advisory content; ensure downloader acquired full data.
Lost CVSS vector GHSA advisory lacks CVSS No remediation; CSAF will omit score.
Unexpected whitespace in ranges GHSA formatting quirks Normalization collapses spaces automatically.

Logging: converter emits structured logs (via slog) for save operations; enable debug verbosity if expanding.


Roadmap

  • Fix issues
  • Perform extensive review
  • Check global GHSA
  • Check hidden GHSA (requires authentication probably)
  • Add CLI functionality & configuration options
  • Add more tests & validation

License and Acknowledgments

Licensed under the terms in LICENSE (refer to file). Built upon:

  • github.com/gocsaf/csaf for CSAF model structures.
  • GitHub Security Advisory data as source material.

Contributions, issues, and suggestions are welcome.

About

Github security advisory to CSAF converter

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages