Skip to content

Releases: scanipy/scanipy-oss

scanipy 0.2.0

08 Jun 20:52
4fa9d59

Choose a tag to compare

The taint engine works. scanipy scan now performs real, deterministic,
local taint analysis: it follows untrusted data from sources to dangerous sinks
(through sanitizers and propagators) and reports the source → … → sink witness
behind every finding. The DSL parser, the Python frontend/IR, the matcher, the
taint engine, the detector catalog, and the scan/rules CLI are all
implemented. Scope is honest (P7): single-language (Python), intra-file
(including intra-file interprocedural via TITO function summaries) — no
cross-file / whole-program analysis. Not yet published to PyPI — install from
source for now (pip install -e .); scanipy-oss remains the reserved future
distribution name.

Added

  • Working scanipy scan PATH — real taint analysis with witness-backed,
    deterministic findings; zero-config; never sends code over the network (P1).
    Supports --format text|json|sarif, --detectors, --severity-threshold,
    --fail-on, --exclude, --gitignore/--no-gitignore, --config, and -o.
  • Working scanipy rules list | show ID | validate FILE — list the bundled
    detectors (sorted by id, with CWE + severity), print one spec in full, and
    validate a spec against the DSL (location-aware DSLError).
  • Seven bundled detectors (run with zero config):
    • python.injection.os-command — CWE-78 (high) — OS command injection.
    • python.injection.sql — CWE-89 (high) — SQL injection.
    • python.injection.code-injection — CWE-94 (critical) — Python code injection
      (eval/exec/compile).
    • python.traversal.path-traversal — CWE-22 (high) — path traversal.
    • python.ssrf.ssrf — CWE-918 (high) — server-side request forgery.
    • python.deserialization.unsafe-deserialization — CWE-502 (critical) —
      unsafe deserialization (pickle / unsafe YAML loader).
    • python.xxe.xxe — CWE-611 (high) — XML external entity (XXE) injection.
  • DSL parser (scanipy.dsl.parse_spec) — validates every field, all four
    pattern kinds (call, attribute, parameter, import), and the flow
    grammar; raises a location-aware DSLError on anything outside the DSL.
  • Python frontend & IR — stdlib-ast-based normalized IR with first-class
    import/alias canonicalization and a per-function CFG.
  • Pattern matcher — segment-wise dotted matching with strict single-*
    wildcard placement (exact / trailing-single / leading-greedy).
  • Taint engine — flow-sensitive forward dataflow, union-at-join, one-sided
    sanitizers (P5), and intra-file interprocedural TITO summaries with witness
    splicing.
  • Verified end-to-end example (docs/examples/end-to-end.md)
    and a release-readiness checklist (docs/release-readiness.md).

Changed

  • Exit-code semantics are now real (the scan path no longer returns 2 as a
    not-implemented stub): 0 = clean (no finding meets the failure gate), 1 = a
    finding met the gate (--fail-on, else the severity threshold), 2 = a
    fatal/usage error (bad path, invalid config, unknown --detectors id, unknown
    rules show id, or a rules validate failure). Per-file parse errors are
    reported on stderr and skipped — they are not fatal.
  • The taint DSL (v0) is locked for this release — see
    docs/dsl-reference.md. A spec that validates against
    the reference works with 0.2.0 as written.
  • Bumped the click floor to click>=8.2 (was click>=8.1).
  • __version__ is now 0.2.0.