Skip to content

A high-performance, fully CommonMark-compliant Markdown parser written in Rust.

Notifications You must be signed in to change notification settings

sebastian-software/ferromark

Repository files navigation

ferromark

Fast Markdown-to-HTML for Rust workloads where throughput and predictable latency matter.

Why ferromark

  • Built for production paths, not toy inputs: docs pipelines, API rendering, and CLIs.
  • Streaming parser design avoids AST overhead on the hot path.
  • CommonMark-compliant while still tuned for raw speed.
  • Small dependency surface and straightforward integration.

Design goals

  • Linear time behavior: no regex backtracking, no parser surprises on large inputs.
  • Low allocation pressure: compact Range references into the input instead of copying text.
  • Cache-friendly execution: tight scanning loops, lookup tables, and reusable buffers.
  • Operational safety: explicit depth/limit guards against pathological nesting.

Architecture at a glance

Input bytes (&[u8])
       │
       ▼
   Block parser (line-oriented)
       │ emits BlockEvent stream
       ▼
   Inline parser (per text range)
       │ emits InlineEvent stream
       ▼
   HTML writer (direct buffer writes)
       │
       ▼
   Output (Vec<u8>)

Why this is fast

  • Block pass stays simple: cheap line scanning via memchr, container stack for quotes/lists.
  • Inline pass is staged: collect marks -> resolve precedence (code, links, emphasis) -> emit.
  • Hot-path tuning: #[inline] where it matters, #[cold] for rare paths, table-driven classification.
  • CommonMark emphasis done right: modulo-3 delimiter handling without expensive rescans.

Performance

Benchmarked on Apple Silicon (M-series), latest run: February 8, 2026. Workload: synthetic wiki-style documents with text-heavy paragraphs, lists, code blocks, and representative CommonMark features (benches/fixtures/commonmark-5k.md, benches/fixtures/commonmark-50k.md). Method: output buffers are reused for ferromark, md4c, and pulldown-cmark where APIs allow; comrak allocates output internally. Default GFM extensions enabled for ferromark (tables, strikethrough, task lists, disallowed raw HTML; autolink literals is opt-in). Main table uses non-PGO binaries for apples-to-apples defaults.

CommonMark 5KB (GFM extensions enabled, includes tables)

Parser Throughput Relative (vs ferromark)
ferromark 289.9 MiB/s 1.00x
pulldown-cmark 247.7 MiB/s 0.85x
md4c 242.3 MiB/s 0.84x
comrak 73.7 MiB/s 0.25x

CommonMark 50KB (GFM extensions enabled, includes tables)

Parser Throughput Relative (vs ferromark)
ferromark 309.3 MiB/s 1.00x
pulldown-cmark 271.7 MiB/s 0.88x
md4c 247.4 MiB/s 0.80x
comrak 76.0 MiB/s 0.25x

All parsers run with GFM tables, strikethrough, and task lists enabled. Other candidates like markdown-rs are far slower in this workload and are omitted from the main tables to keep the comparison focused. Happy to run them on request.

Key results:

  • ferromark is ~17% faster than pulldown-cmark at 5KB and ~14% faster at 50KB.
  • ferromark is ~20% faster than md4c at 5KB and ~25% faster at 50KB.
  • ferromark is ~3.9-4.1x faster than comrak across 5-50KB.

Run benchmarks: cargo bench --bench comparison

Technical Notes (Top-Tier Approaches)

These are the four parsers included in the main benchmark. Ratings use a 4-level emoji heatmap focused on end-to-end Markdown-to-HTML throughput in typical workloads.

Legend:

  • 🟩 = strongest in this row (ties allowed)
  • 🟨 = close behind the row leader
  • 🟧 = notable tradeoffs for this row
  • 🟥 = weakest for this row's goal

Scoring is relative per row so each row has at least one 🟩. Each feature row is followed by a short plain-language explanation. Ferromark optimization backlog: docs/arch/ARCH-PLAN-001-performance-opportunities.md

Feature ferromark md4c pulldown-cmark comrak
Performance-Critical Architecture and Memory
Parser model (streaming, no AST) 🟩 🟩 🟨 🟥
Streaming parsers can emit output as they scan input, which avoids building an intermediate tree and keeps memory and cache pressure low. Mapping: ferromark and md4c stream; pulldown-cmark uses a pull iterator; comrak builds an AST.
API overhead profile (push / pull / AST) 🟩 🟩 🟨 🟥
This score reflects API overhead on straight Markdown-to-HTML throughput, not API flexibility. Mapping: md4c callbacks and ferromark streaming events are lean; pulldown-cmark pull iterators are close; comrak's AST model adds more overhead for this workload.
Parse/render separation 🟨 🟩 🟩 🟧
Clear separation lets parsers stay simple and fast, while renderers can be swapped or tuned. Mapping: md4c and pulldown-cmark separate parse and render clearly; ferromark is mostly separated; comrak leans on AST-based renderers.
Inline parsing pipeline (multi-phase, delimiter stacks) 🟩 🟨 🟨 🟥
Multi-phase inline parsing (collect -> resolve -> emit) keeps the hot path linear and avoids backtracking. Mapping: ferromark uses multi-phase inline parsing; md4c and pulldown-cmark are optimized byte scanners; comrak does more AST bookkeeping.
Emphasis matching efficiency 🟩 🟨 🟨 🟥
Efficient emphasis handling reduces rescans and backtracking. Stack-based algorithms tend to win on long text-heavy documents. Mapping: ferromark uses modulo-3 stacks; md4c and pulldown-cmark are optimized; comrak pays AST overhead.
Link reference processing cost 🟩 🟩 🟩 🟨
Link labels need normalization (case folding and entity handling). Optimized implementations reduce allocations and Unicode overhead. Mapping: All four normalize labels; ferromark, md4c, and pulldown-cmark focus on minimizing allocations; comrak handles more feature paths.
Zero-copy text handling 🟩 🟨 🟨 🟥
Zero-copy means most text slices point directly into input, which reduces allocations and copy costs. Mapping: ferromark uses ranges; md4c and pulldown-cmark borrow slices; comrak allocates AST nodes.
Allocation pressure (hot path) 🟩 🟩 🟨 🟥
Fewer allocations in tight loops improves CPU utilization and reduces allocator overhead. Mapping: Streaming parsers allocate less during parse/render; AST parsers allocate many nodes.
Output buffer reuse 🟩 🟩 🟨 🟥
Reusing output buffers avoids repeated allocations across runs and stabilizes performance. Mapping: ferromark, md4c, and pulldown-cmark allow reuse; comrak allocates internally.
Memory locality (working set size) 🟩 🟩 🟨 🟥
A small working set fits in cache and reduces memory traffic. Mapping: Streaming parsers keep the working set small; AST-based parsing expands it.
Cache friendliness 🟩 🟩 🟨 🟥
Linear scans and contiguous buffers are usually best for CPU caches. Mapping: ferromark and md4c favor linear scans; pulldown-cmark is close; comrak traverses AST allocations.
SIMD availability (optional) 🟩 🟨 🟩 🟥
SIMD can accelerate scanning for special characters if the SIMD path is hot enough. Mapping: ferromark and pulldown-cmark have SIMD paths; md4c relies on C optimizations; comrak is not SIMD-focused.
Hot-path control (bounds/branch minimization) 🟩 🟩 🟧 🟥
This row measures performance headroom from low-level control in inner loops. Mapping: md4c (C) and ferromark use tighter low-level tuning where beneficial; pulldown-cmark is mostly safe-Rust hot loops; comrak prioritizes higher-level flexibility.
Dependency footprint 🟩 🟩 🟨 🟥
Fewer dependencies simplify builds and reduce binary bloat. Mapping: md4c and ferromark are minimal; pulldown-cmark is moderate; comrak is heavier.
Throughput ceiling (architectural) 🟩 🟩 🟨 🟥
With fewer allocations and tighter hot loops, streaming architectures generally allow higher throughput ceilings. Mapping: ferromark and md4c lead here; pulldown-cmark is close; comrak trades throughput for flexibility.
Core compactness (moving parts) 🟨 🟩 🟨 🟧
A compact core is easier to tune and reason about. Mapping: md4c is very compact; ferromark is lean; pulldown-cmark is moderate; comrak is larger by design.
 
Feature Coverage and Extensibility
Extension breadth (GFM and extras) 🟩 🟧 🟨 🟩
More extensions increase compatibility but add parsing work. Mapping: comrak offers the broadest extension catalog; ferromark implements all 5 GFM extensions (tables, strikethrough, task lists, autolink literals, disallowed raw HTML); pulldown-cmark supports common GFM features; md4c supports common GFM features.
Spec compliance focus (CommonMark) 🟩 🟩 🟨 🟩
Full compliance adds edge-case handling. All four are strong here, but more features usually means more code on the hot path. Mapping: All four target CommonMark; comrak and md4c emphasize full compliance; pulldown-cmark adds extensions; ferromark is focused.
Unicode handling configurability 🟧 🟩 🟧 🟧
Configurable Unicode handling can simplify hot paths or support special environments. Mapping: md4c can be built for UTF-8, UTF-16, or ASCII-only; the Rust parsers generally assume UTF-8.
Portability 🟨 🟩 🟨 🟨
Portability matters for embedding and wide deployment. Mapping: md4c compiles almost anywhere with a C toolchain; the Rust crates are broadly portable too.
Extension configuration surface 🟨 🟩 🟨 🟨
Fine-grained flags let you disable features to reduce work. Mapping: md4c has many flags; pulldown-cmark and comrak use options; ferromark has 7 options covering all GFM extensions (allow_html, allow_link_refs, tables, strikethrough, task_lists, autolink_literals, disallowed_raw_html).
Raw HTML control (allow/deny) 🟩 🟩 🟧 🟩
Disabling raw HTML can simplify parsing and output. Mapping: md4c and comrak expose explicit switches; ferromark also exposes an explicit allow_html option; pulldown-cmark is more fixed in defaults.
GFM Tables 🟩 🟩 🟩 🟩
GFM table syntax (header, delimiter, body rows with alignment). Mapping: All four parsers support GFM tables.
Task lists, strikethrough 🟩 🟨 🟨 🟩
These GFM features are common in real-world Markdown. Mapping: All four parsers support task lists and strikethrough.
Footnotes 🟥 🟥 🟨 🟩
Footnotes add extra parsing and rendering complexity. Mapping: pulldown-cmark and comrak support footnotes; ferromark and md4c do not focus on them.
Math support 🟥 🟩 🟥 🟩
Math support often requires custom extensions. Mapping: md4c includes LaTeX math flags; comrak supports math extensions; ferromark and pulldown-cmark do not target math in the core.
Permissive autolinks 🟩 🟩 🟧 🟨
Permissive autolinks trade strictness for convenience. Mapping: ferromark and md4c support GFM autolink literals (URL, www, email); comrak has relaxed autolinks; pulldown-cmark focuses on spec defaults.
Wiki links 🟥 🟩 🟥 🟩
Wiki links are a non-CommonMark extension used in some ecosystems. Mapping: md4c and comrak support wiki links via flags/extensions; pulldown-cmark and ferromark do not.
Underline extension 🟥 🟩 🟥 🟩
Underline is an extension that changes emphasis semantics. Mapping: md4c and comrak include underline extensions; pulldown-cmark and ferromark stick closer to CommonMark emphasis rules.
Task list flexibility 🟧 🟧 🟧 🟩
Relaxed task list parsing can improve compatibility with messy inputs. Mapping: comrak offers relaxed task list options; ferromark, md4c, and pulldown-cmark support task lists with fewer knobs.
Output safety toggles 🟨 🟩 🟧 🟩
Safety toggles control whether raw HTML is emitted or escaped. Mapping: md4c and comrak provide explicit unsafe/escape switches; ferromark provides allow_html and disallowed_raw_html toggles; pulldown-cmark is more fixed in defaults.
no_std viability 🟥 🟨 🟩 🟥
no_std support matters for embedded or constrained environments. Mapping: pulldown-cmark supports no_std builds with features; md4c can be embedded in C environments; ferromark and comrak assume std.
 
Rendering and Output
Output streaming (incremental) 🟩 🟩 🟨 🟥
Output streaming lets you write HTML incrementally, which lowers peak memory and removes extra passes. Mapping: ferromark and md4c stream to buffers or callbacks; pulldown-cmark streams events; comrak often renders after AST work.
Output customization hooks 🟧 🟩 🟨 🟩
Callbacks and ASTs are great for custom rendering but add indirection compared to a single tight rendering loop. Mapping: md4c callbacks and comrak AST are very flexible; pulldown-cmark iterators are easy to transform; ferromark is lower level.
Output formats 🟥 🟧 🟨 🟩
More output formats increase flexibility but add complexity. Mapping: comrak can emit HTML, XML, and CommonMark; pulldown-cmark provides HTML plus event streams; md4c has HTML renderer and callbacks; ferromark targets HTML.
Source position support 🟥 🟥 🟩 🟨
Tracking source positions is useful for diagnostics and tooling, but adds overhead. Mapping: pulldown-cmark has strong source map support; comrak can emit source positions; ferromark and md4c are lighter.
Source map tooling (API or CLI) 🟥 🟥 🟩 🟨
Source maps improve debuggability and tooling integration. Mapping: pulldown-cmark exposes event ranges; comrak can emit source position attributes; ferromark and md4c keep this minimal.
IO friendliness (small writes) 🟩 🟩 🟧 🟥
Many small writes can be expensive without buffering. Mapping: md4c and ferromark stream into buffers or callbacks; pulldown-cmark recommends buffered output; comrak often builds strings after AST work.

Spec Compliance

CommonMark: 100% (652/652 tests)

All CommonMark spec tests pass (no filtering).

GFM: all 5 extensions implemented

Tables, strikethrough, task lists, autolink literals, and disallowed raw HTML.

Usage

use ferromark::to_html;

let html = ferromark::to_html("# Hello\n\n**World**");
assert!(html.contains("<h1>Hello</h1>"));
assert!(html.contains("<strong>World</strong>"));

Zero-allocation API

let mut buffer = Vec::new();
ferromark::to_html_into("# Reuse me", &mut buffer);
// buffer can be reused for next call

Building

# Development
cargo build

# Optimized release (recommended for benchmarks)
cargo build --release

# Run tests
cargo test

# Run CommonMark spec tests
cargo test --test commonmark_spec -- --nocapture

# Run benchmarks
cargo bench

Project Structure

src/
├── lib.rs          # Public API (to_html, to_html_into)
├── block/          # Block-level parser
│   ├── parser.rs   # Line-oriented block parsing
│   └── event.rs    # BlockEvent types
├── inline/         # Inline-level parser
│   ├── mod.rs      # Three-phase inline parsing
│   ├── marks.rs    # Mark collection
│   ├── code_span.rs
│   ├── emphasis.rs      # Modulo-3 stack optimization
│   ├── strikethrough.rs # GFM strikethrough resolution
│   └── links.rs         # Link/image/autolink parsing
├── cursor.rs       # Pointer-based byte cursor
├── range.rs        # Compact u32 range type
├── render.rs       # HTML writer
├── escape.rs       # HTML escaping (memchr-optimized)
└── limits.rs       # DoS prevention constants

License

MIT OR Apache-2.0

About

A high-performance, fully CommonMark-compliant Markdown parser written in Rust.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •