ferromark

Fast Markdown-to-HTML for Rust workloads where throughput and predictable latency matter.

Why ferromark

Built for production paths, not toy inputs: docs pipelines, API rendering, and CLIs.
Streaming parser design avoids AST overhead on the hot path.
CommonMark-compliant while still tuned for raw speed.
Small dependency surface and straightforward integration.

Design goals

Linear time behavior: no regex backtracking, no parser surprises on large inputs.
Low allocation pressure: compact Range references into the input instead of copying text.
Cache-friendly execution: tight scanning loops, lookup tables, and reusable buffers.
Operational safety: explicit depth/limit guards against pathological nesting.

Architecture at a glance

Input bytes (&[u8])
       │
       ▼
   Block parser (line-oriented)
       │ emits BlockEvent stream
       ▼
   Inline parser (per text range)
       │ emits InlineEvent stream
       ▼
   HTML writer (direct buffer writes)
       │
       ▼
   Output (Vec<u8>)

Why this is fast

Block pass stays simple: cheap line scanning via memchr, container stack for quotes/lists.
Inline pass is staged: collect marks -> resolve precedence (code, links, emphasis) -> emit.
Hot-path tuning: #[inline] where it matters, #[cold] for rare paths, table-driven classification.
CommonMark emphasis done right: modulo-3 delimiter handling without expensive rescans.

Performance

Benchmarked on Apple Silicon (M-series), latest run: February 8, 2026. Workload: synthetic wiki-style documents with text-heavy paragraphs, lists, code blocks, and representative CommonMark features (benches/fixtures/commonmark-5k.md, benches/fixtures/commonmark-50k.md). Method: output buffers are reused for ferromark, md4c, and pulldown-cmark where APIs allow; comrak allocates output internally. Default GFM extensions enabled for ferromark (tables, strikethrough, task lists, disallowed raw HTML; autolink literals is opt-in). Main table uses non-PGO binaries for apples-to-apples defaults.

CommonMark 5KB (GFM extensions enabled, includes tables)

Parser	Throughput	Relative (vs ferromark)
ferromark	289.9 MiB/s	1.00x
pulldown-cmark	247.7 MiB/s	0.85x
md4c	242.3 MiB/s	0.84x
comrak	73.7 MiB/s	0.25x

CommonMark 50KB (GFM extensions enabled, includes tables)

Parser	Throughput	Relative (vs ferromark)
ferromark	309.3 MiB/s	1.00x
pulldown-cmark	271.7 MiB/s	0.88x
md4c	247.4 MiB/s	0.80x
comrak	76.0 MiB/s	0.25x

All parsers run with GFM tables, strikethrough, and task lists enabled. Other candidates like markdown-rs are far slower in this workload and are omitted from the main tables to keep the comparison focused. Happy to run them on request.

Key results:

ferromark is ~17% faster than pulldown-cmark at 5KB and ~14% faster at 50KB.
ferromark is ~20% faster than md4c at 5KB and ~25% faster at 50KB.
ferromark is ~3.9-4.1x faster than comrak across 5-50KB.

Run benchmarks: cargo bench --bench comparison

Technical Notes (Top-Tier Approaches)

These are the four parsers included in the main benchmark. Ratings use a 4-level emoji heatmap focused on end-to-end Markdown-to-HTML throughput in typical workloads.

Legend:

🟩 = strongest in this row (ties allowed)
🟨 = close behind the row leader
🟧 = notable tradeoffs for this row
🟥 = weakest for this row's goal

Scoring is relative per row so each row has at least one 🟩. Each feature row is followed by a short plain-language explanation. Ferromark optimization backlog: docs/arch/ARCH-PLAN-001-performance-opportunities.md

Feature	ferromark	md4c	pulldown-cmark	comrak
Performance-Critical Architecture and Memory
Parser model (streaming, no AST)	🟩	🟩	🟨	🟥
Streaming parsers can emit output as they scan input, which avoids building an intermediate tree and keeps memory and cache pressure low. Mapping: ferromark and md4c stream; pulldown-cmark uses a pull iterator; comrak builds an AST.
API overhead profile (push / pull / AST)	🟩	🟩	🟨	🟥
This score reflects API overhead on straight Markdown-to-HTML throughput, not API flexibility. Mapping: md4c callbacks and ferromark streaming events are lean; pulldown-cmark pull iterators are close; comrak's AST model adds more overhead for this workload.
Parse/render separation	🟨	🟩	🟩	🟧
Clear separation lets parsers stay simple and fast, while renderers can be swapped or tuned. Mapping: md4c and pulldown-cmark separate parse and render clearly; ferromark is mostly separated; comrak leans on AST-based renderers.
Inline parsing pipeline (multi-phase, delimiter stacks)	🟩	🟨	🟨	🟥
Multi-phase inline parsing (collect -> resolve -> emit) keeps the hot path linear and avoids backtracking. Mapping: ferromark uses multi-phase inline parsing; md4c and pulldown-cmark are optimized byte scanners; comrak does more AST bookkeeping.
Emphasis matching efficiency	🟩	🟨	🟨	🟥
Efficient emphasis handling reduces rescans and backtracking. Stack-based algorithms tend to win on long text-heavy documents. Mapping: ferromark uses modulo-3 stacks; md4c and pulldown-cmark are optimized; comrak pays AST overhead.
Link reference processing cost	🟩	🟩	🟩	🟨
Link labels need normalization (case folding and entity handling). Optimized implementations reduce allocations and Unicode overhead. Mapping: All four normalize labels; ferromark, md4c, and pulldown-cmark focus on minimizing allocations; comrak handles more feature paths.
Zero-copy text handling	🟩	🟨	🟨	🟥
Zero-copy means most text slices point directly into input, which reduces allocations and copy costs. Mapping: ferromark uses ranges; md4c and pulldown-cmark borrow slices; comrak allocates AST nodes.
Allocation pressure (hot path)	🟩	🟩	🟨	🟥
Fewer allocations in tight loops improves CPU utilization and reduces allocator overhead. Mapping: Streaming parsers allocate less during parse/render; AST parsers allocate many nodes.
Output buffer reuse	🟩	🟩	🟨	🟥
Reusing output buffers avoids repeated allocations across runs and stabilizes performance. Mapping: ferromark, md4c, and pulldown-cmark allow reuse; comrak allocates internally.
Memory locality (working set size)	🟩	🟩	🟨	🟥
A small working set fits in cache and reduces memory traffic. Mapping: Streaming parsers keep the working set small; AST-based parsing expands it.
Cache friendliness	🟩	🟩	🟨	🟥
Linear scans and contiguous buffers are usually best for CPU caches. Mapping: ferromark and md4c favor linear scans; pulldown-cmark is close; comrak traverses AST allocations.
SIMD availability (optional)	🟩	🟨	🟩	🟥
SIMD can accelerate scanning for special characters if the SIMD path is hot enough. Mapping: ferromark and pulldown-cmark have SIMD paths; md4c relies on C optimizations; comrak is not SIMD-focused.
Hot-path control (bounds/branch minimization)	🟩	🟩	🟧	🟥
This row measures performance headroom from low-level control in inner loops. Mapping: md4c (C) and ferromark use tighter low-level tuning where beneficial; pulldown-cmark is mostly safe-Rust hot loops; comrak prioritizes higher-level flexibility.
Dependency footprint	🟩	🟩	🟨	🟥
Fewer dependencies simplify builds and reduce binary bloat. Mapping: md4c and ferromark are minimal; pulldown-cmark is moderate; comrak is heavier.
Throughput ceiling (architectural)	🟩	🟩	🟨	🟥
With fewer allocations and tighter hot loops, streaming architectures generally allow higher throughput ceilings. Mapping: ferromark and md4c lead here; pulldown-cmark is close; comrak trades throughput for flexibility.
Core compactness (moving parts)	🟨	🟩	🟨	🟧
A compact core is easier to tune and reason about. Mapping: md4c is very compact; ferromark is lean; pulldown-cmark is moderate; comrak is larger by design.

Feature Coverage and Extensibility
Extension breadth (GFM and extras)	🟩	🟧	🟨	🟩
More extensions increase compatibility but add parsing work. Mapping: comrak offers the broadest extension catalog; ferromark implements all 5 GFM extensions (tables, strikethrough, task lists, autolink literals, disallowed raw HTML); pulldown-cmark supports common GFM features; md4c supports common GFM features.
Spec compliance focus (CommonMark)	🟩	🟩	🟨	🟩
Full compliance adds edge-case handling. All four are strong here, but more features usually means more code on the hot path. Mapping: All four target CommonMark; comrak and md4c emphasize full compliance; pulldown-cmark adds extensions; ferromark is focused.
Unicode handling configurability	🟧	🟩	🟧	🟧
Configurable Unicode handling can simplify hot paths or support special environments. Mapping: md4c can be built for UTF-8, UTF-16, or ASCII-only; the Rust parsers generally assume UTF-8.
Portability	🟨	🟩	🟨	🟨
Portability matters for embedding and wide deployment. Mapping: md4c compiles almost anywhere with a C toolchain; the Rust crates are broadly portable too.
Extension configuration surface	🟨	🟩	🟨	🟨
Fine-grained flags let you disable features to reduce work. Mapping: md4c has many flags; pulldown-cmark and comrak use options; ferromark has 7 options covering all GFM extensions (`allow_html`, `allow_link_refs`, `tables`, `strikethrough`, `task_lists`, `autolink_literals`, `disallowed_raw_html`).
Raw HTML control (allow/deny)	🟩	🟩	🟧	🟩
Disabling raw HTML can simplify parsing and output. Mapping: md4c and comrak expose explicit switches; ferromark also exposes an explicit `allow_html` option; pulldown-cmark is more fixed in defaults.
GFM Tables	🟩	🟩	🟩	🟩
GFM table syntax (header, delimiter, body rows with alignment). Mapping: All four parsers support GFM tables.
Task lists, strikethrough	🟩	🟨	🟨	🟩
These GFM features are common in real-world Markdown. Mapping: All four parsers support task lists and strikethrough.
Footnotes	🟥	🟥	🟨	🟩
Footnotes add extra parsing and rendering complexity. Mapping: pulldown-cmark and comrak support footnotes; ferromark and md4c do not focus on them.
Math support	🟥	🟩	🟥	🟩
Math support often requires custom extensions. Mapping: md4c includes LaTeX math flags; comrak supports math extensions; ferromark and pulldown-cmark do not target math in the core.
Permissive autolinks	🟩	🟩	🟧	🟨
Permissive autolinks trade strictness for convenience. Mapping: ferromark and md4c support GFM autolink literals (URL, www, email); comrak has relaxed autolinks; pulldown-cmark focuses on spec defaults.
Wiki links	🟥	🟩	🟥	🟩
Wiki links are a non-CommonMark extension used in some ecosystems. Mapping: md4c and comrak support wiki links via flags/extensions; pulldown-cmark and ferromark do not.
Underline extension	🟥	🟩	🟥	🟩
Underline is an extension that changes emphasis semantics. Mapping: md4c and comrak include underline extensions; pulldown-cmark and ferromark stick closer to CommonMark emphasis rules.
Task list flexibility	🟧	🟧	🟧	🟩
Relaxed task list parsing can improve compatibility with messy inputs. Mapping: comrak offers relaxed task list options; ferromark, md4c, and pulldown-cmark support task lists with fewer knobs.
Output safety toggles	🟨	🟩	🟧	🟩
Safety toggles control whether raw HTML is emitted or escaped. Mapping: md4c and comrak provide explicit unsafe/escape switches; ferromark provides `allow_html` and `disallowed_raw_html` toggles; pulldown-cmark is more fixed in defaults.
no_std viability	🟥	🟨	🟩	🟥
no_std support matters for embedded or constrained environments. Mapping: pulldown-cmark supports no_std builds with features; md4c can be embedded in C environments; ferromark and comrak assume std.

Rendering and Output
Output streaming (incremental)	🟩	🟩	🟨	🟥
Output streaming lets you write HTML incrementally, which lowers peak memory and removes extra passes. Mapping: ferromark and md4c stream to buffers or callbacks; pulldown-cmark streams events; comrak often renders after AST work.
Output customization hooks	🟧	🟩	🟨	🟩
Callbacks and ASTs are great for custom rendering but add indirection compared to a single tight rendering loop. Mapping: md4c callbacks and comrak AST are very flexible; pulldown-cmark iterators are easy to transform; ferromark is lower level.
Output formats	🟥	🟧	🟨	🟩
More output formats increase flexibility but add complexity. Mapping: comrak can emit HTML, XML, and CommonMark; pulldown-cmark provides HTML plus event streams; md4c has HTML renderer and callbacks; ferromark targets HTML.
Source position support	🟥	🟥	🟩	🟨
Tracking source positions is useful for diagnostics and tooling, but adds overhead. Mapping: pulldown-cmark has strong source map support; comrak can emit source positions; ferromark and md4c are lighter.
Source map tooling (API or CLI)	🟥	🟥	🟩	🟨
Source maps improve debuggability and tooling integration. Mapping: pulldown-cmark exposes event ranges; comrak can emit source position attributes; ferromark and md4c keep this minimal.
IO friendliness (small writes)	🟩	🟩	🟧	🟥
Many small writes can be expensive without buffering. Mapping: md4c and ferromark stream into buffers or callbacks; pulldown-cmark recommends buffered output; comrak often builds strings after AST work.

Spec Compliance

CommonMark: 100% (652/652 tests)

All CommonMark spec tests pass (no filtering).

GFM: all 5 extensions implemented

Tables, strikethrough, task lists, autolink literals, and disallowed raw HTML.

Usage

use ferromark::to_html;

let html = ferromark::to_html("# Hello\n\n**World**");
assert!(html.contains("<h1>Hello</h1>"));
assert!(html.contains("<strong>World</strong>"));

Zero-allocation API

let mut buffer = Vec::new();
ferromark::to_html_into("# Reuse me", &mut buffer);
// buffer can be reused for next call

Building

# Development
cargo build

# Optimized release (recommended for benchmarks)
cargo build --release

# Run tests
cargo test

# Run CommonMark spec tests
cargo test --test commonmark_spec -- --nocapture

# Run benchmarks
cargo bench

Project Structure

src/
├── lib.rs          # Public API (to_html, to_html_into)
├── block/          # Block-level parser
│   ├── parser.rs   # Line-oriented block parsing
│   └── event.rs    # BlockEvent types
├── inline/         # Inline-level parser
│   ├── mod.rs      # Three-phase inline parsing
│   ├── marks.rs    # Mark collection
│   ├── code_span.rs
│   ├── emphasis.rs      # Modulo-3 stack optimization
│   ├── strikethrough.rs # GFM strikethrough resolution
│   └── links.rs         # Link/image/autolink parsing
├── cursor.rs       # Pointer-based byte cursor
├── range.rs        # Compact u32 range type
├── render.rs       # HTML writer
├── escape.rs       # HTML escaping (memchr-optimized)
└── limits.rs       # DoS prevention constants

License

MIT OR Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 201 Commits
.cargo		.cargo
benches		benches
docs/arch		docs/arch
examples		examples
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LIST_FIXES_PLAN.md		LIST_FIXES_PLAN.md
PERF_ATTEMPTS.md		PERF_ATTEMPTS.md
README.md		README.md
build.rs		build.rs
md-new.md		md-new.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ferromark

Why ferromark

Design goals

Architecture at a glance

Why this is fast

Performance

Technical Notes (Top-Tier Approaches)

Spec Compliance

Usage

Zero-allocation API

Building

Project Structure

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

sebastian-software/ferromark

Folders and files

Latest commit

History

Repository files navigation

ferromark

Why ferromark

Design goals

Architecture at a glance

Why this is fast

Performance

Technical Notes (Top-Tier Approaches)

Spec Compliance

Usage

Zero-allocation API

Building

Project Structure

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages