Python bindings for the htmd Rust library, a fast HTML to Markdown converter.
pip install htmd-py
- Python 3.9+
You can customise the HTML to Markdown conversion with the following options:
heading_style
: Style for headings (values fromhtmd.HeadingStyle
)hr_style
: Style for horizontal rules (values fromhtmd.HrStyle
)br_style
: Style for line breaks (values fromhtmd.BrStyle
)link_style
: Style for links (values fromhtmd.LinkStyle
)link_reference_style
: Style for referenced links (values fromhtmd.LinkReferenceStyle
)code_block_style
: Style for code blocks (values fromhtmd.CodeBlockStyle
)code_block_fence
: Fence style for code blocks (values fromhtmd.CodeBlockFence
)bullet_list_marker
: Marker for unordered lists (values fromhtmd.BulletListMarker
)preformatted_code
: Whether to preserve whitespace in inline code (boolean)skip_tags
: List of HTML tags to skip during conversion (list of strings)
All options are exposed in a simple manner:
import htmd
# Simple conversion with default options
markdown = htmd.convert_html("<h1>Hello World</h1>")
print(markdown) # "# Hello World"
# Using custom options
options = htmd.Options()
options.heading_style = htmd.HeadingStyle.SETEX
options.bullet_list_marker = htmd.BulletListMarker.DASH
markdown = htmd.convert_html("<h1>Hello World</h1><ul><li>Item 1</li></ul>", options)
print(markdown)
# Skip specific HTML tags
options = htmd.create_options_with_skip_tags(["script", "style"])
markdown = htmd.convert_html("<h1>Hello</h1><script>alert('Hi');</script>", options)
print(markdown) # "# Hello" (script tag is skipped)
Refer to the htmd docs for all available options.
The module provides enumeration-like objects for all option values:
import htmd
# HeadingStyle
htmd.HeadingStyle.ATX # "atx"
htmd.HeadingStyle.SETEX # "setex"
# HrStyle
htmd.HrStyle.DASHES # "dashes"
htmd.HrStyle.ASTERISKS # "asterisks"
htmd.HrStyle.UNDERSCORES # "underscores"
# BrStyle
htmd.BrStyle.TWO_SPACES # "two_spaces"
htmd.BrStyle.BACKSLASH # "backslash"
# LinkStyle
htmd.LinkStyle.INLINED # "inlined"
htmd.LinkStyle.REFERENCED # "referenced"
# LinkReferenceStyle
htmd.LinkReferenceStyle.FULL # "full"
htmd.LinkReferenceStyle.COLLAPSED # "collapsed"
htmd.LinkReferenceStyle.SHORTCUT # "shortcut"
# CodeBlockStyle
htmd.CodeBlockStyle.INDENTED # "indented"
htmd.CodeBlockStyle.FENCED # "fenced"
# CodeBlockFence
htmd.CodeBlockFence.TILDES # "tildes"
htmd.CodeBlockFence.BACKTICKS # "backticks"
# BulletListMarker
htmd.BulletListMarker.ASTERISK # "asterisk"
htmd.BulletListMarker.DASH # "dash"
Tested with small (12 lines) and medium (1000 lines) markdown strings
- vs. markdownify: 10x (S) - 30x (M) faster
Maintained by lmmx. Contributions welcome!
- Issues & Discussions: Please open a GitHub issue or discussion for bugs, feature requests, or questions.
- Pull Requests: PRs are welcome!
- Install the dev extra (e.g. with uv:
uv pip install -e .[dev]
) - Run tests (when available) and include updates to docs or examples if relevant.
- If reporting a bug, please include the version and the error message/traceback if available.
- Install the dev extra (e.g. with uv:
- htmd - The underlying Rust library
- Inspired by comrak - Python bindings for Comrak, a fast Markdown to HTML converter.
Licensed under the Apache License, Version 2.0.