Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Limited support for commonly used HTML tags and entities in Markdown #1249

Open
2 tasks done
kyr0 opened this issue Feb 20, 2024 · 3 comments
Open
2 tasks done
Assignees
Labels
enhancement New feature or request

Comments

@kyr0
Copy link

kyr0 commented Feb 20, 2024

Initial checklist

  • I agree to follow the code of conduct
  • I searched issues and discussions and couldn’t find anything (or linked relevant results below)

Problem

I'm facing an issue with preview images rendering at a huge size because the source image width is huge and AFAIK there is no way in commonmark or GFM to set a width or height. Therefore the correct display behaviour for the image plugin is to render the images AS IS. However, HTML support for Markdown is common and widely used. This limitation is breaking UX and behaviour for my current application and I'm quite sure that it's a limitation that is hindering Milkdown adoption. I couldn't find a plugin to support those HTML tags usually supported by decent Markdown editors/libraries.

Solution

It would be great if Milkdown would support the following HTML elements with limited support for attributes, exactly as it does render here:

Image

<img src="$string" width="$numeric_only" height="$numeric_only" /> rendered as:

Underline

<ins>will be underlined</ins> rendered as: will be underlined

HTML Entities and Symbols

&nbsp;&nbsp;&nbsp;&nbsp;&ndash;&copy; rendered as:     –©

Center

<p align="center">This text is centered.</p>

rendered as:

This text is centered.

Comments

Some people need the ability to write sentences in their Markdown files that will not appear in the rendered output.

[This is a comment that will be hidden.]: #

The following is hidden:

Forced Line Breaks

<br><br> rendered as

A



B

Simple Lists, also nested (in tables)

| Syntax      | Description |
| ----------- | ----------- |
| Header      | Title |
| List        | Here's a list! <ul><li>Item one.</li><li>Item two.</li></ul> |

rendered as:

Syntax Description
Header Title
List Here's a list!
  • Item one.
  • Item two.

Table of Contents (ToC)

#### Table of Contents

- [Underline](#underline)
- [Indent](#indent)
- [Center](#center)
- [Color](#color)

Rendered as:

Table of Contents

Video and Audio

[![Video alt text](https://github.com/Milkdown/milkdown/assets/454817/0270c732-7198-45a8-8f9a-a3ca70605ae1)](https://www.youtube.com/watch?v=a8CwpGARAsQ)

rendered as:

Video alt text


All of this, I think, can be achieved by parsing text nodes as HTML and constructing the internal AST representing the respective Nodes including the additional attributes and also re-transform it back into it's original form (serialization), right?

p.s.: Spec is highly influenced by:
https://www.markdownguide.org/hacks/

Alternatives

I'm not sure about that. Please advise.

@kyr0 kyr0 added the enhancement New feature or request label Feb 20, 2024
@kyr0
Copy link
Author

kyr0 commented Feb 20, 2024

I'm pretty sure that it's not the intention to implement support for this in the core. Would it make sense to implement a milkdown-html plugin and could you please point out a single implementation that does something similar, works with the latest release and which follows a pattern that is currently adviced to follow?

I noticed that there have been some breaking changes in the plugin APIs over the past 2 years, making it a bit hard for a developer not familiar with the codebase of prosemirror and milkdown, to fetch some "in the wild" code and get it to work quickly -- to tinker with it and to learn by experiment.

Is going down this road a good idea?
https://github.com/Milkdown/milkdown/blob/main/packages/plugin-math/src/index.ts

I also understand that using remark-html might be advisable to generate HTML - and to parse HTML I'd probably use linkedom leaving me with mapping the AST node attributes only - basically.

I'll probably start tinkering with it this weekend.

@quank123wip
Copy link
Contributor

Hmm, add html support directly may not be a good idea, even in a simple plugin scope. If we're going to add these extensions, maybe we should limit html in particular scoped blocks. For example inside '''html-block or something else

@kyr0
Copy link
Author

kyr0 commented Jul 22, 2024

@quank123wip I agree, I'm using Milkdown for quite some time now, and as my project is allowing HTML to be extracted from websites and is then loaded into Milkdown, I'm currently using turndown and a custom parser to sanitize the HTML. It's a complicated and error-prone process. If Milkdown would support an easy mixed-mode, just as Markdown is intended to be (all markup that is not supported by Markdown could be HTML), a specific AST node representation might make sense. But to represent modern HTML can become incredibly complex, so idk. It seems to be alot of work. For my project I'm thinking about diverging from Milkdown to an editor that would support HTML natively. As much as I love Markdown, Milkdown and Prosemirror, the datatype compatibility my current approach brings to the table, are too much. aka I'm probably using it wrong ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants