Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of indented HTML in DOMParser #315

Merged
merged 2 commits into from
Aug 20, 2024
Merged

Conversation

mattmikolay
Copy link
Contributor

@mattmikolay mattmikolay commented Aug 15, 2024

This PR fixes a bug that causes DOMParser to throw an error when parsing indented HTML with the default collapsing of whitespace.

The root cause of the bug is in add_text_node. Notice the the assignment of node_before on line 571:

node_before = top.content[-1]

When top.content is an empty list, the following error is thrown:

IndexError: list index out of range

Compare with the JavaScript implementation of addTextNode in the original ProseMirror:

 let nodeBefore = top.content[top.content.length - 1]

Here, when top.content is an empty array, nodeBefore will be set to top.content[-1]. In JavaScript, a negative indexing of an empty array will correctly return undefined. However, in Python, a negative indexing of an empty list will throw an error. Thus, we need to check if top.content is truthy before accessing top.content[-1].

Thanks for the great library!

@mattmikolay mattmikolay marked this pull request as ready for review August 15, 2024 13:12
Copy link
Contributor

@p7g p7g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

@p7g p7g merged commit 4754bf6 into fellowapp:main Aug 20, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants