Skip to content

Search: don't include content from non-content nodes#12757

Open
stsewd wants to merge 1 commit intomainfrom
search-fix-overparsing
Open

Search: don't include content from non-content nodes#12757
stsewd wants to merge 1 commit intomainfrom
search-fix-overparsing

Conversation

@stsewd
Copy link
Member

@stsewd stsewd commented Feb 5, 2026

When extracting the text of html documents using text(), it will include contents from inline scripts/styles. For inline graphs, we are indexing a lot of content, it timeouts ES when trying to search over those documents.

@stsewd stsewd requested a review from a team as a code owner February 5, 2026 18:21
@stsewd stsewd requested a review from humitos February 5, 2026 18:21
stsewd added a commit that referenced this pull request Feb 5, 2026
When fixing #12757,
I found a project that was indexing 1.5MB, and when searching over that
content, ES timed out, we can reduce this even further if we want.
But hopefully, with the other fix, this case will be less common.
@stsewd stsewd requested a review from ericholscher February 5, 2026 18:30
@stsewd stsewd moved this to Needs review in 📍Roadmap Feb 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Needs review

Development

Successfully merging this pull request may close these issues.

1 participant