Optimize DOM traversal and CSS selector performance (48% faster for styled elements) #191
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Performance Improvements Plan
Identified Performance Issues
implicit_headingsandfont_sizesiterations into single pass throughprocess_styled_elementsmethod.search()to.css()and improved selector specificity (e.g.,td > pinstead oftd p)Implementation Details
Converter Optimizations:
implicit_headingsandfont_sizesmethods now share a single DOM traversal viaprocess_styled_elements, reducing redundant iterations through all styled elementslist_item_spansprivate method to cache the version-specific CSS selector, avoiding repeated calls toWordToMarkdown.soffice.major_versiontd ptotd > p(direct child selector is more efficient)li ptoli > p(direct child selector)table tr:first tdtotable tr:first-child > td(more specific and efficient).search()to.css()consistently for better performanceTesting & Validation:
test_word_to_markdown_performance.rbwith tests to validate:script/benchmarkto measure actual performance gains:Code Style Fixes:
bin/w2m: Removed redundant.to_scalllib/word-to-markdown/converter.rb: Refactoredprocess_styled_elementsto reduce line count (usednextguard clause)script/benchmark:extract_font_sizehelper method to avoid long safe navigation chains.to_afor selector benchmarksDocumentation:
Created
PERFORMANCE_IMPROVEMENTS.mdwith:Configuration Updates:
.rubocop.ymlto useplugins:instead ofrequire:for rubocop extensionsMetrics/LineLengthtoLayout/LineLengthper rubocop deprecationPerformance Impact
Measured improvements:
Benefits most noticeable with:
Security
Backward Compatibility
Code Quality
word-to-markdown.gemspecnot touched (outside scope of this PR)Progress
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.