Skip to content

Conversation

@dmitrivMS
Copy link
Contributor

Small perf improvements + fixed a bug in rtrim where it didn't handle multi-char suffixes correctly.
Perf measurement results from different models are below.

Codex

Scenario Old (ms) New (ms) Δ vs old
ltrim · leading-spaces 1331.26 295.22 +350.94%
ltrim · leading-comment-markers 197.20 82.54 +138.92%
ltrim · leading-no-match 18.17 10.86 +67.33%
rtrim · trailing-spaces 4184.59 454.51 +820.67%
rtrim · trailing-comment-markers 169.63 66.30 +155.85%
rtrim · trailing-no-match 21.58 11.21 +92.44%

Gemini

ltrim

Scenario Old (ms) New (ms) Improvement
Short string, 1 char needle 8.26 4.83 41.56%
Short string, long needle 6.30 5.99 5.00%
Long string, 1 char needle 955.75 322.78 66.23%
Long string, long needle 2582.88 1490.81 42.28%
No match 3.55 1.81 49.07%
Full match 2555.75 1616.69 36.74%
Match at start/end only 4.93 1.76 64.29%
Pathological 1 char 9614.58 3227.39 66.43%

rtrim

Scenario Old (ms) New (ms) Improvement
Short string, 1 char needle 13.74 5.27 61.63%
Short string, long needle 5.90 6.79 -14.99%
Long string, 1 char needle 2677.69 296.25 88.94%
Long string, long needle 2808.86 1563.51 44.34%
No match 2.99 1.18 60.58%
Full match 2863.60 1625.15 43.25%
Match at start/end only 5.75 1.86 67.65%
Pathological 1 char 28616.35 2964.60 89.64%

Sonet

ltrim Performance

Scenario Old (ms) New (ms) Improvement Speedup
Single char space (short string) 7.84 5.68 27.5% 1.38x
Single char space (long string) 7.18 2.70 62.4% 2.66x
Single char space (many spaces) 870.62 186.27 78.6% 4.67x
Multi-char (single occurrence) 6.21 2.92 53.0% 2.13x
No match 1.92 1.16 39.4% 1.66x
Entire string (single char) 3.36 1.52 54.9% 2.21x
Slash (single char) 5.98 2.73 54.4% 2.19x
Tab character 5.05 2.32 54.1% 2.18x
Newline character 4.68 2.21 52.8% 2.12x

ltrim Average Improvement: 53.0%

rtrim Performance

Scenario Old (ms) New (ms) Improvement Speedup
Single char space (short string) 12.42 4.64 62.6% 2.68x
Single char space (long string) 22.42 2.84 87.3% 7.89x
Single char space (many spaces) 2692.02 178.77 93.4% 15.06x
Multi-char (single occurrence) 7.31 4.12 43.7% 1.77x
No match 3.22 1.11 65.6% 2.90x
Entire string (single char) 8.29 1.52 81.7% 5.45x
Slash (single char) 14.26 2.87 79.9% 4.97x
Tab character 11.82 2.43 79.5% 4.86x
Newline character 11.61 2.15 81.4% 5.40x

rtrim Average Improvement: 75.0%

Key Insights

  • rtrim shows significantly better improvements (75.0% avg) compared to ltrim (53.0% avg)
  • Single-character optimizations are highly effective, especially for rtrim with many characters to trim
  • Biggest win: rtrim with 1000 spaces is 15x faster (93.4% improvement)
  • Multi-char scenarios still show good improvements (44-53%) despite being more complex
  • Even no-match scenarios benefit from the optimization (39-66% faster)

@dmitrivMS dmitrivMS requested review from Copilot and hediet and removed request for Copilot November 18, 2025 23:04
@vs-code-engineering vs-code-engineering bot added this to the November 2025 milestone Nov 18, 2025
Copilot AI review requested due to automatic review settings November 18, 2025 23:06
Copilot finished reviewing on behalf of dmitrivMS November 18, 2025 23:07
Copilot finished reviewing on behalf of dmitrivMS November 18, 2025 23:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves the performance of the ltrim and rtrim string utility functions, fixes a bug in rtrim where it didn't correctly handle multi-character suffixes, and optimizes the AmbiguousCharacters cache implementation.

Key changes:

  • Optimized ltrim and rtrim with fast paths for single-character needles using charCodeAt instead of indexOf/lastIndexOf
  • Fixed rtrim bug: replaced incorrect lastIndexOf logic with endsWith for proper multi-char suffix handling
  • Simplified AmbiguousCharacters cache by using string concatenation as cache key instead of JSON.stringify on arrays

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/vs/base/common/strings.ts Added single-char optimizations to ltrim/rtrim, fixed rtrim multi-char bug, simplified AmbiguousCharacters cache key
src/vs/base/test/common/strings.test.ts Added test cases for multi-char trim scenarios to verify bug fix and prevent regressions

@dmitrivMS dmitrivMS merged commit d9fd98f into main Nov 19, 2025
28 checks passed
@dmitrivMS dmitrivMS deleted the dev/dmitriv/string-utils-perf branch November 19, 2025 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants