refactor array filtering by esaulpaugh · Pull Request #212 · apache/ant

esaulpaugh · 2024-09-04T22:44:20Z

Prefer array truncation via <T> java.util.Arrays.copyOf(T[],int) to ArrayList::toArray because it is faster.

bodewig · 2025-05-31T12:46:01Z

Thank you. Have you faced a performance problem that has been eased by this pull request or is it based on "this arraylist could and should be avoided"?

I'm not opposed to merging the change but wouldn't label it as performance improvement if the difference is likely unnoticeable for most projects.

esaulpaugh · 2025-05-31T14:16:35Z

I have not faced a performance issue but I discovered this trick while optimizing a String parser in one of my projects. It performs better in microbenchmarks compared to building an ArrayList. I suggested it to gson as well: google/gson#2734

On corretto-11.0.27 aarch64:

Benchmark     Mode  Cnt   Score    Error  Units
Measure.arr   avgt   14  17.527 ±  0.435  ns/op
Measure.list  avgt   14  44.013 ± 27.030  ns/op

On graalvm-jdk-24+36.1 (24.2.0) aarch64:

Benchmark     Mode  Cnt   Score   Error  Units
Measure.arr   avgt   14   9.795 ± 0.173  ns/op
Measure.list  avgt   14  41.321 ± 3.544  ns/op

@State(Scope.Benchmark)
@Fork(2)
@BenchmarkMode(Mode.AverageTime)
@Warmup(iterations = 7, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 7, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class Measure {

  private static final String[] STRINGS = { null, "", "00", "\0", "...", null, "H", "ab", "+", "UUU", null, null, "5\t5" };
  
  @Benchmark
  public void list(Blackhole blackhole) {
      ArrayList<String> list = new ArrayList<>(/* STRINGS.length */);
      for (String s : STRINGS) {
          if (s != null) {
              list.add(s);
          }
      }
      blackhole.consume(list.toArray(new String[0]));
  }
  
  @Benchmark
  public void arr(Blackhole blackhole) {
      String[] arr = new String[STRINGS.length];
      int i = 0;
      for (String s : STRINGS) {
          if (s != null) {
              arr[i++] = s;
          }
      }
      blackhole.consume(Arrays.copyOf(arr, i));
  }
}

It's unlikely that this change alone would make a noticeable difference to users, but small changes can add up eventually, in my experience.

bodewig · 2025-06-01T13:46:50Z

as I said I'm not opposed to merging it, it will mostly be "a refactoring".

I wondered whether an extra check whether we've used all slots of the temporary array and avoiding copyOf alltogether in some cases may be an additional improvement. In very many cases DirectoryScanner will not hit any symlinks during the scan at all. Then again the performance of DirectoryScanner very likely is dominated by filesystem I/O and there really is not much of a difference anyway. The extra-check would make the code more difficult to read for little real gain.

bodewig · 2025-06-04T17:38:39Z

Thank you @esaulpaugh . If you want to be credited in our contributors file, please add yourself to CONTRIBUTORS and contributors.xml.

bodewig · 2025-06-05T16:15:58Z

Thank you

esaulpaugh added 2 commits September 4, 2024 17:38

improve DirectoryScanner filtering performance

114228d

improve filtering performance in Copy

3275173

esaulpaugh changed the title ~~improve DirectoryScanner filtering performance~~ improve filtering performance Sep 5, 2024

esaulpaugh changed the title ~~improve filtering performance~~ refactor array filtering Jun 2, 2025

esaulpaugh added 3 commits June 3, 2025 18:12

Merge branch 'apache:master' into master

38393ea

refactor Path construction

985714e

avoid array copy when no symlinks filtered out

6afcb46

add name to contributors

f2eb4ae

bodewig merged commit b3f64f0 into apache:master Jun 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor array filtering#212

refactor array filtering#212
bodewig merged 6 commits intoapache:masterfrom
esaulpaugh:master

esaulpaugh commented Sep 4, 2024

Uh oh!

bodewig commented May 31, 2025

Uh oh!

esaulpaugh commented May 31, 2025

Uh oh!

bodewig commented Jun 1, 2025

Uh oh!

bodewig commented Jun 4, 2025

Uh oh!

bodewig commented Jun 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

esaulpaugh commented Sep 4, 2024

Uh oh!

bodewig commented May 31, 2025

Uh oh!

esaulpaugh commented May 31, 2025

Uh oh!

bodewig commented Jun 1, 2025

Uh oh!

bodewig commented Jun 4, 2025

Uh oh!

bodewig commented Jun 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants