Skip to content

Commit

Permalink
[SPARK-46970][CORE] Rewrite OpenHashSet#hasher with pattern matching
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?
The proposed changes in this pr involve refactoring the method of creating a `Hasher[T]` instance in the code. The original code used a series of if-else statements to check the class type of `T` and create the corresponding `Hasher[T]` instance. The proposed change simplifies this process by using Scala's pattern matching feature. The new code is more concise and easier to read.

### Why are the changes needed?
The changes are needed for several reasons. Firstly, the use of pattern matching makes the code more idiomatic to Scala, which is beneficial for readability and maintainability. Secondly, the original code contains a comment about a bug in the Scala 2.9.x compiler that prevented the use of pattern matching in this context. However, Apache Spark 4.0 has switched to using Scala 2.13, and the new code has passed all tests, it appears that the bug no longer exists in the new version of Scala. Lastly, the change reduces the amount of code and makes it cleaner and easier to understand.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #44998 from LuciferYang/openhashset-hasher.

Lead-authored-by: yangjie01 <[email protected]>
Co-authored-by: YangJie <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
  • Loading branch information
LuciferYang authored and dongjoon-hyun committed Feb 4, 2024
1 parent 062522e commit 7ca355c
Showing 1 changed file with 6 additions and 22 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -62,28 +62,12 @@ class OpenHashSet[@specialized(Long, Int, Double, Float) T: ClassTag](
// specialization to work (specialized class extends the non-specialized one and needs access
// to the "private" variables).

protected val hasher: Hasher[T] = {
// It would've been more natural to write the following using pattern matching. But Scala 2.9.x
// compiler has a bug when specialization is used together with this pattern matching, and
// throws:
// scala.tools.nsc.symtab.Types$TypeError: type mismatch;
// found : scala.reflect.AnyValManifest[Long]
// required: scala.reflect.ClassTag[Int]
// at scala.tools.nsc.typechecker.Contexts$Context.error(Contexts.scala:298)
// at scala.tools.nsc.typechecker.Infer$Inferencer.error(Infer.scala:207)
// ...
val mt = classTag[T]
if (mt == ClassTag.Long) {
(new LongHasher).asInstanceOf[Hasher[T]]
} else if (mt == ClassTag.Int) {
(new IntHasher).asInstanceOf[Hasher[T]]
} else if (mt == ClassTag.Double) {
(new DoubleHasher).asInstanceOf[Hasher[T]]
} else if (mt == ClassTag.Float) {
(new FloatHasher).asInstanceOf[Hasher[T]]
} else {
new Hasher[T]
}
protected val hasher: Hasher[T] = classTag[T] match {
case ClassTag.Long => new LongHasher().asInstanceOf[Hasher[T]]
case ClassTag.Int => new IntHasher().asInstanceOf[Hasher[T]]
case ClassTag.Double => new DoubleHasher().asInstanceOf[Hasher[T]]
case ClassTag.Float => new FloatHasher().asInstanceOf[Hasher[T]]
case _ => new Hasher[T]
}

protected var _capacity = nextPowerOf2(initialCapacity)
Expand Down

0 comments on commit 7ca355c

Please sign in to comment.