Commit 82bd980
authored
[Improvement](hash) opt for pack_fixeds (#59410)
<img width="298" height="2142" alt="图片"
src="https://github.com/user-attachments/assets/3be70146-c4bd-4ff7-ac96-2645f89fed14"
/>
This pull request refactors and optimizes the handling of null maps and
key packing in hash join and hash table code, with a focus on improving
SIMD (Single Instruction, Multiple Data) usage and simplifying null
bitmap logic. The changes replace older byte-searching utilities with
new, more efficient SIMD-based functions, update how null bitmaps are
packed and processed, and streamline column null data replacement.
Additionally, the logic for determining hash key types and handling
fixed key serialization is improved for better correctness and
performance.
Key improvements and changes:
### SIMD utilities and null map handling
* Introduced new SIMD-based functions `contain_one` and `contain_zero`
in `simd/bits.h`, replacing the older `contain_byte` and related logic
for checking the presence of ones or zeros in null maps, resulting in
more efficient null detection.
* Updated all usages of null map checks throughout the codebase to use
the new `contain_one` and `contain_zero` functions, simplifying and
unifying the logic for detecting nulls in columns and filters.
[[1]](diffhunk://#diff-0732e01c1a3f38997ada381c43aff98286e86ca7519db5469a6e4dcdec5bce44L195-L200)
[[2]](diffhunk://#diff-3110bab7d558f46b88ae1958b09ac369a92cac4bff98b280b2cf83d2d7aecbf4L117-R118)
[[3]](diffhunk://#diff-3110bab7d558f46b88ae1958b09ac369a92cac4bff98b280b2cf83d2d7aecbf4L369-R371)
[[4]](diffhunk://#diff-8981dd2e1f08aaa46a97aeef27bd906c64d1bb08deedc0fe1d94c1c49dc064ceL100-R100)
[[5]](diffhunk://#diff-9fd61a223bcb3b7a9cb93c2d26c9364d8cce2131673fe286f22a80b09c6fd2c6L283-R283)
[[6]](diffhunk://#diff-9fd61a223bcb3b7a9cb93c2d26c9364d8cce2131673fe286f22a80b09c6fd2c6L601-R605)
### Hash key and null bitmap packing
* Refactored the logic for packing null maps into hash keys in
`MethodKeysFixed`, introducing new templates and helper functions for
interleaved null map packing, and replacing the old bitmap size
calculation with a simplified approach. This improves both performance
and maintainability.
[[1]](diffhunk://#diff-b8623712a5a1728bb77cc67b6ee1bbf16ef2b842044f6f6bab64c3fc5c4575f3R478-R540)
[[2]](diffhunk://#diff-b8623712a5a1728bb77cc67b6ee1bbf16ef2b842044f6f6bab64c3fc5c4575f3L500-R611)
* Updated the logic for initializing and inserting keys, ensuring
correct handling of nulls and simplifying offset calculations for key
data.
[[1]](diffhunk://#diff-b8623712a5a1728bb77cc67b6ee1bbf16ef2b842044f6f6bab64c3fc5c4575f3R653)
[[2]](diffhunk://#diff-b8623712a5a1728bb77cc67b6ee1bbf16ef2b842044f6f6bab64c3fc5c4575f3L619-R692)
[[3]](diffhunk://#diff-b8623712a5a1728bb77cc67b6ee1bbf16ef2b842044f6f6bab64c3fc5c4575f3L645-R712)
### Column null data replacement
* Simplified the `replace_column_null_data` methods for vector and
decimal columns by removing unnecessary null count checks and optimizing
the replacement logic.
[[1]](diffhunk://#diff-3fa47f544ff08bb2c8232af99312c0bbf2c58cac9da7a2b06473282b99ad5aa4L528-R530)
[[2]](diffhunk://#diff-5fdf450def955da3201cc889aa870d94def054d1168f1ef3def32e8f009dc65aL526-L529)
### Hash key type logic
* Improved the logic for determining the hash key type in
`hash_key_type.h` to handle cases where the number of data types exceeds
the bit size, defaulting to serialized keys as needed.
[[1]](diffhunk://#diff-4f1fb8a89cd0e13a719c3427b1ae7581b42cb7325755a3ceac4c44bdc64bd144R83-R86)
[[2]](diffhunk://#diff-4f1fb8a89cd0e13a719c3427b1ae7581b42cb7325755a3ceac4c44bdc64bd144L97-R101)
### Code cleanup and dependency updates
* Removed unused functions and updated includes to ensure all SIMD
utilities are properly imported where needed.
[[1]](diffhunk://#diff-b8623712a5a1728bb77cc67b6ee1bbf16ef2b842044f6f6bab64c3fc5c4575f3R20-R26)
[[2]](diffhunk://#diff-b8623712a5a1728bb77cc67b6ee1bbf16ef2b842044f6f6bab64c3fc5c4575f3R36)
[[3]](diffhunk://#diff-b8623712a5a1728bb77cc67b6ee1bbf16ef2b842044f6f6bab64c3fc5c4575f3L292-L295)
These changes collectively improve performance, maintainability, and
correctness in hash join operations, especially in handling nullable
columns and SIMD optimizations.1 parent 4a0e4af commit 82bd980
File tree
14 files changed
+325
-90
lines changed- be
- src
- pipeline/exec
- join
- util/simd
- vec
- columns
- common/hash_table
- core
- data_types/serde
- functions
- test
- util/simd
- vec/columns
14 files changed
+325
-90
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
192 | 192 | | |
193 | 193 | | |
194 | 194 | | |
195 | | - | |
196 | | - | |
197 | | - | |
198 | | - | |
199 | | - | |
| 195 | + | |
| 196 | + | |
200 | 197 | | |
201 | 198 | | |
202 | 199 | | |
| |||
208 | 205 | | |
209 | 206 | | |
210 | 207 | | |
211 | | - | |
| 208 | + | |
212 | 209 | | |
213 | 210 | | |
214 | 211 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
114 | 114 | | |
115 | 115 | | |
116 | 116 | | |
117 | | - | |
118 | | - | |
| 117 | + | |
| 118 | + | |
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
| |||
192 | 192 | | |
193 | 193 | | |
194 | 194 | | |
195 | | - | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
196 | 198 | | |
197 | 199 | | |
198 | 200 | | |
| |||
366 | 368 | | |
367 | 369 | | |
368 | 370 | | |
369 | | - | |
370 | | - | |
| 371 | + | |
371 | 372 | | |
372 | 373 | | |
373 | 374 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
97 | 97 | | |
98 | 98 | | |
99 | 99 | | |
100 | | - | |
101 | | - | |
102 | | - | |
| 100 | + | |
103 | 101 | | |
104 | 102 | | |
105 | 103 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
261 | 261 | | |
262 | 262 | | |
263 | 263 | | |
264 | | - | |
265 | | - | |
266 | | - | |
267 | | - | |
268 | | - | |
269 | | - | |
270 | | - | |
271 | | - | |
272 | 264 | | |
273 | 265 | | |
274 | 266 | | |
| |||
281 | 273 | | |
282 | 274 | | |
283 | 275 | | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
284 | 329 | | |
285 | 330 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
521 | 521 | | |
522 | 522 | | |
523 | 523 | | |
524 | | - | |
525 | | - | |
526 | | - | |
527 | | - | |
528 | 524 | | |
529 | 525 | | |
530 | 526 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
280 | 280 | | |
281 | 281 | | |
282 | 282 | | |
283 | | - | |
| 283 | + | |
284 | 284 | | |
285 | 285 | | |
286 | 286 | | |
| |||
598 | 598 | | |
599 | 599 | | |
600 | 600 | | |
601 | | - | |
| 601 | + | |
602 | 602 | | |
603 | 603 | | |
604 | 604 | | |
605 | | - | |
| 605 | + | |
606 | 606 | | |
607 | 607 | | |
608 | 608 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
525 | 525 | | |
526 | 526 | | |
527 | 527 | | |
528 | | - | |
529 | | - | |
530 | | - | |
531 | | - | |
| 528 | + | |
532 | 529 | | |
533 | | - | |
| 530 | + | |
534 | 531 | | |
535 | 532 | | |
536 | 533 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
83 | 87 | | |
84 | 88 | | |
85 | 89 | | |
| |||
94 | 98 | | |
95 | 99 | | |
96 | 100 | | |
97 | | - | |
98 | | - | |
| 101 | + | |
99 | 102 | | |
100 | 103 | | |
101 | 104 | | |
| |||
0 commit comments