Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-49741][DOCS] Add
spark.shuffle.accurateBlockSkewedFactor
to …
…config docs page ### What changes were proposed in this pull request? `spark.shuffle.accurateBlockSkewedFactor` was added in Spark 3.3.0 in https://issues.apache.org/jira/browse/SPARK-36967 and is a useful shuffle configuration to prevent issues where `HighlyCompressedMapStatus` wrongly estimates the shuffle block sizes when the block size distribution is skewed, which can cause the shuffle reducer to fetch too much data and OOM. This PR adds this config to the Spark config docs page to make it discoverable. ### Why are the changes needed? To make this useful config discoverable by users and make them able to resolve shuffle fetch OOM issues themselves. ### Does this PR introduce _any_ user-facing change? Yes, this is a documentation fix. Before this PR there's no `spark.sql.adaptive.skewJoin.skewedPartitionFactor` in the `Shuffle Behavior` section on [the Configurations page](https://spark.apache.org/docs/latest/configuration.html) and now there is. ### How was this patch tested? On the IDE: <img width="1633" alt="image" src="https://github.com/user-attachments/assets/616a94b9-2408-491c-a17b-c6dbdff14465"> Updated: <img width="1274" alt="image" src="https://github.com/user-attachments/assets/ba170e9a-eba2-4fdf-85eb-a3aebefc055e"> ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#48189 from timlee0119/add-accurate-block-skewed-factor-to-doc. Authored-by: Tim Lee <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
- Loading branch information