Use multi-threaded scan nodes with mt_dop > 0#1
Open
zhangyifan27 wants to merge 1 commit intobranch-4.1.0from
Open
Use multi-threaded scan nodes with mt_dop > 0#1zhangyifan27 wants to merge 1 commit intobranch-4.1.0from
zhangyifan27 wants to merge 1 commit intobranch-4.1.0from
Conversation
This patch add a new query option DISABLE_SCAN_NODE_MT, default to false. If it is set true, the single-threaded scan node (i.e. HdfsScanNodeMt or KuduScanNodeMt) will not be used even if mt_dop > 0. So we can create more scanner threads to achieve faster scanning with mt_dop > 0. Also, the type of scan node is addied to the EXPLAIN output. Tetting: - Added e2e tests to verify the new query option. Change-Id: I40841eaaeee1756808aa5978cb58517ffd47c040
8b94a13 to
abfa53b
Compare
zhangyifan27
pushed a commit
that referenced
this pull request
Dec 4, 2024
As agreed in JIRA discussions, the current PR extends existing TRIM
functionality with the support of SQL-standardized TRIM-FROM syntax:
TRIM({[LEADING / TRAILING / BOTH] | [STRING characters]} FROM expr).
Implemented based on the existing LTRIM / RTRIM / BTRIM family of
functions prepared earlier in IMPALA-6059 and extended for UTF-8 in
IMPALA-12718. Besides, partly based on abandoned PR
https://gerrit.cloudera.org/#/c/4474 and similar EXTRACT-FROM
functionality from apache@543fa73f3a846
f0e4527514c993cb0985912b06c.
Supported syntaxes:
Syntax #1 TRIM(<where> FROM <string>);
Syntax #2 TRIM(<charset> FROM <string>);
Syntax #3 TRIM(<where> <charset> FROM <string>);
"where": Case-insensitive trim direction. Valid options are "leading",
"trailing", and "both". "leading" means trimming characters from the
start; "trailing" means trimming characters from the end; "both" means
trimming characters from both sides. For Syntax #2, since no "where"
is specified, the option "both" is implied by default.
"charset": Case-sensitive characters to be removed. This argument is
regarded as a character set going to be removed. The occurrence order
of each character doesn't matter and duplicated instances of the same
character will be ignored. NULL argument implies " " (standard space)
by default. Empty argument ("" or '') makes TRIM return the string
untouched. For Syntax #1, since no "charset" is specified, it trims
" " (standard space) by default.
"string": Case-sensitive target string to trim. This argument can be
NULL.
The UTF8_MODE query option is honored by TRIM-FROM, similarly to
existing TRIM().
UTF8_TRIM-FROM can be used to force UTF8 mode regardless of the query
option.
Design Notes:
1. No-BE. Since the existing LTRIM / RTRIM / BTRIM functions fully cover
all needed use-cases, no backend logic is required. This differs from
similar EXTRACT-FROM.
2. Syntax wrapper. TrimFromExpr class was introduced as a syntax
wrapper around FunctionCallExpr, which instantiates one of the regular
LTRIM / RTRIM / BTRIM functions. TrimFromExpr's role is to maintain
the integrity of the "phantom" TRIM-FROM built-in function.
3. No TRIM keyword. Following EXTRACT-FROM, no "TRIM" keyword was
added to the language. Although generally a keyword would allow easier
and better parsing, on the negative side it restricts token's usage in
general context. However, leading/trailing/both, being previously
saved as reserved words, are now added as keywords to make possible
their usage with no escaping.
Change-Id: I3c4fa6d0d8d0684c4b6d8dac8fd531d205e4f7b4
Reviewed-on: http://gerrit.cloudera.org:8080/21825
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Csaba Ringhofer <csringhofer@cloudera.com>
zhangyifan27
pushed a commit
that referenced
this pull request
Jun 5, 2025
…quences This adds support for a summarized textual representation of timestamps for the event sequences present in the aggregated profile. With the verbose format present in profile V1 and V2, it becomes difficult to analyze an event's timestamps across instances. The event sequences are now displayed in a histogram format, based on the number of timestamps present, in order to support an easier view for skew analysis and other possible use cases. (i.e. based on json_profile_event_timestamp_limit) The summary generated from aggregated instance-level timestamps (i.e. IMPALA-13304) is used to achieve this within the profile V2, which covers the possbility of missing events. Example, Verbosity::DEFAULT json_profile_event_timestamp_limit = 5 (default) Case #1, Number of instances exceeded limit Node Lifecycle Event Timeline Summary : - Open Started (4s880ms): Min: 2s312ms, Avg: 3s427ms, Max: 4s880ms, Count: 12 HistogramCount: 4, 4, 0, 0, 4 Case #2, Number of instances within the limit Node Lifecycle Event Timeline: - Open Started: 5s885ms, 1s708ms, 3s434ms - Open Finished: 5s885ms, 1s708ms, 3s435ms - First Batch Requested: 5s885ms, 1s708ms, 3s435ms - First Batch Returned: 6s319ms, 2s123ms, 3s570ms - Last Batch Returned: 7s878ms, 2s123ms, 3s570ms With Verbosity::EXTENDED or more, all events and timestamps are printed with full verbosity as before. Tests: For test_profile_tool.py, updated the generated outputs for text and JSON profiles. Change-Id: I4bcc0e2e7fccfa8a184cfa8a3a96d68bfe6035c0 Reviewed-on: http://gerrit.cloudera.org:8080/22245 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
zhangyifan27
pushed a commit
that referenced
this pull request
Sep 28, 2025
With this patch we create Iceberg file descriptors from LocatedFileStatus objects during IcebergFileMetadataLoader's parallelListing(). This has the following benefits: * We parallelize the creation of Iceberg file descriptor objects * We don't need to maintain a large hash map with all the LocatedFileStatus objects at once. Now we only need to keep a few LocatedFileStatus objects per partition in memory while we are converting them to Iceberg file descriptors. I.e., the GC is free to destroy the LocatedFileStatus objects we don't use anymore. This patch retires startup flag 'iceberg_reload_new_files_threshold'. Since IMPALA-13254 we only list partitions that have new data files, and we load them in parallel, i.e. efficient incremental table loading is already covered. From that point the startup flag only added unnecessary code complexity. Measurements I created two tables (from tpcds.store_sales) to measure table loading times for large tables: Table #1: PARTITIONED BY SPEC(ss_item_sk, BUCKET(5, ss_sold_time_sk)) partitions: 107818 files: 754726 Table #2: PARTITIONED BY SPEC(ss_item_sk) partitions: 18000 files: 504224 Time taken in IcebergFileMetadataLoader.load() during full table reload: +----------+-------+------+---------+ | | Base | New | Speedup | +----------+-------+------+---------+ | Table #1 | 17.3s | 8.1s | 2.14 | | Table #2 | 7.8s | 4.3s | 1.8 | +----------+-------+------+---------+ I measured incremental table loading only for Table #2 (since there are more files per partition this is the worse scenario for the new code, as it only uses file listings, and each new file were created in a separate partition) Time taken in IcebergFileMetadataLoader.load() during incremental table reload: +------------+------+------+---------+ | #new files | Base | New | Speedup | +------------+------+------+---------+ | 1 | 1.4s | 1.6s | 0.9 | | 100 | 1.5s | 1.9s | 0.8 | | 200 | 1.5s | 1.5s | 1 | +------------+------+------+---------+ We lose a few tenths of a second, but I think the simplified code justifies it. Testing: * some tests were updated because we we don't have startup flag 'iceberg_reload_new_files_threshold' anymore Change-Id: Ia1c2a7119d76db7ce7c43caec2ccb122a014851b Reviewed-on: http://gerrit.cloudera.org:8080/23363 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This patch add a new query option DISABLE_SCAN_NODE_MT, default to false. If it is set true, the single-threaded scan node (i.e. HdfsScanNodeMt or KuduScanNodeMt) will not be used even if mt_dop > 0. So we can create more scanner threads to achieve faster scanning with mt_dop > 0. Also, the type of scan node is addied to the EXPLAIN output.
Tetting:
Change-Id: I40841eaaeee1756808aa5978cb58517ffd47c040