Use multi-threaded scan nodes with mt_dop > 0 by zhangyifan27 · Pull Request #1 · zhangyifan27/impala

zhangyifan27 · 2022-09-28T02:44:35Z

This patch add a new query option DISABLE_SCAN_NODE_MT, default to false. If it is set true, the single-threaded scan node (i.e. HdfsScanNodeMt or KuduScanNodeMt) will not be used even if mt_dop > 0. So we can create more scanner threads to achieve faster scanning with mt_dop > 0. Also, the type of scan node is addied to the EXPLAIN output.

Tetting:

Added e2e tests to verify the new query option.

Change-Id: I40841eaaeee1756808aa5978cb58517ffd47c040

This patch add a new query option DISABLE_SCAN_NODE_MT, default to false. If it is set true, the single-threaded scan node (i.e. HdfsScanNodeMt or KuduScanNodeMt) will not be used even if mt_dop > 0. So we can create more scanner threads to achieve faster scanning with mt_dop > 0. Also, the type of scan node is addied to the EXPLAIN output. Tetting: - Added e2e tests to verify the new query option. Change-Id: I40841eaaeee1756808aa5978cb58517ffd47c040

As agreed in JIRA discussions, the current PR extends existing TRIM functionality with the support of SQL-standardized TRIM-FROM syntax: TRIM({[LEADING / TRAILING / BOTH] | [STRING characters]} FROM expr). Implemented based on the existing LTRIM / RTRIM / BTRIM family of functions prepared earlier in IMPALA-6059 and extended for UTF-8 in IMPALA-12718. Besides, partly based on abandoned PR https://gerrit.cloudera.org/#/c/4474 and similar EXTRACT-FROM functionality from apache@543fa73f3a846 f0e4527514c993cb0985912b06c. Supported syntaxes: Syntax #1 TRIM(<where> FROM <string>); Syntax #2 TRIM(<charset> FROM <string>); Syntax #3 TRIM(<where> <charset> FROM <string>); "where": Case-insensitive trim direction. Valid options are "leading", "trailing", and "both". "leading" means trimming characters from the start; "trailing" means trimming characters from the end; "both" means trimming characters from both sides. For Syntax #2, since no "where" is specified, the option "both" is implied by default. "charset": Case-sensitive characters to be removed. This argument is regarded as a character set going to be removed. The occurrence order of each character doesn't matter and duplicated instances of the same character will be ignored. NULL argument implies " " (standard space) by default. Empty argument ("" or '') makes TRIM return the string untouched. For Syntax #1, since no "charset" is specified, it trims " " (standard space) by default. "string": Case-sensitive target string to trim. This argument can be NULL. The UTF8_MODE query option is honored by TRIM-FROM, similarly to existing TRIM(). UTF8_TRIM-FROM can be used to force UTF8 mode regardless of the query option. Design Notes: 1. No-BE. Since the existing LTRIM / RTRIM / BTRIM functions fully cover all needed use-cases, no backend logic is required. This differs from similar EXTRACT-FROM. 2. Syntax wrapper. TrimFromExpr class was introduced as a syntax wrapper around FunctionCallExpr, which instantiates one of the regular LTRIM / RTRIM / BTRIM functions. TrimFromExpr's role is to maintain the integrity of the "phantom" TRIM-FROM built-in function. 3. No TRIM keyword. Following EXTRACT-FROM, no "TRIM" keyword was added to the language. Although generally a keyword would allow easier and better parsing, on the negative side it restricts token's usage in general context. However, leading/trailing/both, being previously saved as reserved words, are now added as keywords to make possible their usage with no escaping. Change-Id: I3c4fa6d0d8d0684c4b6d8dac8fd531d205e4f7b4 Reviewed-on: http://gerrit.cloudera.org:8080/21825 Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com> Tested-by: Csaba Ringhofer <csringhofer@cloudera.com>

…quences This adds support for a summarized textual representation of timestamps for the event sequences present in the aggregated profile. With the verbose format present in profile V1 and V2, it becomes difficult to analyze an event's timestamps across instances. The event sequences are now displayed in a histogram format, based on the number of timestamps present, in order to support an easier view for skew analysis and other possible use cases. (i.e. based on json_profile_event_timestamp_limit) The summary generated from aggregated instance-level timestamps (i.e. IMPALA-13304) is used to achieve this within the profile V2, which covers the possbility of missing events. Example, Verbosity::DEFAULT json_profile_event_timestamp_limit = 5 (default) Case #1, Number of instances exceeded limit Node Lifecycle Event Timeline Summary : - Open Started (4s880ms): Min: 2s312ms, Avg: 3s427ms, Max: 4s880ms, Count: 12 HistogramCount: 4, 4, 0, 0, 4 Case #2, Number of instances within the limit Node Lifecycle Event Timeline: - Open Started: 5s885ms, 1s708ms, 3s434ms - Open Finished: 5s885ms, 1s708ms, 3s435ms - First Batch Requested: 5s885ms, 1s708ms, 3s435ms - First Batch Returned: 6s319ms, 2s123ms, 3s570ms - Last Batch Returned: 7s878ms, 2s123ms, 3s570ms With Verbosity::EXTENDED or more, all events and timestamps are printed with full verbosity as before. Tests: For test_profile_tool.py, updated the generated outputs for text and JSON profiles. Change-Id: I4bcc0e2e7fccfa8a184cfa8a3a96d68bfe6035c0 Reviewed-on: http://gerrit.cloudera.org:8080/22245 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>

With this patch we create Iceberg file descriptors from LocatedFileStatus objects during IcebergFileMetadataLoader's parallelListing(). This has the following benefits: * We parallelize the creation of Iceberg file descriptor objects * We don't need to maintain a large hash map with all the LocatedFileStatus objects at once. Now we only need to keep a few LocatedFileStatus objects per partition in memory while we are converting them to Iceberg file descriptors. I.e., the GC is free to destroy the LocatedFileStatus objects we don't use anymore. This patch retires startup flag 'iceberg_reload_new_files_threshold'. Since IMPALA-13254 we only list partitions that have new data files, and we load them in parallel, i.e. efficient incremental table loading is already covered. From that point the startup flag only added unnecessary code complexity. Measurements I created two tables (from tpcds.store_sales) to measure table loading times for large tables: Table #1: PARTITIONED BY SPEC(ss_item_sk, BUCKET(5, ss_sold_time_sk)) partitions: 107818 files: 754726 Table #2: PARTITIONED BY SPEC(ss_item_sk) partitions: 18000 files: 504224 Time taken in IcebergFileMetadataLoader.load() during full table reload: +----------+-------+------+---------+ | | Base | New | Speedup | +----------+-------+------+---------+ | Table #1 | 17.3s | 8.1s | 2.14 | | Table #2 | 7.8s | 4.3s | 1.8 | +----------+-------+------+---------+ I measured incremental table loading only for Table #2 (since there are more files per partition this is the worse scenario for the new code, as it only uses file listings, and each new file were created in a separate partition) Time taken in IcebergFileMetadataLoader.load() during incremental table reload: +------------+------+------+---------+ | #new files | Base | New | Speedup | +------------+------+------+---------+ | 1 | 1.4s | 1.6s | 0.9 | | 100 | 1.5s | 1.9s | 0.8 | | 200 | 1.5s | 1.5s | 1 | +------------+------+------+---------+ We lose a few tenths of a second, but I think the simplified code justifies it. Testing: * some tests were updated because we we don't have startup flag 'iceberg_reload_new_files_threshold' anymore Change-Id: Ia1c2a7119d76db7ce7c43caec2ccb122a014851b Reviewed-on: http://gerrit.cloudera.org:8080/23363 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

zhangyifan27 changed the base branch from branch-4.1.1 to branch-4.1.0 September 28, 2022 02:46

zhangyifan27 force-pushed the TIMPALA-146 branch from 8b94a13 to abfa53b Compare November 8, 2022 04:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use multi-threaded scan nodes with mt_dop > 0#1

Use multi-threaded scan nodes with mt_dop > 0#1
zhangyifan27 wants to merge 1 commit intobranch-4.1.0from
TIMPALA-146

zhangyifan27 commented Sep 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zhangyifan27 commented Sep 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant