-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[improve](routine load) improve routine load observability #46238
Conversation
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
run buildall |
TPC-H: Total hot run time: 32703 ms
|
TPC-DS: Total hot run time: 197047 ms
|
ClickBench: Total hot run time: 31.79 s
|
e14a130
to
39089e6
Compare
run buildall |
TPC-H: Total hot run time: 32931 ms
|
TPC-DS: Total hot run time: 196550 ms
|
ClickBench: Total hot run time: 30.83 s
|
39089e6
to
56b84ba
Compare
run buildall |
TPC-H: Total hot run time: 32637 ms
|
TPC-DS: Total hot run time: 197974 ms
|
ClickBench: Total hot run time: 31.75 s
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
1. **reset other msg in the a stream window** The routine load job is a continuously scheduled job, and as the job runs, previous errors do not need to be constantly displayed. 2. **show error info when transaction of sub task failed** If a subtask fails, it will continuously retry, and there may be some errors that prevent the job from scheduling and consuming data properly, such as continuous too many segments error(code: -235). At this time, it is necessary to display it in a timely manner to make the user aware. 3. **set pause reason to other msg when reschedule job** For jobs that are unexpectedly paused, the job manager has an auto resume mechanism. However, for some scenarios, such as not being able to connect to Kafka and being auto resumed after pause to retry, it may cause users to not see the problem for a long time. Unexpectedly paused jobs always have issues, even if auto resume occurs, the reason for the error needs to be displayed.
1. **reset other msg in the a stream window** The routine load job is a continuously scheduled job, and as the job runs, previous errors do not need to be constantly displayed. 2. **show error info when transaction of sub task failed** If a subtask fails, it will continuously retry, and there may be some errors that prevent the job from scheduling and consuming data properly, such as continuous too many segments error(code: -235). At this time, it is necessary to display it in a timely manner to make the user aware. 3. **set pause reason to other msg when reschedule job** For jobs that are unexpectedly paused, the job manager has an auto resume mechanism. However, for some scenarios, such as not being able to connect to Kafka and being auto resumed after pause to retry, it may cause users to not see the problem for a long time. Unexpectedly paused jobs always have issues, even if auto resume occurs, the reason for the error needs to be displayed.
1. **reset other msg in the a stream window** The routine load job is a continuously scheduled job, and as the job runs, previous errors do not need to be constantly displayed. 2. **show error info when transaction of sub task failed** If a subtask fails, it will continuously retry, and there may be some errors that prevent the job from scheduling and consuming data properly, such as continuous too many segments error(code: -235). At this time, it is necessary to display it in a timely manner to make the user aware. 3. **set pause reason to other msg when reschedule job** For jobs that are unexpectedly paused, the job manager has an auto resume mechanism. However, for some scenarios, such as not being able to connect to Kafka and being auto resumed after pause to retry, it may cause users to not see the problem for a long time. Unexpectedly paused jobs always have issues, even if auto resume occurs, the reason for the error needs to be displayed.
1. **reset other msg in the a stream window** The routine load job is a continuously scheduled job, and as the job runs, previous errors do not need to be constantly displayed. 2. **show error info when transaction of sub task failed** If a subtask fails, it will continuously retry, and there may be some errors that prevent the job from scheduling and consuming data properly, such as continuous too many segments error(code: -235). At this time, it is necessary to display it in a timely manner to make the user aware. 3. **set pause reason to other msg when reschedule job** For jobs that are unexpectedly paused, the job manager has an auto resume mechanism. However, for some scenarios, such as not being able to connect to Kafka and being auto resumed after pause to retry, it may cause users to not see the problem for a long time. Unexpectedly paused jobs always have issues, even if auto resume occurs, the reason for the error needs to be displayed.
…#46238 (#46568) Cherry-picked from #46238 Co-authored-by: hui lai <[email protected]>
…#46238 (#46567) Cherry-picked from #46238 Co-authored-by: hui lai <[email protected]>
What problem does this PR solve?
related #48511
reset other msg in the a stream window
The routine load job is a continuously scheduled job, and as the job runs, previous errors do not need to be constantly displayed.
show error info when transaction of sub task failed
If a subtask fails, it will continuously retry, and there may be some errors that prevent the job from scheduling and consuming data properly, such as continuous too many segments error(code: -235). At this time, it is necessary to display it in a timely manner to make the user aware.
set pause reason to other msg when reschedule job
For jobs that are unexpectedly paused, the job manager has an auto resume mechanism. However, for some scenarios, such as not being able to connect to Kafka and being auto resumed after pause to retry, it may cause users to not see the problem for a long time. Unexpectedly paused jobs always have issues, even if auto resume occurs, the reason for the error needs to be displayed.
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)