-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement recorder and replayer #403
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: saza-ku The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @saza-ku. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
88137d3
to
5c4133d
Compare
/cc @utam0k @ordovicia |
@sanposhiho: GitHub didn't allow me to request PR reviews from the following users: ordovicia. Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just took a look at the doc.
I'll take a look for the implementation after @utam0k / @ordovicia.
|
||
1. Set `true` to `recorderEnabled`. | ||
2. Set the path of the kubeconfig file for your cluster to `KubeConfig`. | ||
- This feature only requires the read permission for events. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it? Looks like recorder.go have an event handers though. I guess the read permissions for all resources to be recorded are required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the read permissions for all resources to be recorded are required.
That's right.
"This feature only requires the read permission for resources." was correct.
> [!NOTE] | ||
> When a file already exists at `recordedFilePath`, it backs up the file in the same directory adding a timestamp to the filename and creates a new file for recording. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need it? I prefer do it simply, either just override the file without the error or make the recorder put out the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll fix it to put out the error.
@@ -0,0 +1,95 @@ | |||
# [Beta] Record your real cluster's events and replay them in the simulator | |||
|
|||
You can record events from your real cluster and replay them in the simulator. This feature is useful for reproducing issues that occur in your real cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rephrase "events" in this doc. It's unclear what it means for first-readers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrased it to "changes" or "changes in resources". How about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably go more straightforward like "You can record resource addition/update/deletion at your real cluster"
@@ -0,0 +1,95 @@ | |||
# [Beta] Record your real cluster's events and replay them in the simulator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to put the link for this page at somewhere (README?) so that people can notice
|
||
You can record events from your real cluster and replay them in the simulator. This feature is useful for reproducing issues that occur in your real cluster. | ||
|
||
## Record events |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, basically it has two steps, recording resources into JSON file and replaying it on the simulator.
And, people have to record resources by starting the simulator with recorderEnabled
, then shutdown the simulator once, and then restart the simulator with the JSON file and replayerEnabled
.
The first step looks weird for me because it has to run the simulator, but essentially it just imports the resources from a real cluster and outputs the JSON file; the simulator doesn't come into play.
So... well, I guess, should we create a CLI for it?
It can be recorder
(simple single cli) or sched-simulator record
(sched-simulator
CLI with record
subcommand).
Then the flow would be:
- Users record the resource creation/update/deletion by
sched-simulator record --duration 5m
or something. - After 5min, CLI generates JSON file.
- Users start the simulator with JSON file and
replayerEnabled
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely. I'll create a new cli in simulator/cmd/recorder
.
Sure! I'll review it in this week. |
c656c73
to
dcde17d
Compare
@saza-ku: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
records := []Record{} | ||
for { | ||
select { | ||
case r := <-s.recordCh: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, on every event (resource creation/update/delete) this method serializes all events that have happened so far and writes them to a file, which is a high-cost operation.
In addition, recordCh
is unbuffered, so RecordEvent
will be blocked until record
completes handling an inflight event.
I guess we need some optimization like the following, otherwise the recorder would have a limited performance.
- Asynchronous: Change
recordCh
to a buffered channel so thatRecordEvent
will not be blocked. - Buffering: Save some records in a buffer and write them together as a chunk to reduce file system operations
- Incremental: Write each record chunk into separate files to avoid writing all records many times.
@utam0k I have quite a few points to revise, so it's okay to review this PR next week :) |
What type of PR is this?
/area simulator
/kind feature
What this PR does / why we need it:
This PR implements recorder and replayer.
recorder continuously saves events in an external cluster into a JSON file. replayer reproduces these events in the simulator cluster when the simulator starts by reading the JSON file.
Which issue(s) this PR fixes:
Fixes #395
Special notes for your reviewer:
/label tide/merge-method-squash