-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: pkg/upgrade: abstract resource deletion with Action #957
base: incubation
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
d0bf97e
to
0f73430
Compare
c4557c1
to
4cf9c43
Compare
a1e8ca5
to
84777e0
Compare
pkg/upgrade/upgrade.go
Outdated
// path to the field, like "metadata", "name" | ||
Path []string | ||
// set of values for the field to match object, any one matches | ||
Values []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wonder why this is set to array but not just string.
same as below range job.Values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current code uses it that way:
deprecatedClusterroles := []string{"rhods-namespace-read", "rhods-prometheus-operator"}
But it's possible to have two entries with just a string. Whatever is better.
pkg/upgrade/upgrade.go
Outdated
Gvk schema.GroupVersionKind | ||
Namespace string | ||
// path to the field, like "metadata", "name" | ||
Path []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i understand in unstructured.NestedString
, we can pass down different field not just metadata.name , is the plan that JobSpec
will be extend for other puropose more than only delete resource by name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was actually the main point of the proposal. https://issues.redhat.com/browse/RHOAIENG-4764 wants removing Anything with appName: watson-studio
( I used it in another draft #959 on top of old version of this, patch 5bb54af)
I was thinking a bit more about #959. If we agree about general idea, it probably makes sense to change it such a way that it just adds this new functionality for dashboard resources without touching any existing code. And then we can continue discussion (if it's not just rejected straight away) here. Just before the release it sounds less risky. |
i agree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see that we are unifying upgrades. I shared some suggestions for the solution adjusting the current approach.
Generally speaking, I think the "upgrade/migration" approach should be discussed in a bit more detail. There are a few things which come to mind (though some might be speculative):
- should we introduce some convention on how/where to define these steps? so each component owner can follow it easily?
- should we add some metadata to it? e.g. version from which the migration should be applied.
- should the list of migration steps be calculated automatically or we implement each step in a fail-safe manner?
- how long should we keep the code around?
pkg/upgrade/upgrade.go
Outdated
didWork, err := deleteResourcesWithDone(ctx, cli, &del) | ||
if err != nil || !didWork { | ||
return !didWork, err | ||
} | ||
|
||
return true, multiErr.ErrorOrNil() | ||
didWork, err = deleteResourcesWithDone(ctx, cli, &del) | ||
return !didWork, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why doing it twice instead of delegating the decision of retry to the caller?
pkg/upgrade/upgrade.go
Outdated
// deleteResourcesWithDone takes array of Jobs and deletes all matching resources. Returns true if any was deleted or error. | ||
func deleteResourcesWithDone(ctx context.Context, c client.Client, jobs *[]JobSpec) (bool, error) { | ||
var errors *multierror.Error | ||
didWork := false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a native english speaker, so I find didWork
a bit hard to understand. Is !didWork
a singal that it in fact did not work or it was not done? If the latter, which I guess is the case, how about renaming didWork
to done
instead? But then it contradicts the statement Returns true if any was deleted or error.
I wonder - what do we want to signal with this flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll just explain my logic behind that, it's definitely place to improve and I agree with your concerns.
So, the original delete*AndCheck checked existence of the resource right after deletion without starting next waiting cycle in ExponentialBackoffWithContext and at the point I did not want to introduce extra delay. And I did not want to add extra check for all the cases to the generic function. So generic function does not know if the deletion actually done or not, but it can signal that it performed some actions (vs there are no objects found for action). That is why the second call if the first one really finished all the work will signal that it did nothing, means for the ExponentialBackoffWithContext that the job is done.
I think in this place it is not really critical but it changes in theory original logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what this extra check gives us. We can essentially say we are done when there's nothing to delete, if there's something we perform delete and retry, as some resources can have finalizers. Not really convinced if that extra immediate .Get
gives us any benefit.
Maybe an opportunity to simplify here?
pkg/upgrade/upgrade.go
Outdated
w, err := deleteOneJob(ctx, c, job) | ||
errors = multierror.Append(errors, err) | ||
if w { | ||
didWork = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain the significance of the information that at least one JobSpec
was finished successfully? Kind of repeating my previous comment :)
pkg/upgrade/upgrade.go
Outdated
Gvk: gvkOdhApplication, | ||
Namespace: dscApplicationsNamespace, | ||
Path: pathName, | ||
Values: []string{"jupiterhub"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we have any tests for those... so just in case...
Values: []string{"jupiterhub"}, | |
Values: []string{"jupyterhub"}, |
pkg/upgrade/upgrade.go
Outdated
@@ -257,175 +283,213 @@ func UpdateFromLegacyVersion(cli client.Client, platform deploy.Platform, appNS | |||
// TODO: remove function once we have a generic solution across all components. | |||
func CleanupExistingResource(ctx context.Context, cli client.Client, platform deploy.Platform, dscApplicationsNamespace, dscMonitoringNamespace string) error { | |||
var multiErr *multierror.Error | |||
var toDelete []JobSpec | |||
pathName := []string{"metadata", "name"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps better to name it as metadataNamePath
?
pkg/upgrade/upgrade.go
Outdated
// set of values for the field to match object, any one matches | ||
Values []string | ||
// hook to call before action (delete) | ||
Fixup func(ctx context.Context, c client.Client, obj *unstructured.Unstructured) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could name it PreHook
, Before
or something similar hinting the invocation phase. Fixup reminds me of rebase and git (and maybe linking, but that was long time ago). Not sure if just by reading Fixup
everyone will understand the purpose.
pkg/upgrade/upgrade.go
Outdated
@@ -43,6 +40,35 @@ import ( | |||
"github.com/opendatahub-io/opendatahub-operator/v2/pkg/metadata/labels" | |||
) | |||
|
|||
type JobSpec struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expanding on the comment from related PR (#959 (comment)), there are a few things I would consider:
JobSpec
to handle multiple resources
JobSpec
could have a slice of ResourceSpec
and Action
func to perform on them. Such a function could be a Delete
, DeleteWithRetry
(wrapping the original Delete
for example). The signature could be as simple as
func Action(ctx context.Context, c client.Client, resourceSpecs ...ResourceSpec) error
Then we can have a receiver method which will be invoking it.
Or maybe if we ponder a bit more about naming... Job
implies things like schedule, retry etc, this is just Action
we want to perform on a given (set) of resources. So maybe Action
as a struct name and ActionFunc
for the behaviour we want to attach to it?
Simplified creation
We can think of creating constructor functions to wrap this new struct, e.g.
NewDeleteAction(ctx context.Context, c client.Client, resourceSpecs ...ResourceSpec)
NewDeleteWithRetryAction(ctx context.Context, c client.Client, resourceSpecs ...ResourceSpec)
NewRemoveOwnerReferenceAction(ctx context.Context, c client.Client, resourceSpecs ...ResourceSpec)
I am on the verge if we should keep the client as part of the struct or just expect it to be passed when invoking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds nice! I'll cook something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bartoszmajsak I cooked a draft, will the approach work? I did not make the wrappers yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made the wrappers :)
PS: totally untested, just RFC yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made the wrappers :) PS: totally untested, just RFC yet.
And I have a feeling, that it's better to have something like "ResourceList" which takes care of the filtering and produce the actual list of resources to apply the action to based on the resource specs.
Totally agree, I had similar thoughts but they have not yet crystallised in my mind. But looks like we are on the way. |
84777e0
to
61280c6
Compare
Add abstraction to perfrom action on a set of resources defined as type ResourceSpec struct { Gvk schema.GroupVersionKind Namespace string // path to the field, like "metadata", "name" Path []string // set of values for the field to match object, any one matches Values []string } Provide high level wrappers and lowlevel builder interface. It uses Action directly since there is no any validation or generation logic at the moment. Generated Action does not contain resource list and can be applied to a set of resources with Exec() or ExecWithRetry() methods. Provide also predicates to cover matching for current usecases. User can define its own actions, matchers and retry checkers with signatures: type MatcherFunc func(r ResourceSpec, obj *unstructured.Unstructured) (bool, error) type ActionFunc func(ctx context.Context, c client.Client, r ResourceSpec, obj *unstructured.Unstructured) error type RetryCheckFunc func(ctx context.Context, c client.Client, resources ...ResourceSpec) (bool, error) The latter one is taken by ExecWithRetry to check is the job is done. Examples: err = action.NewDeleteMatched(c, action.Not( action.MatchMapKeyContains(ODHAppPrefix, "spec", "selector", "matchLabels"))). Exec(ctx, d...) err = action.NewDelete(c).Exec(ctx, d...) With the builder: err = action.New(c). Do(action.Delete). ExecWithRetry(ctx, action.IfAnyLeft(action.DefaultMatcher), d...) Signed-off-by: Yauheni Kaliuta <[email protected]>
Signed-off-by: Yauheni Kaliuta <[email protected]>
Signed-off-by: Yauheni Kaliuta <[email protected]>
Signed-off-by: Yauheni Kaliuta <[email protected]>
Signed-off-by: Yauheni Kaliuta <[email protected]>
Signed-off-by: Yauheni Kaliuta <[email protected]>
Signed-off-by: Yauheni Kaliuta <[email protected]>
Signed-off-by: Yauheni Kaliuta <[email protected]>
61280c6
to
fe4c4ac
Compare
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@ykaliuta I think this is something that is covered by the recent refactor? |
probably can have it closed, as target is for the "incubation" and still in Draft |
The intention is to provide declarative framework to perform operations on groups of k8s objects. It was triggered by upgrade case with massive deletion with massive code duplication. The original usecases probably covered by garbage collection. Some other applications removed from the code already. I'll talk to @lburgazzoli if something like that still can be useful. |
Add abstraction to perfrom action on a set of resources defined as
Provide high level wrappers and lowlevel builder interface.
It uses Action directly since there is no any validation or
generation logic at the moment.
Generated Action does not contain resource list and can be applied
to a set of resources with Exec() or ExecWithRetry() methods.
Provide also predicates to cover matching for current usecases.
User can define its own actions, matchers and retry checkers with
signatures:
The latter one is taken by ExecWithRetry to check is the job is done.
Examples:
With the builder:
Description
How Has This Been Tested?
Merge criteria: