Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fixed logic to run certain operations even when the spec is unchanged #56

Merged
merged 7 commits into from
Jun 17, 2024

Conversation

afugazzotto
Copy link
Contributor

@afugazzotto afugazzotto commented Jun 12, 2024

Modifications

Operations such as pipeline pausing and status update based on pipeline and isbservice status need to be performed even if the specs are unchanged based on hash comparison.

Verification

Unit tests and manually checking via ArgoCD and simulating a pipeline pause.

NOTE: we cannot test by checking that the ChildResourcesHealthy condition is True because it is based on the pipeline status which will never be healthy in the unit tests since there is no numaflow-controller running. This could be checked in e2e tests.

…atus updates even when the spec is unchanged.

Signed-off-by: Antonino Fugazzotto <[email protected]>
@afugazzotto afugazzotto changed the title fix: fixed logic to run certain operations like pipeline pause and st… fix: fixed logic to run certain operations even when the spec is unchanged Jun 12, 2024
Signed-off-by: Antonino Fugazzotto <[email protected]>
internal/controller/shared.go Outdated Show resolved Hide resolved
internal/controller/pipelinerollout_controller.go Outdated Show resolved Hide resolved
internal/controller/pipelinerollout_controller.go Outdated Show resolved Hide resolved
Signed-off-by: Antonino Fugazzotto <[email protected]>
@@ -168,7 +168,7 @@ func (r *PipelineRolloutReconciler) reconcile(
}

kubernetes.SetAnnotation(pipelineRollout, apiv1.KeyHash, childResouceSpecHash)
} else {
} else if rolloutChildOp != RolloutChildNoop {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd prefer this to be if rolloutChildOp == RolloutChildUpdate to be more clear.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I was thinking the same. The logic below applies to when we're trying to update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually just realized that we need to remove this. I had it in place in the previous PR but it needs to be removed because part of the logic inside the functions called in this if-statement branch need to execute in case the pipeline needs to be updated or not. The code in this branch needs to be refactored, but I wanted to do that in a separate PR as discussed with @xdevxy

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually just realized that we need to remove this. I had it in place in the previous PR but it needs to be removed because part of the logic inside the functions called in this if-statement branch need to execute in case the pipeline needs to be updated or not.

You know, I made that comment back when you were updating the annotation immediately whenever there was a need to update. Now that you're not updating that annotation until the update has truly happened, I think it may be okay. Basically, we will revisit this "update" logic up until the update actually occurs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in this branch needs to be refactored

are you referring partly to my other comment regarding the long function name here?

// The operations are defined by the RolloutChildOperation constants.
func makeChildResourceFromRolloutAndUpdateSpecHash(
func makeChildResourceFromRolloutAndCalculateSpecHash(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate to be anal, but wondering if we can shorten this name? It's pretty long. I can see it's explicit about what it does, which can be nice. But if you have to name it that way, it sort of reveals a violation of the Single Responsibility Principle, that a function should do one thing. I'm not a total stickler for that rule. (I do think a function can do multiple things but still conceptually do one thing in general.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, can we either shorten the name or break it into multiple functions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially envisioned a different usage for this function. But after realizing that certain logic needs to be executed independently of the fact of having a creation or update, lead me to streamline this function into a simpler function.

Signed-off-by: Antonino Fugazzotto <[email protected]>
}
}

if rolloutChildOp == RolloutChildNoop {
annotations := rolloutObj.GetAnnotations()
if annotation, exists := annotations[apiv1.KeyHash]; exists && annotation != childResouceSpecHash {
if annotation := annotations[apiv1.KeyHash]; annotation != childResouceSpecHash {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, that's a nice neat way to handle annotation not being present

} else if rolloutChildOp == RolloutChildUpdate {

kubernetes.SetAnnotation(pipelineRollout, apiv1.KeyHash, childResouceSpecHash)
} else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you see my comment here? I actually think it does make sense for this logic to be when we're still in the process of trying to update...unless I'm missing something?

@afugazzotto
Copy link
Contributor Author

@juliev0 @xdevxy I made some changes to the spec hashing and annotation setting code based on your feedback to try and resolve all of the concerns. Please, take a look again and let me know what you think. Thanks

@juliev0
Copy link
Collaborator

juliev0 commented Jun 15, 2024

Thank you for the adjustments @afugazzotto!

Unfortunately, and I am so sorry for this - I think I didn’t think this whole annotation thing through before. I am realizing that given that we want to do auto-healing, we want to overwrite the pipeline/ISBservice if somebody hand modifies it (like through kubectl apply). This means that we ultimately want to compare a. The spec we intend to apply (including “pause” designation) to b. the spec currently on the cluster. (We should vet that.) Therefore, there may be no value to storing the hash after all. :(

cc @xdevxy @dpadhiar @a49ibrahim

@afugazzotto afugazzotto merged commit 154ac6f into main Jun 17, 2024
4 checks passed
@afugazzotto afugazzotto deleted the logicFix branch June 17, 2024 19:30
@xdevxy
Copy link
Contributor

xdevxy commented Jun 17, 2024

Thank you for the adjustments @afugazzotto!

Unfortunately, and I am so sorry for this - I think I didn’t think this whole annotation thing through before. I am realizing that given that we want to do auto-healing, we want to overwrite the pipeline/ISBservice if somebody hand modifies it (like through kubectl apply). This means that we ultimately want to compare a. The spec we intend to apply (including “pause” designation) to b. the spec currently on the cluster. (We should vet that.) Therefore, there may be no value to storing the hash after all. :(

cc @xdevxy @dpadhiar @a49ibrahim

We can discuss this when we implement pipeline draining with concurrent concerns addressed. One way is to ignore the desired state in the spec when comparing hash and have dedicated logic to control the desired lifecycle of pipeline.

@juliev0
Copy link
Collaborator

juliev0 commented Jun 17, 2024

One way is to ignore the desired state in the spec when comparing hash and have dedicated logic to control the desired lifecycle of pipeline.

Can you clarify what you mean by "ignore the desired state in the spec"? You mean use the hash only for the purpose of auto-healing? (to compare the last applied state to what's running on the cluster?)

It seems to me that having the hash at all may be inefficient if we need to ultimately compare the desired state to what's running, but I could be wrong. Please do document the design you have in mind when you can.

@xdevxy
Copy link
Contributor

xdevxy commented Jun 18, 2024

One way is to ignore the desired state in the spec when comparing hash and have dedicated logic to control the desired lifecycle of pipeline.

Can you clarify what you mean by "ignore the desired state in the spec"? You mean use the hash only for the purpose of auto-healing? (to compare the last applied state to what's running on the cluster?)

It seems to me that having the hash at all may be inefficient if we need to ultimately compare the desired state to what's running, but I could be wrong. Please do document the design you have in mind when you can.

What I mean is we probably do no even want desired lifecycle to be included in the pipelinerollout spec but only pipeline spec. It's a good point we need to also consider auto healing logic, for it we can directly compare between Pipeline spec in live cluster vs the desired PipelineRollout spec after the first hash comparison. @afugazzotto is working on that for auto healing task.

@afugazzotto
Copy link
Contributor Author

@juliev0 here is the issue I'm working on #64
It has more details on the idea to have both auto healing and hash of spec.

@juliev0
Copy link
Collaborator

juliev0 commented Jun 18, 2024

#64

thank you. I have responded over there.

darshansimha referenced this pull request in darshansimha/numaplane Jun 25, 2024
…anged (#56)

Signed-off-by: Antonino Fugazzotto <[email protected]>
Signed-off-by: dsimha <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants