-
Notifications
You must be signed in to change notification settings - Fork 702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dep: Upgrade kubeflow training operator #6294
base: master
Are you sure you want to change the base?
Conversation
Code Review Agent Run #bfd79dActionable Suggestions - 0Review Details
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6294 +/- ##
==========================================
- Coverage 58.49% 58.48% -0.01%
==========================================
Files 937 937
Lines 71088 71088
==========================================
- Hits 41583 41577 -6
- Misses 26353 26359 +6
Partials 3152 3152
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Changelist by BitoThis pull request implements the following key changes.
|
Code Review Agent Run #052221Actionable Suggestions - 0Review Details
|
Dependency ReviewThe following issues were found:
|
Code Review Agent Run Status
|
Signed-off-by: Fabio Graetz <[email protected]>
Signed-off-by: Fabio Graetz <[email protected]>
Signed-off-by: Fabio Graetz <[email protected]>
Signed-off-by: Fabio Graetz <[email protected]>
72d0107
to
5effd97
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Result of go mod tidy
.
fmt.Errorf("error unmarshaling JSON: while decoding JSON: json: unknown field \"InvalidDomain\""), | ||
err) | ||
"error unmarshaling JSON: while decoding JSON: json: unknown field \"InvalidDomain\"", | ||
err.Error()) | ||
s.DeleterExt.AssertNotCalled(t, "DeleteProjectDomainAttributes", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to a dependency upgrade the type of the error changed:
Error: Not equal:
expected: *errors.errorString(&errors.errorString{s:"error unmarshaling JSON: while decoding JSON: json: unknown field \"InvalidDomain\""})
actual : *fmt.wrapError(&fmt.wrapError{msg:"error unmarshaling JSON: while decoding JSON: json: unknown field \"InvalidDomain\"", err:(*errors.errorString)(0xc0001314f0)})
Test: TestDeleteTaskResourceAttributes/attribute_deletion_invalid_file
Let's only compare the error message, not the type.
github.com/hashicorp/golang-lru v0.5.4 | ||
github.com/imdario/mergo v0.3.13 | ||
github.com/kubeflow/common v0.4.3 | ||
github.com/kubeflow/training-operator v1.5.0-rc.0 | ||
github.com/kubeflow/training-operator v1.8.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One important change here is that github.com/kubeflow/common
isn't needed anymore as it has been integrated into github.com/kubeflow/training-operator
.
@@ -6,7 +6,7 @@ import ( | |||
"sort" | |||
"time" | |||
|
|||
commonOp "github.com/kubeflow/common/pkg/apis/common/v1" | |||
kubeflowv1 "github.com/kubeflow/training-operator/pkg/apis/kubeflow.org/v1" | |||
v1 "k8s.io/api/core/v1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
github.com/kubeflow/common
has been integrated into github.com/kubeflow/training-operator
. This leads to a lot of commonOp
-> kubeflowv1
changes below.
@@ -1204,7 +1203,7 @@ func TestBuildResourcePytorchV1WithElastic(t *testing.T) { | |||
var hasContainerWithDefaultPytorchName = false | |||
|
|||
for _, container := range pytorchJob.Spec.PyTorchReplicaSpecs[kubeflowv1.PyTorchJobReplicaTypeWorker].Template.Spec.Containers { | |||
if container.Name == kubeflowv1.PytorchJobDefaultContainerName { | |||
if container.Name == kubeflowv1.PyTorchJobDefaultContainerName { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pytorch
-> PyTorch
Code Review Agent Run #869081Actionable Suggestions - 0Filtered by Review RulesBito filtered these suggestions based on rules created automatically for your feedback. Manage rules.
Review Details
|
Why are the changes needed?
Goal: Enable #6295 which prevents the flyteplugins kubeflow plugins from failing a task when a PytorchJob, TfJob, ... hasn't been updated by the kubeflow training operator because the job was suspended (e.g. by an external queueing system).
The run policy's "suspend" attribute is introduced only in a newer version of the kubeflow training operator.
What changes were proposed in this pull request?
Upgrade the kubeflow training operator dependency.
Check all the applicable boxes
Related PRs
#6295
Summary by Bito
This pull request upgrades the kubeflow training operator dependency and related k8s libraries across multiple modules to support new suspend functionality. The changes include standardizing API usage by replacing legacy commonOp types with kubeflowv1 types in PyTorch and TensorFlow operators, ensuring compatibility with newer operator policies.Unit tests added: False
Estimated effort to review (1-5, lower is better): 5