Conversation
7430725 to
5a8343b
Compare
800711c to
81c8577
Compare
81c8577 to
5b73e06
Compare
byako
left a comment
There was a problem hiding this comment.
Fail fast, no need to proceed the whole flow after a failure.
a3562d4 to
d86fdbd
Compare
.github/workflows/e2e.yaml
Outdated
| run: go mod vendor | ||
| - name: Build | ||
| run: make PREFIX=artifacts cmds | ||
| - name: install helm and kubelet |
There was a problem hiding this comment.
| - name: install helm and kubelet | |
| - name: install helm and kubectl |
There was a problem hiding this comment.
Do we not need to also install kind?
| gpu_test_1=$(kubectl get pods -n gpu-test1 | grep -c 'Running') | ||
| if [ $gpu_test_1 != 2 ]; then | ||
| echo "gpu_test_1 $gpu_test_1 failed to match against 2 expected pods" | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
We should also check for the existence of GPU_DEVICE_* envvars in each test. Its possible for these to enter the running state even if CDI didn't do its thing behind the scenes. This can happen if CDI is not enabled in the underlying container runtime, or the DynamicResourceAllocation feature gate was not enabled on the Kubelet somehow.
|
/hold Test seems flaky. |
c3f20b4 to
34fe3d2
Compare
|
/hold cancel added kubectl waits on the pods which may help with flakiness. |
|
/lgtm |
|
/assign @klueska |
|
Last outstanding item is #56 (comment). We can resolve this or we can do it as a follow up. I don't care either way. I'll maybe find some time next week to finish that up. |
|
Let's merge as-is for now since running something is better than running nothing. But let's definitely follow-up on that. /approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kannon92, klueska The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I am hoping to at least build the driver, create kind and maybe run a few of those pods as a e2e test.
Surprising this works. I don’t really understand where mine came from in the GitHub runner but it is found.