Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Creating uiIngress fails, logging not showing details #2167

Open
1 task done
tcassaert opened this issue Sep 13, 2024 · 4 comments
Open
1 task done

[BUG] Creating uiIngress fails, logging not showing details #2167

tcassaert opened this issue Sep 13, 2024 · 4 comments

Comments

@tcassaert
Copy link
Contributor

Description

When creating a SparkApplication with the uiIngress enabled, it fails to create the ingress:

2024-09-13T08:55:51.736Z    INFO    sparkapplication/driveringress.go:215    Creating extensions.v1beta1/Ingress for SparkApplication web UI    {"a-eeac4d03af3a461583fb9c51f4018979": "namespace", "spark-jobs-dev": "ingressName"}
2024-09-13T08:55:51.738Z    ERROR    sparkapplication/controller.go:260    Failed to submit SparkApplication    {"name": "a-eeac4d03af3a461583fb9c51f4018979", "namespace": "spark-jobs-dev", "error": "failed to create web UI service"}
github.com/kubeflow/spark-operator/internal/controller/sparkapplication.(*Reconciler).reconcileNewSparkApplication.func1
    /workspace/internal/controller/sparkapplication/controller.go:260
k8s.io/client-go/util/retry.OnError.func1
    /go/pkg/mod/k8s.io/client-go@v0.29.3/util/retry/util.go:51
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection
    /go/pkg/mod/k8s.io/apimachinery@v0.29.3/pkg/util/wait/wait.go:145
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff
    /go/pkg/mod/k8s.io/apimachinery@v0.29.3/pkg/util/wait/backoff.go:461
k8s.io/client-go/util/retry.OnError
    /go/pkg/mod/k8s.io/client-go@v0.29.3/util/retry/util.go:50
k8s.io/client-go/util/retry.RetryOnConflict
    /go/pkg/mod/k8s.io/client-go@v0.29.3/util/retry/util.go:104
github.com/kubeflow/spark-operator/internal/controller/sparkapplication.(*Reconciler).reconcileNewSparkApplication
    /workspace/internal/controller/sparkapplication/controller.go:247
github.com/kubeflow/spark-operator/internal/controller/sparkapplication.(*Reconciler).Reconcile
    /workspace/internal/controller/sparkapplication/controller.go:179
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.5/pkg/internal/controller/controller.go:119
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.5/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.5/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.5/pkg/internal/controller/controller.go:227
2024-09-13T08:55:51.758Z    INFO    sparkapplication/event_handler.go:188    SparkApplication updated    {"name": "a-eeac4d03af3a461583fb9c51f4018979", "namespace": "spark-jobs-dev", "oldState": "", "newState": "SUBMISSION_FAILED"}
  • ✋ I have searched the open/closed issues and my issue is not listed.

Reproduction Code [Required]

Steps to reproduce the behavior:

---
controller:
  batchScheduler:
    enable: true
  podMonitor:
    create: true
  uiIngress:
    enable: true
    urlFormat: '{{$appName}}.{{$appNamespace}}.batch.stag.warsaw.openeo.dataspace.copernicus.eu'
spark:
  jobNamespaces:
    - ""

Expected behavior

The operator is able to create the Ingress. If it fails, the logging should show the reason why it fails.

Actual behavior

The operator fails to create the Ingress and the logging isn't showing any relevant information about why it's failing.

Environment & Versions

Spark Operator App version: v2.0.0-rc.0
Helm Chart Version: v2.0.0-rc.0
Kubernetes Version: 1.25.7
Apache Spark version: 3.5.2

@ChenYi015
Copy link
Contributor

@tcassaert Thanks for reporting the bug. Would you like to contribute it? The error message should be wrapped and return.

if r.options.EnableUIService {
service, err := r.createWebUIService(app)
if err != nil {
return fmt.Errorf("failed to create web UI service")
}

@tcassaert
Copy link
Contributor Author

@ChenYi015 Should this be done with keeping the fmt.Errorf or should I use the logger.Error? I see both used throughout the code-base.

@ChenYi015
Copy link
Contributor

@ChenYi015 Should this be done with keeping the fmt.Errorf or should I use the logger.Error? I see both used throughout the code-base.

@tcassaert I think we can keep using the fmt.Errorf:

if err != nil { 
    return fmt.Errorf("failed to create web UI service: %v", err) 
} 

@tcassaert
Copy link
Contributor Author

I've deployed a version with the logging:

2024-09-16T08:23:58.013Z        ERROR   sparkapplication/controller.go:260      Failed to submit SparkApplication    {"name": "a-e5cebe6a50f74a2f8bce42874ea56189", "namespace": "spark-jobs-dev", "error": "failed to create web UI service: failed to create ingress spark-jobs-dev/a-e5cebe6a50f74a2f8bce42874ea56189-ui-ingress: no matches for kind \"Ingress\" in version \"extensions/v1beta1\""}
github.com/kubeflow/spark-operator/internal/controller/sparkapplication.(*Reconciler).reconcileNewSparkApplication.func1
        /workspace/internal/controller/sparkapplication/controller.go:260
k8s.io/client-go/util/retry.OnError.func1
        /go/pkg/mod/k8s.io/[email protected]/util/retry/util.go:51
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:145
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff
        /go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/backoff.go:461
k8s.io/client-go/util/retry.OnError
        /go/pkg/mod/k8s.io/[email protected]/util/retry/util.go:50
k8s.io/client-go/util/retry.RetryOnConflict
        /go/pkg/mod/k8s.io/[email protected]/util/retry/util.go:104
github.com/kubeflow/spark-operator/internal/controller/sparkapplication.(*Reconciler).reconcileNewSparkApplication
        /workspace/internal/controller/sparkapplication/controller.go:247
github.com/kubeflow/spark-operator/internal/controller/sparkapplication.(*Reconciler).Reconcile
        /workspace/internal/controller/sparkapplication/controller.go:179
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:119
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227

So to me it seems that it wants to use the legacy ingress in my case:

func (r *Reconciler) createWebUIIngress(app *v1beta2.SparkApplication, service SparkService, ingressURL *url.URL, ingressClassName string) (*SparkIngress, error) {
ingressName := util.GetDefaultUIIngressName(app)
if util.IngressCapabilities.Has("networking.k8s.io/v1") {
return r.createDriverIngressV1(app, service, ingressName, ingressURL, ingressClassName)
}
return r.createDriverIngressLegacy(app, service, ingressName, ingressURL)
}

I'm not entirely sure why though, as all other Ingresses in my cluster are networking.k8s.io/v1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants