Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HOSA fails to scrap its own metrics when identity is set in config.yaml #176

Open
ljuaneda opened this issue Oct 25, 2017 · 3 comments
Open

Comments

@ljuaneda
Copy link

ljuaneda commented Oct 25, 2017

Hi,

This is following issue #150

I'm using openshift master-proxy certs in a secret to gather metrics from jolokia endpoints.
Currently working with commit 8600302 on OSCP v3.5.5.15

I'm trying the new docker images hawkular/hawkular-openshift-agent that pulled a version 1.4.2
But HOSA fails to scrap its own metrics on :8443 :

I1025 08:06:11.320386       1 prometheus_metrics_collector.go:97] DEBUG: Told to collect all Prometheus metrics from [https://10.130.5.68:8443/metrics]
2017/10/25 08:06:11 http: TLS handshake error from 10.130.5.68:42456: read tcp 10.130.5.68:8443->10.130.5.68:42456: read: connection reset by peer
W1025 08:06:11.324820       1 metrics_collector_manager.go:186] Failed to collect metrics from [default|hawkular-openshift-agent-8ffjs|prometheus|https://10.130.5.68:8443/metrics] at [Wed, 25 Oct 2017 08:06:11 +0000]. err=Failed to collect Prometheus metrics from [https://10.130.5.68:8443/metrics]. err=Cannot scrape Prometheus URL [https://10.130.5.68:8443/metrics]: err=Get https://10.130.5.68:8443/metrics: x509: cannot validate certificate for 10.130.5.68 because it doesn't contain any IP SANs

My guess is that HOSA is not expecting unsecured connections :

$ oc exec hawkular-openshift-agent-8ffjs -- curl -vks https://10.130.5.68:8443/metrics
* About to connect() to 10.130.5.68 port 8443 (#0)
*   Trying 10.130.5.68...
* Connected to 10.130.5.68 (10.130.5.68) port 8443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* skipping SSL peer certificate verification
* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate:
*       subject: CN=system:master-proxy
*       start date: Feb 01 16:54:03 2017 GMT
*       expire date: Feb 01 16:54:04 2019 GMT
*       common name: system:master-proxy
*       issuer: CN=openshift-signer@1485968044
> GET /metrics HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 10.130.5.68:8443
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Length: 6308
< Content-Type: text/plain; version=0.0.4
< Date: Wed, 25 Oct 2017 08:15:28 GMT
<
{ [data not shown]
* Connection #0 to host 10.130.5.68 left intact
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0.00022610500000000002
go_gc_duration_seconds{quantile="0.25"} 0.00024556400000000004
go_gc_duration_seconds{quantile="0.5"} 0.000258036
go_gc_duration_seconds{quantile="0.75"} 0.000269366
go_gc_duration_seconds{quantile="1"} 0.000530231
go_gc_duration_seconds_sum 0.004801416
go_gc_duration_seconds_count 17
...

My current configuration :

$ cat hawkular-openshift-agent-configuration.cm-new.yaml
apiVersion: v1
kind: List
metadata: {}
items:
- apiVersion: v1
  kind: ConfigMap
  metadata:
    labels:
      metrics-infra: agent
    name: hawkular-openshift-agent-configuration
    namespace: default
  data:
    config.yaml: |
      kubernetes:
        tenant: ${POD:namespace_name}
      hawkular_server:
        url: https://hawkular-metrics.openshift-infra.svc.cluster.local
        credentials:
          username: secret:openshift-infra/hawkular-metrics-account/hawkular-metrics.username
          password: secret:openshift-infra/hawkular-metrics-account/hawkular-metrics.password
        ca_cert_file: secret:openshift-infra/hawkular-metrics-certificate/hawkular-metrics-ca.certificate
      emitter:
        status_enabled: true
        metrics_enabled: true
        health_enabled: true
      identity:
        cert_file: /master-proxy/master.proxy-client.crt
        private_key_file: /master-proxy/master.proxy-client.key
      collector:
        max_metrics_per_pod: 500
        minimum_collection_interval: 10s
        default_collection_interval: 30s
        metric_id_prefix: pod/${POD:uid}/custom/
        pod_label_tags_prefix: _empty_
        tags:
          metric_name: ${METRIC:name}
          description: ${METRIC:description}
          units: ${METRIC:units}
          namespace_id: ${POD:namespace_uid}
          namespace_name: ${POD:namespace_name}
          node_name: ${POD:node_name}
          pod_id: ${POD:uid}
          pod_ip: ${POD:ip}
          pod_name: ${POD:name}
          pod_namespace: ${POD:namespace_name}
          hostname: ${POD:hostname}
          host_ip: ${POD:host_ip}
          labels: ${POD:labels}
          cluster_name: ${POD:cluster_name}
          resource_version: ${POD:resource_version}
          type: pod
          collector: hawkular_openshift_agent
          custom_metric: true
    hawkular-openshift-agent: |
      endpoints:
      - type: prometheus
        protocol: "https"
        port: 8443
        path: /metrics
        collection_interval: 30s
- apiVersion: extensions/v1beta1
  kind: DaemonSet
  metadata:
    creationTimestamp: null
    labels:
      metrics-infra: agent
      name: hawkular-openshift-agent
    name: hawkular-openshift-agent
  spec:
    selector:
      matchLabels:
        name: hawkular-openshift-agent
    template:
      metadata:
        creationTimestamp: null
        labels:
          metrics-infra: agent
          name: hawkular-openshift-agent
      spec:
        containers:
        - command:
          - /opt/hawkular/hawkular-openshift-agent
          - -config
          - /hawkular-openshift-agent-configuration/config.yaml
          - -v
          - "4"
          env:
          - name: K8S_POD_NAMESPACE
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.namespace
          - name: K8S_POD_NAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.name
          - name: EMITTER_STATUS_CREDENTIALS_USERNAME
            valueFrom:
              secretKeyRef:
                key: username
                name: hawkular-openshift-agent-status
          - name: EMITTER_STATUS_CREDENTIALS_PASSWORD
            valueFrom:
              secretKeyRef:
                key: password
                name: hawkular-openshift-agent-status
          image: hawkular/hawkular-openshift-agent:1.4.2
          imagePullPolicy: Always
          livenessProbe:
            failureThreshold: 3
            httpGet:
              path: /health
              port: 8443
              scheme: HTTPS
            initialDelaySeconds: 30
            periodSeconds: 30
            successThreshold: 1
            timeoutSeconds: 1
          name: hawkular-openshift-agent
          resources: {}
          terminationMessagePath: /dev/termination-log
          volumeMounts:
          - mountPath: /hawkular-openshift-agent-configuration
            name: hawkular-openshift-agent-configuration
          - mountPath: /master-proxy
            name: master-proxy
        dnsPolicy: ClusterFirst
        nodeSelector:
          hawkular-openshift-agent: "true"
        restartPolicy: Always
        securityContext: {}
        serviceAccount: hawkular-openshift-agent
        serviceAccountName: hawkular-openshift-agent
        terminationGracePeriodSeconds: 30
        volumes:
        - configMap:
            defaultMode: 420
            name: hawkular-openshift-agent-configuration
          name: hawkular-openshift-agent-configuration
        - configMap:
            defaultMode: 420
            name: hawkular-openshift-agent-configuration
          name: hawkular-openshift-agent
        - name: master-proxy
          secret:
            defaultMode: 420
            secretName: master-proxy

Regards,

Ludovic

@ljuaneda
Copy link
Author

ljuaneda commented Oct 25, 2017

By the way, it doesn't seems to bother the liveness probe

$ oc get pods hawkular-openshift-agent-8ffjs
NAME                             READY     STATUS    RESTARTS   AGE
hawkular-openshift-agent-8ffjs   1/1       Running   0          17m

@jmazzitelli
Copy link
Contributor

Check this commit - shows a change in the ca_cert_file setting that you may also have to incorporate:

7c7d7f5

@ljuaneda
Copy link
Author

This doesn't work with OSCP version 3.5 :

$ oc -n openshift-infra get secret | grep hawkular-metrics
hawkular-metrics-account                           Opaque                                2         186d
hawkular-metrics-certificate                       Opaque                                2         186d
hawkular-metrics-secrets                           Opaque                                9         186d
$ oc -n openshift-infra get secret hawkular-metrics-certificate -o json | jq -r '.data|keys[]'
hawkular-metrics-ca.certificate
hawkular-metrics.certificate

This related to openshift/origin-metrics. Unfortunatly, this is for origin-metrics 3.6 or later, there is no backport for origin-metrics 3.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants