Skip to content

Commit abc259e

Browse files
pratik705devx
authored andcommitted
docs: update readme.md of otel
1 parent ed9c8b4 commit abc259e

File tree

1 file changed

+15
-200
lines changed
  • applications/base/services/observability/opentelemetry-kube-stack

1 file changed

+15
-200
lines changed
Lines changed: 15 additions & 200 deletions
Original file line numberDiff line numberDiff line change
@@ -1,204 +1,19 @@
1-
# OpenTelemetry Kube Stack
1+
# OpenTelemetry Kube Stack – Base Configuration
22

3-
The OpenTelemetry Kube Stack is a comprehensive observability solution that provides a complete OpenTelemetry setup for Kubernetes clusters. It includes the OpenTelemetry Operator, collectors, and essential monitoring components.
3+
This directory contains the **base manifests** for deploying the [OpenTelemetry Kube Stack](https://opentelemetry.io/), a **unified observability framework** for collecting, processing, and exporting **traces and logs** from Kubernetes workloads and infrastructure components.
4+
It is designed to be **consumed by cluster repositories** as a remote base, allowing each cluster to apply **custom overrides** as needed.
45

5-
## Overview
6+
---
67

7-
This chart deploys:
8-
- **OpenTelemetry Operator**: Manages OpenTelemetry collectors and instrumentation
9-
- **OpenTelemetry Collector**: Collects, processes, and exports telemetry data
10-
- **Kube State Metrics**: Exposes cluster-level metrics about Kubernetes objects
11-
- **Node Exporter**: Collects hardware and OS metrics from cluster nodes
8+
## About OpenTelemetry Kube Stack
129

13-
## Configuration
14-
15-
### Chart Information
16-
- **Chart**: opentelemetry-kube-stack
17-
- **Version**: 0.11.1
18-
- **App Version**: 0.129.1
19-
- **Repository**: https://open-telemetry.github.io/opentelemetry-helm-charts
20-
21-
### Namespace
22-
Deployed in the `observability` namespace alongside other monitoring components.
23-
24-
### Security Hardening
25-
26-
The deployment includes comprehensive security configurations:
27-
28-
#### Container Security
29-
- Non-root execution (`runAsNonRoot: true`)
30-
- Specific user ID (`runAsUser: 65534`)
31-
- Security profiles (`seccompProfile.type: RuntimeDefault`)
32-
- Capability dropping (`capabilities.drop: [ALL]`)
33-
- Read-only root filesystem (`readOnlyRootFilesystem: true`)
34-
- Privilege escalation disabled (`allowPrivilegeEscalation: false`)
35-
36-
#### Resource Management
37-
- CPU and memory limits defined for all components
38-
- Resource requests set for proper scheduling
39-
- Memory limiter processor configured for collectors
40-
41-
#### Network Security
42-
- OTLP receivers configured on standard ports (4317/4318)
43-
- Service monitors enabled for Prometheus integration
44-
- Node selectors for Linux-only deployment
45-
46-
### Key Features
47-
48-
#### OpenTelemetry Operator
49-
- Manages collector lifecycle and configuration
50-
- Supports auto-instrumentation for applications
51-
- Webhook-based configuration validation
52-
53-
#### Collector Configuration
54-
- OTLP receivers for traces, metrics, and logs
55-
- Batch processing for efficient data handling
56-
- Memory limiting to prevent resource exhaustion
57-
- Logging exporter for initial setup (can be customized)
58-
59-
#### Monitoring Integration
60-
- Prometheus ServiceMonitor resources enabled
61-
- Kube State Metrics for cluster-level observability
62-
- Node Exporter for infrastructure metrics
63-
- Compatible with existing Prometheus stack
64-
65-
### Customization
66-
67-
#### Collector Configuration
68-
The default collector configuration can be extended by modifying the `config` section in the hardened values file. Common customizations include:
69-
70-
```yaml
71-
config:
72-
exporters:
73-
otlp:
74-
endpoint: "your-backend:4317"
75-
tls:
76-
insecure: false
77-
prometheusremotewrite:
78-
endpoint: "https://prometheus.example.com/api/v1/write"
79-
```
80-
81-
#### Resource Scaling
82-
Adjust resource limits based on cluster size and telemetry volume:
83-
84-
```yaml
85-
resources:
86-
requests:
87-
memory: "256Mi"
88-
cpu: "200m"
89-
limits:
90-
memory: "1Gi"
91-
cpu: "1000m"
92-
```
93-
94-
### Dependencies
95-
96-
This chart has dependencies on:
97-
- OpenTelemetry CRDs (installed automatically)
98-
- Kubernetes 1.19+ for proper ServiceMonitor support
99-
- Prometheus Operator (for ServiceMonitor resources)
100-
101-
### Compatibility
102-
103-
#### Existing Services
104-
The configuration is designed to work alongside existing observability services:
105-
- **kube-prometheus-stack**: Kubernetes service monitors disabled to avoid conflicts
106-
- **Prometheus CRDs**: Installation disabled (uses existing CRDs)
107-
- **Grafana**: Compatible with OpenTelemetry data sources
108-
109-
#### OpenTelemetry Operator
110-
This deployment may conflict with the existing `opentelemetry-operator` service. Consider:
111-
- Using this as a replacement for the standalone operator
112-
- Disabling the operator component if only collectors are needed
113-
- Coordinating CRD management between deployments
114-
115-
### Monitoring and Observability
116-
117-
#### Health Checks
118-
Monitor the deployment status:
119-
```bash
120-
kubectl get helmrelease opentelemetry-kube-stack -n observability
121-
kubectl get pods -n observability -l app.kubernetes.io/name=opentelemetry-kube-stack
122-
```
123-
124-
#### Collector Status
125-
Check OpenTelemetry collector status:
126-
```bash
127-
kubectl get opentelemetrycollector -n observability
128-
kubectl logs -n observability -l app.kubernetes.io/component=opentelemetry-collector
129-
```
130-
131-
#### Metrics Availability
132-
Verify metrics collection:
133-
```bash
134-
kubectl port-forward -n observability svc/opentelemetry-kube-stack-collector 8888:8888
135-
curl http://localhost:8888/metrics
136-
```
137-
138-
### Troubleshooting
139-
140-
#### Common Issues
141-
142-
1. **CRD Conflicts**: If OpenTelemetry CRDs already exist, disable installation:
143-
```yaml
144-
crds:
145-
installOtel: false
146-
```
147-
148-
2. **Resource Constraints**: Increase resource limits if collectors are OOMKilled:
149-
```yaml
150-
resources:
151-
limits:
152-
memory: "1Gi"
153-
```
154-
155-
3. **Webhook Failures**: If admission webhooks cause issues:
156-
```yaml
157-
opentelemetry-operator:
158-
admissionWebhooks:
159-
failurePolicy: "Ignore"
160-
```
161-
162-
#### Debug Commands
163-
```bash
164-
# Check operator logs
165-
kubectl logs -n observability -l app.kubernetes.io/name=opentelemetry-operator
166-
167-
# Describe collector resources
168-
kubectl describe opentelemetrycollector -n observability
169-
170-
# Check service monitor status
171-
kubectl get servicemonitor -n observability
172-
```
173-
174-
### Integration Examples
175-
176-
#### Application Instrumentation
177-
Enable auto-instrumentation for applications:
178-
```yaml
179-
apiVersion: opentelemetry.io/v1alpha1
180-
kind: Instrumentation
181-
metadata:
182-
name: my-instrumentation
183-
spec:
184-
exporter:
185-
endpoint: http://opentelemetry-kube-stack-collector:4317
186-
propagators:
187-
- tracecontext
188-
- baggage
189-
```
190-
191-
#### Custom Exporters
192-
Configure exporters for your observability backend:
193-
```yaml
194-
config:
195-
exporters:
196-
jaeger:
197-
endpoint: jaeger-collector:14250
198-
tls:
199-
insecure: true
200-
prometheus:
201-
endpoint: "0.0.0.0:8889"
202-
```
203-
204-
This deployment provides a solid foundation for OpenTelemetry-based observability in Kubernetes environments with enterprise-grade security and monitoring capabilities.
10+
- Provides a **complete observability foundation** for Kubernetes clusters, integrating **traces and logs** under a single open standard.
11+
- Deployed using the **OpenTelemetry Operator**, which manages collectors, instrumentation, and telemetry pipelines declaratively via Kubernetes manifests.
12+
- Collects telemetry data from:
13+
- **Kubernetes system components** (API server, kubelet, scheduler, etc.)
14+
- **Application workloads** instrumented with OpenTelemetry SDKs or auto-instrumentation.
15+
- Processes data through **OpenTelemetry Collectors**, which perform transformation, filtering, batching, and enrichment before export.
16+
- Supports multiple backends including **Prometheus**, **Tempo**, **Loki**, **Grafana**, **Jaeger**, and **OTLP-compatible endpoints**.
17+
- Enables **auto-discovery and dynamic configuration** for Kubernetes workloads, simplifying instrumentation and reducing manual setup.
18+
- Designed for **scalability and resilience**, supporting both **agent** and **gateway** modes for distributed telemetry collection.
19+
- Natively integrates with **Grafana** and other observability tools for unified dashboards and correlation between metrics, traces, and logs.

0 commit comments

Comments
 (0)