Skip to content

Commit 0b10411

Browse files
authored
Merge pull request #5714 from ClickHouse/clickstack-kafka-logs
ClickStack - Kafka Logs Guide
2 parents 5abc76b + 306c0a7 commit 0b10411

7 files changed

Lines changed: 322 additions & 0 deletions

File tree

docs/use-cases/observability/clickstack/ingesting-data/integration-examples/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ For production deployments, we recommend running integrations as OpenTelemetry C
2323
| [AWS Lambda Logs using Rotel](/use-cases/observability/clickstack/integrations/aws-lambda) | Forward Lambda logs with Rotel |
2424
| [AWS CloudWatch](/use-cases/observability/clickstack/integrations/aws-cloudwatch-logs) | Forward CloudWatch log groups |
2525
| [JVM Metrics](/use-cases/observability/clickstack/integrations/jvm-metrics) | Monitor JVM performance |
26+
| [Kafka Logs](/use-cases/observability/clickstack/integrations/kafka-logs) | Collect Kafka broker logs |
2627
| [Kafka Metrics](/use-cases/observability/clickstack/integrations/kafka-metrics) | Monitor Kafka performance |
2728
| [Kubernetes](/use-cases/observability/clickstack/integrations/kubernetes) | Monitor K8s clusters |
2829
| [MongoDB Logs](/use-cases/observability/clickstack/integrations/mongodb-logs) | Collect MongoDB server logs |
Lines changed: 320 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,320 @@
1+
---
2+
slug: /use-cases/observability/clickstack/integrations/kafka-logs
3+
title: 'Monitoring Kafka Logs with ClickStack'
4+
sidebar_label: 'Kafka Logs'
5+
pagination_prev: null
6+
pagination_next: null
7+
description: 'Monitoring Kafka Logs with ClickStack'
8+
doc_type: 'guide'
9+
keywords: ['Kafka', 'logs', 'OTEL', 'ClickStack', 'broker monitoring', 'Log4j']
10+
---
11+
12+
import Image from '@theme/IdealImage';
13+
import useBaseUrl from '@docusaurus/useBaseUrl';
14+
import Tabs from '@theme/Tabs';
15+
import TabItem from '@theme/TabItem';
16+
import log_view from '@site/static/images/clickstack/kafka/logs/log-view.png';
17+
import search_view from '@site/static/images/clickstack/kafka/logs/search-view.png';
18+
import finish_import from '@site/static/images/clickstack/kafka/logs/finish-import.png';
19+
import example_dashboard from '@site/static/images/clickstack/kafka/logs/example-dashboard.png';
20+
import import_dashboard from '@site/static/images/clickstack/import-dashboard.png';
21+
import { TrackedLink } from '@site/src/components/GalaxyTrackedLink/GalaxyTrackedLink';
22+
23+
# Monitoring Kafka Logs with ClickStack {#kafka-logs-clickstack}
24+
25+
:::note[TL;DR]
26+
Collect and visualize Kafka broker logs (Log4j format) in ClickStack using the OTel `filelog` receiver. Includes a demo dataset and pre-built dashboard.
27+
:::
28+
29+
## Integration with existing Kafka {#existing-kafka}
30+
31+
This section covers configuring your existing Kafka installation to send broker logs to ClickStack by modifying the ClickStack OTel collector configuration.
32+
If you would like to test the Kafka logs integration before configuring your own existing setup, you can test with our preconfigured setup and sample data in the ["Demo dataset"](/use-cases/observability/clickstack/integrations/kafka-logs#demo-dataset) section.
33+
34+
### Prerequisites {#prerequisites}
35+
- ClickStack instance running
36+
- Existing Kafka installation (version 2.0 or newer)
37+
- Access to Kafka log files (`server.log`, `controller.log`, etc.)
38+
39+
<VerticalStepper headerLevel="h4">
40+
41+
#### Verify Kafka logging configuration {#verify-kafka}
42+
43+
Kafka uses Log4j and writes logs to the directory specified by the `kafka.logs.dir` system property or the `LOG_DIR` environment variable. Check your log file location:
44+
45+
```bash
46+
# Default locations
47+
ls $KAFKA_HOME/logs/ # Standard Apache Kafka (defaults to <install-dir>/logs/)
48+
ls /var/log/kafka/ # RPM/DEB package installations
49+
```
50+
51+
Key Kafka log files:
52+
- **`server.log`**: General broker logs (startup, connections, replication, errors)
53+
- **`controller.log`**: Controller-specific events (leader election, partition reassignment)
54+
- **`state-change.log`**: Partition and replica state transitions
55+
56+
Kafka's default Log4j pattern produces lines like:
57+
58+
```text
59+
[2026-03-09 14:23:45,123] INFO [KafkaServer id=0] started (kafka.server.KafkaServer)
60+
```
61+
62+
:::note
63+
For Docker-based Kafka deployments (e.g., `confluentinc/cp-kafka`), the default Log4j configuration only includes a console appender — there is no file appender, so logs are written to stdout only. To use the `filelog` receiver, you'll need to redirect logs to a file, either by adding a file appender to `log4j.properties` or by piping stdout (e.g., `| tee /var/log/kafka/server.log`).
64+
:::
65+
66+
#### Create a custom OTel collector configuration for Kafka {#custom-otel}
67+
68+
ClickStack allows you to extend the base OpenTelemetry Collector configuration by mounting a custom configuration file and setting an environment variable. The custom configuration is merged with the base configuration managed by HyperDX via OpAMP.
69+
70+
Create a file named `kafka-logs-monitoring.yaml` with the following configuration:
71+
72+
```yaml
73+
receivers:
74+
filelog/kafka:
75+
include:
76+
- /var/log/kafka/server.log
77+
- /var/log/kafka/controller.log # optional, only exists if log4j is configured with separate file appenders
78+
- /var/log/kafka/state-change.log # optional, same as above
79+
start_at: beginning
80+
multiline:
81+
line_start_pattern: '^\[\d{4}-\d{2}-\d{2}'
82+
operators:
83+
- type: regex_parser
84+
regex: '^\[(?P<timestamp>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})\] (?P<severity>\w+) (?P<message>.*)'
85+
parse_from: body
86+
parse_to: attributes
87+
timestamp:
88+
parse_from: attributes.timestamp
89+
layout: '%Y-%m-%d %H:%M:%S,%L'
90+
severity:
91+
parse_from: attributes.severity
92+
93+
- type: move
94+
from: attributes.message
95+
to: body
96+
97+
- type: add
98+
field: attributes.source
99+
value: "kafka"
100+
101+
- type: add
102+
field: resource["service.name"]
103+
value: "kafka-production"
104+
105+
service:
106+
pipelines:
107+
logs/kafka:
108+
receivers: [filelog/kafka]
109+
processors:
110+
- memory_limiter
111+
- transform
112+
- batch
113+
exporters:
114+
- clickhouse
115+
```
116+
117+
:::note
118+
- You only define new receivers and pipelines in the custom config. The processors (`memory_limiter`, `transform`, `batch`) and exporters (`clickhouse`) are already defined in the base ClickStack configuration — you just reference them by name.
119+
- The `multiline` configuration ensures stack traces are captured as a single log entry.
120+
- This configuration uses `start_at: beginning` to read all existing logs when the collector starts. For production deployments, change to `start_at: end` to avoid re-ingesting logs on collector restarts.
121+
:::
122+
123+
#### Configure ClickStack to load custom configuration {#load-custom}
124+
125+
To enable custom collector configuration in your existing ClickStack deployment, you must:
126+
127+
1. Mount the custom config file at `/etc/otelcol-contrib/custom.config.yaml`
128+
2. Set the environment variable `CUSTOM_OTELCOL_CONFIG_FILE=/etc/otelcol-contrib/custom.config.yaml`
129+
3. Mount your Kafka log directory so the collector can read them
130+
131+
<Tabs groupId="deployMethod">
132+
<TabItem value="docker-compose" label="Docker Compose" default>
133+
134+
Update your ClickStack deployment configuration:
135+
```yaml
136+
services:
137+
clickstack:
138+
# ... existing configuration ...
139+
environment:
140+
- CUSTOM_OTELCOL_CONFIG_FILE=/etc/otelcol-contrib/custom.config.yaml
141+
# ... other environment variables ...
142+
volumes:
143+
- ./kafka-logs-monitoring.yaml:/etc/otelcol-contrib/custom.config.yaml:ro
144+
- /var/log/kafka:/var/log/kafka:ro
145+
# ... other volumes ...
146+
```
147+
148+
</TabItem>
149+
<TabItem value="docker-run" label="Docker Run (All-in-One Image)">
150+
151+
If you're using the all-in-one image with docker, run:
152+
```bash
153+
docker run --name clickstack \
154+
-p 8080:8080 -p 4317:4317 -p 4318:4318 \
155+
-e CUSTOM_OTELCOL_CONFIG_FILE=/etc/otelcol-contrib/custom.config.yaml \
156+
-v "$(pwd)/kafka-logs-monitoring.yaml:/etc/otelcol-contrib/custom.config.yaml:ro" \
157+
-v /var/log/kafka:/var/log/kafka:ro \
158+
clickhouse/clickstack-all-in-one:latest
159+
```
160+
161+
</TabItem>
162+
</Tabs>
163+
164+
:::note
165+
Ensure the ClickStack collector has appropriate permissions to read the Kafka log files. In production, use read-only mounts (`:ro`) and follow the principle of least privilege.
166+
:::
167+
168+
#### Verify Logs in HyperDX {#verifying-logs}
169+
170+
Once configured, log into HyperDX and verify that logs are flowing:
171+
172+
<Image img={search_view} alt="Search view"/>
173+
174+
<Image img={log_view} alt="Log view"/>
175+
176+
</VerticalStepper>
177+
178+
## Demo dataset {#demo-dataset}
179+
180+
Test the Kafka logs integration with a pre-generated sample dataset before configuring your production systems.
181+
182+
<VerticalStepper headerLevel="h4">
183+
184+
#### Download the sample dataset {#download-sample}
185+
186+
Download the sample log file:
187+
188+
```bash
189+
curl -O https://datasets-documentation.s3.eu-west-3.amazonaws.com/clickstack-integrations/kafka/server.log
190+
```
191+
192+
#### Create test collector configuration {#test-config}
193+
194+
Create a file named `kafka-logs-demo.yaml` with the following configuration:
195+
196+
```yaml
197+
cat > kafka-logs-demo.yaml << 'EOF'
198+
receivers:
199+
filelog/kafka:
200+
include:
201+
- /tmp/kafka-demo/server.log
202+
start_at: beginning
203+
multiline:
204+
line_start_pattern: '^\[\d{4}-\d{2}-\d{2}'
205+
operators:
206+
- type: regex_parser
207+
regex: '^\[(?P<timestamp>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})\] (?P<severity>\w+) (?P<message>.*)'
208+
parse_from: body
209+
parse_to: attributes
210+
timestamp:
211+
parse_from: attributes.timestamp
212+
layout: '%Y-%m-%d %H:%M:%S,%L'
213+
severity:
214+
parse_from: attributes.severity
215+
216+
- type: move
217+
from: attributes.message
218+
to: body
219+
220+
- type: add
221+
field: attributes.source
222+
value: "kafka-demo"
223+
224+
- type: add
225+
field: resource["service.name"]
226+
value: "kafka-demo"
227+
228+
service:
229+
pipelines:
230+
logs/kafka-demo:
231+
receivers: [filelog/kafka]
232+
processors:
233+
- memory_limiter
234+
- transform
235+
- batch
236+
exporters:
237+
- clickhouse
238+
EOF
239+
```
240+
241+
#### Run ClickStack with demo configuration {#run-demo}
242+
243+
Run ClickStack with the demo logs and configuration:
244+
245+
```bash
246+
docker run --name clickstack-demo \
247+
-p 8080:8080 -p 4317:4317 -p 4318:4318 \
248+
-e CUSTOM_OTELCOL_CONFIG_FILE=/etc/otelcol-contrib/custom.config.yaml \
249+
-v "$(pwd)/kafka-logs-demo.yaml:/etc/otelcol-contrib/custom.config.yaml:ro" \
250+
-v "$(pwd)/server.log:/tmp/kafka-demo/server.log:ro" \
251+
clickhouse/clickstack-all-in-one:latest
252+
```
253+
254+
## Verify logs in HyperDX {#verify-demo-logs}
255+
256+
Once ClickStack is running:
257+
258+
1. Open [HyperDX](http://localhost:8080/) and log in to your account (you may need to create an account first)
259+
2. Navigate to the Search view and set the source to `Logs`
260+
3. Set the time range to include **2026-03-09 00:00:00 - 2026-03-10 00:00:00 (UTC)**
261+
262+
<Image img={search_view} alt="Search view"/>
263+
264+
<Image img={log_view} alt="Log view"/>
265+
266+
</VerticalStepper>
267+
268+
## Dashboards and visualization {#dashboards}
269+
270+
<VerticalStepper headerLevel="h4">
271+
272+
#### <TrackedLink href={useBaseUrl('/examples/kafka-logs-dashboard.json')} download="kafka-logs-dashboard.json" eventName="docs.kafka_logs_monitoring.dashboard_download">Download</TrackedLink> the dashboard configuration {#download}
273+
274+
#### Import pre-built dashboard {#import-dashboard}
275+
276+
1. Open HyperDX and navigate to the Dashboards section.
277+
2. Click "Import Dashboard" in the upper right corner under the ellipses.
278+
279+
<Image img={import_dashboard} alt="Import Dashboard"/>
280+
281+
3. Upload the kafka-logs-dashboard.json file and click finish import.
282+
283+
<Image img={finish_import} alt="Finish importing Kafka logs dashboard"/>
284+
285+
#### The dashboard will be created with all visualizations pre-configured {#created-dashboard}
286+
287+
For the demo dataset, set the time range to include **2026-03-09 00:00:00 - 2026-03-10 00:00:00 (UTC)**.
288+
289+
<Image img={example_dashboard} alt="Kafka Logs example dashboard"/>
290+
291+
</VerticalStepper>
292+
293+
## Troubleshooting {#troubleshooting}
294+
295+
**Verify the effective config includes your filelog receiver:**
296+
```bash
297+
docker exec <container> cat /etc/otel/supervisor-data/effective.yaml | grep -A 10 filelog
298+
```
299+
300+
**Check for collector errors:**
301+
```bash
302+
docker exec <container> cat /etc/otel/supervisor-data/agent.log
303+
```
304+
305+
**Verify Kafka log format matches the expected pattern:**
306+
```bash
307+
tail -1 /var/log/kafka/server.log
308+
```
309+
310+
If your Kafka installation uses a custom Log4j pattern, adjust the `regex_parser` regex accordingly.
311+
312+
## Next steps {#next-steps}
313+
314+
- Set up [alerts](/use-cases/observability/clickstack/alerts) for critical events (broker failures, replication errors, consumer group issues)
315+
- Combine with [Kafka Metrics](/use-cases/observability/clickstack/integrations/kafka-metrics) for comprehensive Kafka monitoring
316+
- Create additional [dashboards](/use-cases/observability/clickstack/dashboards) for specific use cases (controller events, partition reassignment)
317+
318+
## Going to production {#going-to-production}
319+
320+
This guide extends ClickStack's built-in OpenTelemetry Collector for quick setup. For production deployments, we recommend running your own OTel Collector and sending data to ClickStack's OTLP endpoint. See [Sending OpenTelemetry data](/use-cases/observability/clickstack/ingesting-data/opentelemetry) for production configuration.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"version":"0.1.0","name":"Kafka Logs","tiles":[{"id":"kl_vol","x":0,"y":0,"w":8,"h":10,"config":{"name":"Log volume over time","source":"Logs","displayType":"line","granularity":"auto","select":[{"aggFn":"count","aggCondition":"","aggConditionLanguage":"sql","valueExpression":"","alias":"Log volume"}],"where":"","whereLanguage":"lucene"}},{"id":"kl_sev","x":8,"y":0,"w":8,"h":10,"config":{"name":"Severity breakdown over time","source":"Logs","displayType":"stacked_bar","granularity":"auto","select":[{"aggFn":"count","aggCondition":"","aggConditionLanguage":"sql","valueExpression":""}],"where":"","whereLanguage":"lucene","groupBy":"SeverityText"}},{"id":"kl_sevtbl","x":16,"y":0,"w":8,"h":10,"config":{"name":"Severity counts","source":"Logs","displayType":"table","granularity":"auto","select":[{"aggFn":"count","aggCondition":"","aggConditionLanguage":"sql","valueExpression":"","alias":"Count"}],"where":"","whereLanguage":"lucene","groupBy":"SeverityText"}},{"id":"kl_err","x":0,"y":10,"w":8,"h":10,"config":{"name":"Errors and warnings over time","source":"Logs","displayType":"stacked_bar","granularity":"auto","select":[{"aggFn":"count","aggCondition":"SeverityText IN ('warn', 'error', 'fatal')","aggConditionLanguage":"sql","valueExpression":"","alias":"Errors/Warnings"}],"where":"","whereLanguage":"lucene","groupBy":"SeverityText"}},{"id":"kl_repl","x":8,"y":10,"w":8,"h":10,"config":{"name":"Replication events","source":"Logs","displayType":"line","granularity":"auto","select":[{"aggFn":"count","aggCondition":"Body LIKE '%ReplicaFetcher%' OR Body LIKE '%replication%' OR Body LIKE '%ISR%'","aggConditionLanguage":"sql","valueExpression":"","alias":"Replication events"}],"where":"","whereLanguage":"lucene"}},{"id":"kl_conn","x":16,"y":10,"w":8,"h":10,"config":{"name":"Connection events","source":"Logs","displayType":"line","granularity":"auto","select":[{"aggFn":"count","aggCondition":"Body LIKE '%connection%' OR Body LIKE '%Closing socket%' OR Body LIKE '%Processor%'","aggConditionLanguage":"sql","valueExpression":"","alias":"Connection events"}],"where":"","whereLanguage":"lucene"}},{"id":"kl_errlogs","x":0,"y":20,"w":12,"h":10,"config":{"name":"Error logs","source":"Logs","displayType":"search","granularity":"auto","select":"","where":"SeverityText IN ('error', 'fatal')","whereLanguage":"sql"}},{"id":"kl_warnlogs","x":12,"y":20,"w":12,"h":10,"config":{"name":"Warning logs","source":"Logs","displayType":"search","granularity":"auto","select":"","where":"SeverityText = 'warn'","whereLanguage":"sql"}}],"filters":[]}
623 KB
Loading
263 KB
Loading
510 KB
Loading
1.3 MB
Loading

0 commit comments

Comments
 (0)