Skip to content

datastax/pulsar-heartbeat

Repository files navigation

Go Report Card CI Build codecov Language Docker image LICENSE

Operation Monitoring for Pulsar Cluster

Pulsar Heartbeat monitors Pulsar cluster availability, tracks latency of Pulsar message pubsub, and reports failures of the Pulsar cluster. It produces synthetic workloads to measure end-to-end message pubsub latency.

It is a cloud native application that can be installed by Helm within the Pulsar Kubernetes cluster.

Here is a list of features that Pulsar Heartbeat supports.

  • monitor Pulsar admin REST API endpoint
  • measure end-to-end message latency from producing to consuming messages
  • for latency measure, it can produce a list of messages with user specified payload size and the number of messages
  • measure average latency over a list of messages
  • detect out of order delivery of a list of generated messages
  • measure a single message latency over the websocket interface
  • measure message latency generated by Pulsar function
  • monitor instance availability of broker, proxy, bookkeeper, and zookeeper in a Pulsar Kubernetes cluster
  • monitor individual Pulsar broker's health
  • Pulsar function trigger over HTTP interface
  • incident alert with OpsGenie with automatic alert clear and deduplication
  • incident alert with PagerDuty with automatic alert clear and deduplication
  • incident alert with IBM OCM (Operations Center Management) via webhook
  • customer configurable alert threshold and probe test interval
  • tracking analytics and usage
  • dead man's snitch heartbeat monitor with OpsGenie
  • alert on Slack
  • monitor multiple Pulsar clusters (with no kubernetes pods monitoring)
  • co-resident monitoring within the same Pulsar Kubernetes cluster

This is a data driven tool that sources configuration from a yaml or json file. Here is a template. The configuration json file can be specified in the overwrite order of

  • an environment variable PULSAR_OPS_MONITOR_CFG
  • an command line argument ./pulsar-heartbeat -config /path/to/pulsar_ops_monitor_config.yml
  • A default path to ../config/runtime.yml

Observability

This tool exposes Prometheus compliant metrics at \metrics endpoint for scraping. The exported metrics are:

Name Type Description
pulsar_pubsub_latency_ms gauge end to end message pub and sub latency in milliseconds
pulsar_pubsub_latency_ms_hst summary end to end message latency histogram summary over 50%, 90%, and 99% samples
pulsar_websocket_latency_ms gauge end to end message pub and sub latency over websocket interface in milliseconds
pulsar_k8s_bookkeeper_offline_counter gauge bookkeeper offline instances in Kubernetes cluster
pulsar_k8s_broker_offline_counter gauge broker offline instances in the Kubernetes cluster
pulsar_k8s_proxy_offline_counter gauge proxy offline instances in the Kubernetes cluster
pulsar_k8s_bookkeeper_zookeeper_counter gauge zookeeper offline instances in the Kubernetes cluster
pulsar_monitor_counter counter the total number of heartbeats counter
pulsar_tenant_size gauge the number of tenants that can be used as a health indicator of admin interface

In-cluster monitoring

Pulsar heartbeat can be deployed within the same Pulsar Kubernetes cluster. Kubernetes monitoring and individual broker monitoring are only supported within the same Pulsar Kubernetes cluster deployment.

Docker

Pulsar Heartbeat's official docker image can be pulled here

Docker compose

$ docker-compose up

Docker example

The runtime.yml/yaml or runtime.json file must be mounted to /config/runtime.yml as the default configuration path.

Run docker container that exposes Prometheus metrics for collection.

$ docker run -d -it -v  $HOME/go/src/github.com/datastax/pulsar-heartbeat/config/runtime-astra.yml:/config/runtime.yml -p 8080:8080 --name=pulsar-heartbeat datastax/pulsar-heartbeat:latest

Helm chart

For the following commands, Helm version 3 is supported.

Install as part of Pulsar cluster using helm

Pulsar Heartbeat can be installed as part of Pulsar cluster in this Helm chart.

Install as part of DataStax Pulsar cluster using Helm

Pulsar Heartbeat can be directly enabled inside the DataStax Pulsar chart.

Development

How to build

This script builds the Pulsar Heartbeat Go application, runs code static analysis(golint), runs unit tests, and creates a binary under ./bin/pulsar-heartbeat.

$ ./scripts/ci.sh

This command runs a multi stage build to produce a docker image.


## Incident Management Integration

Pulsar Heartbeat supports multiple incident management platforms for alerting and escalation:

### OpsGenie
Configure OpsGenie integration in your configuration file:
```yaml
opsGenieConfig:
  intervalSeconds: 180
  heartbeatKey: "your-opsgenie-heartbeat-key"
  alertKey: "your-opsgenie-alert-key"

PagerDuty

Configure PagerDuty integration via environment variable:

export PAGER_DUTY_INTEGRATION_KEY="your-pagerduty-integration-key"

IBM OCM (Operations Center Management)

Configure IBM OCM integration via environment variables:

export IBM_OCM_WEBHOOK_URL="https://your-ibm-ocm-webhook-url"
export IBM_OCM_API_BASE_URL="https://console.oncallmanager.ibm.com"
export IBM_OCM_API_USER="your-api-username"
export IBM_OCM_API_PASSWORD="your-api-password"

Note: For incident resolution, IBM OCM requires API credentials (IBM_OCM_API_BASE_URL, IBM_OCM_API_USER, IBM_OCM_API_PASSWORD). If only the webhook URL is provided, incidents will be created but not automatically resolved.

Note: You can enable multiple incident management platforms simultaneously. Pulsar Heartbeat will send alerts to all configured platforms and automatically resolve incidents when issues are cleared.

$ make


About

Pulsar Heartbeat monitors Pulsar cluster availability, tracks latency of Pulsar message pubsub, and reports failures of the Pulsar cluster. It produces synthetic workloads to measure end-to-end message pubsub latency.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages