Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,21 @@ updates:
schedule:
interval: "daily"

- directory: "/application/grafana"
package-ecosystem: "pip"
schedule:
interval: "daily"

- directory: "/application/grafana"
package-ecosystem: "docker"
schedule:
interval: "daily"

- directory: "/application/grafana"
package-ecosystem: "docker-compose"
schedule:
interval: "daily"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't weekly be enough? (same for the other jobs)

Suggested change
interval: "daily"
interval: "weekly"

Copy link
Copy Markdown
Member Author

@amotl amotl Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to be informed in-time when something goes south, so we can start conversations with upstream authors earlier than later.


- directory: "/application/ingestr"
package-ecosystem: "pip"
schedule:
Expand All @@ -140,6 +155,16 @@ updates:
schedule:
interval: "daily"

- directory: "/application/open-webui"
package-ecosystem: "docker-compose"
schedule:
interval: "daily"

- directory: "/application/open-webui/init"
package-ecosystem: "docker"
schedule:
interval: "daily"

- directory: "/application/metabase"
package-ecosystem: "pip"
schedule:
Expand Down
64 changes: 64 additions & 0 deletions .github/workflows/application-grafana.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
name: "Grafana"

on:
pull_request:
paths:
- '.github/workflows/application-grafana.yml'
- 'application/grafana/**'
push:
branches: [ main ]
paths:
- '.github/workflows/application-grafana.yml'
- 'application/grafana/**'

# Allow job to be triggered manually.
workflow_dispatch:

# Run job each night after CrateDB nightly has been published.
schedule:
- cron: '0 4 * * *'

# Cancel in-progress jobs when pushing to the same branch.
concurrency:
cancel-in-progress: true
group: ${{ github.workflow }}-${{ github.ref }}

jobs:

test:
runs-on: ${{ matrix.os }}

strategy:
fail-fast: false
matrix:
os:
- "ubuntu-latest"
cratedb-version:
- "nightly"
grafana-version:
- "9.5.21"
- "10.4.19"
- "11.6"
- "12.3"
- "12.4"
- "nightly"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes sense to test a grafana nightly, it may fail for valid reasons which could never land in a release.

Suggested change
- "nightly"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would only be the case with main. Using nightly here is the right choice to receive "going south" signals early. If we see too much flakyness due to unrelated instabilities in Grafana nightly, we can always remove that label again. I don't expect many of such, because we are only testing a very minor surface of Grafana, and I don't expect them to ship any completely dysfunctional releases. Grafana nightly is well tested.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to instead ensure that we update this once a new major or minor version is released (dependbot?). I'd like to avoid any possible effort caused by a nightly version which may never see the light. Anyhow, we can also test and remove it once it causes issues.


env:
OS_TYPE: ${{ matrix.os }}
CRATEDB_VERSION: ${{ matrix.cratedb-version }}
GRAFANA_VERSION: ${{ matrix.grafana-version }}

name: "
Grafana ${{ matrix.grafana-version }},
CrateDB ${{ matrix.cratedb-version }}
"
steps:

- name: Acquire sources
uses: actions/checkout@v6

- name: Validate application/grafana
run: |
# TODO: Generalize invocation into `ngr` test runner.
cd application/grafana
bash test.sh
54 changes: 54 additions & 0 deletions application/grafana/compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
services:

cratedb:
image: docker.io/crate/crate:${CRATEDB_VERSION:-latest}
command: >
crate \
'-Cdiscovery.type=single-node' \
'-Cstats.enabled=true'
ports:
- 4200:4200
- 5432:5432
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:4200"]
start_period: 3s
interval: 1.5s
retries: 30
timeout: 30s

grafana:
image: docker.io/grafana/grafana:${GRAFANA_VERSION:-latest}
environment:
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
- GF_AUTH_DISABLE_LOGIN_FORM=true
ports:
- "3000:3000"
depends_on:
- cratedb
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:3000"]
start_period: 3s
interval: 1.5s
retries: 30
timeout: 30s

example-weather:
build:
context: .
dockerfile_inline: |
FROM docker.io/python:3.14-slim-trixie
RUN apt-get update && apt-get install --yes git
ADD requirements.txt /
ADD example-weather.py /
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
ENV UV_SYSTEM_PYTHON=true
RUN uv pip install --requirement requirements.txt
command: python example-weather.py
depends_on:
cratedb:
condition: service_healthy
grafana:
condition: service_healthy
profiles:
- tasks
185 changes: 185 additions & 0 deletions application/grafana/example-weather.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
"""
Example program demonstrating how to work with CrateDB and Grafana using
[grafana-client] and [grafanalib].

[grafana-client]: https://github.com/grafana-toolbox/grafana-client
[grafanalib]: https://github.com/weaveworks/grafanalib
"""
import dataclasses
import json
import logging

from cratedb_toolkit.datasets import load_dataset
from grafana_client import GrafanaApi
from grafana_client.client import GrafanaClientError
from grafana_client.model import DatasourceIdentifier
from grafana_client.util import setup_logging
from grafanalib._gen import DashboardEncoder
from grafanalib.core import (SHORT_FORMAT, Dashboard, Graph, GridPos,
SqlTarget, Time, YAxes, YAxis)
from yarl import URL

logger = logging.getLogger(__name__)


DATASET_NAME = "tutorial/weather-basic"
DATASOURCE_UID = "cratedb-v2KYBt37k"
DASHBOARD_UID = "cratedb-weather-demo"
CRATEDB_SQLALCHEMY_URL = "crate://crate:crate@cratedb:4200/"
CRATEDB_GRAFANA_URL = "cratedb:5432"
GRAFANA_URL = "http://grafana:3000"


@dataclasses.dataclass
class PanelInfo:
"""
Minimal information defining a minimal graph panel.
"""
title: str
field: str
unit: str


def provision(grafana: GrafanaApi):
"""
Provision CrateDB and Grafana.

- Load example weather data into CrateDB.
- Provision Grafana with data source and dashboard.
"""

logger.info("Loading data into CrateDB")

# Load example data into CrateDB.
dataset = load_dataset(DATASET_NAME)
dataset.dbtable(dburi=CRATEDB_SQLALCHEMY_URL, table="example.weather_data").load()

logger.info("Provisioning Grafana data source and dashboard")

# Create Grafana data source.
try:
grafana.datasource.get_datasource_by_uid(DATASOURCE_UID)
grafana.datasource.delete_datasource_by_uid(DATASOURCE_UID)
except GrafanaClientError as ex:
if ex.status_code != 404:
raise
grafana.datasource.create_datasource(
{
"uid": DATASOURCE_UID,
"name": "CrateDB",
"type": "postgres",
"access": "proxy",
"url": CRATEDB_GRAFANA_URL,
"jsonData": {
"database": "doc",
"postgresVersion": 1200,
"sslmode": "disable",
},
"user": "crate",
"secureJsonData": {
"password": "crate",
},
}
)

# Create Grafana dashboard.
dashboard = Dashboard(
uid=DASHBOARD_UID,
title="CrateDB example weather dashboard",
time=Time('2023-01-01T00:00:00Z', '2023-09-01T00:00:00Z'),
refresh=None,
)
panel_infos = [
PanelInfo(title="Weather » Temperature", field="temperature", unit="celsius"),
PanelInfo(title="Weather » Humidity", field="humidity", unit="humidity"),
PanelInfo(title="Weather » Wind speed", field="wind_speed", unit="velocitykmh"),
]
for panel_info in panel_infos:
column_name = panel_info.field
unit = panel_info.unit
dashboard.panels.append(
Graph(
title=f"{panel_info.title}",
dataSource=DATASOURCE_UID,
targets=[
SqlTarget(
rawSql=f"""
SELECT
$__timeGroupAlias("timestamp", $__interval),
"location",
MEAN("{column_name}") AS "{column_name}"
FROM "example"."weather_data"
WHERE $__timeFilter("timestamp")
GROUP BY "time", "location"
ORDER BY "time"
Comment on lines +107 to +114
Copy link
Copy Markdown
Member Author

@amotl amotl Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note this querying scheme+template is super important to follow, otherwise you will set the wire and the browser on fire when processing big data.

@hammerhead told me about this the other day, but I don't know if it is general knowledge across the board and is correctly applied by all our users and customers. 1

Footnotes

  1. I have a slight suspicion users can overload their clusters easily because nobody tells them about those details. In this spirit, a dedicated CrateDB datasource plugin for Grafana would be very sweet indeed, which would optimally guide the user appropriately, similarly like InfluxDB users, for example, who do not need to be concerned about relevant details at all?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, in the following comment, you can see @matriv pinged me about the DATE_BIN() function that had been recommended by @seut.

If that works now, I would be happy to demonstrate the modern version here, even if it requires CrateDB 5.7. We can easily add the SQL template above to the documentation to inform users of older CrateDB versions.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to do that, 5.7 is already a bit old.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did someone use the new variant successfully already, so we can copy from there?

/cc @hammerhead, @grbade, @zolbatar

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zolbatar will look into this detail. Thank you! 🙇

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai: Please capture this into a dedicated issue, so we can improve later. Thank you.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

gh issue create \
  --repo crate/cratedb-examples \
  --title "Grafana: Improve SQL template to use DATE_BIN() (CrateDB 5.7+)" \
  --body "## Summary

In the Grafana end-to-end example ([application/grafana/example-weather.py](https://github.com/crate/cratedb-examples/blob/grafana/application/grafana/example-weather.py)), the SQL query template currently uses \`\$__timeGroupAlias(\"timestamp\", \$__interval)\` together with \`MEAN()\` for time-series aggregation.

It has been recommended to adopt the modern \`DATE_BIN()\` function (available since CrateDB 5.7) as the preferred SQL template for Grafana queries, since CrateDB 5.7 is already a reasonably old baseline.

Using the correct querying scheme and template is critical — as noted by `@hammerhead` — to avoid overloading the cluster or setting the wire and the browser on fire when processing large datasets. This is important knowledge that is not always applied correctly by users and customers.

## Background

- `@matriv` flagged the \`DATE_BIN()\` function (recommended by `@seut`) as the modern replacement in [this comment](https://github.com/daq-tools/kotori/pull/148#discussion_r1499035232).
- The original discussion about the querying scheme can be found [here](https://github.com/daq-tools/kotori/pull/148#discussion_r1237795785).
- `@zolbatar` has volunteered to look into the details of the new \`DATE_BIN()\` variant.

## Tasks

- [ ] Validate the \`DATE_BIN()\` SQL template for Grafana with CrateDB (coordinated with `@zolbatar`)
- [ ] Update the SQL query in \`application/grafana/example-weather.py\` to use the modern \`DATE_BIN()\` approach
- [ ] Add the legacy \`\$__timeGroupAlias\` SQL template to the documentation as a reference for users on older CrateDB versions

## References

- PR: https://github.com/crate/cratedb-examples/pull/1442
- Comment: https://github.com/crate/cratedb-examples/pull/1442#discussion_r2943186202
- Requested by: `@amotl`
" \
  --assignee amotl

Length of output: 53


@amotl, done! I've created issue #1508 to track this improvement:

👉 #1508

"Grafana: Improve SQL template to use DATE_BIN() (CrateDB 5.7+)"

The issue captures:

  • The motivation for adopting DATE_BIN() as the modern time-series aggregation approach (replacing $__timeGroupAlias + MEAN())
  • The importance of using the correct querying scheme to avoid overloading clusters
  • References to the original discussions with @hammerhead, @seut, and @matriv
  • @zolbatar's involvement in validating the new variant
  • Tasks for updating example-weather.py and documenting the legacy template for older CrateDB versions

(∩^ω^)⊃━☆゚.*・。゚

""",
refId="A",
),
],
yAxes=YAxes(
YAxis(format=unit),
YAxis(format=SHORT_FORMAT),
),
gridPos=GridPos(h=8, w=24, x=0, y=9),
)
)
# Encode grafanalib `Dashboard` entity to dictionary.
dashboard_payload = {
"dashboard": json.loads(json.dumps(dashboard, sort_keys=True, cls=DashboardEncoder)),
"overwrite": True,
"message": "Updated by grafanalib",
}
response = grafana.dashboard.update_dashboard(dashboard_payload)

# Display dashboard URL.
dashboard_url = URL(f"{grafana.url}{response['url']}").with_user(None).with_password(None)
logger.info(f"Dashboard URL: {dashboard_url}")


def validate_datasource(grafana: GrafanaApi):
"""
Validate Grafana data source.
"""
logger.info("Validating data source")
health = grafana.datasource.health_inquiry(DATASOURCE_UID)
logger.info("Health status: %s", health.status)
logger.info("Health message: %s", health.message)
assert health.success is True, "Grafana data source is not healthy"


def validate_dashboard(grafana: GrafanaApi):
"""
Validate Grafana dashboard by enumerating and executing all panel targets' `rawSql` expressions.
"""
logger.info("Validating dashboard")
dashboard = grafana.dashboard.get_dashboard(DASHBOARD_UID)
for panel in dashboard["dashboard"].get("panels", []):
for target in panel.get("targets", []):
logger.info("Validating SQL target:\n%s", target["rawSql"])

response = grafana.datasource.smartquery(DatasourceIdentifier(uid=DATASOURCE_UID), target["rawSql"])
status = response["results"]["test"]["status"]
queries = [frame["schema"]["meta"]["executedQueryString"] for frame in response["results"]["test"]["frames"]]
logger.info("Status: %s", status)
logger.info("Executed queries:\n%s", "\n".join(queries))

assert status == 200, "Dashboard query status is not 200"


if __name__ == "__main__":
"""
Boilerplate bootloader. Create a `GrafanaApi` instance and run example.
"""

# Setup logging.
setup_logging(level=logging.INFO)

# Create a `GrafanaApi` instance.
grafana_client = GrafanaApi.from_url(GRAFANA_URL)

# Invoke example conversation.
provision(grafana_client)

# Validate Grafana data source and dashboard.
validate_datasource(grafana_client)
validate_dashboard(grafana_client)
3 changes: 3 additions & 0 deletions application/grafana/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
cratedb-toolkit
grafana-client<6
grafanalib<0.8
16 changes: 16 additions & 0 deletions application/grafana/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/sh

# Use Grafana with CrateDB.

# The miniature stack defines {Docker,Podman} services and tasks to spin
# up CrateDB and Grafana, provision data into CrateDB, and a corresponding
# data source and dashboard into Grafana.

# https://github.com/grafana/grafana
# https://github.com/crate/crate

# Start services.
docker compose up --detach --wait

# Run weather data example.
docker compose run --rm example-weather