Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time series use case #3502

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

mneedham
Copy link
Contributor

Time-series use case

@mneedham mneedham requested a review from a team as a code owner March 14, 2025 11:01
Copy link

vercel bot commented Mar 14, 2025

@mneedham is attempting to deploy a commit to the ClickHouse Team on Vercel.

A member of the Team first needs to authorize it.

Copy link

vercel bot commented Mar 14, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

2 Skipped Deployments
Name Status Preview Comments Updated (UTC)
clickhouse-docs-ru ⬜️ Ignored (Inspect) Visit Preview Mar 14, 2025 0:21am
clickhouse-docs-zh ⬜️ Ignored (Inspect) Visit Preview Mar 14, 2025 0:21am

Copy link
Member

@Blargian Blargian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left a few comments.

Comment on lines +2 to +6
title: Analysis functions- Time-series
sidebar_label: Analysis functions
description: Functions for analyzing time-series data in ClickHouse.
slug: /use-cases/time-series/analysis-functions
keywords: [time-series]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: Analysis functions- Time-series
sidebar_label: Analysis functions
description: Functions for analyzing time-series data in ClickHouse.
slug: /use-cases/time-series/analysis-functions
keywords: [time-series]
title: 'Analysis functions - Time-series'
sidebar_label: 'Analysis functions'
description: 'Functions for analyzing time-series data in ClickHouse.'
slug: /use-cases/time-series/analysis-functions
keywords: ['time-series']

Added a check on the front-matter yesterday that enforces having title, description, slug and single quotes.

Comment on lines +2 to +6
title: Basic operations - Time-series
sidebar_label: Basic operations
description: Basic time-series operations in ClickHouse.
slug: /use-cases/time-series/basic-operations
keywords: [time-series]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: Basic operations - Time-series
sidebar_label: Basic operations
description: Basic time-series operations in ClickHouse.
slug: /use-cases/time-series/basic-operations
keywords: [time-series]
title: 'Basic operations - Time-series'
sidebar_label: 'Basic operations'
description: 'Basic time-series operations in ClickHouse.'
slug: /use-cases/time-series/basic-operations
keywords: ['time-series']

Comment on lines +2 to +6
title: Date and time data types - Time-series
sidebar_label: Date and time data types
description: Time-series data types in ClickHouse.
slug: /use-cases/time-series/date-time-data-types
keywords: [time-series]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: Date and time data types - Time-series
sidebar_label: Date and time data types
description: Time-series data types in ClickHouse.
slug: /use-cases/time-series/date-time-data-types
keywords: [time-series]
title: 'Date and time data types - Time-series'
sidebar_label: 'Date and time data types'
description: 'Time-series data types in ClickHouse.'
slug: /use-cases/time-series/date-time-data-types
keywords: ['time-series']

Comment on lines +2 to +5
slug: /use-cases/time-series
title: Time-Series
pagination_prev: null
pagination_next: null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
slug: /use-cases/time-series
title: Time-Series
pagination_prev: null
pagination_next: null
description: 'Index page for the time-series use-case guide.'
slug: /use-cases/time-series
title: 'Time-Series'
pagination_prev: null
pagination_next: null

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call this page "introduction" rather and have it as the first sub-menu item? Similar to how it is in the Observability guide. Then we can add an index.md which gets a table of contents automatically generated from the titles and descriptions in the yaml front matter at build time. See here for example.

To do that we leave the index.md page blank (apart from front-matter) and add a comment <!-- Table of contents on this page gets automatically inserted -->, then the script here gets updated and the table of contents will get inserted at build-time.

I can do this in a follow up PR though :-)

Comment on lines +2 to +6
title: Query performance - Time-series
sidebar_label: Query performance
description: Improving time-series query performance
slug: /use-cases/time-series/query-performance
keywords: [time-series]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: Query performance - Time-series
sidebar_label: Query performance
description: Improving time-series query performance
slug: /use-cases/time-series/query-performance
keywords: [time-series]
title: 'Query performance - Time-series'
sidebar_label: 'Query performance'
description: 'Improving time-series query performance'
slug: /use-cases/time-series/query-performance
keywords: ['time-series']

Comment on lines +151 to +172
┌────────────────hour─┬─sum(hits)─┐
│ 2015-07-01 00:00:00 │ 3 │ <- missing values
│ 2015-07-01 02:00:00 │ 1 │ <- missing values
│ 2015-07-01 04:00:00 │ 1 │
│ 2015-07-01 05:00:00 │ 2 │
│ 2015-07-01 06:00:00 │ 1 │
│ 2015-07-01 07:00:00 │ 1 │
│ 2015-07-01 08:00:00 │ 3 │
│ 2015-07-01 09:00:00 │ 2 │ <- missing values
│ 2015-07-01 12:00:00 │ 2 │
│ 2015-07-01 13:00:00 │ 4 │
│ 2015-07-01 14:00:00 │ 2 │
│ 2015-07-01 15:00:00 │ 2 │
│ 2015-07-01 16:00:00 │ 2 │
│ 2015-07-01 17:00:00 │ 1 │
│ 2015-07-01 18:00:00 │ 5 │
│ 2015-07-01 19:00:00 │ 5 │
│ 2015-07-01 20:00:00 │ 4 │
│ 2015-07-01 21:00:00 │ 4 │
│ 2015-07-01 22:00:00 │ 2 │
│ 2015-07-01 23:00:00 │ 2 │
└─────────────────────┴───────────┘
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a slightly different result when I run this query:

Suggested change
┌────────────────hour─┬─sum(hits)─┐
│ 2015-07-01 00:00:00 │ 3 │ <- missing values
│ 2015-07-01 02:00:00 │ 1 │ <- missing values
│ 2015-07-01 04:00:00 │ 1 │
│ 2015-07-01 05:00:00 │ 2 │
│ 2015-07-01 06:00:00 │ 1 │
│ 2015-07-01 07:00:00 │ 1 │
│ 2015-07-01 08:00:00 │ 3 │
│ 2015-07-01 09:00:00 │ 2 │ <- missing values
│ 2015-07-01 12:00:00 │ 2 │
│ 2015-07-01 13:00:00 │ 4 │
│ 2015-07-01 14:00:00 │ 2 │
│ 2015-07-01 15:00:00 │ 2 │
│ 2015-07-01 16:00:00 │ 2 │
│ 2015-07-01 17:00:00 │ 1 │
│ 2015-07-01 18:00:00 │ 5 │
│ 2015-07-01 19:00:00 │ 5 │
│ 2015-07-01 20:00:00 │ 4 │
│ 2015-07-01 21:00:00 │ 4 │
│ 2015-07-01 22:00:00 │ 2 │
│ 2015-07-01 23:00:00 │ 2 │
└─────────────────────┴───────────┘
┌────────────────hour─┬─sum(hits)─┐
│ 2015-07-01 00:00:00 │ 4 │
│ 2015-07-01 01:00:00 │ 2 │ <- missing 02:00:00
│ 2015-07-01 03:00:00 │ 1 │ <- missing 04:00:00
│ 2015-07-01 05:00:00 │ 1 │
│ 2015-07-01 06:00:00 │ 2 │ <- missing 07:00:00
│ 2015-07-01 08:00:00 │ 1 │
│ 2015-07-01 09:00:00 │ 3 │
│ 2015-07-01 10:00:00 │ 2 │ <- missing 11:00:00, 12:00:00
│ 2015-07-01 13:00:00 │ 1 │
│ 2015-07-01 14:00:00 │ 4 │
│ 2015-07-01 15:00:00 │ 2 │
│ 2015-07-01 16:00:00 │ 2 │
│ 2015-07-01 17:00:00 │ 2 │
│ 2015-07-01 18:00:00 │ 1 │
│ 2015-07-01 19:00:00 │ 5 │
│ 2015-07-01 20:00:00 │ 5 │
│ 2015-07-01 21:00:00 │ 4 │
│ 2015-07-01 22:00:00 │ 4 │
│ 2015-07-01 23:00:00 │ 1 │
└─────────────────────┴───────────┘

Comment on lines +188 to +213
┌────────────────hour─┬─sum(hits)─┐
│ 2015-07-01 00:00:00 │ 3 │
│ 2015-07-01 01:00:00 │ 0 │ <- new value
│ 2015-07-01 02:00:00 │ 1 │
│ 2015-07-01 03:00:00 │ 0 │ <- new value
│ 2015-07-01 04:00:00 │ 1 │
│ 2015-07-01 05:00:00 │ 2 │
│ 2015-07-01 06:00:00 │ 1 │
│ 2015-07-01 07:00:00 │ 1 │
│ 2015-07-01 08:00:00 │ 3 │
│ 2015-07-01 09:00:00 │ 2 │
│ 2015-07-01 10:00:00 │ 0 │ <- new value
│ 2015-07-01 11:00:00 │ 0 │ <- new value
│ 2015-07-01 12:00:00 │ 2 │
│ 2015-07-01 13:00:00 │ 4 │
│ 2015-07-01 14:00:00 │ 2 │
│ 2015-07-01 15:00:00 │ 2 │
│ 2015-07-01 16:00:00 │ 2 │
│ 2015-07-01 17:00:00 │ 1 │
│ 2015-07-01 18:00:00 │ 5 │
│ 2015-07-01 19:00:00 │ 5 │
│ 2015-07-01 20:00:00 │ 4 │
│ 2015-07-01 21:00:00 │ 4 │
│ 2015-07-01 22:00:00 │ 2 │
│ 2015-07-01 23:00:00 │ 2 │
└─────────────────────┴───────────┘
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
┌────────────────hour─┬─sum(hits)─┐
│ 2015-07-01 00:00:00 │ 3
│ 2015-07-01 01:00:00 │ 0 │ <- new value
│ 2015-07-01 02:00:00 │ 1 │
│ 2015-07-01 03:00:00 │ 0 │ <- new value
│ 2015-07-01 04:00:00 │ 1 │
│ 2015-07-01 05:00:00 │ 2
│ 2015-07-01 06:00:00 │ 1
│ 2015-07-01 07:00:00 │ 1 │
│ 2015-07-01 08:00:00 │ 3
│ 2015-07-01 09:00:00 │ 2
│ 2015-07-01 10:00:00 │ 0 │ <- new value
│ 2015-07-01 11:00:00 │ 0 │ <- new value
│ 2015-07-01 12:00:00 │ 2 │
│ 2015-07-01 13:00:00 │ 4
│ 2015-07-01 14:00:00 │ 2
│ 2015-07-01 15:00:00 │ 2 │
│ 2015-07-01 16:00:00 │ 2 │
│ 2015-07-01 17:00:00 │ 1
│ 2015-07-01 18:00:00 │ 5
│ 2015-07-01 19:00:00 │ 5 │
│ 2015-07-01 20:00:00 │ 4
│ 2015-07-01 21:00:00 │ 4 │
│ 2015-07-01 22:00:00 │ 2
│ 2015-07-01 23:00:00 │ 2
└─────────────────────┴───────────┘
┌────────────────hour─┬─sum(hits)─┐
│ 2015-07-01 00:00:00 │ 4
│ 2015-07-01 01:00:00 │ 2 │
│ 2015-07-01 02:00:00 │ 0 │ <- filled value
│ 2015-07-01 03:00:00 │ 1 │
│ 2015-07-01 04:00:00 │ 0 │ <- filled value
│ 2015-07-01 05:00:00 │ 1
│ 2015-07-01 06:00:00 │ 2
│ 2015-07-01 07:00:00 │ 0 │ <- filled value
│ 2015-07-01 08:00:00 │ 1
│ 2015-07-01 09:00:00 │ 3
│ 2015-07-01 10:00:00 │ 2 │
│ 2015-07-01 11:00:00 │ 0 │ <- filled value
│ 2015-07-01 12:00:00 │ 0 │ <- filled value
│ 2015-07-01 13:00:00 │ 1
│ 2015-07-01 14:00:00 │ 4
│ 2015-07-01 15:00:00 │ 2 │
│ 2015-07-01 16:00:00 │ 2 │
│ 2015-07-01 17:00:00 │ 2
│ 2015-07-01 18:00:00 │ 1
│ 2015-07-01 19:00:00 │ 5 │
│ 2015-07-01 20:00:00 │ 5
│ 2015-07-01 21:00:00 │ 4 │
│ 2015-07-01 22:00:00 │ 4
│ 2015-07-01 23:00:00 │ 1
└─────────────────────┴───────────┘


## Filling empty groups {#time-series-filling-empty-groups}

In a lot of cases we deal with sparse data with some absent intervals. This results in empty buckets. Let’s take the following example where we group data by 1-hour intervals. This will out the following stats with some hours missing values:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In a lot of cases we deal with sparse data with some absent intervals. This results in empty buckets. Let’s take the following example where we group data by 1-hour intervals. This will out the following stats with some hours missing values:
In a lot of cases we deal with sparse data with some absent intervals. This results in empty buckets. Let’s take the following example where we group data by 1-hour intervals. This will output the following stats with some hours missing values:


## Custom grouping intervals {#time-series-custom-grouping-intervals}

We can even arbitrary intervals, e.g., 5 minutes using the [`toStartOfInterval()`](/docs/sql-reference/functions/date-time-functions#tostartofinterval) function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We can even arbitrary intervals, e.g., 5 minutes using the [`toStartOfInterval()`](/docs/sql-reference/functions/date-time-functions#tostartofinterval) function.
We can even group by arbitrary intervals, e.g., 5 minutes using the [`toStartOfInterval()`](/docs/sql-reference/functions/date-time-functions#tostartofinterval) function.

└────────────┴──────────┘
```

We’ve used the `toDate()` function here, which converts the specified time to a date type. Alternatively, we can batch by an hour and filter on the specific date:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We’ve used the `toDate()` function here, which converts the specified time to a date type. Alternatively, we can batch by an hour and filter on the specific date:
We’ve used the [`toDate()`](/sql-reference/functions/type-conversion-functions#todate) function here, which converts the specified time to a date type. Alternatively, we can batch by an hour and filter on the specific date:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants