Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
f78b690
Draft - Added monitor-and-analyze-contention.md with images.
florence-crl Feb 14, 2025
af956b3
Merge remote-tracking branch 'origin/main' into DOC-12277
florence-crl Feb 14, 2025
6db1672
fixed links
florence-crl Feb 14, 2025
786c833
Draft - Added monitor-and-analyze-transaction-contention.md up to Con…
florence-crl Feb 14, 2025
fa6a527
Merge remote-tracking branch 'origin/main' into DOC-12277
florence-crl Feb 14, 2025
f1ed6c9
Draft - Added monitor-and-analyze-transaction-contention.md up to Ana…
florence-crl Feb 18, 2025
a4339c8
Merge remote-tracking branch 'origin/main' into DOC-12277
florence-crl Feb 18, 2025
46480bb
fix links
florence-crl Feb 18, 2025
3199fed
fix links
florence-crl Feb 18, 2025
22ac238
Merge remote-tracking branch 'origin/main' into DOC-12277
florence-crl Feb 20, 2025
e5c73a6
Draft - Added monitor-and-analyze-transaction-contention.md up to Ana…
florence-crl Feb 20, 2025
4c65490
fix link.
florence-crl Feb 21, 2025
efca702
Merge remote-tracking branch 'origin/main' into DOC-12277
florence-crl Feb 21, 2025
6a95b07
Added monitor-and-analyze-transaction-contention.md up to Analyze usi…
florence-crl Feb 21, 2025
efe9416
Merge remote-tracking branch 'origin/main' into DOC-12277
florence-crl Mar 11, 2025
e144a38
Incorporated Jon St. John’s feedback.
florence-crl Mar 11, 2025
fce2217
Merge remote-tracking branch 'origin/main' into DOC-12277
florence-crl Mar 12, 2025
d9f5693
Incorporated DavidH and Xin’s comments from slack.
florence-crl Mar 13, 2025
46d5fd8
Merge remote-tracking branch 'origin/main' into DOC-12277
florence-crl Mar 13, 2025
a87b314
Merge remote-tracking branch 'origin/main' into DOC-12277
florence-crl Mar 17, 2025
2f1ae56
Incorporated suggestions from docs-reviewer-gpt.
florence-crl Mar 17, 2025
689e127
Merge remote-tracking branch 'origin/main' into DOC-12277
florence-crl Mar 17, 2025
4e31724
Merge remote-tracking branch 'origin/main' into DOC-12277
florence-crl Apr 14, 2025
094cd19
Incorporated Kevin’s feedback.
florence-crl Apr 14, 2025
f9c4a94
Merge remote-tracking branch 'origin/main' into DOC-12277
florence-crl Apr 14, 2025
4ab9e9f
Incorporated Rich’s feedback part 1.
florence-crl Apr 16, 2025
e38b223
Incorporated Rich’s feedback part 2.
florence-crl Apr 17, 2025
7846032
Merge remote-tracking branch 'origin/main' into DOC-12277
florence-crl Apr 17, 2025
d26c6ab
fixed links.
florence-crl Apr 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{{site.data.alerts.callout_danger}}
Not all `crdb_internal` tables are production-ready. Consult the [`crdb_internal`]({% link {{ page.version.version }}/crdb-internal.md %}#tables) page for their current status.
{{site.data.alerts.end}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{{site.data.alerts.callout_danger}}
Querying the `crdb_internal.cluster_locks` table triggers an RPC fan-out to all nodes in the cluster, which can make it a relatively expensive operation.
{{site.data.alerts.end}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{{site.data.alerts.callout_danger}}
Querying the `crdb_internal.transaction_contention_events` table triggers an expensive RPC fan-out to all nodes, making it a resource-intensive operation. Avoid frequent polling and do not use this table for continuous monitoring.
{{site.data.alerts.end}}
2 changes: 1 addition & 1 deletion src/current/_includes/v25.1/essential-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ The **Usage** column explains why each metric is important to visualize in a cus
| <a id="sql-service-latency"></a>sql.service.latency-p90, sql.service.latency-p99 | sql.service.latency | Latency of SQL request execution | These high-level metrics reflect workload performance. Monitor these metrics to understand latency over time. If abnormal patterns emerge, apply the metric's time range to the [**SQL Activity** pages]({% link {{ page.version.version }}/monitoring-and-alerting.md %}#sql-activity-pages) to investigate interesting outliers or patterns. The [**Statements page**]({% link {{ page.version.version }}/ui-statements-page.md %}) has P90 Latency and P99 latency columns to enable correlation with this metric. |
| sql.txn.latency-p90, sql.txn.latency-p99 | sql.txn.latency | Latency of SQL transactions | These high-level metrics provide a latency histogram of all executed SQL transactions. These metrics provide an overview of the current SQL workload. |
| txnwaitqueue.deadlocks_total | {% if include.deployment == 'self-hosted' %}txnwaitqueue.deadlocks.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of deadlocks detected by the transaction wait queue | Alert on this metric if its value is greater than zero, especially if transaction throughput is lower than expected. Applications should be able to detect and recover from deadlock errors. However, transaction performance and throughput can be maximized if the application logic avoids deadlock conditions in the first place, for example, by keeping transactions as short as possible. |
| sql.distsql.contended_queries.count | {% if include.deployment == 'self-hosted' %}sql.distsql.contended.queries |{% elsif include.deployment == 'advanced' %} sql.distsql.contended.queries |{% endif %} Number of SQL queries that experienced contention | This metric is incremented whenever there is a non-trivial amount of contention experienced by a statement whether read-write or write-write conflicts. Monitor this metric to correlate possible workload performance issues to contention conflicts. |
| <a id="sql-distsql-contended-queries-count"></a>sql.distsql.contended_queries.count | {% if include.deployment == 'self-hosted' %}sql.distsql.contended.queries |{% elsif include.deployment == 'advanced' %} sql.distsql.contended.queries |{% endif %} Number of SQL queries that experienced contention | This metric is incremented whenever there is a non-trivial amount of contention experienced by a statement whether read-write or write-write conflicts. Monitor this metric to correlate possible workload performance issues to contention conflicts. |
| <a id="sql-conn-failures"></a>sql.conn.failures | sql.conn.failures.count | Number of SQL connection failures | This metric is incremented whenever a connection attempt fails for any reason, including timeouts. |
| <a id="sql-conn-latency"></a>sql.conn.latency-p90, sql.conn.latency-p99 | sql.conn.latency | Latency to establish and authenticate a SQL connection | These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. Connection failures are not recorded in these metrics.|
| txn.restarts.serializable | txn.restarts.serializable | Number of restarts due to a forwarded commit timestamp and isolation=SERIALIZABLE | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,12 @@
"/${VERSION}/admission-control.html"
]
},
{
"title": "Monitor and Analyze Transaction Contention",
"urls": [
"/${VERSION}/monitor-and-analyze-transaction-contention.html"
]
},
{
"title": "Performance Tuning Recipes",
"urls": [
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Column | Type | Description
-------|------|------------
`collection_ts` | `TIMESTAMPTZ NOT NULL` | The timestamp when the transaction [contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) event was collected.
`blocking_txn_id` | `UUID NOT NULL` | The ID of the blocking transaction. You can join this column into the [`cluster_contention_events`]({% link {{ page.version.version }}/crdb-internal.md %}#cluster_contention_events) table.
`blocking_txn_fingerprint_id` | `BYTES NOT NULL`| The ID of the blocking transaction fingerprint. To surface historical information about the transactions that caused the [contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention), you can join this column into the [`statement_statistics`]({% link {{ page.version.version }}/crdb-internal.md %}#statement_statistics) and [`transaction_statistics`]({% link {{ page.version.version }}/crdb-internal.md %}#transaction_statistics) tables.
`waiting_txn_id` | `UUID NOT NULL` | The ID of the waiting transaction. You can join this column into the [`cluster_contention_events`]({% link {{ page.version.version }}/crdb-internal.md %}#cluster_contention_events) table.
`waiting_txn_fingerprint_id` | `BYTES NOT NULL` | The ID of the waiting transaction fingerprint. To surface historical information about the transactions that caused the [contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention), you can join this column into the [`statement_statistics`]({% link {{ page.version.version }}/crdb-internal.md %}#statement_statistics) and [`transaction_statistics`]({% link {{ page.version.version }}/crdb-internal.md %}#transaction_statistics) tables.
`waiting_stmt_id` | `STRING NOT NULL` | The statement id of the transaction that was waiting (unique for each statement execution).
`waiting_stmt_fingerprint_id` | `BYTES NOT NULL` | The ID of the waiting statement fingerprint. To surface historical information about the statements that caused the [contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention), you can join this column into the [`statement_statistics`]({% link {{ page.version.version }}/crdb-internal.md %}#statement_statistics) table.
`contention_duration` | `INTERVAL NOT NULL` | The interval of time the waiting transaction spent waiting for the blocking transaction.
`contending_key` | `BYTES NOT NULL` | The key on which the transactions contended.
`contending_pretty_key` | `STRING NOT NULL` | The specific key that was involved in the contention event, in readable format.
`database_name` | `STRING NOT NULL` | The database where the contention occurred.
`schema_name` | `STRING NOT NULL` | The schema where the contention occurred.
`table_name` | `STRING NOT NULL` | The table where the contention occurred.
`index_name` | `STRING NULL` | The index where the contention occurred.
`contention_type` | `STRING NOT NULL` | The type of contention. Possible values:<ul><li>`LOCK_WAIT`: Indicates that the transaction waited on a specific key. The record includes the key and the wait duration.</li><li>`SERIALIZATION_CONFLICT`: Represents a serialization conflict specific to a transaction execution. This is recorded only when a [client-side retry error]({% link {{ page.version.version }}/transaction-retry-error-reference.md %}) containing the conflicting transaction details is emitted.</li></ul>After recording, the `contention_type` is not modified. A transaction may have multiple `LOCK_WAIT` events, as they correspond to specific keys, but only one `SERIALIZATION_CONFLICT` event.
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{{site.data.alerts.callout_danger}}
Querying the `crdb_internal.cluster_locks` table triggers an RPC fan-out to all nodes in the cluster, which can make it a relatively expensive operation.
{{site.data.alerts.end}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{{site.data.alerts.callout_danger}}
Querying the `crdb_internal.transaction_contention_events` table triggers an expensive RPC fan-out to all nodes, making it a resource-intensive operation. Avoid frequent polling and do not use this table for continuous monitoring.
{{site.data.alerts.end}}
2 changes: 1 addition & 1 deletion src/current/_includes/v25.2/essential-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ The **Usage** column explains why each metric is important to visualize in a cus
| <a id="sql-service-latency"></a>sql.service.latency-p90, sql.service.latency-p99 | sql.service.latency | Latency of SQL request execution | These high-level metrics reflect workload performance. Monitor these metrics to understand latency over time. If abnormal patterns emerge, apply the metric's time range to the [**SQL Activity** pages]({% link {{ page.version.version }}/monitoring-and-alerting.md %}#sql-activity-pages) to investigate interesting outliers or patterns. The [**Statements page**]({% link {{ page.version.version }}/ui-statements-page.md %}) has P90 Latency and P99 latency columns to enable correlation with this metric. |
| sql.txn.latency-p90, sql.txn.latency-p99 | sql.txn.latency | Latency of SQL transactions | These high-level metrics provide a latency histogram of all executed SQL transactions. These metrics provide an overview of the current SQL workload. |
| txnwaitqueue.deadlocks_total | {% if include.deployment == 'self-hosted' %}txnwaitqueue.deadlocks.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of deadlocks detected by the transaction wait queue | Alert on this metric if its value is greater than zero, especially if transaction throughput is lower than expected. Applications should be able to detect and recover from deadlock errors. However, transaction performance and throughput can be maximized if the application logic avoids deadlock conditions in the first place, for example, by keeping transactions as short as possible. |
| sql.distsql.contended_queries.count | {% if include.deployment == 'self-hosted' %}sql.distsql.contended.queries |{% elsif include.deployment == 'advanced' %} sql.distsql.contended.queries |{% endif %} Number of SQL queries that experienced contention | This metric is incremented whenever there is a non-trivial amount of contention experienced by a statement whether read-write or write-write conflicts. Monitor this metric to correlate possible workload performance issues to contention conflicts. |
| <a id="sql-distsql-contended-queries-count"></a>sql.distsql.contended_queries.count | {% if include.deployment == 'self-hosted' %}sql.distsql.contended.queries |{% elsif include.deployment == 'advanced' %} sql.distsql.contended.queries |{% endif %} Number of SQL queries that experienced contention | This metric is incremented whenever there is a non-trivial amount of contention experienced by a statement whether read-write or write-write conflicts. Monitor this metric to correlate possible workload performance issues to contention conflicts. |
| <a id="sql-conn-failures"></a>sql.conn.failures | sql.conn.failures.count | Number of SQL connection failures | This metric is incremented whenever a connection attempt fails for any reason, including timeouts. |
| <a id="sql-conn-latency"></a>sql.conn.latency-p90, sql.conn.latency-p99 | sql.conn.latency | Latency to establish and authenticate a SQL connection | These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. Connection failures are not recorded in these metrics.|
| txn.restarts.serializable | txn.restarts.serializable | Number of restarts due to a forwarded commit timestamp and isolation=SERIALIZABLE | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,12 @@
"/${VERSION}/admission-control.html"
]
},
{
"title": "Monitor and Analyze Transaction Contention",
"urls": [
"/${VERSION}/monitor-and-analyze-transaction-contention.html"
]
},
{
"title": "Performance Tuning Recipes",
"urls": [
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Column | Type | Description
-------|------|------------
`collection_ts` | `TIMESTAMPTZ NOT NULL` | The timestamp when the transaction [contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) event was collected.
`blocking_txn_id` | `UUID NOT NULL` | The ID of the blocking transaction. You can join this column into the [`cluster_contention_events`]({% link {{ page.version.version }}/crdb-internal.md %}#cluster_contention_events) table.
`blocking_txn_fingerprint_id` | `BYTES NOT NULL`| The ID of the blocking transaction fingerprint. To surface historical information about the transactions that caused the [contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention), you can join this column into the [`statement_statistics`]({% link {{ page.version.version }}/crdb-internal.md %}#statement_statistics) and [`transaction_statistics`]({% link {{ page.version.version }}/crdb-internal.md %}#transaction_statistics) tables.
`waiting_txn_id` | `UUID NOT NULL` | The ID of the waiting transaction. You can join this column into the [`cluster_contention_events`]({% link {{ page.version.version }}/crdb-internal.md %}#cluster_contention_events) table.
`waiting_txn_fingerprint_id` | `BYTES NOT NULL` | The ID of the waiting transaction fingerprint. To surface historical information about the transactions that caused the [contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention), you can join this column into the [`statement_statistics`]({% link {{ page.version.version }}/crdb-internal.md %}#statement_statistics) and [`transaction_statistics`]({% link {{ page.version.version }}/crdb-internal.md %}#transaction_statistics) tables.
`waiting_stmt_id` | `STRING NOT NULL` | The statement id of the transaction that was waiting (unique for each statement execution).
`waiting_stmt_fingerprint_id` | `BYTES NOT NULL` | The ID of the waiting statement fingerprint. To surface historical information about the statements that caused the [contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention), you can join this column into the [`statement_statistics`]({% link {{ page.version.version }}/crdb-internal.md %}#statement_statistics) table.
`contention_duration` | `INTERVAL NOT NULL` | The interval of time the waiting transaction spent waiting for the blocking transaction.
`contending_key` | `BYTES NOT NULL` | The key on which the transactions contended.
`contending_pretty_key` | `STRING NOT NULL` | The specific key that was involved in the contention event, in readable format.
`database_name` | `STRING NOT NULL` | The database where the contention occurred.
`schema_name` | `STRING NOT NULL` | The schema where the contention occurred.
`table_name` | `STRING NOT NULL` | The table where the contention occurred.
`index_name` | `STRING NULL` | The index where the contention occurred.
`contention_type` | `STRING NOT NULL` | The type of contention. Possible values:<ul><li>`LOCK_WAIT`: Indicates that the transaction waited on a specific key. The record includes the key and the wait duration.</li><li>`SERIALIZATION_CONFLICT`: Represents a serialization conflict specific to a transaction execution. This is recorded only when a [client-side retry error]({% link {{ page.version.version }}/transaction-retry-error-reference.md %}) containing the conflicting transaction details is emitted.</li></ul>After recording, the `contention_type` is not modified. A transaction may have multiple `LOCK_WAIT` events, as they correspond to specific keys, but only one `SERIALIZATION_CONFLICT` event.
Binary file added src/current/images/v25.1/contention-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/current/images/v25.1/contention-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/current/images/v25.1/contention-3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/current/images/v25.1/contention-4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/current/images/v25.1/contention-5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/current/images/v25.1/contention-6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/current/images/v25.2/contention-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/current/images/v25.2/contention-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/current/images/v25.2/contention-3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/current/images/v25.2/contention-4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/current/images/v25.2/contention-5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/current/images/v25.2/contention-6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading