Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC-12504 Clarify that the connection latency metric (and chart) does not include failed queries #19416

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Original file line number Diff line number Diff line change
@@ -1,3 +1 @@
Connection latency is calculated as the time in nanoseconds between when the cluster receives a connection request and establishes the connection to the client, including [authentication]({% link cockroachcloud/authentication.md %}). This graph shows the p90 and p99 latencies for [SQL connections]({% link {{ site.current_cloud_version }}/show-sessions.md %}) to the cluster.

These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times.
Connection latency is calculated as the time in nanoseconds between when the cluster receives a connection request and establishes the connection to the client, including [authentication]({% link cockroachcloud/authentication.md %}). This graph shows the p90 and p99 latencies for [SQL connections]({% link {{ site.current_cloud_version }}/show-sessions.md %}) to the cluster.<br /><br />These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. Connection failures are not recorded in these metrics.
Original file line number Diff line number Diff line change
@@ -1,5 +1 @@
This metric shows the total number of SQL [client connections]({% link {{ site.current_cloud_version }}/show-sessions.md %}) across the cluster.

Refer to the [<b>Sessions</b> page]({% link cockroachcloud/sessions-page.md %}) for more details on the sessions.

This metric also shows the distribution, or balancing, of connections across the cluster. Review [Connection Pooling]({% link {{ site.current_cloud_version }}/connection-pooling.md %}).
This metric shows the total number of SQL [client connections]({% link {{ site.current_cloud_version }}/show-sessions.md %}) across the cluster.<br /><br />Refer to the [<b>Sessions</b> page]({% link cockroachcloud/sessions-page.md %}) for more details on the sessions.<br /><br />This metric also shows the distribution, or balancing, of connections across the cluster. Review [Connection Pooling]({% link {{ site.current_cloud_version }}/connection-pooling.md %}).
3 changes: 2 additions & 1 deletion src/current/_includes/v23.1/essential-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,8 @@ The **Usage** column explains why each metric is important to visualize in a cus
| sql.txn.latency-p90, sql.txn.latency-p99 | sql.txn.latency | Latency of SQL transactions | These high-level metrics provide a latency histogram of all executed SQL transactions. These metrics provide an overview of the current SQL workload. |
| txnwaitqueue.deadlocks_total | {% if include.deployment == 'self-hosted' %}txnwaitqueue.deadlocks.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of deadlocks detected by the transaction wait queue | Alert on this metric if its value is greater than zero, especially if transaction throughput is lower than expected. Applications should be able to detect and recover from deadlock errors. However, transaction performance and throughput can be maximized if the application logic avoids deadlock conditions in the first place, for example, by keeping transactions as short as possible. |
| sql.distsql.contended_queries.count | {% if include.deployment == 'self-hosted' %}sql.distsql.contended.queries |{% elsif include.deployment == 'advanced' %} sql.distsql.contended.queries |{% endif %} Number of SQL queries that experienced contention | This metric is incremented whenever there is a non-trivial amount of contention experienced by a statement whether read-write or write-write conflicts. Monitor this metric to correlate possible workload performance issues to contention conflicts. |
| sql.conn.latency-p90, sql.conn.latency-p99 | sql.conn.latency | Latency to establish and authenticate a SQL connection | These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. |
| <a id="sql-conn-failures"></a>sql.conn.failures | sql.conn.failures.count | Number of SQL connection failures | This metric is incremented whenever a connection attempt fails for any reason, including timeouts. |
| <a id="sql-conn-latency"></a>sql.conn.latency-p90, sql.conn.latency-p99 | sql.conn.latency | Latency to establish and authenticate a SQL connection | These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. Connection failures are not recorded in these metrics.|
| txn.restarts.serializable | txn.restarts.serializable | Number of restarts due to a forwarded commit timestamp and isolation=SERIALIZABLE | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
| txn.restarts.writetooold | txn.restarts.writetooold | Number of restarts due to a concurrent writer committing first | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
| txn.restarts.writetoooldmulti | {% if include.deployment == 'self-hosted' %}txn.restarts.writetoooldmulti.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of restarts due to multiple concurrent writers committing first | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
Expand Down
3 changes: 2 additions & 1 deletion src/current/_includes/v23.2/essential-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,8 @@ The **Usage** column explains why each metric is important to visualize in a cus
| sql.txn.latency-p90, sql.txn.latency-p99 | sql.txn.latency | Latency of SQL transactions | These high-level metrics provide a latency histogram of all executed SQL transactions. These metrics provide an overview of the current SQL workload. |
| txnwaitqueue.deadlocks_total | {% if include.deployment == 'self-hosted' %}txnwaitqueue.deadlocks.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of deadlocks detected by the transaction wait queue | Alert on this metric if its value is greater than zero, especially if transaction throughput is lower than expected. Applications should be able to detect and recover from deadlock errors. However, transaction performance and throughput can be maximized if the application logic avoids deadlock conditions in the first place, for example, by keeping transactions as short as possible. |
| sql.distsql.contended_queries.count | {% if include.deployment == 'self-hosted' %}sql.distsql.contended.queries |{% elsif include.deployment == 'advanced' %} sql.distsql.contended.queries |{% endif %} Number of SQL queries that experienced contention | This metric is incremented whenever there is a non-trivial amount of contention experienced by a statement whether read-write or write-write conflicts. Monitor this metric to correlate possible workload performance issues to contention conflicts. |
| sql.conn.latency-p90, sql.conn.latency-p99 | sql.conn.latency | Latency to establish and authenticate a SQL connection | These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. |
| <a id="sql-conn-failures"></a>sql.conn.failures | sql.conn.failures.count | Number of SQL connection failures | This metric is incremented whenever a connection attempt fails for any reason, including timeouts. |
| <a id="sql-conn-latency"></a>sql.conn.latency-p90, sql.conn.latency-p99 | sql.conn.latency | Latency to establish and authenticate a SQL connection | These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. Connection failures are not recorded in these metrics.|
| txn.restarts.serializable | txn.restarts.serializable | Number of restarts due to a forwarded commit timestamp and isolation=SERIALIZABLE | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
| txn.restarts.writetooold | txn.restarts.writetooold | Number of restarts due to a concurrent writer committing first | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
| txn.restarts.writetoooldmulti | {% if include.deployment == 'self-hosted' %}txn.restarts.writetoooldmulti.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of restarts due to multiple concurrent writers committing first | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
Expand Down
3 changes: 2 additions & 1 deletion src/current/_includes/v24.1/essential-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,8 @@ The **Usage** column explains why each metric is important to visualize in a cus
| sql.txn.latency-p90, sql.txn.latency-p99 | sql.txn.latency | Latency of SQL transactions | These high-level metrics provide a latency histogram of all executed SQL transactions. These metrics provide an overview of the current SQL workload. |
| txnwaitqueue.deadlocks_total | {% if include.deployment == 'self-hosted' %}txnwaitqueue.deadlocks.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of deadlocks detected by the transaction wait queue | Alert on this metric if its value is greater than zero, especially if transaction throughput is lower than expected. Applications should be able to detect and recover from deadlock errors. However, transaction performance and throughput can be maximized if the application logic avoids deadlock conditions in the first place, for example, by keeping transactions as short as possible. |
| sql.distsql.contended_queries.count | {% if include.deployment == 'self-hosted' %}sql.distsql.contended.queries |{% elsif include.deployment == 'advanced' %} sql.distsql.contended.queries |{% endif %} Number of SQL queries that experienced contention | This metric is incremented whenever there is a non-trivial amount of contention experienced by a statement whether read-write or write-write conflicts. Monitor this metric to correlate possible workload performance issues to contention conflicts. |
| <a id="sql-conn-latency"></a>sql.conn.latency-p90, sql.conn.latency-p99 | sql.conn.latency | Latency to establish and authenticate a SQL connection | These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. |
| <a id="sql-conn-failures"></a>sql.conn.failures | sql.conn.failures.count | Number of SQL connection failures | This metric is incremented whenever a connection attempt fails for any reason, including timeouts. |
| <a id="sql-conn-latency"></a>sql.conn.latency-p90, sql.conn.latency-p99 | sql.conn.latency | Latency to establish and authenticate a SQL connection | These metrics characterize the database connection latency which can affect the application performance, for example, by having slow startup times. Connection failures are not recorded in these metrics.|
| txn.restarts.serializable | txn.restarts.serializable | Number of restarts due to a forwarded commit timestamp and isolation=SERIALIZABLE | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
| txn.restarts.writetooold | txn.restarts.writetooold | Number of restarts due to a concurrent writer committing first | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
| txn.restarts.writetoooldmulti | {% if include.deployment == 'self-hosted' %}txn.restarts.writetoooldmulti.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of restarts due to multiple concurrent writers committing first | This metric is one measure of the impact of contention conflicts on workload performance. For guidance on contention conflicts, review [transaction contention best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) and [performance tuning recipes]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). Tens of restarts per minute may be a high value, a signal of an elevated degree of contention in the workload, which should be investigated. |
Expand Down
Loading
Loading