Troubleshooting Query Plan Regressions guide #20893

bsanchez-the-roach · 2025-10-31T16:16:07Z

DOC-15152

This is a first draft, I definitely want a review for accuracy since I'm still pretty new to this product.

There's an unfinished section at the very bottom, I've left a note there and am looking for some guidance.

Happy to iterate on this more, I just want to get eyes on it.

Rendered preview:

Troubleshooting Query Plan Regressions

netlify · 2025-10-31T16:16:28Z

✅ Deploy Preview for cockroachdb-api-docs canceled.

Name	Link
🔨 Latest commit	`b892e57`
🔍 Latest deploy log	https://app.netlify.com/projects/cockroachdb-api-docs/deploys/691cb5f13ffbe1000844d040

netlify · 2025-10-31T16:16:28Z

✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name	Link
🔨 Latest commit	`b892e57`
🔍 Latest deploy log	https://app.netlify.com/projects/cockroachdb-interactivetutorials-docs/deploys/691cb5f1d7946e00085bcbde

github-actions · 2025-10-31T16:16:40Z

Files changed:

src/current/_includes/v25.4/sidebar-data/troubleshooting.json
src/current/images/v25.4/troubleshooting-query-plan-regressions-1.png:

src/current/v25.4/troubleshoot-query-plan-regressions.md

src/current/v25.4/troubleshoot-query-plan-regressions.md

netlify · 2025-10-31T16:32:50Z

✅ Netlify Preview

Name	Link
🔨 Latest commit	`b892e57`
🔍 Latest deploy log	https://app.netlify.com/projects/cockroachdb-docs/deploys/691cb5f15a77480008f8b31b
😎 Deploy Preview	https://deploy-preview-20893--cockroachdb-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

rytaft

This is really great! Thank you for doing this! I left a few suggestions, and I bet @yuzefovich may have some more.

src/current/v25.4/troubleshoot-query-plan-regressions.md

rytaft · 2025-11-03T17:35:11Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+
+- [Understand how the cost-based optimizer chooses query plans]({% link {{page.version.version}}/cost-based-optimizer.md %}) based on table statistics, and how those statistics are refreshed.
+
+## Query plan regressions vs. suboptimal plans


This section seems a bit too focused on the technicality of what the Insights page currently supports. I think it's worth mentioning that the Insights page can help, but I'm not sure you need to distinguish between plan regressions v suboptimal plans.

I agree with Becca on this. This section seems confusing to me in the current form. "Slow execution" and "suboptimal plan" insights might be good starting points for troubleshooting an unsatisfactory latency for a given query, yet neither necessarily confirms / disproves that this query has experienced a query plan regression.

Perhaps a better way to include the information about the insights would be to have just a single sentence in "Before you begin" section to indicate that "suboptimal plan" insight might help with identifying / understanding the query plan regression. I'd probably omit the mention of "slow execution" insight altogether since it doesn't give much useful signal with query plan regressions - after all, the execution time exceeding the threshold controlled via the cluster setting could be the best we can do.

src/current/v25.4/troubleshoot-query-plan-regressions.md

rytaft · 2025-11-03T17:45:56Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+2. If you've already identified specific time intervals in Step 1, you can use the time interval selector to create a custom time interval. Click **Apply**.
+3. If there is only one plan in the resulting table, there was only one plan used for this statement fingerprint during this time interval, and therefore a query plan regression could not have occurred. If there are multiple plans listed in the resulting table, the query plan changed within the given time interval. By default, the table is sorted from most recent to least recent query plan. Compare the **Average Execution Time** of the different plans.
+
+If a plan in the table has a significantly higher average execution time than the one that preceded it, it's possible that this is a query plan regression. It's also possible that the increase in latency is coincidental, or that the plan change was not the actual cause. For example, if the average execution time of the latest query plan is significantly higher than the average execution time of the previous query plan, this could be explained by a significant increase in the **Average Rows Read** column.


An increase in Average Rows Read could indicate a query plan regression, since it's possible that the bad query plan is scanning more rows than it should.

But as I think you're intending to show, an increase in Average Rows Read could also indicate that more data was added to the table. It's probably worth mentioning both possibilities here.

To me it seems more likely that a significant increase (like an order of magnitude growth) in Average Rows Read is actually due to a plan regression, rather than due to the table size growth, since we're comparing two plans for the given query fingerprint that presumably were executed close - time-wise - to each other. I agree though that both are possibilities.

rytaft · 2025-11-03T17:46:40Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+
+1. In the **Explain Plans** tab, click on the Plan Gist of the more recent plan to see it in more detail.
+2. Click on **All Plans** above to return to the list of plans.
+3. Click on the Plan Gist of the previous plan to see it in more detail. Compare the two plans to understand what changed. Do the plans use different indexes? Are they scanning the different portions of the table? Do they use different join strategies?


nit: the different portions -> different portions

rytaft · 2025-11-03T17:53:53Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+
+#### Determine if a literal in the SQL statement has changed
+
+[NOTE FROM BRANDON: I need more information on this case, mainly how to identify that this is the case, and what to do about it.]


I'm not sure there is a good way to determine this without collecting a conditional statement bundle for a slow execution of the statement fingerprint (unless the DB operator happens to know that the application is using a new value for a particular placeholder). Maybe @yuzefovich has another idea?

Oof, yeah, this is hard one. The tutorial so far assumes that there is a single good plan for a query fingerprint that might have regressed, but it's actually possible that multiple plans are good, depending on the values of placeholders ("literals").

Here is an example of two different optimal plans (although they do look similar):

CREATE TABLE small (k INT PRIMARY KEY, v INT); CREATE TABLE large (k INT PRIMARY KEY, v INT, INDEX (v)); INSERT INTO small SELECT i, i FROM generate_series(1, 10) AS g(i); INSERT INTO large SELECT i, 1 FROM generate_series(1, 10000) AS g(i); ANALYZE small; ANALYZE large; -- this scans `large` on the _left_ side of merge join EXPLAIN SELECT * FROM small INNER JOIN large ON small.v = large.v AND small.v = 1; -- this scans `large` on the _right_ side of merge join EXPLAIN SELECT * FROM small INNER JOIN large ON small.v = large.v AND small.v = 2;

Complicating things is that we deal with query fingerprints internally, so all such constants are removed from our observability tooling. If there was an escalation saying that a particular query fingerprint is occasionally slow, similar to Becca I'd have asked for a conditional statement bundle, and then I'd play around locally with different values of placeholders to see whether multiple plans could be chosen based on concrete placeholder values. But so far we've used statement bundles mostly as internal (to Queries team in particular and Cockroach Labs support in general) tooling, so I'd probably not mention going down this route.

Instead, I'd consider suggesting looking into application side to see whether the literal has changed or something like that.

rytaft · 2025-11-03T17:58:03Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+
+[NOTE FROM BRANDON: I need more information on this case, mainly how to identify that this is the case, and what to do about it.]
+
+If you suspect that the query plan change is the cause of the latency increase, and you suspect that the query plan changed due to a changed query literal, [what should you do]


what should you do

The likely problem is that the query stats don't accurately reflect how this value is represented in the data. This can be fixed by running ANALYZE <table> to refresh the stats for the table. It's also possible that a good index isn't available, which could be fixed by checking the index recommendations displayed by EXPLAIN-ing the query or on the insights page. If none of these options fixes the issue, a more drastic redesign of the schema/application may be needed.

yuzefovich

Nice, glad to see this work!

src/current/v25.4/troubleshoot-query-plan-regressions.md

yuzefovich · 2025-11-04T00:24:30Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+
+- [Understand how the cost-based optimizer chooses query plans]({% link {{page.version.version}}/cost-based-optimizer.md %}) based on table statistics, and how those statistics are refreshed.
+
+## Query plan regressions vs. suboptimal plans


I agree with Becca on this. This section seems confusing to me in the current form. "Slow execution" and "suboptimal plan" insights might be good starting points for troubleshooting an unsatisfactory latency for a given query, yet neither necessarily confirms / disproves that this query has experienced a query plan regression.

Perhaps a better way to include the information about the insights would be to have just a single sentence in "Before you begin" section to indicate that "suboptimal plan" insight might help with identifying / understanding the query plan regression. I'd probably omit the mention of "slow execution" insight altogether since it doesn't give much useful signal with query plan regressions - after all, the execution time exceeding the threshold controlled via the cluster setting could be the best we can do.

src/current/v25.4/troubleshoot-query-plan-regressions.md

yuzefovich · 2025-11-04T00:31:33Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+
+One way of tracking down query plan regressions is to identify SQL statements whose executions are relatively high in latency. Use one or both of the following methods to identify queries that might be associated with a latency increase.
+
+#### Use workload insights


As I mentioned in another comment, my understanding of "slow execution" and "suboptimal plan" insights is that they cannot really be used to find or troubleshoot query plan regressions, so I'd remove "Use workload insights" approach altogether.

That said, it might be worth reaching out to TSEs / EEs to check whether their experience matches my understanding.

yuzefovich · 2025-11-04T00:32:57Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+3. Among the resulting Statement Fingerprints, look for those with high latency. Click on the column headers to sort the results by **Statement Time** or **Max Latency**. 
+4. Click on the Statement Fingerprint to go to the page that details the statement and its executions.
+{{site.data.alerts.callout_success}}
+Look for statements whose **Execution Count** is high. Statements that are run once, such as import statements, aren't likely to be the cause of increased latency due to query plan regressions.


nit: capitalize IMPORT and perhaps link to the IMPORT docs page.

src/current/v25.4/troubleshoot-query-plan-regressions.md

yuzefovich · 2025-11-04T00:47:07Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+2. If you've already identified specific time intervals in Step 1, you can use the time interval selector to create a custom time interval. Click **Apply**.
+3. If there is only one plan in the resulting table, there was only one plan used for this statement fingerprint during this time interval, and therefore a query plan regression could not have occurred. If there are multiple plans listed in the resulting table, the query plan changed within the given time interval. By default, the table is sorted from most recent to least recent query plan. Compare the **Average Execution Time** of the different plans.
+
+If a plan in the table has a significantly higher average execution time than the one that preceded it, it's possible that this is a query plan regression. It's also possible that the increase in latency is coincidental, or that the plan change was not the actual cause. For example, if the average execution time of the latest query plan is significantly higher than the average execution time of the previous query plan, this could be explained by a significant increase in the **Average Rows Read** column.


To me it seems more likely that a significant increase (like an order of magnitude growth) in Average Rows Read is actually due to a plan regression, rather than due to the table size growth, since we're comparing two plans for the given query fingerprint that presumably were executed close - time-wise - to each other. I agree though that both are possibilities.

yuzefovich · 2025-11-04T00:52:43Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+#### Determine if the table indexes changed
+
+1. Look at the **Used Indexes** column for the older and the newer query plans. If these aren't the same, it's likely that the creation or deletion of an index resulted in a change to the statement's query plan.
+2. In the **Explain Plans** tab, click on the Plan Gist of the more recent plan to see it in more detail. Identify the table used in the initial "scan" step of the plan.


nit: s/table/tables/ - it's possible that we have initial scans of multiple tables.

yuzefovich · 2025-11-04T01:08:23Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+
+#### Determine if a literal in the SQL statement has changed
+
+[NOTE FROM BRANDON: I need more information on this case, mainly how to identify that this is the case, and what to do about it.]


Oof, yeah, this is hard one. The tutorial so far assumes that there is a single good plan for a query fingerprint that might have regressed, but it's actually possible that multiple plans are good, depending on the values of placeholders ("literals").

Here is an example of two different optimal plans (although they do look similar):

CREATE TABLE small (k INT PRIMARY KEY, v INT); CREATE TABLE large (k INT PRIMARY KEY, v INT, INDEX (v)); INSERT INTO small SELECT i, i FROM generate_series(1, 10) AS g(i); INSERT INTO large SELECT i, 1 FROM generate_series(1, 10000) AS g(i); ANALYZE small; ANALYZE large; -- this scans `large` on the _left_ side of merge join EXPLAIN SELECT * FROM small INNER JOIN large ON small.v = large.v AND small.v = 1; -- this scans `large` on the _right_ side of merge join EXPLAIN SELECT * FROM small INNER JOIN large ON small.v = large.v AND small.v = 2;

Complicating things is that we deal with query fingerprints internally, so all such constants are removed from our observability tooling. If there was an escalation saying that a particular query fingerprint is occasionally slow, similar to Becca I'd have asked for a conditional statement bundle, and then I'd play around locally with different values of placeholders to see whether multiple plans could be chosen based on concrete placeholder values. But so far we've used statement bundles mostly as internal (to Queries team in particular and Cockroach Labs support in general) tooling, so I'd probably not mention going down this route.

Instead, I'd consider suggesting looking into application side to see whether the literal has changed or something like that.

src/current/v25.4/troubleshoot-query-plan-regressions.md

yuzefovich

Nice! This generally looks good to me now.

src/current/v25.4/troubleshoot-query-plan-regressions.md

rytaft

Looks great! Just a couple small suggestions.

src/current/v25.4/troubleshoot-query-plan-regressions.md

rytaft

LGTM thank you!

florence-crl

Great work structuring this content! I was very nit-picky with making the text more clear and concise. Please let me know if you have any questions.

src/current/v25.4/troubleshoot-query-plan-regressions.md

florence-crl · 2025-11-17T16:06:36Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+
+## What to look out for
+
+Query plan regressions only increase the execution time of SQL statements that use that plan. This means that the overall service latency of the cluster will only be affected during the execution of statements that are run with the problematic query plan. 


Suggested change

Query plan regressions only increase the execution time of SQL statements that use that plan. This means that the overall service latency of the cluster will only be affected during the execution of statements that are run with the problematic query plan.

Query plan regressions increase the execution time only for SQL statements that use the affected plan. This means that the overall service latency of the cluster will only be affected during the execution of statements that are run with the problematic query plan.

Query plan regressions only increase the execution time of SQL statements that use the affected plan. This means that the overall service latency of the cluster will only be affected during the execution of statements that are run with the problematic query plan.

florence-crl · 2025-11-17T16:10:17Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+
+Query plan regressions only increase the execution time of SQL statements that use that plan. This means that the overall service latency of the cluster will only be affected during the execution of statements that are run with the problematic query plan. 
+
+This might make those latency spikes harder to identify. For example, if the problematic plan only affects a query that's run on an infrequent, ad-hoc basis, it might be difficult to notice a pattern among the graphs on the [**Metrics** page]({% link {{page.version.version}}/ui-overview.md %}#metrics).


Suggested change

This might make those latency spikes harder to identify. For example, if the problematic plan only affects a query that's run on an infrequent, ad-hoc basis, it might be difficult to notice a pattern among the graphs on the [**Metrics** page]({% link {{page.version.version}}/ui-overview.md %}#metrics).

As a result, these latency spikes can be harder to identify. For example, if the problematic plan only affects a query that's run on an infrequent, ad-hoc basis, it might be difficult to notice a pattern among the graphs on the [**Metrics** page]({% link {{page.version.version}}/ui-overview.md %}#metrics).

As a result, these latency spikes can be hard to identify. For example, if the problematic plan only affects a query that's run on an infrequent, ad-hoc basis, it might be difficult to notice a pattern among the graphs on the [Metrics page]({% link {{page.version.version}}/ui-overview.md %}#metrics).

src/current/v25.4/troubleshoot-query-plan-regressions.md

florence-crl · 2025-11-17T22:24:16Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+
+Inspect your application to see if the literals being used within the query executions are changing.
+
+If you suspect that the query plan change is the cause of the latency increase, and you suspect that the query plan changed due to a changed query literal, it's possible that the table statistics don't accurately reflect how the literal values are represented in the data. You may want to [manually refresh the statistics for the table]({% link {{ page.version.version }}/create-statistics.md %}#examples). It's also possible that the table indexes are not helpful for queries with the newer literal value, in which case you may want to [check the **Insights** page for index recommendations]({% link {{ page.version.version }}/ui-insights-page.md %}#suboptimal-plan). 


Suggested change

If you suspect that the query plan change is the cause of the latency increase, and you suspect that the query plan changed due to a changed query literal, it's possible that the table statistics don't accurately reflect how the literal values are represented in the data. You may want to [manually refresh the statistics for the table]({% link {{ page.version.version }}/create-statistics.md %}#examples). It's also possible that the table indexes are not helpful for queries with the newer literal value, in which case you may want to [check the **Insights** page for index recommendations]({% link {{ page.version.version }}/ui-insights-page.md %}#suboptimal-plan).

If you suspect the plan change caused the latency increase and was triggered by a changed query literal, table statistics may not accurately reflect how those values appear in the data. You may want to [manually refresh the statistics for the table]({% link {{ page.version.version }}/create-statistics.md %}#examples). It’s also possible that the current indexes aren’t effective for queries with the new literal value. In that case, [check the **Insights** page for index recommendations]({% link {{ page.version.version }}/ui-insights-page.md %}#suboptimal-plan).

florence-crl · 2025-11-17T22:25:05Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+
+If you suspect that the query plan change is the cause of the latency increase, and you suspect that the query plan changed due to a changed query literal, it's possible that the table statistics don't accurately reflect how the literal values are represented in the data. You may want to [manually refresh the statistics for the table]({% link {{ page.version.version }}/create-statistics.md %}#examples). It's also possible that the table indexes are not helpful for queries with the newer literal value, in which case you may want to [check the **Insights** page for index recommendations]({% link {{ page.version.version }}/ui-insights-page.md %}#suboptimal-plan). 
+
+If this does not fix the issue, a more drastic redesign of the schema or application may be needed.


Suggested change

If this does not fix the issue, a more drastic redesign of the schema or application may be needed.

If the issue persists, a more substantial redesign of the schema or application may be required.

florence-crl · 2025-11-17T22:27:56Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+#### View all events
+
+1. Go to the [**Metrics** page]({% link {{page.version.version}}/ui-overview.md %}#metrics).
+2. Go the [**Events** panel]({% link {{page.version.version}}/ui-runtime-dashboard.md %}#events-panel) on the right. Scroll to the bottom, and click **View All Events**.


Suggested change

2. Go the [**Events** panel]({% link {{page.version.version}}/ui-runtime-dashboard.md %}#events-panel) on the right. Scroll to the bottom, and click **View All Events**.

2. Go to the [**Events** panel]({% link {{page.version.version}}/ui-runtime-dashboard.md %}#events-panel) on the right. Scroll to the bottom, and click **View All Events**.

florence-crl · 2025-11-17T22:29:01Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+2. Go the [**Events** panel]({% link {{page.version.version}}/ui-runtime-dashboard.md %}#events-panel) on the right. Scroll to the bottom, and click **View All Events**.
+3. Scroll down to the approximate time when the latency increase began.
+
+    See if any events occured around that time that may have contributed to a query plan regression. These might include schema changes that affect tables involved in the suspect SQL queries, [changed cluster settings]({% link {{ page.version.version }}/set-cluster-setting.md %}), created or dropped indexes, and more. 


Suggested change

See if any events occured around that time that may have contributed to a query plan regression. These might include schema changes that affect tables involved in the suspect SQL queries, [changed cluster settings]({% link {{ page.version.version }}/set-cluster-setting.md %}), created or dropped indexes, and more.

Check for any events around that time that may have contributed to a query plan regression. These may include schema changes affecting tables in suspect SQL queries, [modified cluster settings]({% link {{ page.version.version }}/set-cluster-setting.md %}), created or dropped indexes, and more.

florence-crl · 2025-11-17T22:32:05Z

src/current/v25.4/troubleshoot-query-plan-regressions.md

+
+    See if any events occured around that time that may have contributed to a query plan regression. These might include schema changes that affect tables involved in the suspect SQL queries, [changed cluster settings]({% link {{ page.version.version }}/set-cluster-setting.md %}), created or dropped indexes, and more. 
+
+    A consequential event around the time of the latency increase may have affected the way that the optimizer chose query plans. Inspect changed cluster settings, or [determine if the table indexes changed](#determine-if-the-table-indexes-changed).


Suggested change

A consequential event around the time of the latency increase may have affected the way that the optimizer chose query plans. Inspect changed cluster settings, or [determine if the table indexes changed](#determine-if-the-table-indexes-changed).

An event around the time of the latency increase may have influenced how the optimizer selected query plans. Inspect changed cluster settings, or [determine if the table indexes changed](#determine-if-the-table-indexes-changed).

florence-crl

Great work structuring this content! I was very nit-picky with making the text more clear and concise. Please let me know if you have any questions.

Co-authored-by: Florence Morris <[email protected]>

florence-crl

LGTM pending non-blocking nits

src/current/v25.4/troubleshoot-query-plan-regressions.md

Co-authored-by: Florence Morris <[email protected]>

Troubleshooting Query Plan Regressions guide

eeb57a9

bsanchez-the-roach marked this pull request as draft October 31, 2025 16:16

bsanchez-the-roach requested review from kevin-v-ngo and mwang1026 October 31, 2025 16:17

rytaft reviewed Nov 3, 2025

View reviewed changes

yuzefovich reviewed Nov 4, 2025

View reviewed changes

redrafted based on Becca and Yahor's notes.

c273994

bsanchez-the-roach commented Nov 6, 2025

View reviewed changes

src/current/v25.4/troubleshoot-query-plan-regressions.md Show resolved Hide resolved

bsanchez-the-roach marked this pull request as ready for review November 6, 2025 17:23

bsanchez-the-roach commented Nov 6, 2025

View reviewed changes

src/current/v25.4/troubleshoot-query-plan-regressions.md Outdated Show resolved Hide resolved

a few tiny changes

b5b409a

yuzefovich reviewed Nov 7, 2025

View reviewed changes

src/current/v25.4/troubleshoot-query-plan-regressions.md Outdated Show resolved Hide resolved

src/current/v25.4/troubleshoot-query-plan-regressions.md Show resolved Hide resolved

bsanchez-the-roach added 2 commits November 7, 2025 16:54

added section on View All Events

e6a0bc7

Merge branch 'main' into DOC-15152

8949a08

bsanchez-the-roach requested review from florence-crl, rytaft and yuzefovich November 10, 2025 15:24

rytaft reviewed Nov 10, 2025

View reviewed changes

src/current/v25.4/troubleshoot-query-plan-regressions.md Outdated Show resolved Hide resolved

src/current/v25.4/troubleshoot-query-plan-regressions.md Show resolved Hide resolved

two changes from Becca's feedback

d21f6a7

rytaft approved these changes Nov 10, 2025

View reviewed changes

florence-crl requested changes Nov 17, 2025

View reviewed changes

bsanchez-the-roach and others added 3 commits November 18, 2025 11:15

Apply suggestions from code review

1674f4f

Co-authored-by: Florence Morris <[email protected]>

Slightly modified changes based on Florence's feedback

2af6ed9

Merge branch 'main' into DOC-15152

fbe9373

bsanchez-the-roach requested a review from florence-crl November 18, 2025 16:44

florence-crl approved these changes Nov 18, 2025

View reviewed changes

Apply suggestions from code review

b892e57

Co-authored-by: Florence Morris <[email protected]>

bsanchez-the-roach merged commit fd7fa65 into main Nov 18, 2025
5 checks passed

bsanchez-the-roach deleted the DOC-15152 branch November 18, 2025 18:27


		- [Understand how the cost-based optimizer chooses query plans]({% link {{page.version.version}}/cost-based-optimizer.md %}) based on table statistics, and how those statistics are refreshed.

		## Query plan regressions vs. suboptimal plans


		#### Determine if a literal in the SQL statement has changed

		[NOTE FROM BRANDON: I need more information on this case, mainly how to identify that this is the case, and what to do about it.]


		[NOTE FROM BRANDON: I need more information on this case, mainly how to identify that this is the case, and what to do about it.]

		If you suspect that the query plan change is the cause of the latency increase, and you suspect that the query plan changed due to a changed query literal, [what should you do]


		One way of tracking down query plan regressions is to identify SQL statements whose executions are relatively high in latency. Use one or both of the following methods to identify queries that might be associated with a latency increase.

		#### Use workload insights


		## What to look out for

		Query plan regressions only increase the execution time of SQL statements that use that plan. This means that the overall service latency of the cluster will only be affected during the execution of statements that are run with the problematic query plan.

	Query plan regressions only increase the execution time of SQL statements that use that plan. This means that the overall service latency of the cluster will only be affected during the execution of statements that are run with the problematic query plan.
	Query plan regressions increase the execution time only for SQL statements that use the affected plan. This means that the overall service latency of the cluster will only be affected during the execution of statements that are run with the problematic query plan.


		Query plan regressions only increase the execution time of SQL statements that use that plan. This means that the overall service latency of the cluster will only be affected during the execution of statements that are run with the problematic query plan.

		This might make those latency spikes harder to identify. For example, if the problematic plan only affects a query that's run on an infrequent, ad-hoc basis, it might be difficult to notice a pattern among the graphs on the [Metrics page]({% link {{page.version.version}}/ui-overview.md %}#metrics).


		Inspect your application to see if the literals being used within the query executions are changing.

		If you suspect that the query plan change is the cause of the latency increase, and you suspect that the query plan changed due to a changed query literal, it's possible that the table statistics don't accurately reflect how the literal values are represented in the data. You may want to [manually refresh the statistics for the table]({% link {{ page.version.version }}/create-statistics.md %}#examples). It's also possible that the table indexes are not helpful for queries with the newer literal value, in which case you may want to [check the Insights page for index recommendations]({% link {{ page.version.version }}/ui-insights-page.md %}#suboptimal-plan).


		If you suspect that the query plan change is the cause of the latency increase, and you suspect that the query plan changed due to a changed query literal, it's possible that the table statistics don't accurately reflect how the literal values are represented in the data. You may want to [manually refresh the statistics for the table]({% link {{ page.version.version }}/create-statistics.md %}#examples). It's also possible that the table indexes are not helpful for queries with the newer literal value, in which case you may want to [check the Insights page for index recommendations]({% link {{ page.version.version }}/ui-insights-page.md %}#suboptimal-plan).

		If this does not fix the issue, a more drastic redesign of the schema or application may be needed.

	If this does not fix the issue, a more drastic redesign of the schema or application may be needed.
	If the issue persists, a more substantial redesign of the schema or application may be required.

	2. Go the [Events panel]({% link {{page.version.version}}/ui-runtime-dashboard.md %}#events-panel) on the right. Scroll to the bottom, and click View All Events.
	2. Go to the [Events panel]({% link {{page.version.version}}/ui-runtime-dashboard.md %}#events-panel) on the right. Scroll to the bottom, and click View All Events.

	See if any events occured around that time that may have contributed to a query plan regression. These might include schema changes that affect tables involved in the suspect SQL queries, [changed cluster settings]({% link {{ page.version.version }}/set-cluster-setting.md %}), created or dropped indexes, and more.
	Check for any events around that time that may have contributed to a query plan regression. These may include schema changes affecting tables in suspect SQL queries, [modified cluster settings]({% link {{ page.version.version }}/set-cluster-setting.md %}), created or dropped indexes, and more.


		See if any events occured around that time that may have contributed to a query plan regression. These might include schema changes that affect tables involved in the suspect SQL queries, [changed cluster settings]({% link {{ page.version.version }}/set-cluster-setting.md %}), created or dropped indexes, and more.

		A consequential event around the time of the latency increase may have affected the way that the optimizer chose query plans. Inspect changed cluster settings, or [determine if the table indexes changed](#determine-if-the-table-indexes-changed).

	A consequential event around the time of the latency increase may have affected the way that the optimizer chose query plans. Inspect changed cluster settings, or [determine if the table indexes changed](#determine-if-the-table-indexes-changed).
	An event around the time of the latency increase may have influenced how the optimizer selected query plans. Inspect changed cluster settings, or [determine if the table indexes changed](#determine-if-the-table-indexes-changed).

Troubleshooting Query Plan Regressions guide #20893

Troubleshooting Query Plan Regressions guide #20893

Conversation

bsanchez-the-roach commented Oct 31, 2025 • edited by florence-crl Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for cockroachdb-api-docs canceled.

Uh oh!

netlify bot commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Uh oh!

github-actions bot commented Oct 31, 2025

Files changed:

Uh oh!

netlify bot commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Netlify Preview

Uh oh!

rytaft left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuzefovich Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuzefovich left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuzefovich Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

yuzefovich left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rytaft left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rytaft left a comment

Choose a reason for hiding this comment

Uh oh!

bsanchez-the-roach commented Oct 31, 2025 •

edited by florence-crl

Loading

netlify bot commented Oct 31, 2025 •

edited

Loading

netlify bot commented Oct 31, 2025 •

edited

Loading

netlify bot commented Oct 31, 2025 •

edited

Loading

yuzefovich Nov 4, 2025 •

edited

Loading

yuzefovich Nov 4, 2025 •

edited

Loading

bsanchez-the-roach Nov 18, 2025 •

edited

Loading