Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add query-level metrics along with query text and query execution plan #37478

Open
akshayAppd19 opened this issue Jan 24, 2025 · 3 comments
Open
Labels
enhancement New feature or request needs triage New item requiring triage receiver/oracledb

Comments

@akshayAppd19
Copy link

Component(s)

receiver/oracledb

Is your feature request related to a problem? Please describe.

Currently, this receiver produces only instance level metrics, there are no query-level metrics for users to identify slow, resource-intensive database queries. Identifying such queries helps debug the application or database performance issues. If a user wants to investigate the slow query issue, instance-level metrics are not gonna help a lot, as it does not show how a certain query is processed by the server. In this case, query-level metrics for a query that runs slowly could bring more context and greatly help the process of debugging and optimizing.
For instance, we can obtain physical reads of a single query consumed. This metric indicates disk I/O activity, and having a high number on this may suggest the query consumed too much I/O that the instance can serve, so we know where the bottleneck is and how to fix it.

Describe the solution you'd like

Oracle DB provides insights into query performance through multiple stats table. We propose to use the Oracle table V$SQL to fetch the query-level data. We will make changes to the the Oracle receiver to fetch the data from this table and produce corresponding metrics. Along with the query-level metrics, we also intend to fetch the query text and query execution plan. Query text would be fetched from table V$SQLSTATS, and query execution plan from V$SQL.

Concerns
However, the length of the sanitized query text and the execution plan can be very long, which exceeds the length limit of metric storage. The execution plan has a high cardinality value that does not fit into the metric storage.

To solve this problem, we propose sending query texts and execution plans in logs. So, every query result fetched from the SQL server would populate a metric and a corresponding log.

Describe alternatives you've considered

None

Additional context

We (a Splunk team) are interested in implementing this feature. But, since using logs like this is not what a common receiver would do, we would like to know whether the community can accept this or not before we work on the implementation.

Other receivers that have a similar approach:

k8sclusterreceiver: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/40dbf0b5487fd20e2024b66d72cd9945b90eca98/receiver/k8sclusterreceiver

@akshayAppd19 akshayAppd19 added enhancement New feature or request needs triage New item requiring triage labels Jan 24, 2025
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1
Copy link
Member

crobert-1 commented Jan 24, 2025

Hello @akshayAppd19, thanks for submitting this issue and for including so much information, it's really helpful!

This issue seems to have two main proposals, the first being adding metrics for SQL queries from the V$SQL and V$SQLSTATS table. I think this is a great idea, happy to have these added!

The second part of this request is adding logs support to help fill out query information that doesn't fit in the metrics. The receiver would be used in two instances in this case, so the logs receiver would be duplicating the queries made by the metrics receiver. One receiver instance can't create metrics and logs at the same time from the same data. My other main concern is that the query and query plan data doesn't change often, so my guess is that it would need to be scraped less frequently than metrics, but frequently enough that changes are caught in a timely manner.

Ideally, there'd be some unique value for a SQL query and plan that could be added to the metric which could be used to effectively communicate which query it is, and which plan. This would greatly simplify the logic and UX of components in the Collector.

This isn't a blocker for me, but I'd want to make sure there's no other column from either table that could be used for a query to effectively map the value to the query in Oracle DB.

Can you confirm there's no HASH or ID value in either table (V$SQL or V$SQLSTATS) that can be used instead of your proposed solution?

@akshayAppd19
Copy link
Author

Hello @crobert-1 yes we do want to cache query test as it won't change frequently. However, in Oracle, query plan may change even if the text doesn't change. So, we are still exploring if caching query plan would be useful or not.
The SQL_ID attribute would be used to map the metric with a query. We would use SQL_ID to fetch the query text too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request needs triage New item requiring triage receiver/oracledb
Projects
None yet
Development

No branches or pull requests

2 participants