Conversation
| self.elapsed = elapsed | ||
|
|
||
| def store_profile_events(self, packet): | ||
| data = QueryResult([packet]).get_result() |
There was a problem hiding this comment.
Currently we have 'static' attributes that stores statistics: https://clickhouse-driver.readthedocs.io/en/latest/features.html#query-execution-statistics: client.last_query.progress.total_rows, client.last_query.progress.total_bytes, etc.
I'd prefer to store statistics in the same way if it's possible: client.last_query.stats.select_query, client.last_query.stats.selected_rows.
There was a problem hiding this comment.
I use content of this data for analyzing queries, and they may be very different. Metrics here vary on query type and even queried table engine. I guess server version may have effect too (I use only v.23 at this moment). So, finally, that is not stable list of metrics. And number of options too big (may be >100 finally).
I found that ~20 of them most common for queries and most intersting. I use pydantic model to get them:
class ClickhouseStats(pydantic.BaseModel):
elapsed: int = pydantic.Field(alias="elapsed")
is_insert: int | None = pydantic.Field(alias="InsertQuery", default=None)
read_bytes: int | None = pydantic.Field(alias="ReadCompressedBytes", default=None)
write_bytes: int | None = pydantic.Field(alias="WriteBufferFromFileDescriptorWriteBytes", default=None)
network_recv_bytes: int | None = pydantic.Field(alias="NetworkReceiveBytes", default=None)
network_recv_time: int | None = pydantic.Field(alias="NetworkReceiveElapsedMicroseconds", default=None)
network_send_bytes: int | None = pydantic.Field(alias="NetworkSendBytes", default=None)
network_send_time: int | None = pydantic.Field(alias="NetworkSendElapsedMicroseconds", default=None)
memory_usage: int | None = pydantic.Field(alias="MemoryTrackerUsage", default=None)
memory_peak: int | None = pydantic.Field(alias="MemoryTrackerPeakUsage", default=None)
file_open: int | None = pydantic.Field(alias="FileOpen", default=None)
function_execute: int | None = pydantic.Field(alias="FunctionExecute", default=None)
write_time: int | None = pydantic.Field(alias="DiskWriteElapsedMicroseconds", default=None)
insert_rows: int | None = pydantic.Field(alias="InsertedRows", default=None)
insert_bytes: int | None = pydantic.Field(alias="InsertedBytes", default=None)
select_rows: int | None = pydantic.Field(alias="SelectedRows", default=None)
select_bytes: int | None = pydantic.Field(alias="SelectedBytes", default=None)
insert_parts: int | None = pydantic.Field(alias="InsertedCompactParts", default=None)
real_time: int | None = pydantic.Field(alias="RealTimeMicroseconds", default=None)
system_time: int | None = pydantic.Field(alias="SystemTimeMicroseconds", default=None)
def __init__(self, result: CursorResult | None = None, query_info: QueryInfo | None = None):
if query_info is None:
query_info: QueryInfo = result.context.query_info
super().__init__(elapsed=int(query_info.elapsed * 1000), **(query_info.stats or {})) # TODO: 1000?
I can add them here(without pydantic), but wouldn't that be too much?
There was a problem hiding this comment.
Yep. It's too much. Dict will be fine.
Adds support of profile events on native protocol. There are many different parameters inside like network timings, locks, memory usage, that may be very helpful for debug and monitoring queries.
Not sure is it needed to update docs.
Checklist:
flake8and fix issues.pytestno tests failed. See https://clickhouse-driver.readthedocs.io/en/latest/development.html.