-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add views for metrics about pageserver requests #9008
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Orca Security Scan Summary
Status | Check | Issues by priority | |
---|---|---|---|
Passed | Infrastructure as Code | 0 0 0 0 | View in Orca |
Passed | Secrets | 0 0 0 0 | View in Orca |
Passed | Vulnerabilities | 0 0 0 0 | View in Orca |
48130d2
to
1993f22
Compare
4977 tests run: 4813 passed, 0 failed, 164 skipped (full report)Flaky tests (7)Postgres 17
Postgres 16
Postgres 15
Postgres 14
Code coverage* (full report)
* collected from Rust tests only The comment gets automatically updated with the latest test results
367de54 at 2024-09-17T19:41:09.997Z :recycle: |
1993f22
to
d2f47a0
Compare
Can't this use additional columns for dimensions? I'm an unfan of prometheus' format and would appreciate not exposing such opinionated names. |
I actually wrote it that way at first, with an explicit "bucket" column. But to then convert them to the prometheus metrics format in the exporter, you need a pretty complex SQL query. I found it easier to do that in C code directly. |
Why can't the exporter do the table-to-metrics transformation? Shouldn't it be able to handle that transformation by itself? |
You can give the exporter an arbitrary SQL, and do the transformation in the SQL query. |
I know you can do that, but shouldn't the exporter by itself know how to treat (absense of) labels like E.g. if I query See e.g. https://github.com/neondatabase/neon/blob/0a8c5e1214fcd3f59767a6ca4adeb68612977e51/vm-image-spec.yaml#L439C1-L449C85 where labels are read from columns. |
cur.execute("SELECT * FROM neon_perf_counters") | ||
cur.execute("SELECT * FROM neon_backend_perf_counters") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely these would be all zero?
Should the getpage wait thresholds be checked against pageserver getpage metric buckets that they are in sync?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely these would be all zero?
You mean, in this test, because it hasn't done anything? No, even just connecting to Postgres requires fetching some pages for the system catalogs that are needed for authentication.
Should the getpage wait thresholds be checked against pageserver getpage metric buckets that they are in sync?
That would be difficult to automate. There's always some network latency, so the latencies measured from the compute are expected to be somewhat higher. But they can also be lower, if prefetching is effective; these reported wait times start the clock when we enter the smgrread() call and know that we need to read a page. If a prefetch request has been issued for the page earlier, the response might already be in flight.
Comparing the values as measured from pageserver is a very valuable thing to do manually, though. It can tell a lot about network latency and how effective the prefetching is.
A-ha, now I understand. I didn't know sql_exporter can do that. Yeah, that makes sense, I'll do that. |
There is an Epic from John for that #8926 |
uint64 getpage_wait_us_count; | ||
uint64 getpage_wait_us_sum; | ||
uint64 getpage_wait_us_bucket[NUM_GETPAGE_WAIT_BUCKETS]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add per-shard and/or per-pageserver host labels here? I think this info could be useful and it's also requested in the mentioned Epic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, not impossible but it would be a bit more complicated. I'm going to punt on that for now, so that we have something, and we can add that as a followup PR.
* backend had to reconnect. Note that this doesn't count the first | ||
* connection in each backend, only reconnects. | ||
*/ | ||
uint64 pageserver_disconnects_total; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any other error reasons except disconnections? Like WAL waiting timeout? Should it be a separate pageserver_requests_errors
then? Or just replace pageserver_disconnects_total
with it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pageserver could send an error, or a bogus response which is turned into an error in the compute. But those would result in transaction abort and the error would be logged. The correct number for those errors is zero, and we should rely on logs for them.
pgxn/neon/neon--1.4--1.5.sql
Outdated
-- Show various metrics, for each backend. Note that the values are not | ||
-- reset when a backend exits. When a new backend starts with the backend | ||
-- ID, it will continue accumulating the values from where the old backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: The current implementation seems ambiguous to me. I see that you reused the same shared struct for both cumulative and per-backend stats; that's why we cannot reset the counters, but is the end result handy? Procno is some internal concept that is hard to reason about
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The view exposes 'pid' as well, which is pretty convenient to use, e.g. to show stats of just your own backend
select * from neon_backend_perf_counters where pid=pg_backend_pid();
The metrics include a histogram of how long we need to wait for a GetPage request, number of reconnects, and number of requests among other things. The metrics are not yet exported anywhere, but you can query them manually.
95dd419
to
b80597a
Compare
Done. It now looks like this:
and that can be converted to prometheus style format with a pretty simple query:
|
b80597a
to
367de54
Compare
The metrics include a histogram of how long we need to wait for a GetPage request, number of reconnects, and number of requests among other things.
The metrics are not yet exported anywhere, but you can query them manually.
This is what the view looks like: