Skip to content

Commit

Permalink
Fix inaccurate MySQL/MariaDB, Redis metrics (#12130)
Browse files Browse the repository at this point in the history
  • Loading branch information
yswdqz committed Apr 22, 2024
1 parent e9a2ed2 commit 86155be
Show file tree
Hide file tree
Showing 15 changed files with 294 additions and 185 deletions.
4 changes: 3 additions & 1 deletion docs/en/changes/changes.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,8 @@
- `memory_swap_percentage` -> `memory_virtual_memory_percentage`
* Fix/Change UI init setting for Windows Swap -> Virtual Memory
* Fix `Memory Swap Usage`/`Virtual Memory Usage` display with UI init.(Linux/Windows)
* Fix inaccurate APISIX metrics

* Fix inaccurate APISIX metrics.
* Fix inaccurate MongoDB Metrics.
* Support Apache ActiveMQ server monitoring.
* Add Service Hierarchy auto matching layer relationships (upper -> lower) as following:
Expand All @@ -114,6 +115,7 @@
* MQE query: make metadata not return `null`.
* MQE labeled metrics Binary Operation: return empty value if the labels not match rather than report error.
* Fix inaccurate Hierarchy of RabbitMQ Server monitoring metrics.
* Fix inaccurate MySQL/MariaDB, Redis, PostgreSQL metrics.

#### UI

Expand Down
24 changes: 12 additions & 12 deletions docs/en/setup/backend/backend-mysql-monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,19 +17,19 @@ SkyWalking leverages prometheus/mysqld_exporter for collecting metrics data. It
MySQL/MariaDB monitoring provides monitoring of the status and resources of the MySQL/MariaDB server. MySQL/MariaDB cluster is cataloged as a `Layer: MYSQL` `Service` in OAP.
Each MySQL/MariaDB server is cataloged as an `Instance` in OAP.
#### Supported Metrics
| Monitoring Panel | Unit | Metric Name | Description | Data Source |
|-----|------|-----|-----|-----|
| MySQL Uptime | day | meter_mysql_uptime | The MySQL startup time | mysqld_exporter|
| Max Connections | | meter_mysql_max_connections | The max number of connections. | mysqld_exporter|
| Innodb Buffer Pool Size | MB | meter_mysql_innodb_buffer_pool_size | The buffer pool size in Innodb engine | mysqld_exporter|
| Thread Cache Size | | meter_mysql_thread_cache_size | The size of thread cache | mysqld_exporter|
| Current QPS| | meter_mysql_qps | Queries Per Second | mysqld_exporter|
| Current TPS | | meter_mysql_tps | Transactions Per Second | mysqld_exporter|
| Monitoring Panel | Unit | Metric Name | Description | Data Source |
|-----|------|--------------------------------------------------------------------------------------------------------------------------------------------------|-----|-----|
| MySQL Uptime | day | meter_mysql_uptime | The MySQL startup time | mysqld_exporter|
| Max Connections | | meter_mysql_max_connections | The max number of connections. | mysqld_exporter|
| Innodb Buffer Pool Size | MB | meter_mysql_innodb_buffer_pool_size | The buffer pool size in Innodb engine | mysqld_exporter|
| Thread Cache Size | | meter_mysql_thread_cache_size | The size of thread cache | mysqld_exporter|
| Current QPS| | meter_mysql_qps | Queries Per Second | mysqld_exporter|
| Current TPS | | meter_mysql_tps | Transactions Per Second | mysqld_exporter|
| Commands Rate | | meter_mysql_commands_insert_rate <br/>meter_mysql_commands_select_rate<br />meter_mysql_commands_delete_rate<br />meter_mysql_commands_update_rate | The rate of total number of insert/select/delete/update executed by the current server | mysqld_exporter|
| Threads | | meter_mysql_threads_connected<br />meter_mysql_threads_created<br />meter_mysql_threads_cached<br />meter_mysql_threads_running | The number of currently open connections(threads_connected) <br/> The number of threads created(threads_created) <br/> The number of threads in the thread cache(threads_cached) <br/> The number of threads that are not sleeping(threads_running) | mysqld_exporter|
| Connects | | meter_mysql_connects_available<br />meter_mysql_connects_aborted | The number of available connections(connects_available)<br/>The number of MySQL instance connection rejections(connects_aborted)| mysqld_exporter|
| Connection Errors | | meter_mysql_connection_errors_internal </br> meter_mysql_connection_errors_max_connections | Errors due to exceeding the max_connections(connection_errors_max_connections) </br>Error caused by internal system(connection_errors_internal) | mysqld_exporter|
| Slow Queries Rate | | meter_mysql_slow_queries_rate | The rate of slow queries | mysqld_exporter|
| Threads | | meter_mysql_threads_connected<br />meter_mysql_threads_created<br />meter_mysql_threads_cached<br />meter_mysql_threads_running | The number of currently open connections(threads_connected) <br/> The number of threads created(threads_created) <br/> The number of threads in the thread cache(threads_cached) <br/> The number of threads that are not sleeping(threads_running) | mysqld_exporter|
| Connects | | meter_mysql_max_connections<br />meter_mysql_status_thread_connected<br />meter_mysql_connects_aborted | The number of available connections(connects_available)<br/>The number of MySQL instance connection rejections(connects_aborted)| mysqld_exporter|
| Connection Errors | | meter_mysql_connection_errors_internal </br> meter_mysql_connection_errors_max_connections | Errors due to exceeding the max_connections(connection_errors_max_connections) </br>Error caused by internal system(connection_errors_internal) | mysqld_exporter|
| Slow Queries Rate | | meter_mysql_slow_queries_rate | The rate of slow queries | mysqld_exporter|

### Customizations
You can customize your own metrics/expression/dashboard panel.
Expand Down
24 changes: 12 additions & 12 deletions docs/en/setup/backend/backend-redis-monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,19 +17,19 @@ SkyWalking leverages redis-exporter for collecting metrics data from Redis. It l
Redis monitoring provides monitoring of the status and resources of the Redis server. Redis cluster is cataloged as a `Layer: REDIS` `Service` in OAP.
Each Redis server is cataloged as an `Instance` in OAP.
#### Supported Metrics
| Monitoring Panel | Unit | Metric Name | Description | Data Source |
|-----------------------------------|--------|---------------------------------------------------------------------------------------------------|----------------------------------------------------|----------------|
| Uptime | day | meter_redis_uptime | The uptime of Redis. | redis-exporter |
| Connected Clients | | meter_redis_connected_clients | The number of connected clients. | redis-exporter |
| Blocked Clients | | meter_redis_blocked_clients | The number of blocked clients. | redis-exporter |
| Memory Max Bytes | MB | meter_redis_memory_max_bytes | The max bytes of memory. | redis-exporter |
| Hits Rate | % | meter_redis_hit_rate | Hit rate of redis when used as a cache. | redis-exporter |
| Average Time Spend By Command | second | meter_redis_average_time_spent_by_command | Average time to execute various types of commands. | redis-exporter |
| Total Commands Trend | | meter_redis_total_commands_rate | The Trend of total commands. | redis-exporter |
| Monitoring Panel | Unit | Metric Name | Description | Data Source |
|-----------------------------------|--------|--------------------------------------------------------------------------------------------------|----------------------------------------------------|----------------|
| Uptime | day | meter_redis_uptime | The uptime of Redis. | redis-exporter |
| Connected Clients | | meter_redis_connected_clients | The number of connected clients. | redis-exporter |
| Blocked Clients | | meter_redis_blocked_clients | The number of blocked clients. | redis-exporter |
| Memory Max Bytes | MB | meter_redis_memory_max_bytes | The max bytes of memory. | redis-exporter |
| Hits Rate | % | meter_redis_hit_rate | Hit rate of redis when used as a cache. | redis-exporter |
| Average Time Spend By Command | second | meter_redis_average_time_spent_by_command | Average time to execute various types of commands. | redis-exporter |
| Total Commands Trend | | meter_redis_total_commands_rate | The Trend of total commands. | redis-exporter |
| DB keys | | meter_redis_evicted_keys_total </br> meter_redis_expired_keys_total </br> meter_redis_db_keys | The number of Expired / Evicted / total keys. | redis-exporter |
| Net Input/Output Bytes | KB | meter_redis_net_input_bytes </br> meter_redis_net_output_bytes | Total bytes of input / output of redis net. | redis-exporter |
| Memory Usage | % | meter_redis_memory_usage | Percentage of used memory. | redis-exporter |
| Total Time Spend By Command Trend | | meter_redis_commands_duration_seconds_total_rate | The trend of total time spend by command | redis-exporter |
| Net Input/Output Bytes | KB | meter_redis_net_input_bytes </br> meter_redis_net_output_bytes | Total bytes of input / output of redis net. | redis-exporter |
| Memory Usage | % | meter_redis_memory_used_bytes </br> meter_redis_memory_max_bytes | Percentage of used memory. | redis-exporter |
| Total Time Spend By Command Trend | | meter_redis_commands_duration </br> meter_redis_commands_total | The trend of total time spend by command | redis-exporter |

### Customizations
You can customize your own metrics/expression/dashboard panel.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,39 +34,41 @@ metricPrefix: meter_mysql
metricsRules:
# database throughput
- name: commands_insert_rate
exp: mysql_global_status_commands_total.tagEqual('command','insert').rate('PT1M')
exp: mysql_global_status_commands_total.tagEqual('command','insert').sum(['service_instance_id','host_name']).rate('PT1M')
- name: commands_select_rate
exp: mysql_global_status_commands_total.tagEqual('command','select').rate('PT1M')
exp: mysql_global_status_commands_total.tagEqual('command','select').sum(['service_instance_id','host_name']).rate('PT1M')
- name: commands_delete_rate
exp: mysql_global_status_commands_total.tagEqual('command','delete').rate('PT1M')
exp: mysql_global_status_commands_total.tagEqual('command','delete').sum(['service_instance_id','host_name']).rate('PT1M')
- name: commands_update_rate
exp: mysql_global_status_commands_total.tagEqual('command','update').rate('PT1M')
exp: mysql_global_status_commands_total.tagEqual('command','update').sum(['service_instance_id','host_name']).rate('PT1M')
- name: qps
exp: mysql_global_status_queries.rate('PT1M')
exp: mysql_global_status_queries.rate('PT1M').sum(['service_instance_id','host_name'])
- name: tps
exp: mysql_global_status_commands_total.tagMatch('command','rollback|commit').sum(['host_name']).rate('PT1M')
exp: mysql_global_status_commands_total.tagMatch('command','rollback|commit').sum(['host_name', 'service_instance_id']).rate('PT1M')

# connections
## threads
- name: threads_connected
exp: mysql_global_status_threads_connected
exp: mysql_global_status_threads_connected.sum(['service_instance_id','host_name'])
- name: threads_created
exp: mysql_global_status_threads_created
exp: mysql_global_status_threads_created.sum(['service_instance_id','host_name'])
- name: threads_running
exp: mysql_global_status_threads_running
exp: mysql_global_status_threads_running.sum(['service_instance_id','host_name'])
- name: threads_cached
exp: mysql_global_status_threads_cached
exp: mysql_global_status_threads_cached.sum(['service_instance_id','host_name'])
## connect
- name: connects_aborted
exp: mysql_global_status_aborted_connects
- name: connects_available
exp: mysql_global_variables_max_connections.sum(['host_name']) - mysql_global_status_threads_connected.sum(['host_name'])
exp: mysql_global_status_aborted_connects.sum(['service_instance_id','host_name'])
- name: max_connections
exp: mysql_global_variables_max_connections.sum(['host_name','service_instance_id'])
- name: status_thread_connected
exp: mysql_global_status_threads_connected.sum(['host_name','service_instance_id'])
- name: connection_errors_max_connections
exp: mysql_global_status_connection_errors_total.tagEqual('error','max_connection')
exp: mysql_global_status_connection_errors_total.tagEqual('error','max_connection').sum(['service_instance_id','host_name'])
- name: connection_errors_internal
exp: mysql_global_status_connection_errors_total.tagEqual('error','internal')
exp: mysql_global_status_connection_errors_total.tagEqual('error','internal').sum(['service_instance_id','host_name'])

# slow queries
- name: slow_queries_rate
exp: mysql_global_status_slow_queries.rate('PT1M')
exp: mysql_global_status_slow_queries.sum(['service_instance_id','host_name']).rate('PT1M')

Original file line number Diff line number Diff line change
Expand Up @@ -104,12 +104,12 @@ metricsRules:

## buffers
- name: instance_buffers_checkpoint
exp: pg_stat_bgwriter_buffers_checkpoint.rate('PT1M')
exp: pg_stat_bgwriter_buffers_checkpoint_total.rate('PT1M')
- name: instance_buffers_clean
exp: pg_stat_bgwriter_buffers_clean.rate('PT1M')
exp: pg_stat_bgwriter_buffers_clean_total.rate('PT1M')
- name: instance_buffers_backend_fsync
exp: pg_stat_bgwriter_buffers_backend_fsync.rate('PT1M')
exp: pg_stat_bgwriter_buffers_backend_fsync_total.rate('PT1M')
- name: instance_buffers_alloc
exp: pg_stat_bgwriter_buffers_alloc.rate('PT1M')
exp: pg_stat_bgwriter_buffers_alloc_total.rate('PT1M')
- name: instance_buffers_backend
exp: pg_stat_bgwriter_buffers_backend.rate('PT1M')
exp: pg_stat_bgwriter_buffers_backend_total.rate('PT1M')
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ metricsRules:
exp: pg_stat_database_tup_returned.sum(['service_instance_id','host_name']).rate('PT1M')
## locks
- name: locks_count
exp: pg_locks_count.tag({tags -> tags.mode = tags.datname + ":" + tags.mode}).sum(['service_instance_id','host_name'])
exp: pg_locks_count.sum(['service_instance_id','host_name'])

## sessions
- name: active_sessions
Expand All @@ -68,13 +68,13 @@ metricsRules:

## checkpoint
- name: checkpoint_write_time_rate
exp: pg_stat_bgwriter_checkpoint_write_time_total.rate('PT1M')
exp: pg_stat_bgwriter_checkpoint_write_time_total.rate('PT1M').sum(['service_instance_id','host_name'])
- name: checkpoint_sync_time_rate
exp: pg_stat_bgwriter_checkpoint_sync_time_total.rate('PT1M')
exp: pg_stat_bgwriter_checkpoint_sync_time_total.rate('PT1M').sum(['service_instance_id','host_name'])
- name: checkpoint_req_rate
exp: pg_stat_bgwriter_checkpoints_req_total.rate('PT1M')
exp: pg_stat_bgwriter_checkpoints_req_total.rate('PT1M').sum(['service_instance_id','host_name'])
- name: checkpoints_timed_rate
exp: pg_stat_bgwriter_checkpoints_timed_total.rate('PT1M')
exp: pg_stat_bgwriter_checkpoints_timed_total.rate('PT1M').sum(['service_instance_id','host_name'])

## conflicts and deadlocks
- name: conflicts_rate
Expand All @@ -84,12 +84,12 @@ metricsRules:

## buffers
- name: buffers_checkpoint
exp: pg_stat_bgwriter_buffers_checkpoint.rate('PT1M')
exp: pg_stat_bgwriter_buffers_checkpoint_total.rate('PT1M').sum(['service_instance_id','host_name'])
- name: buffers_clean
exp: pg_stat_bgwriter_buffers_clean.rate('PT1M')
exp: pg_stat_bgwriter_buffers_clean_total.rate('PT1M').sum(['service_instance_id','host_name'])
- name: buffers_backend_fsync
exp: pg_stat_bgwriter_buffers_backend_fsync.rate('PT1M')
exp: pg_stat_bgwriter_buffers_backend_fsync_total.rate('PT1M').sum(['service_instance_id','host_name'])
- name: buffers_alloc
exp: pg_stat_bgwriter_buffers_alloc.rate('PT1M')
exp: pg_stat_bgwriter_buffers_alloc_total.rate('PT1M').sum(['service_instance_id','host_name'])
- name: buffers_backend
exp: pg_stat_bgwriter_buffers_backend.rate('PT1M')
exp: pg_stat_bgwriter_buffers_backend_total.rate('PT1M').sum(['service_instance_id','host_name'])
Loading

0 comments on commit 86155be

Please sign in to comment.