Gobblin Metrics Performance

This document explains the performance impact of using Gobblin Metrics in applications.

Generalities

These are the main resources used by Gobblin Metrics:

CPU time for updating metrics: scales with number of metrics and frequency of metric update
CPU time for metric emission and lifecycle management: scales with number of metrics and frequency of emission
Memory for storing metrics: scales with number of metrics and metric contexts
I/O for reporting metrics: scales with number of metrics and frequency of emission
External resources for metrics emission (e.g. HDFS space, Kafka queue space, etc.): scales with number of metrics and frequency of emission

Update Metrics Performance

Metric updates are the most common interaction with Gobblin Metrics in an application. Every time a counter is increased, a meter is marked, or entries are added to histograms and timers, an update happens. As such, metric updates are the most likely to impact application performance. We measured the max number of metric updates that can be executed per second. The performance of different metric types is different. Also, the performance of metrics depends on the depth in the Metric Context tree at which they are created. Metrics in the Root Metric Context are the fastest, while metrics deep in the tree are slower because they have to update all ancestors as well. The following table shows reference max QPS (updates per second) for each metric type in a i7 processor:

Metric	Root level	Depth: 1	Depth: 2	Depth: 3
Counter	76M	39M	29M	24M
Meter	11M	7M	4.5M	3.5M
Histogram	2.4M	2.4M	1.8M	1.3M
Timer	1.4M	1.4M	1M	1M

Source | Documentation | Discussion Group

Home
[Getting Started](Getting Started)
Architecture
User Guide
- Working with Job Configuration Files
- [Deployment](Gobblin Deployment)
- Gobblin on Yarn
- Compaction
- [State Management and Watermarks] (State-Management-and-Watermarks)
- Working with the ForkOperator
- [Configuration Glossary](Configuration Properties Glossary)
- [Partitioned Writers](Partitioned Writers)
- Monitoring
- Schedulers
- [Job Execution History Store](Job Execution History Store)
- Gobblin Build Options
- Troubleshooting
- [FAQs] (FAQs)
Case Studies
- Kafka-HDFS Ingestion
- Publishing Data to S3
Gobblin Metrics
- [Quick Start](Gobblin Metrics)
- [Existing Reporters](Existing Reporters)
- [Metrics for Gobblin ETL](Metrics for Gobblin ETL)
- [Gobblin Metrics Architecture](Gobblin Metrics Architecture)
- [Implementing New Reporters](Implementing New Reporters)
- [Gobblin Metrics Performance](Gobblin Metrics Performance)
Developer Guide
- [Customization: New Source](Customization for New Source)
- [Customization: Converter/Operator](Customization for Converter and Operator)
- Code Style Guide
- IDE setup
- Monitoring Design
Project
- [Feature List](Feature List)
- Contributors/Team
- [Talks/Tech Blogs](Talks and Tech Blogs)
- News/Roadmap
- Posts
Miscellaneous
- Camus → Gobblin Migration
- Exactly Once Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gobblin Metrics Performance

Generalities

Update Metrics Performance

Clone this wiki locally