Skip to content

Releases: NetApp/harvest

Harvest Nightly Release

24 Dec 05:47
367b927

Choose a tag to compare

Pre-release

Nightly builds may include bugs and other issues. You might want to use the stable releases instead.

25.11.0

10 Nov 13:10
5592755

Choose a tag to compare

25.11.0 / 2025-11-10 Release

📌 Highlights of this major release include:

⭐ New Features

  • 🏅 We've created a Harvest Model Context Protocol (MCP) server. The Harvest MCP server provides MCP clients like GitHub Copilot, Claude Desktop, and other large language models (LLMs) access to your infrastructure monitoring data collected by Harvest from ONTAP, StorageGRID, and Cisco systems.

  • 🔥 Harvest supports monitoring NetApp AFX clusters with this release. Performance metrics with the API name KeyPerf or StatPerf in the ONTAP metrics documentation are supported in AFX systems. As a result, some panels in the dashboards may be missing information.

  • 💎 New dashboards and additional panels:

    • Harvest includes an ASAr2 dashboard with storage units and SAN initiator group panels.
    • Harvest includes a StorageGRID S3 dashboard. Thanks to @ofu48167 for raising!
    • Harvest includes a Hosts dashboard with SAN initiator groups. Thanks to @CJLvU for raising!
    • Harvest collects FlexCache metrics from FSx.
    • The StorageGRID Tenants dashboard includes tenant descriptions and bucket versioning. Thanks to @jowanw for raising!
    • The Volume dashboard includes an autosize table panel. Thanks to @roybatty2019 for raising!
    • The Network dashboard shows all ethernet port errors. Thanks to RobertWatson for raising!
    • The Datacenter dashboard includes a System Manager panel with links to ONTAP System Manager. Thanks to Ed Barron for raising!
    • The Data Protection dashboard includes a Snapshot Policy Violations panel that shows the number of snapshots outside the defined policy scope. Thanks to Lora NeyMan for raising!
    • The Volume dashboard includes panels on hot and cold data. Thanks to prime_kiwi_05259 for raising!
    • The Snapmirror Destination dashboard includes a "TopN Destination Volumes by Average Throughput" panel. Thanks to @roybatty2019 for raising!
    • The Volume dashboard includes a Snaplock panel. Thanks to @BrendonA667 for raising!
    • The MetroCluster dashboard includes IWarp and NVM mirror metrics. Thanks to @mamoep for raising!
    • The Security dashboard includes an anti-ransomware snapshots table. Thanks to @ybizeul for raising!
    • The Workload dashboard includes min IOPs and workload size in the adaptive QoS workload table. Thanks to Paqui for raising!
    • The LUN dashboard includes a LUN's block size in the LUN table. Thanks to Venumadhu for raising!
  • 🌾 harvest grafana import includes a new command-line interface option (show-datasource) to show the datasource variable dropdown in dashboards, useful for multi-datasource setups. Thanks to @RockSolidScripts for raising!

  • harvest grafana import includes a new command-line interface option (add-cluster-label) to rewrite all panel expressions to add the specified cluster label and variable. Thanks to @RockSolidScripts for raising!

  • 📕 Documentation additions:

    • Added a tutorial for how to include StorageGRID-supplied dashboards into Harvest. Thanks to @ofu48167 for raising!
    • Included ONTAP permissions required for the StatPerf collector.
    • Clarified which APIs are used to collect each metric.
    • Clarified that the StatPerf collector does not work for FSx clusters due to ONTAP limitations.
  • Harvest reports node-scoped metrics even when some nodes are down.

  • Harvest's poller includes a /health endpoint for liveness checks. Thanks to @RockSolidScripts for raising!

  • The FcpPort and NicCommon templates work with the StatPerf collector. This means the Network dashboard works with AFX and ASAr2 clusters.

Announcements

‼️ IMPORTANT We've made changes to how volume performance metrics are collected. These changes are automatic and require no action from you unless you've customized Harvest's default volume.yaml templates. Continue reading for more details on the reasons behind this change and how to accommodate it.

By default, Harvest will now use the KeyPerf collector for volume performance metrics. This better aligns with ONTAP's recommendations and what System Manager shows.

The default.yaml files for ZapiPerf and RestPerf now include a KeyPerf: prefix for the volume template (e.g., KeyPerf:volume.yaml). This instructs Harvest to use the KeyPerf collector for volumes. More details are available at: #3900

‼️ IMPORTANT If you are using Docker Compose and want to keep your historical Prometheus data, please
read how to migrate your Prometheus volume.

💡 IMPORTANT After upgrading, don't forget to re-import your dashboards to get all the new enhancements and fixes. You can import them via the bin/harvest grafana import CLI, from the Grafana UI, or from the 'Maintenance > Reset Harvest Dashboards' button in NAbox3. For NAbox4, this step is not needed.

Known Issues

  • #3941 disabled the restperf/volume_node.yaml and zapiperf/volume_node.yaml templates because ONTAP provided incomplete metrics for them. The node_vol prefixed metrics are not used in any Harvest dashboard. If you still need these metrics, you can re-enable the templates in their corresponding default.yaml. See #3900 for details.

Thanks to all the awesome contributors

🤘 Thanks to all the people who've opened issues, asked questions on Discord, and contributed code or dashboards for this release:

@BrendonA667, @CJLvU, @Falcon667, @RockSolidScripts, @jowanw, @mamoep, @ofu48167, @roybatty2019, @ybizeul

🌱 This release includes 34 features, 20 bug fixes, 17 documentation, 1 testing, 8 refactoring, 21 miscellaneous, and 14 ci pull requests.

Expand for full list of pull requests ### 🚀 Features - Allow Partial Aggregation For Node Scoped Objects ([#3811](https://github.com//pull/3811)) - Grafana Import Should Include Option To Show Datasource Var ([#3830](https://github.com//pull/3830)) - Adding Description And Versioning Enable In Tenant Dashboard ([#3833](https://github.com//pull/3833)) - Adding Volume Autosize Details In Volume Dashboard ([#3851](https://github.com//pull/3851)) - Network Dashboard Ethernet Port Errors Should Show All Errors ([#3852](https://github.com//pull/3852)) - Datacenter Dashboard Should Include Links To System Manager ([#3853](https://github.com//pull/3853)) - Monitor Snapshot Policy Compliance ([#3857](https://github.com//pull/3857)) - Display Hot/Cold Data Of Volumes ([#3858](https://github.com//pull/3858)) - Adding Storage-Units Rest Call In Asar2 ([#3867](https://github.com//pull/3867)) - Adding Availability-Zones Rest Call In Asar2 ([#3870](https://github.com//pull/3870)) - Harvest Should Load Asar2 Templates When Monitoring Asar2 Clusters ([#3871](https://github.com//pull/3871)) - Add Health Endpoint To Harvest Poller ([#3879](https://github.com//pull/3879)) - Adding Total Throughput Panel Of Destination Volume Of Sm-Sv ([#3880](https://github.com//pull/3880)) - Collect Volume Snaplock Information ([#3883](https://github.com//pull/3883)) - Adding Nvm_mirror Zapiperf Object And Removed Read_ops And Ops From Iwarp ([#3884](https://github.com//pull/3884)) - Honour Volume Filter In Top Client/File In Volume ([#3888](https://github.com//pull/3888)) - Harvest Mcp ([#3895](https://github.com//pull/3895)) - Asar2 Storage Unit Dashboard ([#3898](https://github.com//pull/3898)) - Use Keyperf Collector For Volume Performance Metrics ([#3909](https://github.com//pull/3909)) - Include Node Model In Aggregate Dashboard ([#3929](https://github.com//pull/3929)) - Arw Snapshot Template With Private Cli ([#3933](https://github.com//pull/3933)) - Adding Static Counter File For Keyperf Asar2 Folder ([#3934](https://github.com//pull/3934)) - Adding Min Iops And Workload Size In Adaptive Qos ([#3937](https://github.com//pull/3937)) - Storagegrid S3 Dashboard ([#3940](https://github.com//pull/3940)) - Disable Volumenode Metrics ([#3941](https://github.com//pull/3941)) - Adding Unique Type Field In Metroclustercheck ([#3948](https://github.com//pull/3948)) - Add Mcp Tool Details ([#3950](https://github.com//pull/3950)) - Cluster-Label Flag Adds New Cluster Var/Label And Update All Panels ([#3955](https://github.com//pull/3955)) - Add Plugins For Statperf Collector ([#3969](https://github.com//pull/3969)) - Root Volume Enable/Disable Handled In Template ([#3975](https://github.com//pull/3975)) - Include Tiering_minimum_cooling_days In Volume Template ([#3977](https://github.com//pull/3977)) - Adding Block_size In Lun Perf ([#3982](https://github.com//pull/3982)) - Add Nic And Fcp Port Support In Statperf ([#3989](https://github.com//pull/3989)) - Hosts Dashboard ([#3994](https://github.com//pull/3994))

🐛 Bug Fixes

  • Check Asup For All Pollers In Docker-Ci (#3836)
  • Don't Fail Poller Startup When Zapi Is Disabled (#3839)
  • Handle Ha Alerts For Non Ha Nodes (#3881)
  • Clus...
Read more

25.08.1

18 Aug 14:10
8fb6c9c

Choose a tag to compare

📌 This release is the same as version 25.08.0, with a fix for an issue where the ONTAP REST collector fails to start if ZAPIs are disabled on the cluster.

Upgrade Recommendation: Only upgrade if you are monitoring clusters with ZAPIs disabled. If ZAPIs are enabled, you can continue using the 25.08.0.

Full Changelog: v25.08.0...v25.08.1

25.08.0

13 Aug 13:07
4867219

Choose a tag to compare

25.08.0 / 2025-08-13 Release

📌 Highlights of this major release include:

⭐ New Features

  • StatPerf Collector

    • This collector is designed for environments where ZapiPerf, RestPerf, or KeyPerf collectors can not be used and uses the well known ONTAP statistics CLI command to gather performance statistics.
  • 💎 Three new dashboards:

    • Multi-admin verification (MAV) Dashboard provides a real-time overview of Multi-Admin Verification requests, tracking their status, approvals, and pending actions for enhanced security and operational visibility.
    • FPolicy dashboard for monitoring FPolicy performance metrics at the policy, SVM, and server levels.
    • ONTAP:Switch dashboard that provides details about switches connected to ONTAP.
  • Cisco switch dashboard updates: 💯 Thanks to @roybatty2019 for raising this issue and providing valuable guidance and examples.

    • Individual fan speeds are now displayed separately from zone speeds.
    • LLDP and CDP parsing have been refined with consistent field naming and improved data handling
    • New traffic monitoring metrics
  • ⭐ Enhancements:

    • Quota and FSA dashboards now support filtering by volume tags.
    • Added a Junction Path variable in the Volume dashboard.
    • Added bucket quotas in StorageGrid Tenant dashboard.
    • Added "Volume" and "Idle Timeout" columns to the CIFS sessions table in the SMB Dashboard.
    • Added Used% in the bucket table within Tenant dashboard.
  • 📕 Documentation additions

    • Navigate to your local Grafana dashboards from the metrics documentation by linking to your Grafana instance.
    • Added documentation for Cisco Switch and StorageGrid metrics.

Announcements

💡 IMPORTANT After upgrade, don't forget to re-import your dashboards, so you get all the new enhancements and fixes. You can import them via the 'bin/harvest grafana import' CLI, from the Grafana UI, or from the 'Maintenance > Reset Harvest Dashboards' button in NAbox3. For NAbox4, this step is not needed.

Known Issues

💡 IMPORTANT FSx ZapiPerf workload collector fails to collect metrics, please use RestPerf instead.

Thanks to all the awesome contributors

🤘 Thanks to all the people who've opened issues, asked questions on Discord, and contributed code or dashboards
this release:

@BrendonA667, @Falcon667, @T1r0l, @anguswilliams, @datamuc, @jowanw, @mamoep, @mhbeh, @mishraavinash88, @roybatty2019

🌱 This release includes 18 features, 19 bug fixes, 10 documentation, 4 refactoring, 1 miscellaneous, and 9 ci pull requests.

🚀 Features

  • Include Shelf Power Usage In "Average Power/Used Tb" And "Avera… (#3705)
  • Add Array Support To Statperf Collector (#3706)
  • Doc Includes Usage Detail Of Given Metric In Dashboards (#3710)
  • Add Filtering Support For Statperf (#3713)
  • Adding Junction Path Var In Volume Dashboard (#3716)
  • Add Storagegrid Bucket Quota (#3720)
  • Adding Volume Column In Smb Dashboard In Cifs Sessions Table (#3721)
  • Added Used% In Bucket Table In Tenant Dashboard (#3729)
  • Adding Legend In Dashboard Panels (#3734)
  • Adding Idle Duration In Cifs Session Table (#3744)
  • Mav Request Dashboard (#3746)
  • Disable Statperf Templates (#3753)
  • Add Volume Tags Support For Quota And Fsa Dashboard (#3769)
  • Fpolicy Dashboard (#3777)
  • Tags As Labels For Volume (#3786)
  • Enhance Fan Metrics And Parsing For Cisco Switches (#3790)
  • Add Tagmapper Plugin For Volume Labels (#3796)
  • Replace Ping With Go Code To Reduce External Dependencies (#3801)

🐛 Bug Fixes

  • Update Instance Key In Snapshot Policy (#3701)
  • Storagegrid Overview Dashboard - "S3 Api Requests" Panel Should … (#3726)
  • "Svm Cifs Connections And Open Files" Panel Should Include Svm I… (#3728)
  • Fsa Time Formatting (#3733)
  • Storagegrid Panel Should Include Units (#3737)
  • Statperf Collector Should Retry With Smaller Batch Size When Ont… (#3748)
  • Handle Sorting Of The Port Labels In Ifgroup (#3750)
  • After Failover/Giveback Volume Dashboard Data Query Failed (#3758)
  • Only Show Shelves Which Are Local In Mcc Cluster Case (#3759)
  • Publish Ontap Array As A Gauge Instead Of Histogram (#3760)
  • Cisco Dashboard Switch Details Should Only Show Columns Once (#3761)
  • Support Alerts Should Publish Recent Data (#3766)
  • Handle Outage Parsing In Disk Rest Template (#3768)
  • Reduce Reason Label In Metric Metadata_component_status (#3776)
  • Duplicate Rows In The Target Systems Panel Of The Metadata Dashboard (#3789)
  • Remove Index From Fsa (#3792)
  • Add Sort For Deterministic Order In Test (#3794)
  • Aggregate Dashboard Should Show % Inactive (#3803)
  • Ciscorest Collector In Default List Causes Ontap Collectors To Fail (#3805)

📕 Documentation

  • Add Storagegrid Port Information (#3699)
  • Add Instance_add Documentation For Endpoints (#3715)
  • Add Cisco And Sg Metric Documentation (#3764)
  • Add Details For Node Total Data (#3767)
  • Add Mav Note For Statperf Collector (#3782)
  • Add Flexcache Support For Rest/Statperf Collector (#3793)
  • Add Quota Dashboard Information (#3809)
  • Clarify That Restperf Is Upgraded To Keyperf For Asa R2 Clusters (#3810)
  • Clarify That Statperf Is Needed For Asa R2 Clusters (#3815)
  • Update Metric Docs (#3820)

Refactoring

  • Reuse Parsecounter Method For Generate Metric Doc (#3709)
  • Remove Unused Templates For Statperf Collector (#3784)
  • Simplify Parsing Cisco Fan Metrics (#3797)
  • Catch Nil Dereferences (#3807)

Miscellaneous

  • Merge Release/25.05.1 (#3698)

🔨 CI

  • Remove Unused Flags (#3712)
  • Add Statperf Integration Tests (#3714)
  • Remove Cr.netapp.io And Jfrog (#3749)
  • Bump Go (#3754)
  • Remove Smoke Test From Fips And Rpm (#3756)
  • Disable Nolintlint Due To False Positives (#3775)
  • Stop Pollers For Rpm And Containers After Tests (#3787)
  • Bump Go (#3808)
  • Update Integration Dependency (#3818) (#3819)

25.05.1

09 Jun 16:16
e0142c3

Choose a tag to compare

25.05.1 / 2025-06-9 Release

📌 This release is identical to 25.05.0, if you are using the Cisco collector, we recommend upgrading to version 25.05.1 to reduce cardinality issues caused by storing a switch's uptime as a label instead of a metric value.

This release also includes:

  1. Introduced a new ONTAP: Switch dashboard that provides detailed information about switches connected to ONTAP.
  2. Enhanced functionality to parse the Cisco version when the RCF is missing.
  3. Updated to Golang 1.23.4, which includes several security vulnerability fixes (CVEs).
  4. MetroCluster internal SVMs and volumes are no longer exported when they are offline.

Full Changelog: v25.05.0...v25.05.1

25.05.0

19 May 13:42
6fee835

Choose a tag to compare

25.05.0 / 2025-05-19 Release

📌 Highlights of this major release include:

⭐ New Features

  • Cisco Switch collector:

    • Harvest collects metrics from all supported MetroCluster Cisco switches. More details here.
    • Harvest collects environmental, ethernet, optics, interface, link layer discovery protocol (LLDP), Cisco discovery protocol (CDP), and version related details.
    • Harvest includes a new Cisco switch dashboard. Thanks to @BrendonA667, Mamoep, and Eric Brüning for reporting and providing valuable feedback on this feature.
  • Harvest includes a new performance collector named KeyPerf, designed to gather performance counters from ONTAP objects that include a statistics field in their REST responses. More details here.

  • Harvest supports auditing volume operations such as create,delete and modify via ONTAP CLI or REST commands, tracked through the ONTAP: AuditLog dashboard. Thanks @mvilam79 for reporting. More details here.

  • Harvest supports filtering for the RestPerf collector. See Filter for more detail.

  • Harvest collects vscan server pool active connection. Thanks @BrendonA667 for reporting.

  • Harvest collects uptime in lif perf templates and shows them in the SVM dashboard. Thanks to @Pengng88 for reporting.

  • Harvest collects volume footprint metrics and displays them through the Volume dashboard. Thanks to @robert Brown for reporting.

  • Harvest includes a beta template to collect ethernet switch ports. Thanks to @robert Watson for reporting!

  • ⭐ Several of the existing dashboards include new panels in this release:

    • The Disk dashboard updates CP panels Disk Utilization panel.
    • The Node dashboard include the Node column in the Node Detail panel.
    • The Quota dashboard includes Space Used panel. Thanks @razaahmed for reporting.
    • The Aggregate dashboard includes Growth Rate panel. Thanks @preston Nguyen for reporting.
    • The Volume dashboard includes Growth Rate panel. Thanks @preston Nguyen for reporting.
    • The Volume dashboard includes volume footprint metrics in FabricPool panel. Thanks @rbrown for reporting.

Announcements

‼️ IMPORTANT If using Docker Compose and you want to keep your historical Prometheus data, please read how to migrate your Prometheus volume

💡 IMPORTANT After upgrade, don't forget to re-import your dashboards, so you get all the new enhancements and fixes. You can import them via the 'bin/harvest grafana import' CLI, from the Grafana UI, or from the 'Maintenance > Reset Harvest Dashboards' button in NAbox3. For NAbox4, this step is not needed.

Known Issues

💡 IMPORTANT FSx ZapiPerf workload collector fails to collect metrics, please use RestPerf instead.

Thanks to all the awesome contributors

🤘 Thanks to all the people who've opened issues, asked questions on Discord, and contributed code or dashboards this release:

@WayneShen2, @mvilam79, @RobBW, @robert Watson, @roller, @Pengng88, @gaur-piyush, @chris Gautcher, @BrendonA667, @razaahmed, @nicolai-hornung-bl, @preston Nguyen, @robert Brown, @jay-law

🌱 This release includes 28 features, 28 bug fixes, 13 documentation, 17 refactoring, 16 miscellaneous, and 11 ci pull requests.

🚀 Features

  • Disable qtree perf metrics for KeyPerf collector (#3488)
  • Volume Audit log (#3479)
  • Handled duplicate instance issue in clustersoftware plugin (#3486)
  • Split cp panels in disk dashboard (#3496)
  • Adding uptime in lif perf templates (#3507)
  • Harvest EMS Events label plugin (#3511)
  • Filter support for RestPerf Collector (#3514)
  • Adding vscan server pool rest template and plugin changes (#3519)
  • Synthesize a timestamp when it is missing from KeyPerf responses (#3544)
  • Node dashboard should include the Node column in the Node detai… (#3553)
  • Adding format for promql in cluster dashboard (#3538)
  • Harvest should monitor Cisco 3K and 9K switches (#3559)
  • Adding space used time series panel in quota dashboard (#3561)
  • Cisco collector should collect optics metrics (#3575)
  • Private CLI perf collector StatPerf (#3566)
  • Cisco collector should collect optics metrics for transceivers … (#3580)
  • Add growth rate panel for Aggregate (#3582)
  • Use timestamp provided by CLI in statperf (#3585)
  • Add crc error for switch interface (#3590)
  • Dedup statperf against other perf collectors (#3592)
  • Harvest should collect volume footprint metrics (#3598)
  • Harvest should collect ethernet switch ports (#3601)
  • Adding cisco switch dashboard (#3574)
  • Add growth rate for volume and aggregate (#3610)
  • Update Cisco dashboard units and comment (#3613)
  • Add Volume footprint metrics to Volume Dashboard (#3624)
  • Include checksums with release artifacts (#3628)
  • Cisco collector should collect CDP and LLDP metrics (#3638)

🐛 Bug Fixes

  • Handled empty node name in clustersoftware plugin (#3460)
  • Duplicate timeseries in volume dashboard (#3483)
  • Update title of number of snapmirror transfers (#3485)
  • Network dashboard link speed units should be Megabits per second (#3491)
  • Workload and workload_volume templates should invoke the instance task before the data task (#3498)
  • Handled empty scanner and export false case for vscan (#3502)
  • KeyPerf Collector Volume stats are incorrect for flexgroup (#3520)
  • EMS cache handling (#3524)
  • IWARP read and write IOPS for ZAPI should be expressed as rate (#3550)
  • Aligning Harvest Dashboard node metrics with ONTAP CLI Data (#3549)
  • Handle system:node deprecate metrics in ZapiPerf (#3554)
  • Update namespace counters (#3558)
  • StorageGrid Collector handles global_prefix inconsistently (#3565)
  • grafana import should add labels to all panel expressions when… (#3567)
  • Cisco environment plugin should trim watts (#3572)
  • Handle string parsing for switch templates (#3578)
  • yaml parsing should handle key/values with spaces, colons, quotes (#3581)
  • Handle array element for optic metrics (#3589)
  • Filter label for ems destination is missing (#3596)
  • Harvest should collect ethernet switch ports when timestamp is m… (#3603)
  • Handle histogram skips in exporter (#3606)
  • Handled nil aggr instance in aggr plugin (#3607)
  • Handle HA and volume move alerts (#3611)
  • Poller Union2 should handle prom_port (#3614)
  • Handle empty values in template (#3626)
  • Improve Cisco RCF parsing (#3629)
  • Grafana import should refuse to redirect (#3632)
  • Handle empty values in template (#3627)
  • Vscanpool plugin should only ask for fields it uses (#3639)
  • Handle uname in qtree zapi plugin (#3641)

📕 Documentation

  • Add changelog discussion link (#3495)
  • Handled plugin custom prefix name for metrics (#3493)
  • Asar2 support ([#3535](#353...
Read more

25.02.0

13 Feb 14:22
c86d525

Choose a tag to compare

25.02.0 / 2025-02-13 Release

📌 Highlights of this major release include:

⭐ New Features

  • ⭐ The Volume dashboard was updated to clarify that volume latencies are missing some latencies from NAS protocols. Use the workload volume metrics in the QoS row for a more detailed breakdown. Thanks to MatthiasS for reporting.

  • All Harvest dashboards default to Datacenter=All instead of the first datacenter in the list. Thanks to @roybatty2019 for reporting.

  • Harvest provides a FIPS 140-3 compliant container image, available as a separate image at ghcr.io/netapp/harvest:25.02.0-1-fips.

  • 🌾 Harvest bin/grafana import

    • Supports nested Grafana folders. Thanks to @IvanZenger for reporting.
    • Supports setting variables' default values during import. See #3384 for details. Thanks to @mamoep for reporting.
  • Harvest collects shelf firmware versions and shows them in the Shelf dashboard, Module row. Thanks to @summertony15 for reporting.

  • ⭐ Several of the existing dashboards include new panels in this release:

    • The Disk dashboard includes a Top Disk and Tape Drives Throughput by Host Adapter panel. Thanks to Amir for reporting.
    • The Datacenter and Data Protection dashboards were updated with data protection buckets and policy rows.
  • The volumes templates exclude transient volumes by default. Thanks to Yann for reporting.

  • Harvest collects rewind context (rwctx) metrics for ONTAP 9.16.0 and later. Thanks to @shawnahall71 for reporting.

  • 📕 Documentation additions

🚀 Performance Improvements

  • RestPerf collector uses less memory by streaming results.

In case you missed the previous 24.11.1 dot release, here are the features included in it:

Performance Improvements in 24.11.1

  • Significant memory footprint improvements for the REST collector. More details here. Thanks to @ryan for reporting it.
  • Reduced memory footprint by using streaming in the REST collector.

New Features in 24.11.1

  • Harvest supports Top files metrics collection. More details here.
  • Volume and Cluster tags are supported via Volume and Cluster dashboards.
  • Field Replaceable Unit (FRU) details have been added to the power dashboard.
  • Track ONTAP image update progress for a cluster via the Cluster dashboard. Thanks to @knappmi for reporting it.
  • prom_port is now supported within the poller. More details here.
  • We've fixed an intermittent latency/operations spike issue in the plugin-generated Harvest performance metrics. Thanks to @wooyoungAhn for reporting it.

Announcements

‼️ IMPORTANT Harvest version 25.02.0 disables the out-of-the-box Qtree templates because of reported ONTAP slowdowns when collecting a large number of qtree objects. If you want to enable the Qtree templates, please see these instructions.

‼️ IMPORTANT Harvest version 25.02.0 removes the WorkloadDetail and WorkloadDetailVolume templates and all dashboard panels that use them. These templates are removed because they are expensive to collect and currently there is no way to collect them from ONTAP without introducing an unacceptable amount of skew in the results. See #3423 for details.

‼️ IMPORTANT If using Docker Compose and you want to keep your historical Prometheus data, please read how to migrate your Prometheus volume

💡 IMPORTANT After upgrade, don't forget to re-import your dashboards, so you get all the new enhancements and fixes. You can import them via the 'bin/harvest grafana import' CLI, from the Grafana UI, or from the 'Maintenance > Reset Harvest Dashboards' button in NAbox3. For NAbox4, this step is not needed.

Known Issues

💡 IMPORTANT FSx ZapiPerf workload collector fails to collect metrics, please use RestPerf instead.

Thanks to all the awesome contributors

🤘 Thanks to all the people who've opened issues, asked questions on Discord, and contributed code or dashboards this release:

@Falcon667, @IvanZenger, @cheese1, @embusalacchi, @mamoep, @roybatty2019, @summertony15, AdiZ, Amir, MatthiasS, Yann, ttlexceeded

🌱 This release includes 18 features, 22 bug fixes, 8 documentation, 1 performance, 1 refactoring, 6 miscellaneous, and 12 ci pull requests.

🚀 Features

  • Hide Transient Volumes (#3337)
  • Adding Ifgrp Api To Fetch Ifgrp Labels In Net_port (#3342)
  • Adding Rwctx Template For Restperf In 9.16.0 (#3349)
  • Add Disk And Tape Drives Throughput By Host Adapter (#3372)
  • Adding The Bucket And Policy Rest Template (#3374)
  • Include Lun And Namespace Templates In Keyperf (#3379)
  • Dashboard Variables Set Default Value On Import (#3399)
  • Optimize Workload_detail And Workload_detail_volume Through Delay Center Filter (#3406)
  • Handled Empty Admin Svm In Plugin Call (#3410)
  • Support Metric For Module Type In Frus (#3411)
  • Harvest Grafana Import Should Support Nested Grafana Folders (#3417)
  • Remove Workload Detail Templates (#3433)
  • Add Streaming To Keyperf Collector (#3435)
  • Negative Metrics Spike Handling (#3439)
  • Disabled Qtree Perf Template And Update Docs (#3445)
  • Harvest Dashboards Should Default To Datacenter=All (#3448)
  • Update Qos Row In Volume Dashboard (#3453)
  • Improve Cp Summary In Disk Dashboard (#3456)

🐛 Bug Fixes

  • Include Instances Generated By Inbuilt Plugins In Plugininstances Log (#3343)
  • Handled Duplicate Key In Securityauditdestination (#3348)
  • Harvest Permissions Should Include Fru (#3354)
  • No Instances Handling In Rest Collector (#3358)
  • Rest No Instance Handling (#3360)
  • Don't Clear Performance Volume Cache When There Is An Error (#3361)
  • Failed To Find Scanner Instance In Cache Zapiperf (#3366)
  • Installation Broken On Debian 11 Bullseye (#3368)
  • Changed Var Label To Ne Null From Empty (#3385)
  • Update Snapshot Policy Endpoint (#3391)
  • Update Export Rule Endpoint (#3392)
  • Upgrade Golang.org/X/Net Due To Dependabot Alert (#3395)
  • Update Export Rule Endpoint (#3396)
  • Remove Redundant Label From Node Template (#3404)
  • Enable Request/Response Logging For Restperf (#3408)
  • Disable Cache To Avoid Cache Poisoning Attack (#3409)
  • Duplicate Time Series In Volume Dashboard (#3418)
  • Update Tr Link In Security Dashboard (#3419)
  • Typo (#3425)
  • Handle Only Labels In Zapi Snapshotpolicy (#3444)
  • "Top Ethernet Ports By Utilization %" Panel Legend Should Not In… (#3451)
  • Handle Cp Labels In Dashboard (#3455)

📕 Documentation

  • Fix Release Announcements (#3330)
  • Keyperf Documentation (#3345)
  • Updating Doc For Custom.yaml (#3352)
  • Rest Endpoint Permissions (#3359)
  • Add Go Binary Steps For Credential Script (#3381)
  • Fix Alignment Of Template (#3421)
  • Document Podman Quadlet As A Deployment Option (#3442)
  • Add Description About Cp In Disk Dashboard (#3454)

⚡ Performance

  • Restperf Should St...
Read more

24.11.1

25 Nov 14:50
bb4113e

Choose a tag to compare

24.11.1 / 2024-11-25 Release

📌 Highlights of this major release include:

🚀 Performance Improvements

  • Significant memory footprint improvements for the REST collector. More details here. Thanks to Ryan for reporting it.
  • Reduced memory footprint by using streaming in the REST collector.

⭐ New Features

  • Harvest supports Top files metrics collection. More details here.
  • Volume and Cluster tags are supported via Volume and Cluster dashboards.
  • Field Replaceable Unit (FRU) details have been added to the power dashboard.
  • Track ONTAP image update progress for a cluster via the Cluster dashboard. Thanks to @knappmi for reporting it.
  • prom_port is now supported within the poller. More details here.
  • We've fixed an intermittent latency/operations spike issue in the plugin-generated Harvest performance metrics. Thanks to @wooyoungAhn for reporting it.

Announcements

‼️ IMPORTANT NetApp moved their communities from Slack to Discord, please join us there!

‼️ IMPORTANT If using Docker Compose and you want to keep your historical Prometheus data, please
read how to migrate your Prometheus volume

💡 IMPORTANT After upgrade, don't forget to re-import your dashboards, so you get all the new enhancements and fixes. You can import them via the bin/harvest grafana import CLI, from the Grafana UI, or from the Maintenance > Reset Harvest Dashboards button in NAbox3. For NAbox4, this step is not needed.

Thanks to all the awesome contributors

🤘 Thanks to all the people who've opened issues, asked questions on Discord, and contributed code or dashboards
this release:

@70tas, @BrendonA667, @Falcon667, @mark Jordan, @paqui, Ryan, @CashnMoney, @ceojinhak, @ekolove, @knappmi, @wooyoungAhn

🌱 This release includes 14 features, 8 bug fixes, 2 documentation, 3 performance, 1 testing, 1 styling, 7 refactoring, 2 miscellaneous, and 3 ci pull requests.

🚀 Features

  • Add Tags To The Volume And Cluster Dashboards (#3273)
  • Harvest Should Request Cluster Version Once (#3274)
  • Top Files Collection (#3279)
  • Enable Iface And Recvcheck Linters (#3280)
  • Harvest Should Support Per-Poller Prom_ports (#3281)
  • Harvest Should Log Number Of Renderedbytes For Each Collector (#3282)
  • Asa R2 Should Use Keyperf Instead Of Restperf (#3289)
  • Add Top Files Panels In Volume Dashboard (#3292)
  • Adding The Ems Doc Link In The Health Dashboard Table (#3295)
  • Add Dimm Panels In Power Dashboard (#3296)
  • Adding Is_space_enforcement_logical, Is_space_reporting_logical… (#3301)
  • Harvest Should Monitor Wafl.dir.size.warning (#3304)
  • Add Flexcache Keyperf Template (#3309)
  • Add Top Metrics Plugin To Keyperf (#3315)

🐛 Bug Fixes

  • Set Dashboard Variable To Refresh To Time Range Change. (#3269)
  • Correct The Mtu Unit In Network Dashboard (#3278)
  • Zapi Collection (#3285)
  • Metroclustercheck Collector Should Report Standby When Metroclus… (#3287)
  • Missing Volumes After Vol Move (#3312)
  • Metroclustercheck Collector Should Report "No Instances" (#3314)
  • Panic If No Volumes Have Analytics Enabled (#3323)
  • Partial Aggregation Handling In Plugins (#3324)

📕 Documentation

  • Update Top Clients Doc (#3311)
  • Harvest Should Include Network Port Ifgrp Permissions (#3318)

⚡ Performance

  • Reduce The Memory Footprint Of Rest Collector (#3303)
  • Add Streaming To Rest Collector (#3305)
  • Improve Memory And Cpu Performance Of Rest Collector (#3310)

🔧 Testing

  • Sort Exporters For Deterministic Tests (#3290)

Styling

Refactoring

  • Remove Extra Log (#3257)
  • Remove Env Logging (#3277)
  • Simplify Negotiateontapapi (#3288)
  • Keyperf Node Template Should Match Restperf Object Name (#3298)
  • Remove Uses Of Nolint:gocritic (#3299)
  • Remove Unused Method In Rest Collector (#3308)
  • Sync Template Names For Keyperf (#3316)

Miscellaneous

  • Update All Dependencies (#3275)
  • Update Chizkiyahu/Delete-Untagged-Ghcr-Action Action To V5 (#3300)

🔨 CI

  • Bump Go (#3270)
  • Lint Errors (#3276)
  • Ignore Volume_top_files_ Counters (#3293)

24.11.0

06 Nov 14:41

Choose a tag to compare

24.11.0 / 2024-11-06 Release

📌 Highlights of this major release include:

  • 💎 New dashboards:

    • SnapMirror Destinations Dashboard which displays relationship details from the destination perspective.
    • Vscan Dashboard which shows SVM-level and connection scanner details.
  • ⭐ Several of the existing dashboards include new panels in this release:

    • SnapMirror dashboard now includes relationship details from the source perspective and has been renamed to "ONTAP: SnapMirror Sources".
    • Health Dashboard's emergency events panel now includes all emergency EMS events from the last 24 hours.
    • Network Dashboard
      • Includes Link Aggregation Group (LAG) metrics
      • Adds Ethernet port details
    • s3 Object Storage dashboard includes panels for s3 metrics for SVM.
    • Tenant Dashboard
      • Adds Tenant/Bucket Capacity Growth Chart
      • Includes average size per object details for each bucket
    • Metadata Dashboard includes a panel displaying the number of instances collected.
    • Power Dashboard includes a new "Average Power Consumption (kWh) Over Last Hour" panel.
    • SVM Dashboard now features panels for logical space and physical space at the SVM level.
    • Volume Deep Dive dashboard includes "Other IOPs" panel.
  • 🚀 Performance Improvements:

    • Reduced memory footprint by optimizing memory allocations when serving metrics.
    • Reduced API calls when using the RestPerf collector.
  • Harvest supports Top clients metrics collection. More details.

  • Harvest supports recording and replaying HTTP requests.

  • Harvest now provides a FIPS-compliant container image, available as a separate image (ghcr.io/netapp/harvest:24.08.0-1-fips).

  • Grafana import allows rewriting the cluster label during import.

Announcements

‼️ IMPORTANT NetApp moved their communities from Slack to Discord, please join us there!

‼️ IMPORTANT If using Docker Compose and you want to keep your historical Prometheus data, please
read how to migrate your Prometheus volume

💡 IMPORTANT After upgrade, don't forget to re-import your dashboards, so you get all the new enhancements and fixes. You can import them via the 'bin/harvest grafana import' CLI, from the Grafana UI, or from the 'Maintenance > Reset Harvest Dashboards' button in NAbox.

Thanks to all the awesome contributors

🤘 Thanks to all the people who've opened issues, asked questions on Discord, and contributed code or dashboards
this release:

🌱 This release includes 36 features, 24 bug fixes, 7 documentation, 7 performance, 1 testing, 3 styling, 5 refactoring, 9 miscellaneous, and 15 ci pull requests.

🚀 Features

  • Tenant Dashboard Buckets Panel Should Include (#3101)
  • Use Docker Buildx Secret For Token (#3108)
  • Enable Pprof Endpoints On Localhost (#3110)
  • Generate Fips Compliant Container Image For Harvest (#3113)
  • Support Ifgroup Level Throughput Metrics (#3117)
  • Harvest Should Include A Vscan Dashboard (#3121)
  • Vscan Dashboard Should Include Topk (#3127)
  • Top Clients Metrics Collection (#3132)
  • Adding Panels For Ontaps3svm Object (#3134)
  • Grafana Import Should Allow Rewriting Cluster Label (#3135)
  • Replace Zerolog With Slog (#3146)
  • Harvest Should Include Time-Series Panels For Tenants And Buckets (#3147)
  • Send The Harvest Version To Ontap (#3152)
  • Replace Zerolog With Slog (#3164)
  • Add Documentation For Plugin-Generated Metrics And Enable Ci (#3169)
  • Add Instances Collected Panel To Metadata Dashboard (#3178)
  • Harvest Should Use Slogs Text Format By Default (#3179)
  • Add "Average Power Consumption (Kwh) Over Last Hour" Panel To Power Dashboard (#3180)
  • Replacing connector webhook with MS workflow (#3183)
  • Handle Url Limit In Rest (#3186)
  • Keyperf Collector Templates (#3194)
  • Harvest Rest And Restperf Collectors Should Support Batching (#3195)
  • Add Top Svm By Space In Svm Dashboard (#3200)
  • All Harvest Dashboards Should Include Tags (#3202)
  • Support Destination/Source Level View - Parity With Sm (#3204)
  • Add Other Ops Panel In Volume Deep Dive Dashboard (#3209)
  • Add Nfs Templates For Keyperf Collector (#3215)
  • Adding Snapmirror Sources dashboard - 1 (#3216)
  • Keyperf Collector Templates (#3219)
  • Adding Ethernet Port Table From Netport Template (#3221)
  • Fail Ci When There Are Errors In Prometheus Or Grafana (#3232)
  • Log Cluster Name And Version With Poller Metadata (#3234)
  • Harvest Should Support Recording And Replaying Http Requests (#3235)
  • Add Emergency Events To Health Dashboard (#3238)
  • Add Keyperf Metric Docs (#3240)
  • Improve Harvest Memory Logging (#3244)
  • Doctor should handle embedded exporters (#3258)

🐛 Bug Fixes

  • Handled Non Exported Qtrees In Template (#3105)
  • Handled Nameservices In Svm Zapi Plugin (#3124)
  • Fix Disk Count In Disk Dashboard (#3126)
  • Handled Quota Index Key In Rest Template With Tests (#3128)
  • Vscan Panels Throws 422 Error (#3133)
  • Correcting The Alert Rule Expression For Required Labels (#3143)
  • Svm Dashboard - Volume Capacity Row Ordering (#3158)
  • Fsa History Data Should Work When Multi Select (#3159)
  • Do Not Log Stdout When A Credential Script Fails (#3163)
  • Remove '*' As 'All' Option In Workload Dropdown On Workload Dashboard (#3165)
  • Bin/Harvest Rest Should Read Credentials Before Fetching Data (#3166)
  • Remove Embedded Shelf Power From Total Power In Series Panel To Match Stats Panel (#3167)
  • Volume_aggr_labels Should Not Include Uuid Label (#3171)
  • Add Embedded Shelf Type For Power Calculation (#3174)
  • Using Instancename Instead Of Volname In Fabricpool Perf (#3175)
  • Correct Failed State In Workflow (#3190)
  • Handled Flexgroup Based On Volume Config Call (#3199)
  • Filter By Svm, Volume In Sm Destination Dashboard (#3220)
  • Remove _Labels From Metric Docs (#3222)
  • Update Datacenter And Cluster Variables In Dashboards (#3227)
  • Don't Double Export Aggregate Efficiency Metrics (#3230)
  • Update Keyperf Collector Static Counter File Path (#3241)
  • Fix Numbering In Quickstart (#3249)
  • Fix Value Mapping In Tenant Dashboard (#3253)
  • Rename volume latency in keyperf (#3261)

📕 Documentation

  • Fix Typo In Docs (#3112)
  • Clarify Ipv6 Support (#3119)
  • Topclients Plugin Document (#3151)
  • Add More Credential Script Troubleshooting Steps (#3154)
  • Remove Qos Service Lat...
Read more

24.08.0

12 Aug 13:19
0cd7265

Choose a tag to compare

24.08.0 / 2024-08-12 Release

📌 Highlights of this major release include:

  • 💎 Harvest dashboards now include links to other relevant dashboards. This makes it easier to navigate relationships between cluster objects.

  • ⭐ Several of the existing dashboards include new panels in this release:

    • The Security dashboard shows SSL certificate expiration dates and warns if certificates are expiring soon. Prometheus alerts are created for expired certificates and certificates that will expire within the next month. Thanks to @timstiller for the suggestion.
    • The Volume and Aggregate dashboards include new panels showing inactive data trends. Thanks to @razaahmed for the suggestion.
    • The Workload dashboard includes panels showing the QoS percentage utilization at the policy level for shared QoS policies. Thanks to Rusty Brown for the suggestion.
    • The Datacenter dashboard includes the number of Qtrees, Quotas, and Workloads in the Object Count panel.
    • The Aggregate dashboard now includes topk timeseries.
    • The Metadata dashboard now includes a stats panel showing the number of failed collectors. Thanks to @mamoep for the suggestion.
    • The Metadata dashboard Pollers table includes the resident set size of each poller process.
    • The StorageGRID Tenant dashboard now includes an "average size per object" column in the Tenant Quota panel. Thanks to @ofu48167 for the contribution.
  • 🌾 Quotas and Qtrees templates are separated into individual templates instead of being combined as in earlier versions of Harvest.

  • The ChangeLog plugin monitors metric value changes in addition to label changes. Thanks to @pilot7777 for the suggestion.

  • Harvest collects quotas even when there are no qtrees. Thanks to @qrm1982 for reporting.

  • The StorageGRID collector supports single sign-on via a credential script auth token. Thanks to @santosh725 for suggesting.

  • Harvest supports OAuth 2.0 ONTAP collectors via a credential script auth token.

  • Harvest handles lun and namespace metrics with simple names.

  • Harvest collects virtual_used and virtual_used_percent metrics from volumes via REST on ONTAP versions 9.14.1+

  • Prometheus metrics retention has been increased to one year in the Docker compose workflow.

  • Harvest creates resolution metrics for health alerts. Thanks to @faguayot for suggesting.

  • Pollers report their status as the poller_status in native and container environments.

  • Grafana import allows you to specify a custom all value when importing. Thanks to ChrisGautcher for the suggestion.

  • Harvest includes remediation steps for EMS active sync events in the EMS alert runbook. Thanks to @Nikhita-13 for the contribution.

  • bin/harvest doctor reports when exporters are missing

  • Harvest allows exporting metrics without a prefix. This can be handy when collecting from a StorageGRID Prometheus instance. See the storagegrid_metrics.yaml template for an example. Thanks to @Bhagyasri-Dolly for suggesting.

  • 📕 Documentation Additions:

    • Harvest includes a new "Getting Started" tutorial. Thanks to MichelePardini for the suggestion.

Announcements

‼️ IMPORTANT Harvest removed the Service Center row from the Workload dashboard and disabled collection of qos_detail_service_time_latency metrics. The metrics can be reenabled by setting with_service_latency: true in the WorkloadDetailVolume template file. See #3015 for details.

‼️ IMPORTANT If using Docker Compose and you want to keep your historical Prometheus data, please
read how to migrate your Prometheus volume

💡 IMPORTANT After upgrade, don't forget to re-import your dashboards, so you get all the new enhancements and fixes. You can import them via the 'bin/harvest grafana import' CLI, from the Grafana UI, or from the 'Maintenance > Reset Harvest Dashboards' button in NAbox3.

Thanks to all the awesome contributors

🤘 Thanks to all the people who've opened issues, asked questions on Discord, and contributed code or dashboards
this release:

🌱 This release includes 40 features, 28 bug fixes, 13 documentation, 1 performance, 2 testing, 5 refactoring, 12 miscellaneous, and 11 ci pull requests.

🚀 Features

  • Prometheus Should Retain Data For Up To One Year (#2919)
  • Log Jitter During Best-Fit Template Loading (#2920)
  • Add Failed Collectors Stats In Metadata Dashboard (#2929)
  • Linking Dashboard Part-1 (#2931)
  • Poller's Should Collect And Export Their Status And Memory (#2944)
  • Include Rss In Poller Table Of Metadata Dashboard (#2948)
  • Grafana Import Should Allow You To Specify A Custom All Value (#2953)
  • Harvest Should Include Remediation Steps For Ems Active Sync Ev… (#2963)
  • Linking Dashboards Part-2 (#2968)
  • Support For Qos Percentage Utilization At Policy Level For Shared Qos Policies (#2972)
  • Linking Dashboards Part-3 (#2976)
  • Create Resolution Metrics For Health Alerts (#2977)
  • Add Qtree,Quota,Workload Counts To Datacenter Dashboard (#2978)
  • Harvest Should Track Poller Maxrss In Auto-Support (#2982)
  • Add Topk To Aggregate Dashboard Timeseries Panels (#2987)
  • Harvest Should Handle Lun And Namespace Metrics With Simple Names (#2998)
  • Harvest Should Log Rss And Maxrss Every Hour (#2999)
  • Implementing Certificate Expiry Detail In Security Dashboard (#3000)
  • Remove Topk Vars From Storagegrid Dashboards (#3002)
  • Add Inactive Data Metrics For Aggregate And Volume (#3003)
  • Harvest Should Remove Service Center Metrics (#3019)
  • Adding Quotas Detail In Asup (#3020)
  • Harvest Should Allow Exporting Metrics Without A Prefix (#3022)
  • Remove Service_time_latency Counter From Tests (#3027)
  • Harvest Should Collect Virtual_used And Virtual_used_percent (#3031)
  • Harvest Should Log Template Loading Errors (#3036)
  • Enable Changelog Plugin To Monitor Metric Value Change (#3041)
  • --Debug Cli Argument Should Enable Debug Logging (#3043)
  • Harvest Should Support Storagegrid Credentials Script With Auth… (#3048)
  • Harvest Doctor Should Report When Exporters Are Missing (#3049)
  • Update Qtree Template Doc - Collect Quotas When No Qtrees (#3056)
  • Handled User/Group Quota In Historicallabels (#3060)
  • Support Oauth2.0 Via Credential Script - Phase1 (#3066)
  • Harvest Should Not Simultaneously Publish Quota Metrics From Qt… (#3067)
  • Split Qtree/Quota Rest Templates (#3068)
  • Adding Generated Instances/Metrics Count In Health Plugin Log (#3074)
  • Health Dashboard Should Indicate When There Are No Events (#3077)
  • Keyperfmetrics Collector Infrastructure (#3078)
  • Adding Ut For Qtree Non Exported Case (#3085)
  • Tenant Dashboard Should Include An Average Size Per Object Co… (#3091)

🐛 Bug Fixes

  • Zapi Rest Parity (#2934)
  • Rest Templates Should Not Have Hyphon (#2943)
  • Restore The Svm, Qtree, User, And Group Columns To The Quota Das… (#2950)
  • Harvest Should Log Errors When Grafana Import Fails (#2962)
  • Correct Details Folder Name While Import (#2966)
  • Handling Min-Max In Gradient (#2969)
  • Use Read/Write Data Due To Missing Historical Data In Dashboards (#2979)
  • Fixing Non-Exported Flexgroup Instances Error ([#2980](https://...
Read more