Skip to content

23.08.0

Compare
Choose a tag to compare
@cgrinds cgrinds released this 21 Aug 13:02
874a1f8

23.08.0 / 2023-08-21 Release

πŸ“Œ Highlights of this major release include:

  • Harvest Security dashboard highlights compliance using NetApp's Security hardening guide for ONTAP

  • Harvest's credential script supports ONTAP daily credential rotation. Thanks to @mamoep for raising.

  • 🎩 Harvest makes it easy to run with both the ZAPI and REST collectors at the same time. Overlapping resources are deduplicated and only exported once. Harvest will automatically upgrade ZAPI conversations to REST when ZAPIs are suspended or disabled.

  • πŸ’Ž Updated workload dashboard now includes Service Center, Latency Breakdown, and 50 panels

  • πŸ’Ž Cluster dashboard updated to work with FSx. Some panels are blank because FSx does not have that data.

  • πŸ“£ The Harvest team published a couple of screencasts about:

  • ⭐ Several of the existing dashboards include new panels in this release:

    • Aggregate dashboard includes busy volume panels
    • SVM dashboard includes per NFS latency heatmaps. Thanks to @rbrownATnetapp for raising.
    • Volume dashboard includes top resources by other IOPs panel and junction paths. Thanks to @tsohst for raising.
  • All Harvest dashboard tables include column filters

  • Harvest dashboards use color to highlight latency and busy threshold breaches

  • Harvest's Prometheus exporter supports TLS

  • 🌾 Harvest includes new templates to collect:

    • Iwarp metrics
    • FCVI metrics
    • Per volume NFS metrics
    • Volume clone metrics
    • QoS workload policy metrics
    • NVME/TCP and NVME/RoCE metrics
    • Flashpool metrics are included in RestPerf. Thanks to @lobster1860 for raising
  • πŸ“• Documentation additions

    • Move more documentation from GitHub to Harvest documentation site
    • Clarify how to tell Harvest to continue using the ZAPI protocol
    • Clarify generic vs custom plugins. Thanks to GregS for raising
    • Clarify which version of Go is required to build Harvest. Thanks to MikeK for raising
    • Clarify how to prepare ONTAP cDOT clusters for Harvest data collection
    • EMS documentation should point to Harvest documentation site. Thanks to @cwaltham for raising
    • Clarify how to gather log files on all platforms
    • Explain how to use the --labels option of bin/harvest grafana. Thanks to @slater0013 for raising
    • Describe how to run docker compose generate command without required Harvest binaries
  • The Harvest doctor command validates collector names listed in your harvest.yml file

  • An earlier version of Harvest collected cloud store information via REST. This release adds the same for ZAPI

  • When ONTAP resources are missing, Harvest tries to collect them every hour. Earlier versions of Harvest waited 24 hours before retrying, which often caused metrics to be missing after a cluster upgrade. Thanks to @Falcon667 for raising

  • Earlier versions of Harvest created world writable auto-support files. These files are now only read/writeable by the current user. Thanks to Bunnygirl for raising

  • bin/harvest import should work with Grafana 10. Thanks to @wooyoungAhn for raising

Announcements

‼️ IMPORTANT 23.08 fixes a REST collector bug that caused partial data collection when ONTAP paginated results. See #2109 for details.

‼️ IMPORTANT Release 23.08 disables the NetConnections and NFSClients templates by default. You can enable them if needed. These templates were disabled because several customers reported that these templates created millions of metrics. None of these metrics are used in Harvest dashboards.

‼️ IMPORTANT Release 23.08 changes how Harvest monitors workloads. For detailed information, please refer to the discussion #2265.

πŸ’‘ The Compliance dashboard was removed after its panels were moved to the Security dashboard.

πŸ‘€ Ambient temperature metric may experience an increase due to issue #2259

‼️ IMPORTANT NetApp moved their communities from Slack to Discord, please join us there!

‼️ IMPORTANT If using Docker Compose and you want to keep your historical Prometheus data, please
read how to migrate your Prometheus volume

πŸ’‘ IMPORTANT After upgrade, don't forget to re-import your dashboards, so you get all the new enhancements and fixes. You can import them via the bin/harvest grafana import CLI, from the Grafana UI, or from the Maintenance > Reset Harvest Dashboards button in NAbox.

Known Issues

  • Some AFF A250 systems do not report power metrics. See ONTAP bug 1511476 for more details.

  • ONTAP does not include REST metrics for offbox_vscan_server and offbox_vscan until ONTAP 9.13.1. See ONTAP bug
    1473892 for more details.

IMPORTANT 7-mode filers that are not on the latest release of ONTAP may experience TLS connection issues with errors like tls: server selected unsupported protocol version 301 This is caused by a change in Go 1.18. The default for TLS client connections was changed to TLS 1.2 in Go 1.18. Please upgrade your 7-mode filers (recommended) or set tls_min_version: tls10 in your harvest.yml poller section. See #1007 for more details.

Thanks to all the awesome contributors

🀘 Thanks to all the people who've opened issues, asked questions on Discord, and contributed code or dashboards
this release:

@7840vz, @DAx-cGn, @Falcon667, @Hedius, @LukaszWasko, @MrObvious, @ReneMeier, @Sawall10, @T1r0l, @XDavidT, @amd-eulee, @aticatac, @chadpruden, @cwaltham, @cygio, @ddhti, @debert-ntap, @demalik, @electrocreative, @elsgaard, @ev1963, @faguayot, @iStep2Step, @jgasher, @jmg011, @lobster1860, @mamoep, @matejzero, @matthieu-sudo, @merdos, @pilot7777, @rbrownATnetapp, @rodenj1, @slater0013, @swordfish291, @tsohst, @wooyoungAhn, Alessandro.Nuzzo, Ed Wilts, GregS, Imthenightbird, KlausHub, MeghanaD, MikeK, Paul P2, Rusty Brown, Shubham Mer, Tudor Pascu, Watson9121, jf38800, jfong, lorenzoc, rcl23, roller, scrhobbs, troysmuller, twodot0h

🌱 This release includes 42 features, 40 bug fixes, 20 documentation, 2 performance, 4 testing, 1 styling, 9 refactoring, 20 miscellaneous, and 12 ci pull requests.

πŸš€ Features

  • Harvest Should Collect Iwarp Counters (#2071)
  • Update Visitpanels To Be Recursive (#2085)
  • Add Table Column Filter For Dashboards (#2088)
  • Update Lagtime Based On Lasttransfersize (#2091)
  • Harvest Should Add Grafana Import Rewrite Svm Filtering For Multi-Tenant Support (#2092)
  • Fetch Cloud_store Info In Zapi Via Plugin (#2094)
  • Collection Of Other Counters For Fcvi Perf Object (#2096)
  • Add Nfs Io Types At The Volume Level (#2098)
  • Add System Defined Workload Collection (#2099)
  • Add Workload Panels In Workload Dashboard (#2100)
  • Add Volume Clone Info In Rest (#2102)
  • Added Volume Panels In Aggr Dashboard (#2104)
  • Workload Policy Iops Metrics (#2111)
  • Autoresolve Ems Would Export Metric Value As 0 And Autoresolve=True Label (#2120)
  • Support Type Label For Volume For Backward Compatibility (#2132)
  • Volume Clone Info For Zapi (#2140)
  • Harvest Should Include Numpollers And Rss In Autosupport (#2143)
  • Colors In Grafana Dashboards To Highlight Warning, Critical Severity (#2147)
  • Security Hardening Guide (#2150)
  • Harvest Prometheus Exporter Should Support Tls (#2153)
  • Latency Units Should Be In Microseconds In Harvest Dashboard (#2156)
  • Simplify Rest Auto-Upgrade (#2167)
  • When Using A Credential Script, Re-Auth On 401S (#2180)
  • Upgrade Zapi Conversations To Rest When Zapis Are Suspended Or … (#2200)
  • When Using A Credential Script, Re-Auth On 401S (#2203)
  • Merge Compliance And Security Dashboard + Added Arw Fields (#2207)
  • Supporting Topk In S3 Dashboard (#2208)
  • Aff250 Power Calculation (#2211)
  • Use Single Go Build Command To Build Harvest And Poller Binaries (#2221)
  • Harvest Should Include A User Agent (#2224)
  • Add Collector Name Validation In Doctor (#2229)
  • Harvest Should Fetch Certificates Via A Script (#2238)
  • Include Lun Offline Ems Alert (#2252)
  • Add Panel For Other Iops On Volume Dashboard (#2254)
  • Update Ambient Temperature Calculation For Power Dashboard (#2259)
  • Nvme/Tcp And Nvme/Roce Counters (#2264)
  • Harvest Svm Dashboard Should Include Latency Heatmap Panels Nfs… (#2268)
  • Added Table Description For Cluster Compliance (#2269)
  • Update Ontap Metric Document (#2270)
  • Add Cpu_firmware_release To Cluster Dashboard (#2274)
  • Enable Cluster Dashboard For Fsx (#2303)
  • Add Junction Paths In Volumes Dashboard (#2309)

πŸ› Bug Fixes

  • Disk Dashboard Power On Time Should Use Seconds Unit (#2039)
  • Update Metadata Cpu Times: Breakdown To Seconds (#2055)
  • Workload Missing Label Value (#2072)
  • Fcvi Restperf Template (#2080)
  • Change Svm Panels Row Name (#2097)
  • Correct Unit In Panels With Added Testcase (#2108)
  • Rest Collector Incomplete Data If Retrieval Exceeds Return_timeout (#2110)
  • Storagegrid Should Honor -Logtofile Option (#2119)
  • Harvest Should Always Pass Addr Argument To Credentials_script (#2128)
  • Handle Difference Of Pollinstance And Polldata Records Via Exportable (#2137)
  • Cpu_busy Description In Cluster Dashboard (#2141)
  • Reduce Auto Support Log Noise When Collecting Process Info On Mac (#2145)
  • Correct The Flashpool Panel Units (#2163)
  • Handling Label Count When Matches Applied In Ems (#2165)
  • Volume Template Fix (#2171)
  • Harvest Should Retry Every Hour When Ontap Replies With An Api-R… (#2181)
  • Ciphers Query Was Giving Wrong Result In Promql (#2188)
  • S3 Dashboard Fails To Import In Grafana 8.5.15 (#2191)
  • Harvest Auto-Support Files Should Not Be World Writable (#2193)
  • Fix Key For Qtree 7Mode (#2196)
  • Check Existing Asup Dir Permission (#2197)
  • Import Dashboard Failure With Editor Role In Grafana (#2206)
  • When Using Credentials_file Make Sure Defaults Are Copied To Poller (#2209)
  • When Using Credentials_file Make Sure Defaults Are Copied To Poller (#2215)
  • Flashpool-Data Is Missing In Restperf (#2217)
  • Disable Nfs_clients.yaml Template By Default In Rest Collector (#2219)
  • Remove Duplicate Error Message (#2222)
  • Correct Svm Rest Template Based On Version (#2239)
  • Correct Shelf Metrics In 7Mode (#2245)
  • Remove Source_node Label From Snapmirror Zapi (#2255)
  • Added Version Check For Aggr-Object-Store-Get-Iter (#2258)
  • Volume Rest Template Based On Version (#2263)
  • Nfs Heatmap Per Cluster (#2273)
  • Make Poller Mandatory For Metrics Generation Cmd (#2280)
  • Handled When Metric Not Found In Plugin (#2281)
  • Disable Netconnections In Rest By Default (#2283)
  • Grafana Ask-For-Token Should Retry At Most 5 Times (#2284)
  • Match Object Name With Zapiperf For Cifs_vserver.yaml (#2288)
  • Add Bin Dir Check Before Removing Files (#2289)
  • Adding Log Forwarding Column In Compliance Table In Security Dashboard (#2306)

πŸ“• Documentation

  • Explain Bin/Grafana Import --Labels (#2032)
  • Update Release Checklist (#2043)
  • Update Docker Compose Generation Process To Remove Binary Dependencies (#2046)
  • Add Details About Volume Sis Stat Panel (#2047)
  • Add Harvest-Metrics Release Branch Creation For Release Steps (#2050)
  • Fix Rest Template Extend Instructions Path (#2051)
  • Fsx Does Not Support Headroom Dashboard (#2131)
  • Update Fsa Dashboard Doc (#2159)
  • Move K8 Podman Document To Documentation Site (#2160)
  • Clarify How To Tell Harvest To Continue Using The Zapi Protocol (#2162)
  • Clarify Generic Vs Custom Plugins (#2166)
  • Update Docker Docs Link To Doc Site (#2186)
  • Clarify Which Version Of Go Is Required (#2214)
  • Give Authentication Precedence Its Own Section (#2226)
  • Add Note About Workload Counter In Default Templates (#2230)
  • Simplify The Preparing Ontap Cdot Cluster Documentation (#2231)
  • Fix Ems Link (#2244)
  • Update Metric Generate Step Command (#2279)
  • Move Troubleshoot Docs To Doc Site (#2287)
  • Release 23.08 Metric Docs (#2290)

⚑ Performance

  • Improve Memory And Cpu Performance Of Restperf Collector (#2053)
  • Optimize Restperf Collector Pollinstance (#2121)

πŸ”§ Testing

  • Add Unit Test For Restperf (#2044)
  • Adding Ems Unit Tests (#2052)
  • Add Unit Test For Rest Collector (#2062)
  • Ensure Dashboard Time Is Now-3H (#2275)

Styling

  • Address All Lint Errors In Ci (#2014)

Refactoring

  • Move Unit Testing Json Parser To Common (#2064)
  • Dashboard Tests (#2090)
  • Harvest Dashboard Jsons Should Be Sorted By Key (#2152)
  • Eliminate Usages Of Time.sleep In Test Code (#2182)
  • Fix Inconsistent Pointer Receivers (#2225)
  • Reduce Asup Log Noise (#2276)
  • Increase Max Log File Size From 5Mb To 10Mb (#2277)
  • Add Cp Command In Dashboard Sort Test (#2278)
  • Code Cleanup (#2282)

Miscellaneous

  • Bump Github.com/Shirou/Gopsutil/V3 From 3.23.3 To 3.23.4 (#2027)
  • Bump Golang.org/X/Term From 0.7.0 To 0.8.0 (#2056)
  • Bump Golang.org/X/Sys From 0.7.0 To 0.8.0 (#2057)
  • Add Renovate Bot (#2075)
  • Update Module Github.com/Imdario/Mergo To V0.3.16 (#2112)
  • Update Renovate Bot (#2116)
  • Update Renovate Commit Prefix (#2117)
  • Update Module Github.com/Shirou/Gopsutil/V3 To V3.23.5 (#2122)
  • Update All Dependencies (#2139)
  • Update Module Github.com/Imdario/Mergo To V1 (#2144)
  • Upgrade Mergo Package (#2157)
  • Update Module Github.com/Shirou/Gopsutil/V3 To V3.23.6 (#2174)
  • Update All Dependencies (#2176)
  • Update Module Golang.org/X/Term To V0.10.0 (#2183)
  • Bump Go (#2205)
  • Update All Dependencies (#2243)
  • Bump Go (#2253)
  • Update All Dependencies (#2261)
  • Bump Go (#2285)
  • Update Module Github.com/Tidwall/Gjson To V1.16.0 (#2286)

πŸ”¨ CI

  • Wait For Qos_volume Counters (#2045)
  • Update Docs For Nightly Builds (#2058)
  • Add Gh-Pages Fetch Before Mkdoc Deploy (#2067)
  • Configure Renovate (#2074)
  • Renovate Should Ignore Integration (#2078)
  • Renovate Should Run On A Schedule (#2082)
  • Renovate Group All Prs (#2136)
  • Ensure Exported Prometheus Metrics Are Unique (#2173)
  • Run Renovate Once In A Week (#2185)
  • Include Harvest Certification Tool (#2241)
  • Fix Local Ci Errors (#2266)
  • Remove Apt-Get Update (#2271)