Skip to content
← Back to release notes
v0.64.3 stable

Release v0.64.3

May 02, 2026

Hardened the platform against fleet-wide reconnect storms and fixed a Windows agent crash loop.

Improved

  • Per-organization rate limit on agent endpoints so a single misbehaving fleet can no longer overwhelm the API. Default budget is generous enough that no normal MSP will ever notice it; tunable per environment.
  • Agent now honors the server's Retry-After header on 429 and 503 responses, so when the API tells the fleet to back off the fleet actually backs off instead of running its own retry schedule.
  • Tighter limits on agent log shipments — smaller batch sizes and a hard cap on request body size keep one chatty agent from drowning the log ingest path.
  • Slower agent and watchdog restart cadence on Linux (30s and 15s) prevents a network blip from triggering a thundering herd of reconnects across the fleet.
  • Postgres connection pool tuned up to 30 connections so heartbeat storms no longer cascade into 504 errors.

Fixed

  • Windows user-helper Scheduled Task was crash-looping on multiple customer tenants with an auth rejection error. The helper now starts with the correct role and a regression test prevents future drift.

A focused resilience release. The biggest piece is a three-part defense against the kind of correlated reconnect storms that can take down the API when a network blip or bad config push affects a large fleet at once: per-organization rate limits, Retry-After awareness on the agent side, and slower service restart cadence. None of this is visible during normal operation — it just means the worst case stays bounded.

The Windows fix is more user-visible: a Scheduled Task running under the standard Users group was crash-looping with an auth rejection on tenants like nexusitsys and Revenant Global. That’s resolved, with a regression test in place so it stays resolved.