What Gets Reported

This page describes what Wolverine.CritterWatch publishes from your service to the console — useful when telemetry isn't flowing as expected, or when you want to know what kind of data is leaving the process.

Publish cadence

Surface	Cadence	Notes
Telemetry batch	Every 1 second	Always publishes, even if nothing changed (heartbeat).
Heartbeat ping	Every 30 seconds	Drives the per-node liveness dot in the UI.
Agent health probe	Every 60 seconds	Active probe; catches silent agent failures.
Broker health probe	~60 seconds	One probe per configured transport.
Capability snapshot	On startup, then on Wolverine reinit	Topology — handlers, endpoints, stores, tenancy.
Source code (handler / HTTP chain)	On demand	Returned when an operator opens the corresponding detail page.

The 1-second batching is the dominant latency — see Architecture → Message Flow for the full picture.

What's in a telemetry batch

Each batch carries:

Service identity — service name, label, Wolverine version.
Endpoint snapshot — every listener and sender with its current status (Accepting / Stopped / TooBusy / Latched / Paused / Draining), transport type, mode.
Subscription / handler catalog — every message type the service handles or publishes, with handler bindings and routing.
Recent changes — node added/removed, agent started/stopped, leadership change, circuit breaker tripped/reset, back pressure triggered/lifted, exceptions, since the last batch.
Agent health snapshot — Healthy / Degraded / Offline for each registered agent.
Persistence counts — inbox, outbox, scheduled, handled, and dead-letter counts per durability store. Per-tenant for multi-tenant services.
Shard states — current sequence and high-water mark for each projection shard.

What's not in a telemetry batch

Message bodies. Bodies are only sent on demand when an operator opens a specific dead-letter or scheduled message for inspection.
Database connection strings. Database URIs (host + database name) are reported for identification; credentials are not.
Application data. Your domain events, aggregates, and read models stay in your service's database.
Application logs / traces. CritterWatch isn't an APM. For traces, configure an OpenTelemetry trace provider in Settings → Trace Providers.
Per-endpoint and per-handler configuration trees. Reported on demand — see Lazy-fetched detail panes below. Keeping them off the heartbeat path is what lets a service with hundreds of endpoints or tenants stay under the broker / SQS payload cap.

Lazy-fetched detail panes

A handful of detail surfaces in the console are populated by a one-time round trip to the service, not by the 1-second telemetry batch:

Pane	Wire request	What it fetches
Handler chain → Source Code	`RequestHandlerSourceCode`	Generated handler source for the message type
HTTP chain → Source Code	`RequestHttpChainSourceCode`	Generated source for the HTTP chain
HTTP chain → OpenAPI	`RequestHttpChainOpenApi`	OpenAPI operation descriptor for the chain
Pipeline tab → endpoint Properties	`RequestEndpointProperties`	Endpoint `Properties` + `Children` config tree
Handler chain → Properties	`RequestMessageHandlerProperties`	Per-handler `Properties` rows for the message type

What you see: a brief spinner the first time you open one of these panes after the service starts (or after a service rollout). The response is cached on the console side keyed by service version, so re-opening the same pane is instant for the rest of the session. A rollout invalidates the cache.

If the round trip fails (service unreachable, license refused), the pane surfaces an inline error rather than retrying silently — re-open the page after the underlying cause clears to retry.

When telemetry stops flowing

If a service goes silent in the UI (the heartbeat dot turns red), the most likely causes in order:

Process is gone. Crashed, killed, or shut down without graceful shutdown. The next telemetry batch never publishes. Check the host's logs.
Broker is unreachable from the service. Telemetry never reaches the transport. Check broker connectivity from the service's network namespace.
Console is unreachable, telemetry queueing. The transport buffers messages. The dashboard will show the service as silent until the console drains the backlog after reconnect.
Service hung but process alive. A deadlock or runaway GC pause stops the publish timer. The 30-second heartbeat is the leading indicator — use the Per-node detail page to confirm.
Snapshot too big for the transport. Look in the service's own logs for ServiceUpdates payload exceeded 240 KiB after compression. This is a defensive guard: a snapshot that big is at risk of being silently dropped by SQS (256 KiB cap) or other size-limited transports. The most common cause is a sudden tenant-count explosion — every tenant adds its own ShardStateSnapshot rows + PersistenceCounts entry. If you see this, see Architecture → Multi-tenancy → Snapshot size.

The amber-then-red transition on the heartbeat dot (60s → 150s) gives you ~2.5 minutes to spot a stuck node before the UI calls it dead.

Capability snapshot

On startup, the service advertises its full topology to the console:

Every registered message type and its handler binding.
Every messaging endpoint with its configuration (mode, buffering limits, circuit-breaker settings).
Every Wolverine durability store (inbox/outbox database).
Every Marten event store and document store the service uses.
Multi-tenancy mode and the current tenant list (for dynamic tenancy).
The Wolverine assembly version.

The snapshot replaces the prior shape wholesale on each rollout. So if you redeploy with a model change, the new shape appears the moment the new version checks in — there's no merge or migration logic to worry about.

The snapshot is also re-issued whenever the Wolverine runtime is reinitialized (e.g., after a hot reload during development).

Graceful shutdown

On IAsyncDisposable.DisposeAsync(), the observer publishes one final telemetry batch tagged as a shutdown, cancels the periodic timers, and waits for in-flight publishes to complete. This produces a clean "service stopped" timeline entry rather than a heartbeat-timeout-induced "service silent" entry.

If your service is killed without graceful shutdown (SIGKILL, OOM kill, container forced termination), the final batch is lost — the service goes silent and the heartbeat dot transitions to red after the threshold.

Custom transports

The default integration uses RabbitMQ. The library works with any Wolverine-supported transport, but RabbitMQ is recommended for production:

Reliable message delivery — telemetry survives a console outage.
Decouples services from console availability.
Standard tooling for observing the queue depth on the console side.

In-memory transport is fine for development and tests but loses messages if either process restarts. SQL Server transport works but has higher persistence overhead than RabbitMQ for the telemetry volume.

What Gets Reported ​

Publish cadence ​

What's in a telemetry batch ​

What's not in a telemetry batch ​

Lazy-fetched detail panes ​

When telemetry stops flowing ​

Capability snapshot ​

Graceful shutdown ​

Custom transports ​