Observer & Telemetry
The CritterWatchObserver class is the heart of the Wolverine.CritterWatch library. It implements IWolverineObserver to receive Wolverine runtime callbacks and IAsyncDisposable to clean up gracefully on shutdown.
Observer Pattern
Wolverine's IWolverineObserver interface receives callbacks for all significant runtime events. The observer intercepts these and accumulates changes in a thread-safe collection, published on a 1-second timer.
// The observer is registered as a singleton
public class CritterWatchObserver : IWolverineObserver, IAsyncDisposable
{
// Called by Wolverine runtime for each event
void NodeStarted(WolverineNode node);
void NodeStopped(WolverineNode node);
void AgentStarted(Uri agentUri, WolverineNode node);
void AgentStopped(Uri agentUri, WolverineNode node);
void LeadershipAssumed(Guid nodeId);
void EndpointListened(Endpoint endpoint);
void EndpointStopped(Endpoint endpoint);
void CircuitBreakerTripped(Endpoint endpoint);
void CircuitBreakerReset(Endpoint endpoint);
void BackPressureTriggered(Endpoint endpoint);
void BackPressureLifted(Endpoint endpoint);
void ExceptionTriggered(Exception ex, Envelope envelope);
}Batching Pipeline
Rather than publishing each event individually, the observer uses a BatchingChannel that:
- Accepts events as they occur (non-blocking)
- Accumulates them in memory
- Publishes a single
ServiceUpdatesbatch every second
This keeps the observer's impact on handler hot paths minimal — writing to a channel is effectively a lock-free queue push.
ServiceUpdates Packet
The ServiceUpdates record contains a full snapshot of the service's current state plus any changes since the last publish:
public record ServiceUpdates(
string ServiceName,
string Label,
string WolverineVersion,
EndpointState[] Endpoints,
MessagingSubscription[] Subscriptions,
WolverineChange[] Changes,
AgentHealthReport[] AgentHealth,
PersistenceCounts PersistenceCounts,
ShardStateSnapshot[] ShardStates
);WolverineChange
Individual state changes are represented as WolverineChange records with a ChangeType discriminator:
public record WolverineChange(
ChangeType Type,
string? NodeId,
string? AgentUri,
string? EndpointUri,
string? Data
);ChangeType values include: NodeAdded, NodeRemoved, AgentStarted, AgentStopped, LeadershipChanged, EndpointAdded, EndpointStopped, CircuitBreakerTripped, CircuitBreakerReset, BackPressureTriggered, BackPressureLifted.
Agent Health Checks
In addition to reactive observation, the observer runs a periodic health check timer (default: 60 seconds) that:
- Requests health status for all known agents via
IAgentRuntime - Collects
AgentHealthReportresponses - Includes them in the next
ServiceUpdatespublish
Agent health is checked actively because some agent failures are silent — the agent stops producing output without generating an observable event. The health check is designed to detect these silent failures.
Agent Health States
public enum AgentHealth
{
Healthy,
Degraded,
Offline
}Health state transitions trigger alert evaluations in CritterWatch:
DegradedafterAgentUnhealthyWarningCountconsecutive degraded reports → Warning alertDegradedafterAgentUnhealthyCriticalCountconsecutive degraded reports → Critical alertHealthy→ alert auto-resolves
Capabilities Discovery
On startup, the observer reports the service's complete capability set to CritterWatch:
- Message subscriptions — all message types the service handles and their routing
- Endpoints — all messaging endpoints with their configuration and mode
- Message stores — Wolverine inbox/outbox database URIs
- Event stores — Marten event store configurations (if any)
- Tenancy — multi-tenancy cardinality and (for dynamic tenancy) the tenant list
- Wolverine version — the assembly version string
This discovery happens once at startup and again whenever the Wolverine runtime is reinitialized (e.g., after a hot reload).
Back Pressure and Circuit Breaker Tracking
The observer maintains in-memory state for all endpoint statuses. When a circuit breaker trips or back pressure activates, it:
- Records the state change in the pending changes buffer
- Includes the updated endpoint state in the next
ServiceUpdatespublish - Clears the state change when the circuit breaker resets or back pressure lifts
This gives CritterWatch real-time awareness of endpoint resilience states without polling.
Graceful Shutdown
On DisposeAsync(), the observer:
- Publishes one final
ServiceUpdatespacket with the shutdown state - Cancels the periodic timers
- Waits for any in-flight publishes to complete
- Disposes all resources
This ensures CritterWatch receives a clean shutdown signal rather than detecting service loss through heartbeat timeout.
