Skip to content

Event Sourcing

CritterWatch uses Marten event sourcing to store all service state as an immutable sequence of events. This provides complete historical visibility, temporal queries, and the ability to rebuild projections as the data model evolves.

Why Event Sourcing for CritterWatch?

CritterWatch itself is a monitoring tool for event-sourced systems — it would be incongruent not to use the same patterns internally. More practically, event sourcing solves several real problems:

Complete audit trail. Every state change — service registration, node addition, agent assignment, circuit breaker trip, operator action — is a fact recorded in time. You can answer "what was the state of trip-service 30 minutes ago?" by replaying events up to that point.

Projection flexibility. As CritterWatch evolves, new projections can be built from existing events without losing historical data. The TimelineProjection was added after ServiceSummaryProjection — it processed all existing events to build the timeline retroactively.

Alert history. Alert transitions are events. The complete lifecycle of every alert — raised, elevated, reduced, resolved, cleared, with timestamps and triggering metrics — is stored in the event store.

Event Streams

Each monitored service gets its own Marten event stream, keyed by service name:

csharp
// Appending to a service's event stream
session.Events.Append(serviceName, events);

Event streams are append-only. The current state of a service is always derivable by replaying its event stream, but in practice the ServiceSummaryProjection materializes a snapshot for fast queries.

Domain Events

All events are defined in src/CritterWatch.Services/Model/Events.cs as C# records:

csharp
// Service lifecycle
public record ServiceRegistered(string ServiceName, string Label, string WolverineVersion);
public record VersionDetected(string ServiceName, string WolverineVersion);
public record LabelUpdated(string ServiceName, string NewLabel);

// Node lifecycle
public record NodeAdded(WolverineNode Node);
public record NodeRemoved(Guid NodeId);
public record LeadershipAssumed(Guid NodeId);
public record DormantNodeEjected(Guid NodeId);

// Agent lifecycle
public record AgentStarted(Uri AgentUri, Guid NodeId);
public record AgentStopped(Uri AgentUri);
public record AgentDiscovered(Uri AgentUri);
public record AgentRemoved(Uri AgentUri);
public record AgentAssigned(Uri AgentUri, Guid NodeId);
public record AssignmentsRevised(AgentAssignment[] Assignments);

// Projection events
public record ShardStatesUpdated(ShardStateSnapshot[] States);
public record SubscriptionOrProjectionRestartedEvent(string ShardName, string Reason);

// Endpoint events
public record EndpointAdded(EndpointState State);
public record EndpointHealthUpdated(EndpointHealthState[] States);

// Capability events
public record SubscriptionsAdded(MessagingSubscription[] Subscriptions);
public record EventTypeAdded(string EventTypeName);
public record MessageStoreDiscoveredEvent(string DatabaseUri, MessageStoreType StoreType);

// Tenant events
public record TenantAdded(string TenantId);
public record TenantDisabled(string TenantId);
public record TenantEnabled(string TenantId);
public record TenantRemoved(string TenantId);

// Health events
public record AgentRestrictionsChanged(string[] Restrictions);

ServiceSummaryProjection

The primary projection is a single-stream snapshot projection. It maintains a ServiceSummary document for each service — a denormalized view of everything CritterWatch knows about that service:

csharp
public class ServiceSummaryProjection : SingleStreamProjection<ServiceSummary>
{
    // The stream ID is the ServiceName
    public static ServiceSummary Create(ServiceRegistered e)
        => new ServiceSummary { Id = e.ServiceName, Label = e.Label };

    public static ServiceSummary Apply(NodeAdded e, ServiceSummary summary)
    {
        summary.Nodes[e.Node.Id] = e.Node;
        return summary;
    }

    public static ServiceSummary Apply(NodeRemoved e, ServiceSummary summary)
    {
        summary.Nodes.Remove(e.NodeId);
        return summary;
    }

    // ... 20+ more Apply methods
}

The ServiceSummary document is queryable directly from Marten without replaying the full event stream.

TimelineProjection

The TimelineProjection is a cross-stream projection that reads events from all service streams and produces TimelineEntry documents — a chronological feed of all notable events across the entire system.

This projection uses Marten's multi-stream aggregation:

csharp
public class TimelineProjection : MultiStreamProjection<TimelineEntry, Guid>
{
    public TimelineProjection()
    {
        Identity<NodeAdded>(e => Guid.NewGuid());
        Identity<AgentStarted>(e => Guid.NewGuid());
        // etc.
    }

    public TimelineEntry Create(NodeAdded e, IEvent metadata)
        => new TimelineEntry
        {
            Id = Guid.NewGuid(),
            Category = TimelineCategory.Node,
            Severity = TimelineSeverity.Info,
            Description = $"Node {e.Node.Id} added",
            Timestamp = metadata.Timestamp
        };
}

Alert Records

Alerts are stored as Marten snapshot documents (AlertRecord) rather than event streams. Each alert document tracks its full transition history as an embedded array. This makes alerts easy to query by status, service, and severity without the overhead of a per-alert event stream.

Schema

All CritterWatch Marten tables live in the critterwatch schema:

TableDescription
critterwatch.mt_streamsEvent stream metadata
critterwatch.mt_eventsAll domain events
critterwatch.mt_doc_servicesummaryProjected service snapshots
critterwatch.mt_doc_timelineentryTimeline entries
critterwatch.mt_doc_alertrecordAlert documents
critterwatch.mt_doc_agenthealthstateAgent health tracking

The schema is created automatically by Marten on first startup. No manual migration steps are required.

Temporal Queries

Because all state is event-sourced, you can query the state of the system at any point in time:

csharp
// What did trip-service look like 2 hours ago?
var pastState = await session.Events
    .AggregateStreamAsync<ServiceSummary>(
        "trip-service",
        timestamp: DateTimeOffset.UtcNow.AddHours(-2));

This capability powers the "event store" view in the CritterWatch UI, letting operators browse historical service states during incident post-mortems.

Released under the MIT License.