docs/.NET Domain Driven Design/architecture-decision-records
Edit on GitHub

Architecture Decision Records (ADRs)

Memory hook: "ADRs are love letters to your future self -- explaining not just what you decided, but why, so that Future You doesn't undo a carefully considered tradeoff."


Table of Contents


ADR-001: Choose Order Fulfillment as Domain

Status: Accepted

Context

This is a learning repository for DDD + Event Sourcing. We need a domain that:

  1. Has a non-trivial state machine (multiple states, transitions, and invariants)
  2. Is universally understood (minimal domain expertise needed to learn)
  3. Benefits from event sourcing (audit trail, temporal queries, multiple projections)
  4. Has clear aggregate boundaries (Order as the central aggregate)
  5. Provides opportunities for multiple bounded contexts (ordering, inventory, payment, shipping)
  6. Has rich business rules that can be encoded in a domain model

Decision

We will use Order Fulfillment as the domain for this learning repository.

The Order aggregate lifecycle:

text
Draft --> Submitted --> InventoryReserved --> PaymentConfirmed --> Shipped --> Delivered
  |          |              |                     |
  +----------+--------------+---------------------+---> Cancelled

Consequences

Positive:

  • Everyone understands online ordering -- no domain expertise barrier
  • The state machine has 7 states and multiple transition rules -- rich enough to demonstrate invariant enforcement
  • Natural events: OrderCreated, OrderLineAdded, OrderSubmitted, OrderShipped -- intuitive for learning event sourcing
  • Multiple useful projections: order detail, order timeline, dashboard statistics
  • Clear integration event boundaries: inventory reservation, payment confirmation, shipping notification

Negative:

  • Order fulfillment is a "textbook" domain -- some developers may find it too familiar and not challenging enough
  • Real-world order fulfillment has complexity we intentionally omit (returns, partial shipments, split payments, promotions) to keep the learning scope manageable

Risks:

  • Learners may mistake this simplified model for a production-ready order system. We mitigate this with clear documentation noting simplifications.

ADR-002: PostgreSQL as Event Store

Status: Accepted

Context

An event-sourced system needs a durable, append-only store for domain events with:

  • Per-stream loading (get all events for aggregate X)
  • Optimistic concurrency (prevent concurrent writes to the same stream)
  • Global ordering (for projection catch-up)
  • JSONB support for flexible event payloads

Options considered:

OptionProsCons
PostgreSQLTeam already knows SQL; single DB for everything; JSONB support; UNIQUE constraints for concurrencyNo built-in subscriptions; manual projection infrastructure
EventStoreDBPurpose-built for event sourcing; built-in subscriptions and projectionsAnother database to operate; learning curve; smaller community
MongoDBFlexible document model; familiar to some teamsWeaker transactional guarantees; no native optimistic concurrency per document array
DynamoDBServerless scaling; built-in conditional writesVendor lock-in; 400KB item limit; complex query patterns

Decision

We will use PostgreSQL as both the event store and the read model database.

The event store schema:

sql
CREATE TABLE event_store (
    id          BIGSERIAL PRIMARY KEY,
    stream_id   UUID NOT NULL,
    event_type  TEXT NOT NULL,
    event_data  JSONB NOT NULL,
    metadata    JSONB,
    version     INT NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

ALTER TABLE event_store
    ADD CONSTRAINT uq_event_store_stream_version UNIQUE (stream_id, version);

Consequences

Positive:

  • One database to operate, back up, and monitor
  • SQL is universally understood -- learners can inspect events with SELECT * FROM event_store
  • UNIQUE constraint on (stream_id, version) provides optimistic concurrency with zero application code
  • JSONB stores flexible event payloads with optional indexing
  • BIGSERIAL primary key provides natural global ordering for projection catch-up
  • All read models (projections) live in the same database -- simple queries, no network hops

Negative:

  • No built-in subscription mechanism -- we must poll for new events (see ADR-008)
  • At very large scale (> 100M events), PostgreSQL may need table partitioning
  • Not purpose-built for event sourcing -- some patterns require more manual implementation than EventStoreDB

Risks:

  • Performance may degrade if event store and read models compete for the same PostgreSQL resources. Mitigated by read replicas or separate databases in production.

ADR-003: Custom Event Store vs Marten

Status: Accepted

Context

Marten is a .NET library that provides document database and event store capabilities on top of PostgreSQL. It offers:

  • Built-in event store with optimistic concurrency
  • Built-in async projections with catch-up daemon
  • Built-in snapshot support
  • LINQ-based querying
  • Active .NET community

Using Marten would significantly reduce the amount of infrastructure code we need to write. However, this is a learning repository.

Decision

We will build a custom event store using Dapper and raw SQL, rather than adopting Marten.

Consequences

Positive:

  • Learners understand every layer: from SQL schema to C# interface to aggregate reconstruction
  • No "magic" -- every event store operation is visible in code and SQL
  • The IEventStore interface is simple (3 methods) and easy to reason about
  • Transferable knowledge: the concepts apply regardless of which event store library/database is used
  • Demonstrates the hexagonal architecture principle: the domain does not depend on infrastructure

Negative:

  • We write ~200 lines of infrastructure code that Marten provides out of the box
  • Our projection infrastructure is simpler (polling, no parallel catch-up) than Marten's daemon
  • We miss Marten's battle-tested edge case handling (partition pruning, event archiving, multi-tenancy)
  • If learners want to build a production system, they should evaluate Marten rather than copying our custom event store

Risks:

  • Learners may copy this custom event store into production systems without adding the robustness that Marten provides. Documentation explicitly warns against this.

ADR-004: Kafka for Integration Events Only

Status: Accepted

Context

Kafka is a distributed event streaming platform. In event-sourced architectures, Kafka can serve two roles:

  1. Event bus -- distribute integration events between bounded contexts/services
  2. Event store -- serve as the source of truth for all domain events

Some architectures use Kafka for both. We need to decide which role Kafka plays in this system.

Decision

Kafka will serve only as an integration event bus. PostgreSQL remains the event store.

Integration events flow:

text
Command Handler --> Save domain events to PostgreSQL --> Publish integration event to Kafka
                                                         --> Other services consume from Kafka

Consequences

Positive:

  • Clear separation: PostgreSQL = source of truth, Kafka = notification channel
  • If Kafka is unavailable, the system still functions (domain events are persisted; notifications are delayed)
  • Kafka retention can be finite (7-30 days) since the event store is in PostgreSQL
  • Simpler operational model -- Kafka is a replaceable component, not critical infrastructure
  • Per-aggregate loading is fast (PostgreSQL indexed query, not Kafka partition scan)

Negative:

  • Two infrastructure components (PostgreSQL + Kafka) instead of potentially one
  • Integration event publishing can fail after domain event persistence (mitigated by outbox pattern)
  • Consumers of integration events experience eventual consistency

Risks:

  • Team might be tempted to use Kafka for everything ("we already have Kafka, why not use it as the event store?"). This ADR and the kafka-integration.md documentation explicitly explain why not.

See also: kafka-integration.md for the full technical analysis.


ADR-005: Redis for Caching and Snapshots

Status: Accepted

Context

The system has three caching/performance needs:

  1. Read model caching: Reduce PostgreSQL query load for frequently accessed data
  2. Aggregate snapshots: Avoid replaying entire event streams on every command
  3. Idempotency tracking: Prevent duplicate processing of integration events

We need a fast, ephemeral store for these purposes.

Decision

We will use Redis for all three purposes: read model caching (STRING with TTL), snapshot storage (STRING), and idempotency tracking (STRING with NX + EX).

Consequences

Positive:

  • Sub-millisecond latency for cache reads and idempotency checks
  • TTL-based automatic cleanup for cache entries and idempotency keys
  • SET NX (set if not exists) provides atomic idempotency check-and-mark
  • Single Redis instance serves all three purposes
  • Redis is completely deletable -- FLUSHALL and the system still works, just slower

Negative:

  • Redis is volatile by default -- a restart loses all cached data (acceptable for our use cases)
  • Additional infrastructure component to manage
  • Cache invalidation adds complexity to projection handlers

Risks:

  • Redis could become a single point of failure for performance (not correctness). Mitigated by designing all Redis consumers to fall back gracefully (PostgreSQL for reads, full replay for snapshots, allow duplicate processing for idempotency).

See also: redis-usage.md for detailed usage patterns.


ADR-006: MediatR for CQRS

Status: Accepted

Context

CQRS separates command handling (writes) from query handling (reads). We need a mechanism to:

  1. Route commands to their handlers
  2. Route queries to their handlers
  3. Decouple the API layer from the application layer
  4. Support cross-cutting concerns (logging, validation, transaction management) via pipeline behaviors

Options considered:

OptionProsCons
MediatRDe facto standard in .NET; simple API; pipeline behaviors; large ecosystemIndirect dispatch can obscure control flow; overhead for very simple handlers
Direct injectionExplicit; easy to navigate (F12 goes to implementation)No pipeline behaviors; handler registration is manual
WolverineMore features (messaging, sagas); built-in retryLarger learning curve; more opinionated
Custom mediatorFull control; no external dependencyReinventing the wheel; no community support

Decision

We will use MediatR for command and query dispatching.

csharp
// Command: IRequest<CommandResult>
public record CreateOrderCommand(...) : IRequest<CommandResult>;

// Query: IRequest<TResponse>
public record GetOrderByIdQuery(Guid OrderId) : IRequest<OrderResponse?>;

// API controller dispatches via MediatR:
[HttpPost]
public async Task<IActionResult> Create(CreateOrderCommand command)
    => Ok(await _mediator.Send(command));

Consequences

Positive:

  • Clean separation between API controllers and application logic
  • Pipeline behaviors enable cross-cutting concerns (logging, validation, error handling) without polluting handlers
  • Handlers are small, focused, and independently testable
  • Standard pattern that most .NET developers recognize

Negative:

  • Indirect dispatch makes "go to definition" harder (F12 on _mediator.Send() goes to MediatR, not the handler)
  • For very simple operations, the handler + request + response ceremony feels heavy
  • Magic registration (assembly scanning) can surprise developers when handlers are not found

Risks:

  • Over-use: not every operation needs to go through MediatR. Simple queries without business logic could be direct service calls. We accept this minor over-use for consistency.

ADR-007: Dapper over EF Core for Event Store

Status: Accepted

Context

We need a data access mechanism for:

  1. The event store (append events, load events by stream ID)
  2. Read model queries (projections, dashboard)
  3. Projection updates (upserts, increments)

Options considered:

OptionProsCons
DapperMinimal abstraction; raw SQL visible; fast; small libraryManual mapping; no change tracking; no migrations
EF CoreMigrations, change tracking, LINQ queries, rich ecosystemHeavy abstraction over simple append-only operations; change tracking unnecessary for event store; JSONB mapping complexity
Npgsql directlyZero abstraction; maximum controlVerbose; manual parameter binding; error-prone

Decision

We will use Dapper for all database access in the event store and projection infrastructure.

csharp
// Event store: clean, visible SQL
await connection.ExecuteAsync(
    """
    INSERT INTO event_store (stream_id, event_type, event_data, metadata, version, created_at)
    VALUES (@StreamId, @EventType, @EventData::jsonb, @Metadata::jsonb, @Version, @CreatedAt)
    """,
    storedEvent);

Consequences

Positive:

  • SQL is visible and debuggable -- learners see exactly what queries run
  • No impedance mismatch between C# objects and SQL operations (event store is append-only; ORM features like change tracking are useless)
  • Dapper is thin: NpgsqlConnection + SQL string + parameters. No hidden magic.
  • High performance: Dapper is one of the fastest .NET data access libraries
  • JSONB handling is straightforward with ::jsonb cast

Negative:

  • No automatic migrations -- schema changes require manual SQL scripts
  • No LINQ for queries -- all queries are raw SQL strings
  • No compile-time query validation (typos in SQL are runtime errors)
  • Manual mapping for complex result sets

Risks:

  • SQL injection if parameters are not properly parameterized. Mitigated by always using Dapper's parameterized queries, never string concatenation.

ADR-008: Polling-Based Projection Catch-Up

Status: Accepted

Context

Projections need to process new events as they are appended to the event store. There are two main approaches:

ApproachHow It WorksProsCons
PollingBackground worker periodically queries event_store WHERE id > @lastCheckpointSimple; no additional infrastructure; works with any databaseLatency (up to poll interval); unnecessary queries when no new events
Push-based (triggers, LISTEN/NOTIFY, CDC)Database notifies the worker when new events arriveLower latency; no wasted queriesComplex setup; PostgreSQL LISTEN/NOTIFY has payload limits; CDC requires additional infrastructure
Event store subscriptionsBuilt into EventStoreDB/MartenNative catch-up semantics; guaranteed orderingRequires specific event store product

Decision

We will use polling-based projection catch-up with a configurable poll interval.

csharp
// Background worker (simplified):
while (!stoppingToken.IsCancellationRequested)
{
    var lastPosition = await GetCheckpointAsync(projectionName);
    var newEvents = await GetEventsAfterAsync(lastPosition);

    foreach (var event in newEvents)
    {
        await projection.HandleAsync(event);
        await SaveCheckpointAsync(projectionName, event.Id);
    }

    if (newEvents.Count == 0)
        await Task.Delay(pollInterval, stoppingToken);  // e.g., 500ms
}

Consequences

Positive:

  • Extremely simple to implement and understand -- a while loop with a SELECT
  • No additional infrastructure beyond PostgreSQL
  • Naturally resilient: if the worker crashes, it resumes from the last checkpoint on restart
  • Poll interval is configurable: 100ms for near-real-time, 5s for relaxed consistency
  • No event ordering issues -- events are processed in id order

Negative:

  • Latency up to the poll interval (default 500ms). Not suitable for sub-100ms requirements.
  • Wasted queries during idle periods (when no new events are being written)
  • At high event volumes, the poller may fall behind if processing is slower than production rate

Risks:

  • If the projection falls significantly behind (e.g., after a long outage), catch-up can take a long time. Mitigated by batch processing (load 100 events per poll) and monitoring projection lag.

Future evolution:

  • Add PostgreSQL LISTEN/NOTIFY as a wake-up signal to eliminate idle polling
  • Switch to Marten's async daemon if/when we adopt Marten
  • Parallelize projections by partitioning event streams

ADR Template (For New Decisions)

markdown
## ADR-NNN: [Title]

**Status**: Proposed | Accepted | Deprecated | Superseded by ADR-XXX

### Context

[What is the issue? What forces are at play? What constraints exist?]

### Decision

[What is the decision? Be specific.]

### Consequences

**Positive**:
- [Benefit 1]
- [Benefit 2]

**Negative**:
- [Cost 1]
- [Cost 2]

**Risks**:
- [Risk and mitigation]