docs/.NET Domain Driven Design/architecture-decision-records

Architecture Decision Records (ADRs)

Memory hook: "ADRs are love letters to your future self -- explaining not just what you decided, but why, so that Future You doesn't undo a carefully considered tradeoff."

ADR-001: Choose Order Fulfillment as Domain
ADR-002: PostgreSQL as Event Store
ADR-003: Custom Event Store vs Marten
ADR-004: Kafka for Integration Events Only
ADR-005: Redis for Caching and Snapshots
ADR-006: MediatR for CQRS
ADR-007: Dapper over EF Core for Event Store
ADR-008: Polling-Based Projection Catch-Up

ADR-001: Choose Order Fulfillment as Domain

Status: Accepted

Context

This is a learning repository for DDD + Event Sourcing. We need a domain that:

Has a non-trivial state machine (multiple states, transitions, and invariants)
Is universally understood (minimal domain expertise needed to learn)
Benefits from event sourcing (audit trail, temporal queries, multiple projections)
Has clear aggregate boundaries (Order as the central aggregate)
Provides opportunities for multiple bounded contexts (ordering, inventory, payment, shipping)
Has rich business rules that can be encoded in a domain model

Decision

We will use Order Fulfillment as the domain for this learning repository.

The Order aggregate lifecycle:

text

Draft --> Submitted --> InventoryReserved --> PaymentConfirmed --> Shipped --> Delivered
  |          |              |                     |
  +----------+--------------+---------------------+---> Cancelled

Consequences

Positive:

Everyone understands online ordering -- no domain expertise barrier
The state machine has 7 states and multiple transition rules -- rich enough to demonstrate invariant enforcement
Natural events: OrderCreated, OrderLineAdded, OrderSubmitted, OrderShipped -- intuitive for learning event sourcing
Multiple useful projections: order detail, order timeline, dashboard statistics
Clear integration event boundaries: inventory reservation, payment confirmation, shipping notification

Negative:

Order fulfillment is a "textbook" domain -- some developers may find it too familiar and not challenging enough
Real-world order fulfillment has complexity we intentionally omit (returns, partial shipments, split payments, promotions) to keep the learning scope manageable

Risks:

Learners may mistake this simplified model for a production-ready order system. We mitigate this with clear documentation noting simplifications.

ADR-002: PostgreSQL as Event Store

Status: Accepted

Context

An event-sourced system needs a durable, append-only store for domain events with:

Per-stream loading (get all events for aggregate X)
Optimistic concurrency (prevent concurrent writes to the same stream)
Global ordering (for projection catch-up)
JSONB support for flexible event payloads

Options considered:

Option	Pros	Cons
PostgreSQL	Team already knows SQL; single DB for everything; JSONB support; UNIQUE constraints for concurrency	No built-in subscriptions; manual projection infrastructure
EventStoreDB	Purpose-built for event sourcing; built-in subscriptions and projections	Another database to operate; learning curve; smaller community
MongoDB	Flexible document model; familiar to some teams	Weaker transactional guarantees; no native optimistic concurrency per document array
DynamoDB	Serverless scaling; built-in conditional writes	Vendor lock-in; 400KB item limit; complex query patterns

Decision

We will use PostgreSQL as both the event store and the read model database.

The event store schema:

sql

CREATE TABLE event_store (
    id          BIGSERIAL PRIMARY KEY,
    stream_id   UUID NOT NULL,
    event_type  TEXT NOT NULL,
    event_data  JSONB NOT NULL,
    metadata    JSONB,
    version     INT NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

ALTER TABLE event_store
    ADD CONSTRAINT uq_event_store_stream_version UNIQUE (stream_id, version);

Consequences

Positive:

One database to operate, back up, and monitor
SQL is universally understood -- learners can inspect events with SELECT * FROM event_store
UNIQUE constraint on (stream_id, version) provides optimistic concurrency with zero application code
JSONB stores flexible event payloads with optional indexing
BIGSERIAL primary key provides natural global ordering for projection catch-up
All read models (projections) live in the same database -- simple queries, no network hops

Negative:

No built-in subscription mechanism -- we must poll for new events (see ADR-008)
At very large scale (> 100M events), PostgreSQL may need table partitioning
Not purpose-built for event sourcing -- some patterns require more manual implementation than EventStoreDB

Risks:

Performance may degrade if event store and read models compete for the same PostgreSQL resources. Mitigated by read replicas or separate databases in production.

ADR-003: Custom Event Store vs Marten

Status: Accepted

Context

Marten is a .NET library that provides document database and event store capabilities on top of PostgreSQL. It offers:

Built-in event store with optimistic concurrency
Built-in async projections with catch-up daemon
Built-in snapshot support
LINQ-based querying
Active .NET community

Using Marten would significantly reduce the amount of infrastructure code we need to write. However, this is a learning repository.

Decision

We will build a custom event store using Dapper and raw SQL, rather than adopting Marten.

Consequences

Positive:

Learners understand every layer: from SQL schema to C# interface to aggregate reconstruction
No "magic" -- every event store operation is visible in code and SQL
The IEventStore interface is simple (3 methods) and easy to reason about
Transferable knowledge: the concepts apply regardless of which event store library/database is used
Demonstrates the hexagonal architecture principle: the domain does not depend on infrastructure

Negative:

We write ~200 lines of infrastructure code that Marten provides out of the box
Our projection infrastructure is simpler (polling, no parallel catch-up) than Marten's daemon
We miss Marten's battle-tested edge case handling (partition pruning, event archiving, multi-tenancy)
If learners want to build a production system, they should evaluate Marten rather than copying our custom event store

Risks:

Learners may copy this custom event store into production systems without adding the robustness that Marten provides. Documentation explicitly warns against this.

ADR-004: Kafka for Integration Events Only

Status: Accepted

Context

Kafka is a distributed event streaming platform. In event-sourced architectures, Kafka can serve two roles:

Event bus -- distribute integration events between bounded contexts/services
Event store -- serve as the source of truth for all domain events

Some architectures use Kafka for both. We need to decide which role Kafka plays in this system.

Decision

Kafka will serve only as an integration event bus. PostgreSQL remains the event store.

Integration events flow:

text

Command Handler --> Save domain events to PostgreSQL --> Publish integration event to Kafka
                                                         --> Other services consume from Kafka

Consequences

Positive:

Clear separation: PostgreSQL = source of truth, Kafka = notification channel
If Kafka is unavailable, the system still functions (domain events are persisted; notifications are delayed)
Kafka retention can be finite (7-30 days) since the event store is in PostgreSQL
Simpler operational model -- Kafka is a replaceable component, not critical infrastructure
Per-aggregate loading is fast (PostgreSQL indexed query, not Kafka partition scan)

Negative:

Two infrastructure components (PostgreSQL + Kafka) instead of potentially one
Integration event publishing can fail after domain event persistence (mitigated by outbox pattern)
Consumers of integration events experience eventual consistency

Risks:

Team might be tempted to use Kafka for everything ("we already have Kafka, why not use it as the event store?"). This ADR and the kafka-integration.md documentation explicitly explain why not.

See also: kafka-integration.md for the full technical analysis.

ADR-005: Redis for Caching and Snapshots

Status: Accepted

Context

The system has three caching/performance needs:

Read model caching: Reduce PostgreSQL query load for frequently accessed data
Aggregate snapshots: Avoid replaying entire event streams on every command
Idempotency tracking: Prevent duplicate processing of integration events

We need a fast, ephemeral store for these purposes.

Decision

We will use Redis for all three purposes: read model caching (STRING with TTL), snapshot storage (STRING), and idempotency tracking (STRING with NX + EX).

Consequences

Positive:

Sub-millisecond latency for cache reads and idempotency checks
TTL-based automatic cleanup for cache entries and idempotency keys
SET NX (set if not exists) provides atomic idempotency check-and-mark
Single Redis instance serves all three purposes
Redis is completely deletable -- FLUSHALL and the system still works, just slower

Negative:

Redis is volatile by default -- a restart loses all cached data (acceptable for our use cases)
Additional infrastructure component to manage
Cache invalidation adds complexity to projection handlers

Risks:

Redis could become a single point of failure for performance (not correctness). Mitigated by designing all Redis consumers to fall back gracefully (PostgreSQL for reads, full replay for snapshots, allow duplicate processing for idempotency).

See also: redis-usage.md for detailed usage patterns.

ADR-006: MediatR for CQRS

Status: Accepted

Context

CQRS separates command handling (writes) from query handling (reads). We need a mechanism to:

Route commands to their handlers
Route queries to their handlers
Decouple the API layer from the application layer
Support cross-cutting concerns (logging, validation, transaction management) via pipeline behaviors

Options considered:

Option	Pros	Cons
MediatR	De facto standard in .NET; simple API; pipeline behaviors; large ecosystem	Indirect dispatch can obscure control flow; overhead for very simple handlers
Direct injection	Explicit; easy to navigate (F12 goes to implementation)	No pipeline behaviors; handler registration is manual
Wolverine	More features (messaging, sagas); built-in retry	Larger learning curve; more opinionated
Custom mediator	Full control; no external dependency	Reinventing the wheel; no community support

Decision

We will use MediatR for command and query dispatching.

csharp

// Command: IRequest<CommandResult>
public record CreateOrderCommand(...) : IRequest<CommandResult>;

// Query: IRequest<TResponse>
public record GetOrderByIdQuery(Guid OrderId) : IRequest<OrderResponse?>;

// API controller dispatches via MediatR:
[HttpPost]
public async Task<IActionResult> Create(CreateOrderCommand command)
    => Ok(await _mediator.Send(command));

Consequences

Positive:

Clean separation between API controllers and application logic
Pipeline behaviors enable cross-cutting concerns (logging, validation, error handling) without polluting handlers
Handlers are small, focused, and independently testable
Standard pattern that most .NET developers recognize

Negative:

Indirect dispatch makes "go to definition" harder (F12 on _mediator.Send() goes to MediatR, not the handler)
For very simple operations, the handler + request + response ceremony feels heavy
Magic registration (assembly scanning) can surprise developers when handlers are not found

Risks:

Over-use: not every operation needs to go through MediatR. Simple queries without business logic could be direct service calls. We accept this minor over-use for consistency.

ADR-007: Dapper over EF Core for Event Store

Status: Accepted

Context

We need a data access mechanism for:

The event store (append events, load events by stream ID)
Read model queries (projections, dashboard)
Projection updates (upserts, increments)

Options considered:

Option	Pros	Cons
Dapper	Minimal abstraction; raw SQL visible; fast; small library	Manual mapping; no change tracking; no migrations
EF Core	Migrations, change tracking, LINQ queries, rich ecosystem	Heavy abstraction over simple append-only operations; change tracking unnecessary for event store; JSONB mapping complexity
Npgsql directly	Zero abstraction; maximum control	Verbose; manual parameter binding; error-prone

Decision

We will use Dapper for all database access in the event store and projection infrastructure.

csharp

// Event store: clean, visible SQL
await connection.ExecuteAsync(
    """
    INSERT INTO event_store (stream_id, event_type, event_data, metadata, version, created_at)
    VALUES (@StreamId, @EventType, @EventData::jsonb, @Metadata::jsonb, @Version, @CreatedAt)
    """,
    storedEvent);

Consequences

Positive:

SQL is visible and debuggable -- learners see exactly what queries run
No impedance mismatch between C# objects and SQL operations (event store is append-only; ORM features like change tracking are useless)
Dapper is thin: NpgsqlConnection + SQL string + parameters. No hidden magic.
High performance: Dapper is one of the fastest .NET data access libraries
JSONB handling is straightforward with ::jsonb cast

Negative:

No automatic migrations -- schema changes require manual SQL scripts
No LINQ for queries -- all queries are raw SQL strings
No compile-time query validation (typos in SQL are runtime errors)
Manual mapping for complex result sets

Risks:

SQL injection if parameters are not properly parameterized. Mitigated by always using Dapper's parameterized queries, never string concatenation.

ADR-008: Polling-Based Projection Catch-Up

Status: Accepted

Context

Projections need to process new events as they are appended to the event store. There are two main approaches:

Approach	How It Works	Pros	Cons
Polling	Background worker periodically queries `event_store WHERE id > @lastCheckpoint`	Simple; no additional infrastructure; works with any database	Latency (up to poll interval); unnecessary queries when no new events
Push-based (triggers, LISTEN/NOTIFY, CDC)	Database notifies the worker when new events arrive	Lower latency; no wasted queries	Complex setup; PostgreSQL LISTEN/NOTIFY has payload limits; CDC requires additional infrastructure
Event store subscriptions	Built into EventStoreDB/Marten	Native catch-up semantics; guaranteed ordering	Requires specific event store product

Decision

We will use polling-based projection catch-up with a configurable poll interval.

csharp

// Background worker (simplified):
while (!stoppingToken.IsCancellationRequested)
{
    var lastPosition = await GetCheckpointAsync(projectionName);
    var newEvents = await GetEventsAfterAsync(lastPosition);

    foreach (var event in newEvents)
    {
        await projection.HandleAsync(event);
        await SaveCheckpointAsync(projectionName, event.Id);
    }

    if (newEvents.Count == 0)
        await Task.Delay(pollInterval, stoppingToken);  // e.g., 500ms
}

Consequences

Positive:

Extremely simple to implement and understand -- a while loop with a SELECT
No additional infrastructure beyond PostgreSQL
Naturally resilient: if the worker crashes, it resumes from the last checkpoint on restart
Poll interval is configurable: 100ms for near-real-time, 5s for relaxed consistency
No event ordering issues -- events are processed in id order

Negative:

Latency up to the poll interval (default 500ms). Not suitable for sub-100ms requirements.
Wasted queries during idle periods (when no new events are being written)
At high event volumes, the poller may fall behind if processing is slower than production rate

Risks:

If the projection falls significantly behind (e.g., after a long outage), catch-up can take a long time. Mitigated by batch processing (load 100 events per poll) and monitoring projection lag.

Future evolution:

Add PostgreSQL LISTEN/NOTIFY as a wake-up signal to eliminate idle polling
Switch to Marten's async daemon if/when we adopt Marten
Parallelize projections by partitioning event streams

ADR Template (For New Decisions)

markdown

## ADR-NNN: [Title]

**Status**: Proposed | Accepted | Deprecated | Superseded by ADR-XXX

### Context

[What is the issue? What forces are at play? What constraints exist?]

### Decision

[What is the decision? Be specific.]

### Consequences

**Positive**:
- [Benefit 1]
- [Benefit 2]

**Negative**:
- [Cost 1]
- [Cost 2]

**Risks**:
- [Risk and mitigation]

Previous← Aggregates And Invariants NextBounded Contexts →

Architecture Decision Records (ADRs)

Table of Contents

ADR-001: Choose Order Fulfillment as Domain

Context

Decision

Consequences

ADR-002: PostgreSQL as Event Store

Context

Decision

Consequences

ADR-003: Custom Event Store vs Marten

Context

Decision

Consequences

ADR-004: Kafka for Integration Events Only

Context

Decision

Consequences

ADR-005: Redis for Caching and Snapshots

Context

Decision

Consequences

ADR-006: MediatR for CQRS

Context

Decision

Consequences

ADR-007: Dapper over EF Core for Event Store

Context

Decision

Consequences

ADR-008: Polling-Based Projection Catch-Up

Context

Decision

Consequences

ADR Template (For New Decisions)