0003: IAM Service Persistence Strategy
Date: 2025-08-03
Status: Accepted
Context
The iam-service requires a persistence mechanism to store tenant, user, and role data. Other services in the Citadel platform, such as book_keeper, are designed with an event sourcing pattern. The question has been raised whether the iam-service should follow the same pattern for consistency, or if a different approach is more suitable for its specific domain.
Decision
We will implement the iam-service using a traditional, state-oriented persistence model. The current state of aggregates (like Tenant and User) will be stored directly in a relational database (PostgreSQL) using an ORM (GORM). We will explicitly not use event sourcing for this service.
Consequences
Positive:
- High Read Performance: Core authorization and authentication queries (e.g., "Is this user active?", "What are this user's roles?") will be extremely fast, as they involve direct lookups on indexed tables representing the current state. This is a critical requirement for a low-latency IAM system.
- Simplicity and Familiarity: The state-oriented approach is a well-understood pattern, making the service easier to implement, debug, and maintain for the majority of developers.
- Strong Consistency: Writes are immediately reflected in subsequent reads, which is the desired behavior for security-sensitive operations like changing a password or assigning a role.
Negative:
- Implicit Audit Trail: The system does not have a "perfect" immutable log of all state changes by default. This is mitigated by our decision to publish domain events (e.g.,
TenantCreated,UserRolesAssigned) for any state change, which can be consumed by a separate audit or logging service.
Rationale
The primary responsibility of the iam-service is to answer the question, "What is the state right now?" This is fundamentally different from a service like book_keeper, whose primary responsibility is to answer, "What happened over time?".
For an IAM service, read performance and strong consistency for the current state are more important than the ability to replay history to reconstruct state. The chosen approach optimizes for the core use case (fast authentication and authorization checks) while still enabling a robust audit trail through the use of domain events. This aligns with choosing the best tool for the job for each service's specific domain.