iam-service-oidc-state-storage-strategy
ARCHIVED AND SUPERSEDED
This ADR is obsolete and has been superseded by ADR-0023: IAM Service as a Policy and Authorization Claims Engine. The content below is for historical purposes only.
0022: IAM Service OIDC State Storage Strategy
Date: 2025-08-13
Status: Proposed
Context
The iam-service's OIDC authentication flow generates short-lived, stateful data, specifically the state parameter for CSRF protection and the single-use authorization code. The current implementation uses an in-memory map, which is not suitable for a distributed, multi-replica production environment as it would lead to authentication failures. A shared, persistent storage mechanism is required.
Decision
We will use a new, dedicated table within the existing iam-service PostgreSQL database to store OIDC state and authorization codes.
Options Considered
1. Dedicated PostgreSQL Table (Chosen)
- Pros:
- No New Dependencies: Leverages the existing, mandatory PostgreSQL database, keeping the operational stack lean.
- Proven Pattern: Aligns with the implementation strategy of mature Identity Providers like Keycloak.
- Transactional Integrity: State can be written and read with strong consistency guarantees.
- Cons:
- Requires Manual Expiration: Unlike Redis, PostgreSQL does not have a built-in TTL mechanism. A separate process must be implemented to periodically delete expired state.
2. Redis Cache
- Pros:
- High Performance: Purpose-built for fast key-value operations.
- Automatic Expiration: Redis's built-in Time-To-Live (TTL) feature is a perfect fit for automatically cleaning up expired state.
- Cons:
- Adds a New Dependency: Introduces Redis as a new service to deploy, manage, and monitor.
Rationale
The primary driver for this decision is operational simplicity. Avoiding the introduction of a new core infrastructure dependency (Redis) significantly reduces the complexity of local development, CI/CD, and production deployments.
The performance of PostgreSQL is more than sufficient for this use case; the latency of a database write is negligible within the context of a user-interactive login flow. The need for a manual cleanup mechanism is a small, manageable implementation detail that is outweighed by the benefit of a simpler architecture.
Consequences
- A new database table (e.g.,
oidc_state_store) with anexpires_attimestamp column will be added to theiam-service's database schema. - An in-process background goroutine will be launched on service startup to periodically run a
DELETEquery on this table. This follows the same pattern already used for JWKS key rotation and avoids introducing external dependencies like a Kubernetes CronJob.