0054: Service-Defined Role and Attribute Management
Date: 2025-12-03
Status: Proposed
Context
A central IAM service owns the "source of truth" for what roles and attributes exist. However, individual downstream services (e.g., erp-service) own the business logic and know what roles and attributes they require to function. This creates a significant synchronization problem and operational friction.
Currently, there is no automated process for a service to declare its authorization needs to the iam-service. This would require manual coordination, complex migration scripts, or direct database manipulation by a Super Admin, all of which are slow, error-prone, and do not scale. We need a developer-friendly, automated way to manage the lifecycle of service-specific roles and attributes.
Decision
We will adopt a declarative, "GitOps-style" approach for managing service-defined roles and attributes, centered around a Service Manifest file. This will be combined with Lazy Initialization and Lazy Sync patterns to manage the lifecycle of these resources within tenants.
-
The
citadel.manifest.yamlFile:- Each service that defines its own roles or attributes will include a
citadel.manifest.yamlfile in its source code repository. - This manifest declaratively lists the roles and attributes the service requires, including descriptions and validation rules.
# Example: /apps/erp-service/citadel.manifest.yaml
roles:
- name: erp_manager
description: "Can manage ERP settings."
attributes:
- key: cost_center
type: string
description: "The cost center for ERP reporting." - Each service that defines its own roles or attributes will include a
-
CI/CD Sync Process:
- On deployment, the service's CI/CD pipeline will make an S2S call to a new
iam-serviceendpoint (e.g.,POST /s2s/system/sync) with the manifest content. - The
iam-servicewill idempotently create or update its internal "System Role Templates" and global attribute schemas based on the manifest. These templates will be versioned.
- On deployment, the service's CI/CD pipeline will make an S2S call to a new
-
Lazy Initialization for Tenants:
- When a new tenant is created, no service-specific roles are created for them.
- The first time an admin from that tenant needs to manage roles for a specific service (e.g.,
erp-service), theiam-servicewill detect this. - It will then create the actual, tenant-scoped roles (
erp_manager) for that tenant just-in-time, based on the latest version of the System Role Template.
-
Lazy Sync for Updates:
- If a service updates its manifest (e.g., adds a permission to the
erp_managerrole), the System Role Template iniam-serviceis updated and its version is incremented. - Existing tenant roles are now "stale".
- The next time an admin from an affected tenant interacts with their roles, the
iam-servicewill detect the version mismatch. - It will perform a just-in-time migration, applying the template changes to the tenant's specific role and updating its synced version number.
- If a service updates its manifest (e.g., adds a permission to the
Consequences
Positive
- Developer-Friendly: Developers can manage their service's authorization needs in a simple YAML file co-located with their code. The process is transparent and requires no manual intervention.
- Automated & Scalable: The sync process is fully automated via CI/CD. The "lazy" patterns ensure that the system scales well, as roles are only created and updated for active tenants when needed.
- Clear Ownership: The service that needs the role is the one that defines it, following the "you build it, you run it" philosophy.
- Resilient: The "Lazy Sync" approach distributes the load of updates over time and minimizes the blast radius of any potential sync failure.
Negative
- Increased
iam-serviceComplexity: Theiam-servicemust now manage template versioning, and the logic for "Lazy Init" and "Lazy Sync" adds complexity to its role management features. - Potential for Stale Roles: Roles for inactive tenants will remain stale until they are next accessed. This is an acceptable trade-off, as it avoids unnecessary background jobs and the roles are guaranteed to be up-to-date when they are actively used.
Future Considerations: Integration with an External Policy Service
This ADR is designed to be forward-compatible with the potential introduction of a dedicated, external policy-service (e.g., one built on Open Policy Agent). In such a scenario, the iam-service would evolve its role:
- The
iam-servicewould no longer store the "System Role Templates" itself. Instead, the CI/CD sync process would populate the externalpolicy-service. - The
iam-service's "Lazy Sync" logic would remain, but instead of querying its own database for templates, it would make an API call to thepolicy-serviceto fetch the latest role and attribute definitions.
This maintains a clean separation of concerns: the policy-service would own the definitions, while the iam-service would continue to own the assignments and their lifecycle.
This ADR establishes a clear, automated, and scalable pattern for managing the lifecycle of roles and attributes in a distributed system, significantly improving the developer experience.