Skip to main content

0054: Service-Defined Role and Attribute Management

Date: 2025-12-03

Status: Proposed

Context

A central IAM service owns the "source of truth" for what roles and attributes exist. However, individual downstream services (e.g., erp-service) own the business logic and know what roles and attributes they require to function. This creates a significant synchronization problem and operational friction.

Currently, there is no automated process for a service to declare its authorization needs to the iam-service. This would require manual coordination, complex migration scripts, or direct database manipulation by a Super Admin, all of which are slow, error-prone, and do not scale. We need a developer-friendly, automated way to manage the lifecycle of service-specific roles and attributes.

Decision

We will adopt a declarative, "GitOps-style" approach for managing service-defined roles and attributes, centered around a Service Manifest file. This will be combined with Lazy Initialization and Lazy Sync patterns to manage the lifecycle of these resources within tenants.

  1. The citadel.manifest.yaml File:

    • Each service that defines its own roles or attributes will include a citadel.manifest.yaml file in its source code repository.
    • This manifest declaratively lists the roles and attributes the service requires, including descriptions and validation rules.
    # Example: /apps/erp-service/citadel.manifest.yaml
    roles:
    - name: erp_manager
    description: "Can manage ERP settings."
    attributes:
    - key: cost_center
    type: string
    description: "The cost center for ERP reporting."
  2. CI/CD Sync Process:

    • On deployment, the service's CI/CD pipeline will make an S2S call to a new iam-service endpoint (e.g., POST /s2s/system/sync) with the manifest content.
    • The iam-service will idempotently create or update its internal "System Role Templates" and global attribute schemas based on the manifest. These templates will be versioned.
  3. Lazy Initialization for Tenants:

    • When a new tenant is created, no service-specific roles are created for them.
    • The first time an admin from that tenant needs to manage roles for a specific service (e.g., erp-service), the iam-service will detect this.
    • It will then create the actual, tenant-scoped roles (erp_manager) for that tenant just-in-time, based on the latest version of the System Role Template.
  4. Lazy Sync for Updates:

    • If a service updates its manifest (e.g., adds a permission to the erp_manager role), the System Role Template in iam-service is updated and its version is incremented.
    • Existing tenant roles are now "stale".
    • The next time an admin from an affected tenant interacts with their roles, the iam-service will detect the version mismatch.
    • It will perform a just-in-time migration, applying the template changes to the tenant's specific role and updating its synced version number.

Consequences

Positive

  • Developer-Friendly: Developers can manage their service's authorization needs in a simple YAML file co-located with their code. The process is transparent and requires no manual intervention.
  • Automated & Scalable: The sync process is fully automated via CI/CD. The "lazy" patterns ensure that the system scales well, as roles are only created and updated for active tenants when needed.
  • Clear Ownership: The service that needs the role is the one that defines it, following the "you build it, you run it" philosophy.
  • Resilient: The "Lazy Sync" approach distributes the load of updates over time and minimizes the blast radius of any potential sync failure.

Negative

  • Increased iam-service Complexity: The iam-service must now manage template versioning, and the logic for "Lazy Init" and "Lazy Sync" adds complexity to its role management features.
  • Potential for Stale Roles: Roles for inactive tenants will remain stale until they are next accessed. This is an acceptable trade-off, as it avoids unnecessary background jobs and the roles are guaranteed to be up-to-date when they are actively used.

Future Considerations: Integration with an External Policy Service

This ADR is designed to be forward-compatible with the potential introduction of a dedicated, external policy-service (e.g., one built on Open Policy Agent). In such a scenario, the iam-service would evolve its role:

  • The iam-service would no longer store the "System Role Templates" itself. Instead, the CI/CD sync process would populate the external policy-service.
  • The iam-service's "Lazy Sync" logic would remain, but instead of querying its own database for templates, it would make an API call to the policy-service to fetch the latest role and attribute definitions.

This maintains a clean separation of concerns: the policy-service would own the definitions, while the iam-service would continue to own the assignments and their lifecycle.

This ADR establishes a clear, automated, and scalable pattern for managing the lifecycle of roles and attributes in a distributed system, significantly improving the developer experience.