Skip to main content

0006: API Gateway Technology and Configuration Strategy

Date: 2025-08-04

Status: Accepted

Context

The API Gateway is a P0 component that acts as the single entry point for all client traffic. The selection of its technology and configuration strategy must satisfy several key requirements:

  1. Easy Local Development: Must be simple to run locally via docker-compose or in a devcontainer without the overhead of a full Kubernetes cluster.
  2. Production Scalability: Must be robust and scalable for production deployments on Kubernetes or other orchestrators.
  3. CI/CD Friendliness: Must be lightweight enough to be easily integrated into automated testing pipelines.
  4. Interoperability: The architecture must allow customers to use their own API gateways (e.g., cloud provider offerings) to connect to our services.
  5. Migration Flexibility: We must avoid vendor lock-in, allowing for a future migration to a different gateway technology if our initial choice becomes limiting.

Decision

We will adopt a hybrid strategy that uses Traefik Proxy as the implementation and the official Kubernetes Gateway API as our standard for production configuration.

  1. Technology Choice: Traefik Proxy

    • Traefik is chosen for its simplicity, performance, and excellent integration with both Docker and Kubernetes. Its auto-discovery features make it exceptionally easy for local development.
  2. Configuration Strategy: Kubernetes Gateway API

    • For production Kubernetes deployments, all routes and policies will be defined using the standard, vendor-neutral Kubernetes Gateway API CRDs (Gateway, HTTPRoute, etc.). This ensures our configuration is portable and directly addresses the requirement for migration flexibility.
    • For local development (docker-compose), we will use Traefik's native Docker labels. This is the simplest and most idiomatic way to configure Traefik in a Docker environment, providing a frictionless developer experience.
  3. Routing Strategy: Path-Based Direct Routing

    • The public API will use a path-based routing convention: /api/<service-name>/....
    • The gateway is responsible for matching on the /api/<service-name> prefix and forwarding the request directly to the upstream service. The service itself is responsible for handling the full path.
    • Example: A request to https://erp.castlecraft.in/api/iam/v1/users will be routed to the iam-service, which handles the full /api/iam/v1/users path.
    • This strategy simplifies gateway configuration (no path rewriting) and makes services more self-contained, as their internal API paths match the public-facing ones.

Consequences

Positive:

  • Satisfies All Requirements: This approach directly meets all the stated goals.

    • Local Dev: docker-compose up with Traefik labels is extremely simple.
    • Production: Traefik is a proven, scalable gateway for Kubernetes.
    • CI/CD: The lightweight docker-compose setup is perfect for integration testing.
    • Interoperability: This is an architectural guarantee, not a technology choice. Because our IAM Service issues standard OIDC JWTs, any external gateway can be configured to validate them. For example, an enterprise on AWS can use the AWS API Gateway with a JWT Authorizer. They would configure the authorizer to point to our IAM Service's public JWKS endpoint. Once the token is validated, the AWS API Gateway can route requests to our backend services running in their VPC. Our choice of an internal gateway (Traefik) does not prevent this.
    • Migration: By standardizing on the Kubernetes Gateway API for our production configuration, we decouple ourselves from Traefik's specific implementation details. Migrating to another gateway that supports the Gateway API (like Kong or Istio) would primarily involve swapping the gateway software, not rewriting all our route definitions.
  • Developer Experience: Developers get a simple, fast, and intuitive local environment without needing to understand complex Kubernetes manifests.

  • Operational Consistency: The operations team gets a standardized, portable, and GitOps-friendly way to manage gateway configuration in production.

Negative:

  • Dual Configuration Models: We will have two distinct methods for configuring routes: Docker labels for local dev and Gateway API manifests for production. This requires clear documentation so that developers understand the mapping between the two. This is a small and acceptable trade-off for the significant benefits in developer experience and production portability.