0006: API Gateway Technology and Configuration Strategy
Date: 2025-08-04
Status: Accepted
Context
The API Gateway is a P0 component that acts as the single entry point for all client traffic. The selection of its technology and configuration strategy must satisfy several key requirements:
- Easy Local Development: Must be simple to run locally via
docker-composeor in a devcontainer without the overhead of a full Kubernetes cluster. - Production Scalability: Must be robust and scalable for production deployments on Kubernetes or other orchestrators.
- CI/CD Friendliness: Must be lightweight enough to be easily integrated into automated testing pipelines.
- Interoperability: The architecture must allow customers to use their own API gateways (e.g., cloud provider offerings) to connect to our services.
- Migration Flexibility: We must avoid vendor lock-in, allowing for a future migration to a different gateway technology if our initial choice becomes limiting.
Decision
We will adopt a hybrid strategy that uses Traefik Proxy as the implementation and the official Kubernetes Gateway API as our standard for production configuration.
-
Technology Choice: Traefik Proxy
- Traefik is chosen for its simplicity, performance, and excellent integration with both Docker and Kubernetes. Its auto-discovery features make it exceptionally easy for local development.
-
Configuration Strategy: Kubernetes Gateway API
- For production Kubernetes deployments, all routes and policies will be defined using the standard, vendor-neutral Kubernetes Gateway API CRDs (
Gateway,HTTPRoute, etc.). This ensures our configuration is portable and directly addresses the requirement for migration flexibility. - For local development (
docker-compose), we will use Traefik's native Docker labels. This is the simplest and most idiomatic way to configure Traefik in a Docker environment, providing a frictionless developer experience.
- For production Kubernetes deployments, all routes and policies will be defined using the standard, vendor-neutral Kubernetes Gateway API CRDs (
-
Routing Strategy: Path-Based Direct Routing
- The public API will use a path-based routing convention:
/api/<service-name>/.... - The gateway is responsible for matching on the
/api/<service-name>prefix and forwarding the request directly to the upstream service. The service itself is responsible for handling the full path. - Example: A request to
https://erp.castlecraft.in/api/iam/v1/userswill be routed to theiam-service, which handles the full/api/iam/v1/userspath. - This strategy simplifies gateway configuration (no path rewriting) and makes services more self-contained, as their internal API paths match the public-facing ones.
- The public API will use a path-based routing convention:
Consequences
Positive:
-
Satisfies All Requirements: This approach directly meets all the stated goals.
- Local Dev:
docker-compose upwith Traefik labels is extremely simple. - Production: Traefik is a proven, scalable gateway for Kubernetes.
- CI/CD: The lightweight
docker-composesetup is perfect for integration testing. - Interoperability: This is an architectural guarantee, not a technology choice. Because our
IAM Serviceissues standard OIDC JWTs, any external gateway can be configured to validate them. For example, an enterprise on AWS can use the AWS API Gateway with a JWT Authorizer. They would configure the authorizer to point to ourIAM Service's public JWKS endpoint. Once the token is validated, the AWS API Gateway can route requests to our backend services running in their VPC. Our choice of an internal gateway (Traefik) does not prevent this. - Migration: By standardizing on the Kubernetes Gateway API for our production configuration, we decouple ourselves from Traefik's specific implementation details. Migrating to another gateway that supports the Gateway API (like Kong or Istio) would primarily involve swapping the gateway software, not rewriting all our route definitions.
- Local Dev:
-
Developer Experience: Developers get a simple, fast, and intuitive local environment without needing to understand complex Kubernetes manifests.
-
Operational Consistency: The operations team gets a standardized, portable, and GitOps-friendly way to manage gateway configuration in production.
Negative:
- Dual Configuration Models: We will have two distinct methods for configuring routes: Docker labels for local dev and Gateway API manifests for production. This requires clear documentation so that developers understand the mapping between the two. This is a small and acceptable trade-off for the significant benefits in developer experience and production portability.