Back to Blog

Token exchange with Spring Cloud Gateway

144 min read
spring-bootspring-cloud-gatewaysecurityoauth2keycloakmicroservices

Why token exchange

Most OAuth2 tutorials end at the login screen. The user authenticates, gets an access token, and every service behind the gateway trusts that same token. Same claims, same audience, same scopes. If one downstream service gets compromised, the attacker holds a token that works everywhere. And the authorization server has no record of which service requested access on behalf of which user.

RFC 8693 fixes this. A service receives the user's token, presents it to the authorization server, and gets back a new token scoped to the specific downstream service it needs to call. The user identity is preserved, but the new token can have a different audience, different scopes, different lifetime. The authorization server decides whether the exchange is allowed.

If you care about least privilege in a microservices architecture, this is how identity delegation should work. The CNCF's guide on zero trust with Keycloak covers the reasoning at scale.

This article walks through a complete implementation: Spring Cloud Gateway exchanging tokens via Keycloak's RFC 8693 endpoint, forwarding scoped JWTs to downstream Spring Boot services, with three frontend frameworks visualizing the exchange in real time. The companion repository is spring-oauth-token-exchange-rfc8693 on GitHub, and everything described here runs with a single command.


The RFC in sixty seconds

RFC 8693 adds a new OAuth2 grant type: urn:ietf:params:oauth:grant-type:token-exchange. A client sends a subject_token (the token to exchange) along with parameters like audience, scope, and requested_token_type to the authorization server's token endpoint. The server validates the subject token, applies its policy, and returns a new token scoped to the requested audience, with the user's identity preserved.

The difference from a client credentials grant is that the user's identity flows through. The downstream service knows who the user is and what roles they have, but the token itself is scoped to that specific service. The authorization server decides whether the exchange is allowed.

I'd recommend Authlete's explainer as a companion to the spec. Unless reading RFCs over coffee is your idea of a good morning.


Architecture

The demo models a conference platform. A Spring Cloud Gateway sits between frontend clients and two backend microservices. The frontend authenticates with Keycloak using a public OIDC client with PKCE. The Gateway receives the user's access token, validates it via introspection, exchanges it for a service-specific JWT through Keycloak's RFC 8693 endpoint, and forwards that JWT to the downstream service.

Full architecture: frontends, gateway, Keycloak, downstream services, and observability

Two test users demonstrate role based access: Alice has the speaker role and can access the Talk Service, while Bob has the reviewer role and can access the Review Service.

Each frontend includes a diff table comparing the token the browser sent with the token the downstream service received, claim by claim. You can actually see the exchange happen.

The request flow

Here's what a single API call looks like from the frontend to a downstream service:

A single request triggers introspection (every time) and token exchange (cached after first call)

Two separate calls to Keycloak happen at the Gateway, and they behave very differently. Both route through Traefik (http://localhost/auth/... proxied to the Keycloak container), which adds a hop but keeps all traffic flowing through a single entry point to match a production reverse proxy setup. The distinction matters for performance and is easy to overlook.

Token introspection validates the incoming access token. The Gateway uses opaqueToken(), so Spring Security calls Keycloak's /token/introspect endpoint on every request rather than decoding the JWT locally. Each introspection request travels Gateway → Traefik → Keycloak and back. Why not just validate the JWT locally? Introspection lets the authorization server enforce revocation. A locally validated JWT is trusted until it expires, even if the session has been terminated. The trade-off is latency: Spring Security does not cache introspection results, so every request round-trips to the authorization server. In production with high throughput, you'd cache introspection results with a short TTL or switch to local JWT validation if immediate revocation isn't a requirement.

Token exchange swaps the access token for a service-specific JWT via RFC 8693. This call is cached. Spring Security's InMemoryReactiveOAuth2AuthorizedClientService stores the exchanged JWT in memory, keyed by clientRegistrationId + principalName. The exchange only fires when:

  • The first request for a given user arrives after gateway startup
  • The cached token expires
  • The Gateway JVM restarts

One thing that surprised me: logging out and back in may not trigger a new exchange if the cache entry is still valid. The cache lives in JVM memory, not tied to the browser session. Restart the Gateway to clear it if you want to see the Keycloak exchange hop in every trace. In production, this caching is correct behavior.


Keycloak setup

Keycloak 26+ simplified token exchange compared to earlier versions, where you had to configure fine-grained authorization policies through the admin console. Now it's a single client attribute. The official Keycloak docs cover both the legacy V1 and the current V2 mechanism.

Clients

The architecture requires two Keycloak clients.

ClientRepresentsPurpose
publicSPAsPKCE login, no secret
gatewayAPI GatewayConfidential, performs token exchange

The public client is what the frontend uses. Standard OIDC authorization code flow with PKCE (S256), no client secret. Lightweight access tokens are enabled, which reduces token size but has implications for what claims are available (more on that below).

One thing worth calling out: Keycloak does not support truly opaque tokens. Lightweight access tokens are still JWTs, just with personal information like preferred_username, email, and given_name stripped from the payload. Anyone with the token can decode the header and see it's a JWT. The demo uses lightweight tokens as the closest Keycloak gets to opaque, combined with Spring Security's opaqueToken() configuration that forces the Gateway to validate via introspection rather than decoding locally. From the Gateway's perspective, the token is an opaque string. It never looks inside.

For truly opaque tokens (random strings with no decodable payload), an identity provider like ZITADEL is needed. That's planned as a follow-up.

The gateway client is the Gateway's server-side client. Confidential, with a client secret. The setting that enables token exchange:

const gatewayClientConfig = {
  clientId: gw.clientId,
  enabled: true,
  publicClient: false,
  clientAuthenticatorType: "client-secret",
  secret: gw.secret,
  standardFlowEnabled: true,
  attributes: {
    "standard.token.exchange.enabled": "true", 
  },
}
typescript|setup-keycloak.tsSource

One attribute. That's what activates RFC 8693 on the gateway client. No admin console clicking, no fine-grained authorization policies.

The audience mapper

The public client needs an audience mapper that includes the gateway client's client_id in the aud claim of issued tokens. Without this, Keycloak rejects the exchange because the subject token wasn't intended for the requesting client.

const mapperConfig = {
  name: mapperName,
  protocol: "openid-connect",
  protocolMapper: "oidc-audience-mapper",
  config: {
    "included.client.audience": gw.clientId,
    "id.token.claim": "false",
    "lightweight.claim": "true", 
    "introspection.token.claim": "true",
    "access.token.claim": "true",
  },
}
typescript|setup-keycloak.tsSource

The lightweight.claim: "true" setting is easy to miss. When lightweight access tokens are enabled, Keycloak strips most claims by default. If the audience mapper doesn't explicitly opt into lightweight tokens, the aud claim doesn't make it into the token, and the exchange fails with a generic error that tells you nothing useful.

Per-service audience scoping via client scopes

Keycloak 26's standard token exchange (standard.token.exchange.enabled) doesn't support the RFC 8693 audience POST parameter. The idiomatic Keycloak approach is scope based audience control through client scopes.

The setup script creates one client scope per downstream service: api-talk-service and api-review-service. Each scope contains an audience mapper that sets the aud claim to the respective service name:

const audMapperConfig = {
  name: audMapperName,
  protocol: "openid-connect",
  protocolMapper: "oidc-audience-mapper",
  config: {
    "included.custom.audience": service, 
    "id.token.claim": "false",
    "access.token.claim": "true",
    "lightweight.claim": "true",
    "introspection.token.claim": "true",
  },
}
typescript|setup-keycloak.tsSource

These scopes are registered as optional scopes on the gateway client. When the Gateway requests scope: openid api-talk-service during token exchange, Keycloak applies the api-talk-service scope, and the audience mapper sets aud: talk-service on the exchanged JWT. The review-service route requests api-review-service instead, producing a JWT with aud: review-service. No standalone service clients exist in Keycloak. Audience scoping is handled entirely through scopes.

Roles

Two realm roles: speaker and reviewer. Alice is assigned speaker, Bob is assigned reviewer. After token exchange, these roles appear in the JWT's realm_access.roles claim, where the downstream services extract them.

Automated setup

The Keycloak configuration is fully automated with a TypeScript setup script using @keycloak/keycloak-admin-client. The script is idempotent: it creates resources if they're missing, updates them to match the configuration, and handles 409 Conflict as "already exists."

No manual clicking in the admin console. The entire realm, both clients, all client scopes, roles, users, and mappers are created from code.


The Gateway: resource server and OAuth2 client

The Spring Cloud Gateway is both an OAuth2 resource server (validating incoming tokens from frontends) and an OAuth2 client (performing the token exchange with Keycloak). You need both for transparent token exchange to work.

Security filter chain

The SecurityConfiguration registers both roles in a single filter chain:

@Bean
SecurityWebFilterChain securityWebFilterChain(
        ServerHttpSecurity http,
        SecurityProperties properties
) {
    http.authorizeExchange(exchanges -> exchanges
            .pathMatchers(properties.unprotectedPaths().toArray(new String[0])).permitAll()
            .anyExchange().authenticated()
    );

    http.oauth2ResourceServer(oauth2 -> oauth2.opaqueToken(Customizer.withDefaults())); 
    http.oauth2Client(Customizer.withDefaults()); 

    http.formLogin(ServerHttpSecurity.FormLoginSpec::disable);
    http.httpBasic(ServerHttpSecurity.HttpBasicSpec::disable);
    return http.build();
}
java|SecurityConfiguration.javaSource

oauth2ResourceServer with opaqueToken means the Gateway validates incoming tokens by calling Keycloak's introspection endpoint. It doesn't decode the token locally. oauth2Client enables the Gateway to act as a client for outbound OAuth2 flows, including token exchange.

Client registration

The token exchange grant type is configured in the Gateway's application.yml:

spring:
  cloud:
    gateway:
      server:
        webflux:
          routes:
            - id: talk-service
              uri: lb://talk-service
              predicates:
                - Path=/talk-service/**
              filters:
                - TokenRelay=talk-service
            - id: review-service
              uri: lb://review-service
              predicates:
                - Path=/review-service/**
              filters:
                - TokenRelay=review-service

  security:
    oauth2:
      resourceserver:
        opaquetoken: 
          introspection-uri: "http://localhost/auth/realms/conference/protocol/openid-connect/token/introspect"
          client-id: "gateway"
          client-secret: "knZMUYRIU3YC2CGZpyF8HiBdEfKzu1WD" # demo only — use environment variables or a secrets manager in production
      client:
        registration:
          talk-service: 
            provider: keycloak
            client-id: gateway
            client-secret: "knZMUYRIU3YC2CGZpyF8HiBdEfKzu1WD" # demo only — use environment variables or a secrets manager in production
            authorization-grant-type: urn:ietf:params:oauth:grant-type:token-exchange
            scope:
              - openid
              - profile
              - email
              - api-talk-service
          review-service: 
            provider: keycloak
            client-id: gateway
            client-secret: "knZMUYRIU3YC2CGZpyF8HiBdEfKzu1WD" # demo only — use environment variables or a secrets manager in production
            authorization-grant-type: urn:ietf:params:oauth:grant-type:token-exchange
            scope:
              - openid
              - profile
              - email
              - api-review-service
        provider:
          keycloak:
            issuer-uri: http://localhost/auth/realms/conference
yaml|application.ymlSource

Each route gets its own TokenRelay filter pointing to a dedicated client registration. Both registrations authenticate as the same gateway client but request different scopes: api-talk-service vs api-review-service. Keycloak uses these scopes to apply the matching audience mapper, setting the aud claim on the exchanged JWT to the target service name.

The opaquetoken section configures introspection. The authorization-grant-type is the RFC 8693 grant type URN, which tells Spring Security to use TokenExchangeReactiveOAuth2AuthorizedClientProvider for these registrations.

When a request hits /talk-service/**, the TokenRelay=talk-service filter looks up the talk-service registration, sees token exchange as the grant type, and requests an exchange with scope: openid api-talk-service. Keycloak returns a JWT with aud: talk-service. The review-service route does the same with api-review-service. Each downstream service gets a JWT scoped to its specific audience.

The subject token resolver

The TokenExchangeReactiveOAuth2AuthorizedClientProvider needs to know which token to exchange:

@Bean
TokenExchangeReactiveOAuth2AuthorizedClientProvider
        tokenExchangeReactiveOAuth2AuthorizedClientProvider() {
    final var provider = new TokenExchangeReactiveOAuth2AuthorizedClientProvider();
    provider.setSubjectTokenResolver(context ->
            Mono.justOrEmpty(context.getPrincipal()) 
                    .ofType(BearerTokenAuthentication.class) 
                    .map(BearerTokenAuthentication::getToken) 
    );
    return provider;
}
java|SecurityConfiguration.javaSource

When a request arrives, Spring Security introspects the access token with Keycloak and wraps the result in a BearerTokenAuthentication. The resolver extracts the original token from that authentication object and passes it as the subject_token parameter in the RFC 8693 exchange request. Audience scoping is handled entirely on the Keycloak side through client scopes, so no custom parameters converter is needed on the Spring Security side.

The Spring Security 6.3 token exchange blog post walks through the same pattern.

Required dependencies

The Gateway needs a specific set of dependencies for the dual resource server / OAuth2 client role:

<!-- Reactive web (required for Spring Cloud Gateway) -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-webflux</artifactId>
</dependency>

<!-- Spring Cloud Gateway -->
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-gateway-server-webflux</artifactId>
</dependency>

<!-- OAuth2 Client (enables TokenRelay filter) -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-oauth2-client</artifactId>
</dependency>

<!-- Resource Server (opaque token introspection) -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-security-oauth2-resource-server</artifactId>
</dependency>

<!-- JWT/JOSE support -->
<dependency>
    <groupId>org.springframework.security</groupId>
    <artifactId>spring-security-oauth2-jose</artifactId>
</dependency>
xml|pom.xmlSource

Miss any one of these and you get either a startup failure or silent misconfiguration. spring-boot-starter-oauth2-client is what makes the TokenRelay filter available. Without it, the filter never registers and routes forward requests with no token exchange.


Downstream services: JWT validation and role extraction

The Talk Service and Review Service are plain Spring Boot MVC applications configured as JWT resource servers. They know nothing about token exchange. They receive a JWT, validate it against Keycloak's JWKS endpoint, extract roles, and enforce access control.

spring:
  security:
    oauth2:
      resourceserver:
        jwt:
          issuer-uri: "http://localhost/auth/realms/conference"
          audiences: "talk-service"
yaml|application.yml

The audiences property tells Spring Security to reject any JWT whose aud claim doesn't include talk-service. A token exchanged for review-service won't pass validation here. Each downstream service enforces its own audience, closing the loop on per-service scoping.

The role extraction problem

Keycloak places realm roles in a nested claim: realm_access.roles. Spring Security's default JwtAuthenticationConverter doesn't read from this path. Without a custom converter, no roles are extracted, and every @PreAuthorize("hasRole('...')") check fails silently, even though the JWT contains the correct roles.

The fix is a custom Converter<Jwt, JwtAuthenticationToken> that reads from both the standard authorities and the Keycloak-specific realm_access claim:

public class JwtTokenConverter implements Converter<Jwt, JwtAuthenticationToken> {

    private static final String REALM_ACCESS_CLAIM = "realm_access";
    private static final String ROLES = "roles";

    private final JwtGrantedAuthoritiesConverter jwtGrantedAuthoritiesConverter
            = new JwtGrantedAuthoritiesConverter();

    @Override
    public JwtAuthenticationToken convert(Jwt source) {
        final var idpAuthorities = jwtGrantedAuthoritiesConverter.convert(source);
        final var applicationAuthorities = convertApplicationAuthorities(source);

        final var authorities = Stream.of(idpAuthorities, applicationAuthorities)
                .flatMap(Collection::stream)
                .toList();

        return new JwtAuthenticationToken(source, authorities);
    }

    private Collection<GrantedAuthority> convertApplicationAuthorities(Jwt jwt) {
        final var realmAccessClaims =
                (Map<String, Object>) jwt.getClaim(REALM_ACCESS_CLAIM);
        if (realmAccessClaims == null) return Collections.emptyList(); 
        final var roles = (List<String>) realmAccessClaims.get(ROLES);
        if (roles == null) return Collections.emptyList(); 
        return roles.stream()
                .map(String::toUpperCase)
                .map("ROLE_"::concat)
                .<GrantedAuthority>map(SimpleGrantedAuthority::new)
                .toList();
    }
}
java|JwtTokenConverter.javaSource

Keycloak's ["speaker"] becomes Spring Security's ROLE_SPEAKER, which @PreAuthorize("hasRole('SPEAKER')") picks up. Both downstream services use the same converter.

The converter is wired into the security configuration:

@Bean
SecurityFilterChain securityFilterChain(
        HttpSecurity http,
        SecurityProperties properties,
        JwtTokenConverter jwtTokenConverter
) {
    http.cors(Customizer.withDefaults());
    http.csrf(AbstractHttpConfigurer::disable); // Bearer tokens are not auto-attached by browsers, so CSRF protection is unnecessary
    http.authorizeHttpRequests(authorizeRequests ->
            authorizeRequests
                    .requestMatchers(properties.unprotectedPaths().toArray(new String[0])).permitAll()
                    .anyRequest().authenticated()
    );
    http.oauth2ResourceServer(oauth2 ->
            oauth2.jwt(jwt -> jwt.jwtAuthenticationConverter(jwtTokenConverter)) 
    );
    return http.build();
}
java|SecurityConfiguration.javaSource

Observability: tracing the full exchange

The project uses the OpenTelemetry Java agent for zero-code instrumentation. Each Spring Boot service gets the agent attached via JVM options:

JAVA_TOOL_OPTIONS=-javaagent:target/opentelemetry-javaagent.jar
bash

No Spring Boot OTel starter, no code changes, no Logback XML. The agent instruments at the Netty/reactor-netty level via bytecode manipulation. Every HTTP client in the JVM gets trace propagation automatically.

Keycloak has built-in tracing (KC_TRACING_ENABLED=true). Traefik exports traces natively. The OTel Collector receives everything and forwards to Grafana LGTM (Tempo for traces, Loki for logs, Mimir for metrics).

A full request trace shows the complete chain: Frontend → Traefik → Gateway → Traefik → Keycloak (introspection + exchange) → Gateway → downstream service. Because the Gateway reaches Keycloak through Traefik, Traefik spans appear twice in every trace, once for the inbound request and once for the Gateway's call to Keycloak.

A single request trace in Grafana Tempo — gateway, Traefik, Keycloak introspection, and the downstream service all connected

The trace propagation gap

This is the kind of thing you only find by staring at traces and noticing what's missing.

When hitting a service mirror endpoint, Grafana Tempo showed the full trace from the Gateway through to the downstream service. But the Keycloak token exchange call was invisible. The authorize exchange span was present (3.54ms), but Keycloak never appeared as a child span. Two separate, disconnected traces instead of one connected chain.

The root cause: Spring Security's TokenExchangeReactiveOAuth2AuthorizedClientProvider creates its own WebClient internally. That WebClient has no OTel instrumentation, so it never sends the traceparent header. Keycloak receives the request, starts its own orphaned trace, and the two never connect.

With the OTel Java agent, this isn't a problem. The agent instruments HTTP calls at the Netty/reactor-netty level, below Spring Security's abstraction. Every WebClient instance gets trace propagation regardless of who created it or whether it was configured with an ObservationRegistry.

What if you use spring-boot-starter-opentelemetry instead?

Spring Boot 4 ships its own OpenTelemetry starter (spring-boot-starter-opentelemetry) as an alternative to the Java agent. Not to be confused with the OpenTelemetry Spring Boot starter maintained by the OTel community. If you go with the Spring Boot starter over the agent, you need to provide an instrumented WebClient to the token exchange provider yourself:

@Bean
TokenExchangeReactiveOAuth2AuthorizedClientProvider
        tokenExchangeReactiveOAuth2AuthorizedClientProvider(
                ObservationRegistry observationRegistry) {
    final var webClient = WebClient.builder()
            .observationRegistry(observationRegistry)
            .build();

    final var responseClient =
            new WebClientReactiveTokenExchangeTokenResponseClient();
    responseClient.setWebClient(webClient);

    final var provider = new TokenExchangeReactiveOAuth2AuthorizedClientProvider();
    provider.setAccessTokenResponseClient(responseClient);
    provider.setSubjectTokenResolver(context -> {
        if (context.getPrincipal() instanceof BearerTokenAuthentication bearer) {
            return Mono.just(bearer.getToken());
        }
        return Mono.empty();
    });
    return provider;
}
java|SecurityConfiguration.java

There's a subtlety here that cost me some investigation: you might think to inject WebClient.Builder through Spring's auto-configuration, which would give you an already-instrumented builder. It doesn't work. Spring Security resolves ReactiveOAuth2AuthorizedClientProvider beans eagerly during context initialization, before WebClientAutoConfiguration runs. The application crashes during startup because the builder bean hasn't been created yet. ObjectProvider<WebClient.Builder> doesn't help either, because the crash happens in Spring Security's constructor, not in your bean method. You have to build the WebClient manually with the ObservationRegistry.

The trade-off is straightforward: the Java agent instruments everything invisibly, the starter makes you find and instrument every HTTP client yourself.


Frontend integration

All three frontends (Angular, Vue, React) use keycloak-js with PKCE and an identical auth configuration:

const keycloak = new Keycloak({
  url: "http://localhost/auth",
  realm: "conference",
  clientId: "public",
})

await keycloak.init({
  onLoad: "check-sso",
  silentCheckSsoRedirectUri: window.location.origin + "/silent-check-sso.html",
  pkceMethod: "S256",
  checkLoginIframe: true,
})
typescript|keycloak-init.ts

check-sso means: if the user already has a Keycloak session, log them in silently via an iframe. No login prompt appears unless the user explicitly navigates to the login page. All three frontends refresh the token every 30 seconds:

setInterval(async () => {
  const refreshed = await keycloak.updateToken(30)
  if (refreshed) await updateTokenState()
}, 30_000)
typescript

updateToken(30) only calls Keycloak if the token expires within 30 seconds. Most of the time it's a no-op.

Lightweight token workaround

When Keycloak issues lightweight access tokens, the JWT payload doesn't include preferred_username, given_name, family_name, or email. The frontends detect this and fall back to the userinfo endpoint:

if (!parsed.preferred_username) {
  const profile = await keycloak.loadUserProfile()
  userInfo = {
    ...parsed,
    preferred_username: profile.username ?? parsed.sub,
    given_name: profile.firstName,
    family_name: profile.lastName,
    email: profile.email,
  }
}
typescript|token-utils.ts

Smaller payloads at the cost of an extra userinfo call when you need profile data.

Framework differences

The three frontends implement the same dashboard with framework-idiomatic patterns. React uses Context API for auth state with a <ProtectedRoute> wrapper and manual fetch(). Vue uses Composition API refs with a router.beforeEach guard and manual fetch(). Angular takes a more structured approach with Signals, a CanActivateFn guard, and an HTTP interceptor that automatically injects the Bearer token.

Angular's authInterceptor is the most interesting of the three because it removes the need for manual header injection:

export const authInterceptor: HttpInterceptorFn = (req, next) => {
  const auth = inject(AuthService)
  const token = auth.accessToken()

  if (token && !req.url.includes("/auth/realms/")) {
    const cloned = req.clone({
      setHeaders: { Authorization: `Bearer ${token}` },
    })
    return next(cloned)
  }

  return next(req)
}
typescript|auth.interceptor.tsSource

The auth implementations are in React's AuthContext, Vue's useAuth composable, and Angular's AuthService.

Visualizing the exchange

Each frontend's dashboard calls mirror endpoints on both the Gateway and the downstream services. These endpoints echo back the HTTP headers the service received. The frontend decodes the token it sent, decodes the token the downstream service received, and renders a diff table showing which JWT claims changed.

The aud claim now shows the specific service (talk-service or review-service) instead of the generic gateway client. Timestamps change, and the user's roles carry through.

Alice's dashboard after calling the Talk Service mirror endpoint — the diff table shows every claim that changed during token exchange

The lightweight token sent by the frontend is missing personal claims (preferred_username, email, realm_access). They show as "—" in the Sent column. After exchange, Keycloak issues a full JWT with all claims populated. The mirror endpoints that enable this live in the Gateway's DebugController and the Talk Service's DebugController. These endpoints exist purely for the demo. They echo back HTTP headers including the Authorization header. Do not ship anything like this outside a development environment.


Infrastructure

The entire infrastructure runs locally via Docker Compose and mise for orchestration. Five containers provide the backbone: Traefik on port 80 as the reverse proxy routing /auth to Keycloak, Keycloak itself for OAuth2/OIDC and token exchange backed by PostgreSQL, the OTel Collector receiving telemetry via gRPC and HTTP, and Grafana LGTM at grafana.localhost bundling Tempo, Loki, and Mimir.

The full Docker Compose configuration is in support/docker-compose.yml. Exact image versions are pinned in the .env file.

A single command starts everything:

mise run demo
bash

Five phases: install frontend dependencies, build Maven projects and download the OTel agent, start Docker infrastructure, set up the Keycloak realm, then start all services and frontends. Service logs go to .logs/. Ctrl+C stops everything. The task definitions are in mise.toml.


Integration testing the exchange

The whole point of token exchange is that each downstream service gets a JWT scoped to it and only it. If that breaks silently, you won't know until someone with the wrong role gets through. So the companion repo includes an integration test that runs against the live stack.

The test uses Node 24's native TypeScript support and the built-in node:test runner. No test framework, no transpiler, no extra dependencies.

How it works

The test logs in as both demo users (alice and bob) by temporarily enabling direct access grants on the gateway client via the Keycloak Admin API, grabbing tokens, and immediately disabling the grant again. Then it calls the mirror endpoints through the gateway (/talk-service/debug/mirror, /review-service/debug/mirror). The gateway performs the RFC 8693 exchange as usual, and the mirror endpoint echoes back the HTTP headers it received, including the Authorization header with the exchanged JWT.

The test decodes that JWT and asserts two things: the aud claim matches the target service, and realm_access.roles contains the expected role.

What it covers

Eight test cases cover the important combinations:

  • Alice's exchanged JWT for the talk service has aud=talk-service and the speaker role
  • Alice can access talk-service's permission check (204) but gets rejected by review-service (403)
  • Bob's exchanged JWT for the review service has aud=review-service and the reviewer role
  • Bob can access review-service's permission check (204) but gets rejected by talk-service (403)
  • The JWTs issued for talk-service and review-service are different tokens with different audiences
  • An unauthenticated request gets a 401

That last pair matters more than it looks. If the gateway accidentally cached one exchanged token and reused it for a different service, the audience isolation test would catch it.

Running the tests

mise run demo              # start the full stack
mise run test:integration  # run the tests (in a second terminal)
bash

Output looks like this:

▶ token exchange flow
  ▶ alice (speaker)
    ✔ exchanged JWT for talk-service has aud=talk-service and role=speaker
    ✔ can access talk-service check-permission (has SPEAKER role)
    ✔ cannot access review-service check-permission (missing REVIEWER role)
  ▶ bob (reviewer)
    ✔ exchanged JWT for review-service has aud=review-service and role=reviewer
    ✔ can access review-service check-permission (has REVIEWER role)
    ✔ cannot access talk-service check-permission (missing SPEAKER role)
  ✔ exchanged tokens for talk-service and review-service are different
  ✔ rejects unauthenticated requests with 401
ℹ tests 8 | pass 8 | fail 0

Production considerations

This demo is designed to make token exchange visible, not to run in production as-is. A few things you'd change, roughly in order of importance.

Introspection on every request adds latency and load on Keycloak. Cache results with a short TTL, or switch to local JWT validation at the gateway level if you don't need immediate revocation.

Client secrets are hardcoded in YAML for demo clarity. Use environment variables or a secrets manager.

CORS is set to * and SSL is disabled on the Keycloak realm. Strictly for local development convenience.

The exchanged tokens carry all realm roles regardless of which service is the target audience. Audience scoping restricts which service accepts the token, but every JWT still contains the full set of roles. True least privilege would also scope down the roles per audience, which Keycloak can do through client scope mappings but the demo doesn't.

The article doesn't cover what lifetime the exchanged token gets. Keycloak controls it independently from the original, so verify it matches your security model rather than assuming it inherits the original token's expiry.

Token refresh is client-side via keycloak-js. The Gateway doesn't deal with refresh tokens. If the exchanged token expires and the cache entry goes stale, the next request triggers a new exchange, but there's no explicit refresh flow at the gateway level.


What comes next

This article covers token exchange with Keycloak. The companion repo is designed to support additional identity providers.

ZITADEL issues opaque access tokens by default and supports RFC 8693 natively, including the ability to exchange an opaque token for a JWT. That would enable a flow where the frontend receives an opaque token (no claims leaked to the browser), and the gateway exchanges it for a JWT that downstream services validate locally without calling back to the authorization server.

The full source is at lukas-grigis/spring-oauth-token-exchange-rfc8693. If something's unclear or broken, open an issue.

Resources

More Posts