Token exchange with Spring Cloud Gateway
Why token exchange
Most OAuth2 tutorials end at the login screen. The user authenticates, gets an access token, and every service behind the gateway trusts that same token. Same claims, same audience, same scopes. If one downstream service gets compromised, the attacker holds a token that works everywhere. And the authorization server has no record of which service requested access on behalf of which user.
RFC 8693 fixes this. A service receives the user's token, presents it to the authorization server, and gets back a new token scoped to the specific downstream service it needs to call. The user identity is preserved, but the new token can have a different audience, different scopes, different lifetime. The authorization server decides whether the exchange is allowed.
If you care about least privilege in a microservices architecture, this is how identity delegation should work. The CNCF's guide on zero trust with Keycloak covers the reasoning at scale.
This article walks through a complete implementation: Spring Cloud Gateway exchanging tokens via
Keycloak's RFC 8693 endpoint, forwarding scoped JWTs to downstream Spring Boot services, with three
frontend frameworks visualizing the exchange in real time. The companion repository is
spring-oauth-token-exchange-rfc8693
on GitHub, and everything described here runs with a single command.
The RFC in sixty seconds
RFC 8693 adds a new OAuth2 grant type: urn:ietf:params:oauth:grant-type:token-exchange. A client
sends a subject_token (the token to exchange) along with parameters like audience, scope, and
requested_token_type to the authorization server's token endpoint. The server validates the
subject token, applies its policy, and returns a new token scoped to the requested audience, with
the user's identity preserved.
The difference from a client credentials grant is that the user's identity flows through. The downstream service knows who the user is and what roles they have, but the token itself is scoped to that specific service. The authorization server decides whether the exchange is allowed.
I'd recommend Authlete's explainer as a companion to the spec. Unless reading RFCs over coffee is your idea of a good morning.
Architecture
The demo models a conference platform. A Spring Cloud Gateway sits between frontend clients and two backend microservices. The frontend authenticates with Keycloak using a public OIDC client with PKCE. The Gateway receives the user's access token, validates it via introspection, exchanges it for a service-specific JWT through Keycloak's RFC 8693 endpoint, and forwards that JWT to the downstream service.
Two test users demonstrate role based access: Alice has the speaker role and can access the Talk
Service, while Bob has the reviewer role and can access the Review Service.
Each frontend includes a diff table comparing the token the browser sent with the token the downstream service received, claim by claim. You can actually see the exchange happen.
The request flow
Here's what a single API call looks like from the frontend to a downstream service:
Two separate calls to Keycloak happen at the Gateway, and they behave very differently. Both route
through Traefik (http://localhost/auth/... proxied to the Keycloak container), which adds a hop
but keeps all traffic flowing through a single entry point to match a production reverse proxy
setup. The distinction matters for performance and is easy to overlook.
Token introspection validates the incoming access token. The Gateway uses opaqueToken(), so Spring
Security calls Keycloak's /token/introspect endpoint on every request rather than decoding the JWT
locally. Each introspection request travels Gateway → Traefik → Keycloak and back. Why not
just validate the JWT locally? Introspection lets the authorization server enforce revocation. A
locally validated JWT is trusted until it expires, even if the session has been terminated. The
trade-off is latency: Spring Security does not cache introspection results, so every request
round-trips to the authorization server. In production with high throughput, you'd cache
introspection results with a short TTL or switch to local JWT validation if immediate revocation
isn't a requirement.
Token exchange swaps the access token for a service-specific JWT via RFC 8693. This call is
cached. Spring Security's InMemoryReactiveOAuth2AuthorizedClientService stores the exchanged JWT
in memory, keyed by clientRegistrationId + principalName. The exchange only fires when:
- The first request for a given user arrives after gateway startup
- The cached token expires
- The Gateway JVM restarts
One thing that surprised me: logging out and back in may not trigger a new exchange if the cache entry is still valid. The cache lives in JVM memory, not tied to the browser session. Restart the Gateway to clear it if you want to see the Keycloak exchange hop in every trace. In production, this caching is correct behavior.
Keycloak setup
Keycloak 26+ simplified token exchange compared to earlier versions, where you had to configure fine-grained authorization policies through the admin console. Now it's a single client attribute. The official Keycloak docs cover both the legacy V1 and the current V2 mechanism.
Clients
The architecture requires two Keycloak clients.
| Client | Represents | Purpose |
|---|---|---|
public | SPAs | PKCE login, no secret |
gateway | API Gateway | Confidential, performs token exchange |
The public client is what the frontend uses. Standard OIDC authorization code flow with PKCE (S256), no client secret. Lightweight access tokens are enabled, which reduces token size but has implications for what claims are available (more on that below).
One thing worth calling out: Keycloak does not support truly opaque tokens. Lightweight access
tokens are still JWTs, just with personal information like preferred_username, email, and
given_name stripped from the payload. Anyone with the token can decode the header and see it's a
JWT. The demo uses lightweight tokens as the closest Keycloak gets to opaque, combined with Spring
Security's opaqueToken() configuration that forces the Gateway to validate via introspection
rather than decoding locally. From the Gateway's perspective, the token is an opaque string. It
never looks inside.
For truly opaque tokens (random strings with no decodable payload), an identity provider like ZITADEL is needed. That's planned as a follow-up.
The gateway client is the Gateway's server-side client. Confidential, with a client secret. The setting that enables token exchange:
const gatewayClientConfig = {
clientId: gw.clientId,
enabled: true,
publicClient: false,
clientAuthenticatorType: "client-secret",
secret: gw.secret,
standardFlowEnabled: true,
attributes: {
"standard.token.exchange.enabled": "true",
},
}One attribute. That's what activates RFC 8693 on the gateway client. No admin console clicking, no fine-grained authorization policies.
The audience mapper
The public client needs an audience mapper that includes the gateway client's client_id in the
aud claim of issued tokens. Without this, Keycloak rejects the exchange because the subject token
wasn't intended for the requesting client.
const mapperConfig = {
name: mapperName,
protocol: "openid-connect",
protocolMapper: "oidc-audience-mapper",
config: {
"included.client.audience": gw.clientId,
"id.token.claim": "false",
"lightweight.claim": "true",
"introspection.token.claim": "true",
"access.token.claim": "true",
},
}The lightweight.claim: "true" setting is easy to miss. When lightweight access tokens are enabled,
Keycloak strips most claims by default. If the audience mapper doesn't explicitly opt into
lightweight tokens, the aud claim doesn't make it into the token, and the exchange fails with a
generic error that tells you nothing useful.
Per-service audience scoping via client scopes
Keycloak 26's standard token exchange (standard.token.exchange.enabled) doesn't support the RFC
8693 audience POST parameter. The idiomatic Keycloak approach is scope based audience control
through client scopes.
The setup script creates one client scope per downstream service: api-talk-service and
api-review-service. Each scope contains an audience mapper that sets the aud claim to the
respective service name:
const audMapperConfig = {
name: audMapperName,
protocol: "openid-connect",
protocolMapper: "oidc-audience-mapper",
config: {
"included.custom.audience": service,
"id.token.claim": "false",
"access.token.claim": "true",
"lightweight.claim": "true",
"introspection.token.claim": "true",
},
}These scopes are registered as optional scopes on the gateway client. When the Gateway requests
scope: openid api-talk-service during token exchange, Keycloak applies the api-talk-service
scope, and the audience mapper sets aud: talk-service on the exchanged JWT. The review-service
route requests api-review-service instead, producing a JWT with aud: review-service. No
standalone service clients exist in Keycloak. Audience scoping is handled entirely through scopes.
Roles
Two realm roles: speaker and reviewer. Alice is assigned speaker, Bob is assigned reviewer.
After token exchange, these roles appear in the JWT's realm_access.roles claim, where the
downstream services extract them.
Automated setup
The Keycloak configuration is fully automated with a TypeScript
setup script
using @keycloak/keycloak-admin-client. The script is idempotent: it creates resources if they're
missing, updates them to match the configuration, and handles 409 Conflict as "already exists."
No manual clicking in the admin console. The entire realm, both clients, all client scopes, roles, users, and mappers are created from code.
The Gateway: resource server and OAuth2 client
The Spring Cloud Gateway is both an OAuth2 resource server (validating incoming tokens from frontends) and an OAuth2 client (performing the token exchange with Keycloak). You need both for transparent token exchange to work.
Security filter chain
The
SecurityConfiguration
registers both roles in a single filter chain:
@Bean
SecurityWebFilterChain securityWebFilterChain(
ServerHttpSecurity http,
SecurityProperties properties
) {
http.authorizeExchange(exchanges -> exchanges
.pathMatchers(properties.unprotectedPaths().toArray(new String[0])).permitAll()
.anyExchange().authenticated()
);
http.oauth2ResourceServer(oauth2 -> oauth2.opaqueToken(Customizer.withDefaults()));
http.oauth2Client(Customizer.withDefaults());
http.formLogin(ServerHttpSecurity.FormLoginSpec::disable);
http.httpBasic(ServerHttpSecurity.HttpBasicSpec::disable);
return http.build();
}oauth2ResourceServer with opaqueToken means the Gateway validates incoming tokens by calling
Keycloak's introspection endpoint. It doesn't decode the token locally. oauth2Client enables the
Gateway to act as a client for outbound OAuth2 flows, including token exchange.
Client registration
The token exchange grant type is configured in the Gateway's application.yml:
spring:
cloud:
gateway:
server:
webflux:
routes:
- id: talk-service
uri: lb://talk-service
predicates:
- Path=/talk-service/**
filters:
- TokenRelay=talk-service
- id: review-service
uri: lb://review-service
predicates:
- Path=/review-service/**
filters:
- TokenRelay=review-service
security:
oauth2:
resourceserver:
opaquetoken:
introspection-uri: "http://localhost/auth/realms/conference/protocol/openid-connect/token/introspect"
client-id: "gateway"
client-secret: "knZMUYRIU3YC2CGZpyF8HiBdEfKzu1WD" # demo only — use environment variables or a secrets manager in production
client:
registration:
talk-service:
provider: keycloak
client-id: gateway
client-secret: "knZMUYRIU3YC2CGZpyF8HiBdEfKzu1WD" # demo only — use environment variables or a secrets manager in production
authorization-grant-type: urn:ietf:params:oauth:grant-type:token-exchange
scope:
- openid
- profile
- email
- api-talk-service
review-service:
provider: keycloak
client-id: gateway
client-secret: "knZMUYRIU3YC2CGZpyF8HiBdEfKzu1WD" # demo only — use environment variables or a secrets manager in production
authorization-grant-type: urn:ietf:params:oauth:grant-type:token-exchange
scope:
- openid
- profile
- email
- api-review-service
provider:
keycloak:
issuer-uri: http://localhost/auth/realms/conferenceEach route gets its own TokenRelay filter pointing to a dedicated client registration. Both
registrations authenticate as the same gateway client but request different scopes:
api-talk-service vs api-review-service. Keycloak uses these scopes to apply the matching
audience mapper, setting the aud claim on the exchanged JWT to the target service name.
The opaquetoken section configures introspection. The authorization-grant-type is the RFC 8693
grant type URN, which tells Spring Security to use
TokenExchangeReactiveOAuth2AuthorizedClientProvider for these registrations.
When a request hits /talk-service/**, the TokenRelay=talk-service filter looks up the
talk-service registration, sees token exchange as the grant type, and requests an exchange with
scope: openid api-talk-service. Keycloak returns a JWT with aud: talk-service. The
review-service route does the same with api-review-service. Each downstream service gets a JWT
scoped to its specific audience.
The subject token resolver
The TokenExchangeReactiveOAuth2AuthorizedClientProvider needs to know which token to exchange:
@Bean
TokenExchangeReactiveOAuth2AuthorizedClientProvider
tokenExchangeReactiveOAuth2AuthorizedClientProvider() {
final var provider = new TokenExchangeReactiveOAuth2AuthorizedClientProvider();
provider.setSubjectTokenResolver(context ->
Mono.justOrEmpty(context.getPrincipal())
.ofType(BearerTokenAuthentication.class)
.map(BearerTokenAuthentication::getToken)
);
return provider;
}When a request arrives, Spring Security introspects the access token with Keycloak and wraps the
result in a BearerTokenAuthentication. The resolver extracts the original token from that
authentication object and passes it as the subject_token parameter in the RFC 8693 exchange
request. Audience scoping is handled entirely on the Keycloak side through client scopes, so no
custom parameters converter is needed on the Spring Security side.
The Spring Security 6.3 token exchange blog post walks through the same pattern.
Required dependencies
The Gateway needs a specific set of dependencies for the dual resource server / OAuth2 client role:
<!-- Reactive web (required for Spring Cloud Gateway) -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-webflux</artifactId>
</dependency>
<!-- Spring Cloud Gateway -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-gateway-server-webflux</artifactId>
</dependency>
<!-- OAuth2 Client (enables TokenRelay filter) -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-oauth2-client</artifactId>
</dependency>
<!-- Resource Server (opaque token introspection) -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-security-oauth2-resource-server</artifactId>
</dependency>
<!-- JWT/JOSE support -->
<dependency>
<groupId>org.springframework.security</groupId>
<artifactId>spring-security-oauth2-jose</artifactId>
</dependency>Miss any one of these and you get either a startup failure or silent misconfiguration.
spring-boot-starter-oauth2-client is what makes the TokenRelay filter available. Without it, the
filter never registers and routes forward requests with no token exchange.
Downstream services: JWT validation and role extraction
The Talk Service and Review Service are plain Spring Boot MVC applications configured as JWT resource servers. They know nothing about token exchange. They receive a JWT, validate it against Keycloak's JWKS endpoint, extract roles, and enforce access control.
spring:
security:
oauth2:
resourceserver:
jwt:
issuer-uri: "http://localhost/auth/realms/conference"
audiences: "talk-service"The audiences property tells Spring Security to reject any JWT whose aud claim doesn't include
talk-service. A token exchanged for review-service won't pass validation here. Each downstream
service enforces its own audience, closing the loop on per-service scoping.
The role extraction problem
Keycloak places realm roles in a nested claim: realm_access.roles. Spring Security's default
JwtAuthenticationConverter doesn't read from this path. Without a custom converter, no roles are
extracted, and every @PreAuthorize("hasRole('...')") check fails silently, even though the JWT
contains the correct roles.
The fix is a custom Converter<Jwt, JwtAuthenticationToken> that reads from both the standard
authorities and the Keycloak-specific realm_access claim:
public class JwtTokenConverter implements Converter<Jwt, JwtAuthenticationToken> {
private static final String REALM_ACCESS_CLAIM = "realm_access";
private static final String ROLES = "roles";
private final JwtGrantedAuthoritiesConverter jwtGrantedAuthoritiesConverter
= new JwtGrantedAuthoritiesConverter();
@Override
public JwtAuthenticationToken convert(Jwt source) {
final var idpAuthorities = jwtGrantedAuthoritiesConverter.convert(source);
final var applicationAuthorities = convertApplicationAuthorities(source);
final var authorities = Stream.of(idpAuthorities, applicationAuthorities)
.flatMap(Collection::stream)
.toList();
return new JwtAuthenticationToken(source, authorities);
}
private Collection<GrantedAuthority> convertApplicationAuthorities(Jwt jwt) {
final var realmAccessClaims =
(Map<String, Object>) jwt.getClaim(REALM_ACCESS_CLAIM);
if (realmAccessClaims == null) return Collections.emptyList();
final var roles = (List<String>) realmAccessClaims.get(ROLES);
if (roles == null) return Collections.emptyList();
return roles.stream()
.map(String::toUpperCase)
.map("ROLE_"::concat)
.<GrantedAuthority>map(SimpleGrantedAuthority::new)
.toList();
}
}Keycloak's ["speaker"] becomes Spring Security's ROLE_SPEAKER, which
@PreAuthorize("hasRole('SPEAKER')") picks up. Both downstream services use the same converter.
The converter is wired into the security configuration:
@Bean
SecurityFilterChain securityFilterChain(
HttpSecurity http,
SecurityProperties properties,
JwtTokenConverter jwtTokenConverter
) {
http.cors(Customizer.withDefaults());
http.csrf(AbstractHttpConfigurer::disable); // Bearer tokens are not auto-attached by browsers, so CSRF protection is unnecessary
http.authorizeHttpRequests(authorizeRequests ->
authorizeRequests
.requestMatchers(properties.unprotectedPaths().toArray(new String[0])).permitAll()
.anyRequest().authenticated()
);
http.oauth2ResourceServer(oauth2 ->
oauth2.jwt(jwt -> jwt.jwtAuthenticationConverter(jwtTokenConverter))
);
return http.build();
}Observability: tracing the full exchange
The project uses the OpenTelemetry Java agent for zero-code instrumentation. Each Spring Boot service gets the agent attached via JVM options:
JAVA_TOOL_OPTIONS=-javaagent:target/opentelemetry-javaagent.jarNo Spring Boot OTel starter, no code changes, no Logback XML. The agent instruments at the Netty/reactor-netty level via bytecode manipulation. Every HTTP client in the JVM gets trace propagation automatically.
Keycloak has built-in tracing (KC_TRACING_ENABLED=true). Traefik exports traces natively. The OTel
Collector receives everything and forwards to Grafana LGTM (Tempo for traces, Loki for logs, Mimir
for metrics).
A full request trace shows the complete chain: Frontend → Traefik → Gateway → Traefik → Keycloak (introspection + exchange) → Gateway → downstream service. Because the Gateway reaches Keycloak through Traefik, Traefik spans appear twice in every trace, once for the inbound request and once for the Gateway's call to Keycloak.
The trace propagation gap
This is the kind of thing you only find by staring at traces and noticing what's missing.
When hitting a service mirror endpoint, Grafana Tempo showed the full trace from the Gateway through
to the downstream service. But the Keycloak token exchange call was invisible. The
authorize exchange span was present (3.54ms), but Keycloak never appeared as a child span. Two
separate, disconnected traces instead of one connected chain.
The root cause: Spring Security's TokenExchangeReactiveOAuth2AuthorizedClientProvider creates its
own WebClient internally. That WebClient has no OTel instrumentation, so it never sends the
traceparent header. Keycloak receives the request, starts its own orphaned trace, and the two
never connect.
With the OTel Java agent, this isn't a problem. The agent instruments HTTP calls at the
Netty/reactor-netty level, below Spring Security's abstraction. Every WebClient instance gets
trace propagation regardless of who created it or whether it was configured with an
ObservationRegistry.
What if you use spring-boot-starter-opentelemetry instead?
Spring Boot 4 ships its own OpenTelemetry starter (spring-boot-starter-opentelemetry) as an
alternative to the Java agent. Not to be confused with the
OpenTelemetry Spring Boot starter
maintained by the OTel community. If you go with the Spring Boot starter over the agent, you need to
provide an instrumented WebClient to the token exchange provider yourself:
@Bean
TokenExchangeReactiveOAuth2AuthorizedClientProvider
tokenExchangeReactiveOAuth2AuthorizedClientProvider(
ObservationRegistry observationRegistry) {
final var webClient = WebClient.builder()
.observationRegistry(observationRegistry)
.build();
final var responseClient =
new WebClientReactiveTokenExchangeTokenResponseClient();
responseClient.setWebClient(webClient);
final var provider = new TokenExchangeReactiveOAuth2AuthorizedClientProvider();
provider.setAccessTokenResponseClient(responseClient);
provider.setSubjectTokenResolver(context -> {
if (context.getPrincipal() instanceof BearerTokenAuthentication bearer) {
return Mono.just(bearer.getToken());
}
return Mono.empty();
});
return provider;
}There's a subtlety here that cost me some investigation: you might think to inject
WebClient.Builder through Spring's auto-configuration, which would give you an
already-instrumented builder. It doesn't work. Spring Security resolves
ReactiveOAuth2AuthorizedClientProvider beans eagerly during context initialization, before
WebClientAutoConfiguration runs. The application crashes during startup because the builder bean
hasn't been created yet. ObjectProvider<WebClient.Builder> doesn't help either, because the crash
happens in Spring Security's constructor, not in your bean method. You have to build the WebClient
manually with the ObservationRegistry.
The trade-off is straightforward: the Java agent instruments everything invisibly, the starter makes you find and instrument every HTTP client yourself.
Frontend integration
All three frontends (Angular, Vue, React) use keycloak-js with PKCE and an identical auth
configuration:
const keycloak = new Keycloak({
url: "http://localhost/auth",
realm: "conference",
clientId: "public",
})
await keycloak.init({
onLoad: "check-sso",
silentCheckSsoRedirectUri: window.location.origin + "/silent-check-sso.html",
pkceMethod: "S256",
checkLoginIframe: true,
})check-sso means: if the user already has a Keycloak session, log them in silently via an iframe.
No login prompt appears unless the user explicitly navigates to the login page. All three frontends
refresh the token every 30 seconds:
setInterval(async () => {
const refreshed = await keycloak.updateToken(30)
if (refreshed) await updateTokenState()
}, 30_000)updateToken(30) only calls Keycloak if the token expires within 30 seconds. Most of the time it's
a no-op.
Lightweight token workaround
When Keycloak issues lightweight access tokens, the JWT payload doesn't include
preferred_username, given_name, family_name, or email. The frontends detect this and fall
back to the userinfo endpoint:
if (!parsed.preferred_username) {
const profile = await keycloak.loadUserProfile()
userInfo = {
...parsed,
preferred_username: profile.username ?? parsed.sub,
given_name: profile.firstName,
family_name: profile.lastName,
email: profile.email,
}
}Smaller payloads at the cost of an extra userinfo call when you need profile data.
Framework differences
The three frontends implement the same dashboard with framework-idiomatic patterns. React uses
Context API for auth state with a <ProtectedRoute> wrapper and manual fetch(). Vue uses
Composition API refs with a router.beforeEach guard and manual fetch(). Angular takes a more
structured approach with Signals, a CanActivateFn guard, and an HTTP interceptor that
automatically injects the Bearer token.
Angular's authInterceptor is the most interesting of the three because it removes the need for
manual header injection:
export const authInterceptor: HttpInterceptorFn = (req, next) => {
const auth = inject(AuthService)
const token = auth.accessToken()
if (token && !req.url.includes("/auth/realms/")) {
const cloned = req.clone({
setHeaders: { Authorization: `Bearer ${token}` },
})
return next(cloned)
}
return next(req)
}The auth implementations are in React's AuthContext, Vue's useAuth composable, and Angular's AuthService.
Visualizing the exchange
Each frontend's dashboard calls mirror endpoints on both the Gateway and the downstream services. These endpoints echo back the HTTP headers the service received. The frontend decodes the token it sent, decodes the token the downstream service received, and renders a diff table showing which JWT claims changed.
The aud claim now shows the specific service (talk-service or review-service) instead of the
generic gateway client. Timestamps change, and the user's roles carry through.
The lightweight token sent by the frontend is missing personal claims (preferred_username,
email, realm_access). They show as "—" in the Sent column. After exchange, Keycloak issues a
full JWT with all claims populated. The mirror endpoints that enable this live in the
Gateway's DebugController
and the
Talk Service's DebugController.
These endpoints exist purely for the demo. They echo back HTTP headers including the Authorization
header. Do not ship anything like this outside a development environment.
Infrastructure
The entire infrastructure runs locally via Docker Compose and mise for
orchestration. Five containers provide the backbone: Traefik on port 80 as the reverse proxy routing
/auth to Keycloak, Keycloak itself for OAuth2/OIDC and token exchange backed by PostgreSQL, the
OTel Collector receiving telemetry via gRPC and HTTP, and Grafana LGTM at grafana.localhost
bundling Tempo, Loki, and Mimir.
The full Docker Compose configuration is in
support/docker-compose.yml.
Exact image versions are pinned in the
.env file.
A single command starts everything:
mise run demoFive phases: install frontend dependencies, build Maven projects and download the OTel agent, start
Docker infrastructure, set up the Keycloak realm, then start all services and frontends. Service
logs go to .logs/. Ctrl+C stops everything. The task definitions are in
mise.toml.
Integration testing the exchange
The whole point of token exchange is that each downstream service gets a JWT scoped to it and only it. If that breaks silently, you won't know until someone with the wrong role gets through. So the companion repo includes an integration test that runs against the live stack.
The test uses Node 24's native TypeScript support and the built-in node:test runner. No test
framework, no transpiler, no extra dependencies.
How it works
The test logs in as both demo users (alice and bob) by temporarily enabling direct access grants on
the gateway client via the Keycloak Admin API, grabbing tokens, and immediately disabling the grant
again. Then it calls the mirror endpoints through the gateway (/talk-service/debug/mirror,
/review-service/debug/mirror). The gateway performs the RFC 8693 exchange as usual, and the mirror
endpoint echoes back the HTTP headers it received, including the Authorization header with the
exchanged JWT.
The test decodes that JWT and asserts two things: the aud claim matches the target service, and
realm_access.roles contains the expected role.
What it covers
Eight test cases cover the important combinations:
- Alice's exchanged JWT for the talk service has
aud=talk-serviceand thespeakerrole - Alice can access talk-service's permission check (204) but gets rejected by review-service (403)
- Bob's exchanged JWT for the review service has
aud=review-serviceand thereviewerrole - Bob can access review-service's permission check (204) but gets rejected by talk-service (403)
- The JWTs issued for talk-service and review-service are different tokens with different audiences
- An unauthenticated request gets a 401
That last pair matters more than it looks. If the gateway accidentally cached one exchanged token and reused it for a different service, the audience isolation test would catch it.
Running the tests
mise run demo # start the full stack
mise run test:integration # run the tests (in a second terminal)Output looks like this:
▶ token exchange flow
▶ alice (speaker)
✔ exchanged JWT for talk-service has aud=talk-service and role=speaker
✔ can access talk-service check-permission (has SPEAKER role)
✔ cannot access review-service check-permission (missing REVIEWER role)
▶ bob (reviewer)
✔ exchanged JWT for review-service has aud=review-service and role=reviewer
✔ can access review-service check-permission (has REVIEWER role)
✔ cannot access talk-service check-permission (missing SPEAKER role)
✔ exchanged tokens for talk-service and review-service are different
✔ rejects unauthenticated requests with 401
ℹ tests 8 | pass 8 | fail 0
Production considerations
This demo is designed to make token exchange visible, not to run in production as-is. A few things you'd change, roughly in order of importance.
Introspection on every request adds latency and load on Keycloak. Cache results with a short TTL, or switch to local JWT validation at the gateway level if you don't need immediate revocation.
Client secrets are hardcoded in YAML for demo clarity. Use environment variables or a secrets manager.
CORS is set to * and SSL is disabled on the Keycloak realm. Strictly for local development
convenience.
The exchanged tokens carry all realm roles regardless of which service is the target audience. Audience scoping restricts which service accepts the token, but every JWT still contains the full set of roles. True least privilege would also scope down the roles per audience, which Keycloak can do through client scope mappings but the demo doesn't.
The article doesn't cover what lifetime the exchanged token gets. Keycloak controls it independently from the original, so verify it matches your security model rather than assuming it inherits the original token's expiry.
Token refresh is client-side via keycloak-js. The Gateway doesn't deal with refresh tokens. If the
exchanged token expires and the cache entry goes stale, the next request triggers a new exchange,
but there's no explicit refresh flow at the gateway level.
What comes next
This article covers token exchange with Keycloak. The companion repo is designed to support additional identity providers.
ZITADEL issues opaque access tokens by default and supports RFC 8693 natively, including the ability to exchange an opaque token for a JWT. That would enable a flow where the frontend receives an opaque token (no claims leaked to the browser), and the gateway exchanges it for a JWT that downstream services validate locally without calling back to the authorization server.
The full source is at
lukas-grigis/spring-oauth-token-exchange-rfc8693.
If something's unclear or broken, open an issue.