Build a Secure MCP Server with Keycloak, Go, and RFC 8693 Token Exchange (Part 1)
Learn to implement token tiering in MCP servers using RFC 8693 token exchange with Keycloak, separating read and write permissions with audience-bound, short-lived tokens.
Key Takeaways
- Implement token tiering to separate read and write operations on MCP tools
- Use RFC 8693 token exchange to mint narrower, audience-bound credentials
- Design step-up challenge patterns where denied requests trigger token exchange
- Validate token claims at the API layer while the gateway routes upstream
MCP makes tools easy to expose to agents. That is powerful, but it creates a quiet security problem: every tool arrives at your server the same way, so by default every tool runs on the same credential. A tool that reads an order's status and a tool that cancels the order end up indistinguishable to your identity layer.
This guide fixes that with token tiering. We build a small order-management MCP system where reads ride the ordinary login token and writes require a second, narrower token minted through RFC 8693 token exchange. Everything runs locally and nothing is mocked: Keycloak performs the real exchange, agentgateway fronts the tools, and two small Go services do the work. You drive the whole flow by hand with curl so every step of the tiering is visible. Part two of this series layers gateway policies on top of this exact stack, and Claude arrives after that.
What you'll learn
By the end you will have a working lab with two tools and two token tiers:
| MCP tool | Risk tier | Example request | Token requirement |
|---|---|---|---|
| get_order_status | Read | "Check order ord_123" | Standard login token with orders:read |
| cancel_order | Write | "Cancel order ord_123" | Exchanged token with orders:cancel and aud=orders-api |
To start off, we follow a simple rule:
- Reads can use the standard delegated token.
- Writes need a narrower, audience-bound, short-lived token.
Along the way you will run a real RFC 8693 exchange against Keycloak, watch a write get denied and then succeed, and see exactly which claims changed in between.
Why one credential is not enough
The mistake most MCP deployments make is treating authorization as one coarse question:
Can Claude call this MCP server?
The question that actually keeps you safe is finer grained:
Which exact tool is being called, for which user, by which agent, with what scope, against which downstream audience, and for how long?
The gap between those two questions is the same in every domain:
| System | Low-risk read | High-risk write |
|---|---|---|
| E-commerce | get_order_status | cancel_order |
| Kubernetes | list_pods | restart_deployment |
| GitHub | list_pull_requests | merge_pull_request |
| Finance | list_invoices | update_vendor |
The MCP spec itself calls out security expectations for tools, including access control and confirmation for sensitive operations, precisely because tools are model-controlled. RFC 8693 is what lets you meet those expectations with a credential instead of a hope: it defines a Security Token Service pattern for exchanging one token for another, downscoping scope and audience along the way.
Here is the entire behavior you are about to build. Two requests, two outcomes:
User: "What's the status of order ord_123?"
Claude uses the standard token
→ Gateway forwards the request
→ API validates orders:read
→ Success
User: "Cancel order ord_123"
Claude uses the standard token
→ API rejects it (no write scope)
Claude exchanges the token at Keycloak
→ gets a narrow, short-lived orders:cancel token
→ Gateway retries the call
→ API validates scope, audience, and actor
→ SuccessThe read path is the normal case. The write path is the normal case plus one extra move: a denied attempt, an RFC 8693 token exchange, and a retry.
Let's now get started with wiring it up with real components.
The tools and where they fit
Four components make the picture above work:
| Component | What it is | Its job in this lab |
|---|---|---|
| Keycloak 26.x (Docker) | OIDC identity provider (Keycloak 26.2 or later. Standard Token Exchange (V2) became a fully supported feature in Keycloak 26.2 and is enabled by default) | Logs the user in, mints tokens, and performs the RFC 8693 token exchange |
| agentgateway | MCP gateway | The control point in front of the tools: routing, auth, per-tool policy (In the next part, we will use it for the enforcement layer) |
| Go MCP server | Small Go service | Exposes the two tools over MCP and forwards the caller's token downstream |
| Go Order API | Small Go service | The business logic and the real enforcement floor validate every token |

Figure 1. Claude has two relationships: a request path through the gateway, and an identity path to Keycloak.
The solid line is the request path: everything Claude does to a tool goes through agentgateway, never directly to the MCP server. The dotted line is the identity path: Claude talks to Keycloak to log in and, when it needs a stronger token, to perform the exchange. The Order API is the last line of defense, and it checks token signatures against Keycloak's public keys rather than trusting anything upstream.
The read request
The user asks a low-risk question, and the standard token they logged in with is enough. This is the baseline every other flow builds on.

Figure 2. The read path: the login token is enough.
The only token in play here is the standard token: the ordinary access token the user received when they logged into Keycloak. It carries orders:read. The API checks the signature and the scope, sees a read request, and answers. No exchange, no extra steps.
The cancel request
Now the user asks for something that changes state. The standard token is deliberately not enough, and that rejection is what triggers the interesting part.

Figure 3. The cancel path: one denial, one exchange, one retry.
A second token appears here, the exchanged token. The first attempt reuses the same standard token from the read path, and the API refuses it because it lacks the cancel scope. That refusal is the whole design working. Claude then exchanges the user's standard token at Keycloak for a new one that is narrower (only orders:cancel), locked to a single audience (orders-api), and short-lived (120 seconds). The audience is the blast radius: aud names the one service allowed to accept this token, so even if it leaks or the agent misroutes it, no other API in your stack will honor it. The retry carries that exchanged token, and this time both the gateway policy and the API agree.
Notice the cancel path passes through agentgateway twice: once to get denied, once to succeed. That denied-then-retry shape is not a workaround. It is exactly the step-up challenge pattern that RFC 9470 formalizes for resource servers that need stronger or fresher credentials on certain requests.
Throughout this guide we'll use Streamable HTTP, which is the recommended transport for new MCP deployments. agentgateway exposes it at http://localhost:3000/mcp/http, and both Claude Desktop and Claude Code support connecting to it directly. While agentgateway still supports the older Server-Sent Events (SSE) transport for compatibility, Streamable HTTP is the preferred choice for new projects.
Boot the stack
Prerequisites
Install these locally:
go version # 1.22 or newer for net/http PathValue
docker version # with Docker Compose v2
curl --version
jq --versionInstall agentgateway:
curl -sL https://agentgateway.dev/install | bash
agentgateway --versionThe installer verifies the checksum and drops the binary into /usr/local/bin, so expect a sudo password prompt.
Project layout
mcp-token-tiering/
├── go.mod
├── docker-compose.yaml
├── keycloak/
│ └── realm-export.json
├── config/
│ └── agentgateway.yaml
└── cmd/
├── order-api/
│ ├── main.go
│ └── orders.json
└── orders-mcp/
└── main.go
Create it or feel free to clone it from GitHub:
mkdir -p mcp-token-tiering/{cmd/order-api,cmd/orders-mcp,config,keycloak}
cd mcp-token-tiering
go mod init example.com/mcp-token-tiering
go get github.com/golang-jwt/jwt/v5
go get github.com/MicahParks/keyfunc/v3
go get github.com/modelcontextprotocol/go-sdk/mcpThe official Go MCP SDK provides the server APIs and streamable HTTP transport we use below.
Keycloak with a pre-built realm
Now let's turn the lab from "trust me" into "run me". We import a realm that contains everything the demo needs: a user, the two scopes, the MCP client, the agent client with token exchange enabled, and the downstream API client that acts as the exchange audience.
Create keycloak/realm-export.json:
{
"realm": "mcp-demo",
"enabled": true,
"accessTokenLifespan": 1800,
"users": [
{
"username": "demo-user",
"enabled": true,
"email": "[email protected]",
"emailVerified": true,
"firstName": "Demo",
"lastName": "User",
"credentials": [
{ "type": "password", "value": "demo-password", "temporary": false }
]
}
],
"clientScopes": [
{
"name": "orders:read",
"protocol": "openid-connect",
"attributes": { "include.in.token.scope": "true" }
},
{
"name": "orders:cancel",
"protocol": "openid-connect",
"attributes": { "include.in.token.scope": "true" },
"protocolMappers": [
{
"name": "orders-api-audience",
"protocol": "openid-connect",
"protocolMapper": "oidc-audience-mapper",
"config": {
"included.client.audience": "orders-api",
"access.token.claim": "true"
}
}
]
}
],
"clients": [
{
"clientId": "user-client",
"name": "Normal CLI User",
"enabled": true,
"publicClient": true,
"standardFlowEnabled": true,
"directAccessGrantsEnabled": true,
"redirectUris": ["http://localhost:3000/*"],
"webOrigins": ["+"],
"defaultClientScopes": ["profile", "email", "orders:read"],
"optionalClientScopes": [],
"protocolMappers": [
{
"name": "orders-agent-audience",
"protocol": "openid-connect",
"protocolMapper": "oidc-audience-mapper",
"config": {
"included.client.audience": "orders-agent",
"access.token.claim": "true",
"id.token.claim": "false"
}
}
]
},
{
"clientId": "orders-agent",
"name": "Agent exchange client",
"enabled": true,
"publicClient": false,
"secret": "orders-agent-secret",
"serviceAccountsEnabled": true,
"standardFlowEnabled": false,
"directAccessGrantsEnabled": false,
"attributes": {
"standard.token.exchange.enabled": "true",
"access.token.lifespan": "120"
},
"defaultClientScopes": [],
"optionalClientScopes": ["orders:cancel"]
},
{
"clientId": "orders-api",
"name": "Orders API audience",
"enabled": true,
"publicClient": false,
"standardFlowEnabled": false,
"directAccessGrantsEnabled": false,
"serviceAccountsEnabled": false
}
]
}What each piece does:
| Object | Role in the demo |
|---|---|
| demo-user | The human. Logs in and gets a standard token with orders:read. |
| user-client client | The MCP client identity. Public client with direct access grants on, purely to simplify curl demos. |
| orders-agent client | The agent. Confidential client with Standard Token Exchange enabled and a 120 second token lifespan. |
| orders-api client | Exists only so audience=orders-api is a valid exchange target in the realm. |
| orders:read scope | Attached to user-client by default. Marks read permission. |
| orders:cancel scope | Optional scope on orders-agent. Carries an audience mapper that stamps aud=orders-api. |
| orders-agent-audience mapper | On user-client. Adds orders-agent to the standard token's aud so the agent may exchange it. |
Three rules of Keycloak token exchange are baked into that file, and knowing them saves you the three most common failures:
-
The exchanging client must be in the subject token's audience. Keycloak V2 rejects an exchange unless the requester (orders-agent) is already listed in the aud of the token being exchanged, with the error
access_denied: Client is not within the token audience. The orders-agent-audience mapper on user-client is what satisfies this. Remove it and the exchange fails. -
The audience must be a real client. The audience parameter must reference a client in the same realm, which is why the orders-api client exists at all and why the audience mapper lives inside the orders:cancel scope.
-
Requested scopes must belong to the exchanging client. The exchange can only grant scopes attached to orders-agent as default or optional client scopes. orders:cancel is optional there, so the exchange request must ask for it explicitly.
Two V2 constraints to note for real deployments: sender-constrained tokens (DPoP-bound or certificate-bound) cannot be used as the subject_token, and if the requesting client has consent required, the exchange only succeeds when the user already consented to the requested scopes.
Now create docker-compose.yaml:
services:
keycloak:
image: quay.io/keycloak/keycloak:26.4
command: ["start-dev", "--import-realm"]
environment:
KC_BOOTSTRAP_ADMIN_USERNAME: admin
KC_BOOTSTRAP_ADMIN_PASSWORD: admin
ports:
- "8080:8080"
volumes:
- ./keycloak/realm-export.json:/opt/keycloak/data/import/realm-export.json:roStart it and verify the exchange grant is live:
docker compose up -d
curl -s http://localhost:8080/realms/mcp-demo/.well-known/openid-configuration | jq '.grant_types_supported'
You should see urn:ietf:params:oauth:grant-type:token-exchange in the list. One gotcha: --import-realm only imports on a container's first startup. If you edit realm-export.json later, a plain restart will not pick up the change. Recreate the container instead:
docker compose down && docker compose up -dKey endpoints for the rest of the lab:
- Issuer: http://localhost:8080/realms/mcp-demo
- Token: http://localhost:8080/realms/mcp-demo/protocol/openid-connect/token
- JWKS: http://localhost:8080/realms/mcp-demo/protocol/openid-connect/certs
agentgateway in front of the tools
Create config/agentgateway.yaml:
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
mcp:
port: 3000
policies:
cors:
allowOrigins:
- "*"
allowHeaders:
- "*"
exposeHeaders:
- "Mcp-Session-Id"
targets:
- name: orders
mcp:
host: http://localhost:9000/mcpThis is the same shape the agentgateway streamable HTTP docs show: an mcp listener on port 3000, CORS exposure for Mcp-Session-Id, and a target pointing at our MCP server. The gateway manages MCP session state and routes follow-up tool calls to the same backend.
One gotcha before you start it: running bare agentgateway with no flags loads a default file from ~/.config/agentgateway/ instead of your project config, so always pass -f config/agentgateway.yaml.
agentgateway -f config/agentgateway.yamlA healthy start looks like this (config dump trimmed):
$ agentgateway -f config/agentgateway.yaml
2026-07-04T18:56:19.403339Z info agentgateway_app::commands::run version: { "version": "1.3.1", ... }
2026-07-04T18:56:19.405945Z info agentgateway_app::commands::run running with config: ipv6Enabled: true
...
2026-07-04T18:56:19.424063Z info state_manager loaded config from File("config/agentgateway.yaml")
2026-07-04T18:56:19.425694Z info app serving UI at http://localhost:15000/ui
As written, this config routes MCP traffic but does not authenticate it. That is deliberate: the Order API downstream is our enforcement floor, so you can drive the whole lab by hand with curl. Turning the gateway itself into a policy-enforcing front door, validating JWTs and applying per-tool rules, is exactly what part two of this series does.
The code
The order data
Create cmd/order-api/orders.json:
[
{
"id": "ord_123",
"owner": "demo-user",
"status": "PROCESSING"
},
{
"id": "ord_456",
"owner": "demo-user",
"status": "PROCESSING"
}
]The API loads this file once at startup into an in-memory store. Both orders start as PROCESSING, and the demo is moving ord_123 from PROCESSING to CANCELLED: first watching the standard token fail to do it, then watching the exchanged token succeed. For read operation, any token can do it!
The Order API
The Order API is where policy has to hold even if everything upstream fails. It validates every token against Keycloak's JWKS and checks issuer, scope, audience, and actor identity. Create cmd/order-api/main.go:
package main
import (
"context"
"encoding/json"
"errors"
"log"
"net/http"
"os"
"strings"
"sync"
"github.com/MicahParks/keyfunc/v3"
"github.com/golang-jwt/jwt/v5"
)
type claims map[string]any
type order struct {
ID string `json:"id"`
Owner string `json:"owner"`
Status string `json:"status"`
}
type orderStore struct {
mu sync.Mutex
orders map[string]*order
}
func loadOrders(path string) (*orderStore, error) {
data, err := os.ReadFile(path)
if err != nil {
return nil, err
}
var list []*order
if err := json.Unmarshal(data, &list); err != nil {
return nil, err
}
store := &orderStore{orders: make(map[string]*order, len(list))}
for _, o := range list {
store.orders[o.ID] = o
}
return store, nil
}
func (s *orderStore) get(id string) (*order, bool) {
s.mu.Lock()
defer s.mu.Unlock()
o, ok := s.orders[id]
return o, ok
}
func (s *orderStore) cancel(id string) (*order, bool) {
s.mu.Lock()
defer s.mu.Unlock()
o, ok := s.orders[id]
if !ok {
return nil, false
}
o.Status = "CANCELLED"
return o, true
}
var (
store *orderStore
jwks keyfunc.Keyfunc
issuer string
)
func envOr(key, fallback string) string {
if v := os.Getenv(key); v != "" {
return v
}
return fallback
}
func parseBearer(r *http.Request) (claims, error) {
auth := r.Header.Get("Authorization")
if !strings.HasPrefix(auth, "Bearer ") {
return nil, errors.New("missing bearer token")
}
raw := strings.TrimPrefix(auth, "Bearer ")
token, err := jwt.Parse(raw, jwks.Keyfunc,
jwt.WithIssuer(issuer),
jwt.WithExpirationRequired(),
jwt.WithValidMethods([]string{"RS256", "ES256"}),
)
if err != nil || !token.Valid {
return nil, errors.New("invalid token")
}
mc, ok := token.Claims.(jwt.MapClaims)
if !ok {
return nil, errors.New("invalid claims")
}
out := claims{}
for k, v := range mc {
out[k] = v
}
return out, nil
}
func claimString(c claims, name string) string {
v, _ := c[name].(string)
return v
}
func hasScope(c claims, wanted string) bool {
for _, scope := range strings.Fields(claimString(c, "scope")) {
if scope == wanted {
return true
}
}
return false
}
func audienceOK(c claims, wanted string) bool {
switch aud := c["aud"].(type) {
case string:
return aud == wanted
case []any:
for _, item := range aud {
if s, ok := item.(string); ok && s == wanted {
return true
}
}
}
return false
}
// actorIdentity returns the agent identity from the token.
// Preferred: the RFC 8693 act claim, from an STS that supports delegation.
// Fallback: the azp claim, which Keycloak V2 sets to the client that performed the exchange.
func actorIdentity(c claims) string {
if act, ok := c["act"].(map[string]any); ok {
if sub, _ := act["sub"].(string); sub != "" {
return sub
}
}
if azp := claimString(c, "azp"); azp == "orders-agent" {
return "client:" + azp
}
return ""
}
func writeJSON(w http.ResponseWriter, status int, body any) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(status)
_ = json.NewEncoder(w).Encode(body)
}
func getOrderStatus(w http.ResponseWriter, r *http.Request) {
c, err := parseBearer(r)
if err != nil {
log.Printf("read order=%s rejected: %v", r.PathValue("id"), err)
writeJSON(w, http.StatusUnauthorized, map[string]any{"error": err.Error()})
return
}
if !hasScope(c, "orders:read") && !hasScope(c, "orders:cancel") {
log.Printf("read order=%s sub=%v denied: missing read scope", r.PathValue("id"), c["sub"])
writeJSON(w, http.StatusForbidden, map[string]any{"error": "orders:read scope required"})
return
}
o, ok := store.get(r.PathValue("id"))
if !ok {
writeJSON(w, http.StatusNotFound, map[string]any{"error": "order not found"})
return
}
log.Printf("read order=%s sub=%v scope=%q allowed", o.ID, c["sub"], c["scope"])
writeJSON(w, http.StatusOK, map[string]any{
"orderId": o.ID,
"status": o.Status,
"subject": c["sub"],
"scope": c["scope"],
})
}
func cancelOrder(w http.ResponseWriter, r *http.Request) {
c, err := parseBearer(r)
if err != nil {
log.Printf("cancel order=%s rejected: %v", r.PathValue("id"), err)
writeJSON(w, http.StatusUnauthorized, map[string]any{"error": err.Error()})
return
}
actor := actorIdentity(c)
switch {
case !hasScope(c, "orders:cancel"):
log.Printf("cancel order=%s sub=%v denied: missing cancel scope", r.PathValue("id"), c["sub"])
writeJSON(w, http.StatusForbidden, map[string]any{"error": "orders:cancel scope required"})
case !audienceOK(c, "orders-api"):
log.Printf("cancel order=%s sub=%v denied: wrong audience", r.PathValue("id"), c["sub"])
writeJSON(w, http.StatusForbidden, map[string]any{"error": "orders-api audience required"})
case actor == "":
log.Printf("cancel order=%s sub=%v denied: no actor identity", r.PathValue("id"), c["sub"])
writeJSON(w, http.StatusForbidden, map[string]any{"error": "actor identity required"})
default:
o, ok := store.get(r.PathValue("id"))
if !ok {
writeJSON(w, http.StatusNotFound, map[string]any{"error": "order not found"})
return
}
if o.Status == "CANCELLED" {
log.Printf("cancel order=%s sub=%v actor=%s conflict: already cancelled", o.ID, c["sub"], actor)
writeJSON(w, http.StatusConflict, map[string]any{"error": "order already cancelled"})
return
}
updated, _ := store.cancel(r.PathValue("id"))
log.Printf("cancel order=%s sub=%v actor=%s allowed", updated.ID, c["sub"], actor)
writeJSON(w, http.StatusOK, map[string]any{
"orderId": updated.ID,
"status": updated.Status,
"subject": c["sub"],
"actor": actor,
})
}
}
func main() {
issuer = envOr("OIDC_ISSUER", "http://localhost:8080/realms/mcp-demo")
jwksURL := envOr("JWKS_URL", issuer+"/protocol/openid-connect/certs")
ordersPath := envOr("ORDERS_FILE", "orders.json")
var err error
store, err = loadOrders(ordersPath)
if err != nil {
log.Fatalf("failed to load orders from %s: %v", ordersPath, err)
}
jwks, err = keyfunc.NewDefaultCtx(context.Background(), []string{jwksURL})
if err != nil {
log.Fatalf("failed to load JWKS from %s: %v", jwksURL, err)
}
mux := http.NewServeMux()
mux.HandleFunc("GET /orders/{id}", getOrderStatus)
mux.HandleFunc("POST /orders/{id}/cancel", cancelOrder)
log.Println("Order API listening on http://localhost:9100")
log.Fatal(http.ListenAndServe("127.0.0.1:9100", mux))
}Reads are simple. A token with either orders:read or orders:cancel can fetch an order. Writes are stricter. To cancel an order, the token must include the orders:cancel scope, target the orders-api audience, and identify the acting client. Only after all three checks pass does the API update the order. If the order has already been cancelled, the API returns a conflict instead.
Every request is also logged, so you can see exactly who attempted the operation and whether it was allowed or denied. The actorIdentity helper first looks for the RFC 8693 act claim and falls back to azp when it is unavailable. This lets the same code work with both full delegation and identity providers that do not yet issue act claims.
The MCP server
Now we create cmd/orders-mcp/main.go, which is our MCP server. It uses the official Go MCP SDK, exposes the two tools, and forwards any received bearer token to the Order API. The gateway and the exchange layer decide which credential the downstream API sees.
package main
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"log"
"net/http"
"os"
"github.com/modelcontextprotocol/go-sdk/mcp"
)
type OrderInput struct {
OrderID string `json:"orderId" jsonschema:"order ID, for example ord_123"`
}
type ToolOutput struct {
Status int `json:"status"`
Body map[string]any `json:"body"`
}
func callJSON(ctx context.Context, method, url, bearer string) (ToolOutput, error) {
req, err := http.NewRequestWithContext(ctx, method, url, bytes.NewReader([]byte("{}")))
if err != nil {
return ToolOutput{}, err
}
req.Header.Set("Authorization", bearer)
req.Header.Set("Content-Type", "application/json")
resp, err := http.DefaultClient.Do(req)
if err != nil {
return ToolOutput{}, err
}
defer resp.Body.Close()
raw, _ := io.ReadAll(resp.Body)
log.Printf("forward %s %s -> %d", method, url, resp.StatusCode)
var parsed map[string]any
if err := json.Unmarshal(raw, &parsed); err != nil {
parsed = map[string]any{"raw": string(raw)}
}
out := ToolOutput{Status: resp.StatusCode, Body: parsed}
if resp.StatusCode >= 400 {
return out, fmt.Errorf("downstream API rejected call: %s", string(raw))
}
return out, nil
}
func newOrdersMCPServer(orderAPI string, bearer string) *mcp.Server {
server := mcp.NewServer(&mcp.Implementation{
Name: "orders-mcp",
Version: "v0.1.0",
}, nil)
mcp.AddTool(
server,
&mcp.Tool{
Name: "get_order_status",
Description: "Read order status. Low-risk read tool. Requires orders:read.",
},
func(ctx context.Context, req *mcp.CallToolRequest, input OrderInput) (*mcp.CallToolResult, ToolOutput, error) {
out, err := callJSON(ctx, http.MethodGet, orderAPI+"/orders/"+input.OrderID, bearer)
return nil, out, err
},
)
mcp.AddTool(
server,
&mcp.Tool{
Name: "cancel_order",
Description: "Cancel an order. Write tool. Requires an exchanged token with orders:cancel.",
},
func(ctx context.Context, req *mcp.CallToolRequest, input OrderInput) (*mcp.CallToolResult, ToolOutput, error) {
out, err := callJSON(ctx, http.MethodPost, orderAPI+"/orders/"+input.OrderID+"/cancel", bearer)
return nil, out, err
},
)
return server
}
func main() {
orderAPI := os.Getenv("ORDER_API")
if orderAPI == "" {
orderAPI = "http://localhost:9100"
}
handler := mcp.NewStreamableHTTPHandler(
func(r *http.Request) *mcp.Server {
return newOrdersMCPServer(orderAPI, r.Header.Get("Authorization"))
},
&mcp.StreamableHTTPOptions{
JSONResponse: true,
Stateless: true,
},
)
mux := http.NewServeMux()
mux.Handle("/mcp", handler)
log.Println("Orders MCP server listening on http://localhost:9000/mcp")
log.Fatal(http.ListenAndServe("127.0.0.1:9000", mux))
}The MCP server simply forwards the bearer token it receives to the Order API and logs the response, making it easy to see each request flow through the system. In this lab, the Order API is responsible for validating tokens and enforcing authorization. In production, you would usually validate tokens earlier, either in the gateway (which part two focuses on) or in the MCP server itself.
Test it with API calls
Start the two Go services, each in its own terminal (the gateway is already up from setup):
export ORDERS_FILE="cmd/order-api/orders.json"
go run ./cmd/order-api2026/07/05 00:34:53 Order API listening on http://localhost:9100
In another terminal:
export ORDER_API="http://localhost:9100"
go run ./cmd/orders-mcp2026/07/05 00:34:59 Orders MCP server listening on http://localhost:9000/mcp
The gateway is already running from the setup section. With all three services up, the Order API is on :9100, the MCP server on :9000, and the gateway fronting them on :3000.
Then set the token endpoint and a small helper for the rest of this section. JWT payloads are base64url encoded, which plain base64 -d decodes incorrectly, so this helper fixes the alphabet and pads the payload first:
export KC_TOKEN="http://localhost:8080/realms/mcp-demo/protocol/openid-connect/token"
decode_jwt() {
local payload
payload=$(printf '%s' "$1" | cut -d. -f2 | tr '_-' '/+')
while [ $((${#payload} % 4)) -ne 0 ]; do payload="${payload}="; done
printf '%s' "$payload" | base64 -d
}Both lines are silent: export sets a variable and the block defines a function, so a clean prompt with no output is the expected result, not an error. decode_jwt prints something only when you call it with a token, which happens in the next steps. To confirm it loaded, run type decode_jwt and you should see the function body echoed back.
Mint the standard token
Log in as the user via the user-client client.
This uses the OAuth password grant, which is the simplest way to get a real token you can paste into curl; a production setup would use the interactive browser login instead, which arrives later in the series when Claude becomes the client.
RESP=$(
curl -s -X POST "$KC_TOKEN" \
-d "grant_type=password" \
-d "client_id=user-client" \
-d "username=demo-user" \
-d "password=demo-password"
)
echo "$RESP" | jq .
export STANDARD_TOKEN=$(echo "$RESP" | jq -r .access_token)A successful response looks like this:
{
"access_token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCIsImtpZCI6IlZ3UlQyVjBGQzBVRUUzVT1zVEFpVQ==...",
"expires_in": 1800,
"refresh_expires_in": 1800,
"refresh_token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCIsImtpZCI6IlZ3UlQyVjBGQzBVRUUzVT1zVEFpVQ==...",
"token_type": "Bearer",
"session_state": "ab443036-a0b6-ff5e-8485-a81f3c05375a",
"scope": "orders:read"
}
The fields that matter are scope: orders:read and token_type: Bearer; the long access_token is what lands in STANDARD_TOKEN. The refresh_token and session_state are standard OAuth response fields you can ignore for this lab.
If the response is an error rather than a token, three causes cover almost every case:
| Response | Cause | Fix |
|---|---|---|
| unauthorized_client or invalid_client | Direct access grants are off on user-client | The export sets directAccessGrantsEnabled: true; re-import the realm |
| invalid_grant | The realm did not import, so demo-user is missing | Recreate the container; import only runs on first startup |
| Connection refused or an HTML error page | Keycloak still starting, or the wrong port | Wait, then check docker compose logs keycloak |
Now inspect the token:
decode_jwt "$STANDARD_TOKEN" | jq '{sub, azp, scope, aud}'Output:
{
"sub": null,
"azp": "user-client",
"scope": "orders:read",
"aud": "orders-agent"
}Confirm two things: scope contains orders:read, and aud includes orders-agent. That audience entry comes from the mapper in the realm and is what permits the exchange later. If it is missing, your realm predates the mapper: recreate the container and mint a fresh token. A null sub in this filtered view is harmless; Keycloak carries the subject under a different claim in some builds, and none of the tiering checks depend on reading it here.
The read works
Initialize an MCP session through agentgateway and capture the session ID it returns:
RESP=$(
curl -s -i -X POST "http://localhost:3000/mcp/http" \
-H "authorization: Bearer ${STANDARD_TOKEN}" \
-H "content-type: application/json" \
-H "accept: application/json, text/event-stream" \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2025-11-25",
"capabilities": {},
"clientInfo": { "name": "curl", "version": "0.1.0" }
}
}'
)
echo "$RESP"
export SESSION=$(echo "$RESP" | grep -i '^mcp-session-id:' | tr -d '\r' | awk '{print $2}')The gateway responds over a server-sent event stream, and the session ID in the headers is what ties your follow-up calls to the same backend.
The serverInfo block confirms that the gateway successfully connected to the orders-mcp server. The response also includes an mcp-session-id, which identifies your MCP session. We will use the session-id value in every request that follows.
Now call the read tool using that session ID:
curl -s -X POST "http://localhost:3000/mcp/http" \
-H "authorization: Bearer ${STANDARD_TOKEN}" \
-H "mcp-session-id: ${SESSION}" \
-H "content-type: application/json" \
-H "accept: application/json, text/event-stream" \
-d '{
"jsonrpc": "2.0",
"id": 2,
"method": "tools/call",
"params": {
"name": "get_order_status",
"arguments": { "orderId": "ord_123" }
}
}' | grep '^data:' | sed 's/^data: //' | jqThe tool returns both a human-readable content block and a structuredContent block with the parsed result:
{
"jsonrpc": "2.0",
"id": 2,
"result": {
"content": [
{
"type": "text",
"text": "{\"body\":{\"orderId\":\"ord_123\",\"scope\":\"orders:read\",\"status\":\"PROCESSING\",\"subject\":null},\"status\":200}"
}
],
"structuredContent": {
"body": {
"orderId": "ord_123",
"scope": "orders:read",
"status": "PROCESSING",
"subject": null
},
"status": 200
}
}
}Over in the two service terminals, the same call leaves a trail. The MCP server logs the forward and the Order API logs the decision:
# orders-mcp terminal
2026/07/05 00:41:12 forward GET http://localhost:9100/orders/ord_123 -> 200
# order-api terminal
2026/07/05 00:41:12 read order=ord_123 sub=<nil> scope="orders:read" allowed
The result shows the status of ord_123: still PROCESSING. Standard token, orders:read, done. This is the tier most tools should live in.
The write is denied
Now try the write with the same token:
curl -s -X POST "http://localhost:3000/mcp/http" \
-H "authorization: Bearer ${STANDARD_TOKEN}" \
-H "mcp-session-id: ${SESSION}" \
-H "content-type: application/json" \
-H "accept: application/json, text/event-stream" \
-d '{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "cancel_order",
"arguments": { "orderId": "ord_123" }
}
}' | grep '^data:' | sed 's/^data: //' | jqInside the result you will find an error envelope, with isError set and the downstream refusal passed through as text:
{
"jsonrpc": "2.0",
"id": 3,
"result": {
"content": [
{
"type": "text",
"text": "downstream API rejected call: {\"error\":\"orders:cancel scope required\"}\n"
}
],
"isError": true
}
}And the service terminals record the denial:
# orders-mcp terminal
2026/07/05 00:41:29 forward POST http://localhost:9100/orders/ord_123/cancel -> 403
# order-api terminal
2026/07/05 00:41:29 cancel order=ord_123 sub=<nil> denied: missing cancel scope
This is the denial from Figure 3, produced by hand. The login token can read, and only read.
Exchange the token
Now the agent client trades the user's token for a narrower one. Before attempting the exchange, inspect the token's audience. Keycloak only lets a client exchange a token if that client already appears in the token's aud claim. In this lab the audience mapper adds orders-agent to the standard token, which is what authorizes the orders-agent client to perform the RFC 8693 exchange. If orders-agent is missing here, the exchange fails with Client is not within the token audience, so this one-line check prevents the single most common failure:
decode_jwt "$STANDARD_TOKEN" | jq '.aud'That list must include orders-agent. When it does, run the RFC 8693 exchange:
RESP=$(
curl -s -X POST "$KC_TOKEN" \
-d "grant_type=urn:ietf:params:oauth:grant-type:token-exchange" \
-d "client_id=orders-agent" \
-d "client_secret=orders-agent-secret" \
-d "subject_token=${STANDARD_TOKEN}" \
-d "subject_token_type=urn:ietf:params:oauth:token-type:access_token" \
-d "requested_token_type=urn:ietf:params:oauth:token-type:access_token" \
-d "audience=orders-api" \
-d "scope=orders:cancel"
)
echo "$RESP" | jq .
export WRITE_TOKEN=$(echo "$RESP" | jq -r .access_token)A successful exchange returns the new token. Notice expires_in: 120 and scope: orders:cancel, and that issued_token_type confirms this came from the token-exchange grant:
{
"access_token": "eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICJXd2tldUZPTzk4Y0taWG1sWFY4eXJHZk5Cc0x5blRpdFhjdTNZTTdtNXdjIn0...",
"expires_in": 120,
"refresh_expires_in": 0,
"token_type": "Bearer",
"not-before-policy": 0,
"session_state": "ab443036-a0b6-ff5e-8485-a81f3c05375a",
"scope": "orders:cancel",
"issued_token_type": "urn:ietf:params:oauth:token-type:access_token"
}If the exchange fails, the response names the reason. Common failures and fixes:
| Response | Cause | Fix |
|---|---|---|
| access_denied: Client is not within the token audience | orders-agent missing from the subject token's aud | Confirm the orders-agent-audience mapper exists, recreate the container, re-mint |
| invalid_client | Wrong id or secret for orders-agent | client_secret=orders-agent-secret must match the realm export |
| access_denied mentioning token exchange | Standard Token Exchange is off on orders-agent | Check the Capability config toggle on the client in the admin console |
| invalid_scope | orders:cancel not assigned to orders-agent | It must be a default or optional client scope on that client |
One subtlety: the exchange only ever sees the token you pass in, so a token minted before a mapper change will keep failing. Always re-mint the standard token after touching the realm. We will crack the new token open in a minute; first, spend it.
The write succeeds
The exchanged token lives for 120 seconds. If more than two minutes have passed, run the exchange again:
curl -s -X POST "http://localhost:3000/mcp/http" \
-H "authorization: Bearer ${WRITE_TOKEN}" \
-H "mcp-session-id: ${SESSION}" \
-H "content-type: application/json" \
-H "accept: application/json, text/event-stream" \
-d '{
"jsonrpc": "2.0",
"id": 4,
"method": "tools/call",
"params": {
"name": "cancel_order",
"arguments": { "orderId": "ord_123" }
}
}' | grep '^data:' | sed 's/^data: //' | jqThis time the result carries the cancelled order, and the actor is stamped right in the body:
{
"jsonrpc": "2.0",
"id": 4,
"result": {
"content": [
{
"type": "text",
"text": "{\"body\":{\"actor\":\"client:orders-agent\",\"orderId\":\"ord_123\",\"status\":\"CANCELLED\",\"subject\":\"1db033bd-8186-4de0-864d-a42b56b25cbb\"},\"status\":200}"
}
],
"structuredContent": {
"body": {
"actor": "client:orders-agent",
"orderId": "ord_123",
"status": "CANCELLED",
"subject": "1db033bd-8186-4de0-864d-a42b56b25cbb"
},
"status": 200
}
}
}The service terminals now show the request clearing every check. Compare the sub here to the <nil> in the read log earlier: the exchanged token carries a real subject and the actor is the agent client:
# orders-mcp terminal
2026/07/05 00:41:47 forward POST http://localhost:9100/orders/ord_123/cancel -> 200
# order-api terminal
2026/07/05 00:41:47 cancel order=ord_123 sub=1db033bd-8186-4de0-864d-a42b56b25cbb actor=client:orders-agent allowed
The API validates scope, audience, and actor, then cancels the order. Close the loop by re-running the read call from earlier with the plain STANDARD_TOKEN: the status is now CANCELLED. And run the cancel once more with a fresh exchanged token: the API returns a conflict, because even a perfect token cannot cancel an order twice. Reads stayed cheap. The write paid its toll.
The exchanged token expires
You still have ord_456 sitting in PROCESSING, and the WRITE_TOKEN you just used is scoped to cancel. Wait out the token's lifetime:
sleep 120Then try to cancel ord_456 with that same, now-stale token:
curl -s -X POST "http://localhost:3000/mcp/http" \
-H "authorization: Bearer ${WRITE_TOKEN}" \
-H "mcp-session-id: ${SESSION}" \
-H "content-type: application/json" \
-H "accept: application/json, text/event-stream" \
-d '{
"jsonrpc": "2.0",
"id": 5,
"method": "tools/call",
"params": {
"name": "cancel_order",
"arguments": { "orderId": "ord_456" }
}
}' | grep '^data:' | sed 's/^data: //' | jqThe API rejects it before it ever looks at the order, because the token is past its exp:
{
"jsonrpc": "2.0",
"id": 5,
"result": {
"content": [
{
"type": "text",
"text": "downstream API rejected call: {\"error\":\"invalid token\"}\n"
}
],
"isError": true
}
}# order-api terminal
2026/07/05 00:44:03 cancel order=ord_456 rejected: invalid token
To cancel ord_456 you have to mint a fresh exchanged token, exactly as you did in the "Exchange the token" section above.
That is the whole point of the short lifetime: a leaked or lingering cancel token is worthless within two minutes, so the blast radius of a stolen credential is measured in seconds, not the 30 minutes the login token would give an attacker.
Re-run the exchange, and the retry with the new token cancels ord_456 cleanly.
The exchange, visually
You've now seen the token exchange in action. Let's look at what actually changed between the original token and the exchanged token. The differences are what make the write request secure:
decode_jwt "$STANDARD_TOKEN" | jq '{sub, azp, scope, aud}'
decode_jwt "$WRITE_TOKEN" | jq '{sub, azp, scope, aud, iat, exp}'The exchanged token looks like this:
{
"sub": "<demo-user id>",
"azp": "orders-agent",
"scope": "orders:cancel",
"aud": "orders-api",
"iat": 1783411200,
"exp": 1783411320
}And the diff between the two is the entire security win of this guide:
| Claim | Login token | Exchanged token |
|---|---|---|
| scope | orders:read | orders:cancel |
| audience | user-client, orders-agent | orders-api |
| azp | user-client | orders-agent |
| lifetime | 30 minutes | 2 minutes |
Read it carefully:
- Same user (sub unchanged),
- different acting party (azp is now the agent client),
- narrower scope (orders:cancel only),
- locked audience (orders-api only),
- 120 second lifetime.
In summary, this is the difference, and how token exchange works end to end:

This is RFC 8693 doing exactly what it was designed for: the user token goes in as the subject_token, and what comes out is downscoped, audience-bound, and short-lived.
What's next: a policy layer in front of every MCP server
Everything in this part enforces the tiers inside the application: the Order API validates signature, scope, audience, and actor itself, and that is exactly where the floor belongs. But application-level checks scale linearly with your fleet. Ten MCP servers means ten implementations of the same tier logic, ten places to update when a scope changes, and ten separate log streams to search when something gets denied.
In the next edition, we add one layer in front of all of them: agentgateway validating JWTs and enforcing the same tier rules as CEL policy at a single front door. A bad token is refused before it ever reaches any MCP server, the cancel tool turns invisible to callers whose token cannot use it, and every denial lands in one gateway audit log instead of scattered application logs. Same contract, one hop earlier, once for the whole fleet.
Share this post
Enjoyed this post? Stay updated with new articles.
Subscribe via RSSThanks for reading.