Proxy Deployment

The customer proxy is the trust boundary between browser code and Google Customer Engagement Suite (CES). It holds the service-account-scoped credentials, applies redaction and rate limits, and emits the audit log your security team needs. Every regulated CES deployment (telco, banking, healthcare, retail) goes through a proxy like this.

apps/proxy-reference/ is a working production-grade implementation. It runs on Google Cloud Run, mints tokens via Workload Identity Federation, and is designed to be a customer security team's first read. This guide walks you through deploying it.

When you need a proxy

In production, the browser should never talk directly to CES. A proxy lets you:

  • Keep credentials server-side via Workload Identity Federation (no JSON keys at rest).
  • Redact PII before it leaves your network.
  • Rate-limit per (origin, sessionId).
  • Audit every request for compliance.
  • Apply custom authorization logic (session cookies, JWT, mTLS).
Browser  -->  Your Proxy  -->  CES

The browser only knows the proxy URL. Credentials, upstream URLs, and OAuth scopes stay on the server.

Quick deploy (under 30 minutes)

The full step-by-step playbook lives in apps/proxy-reference/README.md. The short version:

# Prerequisites
gcloud services enable run.googleapis.com customerengagementsuite.googleapis.com \
  iamcredentials.googleapis.com sts.googleapis.com secretmanager.googleapis.com

# Service account + IAM
gcloud iam service-accounts create gecx-proxy
gcloud projects add-iam-policy-binding "$PROJECT_ID" \
  --member="serviceAccount:gecx-proxy@${PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/customerengagementsuite.client"

# WIF pool + provider (full commands in the README)
gcloud iam workload-identity-pools create gecx-proxy-pool --location=global
# ... see README for the OIDC provider, attribute mapping, and IAM binding.

# Deploy
gcloud run deploy gecx-proxy \
  --source=apps/proxy-reference \
  --region=us-central1 \
  --service-account="gecx-proxy@${PROJECT_ID}.iam.gserviceaccount.com" \
  --set-env-vars="NODE_ENV=production,GECX_AUTH_MODE=wif,ALLOWED_ORIGINS=https://app.example.com,GECX_CHAT_STREAM_URL=..." \
  --set-secrets="/var/secrets/gecx-wif-config/wif.json=gecx-wif-config:latest" \
  --no-allow-unauthenticated

# Verify
curl -fsS "$(gcloud run services describe gecx-proxy --region=us-central1 --format='value(status.url)')/health"

A cloudrun.yaml manifest is included at apps/proxy-reference/cloudrun.yaml for gcloud run services replace.

Authentication modes

GECX_AUTH_MODE selects how the proxy mints chat tokens. Mode selection happens at startup; misconfigurations log kind: auth.misconfigured and cause /chat/token to return 500 PROXY_AUTH_MISCONFIGURED until they're fixed. /health keeps responding so the deployment doesn't churn.

  • wif — production default. Reads a WIF config file at GOOGLE_APPLICATION_CREDENTIALS and exchanges it with Google STS via ExternalAccountClient.fromJSON. No JSON keys in the container.
  • service-account — non-prod testing escape hatch. Reads a service-account key. Emits a one-time stderr warning. Refused unless GECX_AUTH_MODE=service-account is set explicitly.
  • mock — tests and local dev. Opaque tokens. Refused entirely when NODE_ENV=production.

Routes

The proxy must implement these routes. The SDK expects them at these paths when you configure createProxyTransport.

GET /health

Liveness probe. Returns { status: 'ok', version, authConfigured }. Used by Cloud Run, Kubernetes, and load balancers. No auth required, never rate-limited.

/healthz is a legacy alias that returns { ok: true } — kept for Kubernetes-style probes that hardcode the older path.

POST /chat/token

Issues a short-lived chat token to the browser. The proxy authenticates the caller (session cookie, JWT, etc.), mints a CES-scoped access token via the configured auth mode, and returns it.

Response body: { token, expiresAt, tokenType, scopes? }

The browser never sees the WIF config or the service-account credential.

POST /chat/stream

Forwards a send request to the CES streamRunSession endpoint and streams the SSE / NDJSON response back. The proxy parses the JSON body, applies redaction, attaches the WIF-minted bearer token, and pipes the upstream stream to the browser.

Request: JSON body with sessionId and message payload. X-Goog-Request-Id header (or legacy Idempotency-Key) is preserved.

Response: Content-Type matches CES — application/x-ndjson or text/event-stream. Cache-Control: no-store, X-Accel-Buffering: no to prevent intermediary buffering.

POST /chat/upload

Forwards multipart file uploads to the upstream upload endpoint. CES v1 prefers inline blobs (SessionInput.blob = { mimeType, data }) over a dedicated upload lane; leave GECX_CHAT_UPLOAD_URL empty unless you front a custom upload service.

The reference proxy validates every file before forwarding:

  • MIME allowlist from UPLOAD_ALLOWED_MIME_TYPES including wildcards such as image/*.
  • Per-file size limit from UPLOAD_MAX_BYTES.
  • File count limit from UPLOAD_MAX_FILES.
  • Rate limit by (origin, sessionId).
  • Metadata-only scan hook. The default hook returns passed; replace it with a malware/PII scanner in production.

Audit events include route, session, MIME type, size, count, scan status, and upstream status. They do not include file bytes, base64, extracted text, or file contents.

Response body should include { attachmentId?, url, scanStatus?, metadata? }. createProxyTransport maps that into UploadProgressEvent; ChatSession.attachFile(file) then stages a typed file part when lifecycle reaches ready_to_send.

POST /chat/tool-call

Server-tool dispatch. The reference implementation includes a local dispatch table for tools like apply_refund and update_shipping_address. If GECX_CHAT_TOOL_URL is set, requests are forwarded upstream.

Server tools run with service-account-scoped credentials that never reach the browser. Database writes, payment processing, and entitlement checks belong here.

For local dispatch, createServerToolHandler supports manifest-shaped definitions with inputSchema, outputSchema, approvalPolicy, sideEffectLevel, idempotency, auth, audit, and timeoutMs. The reference apply_refund and update_shipping_address actions require Idempotency-Key, validate inputs and outputs, record duplicate disposition, thread ctx.signal, and return one of these status envelopes:

{ "status": "completed", "output": {}, "idempotencyKey": "..." }
{ "status": "denied", "error": "not allowed", "errorCode": "SERVER_TOOL_UNAUTHORIZED" }
{ "status": "pending", "approvalPolicy": "supervisor_approve", "error": "pending approval" }
{ "status": "duplicate", "duplicateDisposition": "replayed", "output": {} }
{ "status": "failed", "error": "validation failed", "errorCode": "TOOL_VALIDATION_FAILED" }

DELETE /chat/conversations/:sessionId

Used by the governance deleteConversation method.

GET /chat/conversations/:sessionId/export

Used by the governance exportConversation method. Accepts an optional ?format=json or ?format=ndjson query parameter.

POST /chat/forget-me

Right-to-be-forgotten endpoint. Accepts { userId, sessionIds?, reason? } and fans out delete requests for each session. Returns 202 on success or 207 if some deletes failed.

POST /chat/voice-token

Mints a short-lived ephemeral token for Gemini Live so the browser can open the realtime WebSocket directly. The proxy validates the chat session, exchanges a server-side Gemini API key (or service account credential) for an ephemeral token, and returns { token, expiresAt, model, voice }.

The reference implementation in apps/proxy-reference/src/server.ts (handleVoiceToken) supports a stub mode for local dev (ALLOW_STUB_VOICE_TOKEN=1, NODE_ENV !== 'production'). The production code path commented inline calls @google/genai's authTokens.create.

Computer-use routes

Three additional routes back the computer_use server tool when enabled:

  • POST /chat/tool-call (existing) routes the computer_use tool name to a createComputerUseHandler instance, which spins up a ComputerUseSession with the configured ComputerUseProvider (Browserbase, Mock, or a custom adapter).
  • GET /chat/computer-use/:sessionId/stream — signed SSE stream of PNG screenshots and structured action-log events. The signature uses HMAC over (sessionId, expiry) with constant-time verification and a 60s TTL. The verifier runs before any session lookup so callers cannot probe for valid session IDs.
  • POST /chat/computer-use/:sessionId/control — per-action approval decisions, abort, and the global admin kill switch.

Every action emits a governance.computer_use.* audit event through the existing ChatGovernance sink with request-id correlation. See Computer-use and the threat model.

IAM bindings

The proxy service account needs:

RoleBound onWhy
roles/customerengagementsuite.clientProjectCall CES streamRunSession and generateChatToken.
roles/iam.workloadIdentityUserThe WIF pool's principalExchange the Cloud Run service identity for an impersonated CES token.
roles/secretmanager.secretAccessorThe gecx-wif-config secretRead the WIF config file mounted into the container.
roles/logging.logWriterProjectEmit structured access and audit logs.

The CES IAM role surfaces as roles/customerengagementsuite.client in most projects. Verify with gcloud iam roles list --filter='name~customerengagement' if it's not found in yours.

Environment variables (uploads, rate limiting, redaction)

VariablePurposeDefault
PORTServer port8080
ALLOWED_ORIGINSComma-separated list of allowed CORS originshttp://localhost:3000
GECX_CHAT_STREAM_URLUpstream stream endpoint URL(required for streaming)
GECX_CHAT_UPLOAD_URLUpstream upload endpoint URL(required for uploads)
GECX_CHAT_TOOL_URLUpstream tool-call endpoint URL(optional; uses local dispatch if unset)
GECX_CHAT_DELETE_URL_BASEUpstream delete endpoint base URL(optional; uses dev stub if unset)
GECX_CHAT_EXPORT_URL_BASEUpstream export endpoint base URL(optional; uses dev stub if unset)
UPSTREAM_TOKENBearer token for upstream API calls(empty)
REDACT_KEYSComma-separated list of additional field names to redact(empty)
RATE_LIMIT_MAXMax requests per window per (origin, session) pair60
RATE_LIMIT_WINDOW_MSRate-limit window in milliseconds60000
UPLOAD_ALLOWED_MIME_TYPESComma-separated MIME allowlist for /chat/upload; supports image/* style wildcardsimage/*,text/plain,application/pdf
UPLOAD_MAX_BYTESMax size per uploaded file in bytes10485760
UPLOAD_MAX_FILESMax files per upload request5

Security features

Redaction

The default redaction list covers password, secret, apiKey, api_key, token, accessToken, access_token, privateKey, private_key, creditCard, credit_card, ssn, socialSecurityNumber. Pattern-based catches: Google API keys (AIza…), PEM private-key headers, credit-card-shaped digit runs.

Add custom keys via REDACT_KEYS. Redaction is destructive; the upstream never sees the original values.

Rate limiting

In-memory (origin, sessionId) token bucket. Default: 60 requests per 60 seconds. When exceeded, the proxy returns 429 with Retry-After. For multi-instance deployments, swap the in-memory limiter for a Redis- or Memorystore-backed implementation — the function signature stays the same.

Audit logging

Every route emits structured audit events to stdout as JSON. Cloud Logging picks them up automatically. Events include token.issued, stream.started, stream.completed, tool.invoked, rate.limited, conversation.delete_requested, and the user-erasure lifecycle. Swap the sink in apps/proxy-reference/src/server.ts for your SIEM.

Structured access logs

Every request also emits an http.access JSON line with requestId, route, method, status, latencyMs, and origin. The requestId is echoed in the response X-Request-Id header — pass it to the SDK's onError callback for end-to-end correlation in Cloud Logging.

Server tool audit events include tool.invoked, tool.completed, tool.denied, tool.pending, tool.duplicate, and tool.error. They include the tool name, session ID, tool call ID, approval policy, audit classification, idempotency key, and duplicate disposition when applicable. The reference does not log raw tool input or service credentials by default.

CORS

Origins are checked against an explicit allowlist. Access-Control-Allow-Origin echoes the matched origin — never *.

Tracing

The server exports setTraceHook so OpenTelemetry can be wired without imposing it as a dependency. The reference ships no @opentelemetry/* package; hosts register a hook before startup. Example in apps/proxy-reference/README.md.

Container

The bundled Dockerfile is multi-stage, built on node:20-alpine, and runs as the node user (uid 1000). Image size is under 200 MB even with google-auth-library. HEALTHCHECK probes /health every 30 seconds.

docker build apps/proxy-reference -t gecx-proxy:test
docker run --rm -p 8080:8080 -e GECX_AUTH_MODE=mock gecx-proxy:test
curl -fsS http://localhost:8080/health

Doctor

After deployment, verify the proxy with the SDK doctor:

npx gecx doctor \
  --token-endpoint https://your-proxy.run.app/chat/token \
  --deployment your-deployment-id

What's next

  • Data Governance — how deleteConversation and forgetMe call through the proxy.
  • Server Tools — defining server-side tools that run behind the proxy.
  • Error Handling — how proxy errors map to SDK error codes.
Source: docs/guides/proxy-deployment.md