Permissions

Device permissions — microphone, camera, screen, geolocation — are a first-party concern in the GECX Chat SDK. A single PermissionManager orchestrates requests across capabilities and platforms, syncs grants to governance consent flags, and emits audit events on every state change. Voice, multimodal capture, and host-level features depend on it.

The platform integration contract (browser, Expo/RN, Capacitor, kiosk) and full provider API live in Permission Providers. This page covers the mental model.

What permissions cover

Four MediaCapability values are first-class:

microphone — required by VoiceSession and captureFromMicrophone.
camera — required by captureFromCamera (and any future vision capture surface).
screen — required by captureFromScreen for screen-share moments.
geolocation — required by location-aware tools.

These are orthogonal to the SDK's API auth (which lives in auth) and to the user's consent posture (none | functional | analytics | all in ChatGovernance). A user with full analytics consent still has to grant microphone access explicitly the first time voice is used.

The manager + provider split

ChatClient
  └── permissions: PermissionManager  ← always present
                       │
                       └── provider: PermissionProvider  ← swappable

PermissionManager is the orchestrator. It owns:

Concurrent-request de-duplication (in-flight requests for the same capability share one promise).
The capability-to-consent-flag mapping.
The event stream (granted, revoked, denied, unsupported).
The audit pipeline (every transition emits a governance.permission_* event).

PermissionProvider is the platform seam. The SDK ships three:

Provider	Subpath	Use
`createBrowserPermissionProvider`	`gecx-chat`	Real `navigator.mediaDevices` + `navigator.permissions` calls. Default when `navigator` exists.
`createMockPermissionProvider`	`gecx-chat`	Scripted grants/denials for tests.
`createNoopPermissionProvider`	`gecx-chat`	Always returns `unsupported`. Default in SSR.

Native runtimes plug in their own adapter — Expo/RN, Capacitor, and kiosk skeletons are in Permission Providers. The contract is small: implement query, request, revoke?, and subscribe?.

Errors are data; ensure() is the exception path

The manager exposes two methods for the same underlying request:

permissions.request(capability) never throws. Failure is returned as data: { status: 'denied' | 'unsupported' | 'prompt_blocked', reason? }. Callers that want to render setup guidance use this.
permissions.ensure(capability) is the throwing convenience. On any non-granted result, it throws a ChatSdkError with the appropriate PERMISSION_* code. VoiceSession.start() and captureFromDevice() use this internally so their callers get exception semantics.

Insecure context (HTTPS or localhost is required for getUserMedia) is distinct from denied: request() returns { status: 'unsupported', reason: 'insecure_context' } and ensure() throws PERMISSION_INSECURE_CONTEXT. Hosts use this to render "this feature requires HTTPS" guidance instead of confusing the user with a generic denial.

Stream ownership transfers

When getUserMedia returns a MediaStream, the manager hands it to the caller and keeps no references. The caller (typically VoiceSession or captureFromDevice) is responsible for stopping tracks when done. captureFromCamera, captureFromMicrophone, and captureFromScreen do this automatically and return a File to the upload pipeline.

This avoids a class of bug where the SDK retains a hot stream after the consumer has finished — which would surface as a stuck "in-use" indicator in the browser tab.

Governance integration

Permissions are not just a UX concern — every grant is an auditable event. Four new ConsentFlag values mirror the capabilities:

microphone_capture
camera_capture
screen_capture
geolocation_capture

When PermissionManager records a grant, it auto-grants the corresponding ConsentFlag on ChatGovernance. When the manager records a revoke, it withdraws the flag. The link is one-directional (manager → governance) so that governance-only consent rollups (e.g., "the user has consented to analytics") never have to consult the permission system.

Two new GovernanceAuditKind values surface in the audit log:

permission_granted — capability, timestamp, requester.
permission_revoked — capability, timestamp, requester, reason.

The existing voice_recording consent flag is untouched. It governs retention of audio bytes (separate from capture permission) and continues to work as documented in Voice and Multimodal.

Voice integration

VoiceSession.start() is wired through the manager:

async start() {
  const stream = await this.permissions.ensure('microphone');
  // ...open realtime session with stream
}

If the user has not granted microphone access, ensure() throws PERMISSION_DENIED (or PERMISSION_INSECURE_CONTEXT, etc.) before any WebSocket is opened. The error surfaces through ChatSession.error and the React <VoiceToggle> component renders inline remediation.

This is also why chat.voice is a lazy getter: configuring voice: 'auto' does not request the microphone and does not open a connection. The provider factory only runs when something first reads session.voice. Hosts that gate voice behind a button can defer permission requests until the moment the user actually wants voice.

React surface

import { usePermission, usePermissionManager, PermissionPrompt } from 'gecx-chat/react';

function Settings() {
  const mic = usePermission('microphone');

  if (mic.status === 'unsupported' && mic.reason === 'insecure_context') {
    return <p>Voice requires HTTPS.</p>;
  }

  if (mic.status !== 'granted') {
    return (
      <PermissionPrompt
        capability="microphone"
        onGranted={() => console.log('mic OK')}
      />
    );
  }

  return <p>Voice ready.</p>;
}

usePermissionManager() returns the full manager for hosts that want to drive multiple capabilities together. DEFAULT_PERMISSION_COPY exports the bundled English strings so hosts can override per locale.

Where to go next

Permission Providers — full API, native-platform plug-in contracts.
Voice and Multimodal — where microphone permissions matter most.
Auth and Security — how permissions relate to API auth and consent posture.
Data Governance — consent posture, retention, and audit events.

Source: docs/concepts/permissions.md