Why: SIP trunk + site-to-site VPN not ready → test the voice agent from a browser over WebRTC
TM SecureVPN
BrowserWebRTC test client
→443 signaling
3478/3480 media
Shared ELB10.65.176.54
CCE cluster · ns livekit
STUNnerTURN — browser MEDIA only
→media
LiveKit SFU★ central hubeveryone joins HERE
↔in-cluster
(no TURN)
TM-Voice-AgentLiveKit participant
→STT / TTS / LLM
GPU host
GPUSTT · TTS · LLM inference
Read it as a hub, not a line: the browser (via STUNner) and the voice agent both connect into the SFU.
The SFU bridges them. The voice agent then calls the GPU for speech-to-text, text-to-speech and the LLM.
STUNner sits only on the browser-media leg — it is not in the agent's or GPU's path.
- SIP trunk not ready & site-to-site VPN pending — the production phone path can't be used yet, so we validate the voice agent from a browser.
- A browser speaks WebRTC, not SIP — media is dynamic-port UDP negotiated via ICE.
- The SFU is private inside CCE — the external browser (behind NAT) and the SFU (behind cluster NAT) cannot form a direct media path → NAT traversal problem.
- TURN fixes it — both sides connect outward to one fixed relay; STUNner forwards media between them on two fixed ports instead of a wide UDP range.
⚠ Per manifest comments, the single-LB UDP-listener reuse is “verified-by-spec, unverified-end-to-end” — live ELB UDP path is still a Phase-0 validation item.
🟣 Browser client (SecureVPN)
The actual test originator. A real browser on TM SecureVPN, behind NAT — reproducing exactly where a real user sits. Establishes a WebRTC session (signaling + media) to the SFU.
role: WebRTC client · transport: wss + TURN
🔵 Shared ELB — 10.65.176.54
The existing UCC-Devl load balancer. STUNner reuses it via kubernetes.io/elb.id rather than provisioning a new one — adding UDP/3478 + TCP/3480 listeners next to the Envoy TCP/443 already there.
elb.id ad1bb119-…aed11c · mixed UDP+TCP listeners
🟠 STUNner — TURN gateway
Kubernetes-native TURN server (Gateway-API). Relays browser media to the SFU on fixed ports, solving NAT traversal. Gateway udp-gateway + UDPRoute livekit-media-plane → SFU. Static TURN creds are broadcast to clients at room-join (not secret).
TURN-UDP 3478 · TURN-TCP 3480 (fallback) · authType: plaintext
🟢 LiveKit SFU — media server
The Selective Forwarding Unit that mixes/routes audio for the voice agent. ClusterIP livekit-server: signaling on :7880 (via HTTPRoute/Envoy 443), media on 50000–60000 (via STUNner). The thing we're actually testing.
svc :80→7880 · metrics :6789 · ns livekit
SIGNALING wss / TCP 443
“Who’s in the room, what codecs, ICE candidates.” Travels over WebSocket-Secure through the Envoy gateway listener on the shared ELB to the SFU on :7880. Low bandwidth, TCP — never blocked.
MEDIA UDP 3478 / TCP 3480
The actual audio. Cannot go peer-to-peer across two NATs, so it’s relayed through STUNner (TURN). UDP/3478 preferred; TCP/3480 is the fallback when UDP is filtered on the client network.