Skip to main content

What is RTC Mode?

RTC Mode enables real-time voice communication with avatars through WebRTC providers (LiveKit or Agora). It builds on top of Host Mode — the @spatialwalk/avatarkit-rtc adapter package handles the RTC connection and feeds audio/animation data to the SDK automatically.
Web Only: RTC Mode is currently available for Web applications only. iOS and Android support is planned for a future release. In the meantime, native mobile platforms can use Host Mode with your own RTC implementation.

When to Use

  • Real-time voice conversation — users talk to an avatar via microphone
  • Low-latency interaction — WebRTC provides sub-second latency
  • Server-side AI — your RTC server processes audio and generates responses

Packages Required

PackagePurposeRequired
@spatialwalk/avatarkitAvatar rendering SDKYes
@spatialwalk/avatarkit-rtcRTC adapterYes
[email protected]LiveKit RTC SDKChoose one
agora-rtc-sdk-ngAgora RTC SDKChoose one
Critical: livekit-client must be exactly version 2.16.1. Other versions are not compatible.

LiveKit vs Agora

FeatureLiveKitAgora
ProtocolVP8 + RTCRtpScriptTransformH.264 + SEI
Chrome94+74+
Firefox117+78+
Safari15.4+14.1+
Edge94+79+
Connection ConfigURL + Token + Room NameApp ID + Channel + Token + UID
Debug OptionsVia logLeveldebugLogging option

How It Works

RTC Mode uses the SDK in Host Mode internally. The AvatarPlayer acts as a bridge:
  1. Initializes the avatar SDK with DrivingServiceMode.host
  2. Connects to the RTC server via the chosen provider
  3. Publishes your microphone audio to the RTC server
  4. Receives animation and audio data from the RTC server
  5. Feeds animation data into the avatar SDK for rendering; audio is played through the native WebRTC audio track by the RTC provider
You don’t need to call yieldAudioData() or yieldFramesData() manually — the adapter handles this.

What RTC Mode Does NOT Use

Although RTC Mode builds on Host Mode internally, the following SDK/Host Mode features are not used in RTC Mode:
FeatureUsed in SDK/Host ModeUsed in RTC Mode
initializeAudioContext()Yes — required for Web Audio API playbackNo — audio is played via WebRTC tracks
Internal audio playerYes — SDK decodes and plays audio internallyNo — RTC provider handles audio playback
yieldAudioData() / yieldFramesData()Yes — you call these manuallyNo — the adapter calls them internally
start() / send() / close()Yes (SDK Mode only)No
Audio path difference: In SDK Mode and Host Mode, the SDK plays avatar audio through its internal audio player (Web Audio API). In RTC Mode, avatar audio arrives as a native WebRTC audio track and is played by the browser’s WebRTC stack directly — the SDK’s internal audio player is not involved. This distinction matters for audio processing (e.g., echo cancellation, noise suppression).

Server-Side Setup

Your backend is responsible for sending audio to the avatar service and having the resulting avatar stream published to your RTC room or channel. Two approaches are available per platform:
PlatformFramework pluginServer SDK + egress
LiveKitLiveKit Agents plugin — hooks into your agent pipeline and publishes to the roomLiveKit Server (section 2) — use AvatarKit Server SDK with LiveKit egress config
AgoraAgora TEN plugin — coming soonAgora Server (section 2) — use AvatarKit Server SDK with Agora egress config
  • Framework plugin: Best if you already use LiveKit Agents or (when available) Agora TEN. The plugin handles audio → Spatialreal → RTC publish for you.
  • Server SDK + egress: Use the Golang or Python Server SDK, create a session with LiveKit or Agora egress config, and send audio. The avatar service publishes audio and animation directly to the RTC room/channel; your server does not relay that data.
Client-side setup is the same either way: the client joins the room/channel and uses @spatialwalk/avatarkit-rtc to render the avatar. See the client guides below.

Get Started

Client (browser) Server (backend)