Integration Modes

SpatialReal currently provides avatar-only services, focusing on generating and rendering real-time avatar animations based on audio input. Voice conversation logic, speech synthesis, and other agent functionalities are managed by your application or third-party services.

Agent Mode is coming soon — a fully managed voice agent solution with built-in conversation logic, speech synthesis, and avatar rendering.

Choose Your Integration Mode

Spatialreal offers four distinct integration modes to suit different architectural needs, latency requirements, and development preferences.

SDK Mode

Client-centric integration with minimal server-side changes

Host Mode

Full control with custom transport layer

RTC Mode

Ultra-low latency via LiveKit or Agora

Framework Plugin

Seamless integration with LiveKit Agents or TEN Framework

SDK Mode

In this mode, the client-side application manages the audio input. The developer passes the audio to the Spatialreal Client SDK, which handles the server interaction to retrieve animation data and render the avatar.

Pass Audio

Developer passes audio to the Spatialreal SDK on the client.

Inference

SDK calls the inference service.

Render

SDK receives drive parameters and plays the avatar.

Best Suited For:

Client-Centric Logic: Scenarios where the voice agent logic resides primarily on the device.
Moderate Latency: Projects where ultra-low latency is not the absolute priority.
Simplified Architecture: Minimal server-side development required (only for authentication), allowing most logic to remain on the client.

View SDK Mode Guide

Host Mode

In Host Mode, the developer acts as the bridge. You use the Spatialreal Server SDK to stream audio to the service and receive streaming drive parameters back. It is then your responsibility to transport this data to the client.

Stream Audio

Developer sends streaming audio via Server SDK.

Receive Parameters

Spatialreal returns streaming drive parameters to the Developer’s Server.

Transport

Developer transmits audio and parameters to the Client SDK via a custom transport layer.

Render

Client SDK renders the avatar.

The custom transport layer must ensure data is delivered without duplication, loss, or disorder.

Best Suited For:

Custom Transport: Teams that already have a reliable, controllable transport layer.
Deep Integration: Developers willing to handle server-side adaptation for maximum control.
High Low-Latency Demands: When you need to optimize the network path manually.

View Host Mode Guide

RTC Mode

This mode leverages Real-Time Communication (RTC) infrastructure (currently supporting LiveKit and Agora). The developer streams audio to Spatialreal, but instead of returning data to the developer, Spatialreal pushes the drive parameters directly into an RTC room/channel.

Stream Audio

Developer streams audio via Server SDK.

Push to RTC

Spatialreal Service pushes audio and avatar drive data to an RTC Room.

Subscribe & Play

Client joins the room using the Spatialreal RTC Client to subscribe, parse, and play.

The stream contains binary drive parameters and audio, not a pre-rendered video feed. Therefore, it cannot be played by standard video players; it requires the Spatialreal Client to render.

Best Suited For:

Existing RTC Users: Teams already using LiveKit or Agora for voice agents but not using their specific agent frameworks (e.g., LiveKit Agents or TEN Framework).
Server-Managed State: Scenarios requiring server-side management of conversation state (e.g., handling interruptions).
Ultra-Low Latency: Leveraging established RTC networks for minimal delay.

View RTC Mode Guide

Framework Plugin

This is the most streamlined approach for modern voice agent frameworks. Developers use a provided plugin that sits inside the voice agent pipeline (e.g., LiveKit Agents, TEN Framework).

Intercept Audio

The Plugin intercepts audio from the agent pipeline.

Send to Spatialreal

It sends audio to Spatialreal and configures the RTC push parameters.

Push & Render

Spatialreal pushes data to the RTC room (Client side is identical to RTC Mode).

Best Suited For:

Framework Users: Teams already using frameworks like LiveKit Agents or TEN Framework.
Rapid Integration: Low implementation cost; the plugin automatically handles signal processing and conversation state (interruption logic).
Migration: Easy to switch if you are currently using other avatar services within these frameworks.

Comparison

Mode	Characteristic	Latency	Integration Effort	Ideal Scenario
SDK Mode	Client-centric	Moderate	Low	Teams that don’t want a lot server side change.
Host Mode	Custom Transport Layer	Low	High	Apps requiring total control over data transport.
RTC Mode	Transport via Agora/LiveKit	Ultra-Low	Medium	Existing RTC users needing server-side state control.
Plugin	Voice Agent Framework	Ultra-Low	Low	Users of LiveKit Agents or TEN Framework.

Next Steps

SDK Mode Guide

Get started with the simplest integration approach

Host Mode Guide

Learn how to implement custom transport

RTC Mode Guide

Set up real-time communication with LiveKit or Agora

Demo Projects

Explore sample implementations

Getting Started

Integrations

SpatialReal Studio

Resources

Choose Your Integration Mode

SDK Mode

Host Mode

RTC Mode

Framework Plugin

SDK Mode

View SDK Mode Guide

Host Mode

View Host Mode Guide

RTC Mode

View RTC Mode Guide

Framework Plugin

Comparison

Next Steps

SDK Mode Guide

Host Mode Guide

RTC Mode Guide

Demo Projects

Getting Started

Integrations

SpatialReal Studio

Resources

​Choose Your Integration Mode

SDK Mode

Host Mode

RTC Mode

Framework Plugin

​SDK Mode

View SDK Mode Guide

​Host Mode

View Host Mode Guide

​RTC Mode

View RTC Mode Guide

​Framework Plugin

​Comparison

​Next Steps

SDK Mode Guide

Host Mode Guide

RTC Mode Guide

Demo Projects

Choose Your Integration Mode

SDK Mode

Host Mode

RTC Mode

Framework Plugin

Comparison

Next Steps