Skip to main content
Using this quickstart to create a Agent demo with SpatialReal SDK and LiveKit Agents. In a few steps, you will set up a backend that uses an end-to-end model and connects to a SpatialReal session. Then, you will create a frontend Vue app that renders the SpatialReal avatar and streams microphone audio to the agent in real time.
You can just copy each code block as-is and replace values in .env. Or just dump this page to any LLM you like and ask it to implement this demo for you.

Prerequisites

  • SpatialReal account (SPATIALREAL_API_KEY, SPATIALREAL_APP_ID, SPATIALREAL_AVATAR_ID)
  • LiveKit Cloud project (LIVEKIT_URL, LIVEKIT_API_KEY, LIVEKIT_API_SECRET)
  • End-to-end model key for Gemini Live (GOOGLE_API_KEY)
  • Node.js 18+, Python 3.10+, pnpm, uv
1

Create workspace

Create one project folder, then scaffold a Vue frontend and a backend folder. You will use this split layout for all files in the next steps.
mkdir spatialreal-avatar-quickstart
cd spatialreal-avatar-quickstart

mkdir backend
pnpm dlx create-vite@latest frontend --template vue-ts
2

Configure backend

Add these backend files exactly as shown.Create backend/.env and paste your LiveKit, Gemini Live, and SpatialReal credentials. Both backend processes read from this file at runtime.
backend/.env
LIVEKIT_URL=wss://your-project.livekit.cloud # https://cloud.livekit.io/
LIVEKIT_API_KEY=your_api_key # LiveKit Cloud -> API Keys
LIVEKIT_API_SECRET=your_api_secret # LiveKit Cloud -> API Keys

GOOGLE_API_KEY=your_google_api_key # https://aistudio.google.com/apikey
E2E_GOOGLE_MODEL=gemini-2.5-flash-native-audio-preview-12-2025
E2E_GOOGLE_VOICE=Puck

SPATIALREAL_API_KEY=your_key # https://app.spatialreal.ai
SPATIALREAL_APP_ID=your_app_id # SpatialReal Studio -> Applications
SPATIALREAL_AVATAR_ID=your_avatar_id # SpatialReal Studio -> Avatars
Create backend/pyproject.toml to install the Python dependencies used in this guide. Use these versions to match the quickstart runtime behavior.
backend/pyproject.toml
[project]
name = "spatialreal-avatar-quickstart-backend"
version = "0.1.0"
requires-python = ">=3.10,<3.15"
dependencies = [
  "flask>=3.0.0",
  "flask-cors>=4.0.0",
  "python-dotenv>=1.0.0",
  "livekit-api>=1.1.0",
  "livekit-agents==1.4.4",
  "livekit-plugins-google==1.4.4",
  "livekit-plugins-spatialreal==1.4.4",
]
Create backend/token_server.py to generate LiveKit JWTs and dispatch the agent to the room.
backend/token_server.py
import asyncio
import os
from datetime import timedelta
from uuid import uuid4

from dotenv import load_dotenv
from flask import Flask, jsonify, request
from flask_cors import CORS
from livekit import api

load_dotenv()

app = Flask(__name__)
CORS(app)

LIVEKIT_URL = os.getenv("LIVEKIT_URL")
LIVEKIT_API_KEY = os.getenv("LIVEKIT_API_KEY")
LIVEKIT_API_SECRET = os.getenv("LIVEKIT_API_SECRET")


async def create_room_and_dispatch(room_name: str) -> None:
    lkapi = api.LiveKitAPI(LIVEKIT_URL, LIVEKIT_API_KEY, LIVEKIT_API_SECRET)
    try:
        try:
            await lkapi.room.create_room(api.CreateRoomRequest(name=room_name))
        except Exception:
            # Room may already exist.
            pass

        await lkapi.agent_dispatch.create_dispatch(
            api.CreateAgentDispatchRequest(room=room_name, agent_name="voice-assistant")
        )
    finally:
        await lkapi.aclose()


@app.route("/token", methods=["POST"])
def token():
    if not LIVEKIT_API_KEY or not LIVEKIT_API_SECRET:
        return jsonify({"error": "LiveKit credentials not configured"}), 500

    body = request.get_json() or {}
    room_name = body.get("room", "voice-agent-room")
    requested_identity = body.get("identity")
    identity = (
        requested_identity.strip()
        if isinstance(requested_identity, str) and requested_identity.strip()
        else f"browser-{uuid4().hex[:8]}"
    )

    jwt = (
        api.AccessToken(LIVEKIT_API_KEY, LIVEKIT_API_SECRET)
        .with_identity(identity)
        .with_name(identity)
        .with_ttl(timedelta(hours=1))
        .with_grants(
            api.VideoGrants(
                room_join=True,
                room=room_name,
                can_publish=True,
                can_subscribe=True,
                can_publish_data=True,
            )
        )
        .to_jwt()
    )

    try:
        asyncio.run(create_room_and_dispatch(room_name))
    except Exception as exc:
        # Keep returning token so frontend can still connect for debugging.
        print(f"Warning: Failed to dispatch agent: {exc}")

    return jsonify(
        {
            "token": jwt,
            "url": LIVEKIT_URL,
            "room": room_name,
            "identity": identity,
        }
    )


if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8080, debug=True)
Create backend/agent.py to run the Gemini Live voice agent and start the SpatialReal avatar session. After this worker joins the room, user microphone audio is processed in real time.
backend/agent.py
import os

from dotenv import load_dotenv
from livekit.agents import Agent, AgentSession, AutoSubscribe, JobContext, WorkerOptions, cli
from livekit.plugins import google
from livekit.plugins.spatialreal import AvatarSession

load_dotenv()


class VoiceAssistant(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions="You are a helpful voice assistant. Keep replies short and natural."
        )


async def entrypoint(ctx: JobContext) -> None:
    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)

    session = AgentSession(
        llm=google.realtime.RealtimeModel(
            model=os.getenv("E2E_GOOGLE_MODEL", "gemini-2.5-flash"),
            voice=os.getenv("E2E_GOOGLE_VOICE", "Puck"),
            api_key=os.getenv("GOOGLE_API_KEY"),
        )
    )

    avatar = AvatarSession()
    await avatar.start(session, room=ctx.room)

    await session.start(agent=VoiceAssistant(), room=ctx.room)


if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint, agent_name="voice-assistant"))
3

Configure frontend

Install packages, then add these frontend files exactly as shown.Run this install command in frontend to add AvatarKit RTC and a compatible LiveKit client version. This prepares the frontend runtime before you add config and UI code.
cd frontend
pnpm install
pnpm add @spatialwalk/avatarkit @spatialwalk/avatarkit-rtc [email protected]
Create frontend/.env with your public avatar IDs and token endpoint. For production, set VITE_TOKEN_ENDPOINT to your deployed backend URL.
frontend/.env
VITE_SPATIALREAL_APP_ID=your_app_id # SpatialReal Studio -> Applications
VITE_SPATIALREAL_AVATAR_ID=your_avatar_id # SpatialReal Studio -> Avatars
VITE_TOKEN_ENDPOINT=http://localhost:8080/token # Your backend token server URL
Update frontend/vite.config.ts to enable AvatarKit build support. The /token proxy routes local frontend requests to your backend token server.
frontend/vite.config.ts
import { defineConfig } from 'vite'
import vue from '@vitejs/plugin-vue'
import { avatarkitVitePlugin } from '@spatialwalk/avatarkit/vite'

export default defineConfig({
  plugins: [vue(), avatarkitVitePlugin()],
  server: {
    port: 3000,
    proxy: {
      '/token': {
        target: 'http://localhost:8080',
        changeOrigin: true,
      },
    },
  },
})
Create frontend/src/App.vue with a minimal video-call layout and two controls. This component mounts the avatar canvas, connects to LiveKit, and toggles microphone publishing.
frontend/src/App.vue
<script setup lang="ts">
import { nextTick, onBeforeUnmount, ref } from 'vue'
import { Room } from 'livekit-client'
import {
  AvatarManager,
  AvatarSDK,
  AvatarView,
  DrivingServiceMode,
  Environment,
} from '@spatialwalk/avatarkit'
import { AvatarPlayer, LiveKitProvider } from '@spatialwalk/avatarkit-rtc'

const container = ref<HTMLDivElement | null>(null)
const status = ref('Click Connect to start')
const connecting = ref(false)
const connected = ref(false)
const micOn = ref(false)

let avatarView: AvatarView | null = null
let player: AvatarPlayer | null = null
let roomClient: Room | null = null

async function connect(): Promise<void> {
  if (connecting.value || connected.value) return
  connecting.value = true
  status.value = 'Requesting token...'

  try {
    const response = await fetch(import.meta.env.VITE_TOKEN_ENDPOINT || '/token', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ room: 'voice-agent-room' }),
    })
    if (!response.ok) throw new Error('Failed to fetch token')

    const { token, url, room } = await response.json()

    if (!AvatarSDK.isInitialized) {
      await AvatarSDK.initialize(import.meta.env.VITE_SPATIALREAL_APP_ID, {
        environment: Environment.intl,
        drivingServiceMode: DrivingServiceMode.host,
      })
    }

    await nextTick()
    const mountEl = container.value
    if (!mountEl) throw new Error('Avatar container is not ready')

    await player?.disconnect().catch(() => undefined)
    avatarView?.dispose()

    const avatar = await AvatarManager.shared.load(import.meta.env.VITE_SPATIALREAL_AVATAR_ID)
    avatarView = new AvatarView(avatar, mountEl)
    player = new AvatarPlayer(new LiveKitProvider(), avatarView)

    status.value = 'Connecting to LiveKit...'
    await player.connect({ url, token, roomName: room })
    roomClient = player.getNativeClient() as Room
    if (!roomClient) throw new Error('LiveKit room client is unavailable')

    await roomClient.startAudio()

    connected.value = true
    status.value = 'Connected. Click Start Mic to talk.'
  } catch (error) {
    status.value = error instanceof Error ? error.message : 'Connection failed'
  } finally {
    connecting.value = false
  }
}

async function startMic(): Promise<void> {
  if (!player || micOn.value) return

  status.value = 'Starting microphone...'
  try {
    await player.startPublishing()
    micOn.value = true
    status.value = 'Mic is on. Start speaking.'
  } catch (error) {
    status.value = error instanceof Error ? `Failed to start microphone: ${error.message}` : 'Failed to start microphone.'
  }
}

async function stopMic(): Promise<void> {
  if (!micOn.value) return
  await player?.stopPublishing().catch(() => undefined)
  micOn.value = false
  status.value = 'Mic is off.'
}

async function disconnect(): Promise<void> {
  await stopMic()
  await player?.disconnect().catch(() => undefined)
  avatarView?.dispose()
  player = null
  avatarView = null
  roomClient = null
  connected.value = false
  status.value = 'Disconnected'
}

async function toggleConnection(): Promise<void> {
  if (connecting.value) return
  if (connected.value) {
    await disconnect()
    return
  }
  await connect()
}

async function toggleMic(): Promise<void> {
  if (!connected.value || connecting.value) return
  if (micOn.value) {
    await stopMic()
    return
  }
  await startMic()
}

onBeforeUnmount(async () => {
  await disconnect()
})
</script>

<template>
  <div style="min-height:100vh; display:flex; align-items:center; justify-content:center; padding:16px;">
    <div style="width:min(720px, 100%); display:flex; flex-direction:column; gap:10px;">
      <div ref="container" style="width:100%; aspect-ratio:16/10; min-height:320px; border-radius:12px; overflow:hidden; border:1px solid;" />

      <div style="display:flex; gap:8px; flex-wrap:wrap;">
        <button :disabled="connecting" @click="toggleConnection">
          {{ connecting ? 'Connecting...' : connected ? 'Disconnect' : 'Connect' }}
        </button>
        <button :disabled="!connected || connecting" @click="toggleMic">
          {{ micOn ? 'Stop Mic' : 'Start Mic' }}
        </button>
      </div>

      <div style="font-size:14px;">{{ status }}</div>
    </div>
  </div>
</template>
4

Verify project structure

Before running, confirm your files match this tree. This quick check helps catch missing or misplaced files before launch.
SpatialReal Avatar quickstart
spatialreal-avatar-quickstart/
├── backend/
│   ├── .env
│   ├── pyproject.toml
│   ├── token_server.py
│   └── agent.py
└── frontend/
    ├── .env
    ├── vite.config.ts
    └── src/
        └── App.vue
5

Install and run

Run these commands in three terminals, then open http://localhost:3000, click Connect, and then click Start Mic.Start the token server first so frontend can fetch JWTs.
# Terminal 1
cd spatialreal-avatar-quickstart/backend
uv sync
uv run token_server.py
Start the agent worker second so it can join the room when dispatched.
# Terminal 2
cd spatialreal-avatar-quickstart/backend
uv run agent.py dev
Start the frontend last, then open the local URL and begin speaking.
# Terminal 3
cd spatialreal-avatar-quickstart/frontend
pnpm dev

What Happens

  • Frontend requests /token
  • Backend returns LiveKit JWT and dispatches voice-assistant
  • Agent joins room with end-to-end realtime model
  • AvatarSession publishes avatar output
  • AvatarPlayer renders avatar and streams your microphone audio

More Implementations

AvatarKit Voice Agent Demo

A full reference repository with more implementation details, including different frontend UI options and multiple backend agent patterns.