Skip to main content

Installation

pnpm add @spatialwalk/avatarkit

Build Tool Configuration

Required: The SDK uses WASM files that need special build configuration. You must configure your build tool before using the SDK.
Add the official plugin to vite.config.ts:
vite.config.ts
import { defineConfig } from 'vite'
import { avatarkitVitePlugin } from '@spatialwalk/avatarkit/vite'

export default defineConfig({
  plugins: [
    avatarkitVitePlugin(),
  ],
})
The plugin automatically handles:
  • Development Server: Sets correct MIME type for WASM files
  • Build Time: Copies WASM files to dist/assets/
  • Cloudflare Pages: Generates _headers file
  • Vite Configuration: Configures optimizeDeps, assetsInclude, etc.
vite.config.ts
import { defineConfig } from 'vite'

export default defineConfig({
  optimizeDeps: {
    exclude: ['@spatialwalk/avatarkit'],
  },
  assetsInclude: ['**/*.wasm'],
  build: {
    assetsInlineLimit: 0,
    rollupOptions: {
      output: {
        assetFileNames: (assetInfo) => {
          if (assetInfo.name?.endsWith('.wasm')) {
            return 'assets/[name][extname]'
          }
          return 'assets/[name]-[hash][extname]'
        },
      },
    },
  },
  configureServer(server) {
    server.middlewares.use((req, res, next) => {
      if (req.url?.endsWith('.wasm')) {
        res.setHeader('Content-Type', 'application/wasm')
      }
      next()
    })
  },
})

Authentication

SDK Mode requires an App ID and a Session Token.
CredentialHow to ObtainNotes
App IDSpatialReal Studio → Create AppRequired for all modes
Session TokenYour backend server requests it from AvatarKit ServerMax 24 hours validity
Authentication Flow:
Your Client → Your Backend → AvatarKit Server → Session Token (24 hours max)
The Session Token must be set before calling start(). Never expose your token generation logic in client-side code.

Quick Start

1

Initialize SDK

import {
  AvatarSDK,
  Environment,
  DrivingServiceMode,
} from '@spatialwalk/avatarkit'

await AvatarSDK.initialize('your-app-id', {
  environment: Environment.intl,  // or Environment.cn
  drivingServiceMode: DrivingServiceMode.sdk,  // Default
})

// Set session token (required before start())
AvatarSDK.setSessionToken('your-session-token')
2

Load Avatar

import { AvatarManager } from '@spatialwalk/avatarkit'

const avatar = await AvatarManager.shared.load('avatar-id', (progress) => {
  console.log(`Loading: ${progress.progress}%`)
})
3

Create View

import { AvatarView } from '@spatialwalk/avatarkit'

// Container MUST have non-zero width and height
const container = document.getElementById('avatar-container')!
const avatarView = new AvatarView(avatar, container)
4

Initialize Audio Context

Critical: initializeAudioContext() must be called inside a user gesture handler (e.g., click, touchstart). This is a browser security requirement — calling it outside a user gesture will fail silently.
button.addEventListener('click', async () => {
  await avatarView.controller.initializeAudioContext()
})
5

Connect and Send Audio

// Start WebSocket connection to AvatarKit server
await avatarView.controller.start()

// Send audio data (PCM16, mono, matching configured sample rate)
const audioData: ArrayBuffer = /* your PCM16 audio data */
avatarView.controller.send(audioData, false)  // Continue sending

// Mark end of conversation round
avatarView.controller.send(lastChunk, true)
6

Cleanup

avatarView.controller.close()  // Close WebSocket connection
avatarView.dispose()           // Release all resources

Core API

AvatarSDK

SDK initialization and global configuration.
// Initialize
await AvatarSDK.initialize(appId: string, configuration: Configuration)

// Properties (read-only)
AvatarSDK.isInitialized   // boolean
AvatarSDK.appId           // string
AvatarSDK.configuration   // Configuration
AvatarSDK.version         // string
AvatarSDK.sessionToken    // string

// Methods
AvatarSDK.setSessionToken(token: string)  // Set auth token
AvatarSDK.setUserId(userId: string)       // Set user ID (for telemetry)
AvatarSDK.cleanup()                       // Release all SDK resources
setSessionToken() can be called before or after initialize(). If called before, the token is applied automatically during initialization.

AvatarManager

Avatar resource loading and caching. Access via the singleton AvatarManager.shared.
const manager = AvatarManager.shared

// Load avatar (downloads and caches)
const avatar = await manager.load(
  id: string,
  onProgress?: (progress: LoadProgressInfo) => void
)

// Clear all cached resources
manager.clearAll()
ParameterTypeDescription
idstringAvatar character ID
onProgress(progress) => voidProgress callback with { progress: number } (0–100)

AvatarView

3D rendering view. Automatically creates a Canvas element and an AvatarController.
// Create view — Canvas is added to container automatically
const avatarView = new AvatarView(avatar: Avatar, container: HTMLElement)
Property / MethodTypeDescription
controllerAvatarControllerPlayback controller (read-only)
transform{ x, y, scale }Avatar position and scale
onFirstRendering() => voidCallback when first frame renders
dispose()voidRelease all resources
Transform coordinates:
FieldRangeDescription
x-1 to 1Horizontal offset (-1 = left, 0 = center, 1 = right)
y-1 to 1Vertical offset (-1 = bottom, 0 = center, 1 = top)
scale> 0Scale factor (1.0 = original size)
Container requirement: The container element must have non-zero width and height. The Canvas fills the container and auto-resizes via ResizeObserver.

AvatarController — SDK Mode Methods

These methods are only available when drivingServiceMode is DrivingServiceMode.sdk.
// Initialize audio (MUST be in user gesture handler)
await controller.initializeAudioContext()

// Connect to AvatarKit server
await controller.start()

// Send audio data — returns conversationId
const conversationId = controller.send(
  audioData: ArrayBuffer,  // PCM16, mono
  end: boolean             // true = end of conversation round
)

// Close connection
controller.close()
send() behavior:
  • end: false — continues the current conversation round
  • end: true — marks the end of the current round; sending new audio after this starts a new round and interrupts any ongoing playback

AvatarController — Common Methods

Available in both SDK Mode and Host Mode.
// Playback control
controller.pause()       // Pause audio + animation
controller.resume()      // Resume playback
controller.interrupt()   // Stop current playback, clear data

// Data management
controller.clear()       // Clear all data and resources

// Conversation
controller.getCurrentConversationId()  // string | null

// Volume (affects avatar audio only, not system volume)
controller.setVolume(volume: number)   // 0.0 to 1.0
controller.getVolume(): number         // Current volume

AvatarController — Event Callbacks

// Connection state changes (SDK Mode only)
controller.onConnectionState = (state: ConnectionState) => {
  console.log('Connection:', state)
}

// Conversation state changes
controller.onConversationState = (state: ConversationState) => {
  console.log('Conversation:', state)
}

// Error handler
controller.onError = (error: Error) => {
  console.error('Error:', error)
}

Audio Format

The SDK requires audio in mono PCM16 format:
PropertyValue
FormatPCM16 (16-bit signed integer, little-endian)
ChannelsMono (1 channel)
Sample RateConfigurable: 8000, 16000, 22050, 24000, 32000, 44100, 48000 Hz (default: 16000)
Data TypeArrayBuffer or Uint8Array
Data size: 1 second at 16 kHz = 16,000 samples × 2 bytes = 32,000 bytes.
async function mp3ToPcm16(mp3File: File, targetSampleRate: number): Promise<ArrayBuffer> {
  const arrayBuffer = await mp3File.arrayBuffer()
  const audioContext = new AudioContext({ sampleRate: targetSampleRate })
  const audioBuffer = await audioContext.decodeAudioData(arrayBuffer.slice(0))

  const length = audioBuffer.length
  const channels = audioBuffer.numberOfChannels
  const pcm16Buffer = new ArrayBuffer(length * 2)
  const pcm16View = new DataView(pcm16Buffer)

  // Mix to mono if stereo
  const mono = channels === 1
    ? audioBuffer.getChannelData(0)
    : (() => {
        const mixed = new Float32Array(length)
        const left = audioBuffer.getChannelData(0)
        const right = audioBuffer.getChannelData(1)
        for (let i = 0; i < length; i++) mixed[i] = (left[i] + right[i]) / 2
        return mixed
      })()

  // Float32 → Int16
  for (let i = 0; i < length; i++) {
    const s = Math.max(-1, Math.min(1, mono[i]))
    pcm16View.setInt16(i * 2, s < 0 ? s * 0x8000 : s * 0x7FFF, true)
  }

  audioContext.close()
  return pcm16Buffer
}

Configuration

Configuration Interface

interface Configuration {
  environment: Environment
  drivingServiceMode?: DrivingServiceMode  // Default: DrivingServiceMode.sdk
  logLevel?: LogLevel                      // Default: LogLevel.off
  audioFormat?: AudioFormat                // Default: { channelCount: 1, sampleRate: 16000 }
}

Environment

enum Environment {
  cn = 'cn',      // China region
  intl = 'intl',  // International region
}

DrivingServiceMode

enum DrivingServiceMode {
  sdk = 'sdk',    // Server-driven (default)
  host = 'host',  // Client-driven
}

AudioFormat

interface AudioFormat {
  readonly channelCount: 1   // Fixed to mono
  readonly sampleRate: number // 8000 | 16000 | 22050 | 24000 | 32000 | 44100 | 48000
}

LogLevel

enum LogLevel {
  off = 'off',          // No logging (default)
  error = 'error',      // Errors only
  warning = 'warning',  // Errors + warnings
  all = 'all',          // All logs
}

State Management

ConnectionState

Reported via onConnectionState callback (SDK Mode only).
enum ConnectionState {
  disconnected = 'disconnected',
  connecting = 'connecting',
  connected = 'connected',
  failed = 'failed',
}

ConversationState

Reported via onConversationState callback.
enum ConversationState {
  idle = 'idle',        // Breathing animation, waiting for input
  playing = 'playing',  // Active conversation playback
  pausing = 'pausing',  // Paused during playback
}
State transitions are notified immediately when the transition starts, not when the animation completes. For example, playing is reported as soon as the transition from idle begins.

Error Handling

AvatarError

import { AvatarError } from '@spatialwalk/avatarkit'

try {
  await avatarView.controller.start()
} catch (error) {
  if (error instanceof AvatarError) {
    console.error('SDK error:', error.message, error.code)
  }
}

Error Callback

avatarView.controller.onError = (error: Error) => {
  console.error('Controller error:', error)
}

Lifecycle Management

Avatar Switching

// 1. Dispose current view
currentAvatarView.dispose()

// 2. Load new avatar
const newAvatar = await AvatarManager.shared.load('new-avatar-id')

// 3. Create new view (reuse same container)
currentAvatarView = new AvatarView(newAvatar, container)

// 4. Reconnect
await currentAvatarView.controller.initializeAudioContext()
await currentAvatarView.controller.start()

Resource Cleanup

dispose() automatically cleans up all resources:
  • WebSocket connections
  • Audio playback data and animation resources
  • Canvas elements and render system
  • Event listeners and callbacks
Always call dispose() when the view is no longer needed. Failing to do so may cause memory leaks.

Fallback Mechanism

If the WebSocket connection fails within 15 seconds, the SDK automatically enters audio-only fallback mode — audio continues playing without animation. This ensures uninterrupted audio playback when the server is unreachable.
  • The fallback mode is interruptible like normal playback
  • onConnectionState reports failed when the connection times out

Browser Compatibility

BrowserMinimum VersionRendering
Chrome / Edge90+WebGPU (preferred)
Firefox90+WebGL
Safari14+WebGL
iOS Safari14+WebGL
Android Chrome90+WebGL

Common Issues

IssueCauseSolution
Audio not workinginitializeAudioContext() not in user gestureCall it inside a click or touchstart handler
Avatar not renderingContainer has zero dimensionsSet explicit width and height on the container
WASM MIME type errorBuild tool misconfigurationUse Vite plugin or Next.js wrapper
Session token invalidToken expired or not setRefresh token from backend, call setSessionToken() before start()
WebSocket connection failedNetwork or auth issueCheck network connectivity and token validity