Installation
pnpm add @spatialwalk/avatarkit
npm install @spatialwalk/avatarkit
yarn add @spatialwalk/avatarkit
Required: The SDK uses WASM files that need special build configuration. You must configure your build tool before using the SDK.
Add the official plugin to vite.config.ts: import { defineConfig } from 'vite'
import { avatarkitVitePlugin } from '@spatialwalk/avatarkit/vite'
export default defineConfig ({
plugins: [
avatarkitVitePlugin (),
] ,
})
The plugin automatically handles:
Development Server : Sets correct MIME type for WASM files
Build Time : Copies WASM files to dist/assets/
Cloudflare Pages : Generates _headers file
Vite Configuration : Configures optimizeDeps, assetsInclude, etc.
Manual Configuration (Without Plugin)
import { defineConfig } from 'vite'
export default defineConfig ({
optimizeDeps: {
exclude: [ '@spatialwalk/avatarkit' ],
} ,
assetsInclude: [ '**/*.wasm' ] ,
build: {
assetsInlineLimit: 0 ,
rollupOptions: {
output: {
assetFileNames : ( assetInfo ) => {
if ( assetInfo . name ?. endsWith ( '.wasm' )) {
return 'assets/[name][extname]'
}
return 'assets/[name]-[hash][extname]'
},
},
},
} ,
configureServer ( server ) {
server . middlewares . use (( req , res , next ) => {
if ( req . url ?. endsWith ( '.wasm' )) {
res . setHeader ( 'Content-Type' , 'application/wasm' )
}
next ()
})
} ,
})
Wrap your config in next.config.mjs: import { withAvatarkit } from '@spatialwalk/avatarkit/next'
export default withAvatarkit ({
// ...your existing Next.js config
})
The wrapper automatically handles:
Emscripten Fix : Patches scriptDirectory so the WASM glue file correctly resolves assets at /_next/static/chunks/
WASM Copying : Copies .wasm files into static/chunks/ via a custom webpack plugin (client build only)
Content-Type Headers : Adds application/wasm response header for /_next/static/chunks/*.wasm
Config Chaining : Preserves your existing webpack and headers configurations
If you have multiple config wrappers, withAvatarkit must wrap your entire config: import { withAvatarkit } from '@spatialwalk/avatarkit/next'
import withOtherPlugin from 'other-plugin'
export default withAvatarkit ( withOtherPlugin ({
// ...your config
}))
Authentication
SDK Mode requires an App ID and a Session Token .
Credential How to Obtain Notes App ID SpatialReal Studio → Create AppRequired for all modes Session Token Your backend server requests it from AvatarKit Server Max 24 hours validity
Authentication Flow: Your Client → Your Backend → AvatarKit Server → Session Token (24 hours max)
The Session Token must be set before calling start(). Never expose your token generation logic in client-side code.
Quick Start
Initialize SDK
import {
AvatarSDK ,
Environment ,
DrivingServiceMode ,
} from '@spatialwalk/avatarkit'
await AvatarSDK . initialize ( 'your-app-id' , {
environment: Environment . intl , // or Environment.cn
drivingServiceMode: DrivingServiceMode . sdk , // Default
})
// Set session token (required before start())
AvatarSDK . setSessionToken ( 'your-session-token' )
Load Avatar
import { AvatarManager } from '@spatialwalk/avatarkit'
const avatar = await AvatarManager . shared . load ( 'avatar-id' , ( progress ) => {
console . log ( `Loading: ${ progress . progress } %` )
})
Create View
import { AvatarView } from '@spatialwalk/avatarkit'
// Container MUST have non-zero width and height
const container = document . getElementById ( 'avatar-container' ) !
const avatarView = new AvatarView ( avatar , container )
Initialize Audio Context
Critical: initializeAudioContext() must be called inside a user gesture handler (e.g., click, touchstart). This is a browser security requirement — calling it outside a user gesture will fail silently.
button . addEventListener ( 'click' , async () => {
await avatarView . controller . initializeAudioContext ()
})
Connect and Send Audio
// Start WebSocket connection to AvatarKit server
await avatarView . controller . start ()
// Send audio data (PCM16, mono, matching configured sample rate)
const audioData : ArrayBuffer = /* your PCM16 audio data */
avatarView . controller . send ( audioData , false ) // Continue sending
// Mark end of conversation round
avatarView . controller . send ( lastChunk , true )
Cleanup
avatarView . controller . close () // Close WebSocket connection
avatarView . dispose () // Release all resources
Core API
AvatarSDK
SDK initialization and global configuration.
// Initialize
await AvatarSDK . initialize ( appId : string , configuration : Configuration )
// Properties (read-only)
AvatarSDK . isInitialized // boolean
AvatarSDK . appId // string
AvatarSDK . configuration // Configuration
AvatarSDK . version // string
AvatarSDK . sessionToken // string
// Methods
AvatarSDK . setSessionToken ( token : string ) // Set auth token
AvatarSDK . setUserId ( userId : string ) // Set user ID (for telemetry)
AvatarSDK . cleanup () // Release all SDK resources
setSessionToken() can be called before or after initialize(). If called before, the token is applied automatically during initialization.
AvatarManager
Avatar resource loading and caching. Access via the singleton AvatarManager.shared.
const manager = AvatarManager . shared
// Load avatar (downloads and caches)
const avatar = await manager . load (
id : string ,
onProgress ?: ( progress : LoadProgressInfo ) => void
)
// Clear all cached resources
manager . clearAll ()
Parameter Type Description idstringAvatar character ID onProgress(progress) => voidProgress callback with { progress: number } (0–100)
AvatarView
3D rendering view. Automatically creates a Canvas element and an AvatarController.
// Create view — Canvas is added to container automatically
const avatarView = new AvatarView ( avatar : Avatar , container : HTMLElement )
Property / Method Type Description controllerAvatarControllerPlayback controller (read-only) transform{ x, y, scale }Avatar position and scale onFirstRendering() => voidCallback when first frame renders dispose()voidRelease all resources
Transform coordinates:
Field Range Description x-1 to 1 Horizontal offset (-1 = left, 0 = center, 1 = right) y-1 to 1 Vertical offset (-1 = bottom, 0 = center, 1 = top) scale> 0 Scale factor (1.0 = original size)
Container requirement: The container element must have non-zero width and height. The Canvas fills the container and auto-resizes via ResizeObserver.
AvatarController — SDK Mode Methods
These methods are only available when drivingServiceMode is DrivingServiceMode.sdk.
// Initialize audio (MUST be in user gesture handler)
await controller . initializeAudioContext ()
// Connect to AvatarKit server
await controller . start ()
// Send audio data — returns conversationId
const conversationId = controller . send (
audioData : ArrayBuffer , // PCM16, mono
end : boolean // true = end of conversation round
)
// Close connection
controller . close ()
send() behavior:
end: false — continues the current conversation round
end: true — marks the end of the current round; sending new audio after this starts a new round and interrupts any ongoing playback
AvatarController — Common Methods
Available in both SDK Mode and Host Mode.
// Playback control
controller . pause () // Pause audio + animation
controller . resume () // Resume playback
controller . interrupt () // Stop current playback, clear data
// Data management
controller . clear () // Clear all data and resources
// Conversation
controller . getCurrentConversationId () // string | null
// Volume (affects avatar audio only, not system volume)
controller . setVolume ( volume : number ) // 0.0 to 1.0
controller . getVolume (): number // Current volume
AvatarController — Event Callbacks
// Connection state changes (SDK Mode only)
controller . onConnectionState = ( state : ConnectionState ) => {
console . log ( 'Connection:' , state )
}
// Conversation state changes
controller . onConversationState = ( state : ConversationState ) => {
console . log ( 'Conversation:' , state )
}
// Error handler
controller . onError = ( error : Error ) => {
console . error ( 'Error:' , error )
}
The SDK requires audio in mono PCM16 format:
Property Value Format PCM16 (16-bit signed integer, little-endian) Channels Mono (1 channel) Sample Rate Configurable: 8000, 16000, 22050, 24000, 32000, 44100, 48000 Hz (default: 16000) Data Type ArrayBuffer or Uint8Array
Data size: 1 second at 16 kHz = 16,000 samples × 2 bytes = 32,000 bytes.
async function mp3ToPcm16 ( mp3File : File , targetSampleRate : number ) : Promise < ArrayBuffer > {
const arrayBuffer = await mp3File . arrayBuffer ()
const audioContext = new AudioContext ({ sampleRate: targetSampleRate })
const audioBuffer = await audioContext . decodeAudioData ( arrayBuffer . slice ( 0 ))
const length = audioBuffer . length
const channels = audioBuffer . numberOfChannels
const pcm16Buffer = new ArrayBuffer ( length * 2 )
const pcm16View = new DataView ( pcm16Buffer )
// Mix to mono if stereo
const mono = channels === 1
? audioBuffer . getChannelData ( 0 )
: (() => {
const mixed = new Float32Array ( length )
const left = audioBuffer . getChannelData ( 0 )
const right = audioBuffer . getChannelData ( 1 )
for ( let i = 0 ; i < length ; i ++ ) mixed [ i ] = ( left [ i ] + right [ i ]) / 2
return mixed
})()
// Float32 → Int16
for ( let i = 0 ; i < length ; i ++ ) {
const s = Math . max ( - 1 , Math . min ( 1 , mono [ i ]))
pcm16View . setInt16 ( i * 2 , s < 0 ? s * 0x8000 : s * 0x7FFF , true )
}
audioContext . close ()
return pcm16Buffer
}
Configuration
Configuration Interface
interface Configuration {
environment : Environment
drivingServiceMode ?: DrivingServiceMode // Default: DrivingServiceMode.sdk
logLevel ?: LogLevel // Default: LogLevel.off
audioFormat ?: AudioFormat // Default: { channelCount: 1, sampleRate: 16000 }
}
Environment
enum Environment {
cn = 'cn' , // China region
intl = 'intl' , // International region
}
DrivingServiceMode
enum DrivingServiceMode {
sdk = 'sdk' , // Server-driven (default)
host = 'host' , // Client-driven
}
interface AudioFormat {
readonly channelCount : 1 // Fixed to mono
readonly sampleRate : number // 8000 | 16000 | 22050 | 24000 | 32000 | 44100 | 48000
}
LogLevel
enum LogLevel {
off = 'off' , // No logging (default)
error = 'error' , // Errors only
warning = 'warning' , // Errors + warnings
all = 'all' , // All logs
}
State Management
ConnectionState
Reported via onConnectionState callback (SDK Mode only).
enum ConnectionState {
disconnected = 'disconnected' ,
connecting = 'connecting' ,
connected = 'connected' ,
failed = 'failed' ,
}
ConversationState
Reported via onConversationState callback.
enum ConversationState {
idle = 'idle' , // Breathing animation, waiting for input
playing = 'playing' , // Active conversation playback
pausing = 'pausing' , // Paused during playback
}
State transitions are notified immediately when the transition starts, not when the animation completes. For example, playing is reported as soon as the transition from idle begins.
Error Handling
AvatarError
import { AvatarError } from '@spatialwalk/avatarkit'
try {
await avatarView . controller . start ()
} catch ( error ) {
if ( error instanceof AvatarError ) {
console . error ( 'SDK error:' , error . message , error . code )
}
}
Error Callback
avatarView . controller . onError = ( error : Error ) => {
console . error ( 'Controller error:' , error )
}
Lifecycle Management
Avatar Switching
// 1. Dispose current view
currentAvatarView . dispose ()
// 2. Load new avatar
const newAvatar = await AvatarManager . shared . load ( 'new-avatar-id' )
// 3. Create new view (reuse same container)
currentAvatarView = new AvatarView ( newAvatar , container )
// 4. Reconnect
await currentAvatarView . controller . initializeAudioContext ()
await currentAvatarView . controller . start ()
Resource Cleanup
dispose() automatically cleans up all resources:
WebSocket connections
Audio playback data and animation resources
Canvas elements and render system
Event listeners and callbacks
Always call dispose() when the view is no longer needed. Failing to do so may cause memory leaks.
Fallback Mechanism
If the WebSocket connection fails within 15 seconds, the SDK automatically enters audio-only fallback mode — audio continues playing without animation. This ensures uninterrupted audio playback when the server is unreachable.
The fallback mode is interruptible like normal playback
onConnectionState reports failed when the connection times out
Browser Compatibility
Browser Minimum Version Rendering Chrome / Edge 90+ WebGPU (preferred) Firefox 90+ WebGL Safari 14+ WebGL iOS Safari 14+ WebGL Android Chrome 90+ WebGL
Common Issues
Issue Cause Solution Audio not working initializeAudioContext() not in user gestureCall it inside a click or touchstart handler Avatar not rendering Container has zero dimensions Set explicit width and height on the container WASM MIME type error Build tool misconfiguration Use Vite plugin or Next.js wrapper Session token invalid Token expired or not set Refresh token from backend, call setSessionToken() before start() WebSocket connection failed Network or auth issue Check network connectivity and token validity