Core Classes
AvatarKit
The core management class of the SDK, responsible for initialization and global configuration.
/// SDK entry class
export class AvatarKit {
/// Initialize SDK
/// @param appId - Application ID
/// @param configuration - SDK configuration
static async initialize(
appId: string,
configuration: Configuration
): Promise<void>
/// Check initialization status
static readonly isInitialized: boolean
/// Get the App ID provided during initialization
static readonly appId: string | null
/// Get the configuration object provided during initialization
static readonly configuration: Configuration | null
/// Set session token for WebSocket authentication
/// @param token - Session token
/// Note: sessionToken is not in the Configuration interface, it must be set separately using this method
static setSessionToken(token: string): void
/// Get the currently set session token
static readonly sessionToken: string | null
/// Set user ID for logging and analytics
/// @param userId - User ID
static setUserId(userId: string): void
/// Get the currently set user ID
static readonly userId: string | null
/// Get SDK version
static readonly version: string
/// Clean up SDK global resources
/// If you no longer need the digital human SDK, you must call this to release all occupied resources, including WASM modules, memory, etc.
/// Note: Only call this when the entire application no longer needs digital human functionality.
static cleanup(): void
}
AvatarManager
Character resource manager, responsible for downloading, caching, and loading character data. Use the singleton instance via AvatarManager.shared.
export class AvatarManager {
/// Get singleton instance
static readonly shared: AvatarManager
/// Load character resources
/// @param characterId - Character ID
/// @param onProgress - Progress callback (optional)
/// @returns Promise<Avatar> - Successfully loaded character object
async load(
characterId: string,
onProgress?: (progress: LoadProgressInfo) => void
): Promise<Avatar>
/// Clear cache
clearCache(): void
}
AvatarView
3D rendering view (rendering layer), responsible for 3D rendering only. Internally automatically creates and manages AvatarController.
export class AvatarView {
/// Create view instance
/// @param avatar - Loaded character object
/// @param container - Container element (required)
/// - Canvas automatically uses the full size of the container (width and height)
/// - Canvas aspect ratio adapts to container size - set container size to control aspect ratio
/// - Canvas is automatically added to the container
constructor(avatar: Avatar, container: HTMLElement)
/// Real-time communication controller
readonly avatarController: AvatarController
/// Wait for first frame to render
readonly ready: Promise<void>
/// Get rendering Canvas element
getCanvas(): HTMLCanvasElement
/// Set background image
/// @param imageUrl - Image URL
setBackgroundImage(imageUrl: string): void
/// Update camera configuration
/// @param config - Camera configuration object
updateCameraConfig(config: CameraConfig): void
/// Clean up view resources
dispose(): void
}
AvatarController
Real-time communication controller that handles WebSocket connections and audio/video data.
Important Notes:
- The
sendText() method is not currently supported and will throw an error if called.
start() and close() are only available in SDK mode
yieldAudioData() and yieldFramesData() are only available in Host mode
interrupt(), clear(), getCurrentConversationId(), setVolume(), and getVolume() are available in both modes
- Volume Control: Use
setVolume(volume) to control audio playback volume (0.0 to 1.0). This only affects the avatar’s audio player, not system volume. Volume changes take effect immediately, including for currently playing audio.
export class AvatarController {
/// Start WebSocket connection
async start(): Promise<void>
/// Send audio data
/// @param audioData - Audio data (ArrayBuffer format, must be 16kHz mono PCM16)
/// - Sample rate: 16kHz (16000 Hz) - backend requirement
/// - Format: PCM16 (16-bit signed integer, little-endian)
/// - Channels: Mono (single channel)
/// - Example: 1 second = 16000 samples × 2 bytes = 32000 bytes
/// @param end - Whether to end sending
/// - `false` (default) - Continue sending audio data for current conversation
/// - `true` - Mark the end of current conversation round. After end=true, sending new audio data will interrupt any ongoing playback from the previous conversation round
/// @returns conversationId - Conversation ID (mainly used internally in SDK mode, not required for basic usage)
/// Important Notes:
/// - SDK will start returning animation data and playing once it collects enough audio data, no need to wait for `end=true`
/// - After `end=true`, sending new audio data will interrupt any ongoing playback from the previous conversation round
send(audioData: ArrayBuffer, end?: boolean): string
/// Stream audio chunks (Host mode only)
/// @param data - Audio chunk data (Uint8Array format, 16kHz mono PCM16)
/// @param isLast - Whether this is the last chunk (default: false)
/// @returns conversationId - Conversation ID for this audio session
/// Important Notes:
/// - Can be called directly to start a new session
/// - If no conversationId exists, a new one will be automatically generated
/// - Returns the conversationId that must be used when sending animation data via yieldFramesData()
yieldAudioData(data: Uint8Array, isLast?: boolean): string
/// Stream animation keyframes (Host mode only)
/// @param keyframes - Animation keyframes (obtained from your service)
/// @param conversationId - Conversation ID (required)
/// - Use getCurrentConversationId() or yieldAudioData() to get conversationId
/// - Animation data with mismatched conversationId will be discarded
/// Important Notes:
/// - Requires a valid conversationId from audio data
/// - Animation data with mismatched conversationId will be discarded
/// - Must send audio data first to get conversationId before calling this method
yieldFramesData(keyframes: any[], conversationId: string): void
/// Interrupt current playback (stops and clears data)
interrupt(): void
/// Clear all data and resources
clear(): void
/// Close connection (SDK mode only)
close(): void
/// Get current conversation ID (for Host mode)
/// Returns: Current conversationId for the active audio session, or null if no active session
getCurrentConversationId(): string | null
/// Volume control (affects only avatar audio player, not system volume)
/// @param volume - Volume level from 0.0 to 1.0
setVolume(volume: number): void
/// Get current volume level
/// @returns Current volume level from 0.0 to 1.0
getVolume(): number
/// Connection state callback (SDK mode only)
onConnectionState?: (state: ConnectionState) => void
/// Conversation state callback
onConversationState?: (state: ConversationState) => void
/// Error callback
onError?: (error: Error) => void
}
Event Descriptions:
onConnectionState: Connection state change callback (SDK mode only)
onConversationState: Conversation state change callback
onError: Error callback
Configuration Types
Configuration
SDK configuration interface
interface Configuration {
/// Environment configuration
environment: Environment
/// Driving service mode (optional, default is 'sdk')
drivingServiceMode?: DrivingServiceMode
}
enum DrivingServiceMode {
sdk = 'sdk' // SDK mode: SDK handles WebSocket communication automatically
}
Environment
Environment enumeration
enum Environment {
/// China region
cn = 'cn'
/// International region
intl = 'intl'
/// Test environment
test = 'test'
}
CameraConfig
Camera configuration interface
interface CameraConfig {
/// Camera position
position: [number, number, number]
/// Camera target point
target: [number, number, number]
/// Field of view angle (degrees)
fov: number
/// Near clipping plane
near: number
/// Far clipping plane
far: number
/// Up direction vector (optional)
up?: [number, number, number]
/// Aspect ratio (optional)
aspect?: number
}
State Types
ConnectionState
Connection state enumeration
enum ConnectionState {
/// Not connected
disconnected = 'disconnected'
/// Connecting
connecting = 'connecting'
/// Connected
connected = 'connected'
/// Connection failed
failed = 'failed'
}
ConversationState
Conversation state enumeration
enum ConversationState {
/// Idle state, showing breathing animation
idle = 'idle'
/// Playing conversation content (including transition animation period)
playing = 'playing'
}
State Descriptions:
idle: Avatar is in idle state, waiting for conversation to start
playing: Avatar is playing conversation content (including during transition animation period)
Note: Target state is notified immediately during transition animation:
- When transitioning from
idle to playing, playing state is notified immediately
- When transitioning from
playing to idle, idle state is notified immediately
LoadProgress
Load progress enumeration
enum LoadProgress {
/// Downloading
downloading = 'downloading'
/// Completed
completed = 'completed'
/// Failed
failed = 'failed'
}
type LoadProgressInfo = {
/// Load progress type
type: LoadProgress
/// Progress percentage 0-100
progress?: number
/// Error information
error?: Error
}
Fallback Mechanism
The SDK includes a fallback mechanism to ensure audio playback continues even when animation data is unavailable:
- SDK Mode Connection Failure: If WebSocket connection fails to establish within 15 seconds, the SDK automatically enters fallback mode. In this mode, audio data can still be sent and will play normally, even though no animation data will be received from the server. This ensures that audio playback is not interrupted even when the service connection fails.
- SDK Mode Server Error: If the server returns an error after connection is established, the SDK automatically enters audio-only mode for that session and continues playing audio independently.
- Once in audio-only mode, any subsequent animation data for that session will be ignored, and only audio will continue playing.
- The fallback mode is interruptible, just like normal playback mode.
- Connection state callbacks (
onConnectionState) will notify you when connection fails or times out, allowing you to handle the fallback state appropriately.
Usage Examples
Basic Initialization
Create Configuration
import { AvatarKit, Environment, DrivingServiceMode } from '@spatialwalk/avatarkit'
await AvatarKit.initialize('your-app-id', {
environment: Environment.cn,
drivingServiceMode: DrivingServiceMode.sdk, // Optional, 'sdk' is default
})
Load Character
import { AvatarManager } from '@spatialwalk/avatarkit'
const avatarManager = AvatarManager.shared
const avatar = await avatarManager.load('character-id', (progress) => {
console.log(`Loading: ${progress.progress}%`)
})
Ensure the character ID is valid and network connection is normal. It’s recommended to check network status before loading.
Create Digital Human View
Create View Instance
import { AvatarView } from '@spatialwalk/avatarkit'
const container = document.getElementById('avatar-container')
const avatarView = new AvatarView(avatar, container)
// Wait for first frame to render
await avatarView.ready
Start Real-time Communication
// Set up event listeners
avatarView.avatarController.onConnectionState = (state) => {
console.log('Connection state:', state)
}
avatarView.avatarController.onConversationState = (state) => {
console.log('Conversation state:', state)
}
// Start connection
await avatarView.avatarController.start()
Audio Processing
⚠️ Important: The SDK requires audio data to be in 16kHz mono PCM16 format:
- Sample Rate: 16kHz (16000 Hz) - This is a backend requirement
- Channels: Mono (single channel)
- Format: PCM16 (16-bit signed integer, little-endian)
- Byte Order: Little-endian
Audio Data Format:
- Each sample is 2 bytes (16-bit)
- Audio data must be provided as an
ArrayBuffer. If your data is in a Uint8Array, it must be converted to ArrayBuffer (e.g., using slice().buffer)
- For example: 1 second of audio = 16000 samples × 2 bytes = 32000 bytes
Resampling:
- If your audio source is at a different sample rate (e.g., 24kHz, 48kHz), you must resample it to 16kHz before sending to the SDK
- For high-quality resampling, we recommend using Web Audio API’s
OfflineAudioContext with anti-aliasing filtering
- See example projects for resampling implementation
Audio Application Example
// ⚠️ Important: Audio must be 16kHz mono PCM16 format
// If audio is Uint8Array, you can use slice().buffer to convert to ArrayBuffer
const audioUint8 = new Uint8Array(1024) // Example: 16kHz PCM16 audio data (512 samples = 1024 bytes)
const audioData = audioUint8.slice().buffer // Convert to ArrayBuffer
// Send audio data (end=false means continue sending for current conversation)
avatarView.avatarController.send(audioData, false) // Returns conversationId, but not required in SDK mode
// Note: SDK will start returning animation data and playing once it collects enough audio data, no need to wait for end=true
// Call when audio ends (end=true marks the end of current conversation round)
avatarView.avatarController.send(new ArrayBuffer(0), true) // end=true marks the end of current conversation round
// ⚠️ Note: After end=true, sending new audio data will interrupt any ongoing playback from the previous conversation round
// Interrupt conversation
avatarView.avatarController.interrupt()
Resource Management
// Clean up view resources
avatarView.dispose()
// If you no longer need the digital human SDK, clean up global resources
AvatarKit.cleanup() // Clean up SDK global resources (only call if you no longer need the entire SDK)
When disposing AvatarView instances, must call dispose() to properly clean up resources. Not properly cleaning up may cause resource leaks and rendering errors.
Error Handling
import { SPAvatarError } from '@spatialwalk/avatarkit'
try {
await avatarView.avatarController.start()
} catch (error) {
if (error instanceof SPAvatarError) {
console.error('SDK Error:', error.message, error.code)
}
}
// Or use error callback
avatarView.avatarController.onError = (error) => {
console.error('AvatarController error:', error)
}
Browser Compatibility
| Browser | Minimum Version | Rendering Backend | Notes |
|---|
| Chrome | 90+ | WebGPU (Recommended) / WebGL | Best Support |
| Edge | 90+ | WebGPU (Recommended) / WebGL | Best Support |
| Firefox | 90+ | WebGL | Limited WebGPU Support |
| Safari | 14+ | WebGL | Limited WebGPU Support |
| iOS Safari | 14+ | WebGL | Limited WebGPU Support |
| Android Chrome | Android 8+ | WebGL | Limited WebGPU Support |
WebGPU requires newer browser versions. SDK will automatically detect and fallback to WebGL.
- SDK automatically selects the best rendering backend (WebGPU/WebGL)
- Resources are automatically cached, no need to re-download on repeated loads
- WASM memory is automatically managed by SDK, no manual release needed
- Supports dynamic loading and unloading of resources
- Recommended to use modern browsers for best performance