CompositeVoiceConfig

Main configuration type for the CompositeVoice SDK.

Defined in: src/core/types/config.ts:480

Main configuration type for the CompositeVoice SDK.

Remarks

This is the top-level configuration object passed to the CompositeVoice constructor. It accepts a flat providers array containing all pipeline providers, plus optional settings for queue buffering, reconnection, logging, turn-taking, conversation history, eager LLM, error recovery, and custom extensions.

The providers array replaces the old { stt, llm, tts } pattern, enabling multi-role providers (e.g., NativeSTT covering both 'input' and 'stt' roles) and explicit audio I/O providers (e.g., MicrophoneInput, BrowserAudioOutput).

All optional fields have sensible defaults that are applied automatically by the SDK.

Examples

import { CompositeVoice, NativeSTT, AnthropicLLM, NativeTTS } from 'composite-voice';

const agent = new CompositeVoice({
  providers: [
    new NativeSTT(),
    new AnthropicLLM({ proxyUrl: '/api/proxy/anthropic', model: 'claude-haiku-4-5' }),
    new NativeTTS(),
  ],
  conversationHistory: { enabled: true, maxTurns: 10 },
  logging: { enabled: true, level: 'debug' },
});

import {
  CompositeVoice,
  MicrophoneInput,
  DeepgramSTT,
  AnthropicLLM,
  DeepgramTTS,
  BrowserAudioOutput,
} from 'composite-voice';

const agent = new CompositeVoice({
  providers: [
    new MicrophoneInput({ sampleRate: 16000 }),
    new DeepgramSTT({ proxyUrl: '/api/proxy/deepgram' }),
    new AnthropicLLM({ proxyUrl: '/api/proxy/anthropic', model: 'claude-haiku-4-5' }),
    new DeepgramTTS({ proxyUrl: '/api/proxy/deepgram' }),
    new BrowserAudioOutput(),
  ],
  queue: {
    input: { maxSize: 2000 },
    output: { maxSize: 500 },
  },
});

See

BaseProvider for the provider interface that all providers implement
ReconnectionConfig for WebSocket reconnection settings
TurnTakingConfig for turn-taking behavior
EagerLLMConfig for speculative generation settings
ConversationHistoryConfig for multi-turn history
AudioBufferQueueConfig for queue configuration

Properties

Property	Type	Description	Defined in
`autoRecover?`	`boolean`	Whether to enable automatic error recovery. Remarks When `true`, the SDK attempts to recover from provider errors automatically (e.g., reinitializing a crashed provider) instead of propagating the error immediately.	src/core/types/config.ts:625
`conversationHistory?`	`ConversationHistoryConfig`	Conversation history configuration. Remarks When enabled, previous turns are sent to the LLM as context for multi-turn conversations. See ConversationHistoryConfig	src/core/types/config.ts:558
`eagerLLM?`	`EagerLLMConfig`	Eager LLM configuration. Remarks When enabled, the LLM starts speculatively on STT preflight events, reducing speech-to-first-token latency. See EagerLLMConfig	src/core/types/config.ts:594
`extra?`	`Record`<`string`, `unknown`>	Additional custom configuration for provider-specific or application-specific needs. Remarks This catch-all record allows you to pass arbitrary data through the configuration without extending the type. Providers can read from this via the config object.	src/core/types/config.ts:664
`ioContext?`	{ `enabled?`: `boolean`; `format?`: `"frontmatter"` \| `"prose"`; `includeFormatGuidance?`: `boolean`; }	I/O context configuration. Remarks Controls the system message prepended to every LLM request that tells the model about the current input/output modality (voice vs text). - `enabled` (default: `true`) — include the I/O context message - `format` (`'frontmatter'`	`'prose'`, default: `'frontmatter'`) — how the context is formatted in the system message - `includeFormatGuidance` (default: `true`) — include instructions about response format (no markdown for voice, etc.) The frontmatter format uses YAML-style `---` delimiters so the context is structured and easy to inspect in conversation history.
`ioContext.enabled?`	`boolean`	Whether to include I/O context in LLM requests.	src/core/types/config.ts:578
`ioContext.format?`	`"frontmatter"` \| `"prose"`	Format for the context message.	src/core/types/config.ts:580
`ioContext.includeFormatGuidance?`	`boolean`	Include response format guidance (no markdown, etc.).	src/core/types/config.ts:582
`logging?`	`LoggingConfig`	Logging configuration. Remarks Defaults to DEFAULT_LOGGING_CONFIG when not specified. See LoggingConfig	src/core/types/config.ts:537
`pipeline?`	{ `maxPendingChunks?`: `number`; }	Pipeline tuning options. Remarks Controls flow between pipeline stages. Currently supports backpressure between LLM and Live TTS providers.	src/core/types/config.ts:603
`pipeline.maxPendingChunks?`	`number`	Maximum text chunks buffered between LLM and TTS before pausing LLM generation. Remarks Only applies to Live (WebSocket) TTS providers. REST TTS receives the full response at once and is unaffected. When not set, no backpressure is applied (default behavior).	src/core/types/config.ts:614
`providers`	`BaseProvider`[]	Array of provider instances for the voice pipeline. Remarks Each provider declares its roles property indicating which pipeline slots it covers. The SDK resolves the 5-role pipeline (`input`, `stt`, `llm`, `tts`, `output`) from this array: - Multi-role providers (e.g., `NativeSTT` with `roles: ['input', 'stt']`) cover multiple slots with a single instance. - Single-role providers (e.g., `MicrophoneInput` with `roles: ['input']`) cover exactly one slot. - The `llm` role is always required. - When `input`+`stt` are uncovered, defaults to `NativeSTT()`. - When `tts`+`output` are uncovered, defaults to `NativeTTS()`. See BaseProvider for the interface all providers implement	src/core/types/config.ts:499
`queue?`	{ `input?`: `Partial`<`AudioBufferQueueConfig`>; `output?`: `Partial`<`AudioBufferQueueConfig`>; }	Queue configuration for input and output audio buffer queues. Remarks When separate input and STT providers are used (e.g., `MicrophoneInput` + `DeepgramSTT`), an `AudioBufferQueue` buffers audio between them to prevent frame loss during STT connection. Similarly for TTS + output. This config lets you tune queue sizes and overflow behavior. See AudioBufferQueueConfig	src/core/types/config.ts:512
`queue.input?`	`Partial`<`AudioBufferQueueConfig`>	Configuration overrides for the input→STT buffer queue.	src/core/types/config.ts:514
`queue.output?`	`Partial`<`AudioBufferQueueConfig`>	Configuration overrides for the TTS→output buffer queue.	src/core/types/config.ts:516
`reconnection?`	`ReconnectionConfig`	WebSocket reconnection configuration. Remarks Defaults to DEFAULT_RECONNECTION_CONFIG when not specified. See ReconnectionConfig	src/core/types/config.ts:527
`recovery?`	`RecoveryStrategy`	Recovery strategy configuration for automatic error recovery. Remarks Only applies when `autoRecover` is `true`. Controls the backoff behavior when the SDK attempts to recover from provider errors. See RecoveryStrategy	src/core/types/config.ts:654
`tools?`	{ `definitions`: `LLMToolDefinition`[]; `onToolCall`: (`toolCall`) => `Promise`<`LLMToolResult`>; }	Tool use configuration for LLM function calling. Remarks When provided, the LLM can invoke tools during generation. Text output is streamed to TTS as usual, while tool calls are handled via the `onToolCall` callback. After tool execution, the LLM is called again with the tool result to generate a natural language follow-up. Requires the LLM provider to implement `ToolAwareLLMProvider`.	src/core/types/config.ts:638
`tools.definitions`	`LLMToolDefinition`[]	-	src/core/types/config.ts:639
`tools.onToolCall`	(`toolCall`) => `Promise`<`LLMToolResult`>	-	src/core/types/config.ts:640
`turnTaking?`	`TurnTakingConfig`	Turn-taking behavior configuration. Remarks Defaults to DEFAULT_TURN_TAKING_CONFIG when not specified. See TurnTakingConfig	src/core/types/config.ts:547