ElevenLabsSTT
ElevenLabs STT provider for real-time streaming speech-to-text via WebSocket.
Defined in: src/providers/stt/elevenlabs/ElevenLabsSTT.ts:317
ElevenLabs STT provider for real-time streaming speech-to-text via WebSocket.
Remarks
This provider establishes a WebSocket connection to the ElevenLabs Scribe V2 real-time STT API (or a proxy server). Audio chunks are base64-encoded and sent as JSON input_audio_chunk messages. Transcription results are received as partial_transcript and committed_transcript messages.
The lifecycle is:
- Construct with ElevenLabsSTTConfig
- Call
initialize()to validate configuration and build the WebSocket URL - Call
connect()to open the WebSocket and wait forsession_started - Call
sendAudio()to stream audio chunks for transcription - Call
disconnect()to close the WebSocket - Call
dispose()to release all resources
Audio flow: Microphone -> AudioCapture -> sendAudio(chunk) -> WebSocket -> Scribe V2
Example
import { ElevenLabsSTT } from 'composite-voice';
const stt = new ElevenLabsSTT({
proxyUrl: 'http://localhost:3001/api/proxy/elevenlabs',
model: 'scribe_v2_realtime',
commitStrategy: 'vad',
audioFormat: 'pcm_16000',
});
await stt.initialize();
await stt.connect();
stt.onTranscription((result) => {
if (result.isFinal) {
console.log('Final:', result.text);
} else {
console.log('Partial:', result.text);
}
});
// Stream audio chunks from microphone...
stt.sendAudio(audioChunk);
await stt.disconnect();
See
- LiveSTTProvider - The base class this provider extends.
- ElevenLabsSTTConfig - Configuration options for this provider.
- WebSocketManager - The WebSocket manager used for connection handling.
Extends
LiveSTTProvider
Constructors
Constructor
new ElevenLabsSTT(config, logger?): ElevenLabsSTT;
Defined in: src/providers/stt/elevenlabs/ElevenLabsSTT.ts:331
Creates a new ElevenLabsSTT provider instance.
Parameters
| Parameter | Type | Description |
|---|---|---|
config | ElevenLabsSTTConfig | Configuration for the ElevenLabs STT provider. |
logger? | Logger | Optional logger instance for debug and diagnostic output. |
Returns
ElevenLabsSTT
Overrides
LiveSTTProvider.constructor
Properties
| Property | Modifier | Type | Default value | Description | Overrides | Inherited from | Defined in |
|---|---|---|---|---|---|---|---|
config | public | ElevenLabsSTTConfig | undefined | STT-specific provider configuration. | LiveSTTProvider.config | - | src/providers/stt/elevenlabs/ElevenLabsSTT.ts:318 |
initialized | protected | boolean | false | Tracks whether initialize has completed successfully. | - | LiveSTTProvider.initialized | src/providers/base/BaseProvider.ts:97 |
logger | protected | Logger | undefined | Scoped logger instance for this provider. | - | LiveSTTProvider.logger | src/providers/base/BaseProvider.ts:94 |
roles | readonly | readonly ProviderRole[] | undefined | STT providers cover the 'stt' pipeline role by default. | - | LiveSTTProvider.roles | src/providers/base/BaseSTTProvider.ts:77 |
transcriptionCallback? | protected | (result) => void | undefined | Callback registered by the SDK or consumer to receive transcription results. Set via onTranscription. | - | LiveSTTProvider.transcriptionCallback | src/providers/base/BaseSTTProvider.ts:86 |
type | readonly | ProviderType | undefined | Communication transport this provider uses ('rest' or 'websocket'). | - | LiveSTTProvider.type | src/providers/base/BaseProvider.ts:74 |
Accessors
isProxyMode
Get Signature
get protected isProxyMode(): boolean;
Defined in: src/providers/base/BaseProvider.ts:286
Whether the provider is in proxy mode.
Returns
boolean
true when proxyUrl is set.
Inherited from
LiveSTTProvider.isProxyMode
Methods
assertAuth()
protected assertAuth(): void;
Defined in: src/providers/base/BaseProvider.ts:272
Validate that auth is configured (either apiKey or proxyUrl).
Returns
void
Remarks
Call this in onInitialize() for any provider that requires external authentication. Native providers (NativeSTT, NativeTTS) and in-browser providers (WebLLM) should NOT call this method.
Throws
ProviderInitializationError Thrown when neither apiKey nor proxyUrl is set.
Inherited from
LiveSTTProvider.assertAuth
assertReady()
protected assertReady(): void;
Defined in: src/providers/base/BaseProvider.ts:255
Guard that throws if the provider has not been initialized.
Returns
void
Remarks
Call at the start of any method that requires the provider to be ready.
Throws
Error Thrown with a descriptive message when initialized is false.
Inherited from
LiveSTTProvider.assertReady
connect()
connect(): Promise<void>;
Defined in: src/providers/stt/elevenlabs/ElevenLabsSTT.ts:479
Opens a WebSocket connection and waits for the session_started message.
Returns
Promise<void>
Remarks
Establishes a WebSocket connection to the ElevenLabs real-time STT endpoint (or proxy). The connect promise does not resolve until the server sends a session_started message, ensuring the session is fully initialized before audio streaming begins.
Auto-reconnect is disabled because each STT session is stateful and cannot be resumed after disconnection.
Throws
ProviderConnectionError if the connection fails or times out.
Overrides
LiveSTTProvider.connect
disconnect()
disconnect(): Promise<void>;
Defined in: src/providers/stt/elevenlabs/ElevenLabsSTT.ts:777
Disconnects from the ElevenLabs WebSocket.
Returns
Promise<void>
Remarks
Sends a final commit message to flush any buffered audio, waits briefly for remaining transcription results, then gracefully closes the WebSocket connection and releases the WebSocketManager instance.
Throws
Rethrows any error that occurs during disconnection.
Overrides
LiveSTTProvider.disconnect
dispose()
dispose(): Promise<void>;
Defined in: src/providers/base/BaseProvider.ts:154
Clean up resources and dispose of the provider.
Returns
Promise<void>
Remarks
Delegates to the subclass hook onDispose and resets the initialized flag. If the provider is not initialized, the call is a no-op.
Throws
Re-throws any error raised by onDispose.
Inherited from
LiveSTTProvider.dispose
emitTranscription()
protected emitTranscription(result): void;
Defined in: src/providers/base/BaseSTTProvider.ts:206
Emit a transcription result to the registered callback.
Parameters
| Parameter | Type | Description |
|---|---|---|
result | TranscriptionResult | The transcription result to emit. |
Returns
void
Remarks
Subclasses call this method whenever transcribed text is available. If no callback has been registered via onTranscription, the result is logged as a warning and dropped.
Inherited from
LiveSTTProvider.emitTranscription
getConfig()
getConfig(): STTProviderConfig;
Defined in: src/providers/base/BaseSTTProvider.ts:225
Get a shallow copy of the current STT configuration.
Returns
A new STTProviderConfig object.
Inherited from
LiveSTTProvider.getConfig
initialize()
initialize(): Promise<void>;
Defined in: src/providers/base/BaseProvider.ts:127
Initialize the provider, making it ready for use.
Returns
Promise<void>
Remarks
Calls the subclass hook onInitialize. If the provider has already been initialized the call is a no-op.
Throws
ProviderInitializationError Thrown when onInitialize rejects. The original error is wrapped with the provider class name for diagnostics.
Inherited from
LiveSTTProvider.initialize
isFinal()
isFinal(result): boolean;
Defined in: src/providers/base/BaseSTTProvider.ts:174
Is this a final segment (but not necessarily utterance-complete)?
Parameters
| Parameter | Type | Description |
|---|---|---|
result | TranscriptionResult | The transcription result to check. |
Returns
boolean
true when this is a final segment.
Remarks
A final segment represents committed text, but multi-segment providers (e.g., Deepgram) may emit several final segments for a single utterance. Only the last one will have isUtteranceComplete return true.
Inherited from
LiveSTTProvider.isFinal
isInterim()
isInterim(result): boolean;
Defined in: src/providers/base/BaseSTTProvider.ts:159
Is this an interim (partial, non-final) result?
Parameters
| Parameter | Type | Description |
|---|---|---|
result | TranscriptionResult | The transcription result to check. |
Returns
boolean
true when this is an interim result.
Remarks
Interim results update as the user speaks and are replaced by subsequent results. Useful for display but not for triggering downstream processing.
Inherited from
LiveSTTProvider.isInterim
isPreflight()
isPreflight(result): boolean;
Defined in: src/providers/base/BaseSTTProvider.ts:144
Is this a preflight/eager end-of-turn signal?
Parameters
| Parameter | Type | Description |
|---|---|---|
result | TranscriptionResult | The transcription result to check. |
Returns
boolean
true when this is a preflight signal.
Remarks
Used by the eager LLM pipeline for speculative generation. Only providers with preflight support (e.g., Deepgram Flux) need to override this.
Inherited from
LiveSTTProvider.isPreflight
isReady()
isReady(): boolean;
Defined in: src/providers/base/BaseProvider.ts:178
Check whether the provider has been initialized and is ready.
Returns
boolean
true when initialize has completed successfully and dispose has not yet been called.
Inherited from
LiveSTTProvider.isReady
isUtteranceComplete()
isUtteranceComplete(result): boolean;
Defined in: src/providers/base/BaseSTTProvider.ts:129
Is this result a complete utterance ready for LLM processing?
Parameters
| Parameter | Type | Description |
|---|---|---|
result | TranscriptionResult | The transcription result to check. |
Returns
boolean
true when the utterance is complete.
Remarks
The orchestrator calls this to decide when to send transcribed text to the LLM. Concrete providers override this when they have domain- specific endpointing logic (e.g., DeepgramSTT checks speech_final).
Inherited from
LiveSTTProvider.isUtteranceComplete
isWebSocketConnected()
isWebSocketConnected(): boolean;
Defined in: src/providers/stt/elevenlabs/ElevenLabsSTT.ts:835
Checks whether the WebSocket connection to ElevenLabs is currently active.
Returns
boolean
true if the WebSocket is connected, false otherwise.
onConfigUpdate()
protected onConfigUpdate(_config): void;
Defined in: src/providers/base/BaseProvider.ts:242
Hook called after updateConfig merges new values.
Parameters
| Parameter | Type | Description |
|---|---|---|
_config | Partial<BaseProviderConfig> | The partial configuration that was merged. |
Returns
void
Remarks
The default implementation is a no-op. Override in subclasses to react to runtime configuration changes (e.g. reconnect with a new API key).
Inherited from
LiveSTTProvider.onConfigUpdate
onDispose()
protected onDispose(): Promise<void>;
Defined in: src/providers/stt/elevenlabs/ElevenLabsSTT.ts:376
Disposes the provider, disconnecting and releasing resources.
Returns
Promise<void>
Overrides
LiveSTTProvider.onDispose
onInitialize()
protected onInitialize(): Promise<void>;
Defined in: src/providers/stt/elevenlabs/ElevenLabsSTT.ts:346
Validates configuration and builds the WebSocket URL with query parameters.
Logs a debug warning if none of apiKey, token, or proxyUrl is configured.
Returns
Promise<void>
Overrides
LiveSTTProvider.onInitialize
onTranscription()
onTranscription(callback): void;
Defined in: src/providers/base/BaseSTTProvider.ts:191
Register a callback to receive transcription results.
Parameters
| Parameter | Type | Description |
|---|---|---|
callback | (result) => void | Function invoked with each TranscriptionResult. |
Returns
void
Remarks
All STT providers — regardless of transport — deliver text through this callback. CompositeVoice registers it during pipeline setup so that transcription results flow into the conversation manager and, ultimately, the LLM provider.
Inherited from
LiveSTTProvider.onTranscription
processAudio()
processAudio(chunk): void;
Defined in: src/providers/base/LiveSTTProvider.ts:140
Process a raw audio chunk by sending it over the WebSocket.
Parameters
| Parameter | Type | Description |
|---|---|---|
chunk | ArrayBuffer | Raw audio data as an ArrayBuffer. |
Returns
void
Remarks
Legacy alias for sendAudio. Delegates to sendAudioToSocket.
Inherited from
LiveSTTProvider.processAudio
resolveApiKey()
protected resolveApiKey(): string;
Defined in: src/providers/base/BaseProvider.ts:325
Resolve the API key for this provider.
Returns
string
The configured API key, or 'proxy' in proxy mode.
Remarks
Returns 'proxy' in proxy mode so that SDK clients (which require a non-empty API key string) can be instantiated without the real key.
Inherited from
LiveSTTProvider.resolveApiKey
resolveAuthHeader()
protected resolveAuthHeader(defaultAuthType?): string | undefined;
Defined in: src/providers/base/BaseProvider.ts:366
Resolve Authorization header value for the configured auth type.
Parameters
| Parameter | Type | Default value | Description |
|---|---|---|---|
defaultAuthType | "token" | "bearer" | 'token' | The default auth type for this provider. |
Returns
string | undefined
The Authorization header value, or undefined in proxy mode.
Remarks
Returns the header value for REST or server-side WebSocket connections:
'token'→'Token <apiKey>''bearer'→'Bearer <apiKey>'
Returns undefined in proxy mode.
Inherited from
LiveSTTProvider.resolveAuthHeader
resolveBaseUrl()
protected resolveBaseUrl(defaultUrl?): string | undefined;
Defined in: src/providers/base/BaseProvider.ts:307
Resolve the base URL for this provider.
Parameters
| Parameter | Type | Description |
|---|---|---|
defaultUrl? | string | The provider’s default API URL. Pass undefined to let the underlying SDK use its own default. |
Returns
string | undefined
The resolved URL, or undefined when all sources are unset.
Remarks
Priority: proxyUrl > endpoint > defaultUrl.
For WebSocket providers (this.type === 'websocket'), the proxy URL’s http(s) scheme is automatically converted to ws(s).
When no URL is configured and defaultUrl is undefined, the return value is undefined — this lets SDK-based providers (Anthropic, OpenAI) fall back to their own built-in defaults.
Inherited from
LiveSTTProvider.resolveBaseUrl
resolveWsProtocols()
protected resolveWsProtocols(defaultAuthType?): string[] | undefined;
Defined in: src/providers/base/BaseProvider.ts:343
Resolve WebSocket subprotocol for authentication.
Parameters
| Parameter | Type | Default value | Description |
|---|---|---|---|
defaultAuthType | "token" | "bearer" | 'token' | The default auth type for this provider. |
Returns
string[] | undefined
Subprotocol array for new WebSocket(url, protocols), or undefined.
Remarks
Returns the subprotocol array for direct mode based on authType:
'token'→['token', apiKey](Deepgram default)'bearer'→['bearer', apiKey](OAuth/Bearer tokens)
Returns undefined in proxy mode (no client-side auth needed).
Inherited from
LiveSTTProvider.resolveWsProtocols
sendAudio()
sendAudio(chunk): void;
Defined in: src/providers/base/LiveSTTProvider.ts:128
Send an audio chunk for real-time transcription.
Parameters
| Parameter | Type | Description |
|---|---|---|
chunk | ArrayBuffer | Raw audio data as an ArrayBuffer. |
Returns
void
Remarks
This is the public method required by the ILiveSTTProvider interface. It delegates to sendAudioToSocket, which subclasses implement to forward audio data over the WebSocket connection. For providers that manage their own audio (e.g. NativeSTT), sendAudioToSocket is a no-op.
Inherited from
LiveSTTProvider.sendAudio
sendAudioToSocket()
protected sendAudioToSocket(chunk): void;
Defined in: src/providers/stt/elevenlabs/ElevenLabsSTT.ts:678
Sends a raw audio chunk to ElevenLabs for real-time transcription.
Parameters
| Parameter | Type | Description |
|---|---|---|
chunk | ArrayBuffer | Raw audio data as an ArrayBuffer. |
Returns
void
Remarks
The audio ArrayBuffer is base64-encoded and sent as an input_audio_chunk JSON message. The previous_text context field is included only on the first audio chunk (when configured), to provide transcription context without sending it repeatedly.
Overrides
LiveSTTProvider.sendAudioToSocket
sendCommit()
sendCommit(): void;
Defined in: src/providers/stt/elevenlabs/ElevenLabsSTT.ts:739
Manually commits the current audio buffer, finalizing any partial transcript.
Returns
void
Remarks
This method is only meaningful when commitStrategy is 'manual'. In manual mode, the server buffers incoming audio and emits partial_transcript messages but does not finalize (commit) the transcript until instructed. Calling sendCommit() sends an empty input_audio_chunk with commit: true, telling the server to produce a committed_transcript for the buffered audio.
When commitStrategy is 'vad', this method is a no-op because the server automatically commits based on silence detection.
Example
const stt = new ElevenLabsSTT({
proxyUrl: 'http://localhost:3001/api/proxy/elevenlabs',
commitStrategy: 'manual',
});
// After streaming audio, manually commit to get final transcript:
stt.sendCommit();
updateConfig()
updateConfig(config): void;
Defined in: src/providers/base/BaseProvider.ts:201
Merge partial configuration updates into the current config.
Parameters
| Parameter | Type | Description |
|---|---|---|
config | Partial<BaseProviderConfig> | A partial configuration object whose keys will overwrite existing values. |
Returns
void
Remarks
After merging, the subclass hook onConfigUpdate is called so providers can react to changed values at runtime.
Inherited from
LiveSTTProvider.updateConfig