DeepgramSTT

Deepgram real-time STT provider using native WebSocket (no SDK required).

Defined in: src/providers/stt/deepgram/DeepgramSTT.ts:263

Deepgram real-time STT provider using native WebSocket (no SDK required).

Remarks

DeepgramSTT extends LiveSTTProvider and connects to Deepgram’s V1 WebSocket streaming transcription API directly. It supports:

Real-time interim and final transcription results
Multi-segment utterance buffering (accumulates is_final segments until speech_final to deliver a complete utterance)
Proxy mode via DeepgramSTTConfig.proxyUrl (recommended for production so the API key stays server-side)
All V1 query parameters (model, language, punctuate, smart_format, etc.)
Keep-alive and finalize (flush) methods

Transport: Native WebSocket (no @deepgram/sdk required)

Browser support: All modern browsers (Chrome, Firefox, Safari, Edge).

Auth in browser: Direct mode uses WebSocket subprotocol ["token", apiKey]; proxy mode omits auth (the proxy injects the key via HTTP headers).

Data flow:

Microphone -> AudioCapture -> sendAudio(chunk) -> Deepgram WebSocket
                                                      |
CompositeVoice <- onTranscription(result) <----------+

Example

import { DeepgramSTT } from 'composite-voice';

const stt = new DeepgramSTT({
  proxyUrl: 'http://localhost:3001/api/proxy/deepgram',
  language: 'en-US',
  interimResults: true,
  options: {
    model: 'nova-3',
    smartFormat: true,
    punctuation: true,
  },
});

await stt.initialize();

stt.onTranscription((result) => {
  if (result.isFinal && result.speechFinal) {
    console.log('Complete utterance:', result.text);
  }
});

await stt.connect();
// ... send audio chunks via stt.sendAudio(chunk) ...
stt.sendFinalize(); // flush pending audio
await stt.disconnect();

See

LiveSTTProvider for the base WebSocket STT class
DeepgramSTTConfig for configuration options
DeepgramTranscriptionOptions for transcription parameters

Extends

LiveSTTProvider

Constructors

Constructor

new DeepgramSTT(config, logger?): DeepgramSTT;

Defined in: src/providers/stt/deepgram/DeepgramSTT.ts:324

Create a new DeepgramSTT provider.

Parameters

Parameter	Type	Description
`config`	`DeepgramSTTConfig`	Deepgram STT configuration. Must include either `apiKey` or `proxyUrl`.
`logger?`	`Logger`	Optional parent logger; a child will be derived.

Returns

DeepgramSTT

Example

const stt = new DeepgramSTT({
  apiKey: 'dg_abc123...',
  options: { model: 'nova-3' },
});

Overrides

LiveSTTProvider.constructor

Properties

Property	Modifier	Type	Default value	Description	Overrides	Inherited from	Defined in
`config`	`public`	`DeepgramSTTConfig`	`undefined`	STT-specific provider configuration.	`LiveSTTProvider.config`	-	src/providers/stt/deepgram/DeepgramSTT.ts:264
`initialized`	`protected`	`boolean`	`false`	Tracks whether initialize has completed successfully.	-	`LiveSTTProvider.initialized`	src/providers/base/BaseProvider.ts:97
`logger`	`protected`	`Logger`	`undefined`	Scoped logger instance for this provider.	-	`LiveSTTProvider.logger`	src/providers/base/BaseProvider.ts:94
`roles`	`readonly`	readonly `ProviderRole`[]	`undefined`	STT providers cover the `'stt'` pipeline role by default.	-	`LiveSTTProvider.roles`	src/providers/base/BaseSTTProvider.ts:77
`transcriptionCallback?`	`protected`	(`result`) => `void`	`undefined`	Callback registered by the SDK or consumer to receive transcription results. Set via onTranscription.	-	`LiveSTTProvider.transcriptionCallback`	src/providers/base/BaseSTTProvider.ts:86
`type`	`readonly`	`ProviderType`	`undefined`	Communication transport this provider uses (`'rest'` or `'websocket'`).	-	`LiveSTTProvider.type`	src/providers/base/BaseProvider.ts:74

Accessors

isProxyMode

Get Signature

get protected isProxyMode(): boolean;

Defined in: src/providers/base/BaseProvider.ts:286

Whether the provider is in proxy mode.

Returns

boolean

true when proxyUrl is set.

Inherited from

LiveSTTProvider.isProxyMode

Methods

assertAuth()

protected assertAuth(): void;

Defined in: src/providers/base/BaseProvider.ts:272

Validate that auth is configured (either apiKey or proxyUrl).

Returns

void

Remarks

Call this in onInitialize() for any provider that requires external authentication. Native providers (NativeSTT, NativeTTS) and in-browser providers (WebLLM) should NOT call this method.

Throws

ProviderInitializationError Thrown when neither apiKey nor proxyUrl is set.

Inherited from

LiveSTTProvider.assertAuth

assertReady()

protected assertReady(): void;

Defined in: src/providers/base/BaseProvider.ts:255

Guard that throws if the provider has not been initialized.

Returns

void

Remarks

Call at the start of any method that requires the provider to be ready.

Throws

Error Thrown with a descriptive message when initialized is false.

Inherited from

LiveSTTProvider.assertReady

connect()

connect(): Promise<void>;

Defined in: src/providers/stt/deepgram/DeepgramSTT.ts:447

Open a WebSocket connection to Deepgram for real-time transcription.

Returns

Promise<void>

Remarks

Builds the connection URL with all query parameters, creates a native WebSocket, and waits for the open event before resolving.

In direct mode, auth is sent via WebSocket subprotocol ["token", apiKey] (the standard Deepgram browser auth mechanism). In proxy mode, no auth is sent — the proxy injects the real API key via HTTP headers.

Throws

ProviderConnectionError Thrown when the provider is not initialized, or the connection times out / errors.

Overrides

LiveSTTProvider.connect

disconnect()

disconnect(): Promise<void>;

Defined in: src/providers/stt/deepgram/DeepgramSTT.ts:835

Gracefully close the Deepgram WebSocket connection.

Returns

Promise<void>

Remarks

Sends a CloseStream control message for graceful server-side cleanup, then closes the WebSocket. Waits up to 1 second for the close event before force-resolving. Resets the utterance buffer and internal state.

Throws

Re-throws any unexpected error during disconnection.

Overrides

LiveSTTProvider.disconnect

dispose()

dispose(): Promise<void>;

Defined in: src/providers/base/BaseProvider.ts:154

Clean up resources and dispose of the provider.

Returns

Promise<void>

Remarks

Delegates to the subclass hook onDispose and resets the initialized flag. If the provider is not initialized, the call is a no-op.

Throws

Re-throws any error raised by onDispose.

Inherited from

LiveSTTProvider.dispose

emitTranscription()

protected emitTranscription(result): void;

Defined in: src/providers/base/BaseSTTProvider.ts:206

Emit a transcription result to the registered callback.

Parameters

Parameter	Type	Description
`result`	`TranscriptionResult`	The transcription result to emit.

Returns

void

Remarks

Subclasses call this method whenever transcribed text is available. If no callback has been registered via onTranscription, the result is logged as a warning and dropped.

Inherited from

LiveSTTProvider.emitTranscription

getConfig()

getConfig(): STTProviderConfig;

Defined in: src/providers/base/BaseSTTProvider.ts:225

Get a shallow copy of the current STT configuration.

Returns

STTProviderConfig

A new STTProviderConfig object.

Inherited from

LiveSTTProvider.getConfig

initialize()

initialize(): Promise<void>;

Defined in: src/providers/base/BaseProvider.ts:127

Initialize the provider, making it ready for use.

Returns

Promise<void>

Remarks

Calls the subclass hook onInitialize. If the provider has already been initialized the call is a no-op.

Throws

ProviderInitializationError Thrown when onInitialize rejects. The original error is wrapped with the provider class name for diagnostics.

Inherited from

LiveSTTProvider.initialize

isFinal()

isFinal(result): boolean;

Defined in: src/providers/base/BaseSTTProvider.ts:174

Is this a final segment (but not necessarily utterance-complete)?

Parameters

Parameter	Type	Description
`result`	`TranscriptionResult`	The transcription result to check.

Returns

boolean

true when this is a final segment.

Remarks

A final segment represents committed text, but multi-segment providers (e.g., Deepgram) may emit several final segments for a single utterance. Only the last one will have isUtteranceComplete return true.

Inherited from

LiveSTTProvider.isFinal

isInterim()

isInterim(result): boolean;

Defined in: src/providers/base/BaseSTTProvider.ts:159

Is this an interim (partial, non-final) result?

Parameters

Parameter	Type	Description
`result`	`TranscriptionResult`	The transcription result to check.

Returns

boolean

true when this is an interim result.

Remarks

Interim results update as the user speaks and are replaced by subsequent results. Useful for display but not for triggering downstream processing.

Inherited from

LiveSTTProvider.isInterim

isPreflight()

isPreflight(result): boolean;

Defined in: src/providers/base/BaseSTTProvider.ts:144

Is this a preflight/eager end-of-turn signal?

Parameters

Parameter	Type	Description
`result`	`TranscriptionResult`	The transcription result to check.

Returns

boolean

true when this is a preflight signal.

Remarks

Used by the eager LLM pipeline for speculative generation. Only providers with preflight support (e.g., Deepgram Flux) need to override this.

Inherited from

LiveSTTProvider.isPreflight

isReady()

isReady(): boolean;

Defined in: src/providers/base/BaseProvider.ts:178

Check whether the provider has been initialized and is ready.

Returns

boolean

true when initialize has completed successfully and dispose has not yet been called.

Inherited from

LiveSTTProvider.isReady

isUtteranceComplete()

isUtteranceComplete(result): boolean;

Defined in: src/providers/base/BaseSTTProvider.ts:129

Is this result a complete utterance ready for LLM processing?

Parameters

Parameter	Type	Description
`result`	`TranscriptionResult`	The transcription result to check.

Returns

boolean

true when the utterance is complete.

Remarks

The orchestrator calls this to decide when to send transcribed text to the LLM. Concrete providers override this when they have domain- specific endpointing logic (e.g., DeepgramSTT checks speech_final).

Inherited from

LiveSTTProvider.isUtteranceComplete

isWebSocketConnected()

isWebSocketConnected(): boolean;

Defined in: src/providers/stt/deepgram/DeepgramSTT.ts:892

Check whether the Deepgram WebSocket connection is currently open.

Returns

boolean

true when connected and ready to receive audio.

onConfigUpdate()

protected onConfigUpdate(_config): void;

Defined in: src/providers/base/BaseProvider.ts:242

Hook called after updateConfig merges new values.

Parameters

Parameter	Type	Description
`_config`	`Partial`<`BaseProviderConfig`>	The partial configuration that was merged.

Returns

void

Remarks

The default implementation is a no-op. Override in subclasses to react to runtime configuration changes (e.g. reconnect with a new API key).

Inherited from

LiveSTTProvider.onConfigUpdate

onDispose()

protected onDispose(): Promise<void>;

Defined in: src/providers/stt/deepgram/DeepgramSTT.ts:363

Disconnect the WebSocket (if connected) and release resources.

Returns

Promise<void>

Overrides

LiveSTTProvider.onDispose

onInitialize()

protected onInitialize(): Promise<void>;

Defined in: src/providers/stt/deepgram/DeepgramSTT.ts:339

Validate configuration — no SDK import required.

Returns

Promise<void>

Throws

ProviderInitializationError Thrown when neither apiKey nor proxyUrl is configured.

Overrides

LiveSTTProvider.onInitialize

onTranscription()

onTranscription(callback): void;

Defined in: src/providers/base/BaseSTTProvider.ts:191

Parameters

Parameter	Type	Description
`callback`	(`result`) => `void`	Function invoked with each TranscriptionResult.

Returns

void

Remarks

All STT providers — regardless of transport — deliver text through this callback. CompositeVoice registers it during pipeline setup so that transcription results flow into the conversation manager and, ultimately, the LLM provider.

Inherited from

LiveSTTProvider.onTranscription

processAudio()

processAudio(chunk): void;

Defined in: src/providers/base/LiveSTTProvider.ts:140

Process a raw audio chunk by sending it over the WebSocket.

Parameters

Parameter	Type	Description
`chunk`	`ArrayBuffer`	Raw audio data as an `ArrayBuffer`.

Returns

void

Remarks

Legacy alias for sendAudio. Delegates to sendAudioToSocket.

Inherited from

LiveSTTProvider.processAudio

resolveApiKey()

protected resolveApiKey(): Promise<string>;

Defined in: src/providers/base/BaseProvider.ts:321

Resolve the API key, calling the factory if apiKey is a function.

Returns

Promise<string>

The resolved API key string, or 'proxy' in proxy mode.

Inherited from

LiveSTTProvider.resolveApiKey

resolveAuthHeader()

protected resolveAuthHeader(defaultAuthType?): Promise<string | undefined>;

Defined in: src/providers/base/BaseProvider.ts:366

Resolve Authorization header value for the configured auth type.

Parameters

Parameter	Type	Default value	Description
`defaultAuthType`	`"token"` \| `"bearer"`	`'token'`	The default auth type for this provider.

Returns

Promise<string | undefined>

The Authorization header value, or undefined in proxy mode.

Remarks

If apiKey is a factory function it is called to get a fresh token. Returns the header value for REST or server-side WebSocket connections:

'token' → 'Token <apiKey>'
'bearer' → 'Bearer <apiKey>'

Returns undefined in proxy mode.

Inherited from

LiveSTTProvider.resolveAuthHeader

resolveBaseUrl()

protected resolveBaseUrl(defaultUrl?): string | undefined;

Defined in: src/providers/base/BaseProvider.ts:307

Resolve the base URL for this provider.

Parameters

Parameter	Type	Description
`defaultUrl?`	`string`	The provider’s default API URL. Pass `undefined` to let the underlying SDK use its own default.

Returns

string | undefined

The resolved URL, or undefined when all sources are unset.

Remarks

Priority: proxyUrl > endpoint > defaultUrl.

For WebSocket providers (this.type === 'websocket'), the proxy URL’s http(s) scheme is automatically converted to ws(s).

When no URL is configured and defaultUrl is undefined, the return value is undefined — this lets SDK-based providers (Anthropic, OpenAI) fall back to their own built-in defaults.

Inherited from

LiveSTTProvider.resolveBaseUrl

resolveWsProtocols()

protected resolveWsProtocols(defaultAuthType?): Promise<string[] | undefined>;

Defined in: src/providers/base/BaseProvider.ts:342

Resolve WebSocket subprotocol for authentication.

Parameters

Parameter	Type	Default value	Description
`defaultAuthType`	`"token"` \| `"bearer"`	`'token'`	The default auth type for this provider.

Returns

Promise<string[] | undefined>

Subprotocol array for new WebSocket(url, protocols), or undefined.

Remarks

If apiKey is a factory function it is called to get a fresh token. Returns the subprotocol array for direct mode based on authType:

'token' → ['token', apiKey] (Deepgram default)
'bearer' → ['bearer', apiKey] (OAuth/Bearer tokens)

Returns undefined in proxy mode (no client-side auth needed).

Inherited from

LiveSTTProvider.resolveWsProtocols

sendAudio()

sendAudio(chunk): void;

Defined in: src/providers/base/LiveSTTProvider.ts:128

Send an audio chunk for real-time transcription.

Parameters

Parameter	Type	Description
`chunk`	`ArrayBuffer`	Raw audio data as an `ArrayBuffer`.

Returns

void

Remarks

This is the public method required by the ILiveSTTProvider interface. It delegates to sendAudioToSocket, which subclasses implement to forward audio data over the WebSocket connection. For providers that manage their own audio (e.g. NativeSTT), sendAudioToSocket is a no-op.

Inherited from

LiveSTTProvider.sendAudio

sendAudioToSocket()

protected sendAudioToSocket(chunk): void;

Defined in: src/providers/stt/deepgram/DeepgramSTT.ts:768

Send a raw audio chunk to Deepgram for real-time transcription.

Parameters

Parameter	Type	Description
`chunk`	`ArrayBuffer`	Raw audio data captured from the microphone.

Returns

void

Remarks

Sends the audio data as a binary WebSocket frame. If the connection is not open, the chunk is silently dropped and a warning is logged.

Called by the base class’s LiveSTTProvider.sendAudio method.

Overrides

LiveSTTProvider.sendAudioToSocket

sendFinalize()

sendFinalize(): void;

Defined in: src/providers/stt/deepgram/DeepgramSTT.ts:811

Send a finalize signal to flush any pending audio and force a final transcription result from Deepgram.

Returns

void

Remarks

Sends { "type": "Finalize" } JSON message. This tells Deepgram to process any buffered audio and return a final result. Useful before disconnecting or when you need an immediate result.

sendKeepAlive()

sendKeepAlive(): void;

Defined in: src/providers/stt/deepgram/DeepgramSTT.ts:788

Send a keep-alive signal to prevent the WebSocket from timing out.

Returns

void

Remarks

Sends { "type": "KeepAlive" } JSON message. Useful for long pauses where no audio is being sent but the connection should remain open.

updateConfig()

updateConfig(config): void;

Defined in: src/providers/base/BaseProvider.ts:201

Merge partial configuration updates into the current config.

Parameters

Parameter	Type	Description
`config`	`Partial`<`BaseProviderConfig`>	A partial configuration object whose keys will overwrite existing values.

Returns

void

Remarks

After merging, the subclass hook onConfigUpdate is called so providers can react to changed values at runtime.

Inherited from

LiveSTTProvider.updateConfig