AudioCapture

Manages microphone audio capture using the Web Audio API.

Defined in: src/core/audio/AudioCapture.ts:128

Manages microphone audio capture using the Web Audio API.

Remarks

AudioCapture encapsulates the browser’s getUserMedia, AudioContext, and audio processing APIs to provide a simple start/stop interface for microphone capture. It supports pause/resume, configurable sample rates, echo cancellation, noise suppression, and automatic gain control.

When starting, the class attempts to use AudioWorkletNode for off-main-thread audio processing. If AudioWorklet is unavailable (e.g. in older browsers or jsdom test environments), it transparently falls back to the deprecated ScriptProcessorNode.

Audio data is delivered as ArrayBuffer chunks to the callback provided to start(). The chunk size depends on the processing path: the worklet delivers 128-frame render quanta, while the fallback ScriptProcessorNode uses a power-of-two buffer size derived from the AudioInputConfig.chunkDuration setting (default 100ms).

The class tracks its own state via AudioCaptureState values: 'inactive', 'starting', 'active', 'paused', and 'stopping'.

Example

import { AudioCapture } from './AudioCapture';

const capture = new AudioCapture({ sampleRate: 16000 }, logger);

// Start capturing microphone audio
await capture.start((audioData) => {
  // Process or send the PCM audio chunk
  sttProvider.sendAudio(audioData);
});

console.log(capture.isCapturing()); // true

// Pause during TTS playback
capture.pause();

// Resume after playback ends
await capture.resume();

// Stop capture and release resources
await capture.stop();

See

AudioPlayer for audio playback.
AudioInputConfig for available configuration options.

Constructors

Constructor

new AudioCapture(config?, logger?): AudioCapture;

Defined in: src/core/audio/AudioCapture.ts:152

Creates a new AudioCapture instance.

Parameters

Parameter	Type	Description
`config`	`Partial`<`AudioInputConfig`>	Partial audio input configuration. Missing fields are filled from DEFAULT_AUDIO_INPUT_CONFIG.
`logger?`	`Logger`	Optional Logger instance. A child logger named `'AudioCapture'` is created if provided.

Returns

AudioCapture

Remarks

The instance starts in the 'inactive' state. Call start() to begin capturing audio.

Methods

checkPermission()

checkPermission(): Promise<"granted" | "denied" | "prompt">;

Defined in: src/core/audio/AudioCapture.ts:209

Queries the current microphone permission status without prompting the user.

Returns

Promise<"granted" | "denied" | "prompt">

The permission state: 'granted', 'denied', or 'prompt'.

Remarks

Uses the Permissions API (navigator.permissions.query). If the Permissions API is not available (e.g. in some browsers), this method returns 'prompt' as a safe fallback.

dispose()

dispose(): Promise<void>;

Defined in: src/core/audio/AudioCapture.ts:669

Releases all resources by stopping capture. Safe to call from any state.

Returns

Promise<void>

Remarks

This is a convenience method equivalent to calling stop(). It is intended for use in cleanup/teardown code where you want to ensure all resources are freed regardless of the current state.

getAudioContext()

getAudioContext(): AudioContext | null;

Defined in: src/core/audio/AudioCapture.ts:613

Returns the underlying AudioContext, if one has been created.

Returns

AudioContext | null

The active AudioContext, or null if none exists.

Remarks

This is intended for advanced use cases where direct access to the audio context is needed (e.g. creating custom audio processing nodes). Returns null if capture has not been started or has been stopped.

getConfig()

getConfig(): AudioInputConfig;

Defined in: src/core/audio/AudioCapture.ts:627

Returns a copy of the current audio input configuration.

Returns

AudioInputConfig

A copy of the current AudioInputConfig.

Remarks

The returned object is a shallow copy; modifying it does not affect the internal configuration. Use updateConfig() to change settings.

getState()

getState(): AudioCaptureState;

Defined in: src/core/audio/AudioCapture.ts:163

Returns the current capture state.

Returns

AudioCaptureState

The current AudioCaptureState ('inactive', 'starting', 'active', 'paused', or 'stopping').

isCapturing()

isCapturing(): boolean;

Defined in: src/core/audio/AudioCapture.ts:172

Checks whether the capture is actively recording audio.

Returns

boolean

true if the state is 'active', false otherwise.

isUsingWorklet()

isUsingWorklet(): boolean;

Defined in: src/core/audio/AudioCapture.ts:656

Returns whether the AudioWorklet path is being used for the current session.

Returns

boolean

true if AudioWorklet is in use, false otherwise.

Remarks

Returns true if the current capture session is using AudioWorkletNode, false if using the ScriptProcessorNode fallback or if capture is not active.

pause()

pause(): void;

Defined in: src/core/audio/AudioCapture.ts:505

Pauses audio capture by suspending the AudioContext.

Returns

void

Remarks

If the capture is not in the 'active' state, this method logs a warning and returns without error. While paused, no audio data is delivered to the callback. Call resume() to continue capturing.

requestPermission()

requestPermission(): Promise<boolean>;

Defined in: src/core/audio/AudioCapture.ts:187

Requests microphone permission from the user without starting capture.

Returns

Promise<boolean>

true if permission was granted, false if denied or unavailable.

Remarks

This method calls getUserMedia to trigger the browser’s permission prompt, then immediately stops the resulting stream. It is useful for pre-requesting permission during onboarding so that start() can succeed without a permission dialog.

resume()

resume(): Promise<void>;

Defined in: src/core/audio/AudioCapture.ts:525

Resumes audio capture from a paused state.

Returns

Promise<void>

Remarks

If the capture is not in the 'paused' state, this method logs a warning and returns without error.

start()

start(callback): Promise<void>;

Defined in: src/core/audio/AudioCapture.ts:253

Starts capturing audio from the microphone.

Parameters

Parameter	Type	Description
`callback`	`AudioCaptureCallback`	The AudioCaptureCallback that receives audio data chunks as `ArrayBuffer`.

Returns

Promise<void>

Remarks

This method:

Requests microphone access via getUserMedia.
Creates an AudioContext at the configured sample rate.
Connects a MediaStreamAudioSourceNode through an AudioWorkletNode (preferred) or ScriptProcessorNode (fallback).
Delivers processed audio chunks to the provided callback.

If already capturing (state === 'active'), this method logs a warning and returns immediately without error.

Throws

AudioCaptureError if the Media Devices API is not supported by the browser.

Throws

MicrophonePermissionError if the user denies microphone access.

Throws

AudioCaptureError for any other failure during stream or context creation.

Example

await capture.start((audioData) => {
  websocket.send(audioData);
});

stop()

stop(): Promise<void>;

Defined in: src/core/audio/AudioCapture.ts:549

Stops audio capture and releases all audio resources.

Returns

Promise<void>

Remarks

This method disconnects the audio processing graph, closes the AudioContext, stops all media stream tracks, and clears the callback. If already in the 'inactive' state, this method is a no-op.

After stopping, the instance can be restarted by calling start() again with a new callback.

updateConfig()

updateConfig(config): void;

Defined in: src/core/audio/AudioCapture.ts:641

Updates the audio input configuration.

Parameters

Parameter	Type	Description
`config`	`Partial`<`AudioInputConfig`>	Partial configuration to merge with the current settings.

Returns

void

Remarks

Configuration changes take effect on the next call to start(). They do not affect an active capture session — you must stop and restart capture for changes to apply.