AudioCapture
Manages microphone audio capture using the Web Audio API.
Defined in: src/core/audio/AudioCapture.ts:128
Manages microphone audio capture using the Web Audio API.
Remarks
AudioCapture encapsulates the browser’s getUserMedia, AudioContext, and audio processing APIs to provide a simple start/stop interface for microphone capture. It supports pause/resume, configurable sample rates, echo cancellation, noise suppression, and automatic gain control.
When starting, the class attempts to use AudioWorkletNode for off-main-thread audio processing. If AudioWorklet is unavailable (e.g. in older browsers or jsdom test environments), it transparently falls back to the deprecated ScriptProcessorNode.
Audio data is delivered as ArrayBuffer chunks to the callback provided to start(). The chunk size depends on the processing path: the worklet delivers 128-frame render quanta, while the fallback ScriptProcessorNode uses a power-of-two buffer size derived from the AudioInputConfig.chunkDuration setting (default 100ms).
The class tracks its own state via AudioCaptureState values: 'inactive', 'starting', 'active', 'paused', and 'stopping'.
Example
import { AudioCapture } from './AudioCapture';
const capture = new AudioCapture({ sampleRate: 16000 }, logger);
// Start capturing microphone audio
await capture.start((audioData) => {
// Process or send the PCM audio chunk
sttProvider.sendAudio(audioData);
});
console.log(capture.isCapturing()); // true
// Pause during TTS playback
capture.pause();
// Resume after playback ends
await capture.resume();
// Stop capture and release resources
await capture.stop();
See
- AudioPlayer for audio playback.
- AudioInputConfig for available configuration options.
Constructors
Constructor
new AudioCapture(config?, logger?): AudioCapture;
Defined in: src/core/audio/AudioCapture.ts:152
Creates a new AudioCapture instance.
Parameters
| Parameter | Type | Description |
|---|---|---|
config | Partial<AudioInputConfig> | Partial audio input configuration. Missing fields are filled from DEFAULT_AUDIO_INPUT_CONFIG. |
logger? | Logger | Optional Logger instance. A child logger named 'AudioCapture' is created if provided. |
Returns
AudioCapture
Remarks
The instance starts in the 'inactive' state. Call start() to begin capturing audio.
Methods
checkPermission()
checkPermission(): Promise<"granted" | "denied" | "prompt">;
Defined in: src/core/audio/AudioCapture.ts:209
Queries the current microphone permission status without prompting the user.
Returns
Promise<"granted" | "denied" | "prompt">
The permission state: 'granted', 'denied', or 'prompt'.
Remarks
Uses the Permissions API (navigator.permissions.query). If the Permissions API is not available (e.g. in some browsers), this method returns 'prompt' as a safe fallback.
dispose()
dispose(): Promise<void>;
Defined in: src/core/audio/AudioCapture.ts:669
Releases all resources by stopping capture. Safe to call from any state.
Returns
Promise<void>
Remarks
This is a convenience method equivalent to calling stop(). It is intended for use in cleanup/teardown code where you want to ensure all resources are freed regardless of the current state.
getAudioContext()
getAudioContext(): AudioContext | null;
Defined in: src/core/audio/AudioCapture.ts:613
Returns the underlying AudioContext, if one has been created.
Returns
AudioContext | null
The active AudioContext, or null if none exists.
Remarks
This is intended for advanced use cases where direct access to the audio context is needed (e.g. creating custom audio processing nodes). Returns null if capture has not been started or has been stopped.
getConfig()
getConfig(): AudioInputConfig;
Defined in: src/core/audio/AudioCapture.ts:627
Returns a copy of the current audio input configuration.
Returns
A copy of the current AudioInputConfig.
Remarks
The returned object is a shallow copy; modifying it does not affect the internal configuration. Use updateConfig() to change settings.
getState()
getState(): AudioCaptureState;
Defined in: src/core/audio/AudioCapture.ts:163
Returns the current capture state.
Returns
The current AudioCaptureState ('inactive', 'starting', 'active', 'paused', or 'stopping').
isCapturing()
isCapturing(): boolean;
Defined in: src/core/audio/AudioCapture.ts:172
Checks whether the capture is actively recording audio.
Returns
boolean
true if the state is 'active', false otherwise.
isUsingWorklet()
isUsingWorklet(): boolean;
Defined in: src/core/audio/AudioCapture.ts:656
Returns whether the AudioWorklet path is being used for the current session.
Returns
boolean
true if AudioWorklet is in use, false otherwise.
Remarks
Returns true if the current capture session is using AudioWorkletNode, false if using the ScriptProcessorNode fallback or if capture is not active.
pause()
pause(): void;
Defined in: src/core/audio/AudioCapture.ts:505
Pauses audio capture by suspending the AudioContext.
Returns
void
Remarks
If the capture is not in the 'active' state, this method logs a warning and returns without error. While paused, no audio data is delivered to the callback. Call resume() to continue capturing.
requestPermission()
requestPermission(): Promise<boolean>;
Defined in: src/core/audio/AudioCapture.ts:187
Requests microphone permission from the user without starting capture.
Returns
Promise<boolean>
true if permission was granted, false if denied or unavailable.
Remarks
This method calls getUserMedia to trigger the browser’s permission prompt, then immediately stops the resulting stream. It is useful for pre-requesting permission during onboarding so that start() can succeed without a permission dialog.
resume()
resume(): Promise<void>;
Defined in: src/core/audio/AudioCapture.ts:525
Resumes audio capture from a paused state.
Returns
Promise<void>
Remarks
If the capture is not in the 'paused' state, this method logs a warning and returns without error.
start()
start(callback): Promise<void>;
Defined in: src/core/audio/AudioCapture.ts:253
Starts capturing audio from the microphone.
Parameters
| Parameter | Type | Description |
|---|---|---|
callback | AudioCaptureCallback | The AudioCaptureCallback that receives audio data chunks as ArrayBuffer. |
Returns
Promise<void>
Remarks
This method:
- Requests microphone access via
getUserMedia. - Creates an
AudioContextat the configured sample rate. - Connects a
MediaStreamAudioSourceNodethrough anAudioWorkletNode(preferred) orScriptProcessorNode(fallback). - Delivers processed audio chunks to the provided
callback.
If already capturing (state === 'active'), this method logs a warning and returns immediately without error.
Throws
AudioCaptureError if the Media Devices API is not supported by the browser.
Throws
MicrophonePermissionError if the user denies microphone access.
Throws
AudioCaptureError for any other failure during stream or context creation.
Example
await capture.start((audioData) => {
websocket.send(audioData);
});
stop()
stop(): Promise<void>;
Defined in: src/core/audio/AudioCapture.ts:549
Stops audio capture and releases all audio resources.
Returns
Promise<void>
Remarks
This method disconnects the audio processing graph, closes the AudioContext, stops all media stream tracks, and clears the callback. If already in the 'inactive' state, this method is a no-op.
After stopping, the instance can be restarted by calling start() again with a new callback.
updateConfig()
updateConfig(config): void;
Defined in: src/core/audio/AudioCapture.ts:641
Updates the audio input configuration.
Parameters
| Parameter | Type | Description |
|---|---|---|
config | Partial<AudioInputConfig> | Partial configuration to merge with the current settings. |
Returns
void
Remarks
Configuration changes take effect on the next call to start(). They do not affect an active capture session — you must stop and restart capture for changes to apply.