-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Problem
When dictating with Wave, any currently playing media — music, podcasts, videos — bleeds into the microphone. This degrades transcription quality and forces users to manually pause all audio before triggering dictation. The friction breaks the 'fast activation' flow that Wave is built around.
Codebase Context
AudioRecorder.swift sets up an AVAudioRecorder with raw PCM settings but does not configure any audio session behaviour. No ducking, no interruption handling, and no coordination with other audio-producing apps exists anywhere in the pipeline.
// AudioRecorder.swift — no audio session setup today
let recorder = try AVAudioRecorder(url: url, settings: recordSettings)Proposed Behaviour
- When dictation starts, other system audio is silenced or ducked to a low level.
- When dictation ends (stop + transcribe), audio is restored to its previous level.
- This should be opt-in via a toggle in Settings, since some users may prefer leaving audio running.
Implementation Notes
macOS offers a few paths:
- AVAudioSession (macOS 12+) — set category to
.playAndRecordwith.duckOthersoption beforerecorder.record(). Restore afterrecorder.stop(). Requiresimport AVFAudio. - CoreAudio — use
AudioHardwareServiceGetPropertyData/AudioHardwareServiceSetPropertyDataon the default output device to temporarily scale its volume to 0 and restore it after.
Option 1 is cleaner and uses Apple's own AVFoundation session APIs. It keeps the change confined to AudioRecorder.swift and a new boolean setting in AppState.swift.
The change is additive and backwards compatible — existing behaviour remains the default if the setting is off.
Validation Checklist
- Playing media in another app (Music, Safari, etc.) is silenced when hotkey is pressed
- Audio is fully restored after dictation completes or is cancelled
- Toggling the setting off restores original behaviour
- No impact on transcription quality or recording latency