JetKVM Advanced, CGO-based 2-way Audio Support #718

pennycoders · 2025-08-02T02:17:37Z

Summary

This PR introduces bidirectional audio support to JetKVM, enabling both audio output (listening to the managed device) and audio input (microphone from browser to device). Audio is implemented using an in-process CGO architecture that directly calls C code for ALSA audio capture/playback and Opus encoding/decoding. The managed device presents itself as a USB Audio Class 1 (UAC1) gadget providing both stereo speakers and a stereo microphone interface over USB.

Key Features:

Bidirectional stereo audio (48kHz, 16-bit, 2 channels)
In-process CGO implementation for low latency and simplicity
USB Audio Gadget (UAC1) integration
WebRTC-based real-time streaming with Opus codec
Frontend controls for enabling/disabling audio output and input
HDMI or USB audio capture source selection
SDP munging for proper stereo audio support in browsers

Credits

Thank you, @lqs as your [draft] USB Audio Support #446 PR has actually encouraged me to pursue this

Thanks!
Alex

CLAassistant · 2025-08-02T02:17:43Z

All committers have signed the CLA.

pennycoders · 2025-08-04T15:52:59Z

Great news! I'll soon update this PR with Audio Input pass-through functionality too

IDisposable · 2025-08-07T22:25:07Z

This is amazing!

Would it be possible to forward the audio channel on device's input HDMI to the browser?

By this I mean that if I set my host/controlled device's audio output to the JetKVM virtual monitor then the sound is going to be coming in the HDMI stream, which might be possible to extract (I know nothing about that hardware), so we could have the host-audio come through without an additional (virtual) audit device.

pennycoders · 2025-08-08T05:44:13Z

Hi @IDisposable

Glad you like this functionality, mainly to free up as much of that USB bandwidth. I'm actually looking at this, however, it is a little trickier as it moat likely requires changes in the rv1106-system repo containing the OS too.

In case I do manage to pull that off before the v0.5.0 release for which this functionality has been scheduled, I'll update this PR.

Thanks,
Alex

vvns · 2025-08-10T12:46:35Z

JetKVM Audio PR Review & Test Feedback

Hi @pennycoders ,

First, thanks for the work on bringing audio and mic support into JetKVM.

I’ve tested the new functionality in a local LAN environment with both playback and microphone streaming active, including during real-world scenarios like a Teams call.

Test conditions:

Setup: Wired LAN, low network latency, tested with a headset mic.
Modes tested: Low, Medium, High, Ultra for both playback and mic.

Main observations:

Mic quality constant across modes
- Microphone stream sounds the same in all modes.
- Quality is acceptable but not “HD” — there is a constant background noise floor, even with a headset mic on a clean LAN.
Ultra playback distortion
- In Ultra mode only, playback sometimes has a warped/buzzy/distorted effect.
- Low/Medium/High playback modes sound good and consistent.
Latency when mic is active
- Mouse and keyboard control become noticeably less responsive whenever the mic stream is active, even on a low-latency LAN connection.
- Likely due to video/control WebSocket traffic competing with audio packets on the same channel.
Packet loss
- Playback drop rate: ~22%
- Mic drop rate: ~13%
- Loss observed despite no network congestion, pointing to buffering or scheduling bottlenecks.

Potential Improvements (Technical):

Separate transport channels
- Move audio to a dedicated WebSocket endpoint (e.g. /ws/audio) or use WebRTC for audio transport.
- Prevents video/control from being delayed by audio bursts.
Opus tuning exposure
- Make parameters adjustable via UI or JSON config:
  - bitrate, frame size, complexity
  - FEC, DTX, VBR/CBR
- Lets users balance latency, quality, and bandwidth.
ALSA parameter control
- Expose period_time and buffer_time for fine-tuning latency vs underrun protection.
Queue management
- Use a bounded audio frame queue with drop-oldest to prevent latency spikes when encoding falls behind.
Noise reduction & echo cancellation
- Integrate RNNoise or WebRTC AEC/NS for mic clarity.
- Even simple high-pass filtering can reduce constant hum.
Thread/process separation
- Run audio encode/decode in its own goroutine/process to isolate timing from video/control.

Happy to re-run these tests and provide before/after metrics once adjustments are implemented.
This PR is already a big step forward, and with these improvements, we could get low-latency, clean mic audio without impacting remote control responsiveness.

pennycoders · 2025-08-10T18:17:24Z

Hi @vvns

Thanks! Thank you very much for putting this through its paces! This is great feedback, that I can definitely work with. I initially encountered the interference with the Keyboard & Mouse that you are mentioning and made some optimizations. Do you happen to know the commit hash you've tested at? Is it the latest version of my branch? I am asking because I've tested actual calls with the latest implementation and was definitely usable.

I will break down into your feedback and see what I can do about each of the items.

If you want we can discuss more in-depth on other channels too.

Thanks,

Alex

vvns · 2025-08-10T18:26:56Z

Hi @pennycoders ,

Glad the feedback is useful! 👍
I’ve confirmed that my tests were run on the latest commit at the time — 5f905e7 from your feat/audio-support branch — so the results I reported already include your most recent optimizations.

The version is indeed usable, but in slightly more demanding conditions (e.g., during calls or with sustained mic usage) the remote control latency — which was very low before the audio feature — increases significantly, to the point where slow mouse movement becomes noticeably delayed.

I can still retest to be sure nothing was missed, but the latency impact with mic active, packet loss, and Ultra mode distortion were all observed on that commit.

I’m happy to continue sharing feedback as you push further updates, so we can iterate quickly toward the best possible audio and control experience. Let me know which channel you’d prefer for more direct discussion, so you can share any details privately if needed.

Thanks again for the great work — we’re close to a fully smooth audio + control experience.

pennycoders · 2025-08-10T19:09:56Z

Hi @vvns,

Are you on the JetKVM Discord?

If so, we can discuss there. What's your username?

Thanks,
Alex

IDisposable

This really looks nice, all my comments are questions or nits, just feel free to ignore... I wonder if we need to be more explicit in the priority assignment of the other RTC channels (medai/serial/rpc) as we really want to ensure the control signals get through at very high fidelity ... might even be worthwhile splitting up the RPC messages into control vs. advisory messaging, but that's not this PR :)

cloud.go

dev_deploy.sh

input_rpc.go

ui/src/components/WebRTCVideo.tsx

ui/src/components/popovers/AudioControlPopover.tsx

ui/src/hooks/useMicrophone.ts

webrtc.go

cloud.go

am-zed · 2025-08-23T00:50:02Z

Audio works in Firefox and Chrome, but clicking on Audio button in Chrome throws error "Cannot read properties of undefined (reading 'addEventListener')":

pennycoders · 2025-08-24T11:18:40Z

Audio works in Firefox and Chrome, but clicking on Audio button in Chrome throws error "Cannot read properties of undefined (reading 'addEventListener')":

Hi! I am actually actually developing and testing this using Chrome. You mean the Audio button in the Actions bar (The top menu), right? How did you deploy the feat/audio-support branch to your JetKVM? Also, what is your Chrome version? Is there something particular about your networking setup, such as WebSockets or WebRTC being blocked? Any funkier chrome extensions installed? Can you please try again with the latest version of the branch please?

Thanks!

Defensive programming to prevent undefined behavior when closing ALSA PCM handles. While the previous commit disabled assertions with -DNDEBUG, adding explicit NULL checks ensures graceful handling even if handles are unexpectedly NULL. All error paths that call snd_pcm_close() now verify the handle is non-NULL before closing, preventing potential crashes in edge cases.

pennycoders · 2025-11-22T07:15:19Z

FYI: Troubleshooting some issues. ATM audio is not working. Will update as I have an update

Refactor audio processing to enhance stability and code clarity: - Remove soft-clipping from audio capture pipeline - Fix hardware frame size calculation for variable sample rates - Add comprehensive error codes for audio initialization failures - Clear stop flags after cleanup to prevent initialization deadlocks - Improve mutex handling during device initialization - Simplify constant validation and remove redundant comments - Add DevPod setup instructions for Apple Silicon users - Enforce Go cache clearing in dev_deploy.sh for CGO reliability These changes improve audio capture stability when switching between HDMI and USB audio sources, and fix race conditions during device initialization and teardown.

Replace EDID with version that only advertises 60Hz timing modes (1920x1080@60Hz and 1280x720@60Hz), removing the 1920x1080@50Hz mode that was causing HDMI sources to prefer 50fps over 60fps output.

Query TC358743 HDMI receiver for detected audio sample rate before initializing ALSA capture device. This fixes distortion issues when HDMI sources send 44.1kHz audio (e.g., Armbian SBC) instead of 48kHz. Previously, the code always requested 48kHz from ALSA, but in I2S slave mode, the RV1106 I2S controller receives whatever clock rate the TC358743 master provides. This caused a sample rate mismatch where ALSA thought it was 48kHz but hardware was actually running at 44.1kHz, resulting in incorrect SpeexDSP resampling and audio distortion. Changes: - Add V4L2 ioctl to query TC358743's audio_sampling_rate control - Use detected rate when configuring ALSA (falls back to 48kHz if unavailable) - SpeexDSP resampler now gets correct input rate (44.1k, 48k, etc.) - Supports all HDMI audio sample rates: 32k, 44.1k, 48k, 88.2k, 96k, etc.

Simplify channel swap detection and improve performance based on IDisposable's review comments: - Pass bool pointer directly instead of encoding in bit flag - Remove redundant channel count check (already verified earlier) - Use ARM NEON SIMD for channel swapping (4x faster) - Process 4 frames (8 samples) per iteration with vrev32q_s16 These changes improve code clarity and boost channel swap performance by ~4x using vectorized operations.

Address 5 critical issues found in comprehensive code review: 1. Opus Encoder Configuration Failures (CRITICAL) - Split encoder settings into critical vs non-critical - Critical settings (bitrate, VBR, FEC) now fail initialization on error - Non-critical settings (complexity, DTX) log warnings but continue - Prevents silent audio quality degradation from misconfigured encoder 2. V4L2 Sample Rate Detection Error Reporting (CRITICAL) - Add specific error messages for different failure modes - Distinguish permission errors, device not found, and no signal - Validate detected sample rates are in reasonable range (8-192kHz) - Improves debuggability when HDMI audio detection fails 3. Mutex Handling in ALSA Error Recovery (CRITICAL) - Refactor handle_alsa_error() to NEVER unlock mutex internally - Caller now always responsible for unlocking after checking return - Eliminates complex mutex ownership semantics that caused deadlocks - Consistent lock/unlock patterns prevent double-unlock bugs 4. Async Audio Start Error Propagation (CRITICAL) - Make SetAudioOutputEnabled/SetAudioInputEnabled synchronous - Add 5-second timeout for audio initialization - Return errors to caller instead of only logging - Revert state on failure to maintain consistency - Users now get immediate feedback if audio fails to start 5. CgoSource Race Condition (CRITICAL) - Hold c.mu mutex during C function calls in ReadMessage/WriteMessage - Prevents use-after-free when Disconnect() called concurrently - Lock order (c.mu -> capture_mutex) is consistent, no deadlock risk - Fixes potential crash from accessing freed ALSA/codec resources These changes eliminate silent failures, improve error visibility, and prevent race conditions that could cause crashes or audio degradation.

Addressed 5 HIGH priority issues identified in code review: HIGH jetkvm#12: safe_alsa_open validation (audio.c:314-327) - Added validation for snd_pcm_nonblock() return value - Properly close handle and return error if blocking mode fails - Prevents silent failures when switching to blocking mode HIGH jetkvm#13: WebRTC write failure handling (relay.go:74-147) - Track consecutive WebRTC write failures in OutputRelay - Reconnect source after 50 consecutive failures - Throttle warning logs (first failure + every 10th) - Prevents silent audio degradation from persistent write errors HIGH jetkvm#14: Opus packet size validation (audio.c:1024-1027) - Moved validation before mutex acquisition - Reduces unnecessary lock contention for invalid packets - Validates opus_buf, opus_size bounds before hot path HIGH jetkvm#15: FEC recovery validation (audio.c:1050-1071) - Added detailed logging for FEC usage - Log warnings when decode fails and FEC is attempted - Log info/error messages for FEC success/failure - Log warning when FEC returns 0 frames (silence) - Improves debuggability of packet loss scenarios Comment accuracy fixes (audio.c:63-67, 149-150, 164-165) - Clarified RFC 7587 comment: RTP clock rate vs codec sample rate - Explained why sr/fs parameters are ignored - Added context about SpeexDSP resampling Channel map validation (audio.c:572-579) - Added validation for unexpected channel counts - Check for SND_CHMAP_UNKNOWN positions - Prevents crashes from malformed channel map data

- Fix missing time import in audio.go causing build failure - Remove 15 redundant comments that restate obvious code: * Hot path function docblocks (jetkvm_audio_read_encode, jetkvm_audio_decode_write) * Obvious state descriptions (capture_channels_swapped) * SIMD function docblock (simd_clear_samples_s16) * safe_alsa_open docblock * relay.go implementation comments (Connect if not connected, Read message, etc.) * Duplicate RFC 7587 comment in cgo_source.go - Fix CRITICAL misleading comment: mutex protection claim * OLD: "The mutexes protect... ALSA I/O" (FALSE - mutex released during I/O) * NEW: "Mutexes protect handle lifecycle, NOT the ALSA I/O itself" * Added explanation of race detection via handle pointer comparison - Reduce verbose function comments (SetAudioOutputEnabled, SetAudioInputEnabled) * Removed redundant first line restating function names * Kept valuable behavioral context (blocking, timeout) Net result: 30 lines removed, improved code clarity, fixed build error

Implement periodic polling (every ~1 second) to detect HDMI audio sample rate changes and trigger automatic reconfiguration. This prevents audio distortion when switching between 44.1kHz and 48kHz sources. Key changes: - Poll TC358743 V4L2 control every 50 frames in capture hot path - Trigger reconnection when sample rate changes - Optimize logging to only output on rate changes (reduces log spam) - Add proper state tracking to prevent duplicate logging - Fix comment accuracy and ensure all state updates are consistent Performance impact: ~100-500μs overhead every second (~0.01-0.05% CPU)

Integrate latest dev branch changes including: - HID RPC handshake improvements - Video sleep mode functionality - IPv6 address sorting fixes - OTA update flow improvements - Video stream lifecycle management All audio functionality preserved and tested.

The sleep mode feature from dev (PR jetkvm#999) prevented video from auto-starting when streaming_status == 0, which broke HDMI hotplug functionality. Modified video_restart_streaming() to check detected_signal: - No signal + not streaming -> don't start (preserve sleep mode) - Has signal -> always stop cleanly then restart (fix hotplug) This ensures proper encoder cleanup and prevents corrupted video output after HDMI disconnect/reconnect cycles.

Resolved conflicts by: - Preserving HDMI hotplug fix in video_restart_streaming() that checks detected_signal - Keeping both isSecureContext and doRpcHidHandshake imports in devices.$id.tsx No functionality was changed or broken in this merge.

- Delete unused audio_common.c and audio_common.h (237 lines of dead code) - Remove redundant encoder/decoder pointer comparisons (mutex already held) - Remove unused err variables in encode/decode hot paths

- Fix OPUS_BANDWIDTH constant from 1104 to 1105 (fullband 20kHz vs superwideband 12kHz) - Fix PacketLossPerc default from 0 to 20 to match C layer default - Add early return in FEC recovery when pcm_frames <= 0 to avoid zero-frame ALSA writes - Remove unused InputRelay fields (source, ctx, cancel) and unused NewInputRelay parameter - Add async cleanup on timeout in SetAudioOutputEnabled/SetAudioInputEnabled

- Add missing __sync_synchronize() in capture init to match playback init path - Add error handling for os.Setenv calls with warning logs on failure - Add logging when ALSA channel map is unavailable (assumes standard L/R order)

pennycoders · 2025-11-25T05:11:23Z

@IDisposable @adamshiervani I've also implemented dynamic resampler updates. In order for that feature to be supported, this PR has to be merged first (the audio does work without it too, though, defaulting to 48000KHz source sample rate): jetkvm/rv1106-system#50

Please review, test and let me know. @SuperKali has been of great help turning this on all sides (he is still seeing some issues with an armbian SBC, which I was unable to reproduce - basically the audio not sounding right when the SBC outputs to 44.1KHz - we are still tracking that down). USB Audio works for him as well though, and HDMI works quite well in my case too. Testing on as many hardware configurations as possible would be great.

Thanks,
Alex

- Copy opus data in ReadMessage to prevent aliasing with internal buffer - Remove unused ctx field from OutputRelay struct - Fix OPUS_BANDWIDTH comment accuracy (20kHz passband)

Hardware sample rate is auto-negotiated by ALSA, and frame size is derived from it. These parameters were being passed but ignored - remove them from update_audio_constants() and update_audio_decoder_constants() signatures.

All callers pass non-NULL pointers for actual_rate_out and actual_frame_size_out.

Sample rate detection and periodic checking is only relevant for HDMI where the source can change rates. USB Gadget is fixed at 48kHz.

internal/native/video.go

- Add CTA-861 extension block with HDMI Vendor-Specific Data Block - Include Audio Data Block (2ch LPCM, 32/44.1/48kHz, 16/20/24-bit) - Add Speaker Allocation and Video Capability data blocks - Set display name to JetKVM with proper display size (71x40cm) - EDID now passes edid-decode validation

pennycoders changed the title ~~JetKVM Advanced, CGO Audio Support~~ JetKVM Advanced, CGO-based Audio Support Aug 2, 2025

pennycoders force-pushed the feat/audio-support branch from db2d107 to 4f47d62 Compare August 2, 2025 02:23

adamshiervani added this to JetKVM Aug 4, 2025

adamshiervani added this to the 0.5.0 milestone Aug 4, 2025

adamshiervani moved this to Backlog in JetKVM Aug 4, 2025

adamshiervani moved this from Backlog to In progress in JetKVM Aug 4, 2025

adamshiervani moved this from In progress to In review in JetKVM Aug 4, 2025

adamshiervani moved this from In review to In progress in JetKVM Aug 4, 2025

adamshiervani moved this from In progress to In Review in JetKVM Aug 4, 2025

adamshiervani requested review from adamshiervani and ym August 4, 2025 12:26

adamshiervani mentioned this pull request Aug 4, 2025

[draft] USB Audio Support #446

Closed

3 tasks

adamshiervani linked an issue Aug 4, 2025 that may be closed by this pull request

Add sound support #315

Open

pennycoders force-pushed the feat/audio-support branch from c9f4aea to 3444607 Compare August 4, 2025 20:42

pennycoders changed the title ~~JetKVM Advanced, CGO-based Audio Support~~ JetKVM Advanced, CGO-based 2-way Audio Support Aug 4, 2025

adamshiervani mentioned this pull request Aug 5, 2025

Add sound support #315

Open

IDisposable reviewed Aug 13, 2025

View reviewed changes

pennycoders force-pushed the feat/audio-support branch from 7408195 to 767311e Compare August 13, 2025 11:35

IDisposable reviewed Aug 13, 2025

View reviewed changes

cloud.go Outdated Show resolved Hide resolved

IDisposable mentioned this pull request Aug 22, 2025

Rework keyboard management to allow device-side tracking of modifier states #725

Merged

pennycoders added 3 commits November 24, 2025 14:58

Update DefaultEDID to force 60Hz video output

818a2ca

Replace EDID with version that only advertises 60Hz timing modes (1920x1080@60Hz and 1280x720@60Hz), removing the 1920x1080@50Hz mode that was causing HDMI sources to prefer 50fps over 60fps output.

pennycoders force-pushed the feat/audio-support branch from b6050ea to db2dc88 Compare November 24, 2025 17:53

pennycoders added 4 commits November 24, 2025 20:05

pennycoders force-pushed the feat/audio-support branch from a6acc43 to ecc4a11 Compare November 24, 2025 21:11

pennycoders force-pushed the feat/audio-support branch from ecc4a11 to bbc9f06 Compare November 24, 2025 21:20

pennycoders added 2 commits November 24, 2025 23:30

pennycoders force-pushed the feat/audio-support branch from 40cea52 to 7580d00 Compare November 24, 2025 22:28

pennycoders added 5 commits November 25, 2025 00:35

Merge branch 'dev' into feat/audio-support

f324496

Resolved conflicts by: - Preserving HDMI hotplug fix in video_restart_streaming() that checks detected_signal - Keeping both isSecureContext and doRpcHidHandshake imports in devices.$id.tsx No functionality was changed or broken in this merge.

Remove dead code and redundant checks in audio implementation

3d1c2a1

- Delete unused audio_common.c and audio_common.h (237 lines of dead code) - Remove redundant encoder/decoder pointer comparisons (mutex already held) - Remove unused err variables in encode/decode hot paths

Fix misleading comment about decoder sample rate

aeb6881

pennycoders added 6 commits November 25, 2025 10:01

Fix buffer aliasing and cleanup unused fields in audio

fe940cd

- Copy opus data in ReadMessage to prevent aliasing with internal buffer - Remove unused ctx field from OutputRelay struct - Fix OPUS_BANDWIDTH comment accuracy (20kHz passband)

Remove unused sample rate and frame size parameters

4cadf38

Hardware sample rate is auto-negotiated by ALSA, and frame size is derived from it. These parameters were being passed but ignored - remove them from update_audio_constants() and update_audio_decoder_constants() signatures.

Remove unnecessary NULL checks in configure_alsa_device

82562b7

All callers pass non-NULL pointers for actual_rate_out and actual_frame_size_out.

Skip TC358743 sample rate detection for USB audio

9fe46d5

Sample rate detection and periodic checking is only relevant for HDMI where the source can change rates. USB Gadget is fixed at 48kHz.

Merge remote-tracking branch 'origin/dev' into feat/audio-support

89c0b28

Change default EDID display name to JetKVM

594efa6

IDisposable requested changes Nov 30, 2025

View reviewed changes

internal/native/video.go Outdated Show resolved Hide resolved

JetKVM Advanced, CGO-based 2-way Audio Support #718

Are you sure you want to change the base?

JetKVM Advanced, CGO-based 2-way Audio Support #718

Conversation

pennycoders commented Aug 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Credits

Uh oh!

CLAassistant commented Aug 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pennycoders commented Aug 4, 2025

Uh oh!

IDisposable commented Aug 7, 2025

Uh oh!

pennycoders commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vvns commented Aug 10, 2025

Main observations:

Potential Improvements (Technical):

Uh oh!

pennycoders commented Aug 10, 2025

Uh oh!

vvns commented Aug 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pennycoders commented Aug 10, 2025

Uh oh!

IDisposable left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

am-zed commented Aug 23, 2025

Uh oh!

pennycoders commented Aug 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pennycoders commented Nov 22, 2025

Uh oh!

pennycoders commented Nov 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

pennycoders commented Aug 2, 2025 •

edited

Loading

CLAassistant commented Aug 2, 2025 •

edited

Loading

pennycoders commented Aug 8, 2025 •

edited

Loading

vvns commented Aug 10, 2025 •

edited

Loading

pennycoders commented Aug 24, 2025 •

edited

Loading