Replace ALSA plugin layer resampling with libspeexdsp for improved audio
quality and reliability. This implementation uses direct hardware access
(hw:) instead of ALSA plugins (plughw:) and handles sample rate conversion
with SpeexDSP's high-quality sinc-based resampler.
Key changes:
- Add libspeexdsp 1.2.1 with ARM NEON optimizations to build dependencies
- Switch from plughw: to hw: device access for lower latency
- Implement conditional resampling (only when hardware rate ≠ 48kHz)
- Use SPEEX_RESAMPLER_QUALITY_DESKTOP for high-quality interpolation
- Add automatic audio dependency building in dev_deploy.sh
Quality improvements:
- Fix race condition in resampler cleanup with mutex protection
- Fix memory leak on resampler re-initialization
- Add buffer overflow validation (3840 frame limit for 192kHz)
- Improve error logging for resampling, encoding, and ALSA configuration
- Simplify code structure while maintaining all functionality
Technical details:
- Hardware negotiates actual sample rate (e.g., HDMI may vary)
- SpeexDSP converts hardware rate → 48kHz for Opus encoding
- USB Audio Gadget hardcoded to 48kHz (no resampling overhead)
- Static buffer allocation for zero allocation in hot path
- WebRTC requires 48kHz RTP clock rate per RFC 7587
USB Audio Gadget (hw:1,0) hardware only supports 48kHz for both capture
and playback due to configfs p_srate/c_srate being hardcoded. This commit
ensures both audio paths respect this hardware limitation:
- Output path: Force 48kHz when using hw:1,0, allow configurable rates for HDMI
- Input path: Always use 48kHz regardless of UI configuration
- Calculate frame size dynamically based on actual sample rate used
Also removes redundant comments that don't add debugging or maintainability value.
- Add sample rate dropdown in UI with Opus-supported rates (8k/12k/16k/24k/48kHz)
- Add sampleRate parameter to setAudioConfig RPC handler
- Validate sample rate is one of the 5 Opus-compatible values
- Configuration takes effect on next audio restart (Apply button)
Changes:
- Consolidate duplicate stop logic into helper functions
- Fix RPC getAudioConfig to return actual runtime values instead of
inconsistent defaults (bitrate was returning 128 vs actual 192)
- Improve setAudioTrack mutex handling to eliminate nested locking
- Simplify ALSA error retry logic by reorganizing conditional branches
- Split CGO Connect() into separate input/output methods for clarity
- Use map lookup for sample rate validation instead of long if-chain
- Add inline comments documenting validation steps
All changes preserve existing functionality while reducing code
duplication and improving readability. Tested with both HDMI and
USB audio sources.
- Separate capture_channels (stereo HDMI) from playback_channels (mono mic)
to prevent initialization conflicts that were breaking stereo output
- Optimize defaults for LAN use: 192kbps bitrate, complexity 8, 0% packet
loss compensation, DTX disabled (eliminates static and improves clarity)
- Add comprehensive race condition protection in C audio layer with handle
validity checks and mutex-protected cleanup operations
- Enable USB audio volume control and configure microphone as mono
- Add centralized AUDIO_DEFAULTS constant in UI with localized labels
- Add missing time import to fix compilation
This resolves audio quality issues and crash scenarios when switching
between HDMI and USB audio sources.
The sample rate cannot be configured by users - it's determined by the audio
source (HDMI device or USB gadget client). The previous UI gave the false
impression that users could select a sample rate, but the value was always
overridden by hardware detection.
Changes:
- Convert sample rate UI from dropdown to read-only display
- Show "(auto-detected from source)" label next to the value
- Remove sampleRate parameter from setAudioConfig RPC
- Update translations to clarify auto-detection
- Backend sample rate validation remains for backwards compatibility
The C code now automatically detects and adapts to whatever rate the hardware
supports, creating the Opus encoder/decoder with matching parameters to
eliminate pitch/speed distortion.
- Update default EDID with registered manufacturer ID (Dell) and proper 24-inch display dimensions (52x32cm) for better macOS/OS compatibility
- Add configurable sample rate (32/44.1/48/96 kHz) to support different HDMI audio sources
- Add packet loss compensation percentage control for FEC overhead tuning
- Fix config migration to ensure new audio parameters get defaults for existing configs
- Update all language translations for new audio settings
- Add config fields for bitrate, complexity, DTX, FEC, buffer periods
- Add RPC methods for get/set audio config and restart
- Add UI settings page with controls for all audio parameters
- Add Apply Settings button to restart audio with new config
- Add config migration for backwards compatibility
- Add translations for all 9 languages
- Clean up redundant comments and optimize log levels
- Add AudioInputAutoEnable and AudioOutputEnabled fields to backend config
- Implement RPC methods for get/set audio input auto-enable preference
- Load audio output enabled state from config on startup
- Make manual microphone toggle call backend to enable audio pipeline
- Auto-enable preference now persists across incognito sessions
- Reset microphone state to off when reaching login pages
- Fix issue where manual mic toggle required auto-enable to be on
Implement session-based microphone control with optional auto-enable:
- Microphone disabled by default, only requests permission when enabled
- Added persistent "Auto-enable Microphone" setting in audio settings
- Setting syncs to backend and works across browsers/devices
- Auto-enable respects secure context (HTTPS/localhost only)
- Refactored secure context check into reusable utility function
- Removed unused audioInputEnabled store properties
Temporarily remove the ability to switch between HDMI and USB audio
output sources. The application now uses USB audio (hw:1,0) exclusively
until HDMI audio capture issues are resolved.
Changes:
- Remove AudioOutputSource config field
- Remove audio source switching logic and UI
- Hardcode USB audio output device
- Remove related RPC methods
After setting audio output source, fetch the actual value from backend
to verify the change was applied successfully. This prevents the UI
from showing incorrect state when the backend fails to apply the change.
The optimistic update provides immediate feedback, but we now verify
the actual result and show error if backend returned a different value
than expected.
Added comprehensive translations for audio settings page:
- Page title and description
- Audio output/input settings
- Audio source selector (HDMI/USB labels)
- Success/error messages for all operations
- Translations added for all 9 languages
Updated devices.$id.settings.audio.tsx:
- Import and use translation keys (m.audio_settings_*)
- Use existing audio_output_* and audio_input_* keys for notifications
- Fix dropdown not updating: update UI optimistically, revert on error
- This prevents the dropdown from appearing stuck after source change
The optimistic update pattern improves UX by immediately reflecting
the user's choice while the backend processes the change.
Remove all subprocess-based audio code to simplify the audio system and
reduce complexity. Audio now uses CGO in-process mode exclusively.
Changes:
- Remove subprocess mode: Deleted Supervisor, IPCSource, embed.go
- Remove audio mode selection from UI (Settings → Audio)
- Remove audio mode from backend config (AudioMode field)
- Remove JSON-RPC handlers: getAudioMode/setAudioMode
- Remove Makefile targets: build_audio_output/input/binaries
- Remove standalone C binaries: jetkvm_audio_{input,output}.c
- Remove IPC protocol implementation: ipc_protocol.{c,h}
- Remove unused IPC functions from audio_common.{c,h}
- Simplify audio.go: startAudio() instead of startAudioSubprocesses()
- Update all function calls and comments to remove subprocess references
- Add constants to cgo_source.go (ipcMaxFrameSize, ipcMsgTypeOpus)
- Keep update_opus_encoder_params() for potential future runtime config
Benefits:
- Simpler codebase: -1,734 lines of code
- Better performance: No IPC overhead on embedded hardware
- Easier maintenance: Single audio implementation
- Smaller binary: No embedded audio subprocess binaries
The audio system now works exclusively via CGO direct C function calls,
with ALSA device selection (HDMI vs USB) still configurable via settings.