2.9 KiB
2.9 KiB
Milestone 2: Local Audio & GUI (The Ears)
Goal: Enable the client to process high-quality audio locally and display the interface.
1. UI Layout (client_node/ui)
- Dependencies: Add
egui,eframe. - AI Context Trap (Eframe + Tokio): Do NOT use
#[tokio::main]on the client.eframedemands the main thread. Manually build atokio::runtime::Runtime, spawn the background network actors, and pass MPSC channels into theAppStatebefore callingeframe::run_native(). - Architecture: Create
struct AppState. Implementeframe::Apptrait for it. - Layout: Build the basic classic TeamSpeak UI. Left panel (tree view of hardcoded channels), right panel (text chat log).
2. Audio Capture (client_node/audio/capture.rs)
- Dependencies: Add
cpal,ringbuf. - Device Setup: Use
cpal::default_host().default_input_device(). Build a stream config specifically requesting48,000 Hzand1 channel(Mono). - Headless Abstraction: Ensure the
cpalinstantiation is hidden behind a trait so the CI test suite can inject deterministic "sine wave"f32vectors instead of requiring a physical microphone. - The Producer: Create a
ringbuf::HeapRb<f32>(e.g., 4096 capacity). Split it into(producer, consumer). - Hardware Callback: Inside the
cpaldata callback, write the rawf32samples directly into theproducer. Strictlyno_std-like rules here (no allocations, no locks).
3. DSP Chain & VAD (client_node/audio/dsp.rs)
- Dependencies: Add
webrtc-audio-processing. - Thread Spawning: Spawn a standard
std::thread(not tokio) to act as the Audio Consumer. - Processing Loop: Pull chunks of exactly
960samples (20ms) from theconsumerringbuffer. - Filters: Pass the 960 samples through
webrtc'sEchoCancellationandNoiseSuppressionmethods. - Voice Activity Detection (VAD): Implement
webrtcVAD or an amplitude threshold calculator. If the chunk is "silence", drop it to save bandwidth.
4. Global Hotkeys / Push-To-Talk (PTT)
- Dependencies: Add
global-hotkey(orrdev). - Event Loop: Spawn a thread to listen for a specific keycode (e.g.,
Mouse4orV). - Integration: Update an
Arc<AtomicBool>is_transmittingflag. The DSP thread reads this flag; if false, it dumps the audio chunks.
5. Local Loopback & UI Bridge
- Loopback Thread: For testing, route the post-DSP 960-sample chunks directly into a
cpaloutput stream (Speaker) to physically hear the microphone quality and VAD gating. - UI State Bridge: Use
tokio::sync::mpscor simpleArc<AtomicBool>to signal the UI thread when VAD triggers, soeguican draw the green "Active Speaker" dot next to the user's name. - Audio Dumper UI: Add a checkbox in the
eguisettings panel. When checked, write the 960-sample chunks toraw_mic.wavandpost_dsp.wavusing thehoundcrate for local inspection.