Files
TS3-vibed/Documentation/Mile_Stones/Milestone_2.md
2026-05-03 10:50:25 +02:00

32 lines
2.7 KiB
Markdown

# Milestone 2: Local Audio & GUI (The Ears)
**Goal:** Enable the client to process high-quality audio locally and display the interface.
### 1. UI Layout (`client_node/ui`)
- [ ] **Dependencies:** Add `egui`, `eframe`.
- [ ] **Initialization:** In `main.rs`, launch `eframe::run_native`.
- [ ] **Architecture:** Create `struct AppState`. Implement `eframe::App` trait for it.
- [ ] **Layout:** Build the basic classic TeamSpeak UI. Left panel (tree view of hardcoded channels), right panel (text chat log).
### 2. Audio Capture (`client_node/audio/capture.rs`)
- [ ] **Dependencies:** Add `cpal`, `ringbuf`.
- [ ] **Device Setup:** Use `cpal::default_host().default_input_device()`. Build a stream config specifically requesting `48,000 Hz` and `1 channel` (Mono).
- [ ] **Headless Abstraction:** Ensure the `cpal` instantiation is hidden behind a trait so the CI test suite can inject deterministic "sine wave" `f32` vectors instead of requiring a physical microphone.
- [ ] **The Producer:** Create a `ringbuf::HeapRb<f32>` (e.g., 4096 capacity). Split it into `(producer, consumer)`.
- [ ] **Hardware Callback:** Inside the `cpal` data callback, write the raw `f32` samples directly into the `producer`. Strictly `no_std`-like rules here (no allocations, no locks).
### 3. DSP Chain & VAD (`client_node/audio/dsp.rs`)
- [ ] **Dependencies:** Add `webrtc-audio-processing`.
- [ ] **Thread Spawning:** Spawn a standard `std::thread` (not tokio) to act as the Audio Consumer.
- [ ] **Processing Loop:** Pull chunks of exactly `960` samples (20ms) from the `consumer` ringbuffer.
- [ ] **Filters:** Pass the 960 samples through `webrtc`'s `EchoCancellation` and `NoiseSuppression` methods.
- [ ] **Voice Activity Detection (VAD):** Implement `webrtc` VAD or an amplitude threshold calculator. If the chunk is "silence", drop it to save bandwidth.
### 4. Global Hotkeys / Push-To-Talk (PTT)
- [ ] **Dependencies:** Add `global-hotkey` (or `rdev`).
- [ ] **Event Loop:** Spawn a thread to listen for a specific keycode (e.g., `Mouse4` or `V`).
- [ ] **Integration:** Update an `Arc<AtomicBool>` `is_transmitting` flag. The DSP thread reads this flag; if false, it dumps the audio chunks.
### 5. Local Loopback & UI Bridge
- [ ] **Loopback Thread:** For testing, route the post-DSP 960-sample chunks directly into a `cpal` output stream (Speaker) to physically hear the microphone quality and VAD gating.
- [ ] **UI State Bridge:** Use `tokio::sync::mpsc` or simple `Arc<AtomicBool>` to signal the UI thread when VAD triggers, so `egui` can draw the green "Active Speaker" dot next to the user's name.
- [ ] **Audio Dumper UI:** Add a checkbox in the `egui` settings panel. When checked, write the 960-sample chunks to `raw_mic.wav` and `post_dsp.wav` using the `hound` crate for local inspection.