Explore Help

sam/TS3-vibed

1

0

You've already forked TS3-vibed

Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity

Files

43483c21455caa611d79481aa56d362e844b1a4b

TS3-vibed/Documentation/Mile_Stones/Milestone_2.md

sam 7dbb940107 updated plan

2026-05-03 11:07:42 +02:00

2.9 KiB

Raw Blame History

Milestone 2: Local Audio & GUI (The Ears)

Goal: Enable the client to process high-quality audio locally and display the interface.

1. UI Layout (`client_node/ui`)

Dependencies: Add egui, eframe.
AI Context Trap (Eframe + Tokio): Do NOT use #[tokio::main] on the client. eframe demands the main thread. Manually build a tokio::runtime::Runtime, spawn the background network actors, and pass MPSC channels into the AppState before calling eframe::run_native().
Architecture: Create struct AppState. Implement eframe::App trait for it.
Layout: Build the basic classic TeamSpeak UI. Left panel (tree view of hardcoded channels), right panel (text chat log).

2. Audio Capture (`client_node/audio/capture.rs`)

Dependencies: Add cpal, ringbuf.
Device Setup: Use cpal::default_host().default_input_device(). Build a stream config specifically requesting 48,000 Hz and 1 channel (Mono).
Headless Abstraction: Ensure the cpal instantiation is hidden behind a trait so the CI test suite can inject deterministic "sine wave" f32 vectors instead of requiring a physical microphone.
The Producer: Create a ringbuf::HeapRb<f32> (e.g., 4096 capacity). Split it into (producer, consumer).
Hardware Callback: Inside the cpal data callback, write the raw f32 samples directly into the producer. Strictly no_std-like rules here (no allocations, no locks).

3. DSP Chain & VAD (`client_node/audio/dsp.rs`)

Dependencies: Add webrtc-audio-processing.
Thread Spawning: Spawn a standard std::thread (not tokio) to act as the Audio Consumer.
Processing Loop: Pull chunks of exactly 960 samples (20ms) from the consumer ringbuffer.
Filters: Pass the 960 samples through webrtc's EchoCancellation and NoiseSuppression methods.
Voice Activity Detection (VAD): Implement webrtc VAD or an amplitude threshold calculator. If the chunk is "silence", drop it to save bandwidth.

4. Global Hotkeys / Push-To-Talk (PTT)

Dependencies: Add global-hotkey (or rdev).
Event Loop: Spawn a thread to listen for a specific keycode (e.g., Mouse4 or V).
Integration: Update an Arc<AtomicBool> is_transmitting flag. The DSP thread reads this flag; if false, it dumps the audio chunks.

5. Local Loopback & UI Bridge

Loopback Thread: For testing, route the post-DSP 960-sample chunks directly into a cpal output stream (Speaker) to physically hear the microphone quality and VAD gating.
UI State Bridge: Use tokio::sync::mpsc or simple Arc<AtomicBool> to signal the UI thread when VAD triggers, so egui can draw the green "Active Speaker" dot next to the user's name.
Audio Dumper UI: Add a checkbox in the egui settings panel. When checked, write the 960-sample chunks to raw_mic.wav and post_dsp.wav using the hound crate for local inspection.

Reference in New Issue View Git Blame Copy Permalink

Powered by Gitea Version: 1.25.4 Page: 17ms Template: 3ms

English

Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語简体中文繁體中文（台灣）繁體中文（香港） 한국어

Licenses API