Add remaining project files

This commit is contained in:
sam
2026-05-03 10:50:25 +02:00
parent 302fdb5459
commit 989d3bcc9f
16 changed files with 644 additions and 0 deletions

View File

@@ -0,0 +1,13 @@
# Voice App: Project Master Roadmap
Use this file to track high-level progress. Mark each milestone as [x] once all tasks in its dedicated file are complete.
- [ ] **Milestone 1: The Foundation** (The Skeleton) -> [Milestone_1.md]
- [ ] **Milestone 2: Local Audio & GUI** (The Ears) -> [Milestone_2.md]
- [ ] **Milestone 3: The First Voice Call** (The Connection) -> [Milestone_3.md]
- [ ] **Milestone 4: Multi-User Routing** (The Switchboard) -> [Milestone_4.md]
- [ ] **Milestone 5: Management & Plugins** (The Power) -> [Milestone_5.md]
- [ ] **Milestone 6: Deployment & Automation** (The Release) -> [Milestone_6.md]
---
**Current Status:** Planning Complete. Ready to Initialize.

View File

@@ -0,0 +1,31 @@
# Milestone 1: The Foundation (The Skeleton)
**Goal:** Initialize the project and establish the shared language between client and server.
### 1. Workspace Setup
- [ ] Initialize the root Cargo workspace: `cargo init --vcs none` (delete `src/`). Create a root `Cargo.toml` with `[workspace] members = ["core_protocol", "server_node", "client_node"]`.
- [ ] Create crates: `cargo new --lib core_protocol`, `cargo new --bin server_node`, `cargo new --bin client_node`.
- [ ] Add strict lints (`#![forbid(unsafe_code)]`, etc.) to the root workspace or individual `lib.rs`/`main.rs` files.
- [ ] **Dependencies (`core_protocol`):** Add `serde`, `bincode`, `uuid`, `chrono`, `thiserror`, `secrecy` (for zeroing sensitive keys).
- [ ] **Dependencies (`server_node`):** Add `tokio` (full), `tracing`, `tracing-subscriber`, `anyhow`, `dashmap`.
- [ ] **Dependencies (`client_node`):** Add `tokio` (rt-multi-thread), `tracing`, `tracing-subscriber`, `anyhow`.
### 2. Protocol Definitions (`core_protocol`)
- [ ] Create `src/tcp_events.rs`. Define `enum TcpEvent { AuthRequest { username: String, ... }, AuthResponse { session_token: u32, ... }, ChannelJoin { ... }, ChatMessage { ... } }` with `#[derive(Serialize, Deserialize)]`.
- [ ] Create `src/udp_packets.rs`. Define `struct VoicePacketHeader { pub session_token: u32, pub sequence_num: u64, pub timestamp: u64 }` with `#[derive(Serialize, Deserialize)]`.
- [ ] Create `src/constants.rs`. Define `pub const SAMPLE_RATE: u32 = 48000;`, `pub const FRAME_SIZE: usize = 960;`, `pub const TCP_PORT: u16 = 8080;`.
### 3. TCP Handshake (`server_node` & `client_node`)
- [ ] **Server:** In `server_node/src/main.rs`, initialize `tokio::net::TcpListener::bind("0.0.0.0:8080")`.
- [ ] **Server:** Spawn a new `tokio::spawn(async move { ... })` for each incoming `TcpStream`.
- [ ] **Client:** In `client_node/src/network/control.rs`, implement `TcpStream::connect("127.0.0.1:8080")`.
- [ ] **Shared:** Implement a framing mechanism (e.g., sending a `u32` length prefix before the `bincode` serialized `TcpEvent`) to prevent TCP stream fragmentation.
### 4. Login Logic & State
- [ ] **Server State:** Create `server_node/src/state.rs`. Define a `DashMap<u32, UserState>` to store active session tokens.
- [ ] **Authentication Flow:** Client sends `TcpEvent::AuthRequest`. Server generates a random `u32` session token, stores it in `DashMap`, and returns `TcpEvent::AuthResponse`.
- [ ] **Validation:** Ensure the server actively drops the connection if the client sends invalid or excessively large payloads.
### 5. Observability (Logging)
- [ ] **Initialization:** In both binaries' `main.rs`, call `tracing_subscriber::fmt::init()`.
- [ ] **Implementation:** Replace all `println!` calls with `tracing::info!`, `tracing::warn!`, or `tracing::error!`.
- [ ] **Tracing Context:** Use `#[tracing::instrument]` on core TCP handler functions to automatically log client IPs and session IDs.

View File

@@ -0,0 +1,32 @@
# Milestone 2: Local Audio & GUI (The Ears)
**Goal:** Enable the client to process high-quality audio locally and display the interface.
### 1. UI Layout (`client_node/ui`)
- [ ] **Dependencies:** Add `egui`, `eframe`.
- [ ] **Initialization:** In `main.rs`, launch `eframe::run_native`.
- [ ] **Architecture:** Create `struct AppState`. Implement `eframe::App` trait for it.
- [ ] **Layout:** Build the basic classic TeamSpeak UI. Left panel (tree view of hardcoded channels), right panel (text chat log).
### 2. Audio Capture (`client_node/audio/capture.rs`)
- [ ] **Dependencies:** Add `cpal`, `ringbuf`.
- [ ] **Device Setup:** Use `cpal::default_host().default_input_device()`. Build a stream config specifically requesting `48,000 Hz` and `1 channel` (Mono).
- [ ] **Headless Abstraction:** Ensure the `cpal` instantiation is hidden behind a trait so the CI test suite can inject deterministic "sine wave" `f32` vectors instead of requiring a physical microphone.
- [ ] **The Producer:** Create a `ringbuf::HeapRb<f32>` (e.g., 4096 capacity). Split it into `(producer, consumer)`.
- [ ] **Hardware Callback:** Inside the `cpal` data callback, write the raw `f32` samples directly into the `producer`. Strictly `no_std`-like rules here (no allocations, no locks).
### 3. DSP Chain & VAD (`client_node/audio/dsp.rs`)
- [ ] **Dependencies:** Add `webrtc-audio-processing`.
- [ ] **Thread Spawning:** Spawn a standard `std::thread` (not tokio) to act as the Audio Consumer.
- [ ] **Processing Loop:** Pull chunks of exactly `960` samples (20ms) from the `consumer` ringbuffer.
- [ ] **Filters:** Pass the 960 samples through `webrtc`'s `EchoCancellation` and `NoiseSuppression` methods.
- [ ] **Voice Activity Detection (VAD):** Implement `webrtc` VAD or an amplitude threshold calculator. If the chunk is "silence", drop it to save bandwidth.
### 4. Global Hotkeys / Push-To-Talk (PTT)
- [ ] **Dependencies:** Add `global-hotkey` (or `rdev`).
- [ ] **Event Loop:** Spawn a thread to listen for a specific keycode (e.g., `Mouse4` or `V`).
- [ ] **Integration:** Update an `Arc<AtomicBool>` `is_transmitting` flag. The DSP thread reads this flag; if false, it dumps the audio chunks.
### 5. Local Loopback & UI Bridge
- [ ] **Loopback Thread:** For testing, route the post-DSP 960-sample chunks directly into a `cpal` output stream (Speaker) to physically hear the microphone quality and VAD gating.
- [ ] **UI State Bridge:** Use `tokio::sync::mpsc` or simple `Arc<AtomicBool>` to signal the UI thread when VAD triggers, so `egui` can draw the green "Active Speaker" dot next to the user's name.
- [ ] **Audio Dumper UI:** Add a checkbox in the `egui` settings panel. When checked, write the 960-sample chunks to `raw_mic.wav` and `post_dsp.wav` using the `hound` crate for local inspection.

View File

@@ -0,0 +1,28 @@
# Milestone 3: The First Voice Call (The Connection)
**Goal:** Successfully transmit compressed voice data to the server and back.
### 1. Opus Encoder (`client_node/audio/codec.rs`)
- [ ] **Dependencies:** Add `audiopus`.
- [ ] **Initialization:** Create an `audiopus::coder::Encoder` with `48,000 Hz`, `Mono`, and `Application::Voip`. Set the bitrate dynamically or hardcode to `48,000 bps` for testing.
- [ ] **Encoding Loop:** Take the 960-sample `f32` chunks from the DSP thread and call `encoder.encode_float()`. This will output a `&[u8]` payload of variable length.
### 2. UDP Transport & Formatting (`client_node/network/voice.rs`)
- [ ] **Packet Assembly:** Construct the UDP binary packet. Byte 0-3: `SessionToken` (from TCP handshake). Byte 4-11: `SequenceNumber` (incremented every chunk). Byte 12-19: `Timestamp` (`u64`). Byte 20+: The Opus payload.
- [ ] **Fuzz Testing:** Use `proptest` to aggressively throw random, garbage byte arrays at the UDP packet parser to guarantee it never panics.
- [ ] **Socket Binding:** Use `tokio::net::UdpSocket::bind("0.0.0.0:0")`.
- [ ] **Transmission:** Send the assembled binary packet to the server's UDP port at `127.0.0.1:8080`.
### 3. Server Echo Relay (`server_node/udp_relay.rs`)
- [ ] **Socket Setup:** `tokio::net::UdpSocket::bind("0.0.0.0:8080")`.
- [ ] **Validation Loop:** `recv_from` the socket. Parse the first 4 bytes as `u32`. Check the `DashMap` to ensure the `SessionToken` is valid.
- [ ] **Echo Mode:** Temporarily, immediately `send_to` the exact same `&[u8]` buffer back to the originating client `SocketAddr` to test the round-trip.
- [ ] **Zero-Byte Keep-Alives:** Implement client logic to send an empty 0-byte UDP packet every 5 seconds. Server strictly ignores them (used for NAT traversal).
### 4. Opus Decoder & Playback (`client_node/audio/playback.rs`)
- [ ] **Receiving:** Client UDP socket receives the echoed packet. Extract the Opus payload bytes.
- [ ] **Decoding:** Initialize `audiopus::coder::Decoder` (`48,000 Hz`, `Mono`). Call `decoder.decode_float()` to retrieve the 960 `f32` samples.
- [ ] **Playback:** Push the 960 samples into a secondary `cpal` output ringbuffer (The Speaker thread).
### 5. TCP Auto-Reconnect & UDP Chaos Simulator
- [ ] **Auto-Reconnect Logic:** Wrap the TCP connection logic in a `loop`. If the socket drops (`read` returns 0), `tokio::time::sleep(2s)` and attempt to reconnect. Send a `ReconnectRequest` with the existing `SessionToken` instead of a full `AuthRequest`.
- [ ] **Chaos Simulator UI:** In the `egui` Developer Settings, add a slider for `Packet Loss %`. Intercept outgoing UDP packets in `voice.rs`; use `rand::thread_rng` to randomly drop packets based on the slider value before hitting the socket.

View File

@@ -0,0 +1,25 @@
# Milestone 4: Multi-User Routing (The Switchboard)
**Goal:** Support multiple users in rooms with stable, synchronized audio.
### 1. Server State & Broadcast (`server_node/udp_relay.rs`)
- [ ] **Data Structure:** Expand the `DashMap` to track `ChannelId -> Vec<SessionToken>`.
- [ ] **Routing Loop:** When a UDP packet arrives, look up the sender's `ChannelId`. Iterate through all other `SessionToken`s in that channel, grab their `SocketAddr`, and use `UdpSocket::send_to` to forward the exact bytes. (Zero-copy payload routing).
- [ ] **Whisper Lists (Direct UDP):** Modify the UDP header to include a `Target_SessionToken`. If this value is `!= 0`, bypass the channel iteration and forward the packet strictly to that target's `SocketAddr`.
### 2. Client Jitter Buffer (`client_node/network/jitter.rs`)
- [ ] **Data Structure:** Create a `std::collections::BinaryHeap` wrapped in a Mutex, ordered by the packet's `SequenceNumber`.
- [ ] **Buffering Logic:** When packets arrive from the UDP socket, push them into the heap. Do NOT start popping until the heap contains at least 2 packets (40ms "Watermark").
- [ ] **Tick Loop:** Every 20ms, the audio playback thread pops the next expected `SequenceNumber`.
### 3. Packet Loss Concealment (PLC) & Playback
- [ ] **Missed Sequences:** If the `SequenceNumber` popped from the Jitter Buffer is missing (skipped a number), call `decoder.decode_float()` but pass `None` or a null buffer to the Opus library. This triggers internal PLC synthesis.
- [ ] **Late Packets:** If a packet arrives with a `SequenceNumber` older than what has already been played, immediately drop it.
### 4. TCP Chat & Presence Sync
- [ ] **Broadcast Events:** When a user joins or leaves a channel, the server broadcasts a `TcpEvent::UserJoined` or `UserLeft` to all users in that channel.
- [ ] **Chat Routing:** Client sends `TcpEvent::ChatMessage`. Server broadcasts it to the relevant channel.
- [ ] **UI Updates:** The client parses these TCP events to update the `egui` Tree View and append messages to the Chat Log in real-time.
### 5. Diagnostics Overlay
- [ ] **Debug UI:** Implement a developer panel (toggled via `F3` or a button) in `egui`.
- [ ] **Metrics:** Hook into the Jitter Buffer length, calculate packet loss % over the last 10 seconds, and ping the server via TCP to display live network health on the UI.

View File

@@ -0,0 +1,25 @@
# Milestone 5: Management & Plugins (The Power)
**Goal:** Add persistent storage, an admin web dashboard, and the Wasm sandbox.
### 1. Database Setup (`server_node/database.rs`)
- [ ] **Dependencies:** Add `sqlx` with the `sqlite` and `runtime-tokio` features.
- [ ] **Schema Migrations:** Create `users` (ID, Name, Hash, Role) and `channels` (ID, Name, ParentID, RequiredRole, Bitrate). Run migrations on startup via `sqlx::migrate!()`.
- [ ] **Permissions Check:** During the TCP `ChannelJoin` event, query the DB to ensure the user's Role $\ge$ the `RequiredRole` of the channel.
### 2. Web Admin Dashboard (`server_node/admin_api`)
- [ ] **Dependencies:** Add `axum`, `rust-embed`, `jsonwebtoken`, `prometheus`.
- [ ] **Static Assets:** Build the HTML/CSS for the classic control panel UI. Use `rust-embed` to compile these assets directly into the server binary.
- [ ] **API Endpoints:** Build POST routes for `/api/kick`, `/api/ban`, `/api/channel/create`, and `/api/channel/bitrate`.
- [ ] **Telemetry:** Expose a `/metrics` Prometheus endpoint to track high-level health (concurrent users, UDP packet drops, CPU usage).
- [ ] **Security:** Implement an Axum middleware that verifies a Bearer JWT before allowing access to the API routes.
### 3. Server Bookmarks (Client-Side Persistence)
- [ ] **Local Storage:** Use `directories` crate to find the OS config path (`~/.config/voiceapp` or `%APPDATA%`).
- [ ] **Serialization:** Serialize a `Vec<ServerBookmark>` to a `bookmarks.toml` or `.json` file using `serde`.
- [ ] **UI Integration:** Render the saved bookmarks in the `egui` login screen so users can 1-click connect.
### 4. Wasm Sandbox (`client_node/plugins.rs`)
- [ ] **Dependencies:** Add `extism` to the client.
- [ ] **Initialization:** Load external `.wasm` files from a local `/plugins` folder.
- [ ] **Plugin Hooks (Chat):** Before rendering a chat message, serialize it to JSON, allocate memory in the Wasm instance, call the Wasm function, and read the modified JSON back.
- [ ] **Plugin Hooks (Audio):** Pass the `&mut [f32]` array to the Wasm module *before* the Opus encoder, allowing plugins to mutate the raw audio (Voice Changers, Soundboards).

View File

@@ -0,0 +1,20 @@
# Milestone 6: Deployment & Automation (The Release)
**Goal:** Finalize security and automate the installation for self-hosters.
### 1. Network Encryption
- [ ] **TCP TLS:** Wrap the server's `TcpListener` and client's `TcpStream` using `rustls`. Generate or require self-signed certificates for the server.
- [ ] **UDP Encryption:** Add `chacha20poly1305`. After Opus encoding, encrypt the payload byte array using a symmetric key negotiated during the TLS TCP handshake, before sending over UDP.
### 2. Dockerization
- [ ] **Dockerfile:** Write a multi-stage `Dockerfile`. Stage 1: `cargo build --release` using a minimal rust alpine image. Stage 2: Copy the binary to a scratch/debian container.
- [ ] **Docker Compose:** Write `docker-compose.yml` mapping ports `8080/tcp` (Control), `8080/udp` (Voice), and `3000/tcp` (Admin Dashboard), and volume-mapping the SQLite database file.
### 3. CI/CD & Auto-Installer
- [ ] **GitHub Actions:** Create `.github/workflows/release.yml`. Trigger on tags. Cross-compile binaries for `x86_64-linux`, `x86_64-windows`, and `aarch64-macos`.
- [ ] **Security Auditing:** Add `cargo audit` to the pipeline to automatically fail the build if a known CVE is discovered.
- [ ] **Install Script:** Write `scripts/install.sh`. The script downloads the correct binary via GitHub API, creates a non-root `voiceapp` user, and writes a `/etc/systemd/system/voiceapp.service` file.
### 4. The Final Stress Test
- [ ] **Load Tester Bot:** Build a standalone Rust binary (`tests/load_tester.rs`).
- [ ] **Concurrency:** Use Tokio to spawn 100+ async tasks. Each task connects via TCP, gets a SessionToken, and then blasts pre-recorded `.wav` data over UDP to the server at exactly 20ms intervals.
- [ ] **Verification:** Use the Admin Dashboard to verify the server handles the packet throughput without CPU spiking or crashing.