Add remaining project files
This commit is contained in:
74
Documentation/Concept/General_idea.md
Normal file
74
Documentation/Concept/General_idea.md
Normal file
@@ -0,0 +1,74 @@
|
||||
# General Concept: Rust Voice Communication App
|
||||
|
||||
## 1. Core Philosophy
|
||||
|
||||
The application operates on a "Switchboard and Walkie-Talkie" model, designed for instant, drop-in voice communication.
|
||||
|
||||
* **The Switchboard (Server):** A central routing hub. It maintains the blueprint of all channels and tracks which users are in which rooms. It **does not** process audio; it strictly relays data to the correct destinations.
|
||||
* **The Walkie-Talkies (Clients):** The desktop applications. They capture local microphone input, compress it, send it to the server, and decompress incoming audio for playback.
|
||||
|
||||
## 2. The User Experience (Core Features)
|
||||
|
||||
* **Persistent Room List:** A static hierarchy of voice channels displayed on the side panel.
|
||||
* **Drop-In Audio:** No ringing or answering. Users click a room and are immediately broadcasting and receiving audio.
|
||||
* **Text Chat:** A synchronized text channel for every voice room, allowing users to share links and messages with current occupants.
|
||||
* **Active Speaker Indicators:** Visual cues (e.g., green outlines) next to user avatars that illuminate when voice data is being transmitted.
|
||||
* **Hardware Controls:** Easily accessible global Mute (microphone) and Deafen (headphones) toggles.
|
||||
|
||||
## 3. The Two-Lane Network Architecture
|
||||
|
||||
To guarantee a responsive UI while preventing robotic, lagging audio, the app utilizes two simultaneous network streams:
|
||||
|
||||
* **The Control Lane (TCP):** Slow but 100% reliable. Used for text messages, channel movements, authentication, and state updates. Ensures critical data is never lost.
|
||||
* **The Voice Lane (UDP):** Blazing fast but unreliable. Blasts compressed audio packets continuously. If a packet drops, the client discards it and moves to the newest data, prioritizing real-time delivery over perfect quality to prevent audio delay.
|
||||
|
||||
## 4. Cross-Platform Strategy
|
||||
|
||||
The app is natively compiled for Windows, macOS, and Linux from a single Rust codebase.
|
||||
|
||||
* **Audio I/O:** Handled via the `cpal` crate to interface seamlessly with WASAPI (Windows), CoreAudio (Mac), and ALSA/PulseAudio (Linux).
|
||||
* **User Interface:** Powered by `egui` and `eframe`, rendering natively via the system's preferred graphics API (DirectX, Metal, Vulkan/OpenGL).
|
||||
* **Global Hotkeys:** Handled via OS-specific registry hooks to capture Push-to-Talk events even when the application is minimized.
|
||||
|
||||
## 5. WebAssembly (Wasm) Plugin System
|
||||
|
||||
A secure, language-agnostic extension framework that allows users to modify the client's behavior without altering the core Rust binary.
|
||||
|
||||
* **The Sandbox:** Plugins run inside an isolated Wasm runtime. A malicious or broken plugin can crash its own sandbox but cannot crash the main voice client or access unauthorized system files.
|
||||
* **Language Agnostic:** Users can write plugins in Python, JavaScript, Go, or Rust, compiling them down to a `.wasm` file.
|
||||
* **Event Hooks:** The core application broadcasts specific triggers into the sandbox (e.g., `OnUserJoinChannel`, `OnAudioFrameCaptured`), allowing plugins to react to network events, manipulate local audio streams (e.g., voice changers), or automate chat functions.
|
||||
|
||||
## 6. Audio DSP Pipeline (Quality Control)
|
||||
|
||||
Raw microphone input is inherently messy. Before audio is compressed and sent to the network, it must pass through a local Digital Signal Processing (DSP) chain to ensure professional voice quality.
|
||||
|
||||
* **Acoustic Echo Cancellation (AEC):** Prevents the user's microphone from re-broadcasting audio coming from their own speakers.
|
||||
* **Noise Suppression:** Filters out continuous background noise (e.g., keyboard clacking, computer fans) using a lightweight algorithm (like WebRTC DSP or RNNoise).
|
||||
* **Voice Activity Detection (VAD) / Noise Gate:** Automatically stops transmitting network packets when the user is not actively speaking, saving massive amounts of bandwidth.
|
||||
|
||||
## 7. Identity and Authorization
|
||||
|
||||
The system employs a strict Role-Based Access Control (RBAC) architecture to maintain order within the server.
|
||||
|
||||
* **The Hierarchy:** Users are assigned roles (e.g., Guest, Member, Moderator, Admin) which dictate their permissions.
|
||||
* **Channel Permissions:** Specific rooms can be locked behind passwords or restricted to specific roles.
|
||||
* **Moderation Tools:** Authorized users have the network authority to send `Kick`, `Ban`, or `ServerMute` commands, which the server enforces by dropping the target's network connections or ignoring their UDP packets.
|
||||
|
||||
## 8. Security and Resiliency
|
||||
|
||||
The application is designed to survive hostile network conditions and protect user privacy.
|
||||
|
||||
* **Stateful Auto-Reconnect:** If the TCP control lane drops due to a network hiccup, the client enters a "Reconnecting" state. It will silently attempt to re-establish the connection and re-join their previous voice channel without requiring user interaction.
|
||||
* **Encryption-in-Transit:** All TCP control traffic (text chat, logins) is wrapped in TLS.
|
||||
* **Voice Encryption:** UDP voice packets are encrypted using a lightweight symmetric key (like ChaCha20 or AES-GCM) exchanged securely during the initial TCP handshake, preventing packet-sniffing on public networks.
|
||||
|
||||
## 9. Self-Hosting and Web Administration
|
||||
The server is designed for decentralized, user-hosted deployment. It compiles into a single, standalone executable that requires no external dependencies (no separate web servers or database installations).
|
||||
|
||||
* **Tri-Port Architecture:** The single server binary binds to three ports simultaneously:
|
||||
* TCP Control Lane (Client connections)
|
||||
* UDP Voice Lane (Audio routing)
|
||||
* HTTP Web Server (Admin dashboard)
|
||||
* **The Web Dashboard (`axum`):** A lightweight, embedded web server provides a visual interface for server owners to manage their instance from any web browser.
|
||||
* **Embedded Assets (`rust-embed`):** The entire HTML/CSS/JS frontend for the admin dashboard is compiled directly into the Rust server binary. The server hosts its own admin panel from memory.
|
||||
* **Admin REST API:** The web dashboard communicates with the core server via secure HTTP endpoints (e.g., `GET /api/users`, `POST /api/kick/:id`), protected by standard JWT authentication. This API interacts directly with the live concurrent state (`DashMap`) of the voice server.
|
||||
27
Documentation/Concept/UI_Mockups.md
Normal file
27
Documentation/Concept/UI_Mockups.md
Normal file
@@ -0,0 +1,27 @@
|
||||
# UI & Aesthetics: Reference Mockups
|
||||
|
||||
This project prioritizes a simple, utilitarian, and highly functional aesthetic, inspired heavily by classic communication tools like TeamSpeak 3. Below are the reference designs for both the main client application and the web administration dashboard.
|
||||
|
||||
## 1. The Main Client Application
|
||||
|
||||
The desktop client is completely focused on raw functionality, utilizing a basic blocky layout without distracting glassmorphism or overly stylized elements.
|
||||
|
||||

|
||||
|
||||
**Key Design Elements:**
|
||||
* **Channel Hierarchy (Left):** A straightforward, folder-like tree view. Active speakers are indicated by simple colored dots next to their names.
|
||||
* **Chat Interface (Right):** A plain text chat log focused on high readability and efficiency.
|
||||
* **Color Palette:** Standard system greys and soft dark themes. The focus is on low visual noise so the app can fade into the background.
|
||||
|
||||
---
|
||||
|
||||
## 2. The Web Admin Dashboard
|
||||
|
||||
The server administration dashboard (served via `axum` and embedded via `rust-embed`) provides a clean, classic control panel overview of the node's health.
|
||||
|
||||

|
||||
|
||||
**Key Design Elements:**
|
||||
* **Raw Data Focus:** The main panel highlights raw data tables, basic server logs, and straightforward metric charts.
|
||||
* **Sidebar Navigation:** A no-nonsense sidebar with plain text links to Users, Channels, Roles, and Settings.
|
||||
* **Utility Over Flash:** Designed strictly for server administrators who need to view logs and adjust permissions quickly without visual clutter.
|
||||
BIN
Documentation/Concept/admin_dashboard_ui.png
Normal file
BIN
Documentation/Concept/admin_dashboard_ui.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 633 KiB |
BIN
Documentation/Concept/main_client_ui.png
Normal file
BIN
Documentation/Concept/main_client_ui.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 522 KiB |
78
Documentation/High_level_plan/Technical_Specs.md
Normal file
78
Documentation/High_level_plan/Technical_Specs.md
Normal file
@@ -0,0 +1,78 @@
|
||||
# Technical Specifications & Standards
|
||||
|
||||
## 1. Database Architecture (Self-Hosted State)
|
||||
|
||||
The server uses an embedded, file-based database for persistent storage, allowing the server to be a single binary.
|
||||
|
||||
* **Core Library:** `sqlx` with SQLite. (Strictly using `sqlx` macros for compile-time query verification to prevent SQL injection).
|
||||
* **Cryptography Standard:** `rust-argon2` for password hashing. Passwords are never stored or transmitted in plain text.
|
||||
* **High-Level Schema Map:**
|
||||
* `Users`: ID, Username, Argon2_Hash, Global_Role.
|
||||
* `Channels`: ID, Parent_ID (for nesting), Name, Is_Voice, Required_Role.
|
||||
* `Bans`: IP_Address, User_ID, Expiry_Date.
|
||||
|
||||
## 2. WebAssembly (Wasm) Plugin API
|
||||
|
||||
The plugin system must be universally accessible (polyglot) but strictly sandboxed.
|
||||
|
||||
* **Core Library:** `extism` (Extism is vastly superior to raw Wasmtime for this use case because it automatically handles passing complex strings and byte arrays between the host and the plugin, avoiding manual memory pointer math).
|
||||
* **Data Exchange Standard:** All data passed between the Rust Host and the Wasm Guest is serialized using `JSON`.
|
||||
* **The API Boundary (What plugins CAN do):**
|
||||
* *Read-Only State:* Plugins can query the current channel layout and user list.
|
||||
* *Intercept Audio:* Plugins can request the raw f32 audio buffer *before* Opus encoding to apply voice effects.
|
||||
* *Inject Chat:* Plugins can send text messages via the bot/plugin API.
|
||||
* **The API Boundary (What plugins CANNOT do):** File system access and raw network socket access are strictly denied at the Wasm runtime level.
|
||||
|
||||
## 3. UI and Concurrency Architecture (The Actor Pattern)
|
||||
|
||||
`egui` redraws the screen at 60 frames per second. If it waits for a network packet, the app freezes. We must use the **Actor Pattern** to keep them isolated.
|
||||
|
||||
* **Core Libraries:** `eframe` (for `egui`) and `tokio::sync`.
|
||||
* **The Downstream (UI to Network):** Uses `tokio::sync::mpsc` (Multi-Producer, Single-Consumer). When a user clicks "Connect," the UI sends an enum variant `UiAction::Connect(ip)` down the channel and immediately returns to drawing the screen.
|
||||
* **The Upstream (Network to UI):** Uses `tokio::sync::watch`. The background Tokio network thread holds the "Master State" (who is speaking, who is in the channel). It pushes updates to the `watch` channel. The UI simply reads the latest value from this channel on every frame and draws it.
|
||||
|
||||
## 4. Audio Engine & DSP Standards
|
||||
|
||||
Real-time audio requires strict mathematical constraints to guarantee low latency and prevent lag.
|
||||
|
||||
* **Core Libraries:** `audiopus` (Opus compression), `cpal` (Hardware IO), and `webrtc-audio-processing` (Rust bindings for Google's WebRTC DSP).
|
||||
* **DSP Pipeline (Crucial for preventing echo):** Raw Mic Audio -> WebRTC Noise Suppression -> WebRTC Acoustic Echo Cancellation (AEC) -> Opus Encoder -> UDP Socket.
|
||||
* **Mathematical Standards:**
|
||||
* **Sample Rate:** Strictly locked to `48,000 Hz` (48kHz). This is the Opus codec standard.
|
||||
* **Frame Size:** Strictly locked to `20 milliseconds`. At 48kHz, this is exactly `960 samples` per frame. You cannot send more or less; Opus requires exact frame boundaries.
|
||||
* **Channels:** Microphone input is captured in `Mono` (1 channel). Speaker output is played in `Stereo` (2 channels, allowing for 3D positional audio later).
|
||||
* **Bitrate:** Variable Bitrate (VBR) targeting `48 kbps`. This provides crystal clear voice while using practically zero internet bandwidth.
|
||||
|
||||
## 5. Network Transport & NAT Traversal Strategy
|
||||
Home routers aggressively block incoming UDP traffic. We must define how voice packets survive NAT (Network Address Translation) firewalls.
|
||||
|
||||
* **UDP Hole Punching Standard:**
|
||||
* The client must send a tiny, empty "Keep-Alive" UDP packet to the server every 5 seconds. This keeps the user's router port open so the server's incoming voice packets aren't blocked.
|
||||
* **UDP Payload Structure:** Every single UDP packet must begin with a strict binary header before the Opus payload:
|
||||
* `[Session Token: u32]` (Who is sending this?)
|
||||
* `[Sequence Number: u64]` (What order is this packet in?)
|
||||
* `[Timestamp: u64]` (When was this spoken?)
|
||||
* `[Encrypted Opus Data: Vec<u8>]`
|
||||
|
||||
## 6. Cryptography & Security Standards
|
||||
Audio must be encrypted so internet Service Providers (or hackers on public Wi-Fi) cannot listen to private voice channels.
|
||||
|
||||
* **Core Libraries:** `rustls` (for TCP TLS) and `chacha20poly1305` (for UDP payload encryption).
|
||||
* **The Handshake Protocol:**
|
||||
1. The client connects via TCP (which is secured by standard TLS).
|
||||
2. The server generates a unique, temporary symmetric encryption key (ChaCha20) for that specific user session.
|
||||
3. The server sends this key to the client over the secure TCP lane.
|
||||
4. Both the client and server use this key to rapidly encrypt and decrypt the UDP voice packets.
|
||||
|
||||
## 7. Audio Playout Strategy (The Jitter Buffer)
|
||||
UDP packets do not arrive in the exact order they were sent. Some arrive fast, some arrive slow, and some arrive backward. If you play them the millisecond they arrive, the audio will crackle and pop.
|
||||
|
||||
* **Standard Requirement: The Jitter Buffer.**
|
||||
* **Implementation Rule:** The receiving client must hold incoming UDP packets in a priority queue sorted by `Sequence Number` for a minimum of `40 milliseconds` (holding roughly 2 frames of audio) before sending them to the `cpal` speaker thread.
|
||||
* **Missing Packet Logic:** If a packet sequence number is completely missing after the 40ms wait time, the AI must trigger the `audiopus` decoder's built-in "Packet Loss Concealment" (PLC) to artificially guess the missing audio and prevent a hard static pop.
|
||||
|
||||
## 8. Observability and Debugging
|
||||
When the self-hosted server crashes on a Linux VPS, you cannot use print statements to figure out why. You need structured, asynchronous logging.
|
||||
|
||||
* **Core Libraries:** `tracing` and `tracing-subscriber`.
|
||||
* **Implementation Rule:** Do not use `println!()`. All state changes, network drops, and database queries must be logged using `tracing::info!`, `tracing::warn!`, or `tracing::error!`. The server must output these logs to a rolling `.log` file on the host machine.
|
||||
96
Documentation/Low_level_plan/Implementation_Plan.md
Normal file
96
Documentation/Low_level_plan/Implementation_Plan.md
Normal file
@@ -0,0 +1,96 @@
|
||||
# Low-Level Implementation Plan
|
||||
|
||||
## 1. Network Packet Anatomy (The Data Plane)
|
||||
To minimize latency, we use a custom binary format for UDP voice data instead of JSON or Protobuf[cite: 1].
|
||||
|
||||
* **UDP Voice Header (Fixed 16 Bytes):**
|
||||
* `u32` (4 bytes): **Session Token.** Generated during TCP handshake. The server drops any packet where the IP/Port does not match this token[cite: 1].
|
||||
* `u64` (8 bytes): **Sequence Number.** Monotonically increasing per user. Essential for the Jitter Buffer to reorder packets[cite: 1].
|
||||
* `u32` (4 bytes): **Timestamp.** Measured in audio samples (increments by 960 per 20ms frame) to handle playback timing[cite: 1].
|
||||
* **Payload:** Raw Opus-encoded bytes (variable length, typically 60–120 bytes). The bitrate is not hardcoded; it is dictated dynamically by the server's `ChannelConfig` (e.g., 16kbps for voice, 96kbps for music bots) when the user joins a room.
|
||||
|
||||
---
|
||||
|
||||
## 2. Real-Time Audio Pipeline (`client_node/audio`)
|
||||
Audio threads must be "lock-free" to prevent stuttering. We use a Single-Producer Single-Consumer (SPSC) ring buffer[cite: 1].
|
||||
|
||||
* **Global Hotkeys / Push-to-Talk:**
|
||||
* Use `global-hotkey` (or `rdev`) to hook OS-level key presses, allowing PTT even when minimized[cite: 1].
|
||||
* **Microphone Thread (The Producer):**
|
||||
* Initialize `cpal` with a 48kHz input stream[cite: 1].
|
||||
* **Rule:** The hardware callback *must only* push raw `f32` samples into the `ringbuf`. No networking or heavy math allowed here[cite: 1].
|
||||
* **DSP/Encoder Thread (The Consumer):**
|
||||
* Pull samples from `ringbuf`.
|
||||
* Process via `webrtc_audio_processing` (Echo Cancellation, Noise Suppression, and Voice Activity Detection/VAD). If VAD detects silence, stop transmitting to save bandwidth[cite: 1].
|
||||
* Accumulate exactly $960$ samples ($20\text{ms}$)[cite: 1].
|
||||
* Pass to `audiopus::Encoder`.
|
||||
* Send resulting bytes to the **Network Task** via an asynchronous MPSC channel[cite: 1].
|
||||
|
||||
---
|
||||
|
||||
## 3. Jitter Buffer & Playback Logic (`client_node/network`)
|
||||
The Jitter Buffer compensates for unstable internet connection by adding a controlled "latency tax"[cite: 1].
|
||||
|
||||
* **The Sorting Mechanism:** Incoming UDP packets are inserted into a `BinaryHeap` (Min-Heap) sorted by **Sequence Number**[cite: 1].
|
||||
* **The Watermark Strategy:**
|
||||
* Wait until the heap contains at least $40\text{ms}$ (2 frames) of audio before starting playback[cite: 1].
|
||||
* This buffer allows late-arriving packets to be inserted in the correct order[cite: 1].
|
||||
* **Playback Tick:** Every $20\text{ms}$, the playback thread pops the next sequence number.
|
||||
* **Success:** Decode the packet. Before pushing to the master `cpal` speaker buffer, multiply the specific user's `f32` decoded array by their local volume scalar (e.g., 0.5 for 50% volume) to enable **Per-User Volume Control**.
|
||||
* **Missing (Packet Loss):** If the sequence number is missing, call `audiopus::Decoder::decode` with a `None` frame to trigger **Packet Loss Concealment (PLC)**, which synthesizes a "guess" of the missing sound[cite: 1].
|
||||
|
||||
---
|
||||
|
||||
## 4. Server Relay & Routing (`server_node/udp_relay.rs`)
|
||||
The server acts as a high-speed traffic controller. It must be "Zero-Copy" where possible.
|
||||
|
||||
* **Validation:** Use `tokio::net::UdpSocket`. On receipt, verify the `u32 Session Token` against the `DashMap` state[cite: 1].
|
||||
* **Broadcast Logic:**
|
||||
1. Identify the sender's current `ChannelId`[cite: 1].
|
||||
2. Retrieve the list of `SocketAddr` for every other user in that channel[cite: 1].
|
||||
3. Iterate and send the exact byte buffer to each address. Use the `bytes` crate to share the buffer via reference counting (`Arc`) instead of cloning[cite: 1].
|
||||
* **NAT Keep-Alives:** The server must ignore empty 0-byte UDP packets (used by clients to keep router ports open)[cite: 1].
|
||||
* **TCP Control Lane & Chat Routing:** The TCP router handles synchronized text messages and broadcasts them to users in the same `ChannelId`[cite: 1].
|
||||
* **Stateful Auto-Reconnect:** If the TCP socket drops, the client quietly reconnects and submits its existing `Session Token` to resume its channel presence without forcing a full re-login[cite: 1].
|
||||
* **Whisper Lists (Direct UDP Routing):** The server supports targeted UDP forwarding. If a packet header contains a `Target_SessionToken`, the server routes the audio strictly to that user, bypassing the standard channel broadcast.
|
||||
|
||||
---
|
||||
|
||||
## 5. Wasm Plugin ABI (`client_node/plugins`)
|
||||
Since the Wasm sandbox cannot access host memory directly, we use a shared "mailbox" system.
|
||||
|
||||
* **The ABI Pattern:**
|
||||
1. Host (Rust) serializes event data (e.g., `OnMessage`) into JSON[cite: 1].
|
||||
2. Host allocates a block of memory inside the Wasm instance and writes the JSON there[cite: 1].
|
||||
3. Host calls the Wasm function, passing the memory pointer[cite: 1].
|
||||
4. Guest (Wasm) processes and returns a pointer to its response[cite: 1].
|
||||
* **Audio Intercepts:** For voice changers, the Host passes a raw `&mut [f32]` buffer to the plugin. The plugin modifies the samples "in-place" before they reach the Opus encoder[cite: 1].
|
||||
|
||||
---
|
||||
|
||||
## 6. Persistence & State Management (`server_node/database.rs`)
|
||||
The server uses `sqlx` for compile-time safe database interaction[cite: 1].
|
||||
|
||||
* **Hashing:** Use `Argon2id` with a salt of at least 16 bytes. Passwords should be hashed with a minimum of $3$ passes and $64\text{MB}$ of memory[cite: 1].
|
||||
* **Migrations:** On startup, the server checks the `_sqlx_migrations` table. If the code expects a newer schema than the SQLite file has, it applies the `.sql` scripts in order before opening the network ports[cite: 1].
|
||||
* **Admin API:** The `axum` web server requires a `Bearer` token (JWT) for all sensitive routes (`/api/kick`, `/api/ban`). This token is generated when the Admin logs into the dashboard[cite: 1].
|
||||
* **Permissions & Access Control:** During TCP `ChannelJoin` events, the server checks the database for `Required_Role` and password locks before permitting entry[cite: 1].
|
||||
* **Client-Side Persistence (Bookmarks):** The `client_node` maintains a local SQLite or `.toml` file to persist Server Bookmarks (IP, Port, Password, chosen Nickname) so users don't have to manually type connection details.
|
||||
|
||||
---
|
||||
|
||||
## 7. Zero-Conf Automation Logic (`scripts/install.sh`)
|
||||
* **Environment Check:** Script verifies `systemd` availability[cite: 1].
|
||||
* **Permissioning:** Creates a non-privileged `voiceapp` user to run the binary (security hardening)[cite: 1].
|
||||
* **Auto-Update:** `update.sh` compares the local binary hash against the `latest` release on GitHub via the API. If different, it downloads, replaces, and runs `systemctl restart voice_app`[cite: 1]
|
||||
|
||||
---
|
||||
|
||||
## 8. Testing & Debugging Strategy
|
||||
To ensure the real-time audio pipeline and network remain stable during development, several specific debugging tools are built directly into the workflow, completely avoiding the need for CLI flags or terminal commands.
|
||||
|
||||
* **Developer Control Panel:** A dedicated "Testing & Debugging" tab within the `egui` client settings. This provides a purely graphical interface for all diagnostic tools.
|
||||
* **UI-Driven Audio Dumper:** A toggle in the Developer Panel that instantly records and writes the DSP pipeline streams to `.wav` files (`raw_mic.wav`, `post_dsp.wav`, `post_opus_decode.wav`) to physically inspect audio quality degradation.
|
||||
* **UI-Driven Chaos Simulator:** Sliders in the Developer Panel that dynamically inject artificial packet loss (%), latency (ms), and packet re-ordering into the outgoing UDP transport layer to stress-test the Jitter Buffer locally.
|
||||
* **In-App Debug Overlay:** An `egui` diagnostic HUD toggled via a UI button (or `F3`) that overlays real-time metrics: Network Ping (TCP and UDP), Jitter Buffer depth (ms), packet loss percentage, and active Opus PLC triggers.
|
||||
* **Load Test Dashboard:** The Server's web admin dashboard (`axum`) will feature a "Stress Test" page. Instead of running terminal scripts, the server admin can click "Spawn 100 Bots", which dynamically spins up headless internal clients that broadcast `.wav` audio to verify the server's UDP routing capacity.
|
||||
13
Documentation/Mile_Stones/Master_Milestones.md
Normal file
13
Documentation/Mile_Stones/Master_Milestones.md
Normal file
@@ -0,0 +1,13 @@
|
||||
# Voice App: Project Master Roadmap
|
||||
|
||||
Use this file to track high-level progress. Mark each milestone as [x] once all tasks in its dedicated file are complete.
|
||||
|
||||
- [ ] **Milestone 1: The Foundation** (The Skeleton) -> [Milestone_1.md]
|
||||
- [ ] **Milestone 2: Local Audio & GUI** (The Ears) -> [Milestone_2.md]
|
||||
- [ ] **Milestone 3: The First Voice Call** (The Connection) -> [Milestone_3.md]
|
||||
- [ ] **Milestone 4: Multi-User Routing** (The Switchboard) -> [Milestone_4.md]
|
||||
- [ ] **Milestone 5: Management & Plugins** (The Power) -> [Milestone_5.md]
|
||||
- [ ] **Milestone 6: Deployment & Automation** (The Release) -> [Milestone_6.md]
|
||||
|
||||
---
|
||||
**Current Status:** Planning Complete. Ready to Initialize.
|
||||
31
Documentation/Mile_Stones/Milestone_1.md
Normal file
31
Documentation/Mile_Stones/Milestone_1.md
Normal file
@@ -0,0 +1,31 @@
|
||||
# Milestone 1: The Foundation (The Skeleton)
|
||||
**Goal:** Initialize the project and establish the shared language between client and server.
|
||||
|
||||
### 1. Workspace Setup
|
||||
- [ ] Initialize the root Cargo workspace: `cargo init --vcs none` (delete `src/`). Create a root `Cargo.toml` with `[workspace] members = ["core_protocol", "server_node", "client_node"]`.
|
||||
- [ ] Create crates: `cargo new --lib core_protocol`, `cargo new --bin server_node`, `cargo new --bin client_node`.
|
||||
- [ ] Add strict lints (`#![forbid(unsafe_code)]`, etc.) to the root workspace or individual `lib.rs`/`main.rs` files.
|
||||
- [ ] **Dependencies (`core_protocol`):** Add `serde`, `bincode`, `uuid`, `chrono`, `thiserror`, `secrecy` (for zeroing sensitive keys).
|
||||
- [ ] **Dependencies (`server_node`):** Add `tokio` (full), `tracing`, `tracing-subscriber`, `anyhow`, `dashmap`.
|
||||
- [ ] **Dependencies (`client_node`):** Add `tokio` (rt-multi-thread), `tracing`, `tracing-subscriber`, `anyhow`.
|
||||
|
||||
### 2. Protocol Definitions (`core_protocol`)
|
||||
- [ ] Create `src/tcp_events.rs`. Define `enum TcpEvent { AuthRequest { username: String, ... }, AuthResponse { session_token: u32, ... }, ChannelJoin { ... }, ChatMessage { ... } }` with `#[derive(Serialize, Deserialize)]`.
|
||||
- [ ] Create `src/udp_packets.rs`. Define `struct VoicePacketHeader { pub session_token: u32, pub sequence_num: u64, pub timestamp: u64 }` with `#[derive(Serialize, Deserialize)]`.
|
||||
- [ ] Create `src/constants.rs`. Define `pub const SAMPLE_RATE: u32 = 48000;`, `pub const FRAME_SIZE: usize = 960;`, `pub const TCP_PORT: u16 = 8080;`.
|
||||
|
||||
### 3. TCP Handshake (`server_node` & `client_node`)
|
||||
- [ ] **Server:** In `server_node/src/main.rs`, initialize `tokio::net::TcpListener::bind("0.0.0.0:8080")`.
|
||||
- [ ] **Server:** Spawn a new `tokio::spawn(async move { ... })` for each incoming `TcpStream`.
|
||||
- [ ] **Client:** In `client_node/src/network/control.rs`, implement `TcpStream::connect("127.0.0.1:8080")`.
|
||||
- [ ] **Shared:** Implement a framing mechanism (e.g., sending a `u32` length prefix before the `bincode` serialized `TcpEvent`) to prevent TCP stream fragmentation.
|
||||
|
||||
### 4. Login Logic & State
|
||||
- [ ] **Server State:** Create `server_node/src/state.rs`. Define a `DashMap<u32, UserState>` to store active session tokens.
|
||||
- [ ] **Authentication Flow:** Client sends `TcpEvent::AuthRequest`. Server generates a random `u32` session token, stores it in `DashMap`, and returns `TcpEvent::AuthResponse`.
|
||||
- [ ] **Validation:** Ensure the server actively drops the connection if the client sends invalid or excessively large payloads.
|
||||
|
||||
### 5. Observability (Logging)
|
||||
- [ ] **Initialization:** In both binaries' `main.rs`, call `tracing_subscriber::fmt::init()`.
|
||||
- [ ] **Implementation:** Replace all `println!` calls with `tracing::info!`, `tracing::warn!`, or `tracing::error!`.
|
||||
- [ ] **Tracing Context:** Use `#[tracing::instrument]` on core TCP handler functions to automatically log client IPs and session IDs.
|
||||
32
Documentation/Mile_Stones/Milestone_2.md
Normal file
32
Documentation/Mile_Stones/Milestone_2.md
Normal file
@@ -0,0 +1,32 @@
|
||||
# Milestone 2: Local Audio & GUI (The Ears)
|
||||
**Goal:** Enable the client to process high-quality audio locally and display the interface.
|
||||
|
||||
### 1. UI Layout (`client_node/ui`)
|
||||
- [ ] **Dependencies:** Add `egui`, `eframe`.
|
||||
- [ ] **Initialization:** In `main.rs`, launch `eframe::run_native`.
|
||||
- [ ] **Architecture:** Create `struct AppState`. Implement `eframe::App` trait for it.
|
||||
- [ ] **Layout:** Build the basic classic TeamSpeak UI. Left panel (tree view of hardcoded channels), right panel (text chat log).
|
||||
|
||||
### 2. Audio Capture (`client_node/audio/capture.rs`)
|
||||
- [ ] **Dependencies:** Add `cpal`, `ringbuf`.
|
||||
- [ ] **Device Setup:** Use `cpal::default_host().default_input_device()`. Build a stream config specifically requesting `48,000 Hz` and `1 channel` (Mono).
|
||||
- [ ] **Headless Abstraction:** Ensure the `cpal` instantiation is hidden behind a trait so the CI test suite can inject deterministic "sine wave" `f32` vectors instead of requiring a physical microphone.
|
||||
- [ ] **The Producer:** Create a `ringbuf::HeapRb<f32>` (e.g., 4096 capacity). Split it into `(producer, consumer)`.
|
||||
- [ ] **Hardware Callback:** Inside the `cpal` data callback, write the raw `f32` samples directly into the `producer`. Strictly `no_std`-like rules here (no allocations, no locks).
|
||||
|
||||
### 3. DSP Chain & VAD (`client_node/audio/dsp.rs`)
|
||||
- [ ] **Dependencies:** Add `webrtc-audio-processing`.
|
||||
- [ ] **Thread Spawning:** Spawn a standard `std::thread` (not tokio) to act as the Audio Consumer.
|
||||
- [ ] **Processing Loop:** Pull chunks of exactly `960` samples (20ms) from the `consumer` ringbuffer.
|
||||
- [ ] **Filters:** Pass the 960 samples through `webrtc`'s `EchoCancellation` and `NoiseSuppression` methods.
|
||||
- [ ] **Voice Activity Detection (VAD):** Implement `webrtc` VAD or an amplitude threshold calculator. If the chunk is "silence", drop it to save bandwidth.
|
||||
|
||||
### 4. Global Hotkeys / Push-To-Talk (PTT)
|
||||
- [ ] **Dependencies:** Add `global-hotkey` (or `rdev`).
|
||||
- [ ] **Event Loop:** Spawn a thread to listen for a specific keycode (e.g., `Mouse4` or `V`).
|
||||
- [ ] **Integration:** Update an `Arc<AtomicBool>` `is_transmitting` flag. The DSP thread reads this flag; if false, it dumps the audio chunks.
|
||||
|
||||
### 5. Local Loopback & UI Bridge
|
||||
- [ ] **Loopback Thread:** For testing, route the post-DSP 960-sample chunks directly into a `cpal` output stream (Speaker) to physically hear the microphone quality and VAD gating.
|
||||
- [ ] **UI State Bridge:** Use `tokio::sync::mpsc` or simple `Arc<AtomicBool>` to signal the UI thread when VAD triggers, so `egui` can draw the green "Active Speaker" dot next to the user's name.
|
||||
- [ ] **Audio Dumper UI:** Add a checkbox in the `egui` settings panel. When checked, write the 960-sample chunks to `raw_mic.wav` and `post_dsp.wav` using the `hound` crate for local inspection.
|
||||
28
Documentation/Mile_Stones/Milestone_3.md
Normal file
28
Documentation/Mile_Stones/Milestone_3.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Milestone 3: The First Voice Call (The Connection)
|
||||
**Goal:** Successfully transmit compressed voice data to the server and back.
|
||||
|
||||
### 1. Opus Encoder (`client_node/audio/codec.rs`)
|
||||
- [ ] **Dependencies:** Add `audiopus`.
|
||||
- [ ] **Initialization:** Create an `audiopus::coder::Encoder` with `48,000 Hz`, `Mono`, and `Application::Voip`. Set the bitrate dynamically or hardcode to `48,000 bps` for testing.
|
||||
- [ ] **Encoding Loop:** Take the 960-sample `f32` chunks from the DSP thread and call `encoder.encode_float()`. This will output a `&[u8]` payload of variable length.
|
||||
|
||||
### 2. UDP Transport & Formatting (`client_node/network/voice.rs`)
|
||||
- [ ] **Packet Assembly:** Construct the UDP binary packet. Byte 0-3: `SessionToken` (from TCP handshake). Byte 4-11: `SequenceNumber` (incremented every chunk). Byte 12-19: `Timestamp` (`u64`). Byte 20+: The Opus payload.
|
||||
- [ ] **Fuzz Testing:** Use `proptest` to aggressively throw random, garbage byte arrays at the UDP packet parser to guarantee it never panics.
|
||||
- [ ] **Socket Binding:** Use `tokio::net::UdpSocket::bind("0.0.0.0:0")`.
|
||||
- [ ] **Transmission:** Send the assembled binary packet to the server's UDP port at `127.0.0.1:8080`.
|
||||
|
||||
### 3. Server Echo Relay (`server_node/udp_relay.rs`)
|
||||
- [ ] **Socket Setup:** `tokio::net::UdpSocket::bind("0.0.0.0:8080")`.
|
||||
- [ ] **Validation Loop:** `recv_from` the socket. Parse the first 4 bytes as `u32`. Check the `DashMap` to ensure the `SessionToken` is valid.
|
||||
- [ ] **Echo Mode:** Temporarily, immediately `send_to` the exact same `&[u8]` buffer back to the originating client `SocketAddr` to test the round-trip.
|
||||
- [ ] **Zero-Byte Keep-Alives:** Implement client logic to send an empty 0-byte UDP packet every 5 seconds. Server strictly ignores them (used for NAT traversal).
|
||||
|
||||
### 4. Opus Decoder & Playback (`client_node/audio/playback.rs`)
|
||||
- [ ] **Receiving:** Client UDP socket receives the echoed packet. Extract the Opus payload bytes.
|
||||
- [ ] **Decoding:** Initialize `audiopus::coder::Decoder` (`48,000 Hz`, `Mono`). Call `decoder.decode_float()` to retrieve the 960 `f32` samples.
|
||||
- [ ] **Playback:** Push the 960 samples into a secondary `cpal` output ringbuffer (The Speaker thread).
|
||||
|
||||
### 5. TCP Auto-Reconnect & UDP Chaos Simulator
|
||||
- [ ] **Auto-Reconnect Logic:** Wrap the TCP connection logic in a `loop`. If the socket drops (`read` returns 0), `tokio::time::sleep(2s)` and attempt to reconnect. Send a `ReconnectRequest` with the existing `SessionToken` instead of a full `AuthRequest`.
|
||||
- [ ] **Chaos Simulator UI:** In the `egui` Developer Settings, add a slider for `Packet Loss %`. Intercept outgoing UDP packets in `voice.rs`; use `rand::thread_rng` to randomly drop packets based on the slider value before hitting the socket.
|
||||
25
Documentation/Mile_Stones/Milestone_4.md
Normal file
25
Documentation/Mile_Stones/Milestone_4.md
Normal file
@@ -0,0 +1,25 @@
|
||||
# Milestone 4: Multi-User Routing (The Switchboard)
|
||||
**Goal:** Support multiple users in rooms with stable, synchronized audio.
|
||||
|
||||
### 1. Server State & Broadcast (`server_node/udp_relay.rs`)
|
||||
- [ ] **Data Structure:** Expand the `DashMap` to track `ChannelId -> Vec<SessionToken>`.
|
||||
- [ ] **Routing Loop:** When a UDP packet arrives, look up the sender's `ChannelId`. Iterate through all other `SessionToken`s in that channel, grab their `SocketAddr`, and use `UdpSocket::send_to` to forward the exact bytes. (Zero-copy payload routing).
|
||||
- [ ] **Whisper Lists (Direct UDP):** Modify the UDP header to include a `Target_SessionToken`. If this value is `!= 0`, bypass the channel iteration and forward the packet strictly to that target's `SocketAddr`.
|
||||
|
||||
### 2. Client Jitter Buffer (`client_node/network/jitter.rs`)
|
||||
- [ ] **Data Structure:** Create a `std::collections::BinaryHeap` wrapped in a Mutex, ordered by the packet's `SequenceNumber`.
|
||||
- [ ] **Buffering Logic:** When packets arrive from the UDP socket, push them into the heap. Do NOT start popping until the heap contains at least 2 packets (40ms "Watermark").
|
||||
- [ ] **Tick Loop:** Every 20ms, the audio playback thread pops the next expected `SequenceNumber`.
|
||||
|
||||
### 3. Packet Loss Concealment (PLC) & Playback
|
||||
- [ ] **Missed Sequences:** If the `SequenceNumber` popped from the Jitter Buffer is missing (skipped a number), call `decoder.decode_float()` but pass `None` or a null buffer to the Opus library. This triggers internal PLC synthesis.
|
||||
- [ ] **Late Packets:** If a packet arrives with a `SequenceNumber` older than what has already been played, immediately drop it.
|
||||
|
||||
### 4. TCP Chat & Presence Sync
|
||||
- [ ] **Broadcast Events:** When a user joins or leaves a channel, the server broadcasts a `TcpEvent::UserJoined` or `UserLeft` to all users in that channel.
|
||||
- [ ] **Chat Routing:** Client sends `TcpEvent::ChatMessage`. Server broadcasts it to the relevant channel.
|
||||
- [ ] **UI Updates:** The client parses these TCP events to update the `egui` Tree View and append messages to the Chat Log in real-time.
|
||||
|
||||
### 5. Diagnostics Overlay
|
||||
- [ ] **Debug UI:** Implement a developer panel (toggled via `F3` or a button) in `egui`.
|
||||
- [ ] **Metrics:** Hook into the Jitter Buffer length, calculate packet loss % over the last 10 seconds, and ping the server via TCP to display live network health on the UI.
|
||||
25
Documentation/Mile_Stones/Milestone_5.md
Normal file
25
Documentation/Mile_Stones/Milestone_5.md
Normal file
@@ -0,0 +1,25 @@
|
||||
# Milestone 5: Management & Plugins (The Power)
|
||||
**Goal:** Add persistent storage, an admin web dashboard, and the Wasm sandbox.
|
||||
|
||||
### 1. Database Setup (`server_node/database.rs`)
|
||||
- [ ] **Dependencies:** Add `sqlx` with the `sqlite` and `runtime-tokio` features.
|
||||
- [ ] **Schema Migrations:** Create `users` (ID, Name, Hash, Role) and `channels` (ID, Name, ParentID, RequiredRole, Bitrate). Run migrations on startup via `sqlx::migrate!()`.
|
||||
- [ ] **Permissions Check:** During the TCP `ChannelJoin` event, query the DB to ensure the user's Role $\ge$ the `RequiredRole` of the channel.
|
||||
|
||||
### 2. Web Admin Dashboard (`server_node/admin_api`)
|
||||
- [ ] **Dependencies:** Add `axum`, `rust-embed`, `jsonwebtoken`, `prometheus`.
|
||||
- [ ] **Static Assets:** Build the HTML/CSS for the classic control panel UI. Use `rust-embed` to compile these assets directly into the server binary.
|
||||
- [ ] **API Endpoints:** Build POST routes for `/api/kick`, `/api/ban`, `/api/channel/create`, and `/api/channel/bitrate`.
|
||||
- [ ] **Telemetry:** Expose a `/metrics` Prometheus endpoint to track high-level health (concurrent users, UDP packet drops, CPU usage).
|
||||
- [ ] **Security:** Implement an Axum middleware that verifies a Bearer JWT before allowing access to the API routes.
|
||||
|
||||
### 3. Server Bookmarks (Client-Side Persistence)
|
||||
- [ ] **Local Storage:** Use `directories` crate to find the OS config path (`~/.config/voiceapp` or `%APPDATA%`).
|
||||
- [ ] **Serialization:** Serialize a `Vec<ServerBookmark>` to a `bookmarks.toml` or `.json` file using `serde`.
|
||||
- [ ] **UI Integration:** Render the saved bookmarks in the `egui` login screen so users can 1-click connect.
|
||||
|
||||
### 4. Wasm Sandbox (`client_node/plugins.rs`)
|
||||
- [ ] **Dependencies:** Add `extism` to the client.
|
||||
- [ ] **Initialization:** Load external `.wasm` files from a local `/plugins` folder.
|
||||
- [ ] **Plugin Hooks (Chat):** Before rendering a chat message, serialize it to JSON, allocate memory in the Wasm instance, call the Wasm function, and read the modified JSON back.
|
||||
- [ ] **Plugin Hooks (Audio):** Pass the `&mut [f32]` array to the Wasm module *before* the Opus encoder, allowing plugins to mutate the raw audio (Voice Changers, Soundboards).
|
||||
20
Documentation/Mile_Stones/Milestone_6.md
Normal file
20
Documentation/Mile_Stones/Milestone_6.md
Normal file
@@ -0,0 +1,20 @@
|
||||
# Milestone 6: Deployment & Automation (The Release)
|
||||
**Goal:** Finalize security and automate the installation for self-hosters.
|
||||
|
||||
### 1. Network Encryption
|
||||
- [ ] **TCP TLS:** Wrap the server's `TcpListener` and client's `TcpStream` using `rustls`. Generate or require self-signed certificates for the server.
|
||||
- [ ] **UDP Encryption:** Add `chacha20poly1305`. After Opus encoding, encrypt the payload byte array using a symmetric key negotiated during the TLS TCP handshake, before sending over UDP.
|
||||
|
||||
### 2. Dockerization
|
||||
- [ ] **Dockerfile:** Write a multi-stage `Dockerfile`. Stage 1: `cargo build --release` using a minimal rust alpine image. Stage 2: Copy the binary to a scratch/debian container.
|
||||
- [ ] **Docker Compose:** Write `docker-compose.yml` mapping ports `8080/tcp` (Control), `8080/udp` (Voice), and `3000/tcp` (Admin Dashboard), and volume-mapping the SQLite database file.
|
||||
|
||||
### 3. CI/CD & Auto-Installer
|
||||
- [ ] **GitHub Actions:** Create `.github/workflows/release.yml`. Trigger on tags. Cross-compile binaries for `x86_64-linux`, `x86_64-windows`, and `aarch64-macos`.
|
||||
- [ ] **Security Auditing:** Add `cargo audit` to the pipeline to automatically fail the build if a known CVE is discovered.
|
||||
- [ ] **Install Script:** Write `scripts/install.sh`. The script downloads the correct binary via GitHub API, creates a non-root `voiceapp` user, and writes a `/etc/systemd/system/voiceapp.service` file.
|
||||
|
||||
### 4. The Final Stress Test
|
||||
- [ ] **Load Tester Bot:** Build a standalone Rust binary (`tests/load_tester.rs`).
|
||||
- [ ] **Concurrency:** Use Tokio to spawn 100+ async tasks. Each task connects via TCP, gets a SessionToken, and then blasts pre-recorded `.wav` data over UDP to the server at exactly 20ms intervals.
|
||||
- [ ] **Verification:** Use the Admin Dashboard to verify the server handles the packet throughput without CPU spiking or crashing.
|
||||
103
Documentation/Standards/Coding_and_Docs_Standards.md
Normal file
103
Documentation/Standards/Coding_and_Docs_Standards.md
Normal file
@@ -0,0 +1,103 @@
|
||||
# Project Standards: Code & Documentation
|
||||
|
||||
This document defines the absolute highest standards for the development of the Voice App project. All contributors must adhere to these rules strictly to ensure maximum performance, security, and maintainability.
|
||||
|
||||
## 1. Rust Coding Standards
|
||||
|
||||
### 1.1 Strict Linting & Safety
|
||||
The project enforces strict, zero-tolerance compiler checks. Every crate must begin with (or include in `Cargo.toml` as workspace lints):
|
||||
|
||||
```rust
|
||||
#![forbid(unsafe_code)]
|
||||
#![deny(clippy::all, clippy::pedantic)]
|
||||
#![deny(clippy::unwrap_used, clippy::expect_used)]
|
||||
```
|
||||
* **Unsafe Code:** Completely banned. If a dependency requires `unsafe`, it must be highly audited and isolated.
|
||||
* **No Unwraps:** `unwrap()` and `expect()` are forbidden in production code. You must handle the `Result` or `Option` gracefully and bubble it up using the `?` operator.
|
||||
|
||||
### 1.2 Error Handling
|
||||
* **Libraries (`core_protocol`):** Must define their own custom error enumerations using `thiserror`. Do not panic.
|
||||
* **Binaries (`client_node`, `server_node`):** Use `anyhow::Result` for application-level error bubbling. All bubbled errors must be logged using `tracing::error!` before the process safely terminates or restarts the specific actor.
|
||||
|
||||
### 1.3 Concurrency & The Actor Pattern
|
||||
* **No Shared Mutexes for Data Flow:** Passing an `Arc<Mutex<State>>` directly between the Tokio network background thread and the `egui` UI thread is prohibited, as it causes stuttering.
|
||||
* **Enforce the Actor Pattern:** Use `tokio::sync::mpsc` (UI -> Network) and `tokio::sync::watch` (Network -> UI) for all cross-thread communication.
|
||||
* **Blocking Operations:** Database queries (`sqlx` is async so it's fine), heavy cryptography (e.g., Argon2 hashing), or file I/O must *never* run on the main Tokio executor thread. They must be offloaded using `tokio::task::spawn_blocking`.
|
||||
|
||||
### 1.4 High-Performance Audio Rules
|
||||
* **Lock-Free Audio Threads:** The `cpal` audio input and output callbacks operate in real-time. You are strictly forbidden from placing `Mutex::lock()`, network socket calls, or heap memory allocations (`Vec::new()`, `String::from()`) inside the audio callback.
|
||||
* **Zero-Copy Routing:** The Server's UDP relay must utilize the `bytes::Bytes` crate to share network packet payloads via reference counting. It must never clone raw arrays when broadcasting to multiple users in a channel.
|
||||
|
||||
---
|
||||
|
||||
## 2. Documentation Standards
|
||||
|
||||
### 2.1 The "Why", Not the "What"
|
||||
Code should be readable enough to explain *what* it is doing. Inline comments must explain *why* a specific decision was made (e.g., "We pad this buffer to 960 samples because Opus strictly requires a 20ms mathematical frame, otherwise it throws an invalid packet error.").
|
||||
|
||||
### 2.2 Docstrings
|
||||
Every public `struct`, `enum`, `fn`, and `mod` must have a standard `///` docstring.
|
||||
Functions returning a `Result` must include an `# Errors` section.
|
||||
|
||||
```rust
|
||||
/// Encodes raw f32 audio samples into a compressed Opus frame.
|
||||
///
|
||||
/// # Arguments
|
||||
/// * `pcm_data` - A slice of exactly 960 audio samples representing 20ms of audio.
|
||||
///
|
||||
/// # Errors
|
||||
/// Returns `AudioError::InvalidFrameSize` if the length of `pcm_data` is not exactly 960.
|
||||
pub fn encode_frame(pcm_data: &[f32]) -> Result<Vec<u8>, AudioError> { ... }
|
||||
```
|
||||
|
||||
### 2.3 Module Level Documentation
|
||||
Every `lib.rs` or `mod.rs` file must begin with a `//!` documentation block explaining the architectural purpose of that module. If a developer opens a file, the first paragraph should tell them exactly what the module's responsibility is.
|
||||
|
||||
### 2.4 Architecture Decision Records (ADRs)
|
||||
If you propose a fundamental change to the architecture (e.g., changing the database from SQLite to PostgreSQL, or swapping Extism for Wasmtime), it must first be documented in a new file under `Documentation/ADR/` detailing the Context, Decision, and Consequences.
|
||||
|
||||
---
|
||||
|
||||
## 3. Security Standards
|
||||
|
||||
* **Zero-Knowledge Logging:** Never log passwords, Argon2 hashes, raw JWT tokens, or UDP Session Tokens via the `tracing` framework. Use `[REDACTED]` if you must log an event involving secure data.
|
||||
* **Memory Sanitization:** Highly sensitive cryptographic material in memory should utilize the `secrecy` crate to ensure it is Zeroized (wiped from RAM) the moment it falls out of scope.
|
||||
* **Strict API Validation:** Never trust the client. The server must validate the lengths, bounds, and permissions of every single TCP event and UDP packet before processing it.
|
||||
|
||||
---
|
||||
|
||||
## 4. Git & Workflow Standards
|
||||
|
||||
* **Conventional Commits:** Commit messages must follow the standard:
|
||||
* `feat: added global hotkeys for PTT`
|
||||
* `fix: resolved Opus frame alignment crash`
|
||||
* `refactor: migrated UDP payload to bytes crate`
|
||||
* `docs: updated API documentation for Wasm plugin`
|
||||
* **CI/CD Constraints:** No code is merged into `main` unless it successfully passes:
|
||||
1. `cargo fmt --all -- --check`
|
||||
2. `cargo clippy --all-targets --all-features -- -D warnings`
|
||||
3. `cargo test --all`
|
||||
|
||||
---
|
||||
|
||||
## 5. Testing & Quality Assurance
|
||||
|
||||
* **Property-Based Testing:** Network boundaries (especially the UDP parser) must use property-based testing (e.g., the `proptest` crate) to throw random, garbage bytes at the decoders. A malformed network packet must never cause the server or client to panic.
|
||||
* **Mocking Hardware Interfaces:** To ensure the CI pipeline can run headless without physical microphones, the audio capture interface (`cpal`) must be abstracted. The test suite will feed deterministic "sine wave" `f32` vectors into the DSP pipeline to mathematically verify encoding/decoding.
|
||||
* **Unit Testing Core Logic:** All state transitions, RBAC permissions, and packet serializers must have 100% test coverage.
|
||||
|
||||
---
|
||||
|
||||
## 6. Cross-Platform Constraints
|
||||
|
||||
* **Strict OS Agnosticism:** Core business logic must remain OS-agnostic. Any OS-specific system calls (like Windows Registry Hooks or Linux `alsa` tweaks) must be isolated behind `#[cfg(target_os = "...")]` compilation flags.
|
||||
* **Fallback Mandate:** If an OS-specific feature is added (e.g., a Windows global hotkey), a functional equivalent or safe fallback must be provided for Linux and macOS in the exact same Pull Request.
|
||||
* **Pathing:** Hardcoding `/` or `\` in strings is forbidden. Developers must strictly use `std::path::PathBuf` for all file system interactions to ensure Windows/POSIX compatibility.
|
||||
|
||||
---
|
||||
|
||||
## 7. Dependency Management & Telemetry
|
||||
|
||||
* **Dependency Minimalism:** Before adding a new crate to `Cargo.toml`, evaluate if the feature can be implemented natively in under 100 lines of Rust. Heavy, sprawling dependencies will be rejected.
|
||||
* **Security Auditing:** `cargo audit` must be integrated into the CI pipeline to automatically fail builds if a known CVE is discovered in the dependency tree.
|
||||
* **Server Telemetry:** Beyond text logs, the server must expose a `/metrics` Prometheus endpoint to track high-level health metrics: concurrent active users, UDP packet drop rate, Jitter buffer latency spikes, and CPU/Memory usage.
|
||||
Reference in New Issue
Block a user