From 63e7700340c8880aef59c299e54d158d5fba33b6 Mon Sep 17 00:00:00 2001 From: Matiss Jurevics Date: Sat, 7 Feb 2026 14:50:00 +0000 Subject: [PATCH] docs(streaming): add single-server implementation plan --- Backend/README.md | 1 + Backend/docs/streaming-on-web-server-plan.md | 94 ++++++++++++++++++++ 2 files changed, 95 insertions(+) create mode 100644 Backend/docs/streaming-on-web-server-plan.md diff --git a/Backend/README.md b/Backend/README.md index 42d8a34..5a9300e 100644 --- a/Backend/README.md +++ b/Backend/README.md @@ -147,6 +147,7 @@ Stream realtime events: - This backend currently acts as a control plane (commands, session state, credentials, events), not a full media plane/SFU. - Running live transport + fan-out + recording on the same web server is possible for small loads but introduces significant CPU, RAM, and network egress pressure under concurrency. - For larger deployments, use a dedicated media plane (managed or self-hosted SFU + recorder) and keep this service focused on auth/session/control APIs. +- For a pragmatic prototype path that keeps media on the current server, see `docs/streaming-on-web-server-plan.md`. ### API Docs OpenAPI docs are generated from Zod/OpenAPI definitions: diff --git a/Backend/docs/streaming-on-web-server-plan.md b/Backend/docs/streaming-on-web-server-plan.md new file mode 100644 index 0000000..6750293 --- /dev/null +++ b/Backend/docs/streaming-on-web-server-plan.md @@ -0,0 +1,94 @@ +# Single-Server Streaming Implementation Plan (Prototype) + +## Scope +Build live camera streaming and simultaneous recording on the current web server for low-to-moderate load testing, with explicit non-scale assumptions. + +## Constraints +- Keep existing backend as the control API (`/streams/*`, device auth, command lifecycle). +- Add media transport and recording in the same deployment for now. +- Prefer solutions that can later be split into a dedicated media service. + +## Recommended Stack (Current Server) +1. SFU: `mediasoup` (Node.js SFU library). +2. TURN/STUN: `coturn` (external process/service, mandatory for NAT traversal reliability). +3. Recording worker: `ffmpeg` process consuming RTP from SFU plain transports. +4. Signaling: keep existing Socket.IO channel (`webrtc:signal`) or migrate to REST+WS messages while preserving auth. +5. Storage: keep MinIO upload path and reuse current recordings finalize flow. + +## Why this stack +- `mediasoup` gives server-side fan-out (camera publishes once, multiple subscribers). +- `ffmpeg` can write MP4/HLS outputs from server-side RTP. +- `coturn` is required for real-world networks where direct ICE paths fail. +- This minimizes changes to existing route structure and DB entities. + +## Candidate Library Check +- `mediasoup`: mature SFU for Node, suitable for self-hosted media routing. +- `@roamhq/wrtc` / `node-webrtc` style bindings: useful for peer/bot use-cases, but not a full SFU architecture by itself. +- `werift`: pure TypeScript WebRTC stack; possible for custom flows, but higher implementation risk than mediasoup for production-like behavior. +- Managed alternatives (LiveKit/Twilio/Agora/100ms/Mux/Cloudflare): faster and more reliable, but outside strict single-server-in-process scope. + +## Implementation Phases + +### Phase 0: Environment + Guardrails +1. Add env vars: + - `TURN_URLS`, `TURN_USERNAME`, `TURN_CREDENTIAL` + - `MEDIA_RECORDINGS_DIR` + - `MEDIA_MAX_PUBLISHERS`, `MEDIA_MAX_SUBSCRIBERS_PER_ROOM` +2. Add explicit README warning that this mode is prototype-only. +3. Add metrics baseline (CPU, RAM, event loop lag, outbound bitrate, active sessions). + +### Phase 1: Media Plane Skeleton +1. Add `media/sfu/` module: + - worker bootstrap + - router lifecycle per stream session + - transport creation helpers +2. Extend `media/types.ts` provider contracts: + - publish transport params + - subscribe transport params + - producer/consumer lifecycle ops +3. Add stream session registry in memory + DB mapping (`streamSessionId -> router/producer state`). + +### Phase 2: Publish/Subscribe Handshake +1. Camera flow: + - request publish transport params + - connect DTLS + - produce video track +2. Client flow: + - request subscribe transport params + - connect DTLS + - consume producer track +3. Use existing device auth checks and stream ownership checks. +4. Keep `stream:started`/`stream:ended` events for UI state updates. + +### Phase 3: Recording on Server +1. On first producer for a stream, start `ffmpeg` recording worker. +2. Record strategy: + - start with single-track MP4 for simplicity + - optionally add HLS segment output later +3. On `/streams/:id/end`: + - stop recorder + - upload result to MinIO + - call existing recording finalize path +4. Add retry and orphan cleanup worker for interrupted recordings. + +### Phase 4: Reliability + Backpressure +1. Remove JPEG `stream:frame` fallback from simulator once SFU path is stable. +2. Add connection timeout, ICE restart, and stream health checks. +3. Add admission limits per account and global concurrent stream caps. +4. Add stale session cleanup and worker crash recovery. + +### Phase 5: Load Test + Exit Criteria +1. Target load test: + - 1 publisher + N viewers per stream + - multiple concurrent streams +2. Capture: + - startup latency (request -> first frame) + - packet loss behavior + - server CPU/RAM/network saturation points +3. Define threshold to migrate to dedicated media service when limits are hit. + +## Immediate Code Changes (Low-Risk First) +1. Add `docs` and env scaffolding for TURN and recording worker. +2. Add `media/sfu` interfaces with no routing behavior yet (feature-flagged). +3. Implement one end-to-end stream path behind a flag (`MEDIA_MODE=single_server_sfu`). +4. Deprecate frame relay fallback after validation.