Files
Final-Year-Project/Backend/docs/streaming-on-web-server-plan.md

4.2 KiB

Single-Server Streaming Implementation Plan (Prototype)

Scope

Build live camera streaming and simultaneous recording on the current web server for low-to-moderate load testing, with explicit non-scale assumptions.

Constraints

  • Keep existing backend as the control API (/streams/*, device auth, command lifecycle).
  • Add media transport and recording in the same deployment for now.
  • Prefer solutions that can later be split into a dedicated media service.
  1. SFU: mediasoup (Node.js SFU library).
  2. TURN/STUN: coturn (external process/service, mandatory for NAT traversal reliability).
  3. Recording worker: ffmpeg process consuming RTP from SFU plain transports.
  4. Signaling: keep existing Socket.IO channel (webrtc:signal) or migrate to REST+WS messages while preserving auth.
  5. Storage: keep MinIO upload path and reuse current recordings finalize flow.

Why this stack

  • mediasoup gives server-side fan-out (camera publishes once, multiple subscribers).
  • ffmpeg can write MP4/HLS outputs from server-side RTP.
  • coturn is required for real-world networks where direct ICE paths fail.
  • This minimizes changes to existing route structure and DB entities.

Candidate Library Check

  • mediasoup: mature SFU for Node, suitable for self-hosted media routing.
  • @roamhq/wrtc / node-webrtc style bindings: useful for peer/bot use-cases, but not a full SFU architecture by itself.
  • werift: pure TypeScript WebRTC stack; possible for custom flows, but higher implementation risk than mediasoup for production-like behavior.
  • Managed alternatives (LiveKit/Twilio/Agora/100ms/Mux/Cloudflare): faster and more reliable, but outside strict single-server-in-process scope.

Implementation Phases

Phase 0: Environment + Guardrails

  1. Add env vars:
    • TURN_URLS, TURN_USERNAME, TURN_CREDENTIAL
    • MEDIA_RECORDINGS_DIR
    • MEDIA_MAX_PUBLISHERS, MEDIA_MAX_SUBSCRIBERS_PER_ROOM
  2. Add explicit README warning that this mode is prototype-only.
  3. Add metrics baseline (CPU, RAM, event loop lag, outbound bitrate, active sessions).

Phase 1: Media Plane Skeleton

  1. Add media/sfu/ module:
    • worker bootstrap
    • router lifecycle per stream session
    • transport creation helpers
  2. Extend media/types.ts provider contracts:
    • publish transport params
    • subscribe transport params
    • producer/consumer lifecycle ops
  3. Add stream session registry in memory + DB mapping (streamSessionId -> router/producer state).

Phase 2: Publish/Subscribe Handshake

  1. Camera flow:
    • request publish transport params
    • connect DTLS
    • produce video track
  2. Client flow:
    • request subscribe transport params
    • connect DTLS
    • consume producer track
  3. Use existing device auth checks and stream ownership checks.
  4. Keep stream:started/stream:ended events for UI state updates.

Phase 3: Recording on Server

  1. On first producer for a stream, start ffmpeg recording worker.
  2. Record strategy:
    • start with single-track MP4 for simplicity
    • optionally add HLS segment output later
  3. On /streams/:id/end:
    • stop recorder
    • upload result to MinIO
    • call existing recording finalize path
  4. Add retry and orphan cleanup worker for interrupted recordings.

Phase 4: Reliability + Backpressure

  1. Remove JPEG stream:frame fallback from simulator once SFU path is stable.
  2. Add connection timeout, ICE restart, and stream health checks.
  3. Add admission limits per account and global concurrent stream caps.
  4. Add stale session cleanup and worker crash recovery.

Phase 5: Load Test + Exit Criteria

  1. Target load test:
    • 1 publisher + N viewers per stream
    • multiple concurrent streams
  2. Capture:
    • startup latency (request -> first frame)
    • packet loss behavior
    • server CPU/RAM/network saturation points
  3. Define threshold to migrate to dedicated media service when limits are hit.

Immediate Code Changes (Low-Risk First)

  1. Add docs and env scaffolding for TURN and recording worker.
  2. Add media/sfu interfaces with no routing behavior yet (feature-flagged).
  3. Implement one end-to-end stream path behind a flag (MEDIA_MODE=single_server_sfu).
  4. Deprecate frame relay fallback after validation.