StegoTorus distributes a fixed set of packet traces and HTTP covertext databases with the software, but allows users to record their own; classifiers trained on the distributed covertext will not generalize to user-generated databases. The paper further notes that reusing a small number of traces repeatedly creates a statistical fingerprint because censors can learn conversation patterns from packet sizes and timings alone, implying that trace diversity must be maintained over time.
From 2012-weinberg-stegotorus — StegoTorus: A Camouflage Proxy for the Tor Anonymity System
· §4.1.1, §5.2
· 2012
· Computer and Communications Security
Implications
Ship a minimal seed covertext database but mandate per-install or per-session covertext generation so that a censor who intercepts the software distribution cannot pre-train an effective classifier.
Rotate covertext traces frequently (per-session minimum) to prevent the censor from fingerprinting users who appear to 'have the same conversation' repeatedly.