FINDING · EVALUATION
State-of-the-art ML classifiers (Deep Fingerprinting, Decision Tree, Random Forest, nPrintML) trained on known UPGen protocols and benign traffic always incur high out-of-distribution false-positive rates when attempting to block unknown UPGen protocols — in the vast majority of experiments the OOD FPR is 100%. The one exception (SSH OOD, Deep Fingerprinting) achieved a UPGen TPR of only 20%. By contrast, identical classifiers successfully generalize to block unknown Obfs4 flows with near-zero collateral damage in 3 of 4 cases.
From 2025-wails-censorship — Censorship Evasion with Unidentified Protocol Generation · §4.3.3, Table 3, Table 4 · 2025 · USENIX Security Symposium
Implications
- Fully-encrypted protocols like Obfs4 are structurally distinguishable from benign traffic; UPGen-style structured-but-variable protocols are not — prefer designs that resemble real encrypted protocols over purely random streams.
- A censor facing UPGen-style protocol diversity cannot train a generalizable classifier without blocking enormous amounts of benign encrypted traffic, raising the cost of censorship to unacceptable collateral-damage levels.
Tags
Extracted by claude-sonnet-4-6 — review before relying.