Combinations of Bayesian methods, data augmentation with mixup, and NOTA defensive padding cut the open-world false positive rate by up to 92% at 0.5 recall on HTTPS-only traffic and 75% on Tor traffic relative to the deterministic MSP baseline. Even with these improvements, sustaining a world size in the hundreds of millions (approaching YouTube-scale) requires accepting recall of 0.5–0.6 and precision of only 0.1–0.2; at precision 0.5 and recall 0.5, the maximum workable world size is only 37.5M for HTTPS-only (Table 3), far below YouTube's ~10 billion video catalog.
From 2025-walsh-improved-open-world-fingerprinting — Improved Open-World Fingerprinting Increases Threat to Streaming Video Privacy but Realistic Scenarios Remain Difficult
· §4.5, Tables 1–3
· 2025
· PoPETs 2025
Implications
Circumvention tools carrying video traffic gain inherent fingerprinting resistance simply from large-catalog platforms (YouTube, Vimeo); adversary precision collapses once the effective world size exceeds ~40–350M videos depending on transport.
NOTA-style defensive padding (seeding decision boundaries with adversarially crafted near-boundary samples) and mixup augmentation are the most effective countermeasures against improved open-world classifiers and should be evaluated as protocol-level padding strategies.