FINDING · EVALUATION

Term frequency clustering of block pages achieves an F-1 measure of 0.98, correctly recovering manually identified block-page templates; page-length clustering performs far worse at F-1 of 0.64. Across the full ONI dataset, only 37 distinct term frequency vectors were found from five years of measurements, indicating that filtering vendors rarely change block-page HTML structure.

From 2014-jones-automatedAutomated Detection and Fingerprinting of Censorship Block Pages · §5.1, §5.2 · 2014 · Internet Measurement Conference

Implications

Tags

censors
generic
techniques
measurement-platformml-classifier

Extracted by claude-sonnet-4-6 — review before relying.