Only 36.66% of the 139,957 source list URLs (51,313) survived sanitization as live, meaningful pages, with 18,911 URLs removed for lack of content and many more for dead links — underscoring how rapidly manually curated probe lists decay. In Beijing and Shanghai, over 20% of known domains were consistently inaccessible, versus fewer than 4.5% at all other vantage points, and over 68% of known domains remained blocked, suggesting censored topics stay sensitive even as URLs go stale.
From 2024-tang-automatic — Automatic Generation of Web Censorship Probe Lists
· §3.1, §5.2
· 2024
· Privacy Enhancing Technologies
Implications
Circumvention tools that rely on static block lists for routing or domain-fronting decisions should treat those lists as perishable; automated freshness checks and continuous regeneration are necessary to maintain coverage.
Domains blocked in China tend to remain blocked long-term even as the underlying content changes; this persistence means historical censorship signals remain predictive and useful for training classifiers or seeding new probe lists.