FINDING · EVALUATION

Using NLP phrase extraction on Chinese-language censored pages, the system discovered 1,125 new censored domains not present on any publicly available blocklist, producing a list 12.5× larger than the standard Citizen Lab list (220 web pages, 85 domains). Across three evaluations (unigrams, bigrams, trigrams, each capped at 1,000,000 URLs), only 3 of the top 50 discovered domains overlapped with FilteredWeb's top 50.

From 2018-hounsel-automaticallyAutomatically Generating a Large, Culture-Specific Blocklist for China · §5.1, Table 1 · 2018 · Free and Open Communications on the Internet

Implications

Tags

censors
cn
techniques
dns-poisoningmeasurement-platform

Extracted by claude-sonnet-4-6 — review before relying.