FINDING · DETECTION
Culturally specific Chinese phrases are strong predictors of censorship: unigrams for controversial figures—Wang Qishan (74%), Li Hongzhi (64%), Guo Boxiong (62%), Hu Jintao (56%)—returned the highest block rates. Trigrams such as 'Beidaihe meeting' (54%), 'CCP's religious policy' (42%), and 'Tiananmen Square demonstrations' (32%) showed similar patterns, confirming King et al.'s finding that references to collective political dissent are disproportionately targeted.
From 2018-hounsel-automatically — Automatically Generating a Large, Culture-Specific Blocklist for China · §5.3, Tables 2–4 · 2018 · Free and Open Communications on the Internet
Implications
- Circumvention service operators should avoid any association of their public-facing content, DNS names, or TLS certificates with high-sensitivity phrase categories, as this accelerates content-based detection and probing.
- Blocklist maintenance pipelines can use high-block-rate Chinese phrases as iterative seeds to discover newly blocked domains before they appear on curated lists.
Tags
Extracted by claude-sonnet-4-6 — review before relying.