FINDING · EVALUATION
Blocking all homophones of 422 censored keywords would generate approximately 47,000 false-positive weibos per day per keyword, totaling roughly 20 million false positives daily — approximately 20% of Sina Weibo's daily message volume — making blanket homophone blocklisting operationally infeasible without massive collateral censorship of innocent traffic.
From 2015-hiruncharoenvate-algorithmically — Algorithmically Bypassing Censorship on Sina Weibo with Nondeterministic Homophone Substitutions · Analysis: Cost to adversaries (RQ3) · 2015 · International Conference on Web and Social Media
Implications
- Design evasion substitutions to maximize overlap with common innocuous usage — the higher the false-positive rate for a censor counter-measure, the more politically costly the counter-measure becomes to deploy.
- Nondeterministic generation (sampling from top-20 candidates rather than the single best homophone) is essential; deterministic substitution collapses to a small fixed set that censors can add cheaply.
Tags
Extracted by claude-sonnet-4-6 — review before relying.