FINDING · DETECTION

The paper proposes detecting translation censorship by back-translating the Chinese text to English via Google Translate, embedding each paragraph with distiluse-base-multilingual-cased-v1, and solving a linear-sum-assignment bipartite matching weighted by negated cosine similarity. Paragraphs below a similarity threshold are flagged as cut; matched paragraphs are recursively compared at sentence level to detect alterations.

From 2023-streisand-whereWhere Have All the Paragraphs Gone? Detecting and Exposing Censorship in Chinese Translation · §2 Methodology · 2023 · Free and Open Communications on the Internet

Implications

Tags

censors
cn
techniques
keyword-filteringmeasurement-platform

Extracted by claude-sonnet-4-6 — review before relying.