circumvention research · structured · LLM-callable

A structured corpus of how to keep the internet free.

Every paper tagged against a shared taxonomy of censors, detection techniques, and defenses. An MCP server exposes the whole thing to any AI assistant.

papers
413
censors
20
techniques
23
defenses
37

§ 01 why this exists

A layer the field doesn't have yet.

The censorship-circumvention community has wonderful resources: net4people/bbs for discussion, gfw.report for original research, CensorBib as a maintained bibliography, OONI for measurement.

None of them are LLM-callable. None of them have a consistent structured-metadata schema. None of them let an AI assistant compose a corpus query with operational data in the same conversation.

This corpus adds that one missing layer.

§ 02 core papers

Hand-selected as load-bearing.

If a Lantern protocol designer hadn't read these, the team would expect them to be slowed down. Team consensus marks them as core: true; everyone using the corpus sees them surfaced first.

§ 03 self-updating

The corpus keeps crawling without us.

A pipeline polls arXiv, net4people/bbs, gfw.report, PoPETs, FOCI, USENIX Security, ermao.net, and the Paderborn upb-syssec group's publications and blog for new circumvention research, fetches each candidate via wick (browser-grade, residential-IP web access), then asks Claude to propose taxonomy tags and extract findings against the schema. Every new entry lands as an auto-ingest PR labeled for human review. Below: the most recent additions.

§ 04 connect

Plug it into your assistant.

One line. Your AI gains search_papers, get_paper, list_taxonomy, and find_related over the corpus.

How to install