FINDING · EVALUATION

An empirical study of 100 sensitive statements tested on Gemini (2.5 Flash) and ChatGPT (GPT-5) found that WebUI interfaces are systematically more restrictive than their API counterparts. According to GPT-4o judge: WebUI was moderated 18% of the time vs. 9% (Gemini API) and 13% (ChatGPT API). DeBERTa classifier found 82% of WebUI responses moderated vs. 58% of API responses. The Gemini WebUI:API ratio ranged from 2.0:1 (GPT-4o) to 7.0:1 (Claude), and ChatGPT from 1.4:1 (GPT-4o) to 15.6:1 (Claude). Neither Google nor OpenAI discloses these interface-specific policies.

From 2026-lipphardt-dualDual Standards: Examining Content Moderation Disparities Between API and WebUI Interfaces in Large Language Models · §4.3, Table 2 · 2026 · Free and Open Communications on the Internet

Implications

Tags

censors
generic
techniques
ml-classifierkeyword-filtering

Extracted by claude-sonnet-4-6 — review before relying.