Tagged as AI Models

Tech Insight : OpenAI Trains AI Models To Confess When They Break The Rules

Published 10 December 2025

OpenAI has developed a new research technique that trains advanced AI models to admit when they ignored instructions, took unintended shortcuts, or quietly breached the rules they were given. A New Approach To Detecting Hidden Misbehaviour OpenAI’s latest research introduces what it calls a “confession”, which is a second output […]

Posted in News | Also tagged , ,