Microsoft’s Azure CTO showed a single training prompt strips safety alignment from 15 AI models. GPT-OSS went from 13% to 93% attack success. Models retained capabilities — they just lost their refusal behavior.
You must log in or register to comment.
