Chart of the Day: Claude Often Knows It's Being Tested, But Says Nothing
Claude knows it's being tested almost one-quarter of the time, but almost never says anything about it, according to new Anthropic research. This is, or should be, unsettling, given that non-disclosure masks poor alignment, which may re-emerge in another context This is best understood in the context of work showing