Clarifying ways in which faking alignment during training is neither necessary nor sufficient for the kind of scheming that AI control tries to defend against.
Training-time schemers vs behavioral schemers
Clarifying ways in which faking alignment during training is neither necessary nor sufficient for the kind of scheming that AI control tries to defend against.