Discussion about this post

User's avatar
James Sullivan's avatar

Interesting post. I’ve had similar issues on Opus 4.5 and 4.6. How much of this behavior do you think is misalignment vs a lack of capabilities? How do you tell the difference between a model misrepresenting its work and not being capable of accurately assessing its work? Ideally models would tell us when they lack a capability, but that in itself is a capability that today’s models largely lack.

Felipe Flórez's avatar

It was inevitable. Humans have been embedding their subjective failed morality.

What’s the solution? Axiomatic Morality. Is the human ready to accept Objective Morality?

8 more comments...

No posts

Ready for more?