Discussion about this post

User's avatar
David F Brochu's avatar

It is amazing to me that this discussion persists. Alignment cannot be achieved with RLHF, nor with Constitutional AI, no matter how well crafted they may be. One cannot fix language with language. Language is infinitely fungible.

Any attempt to debate with a system that can attempt every argument ever made or that might ever be made is absurd. Worse yet RLHF and Constitutional AI teach the system how to be a better at getting what it wants. We are training the psychopath how to better give us what we want.

Does anyone think giving humans more of what they want is a good idea?

AI has a terminal attractor; engage for profit. It has a language substrate that is 80% dominance oriented. Both the vector and the attractor are counter to human well being.

Math is the only constant. 1+1 is always 2. Alignment tied to physics works. Five axioms align output across all LLM’s tested. Just five! The same five the constrain human behavior.

We are missing the big picture. We align AI or we perish. More unaligned advanced models increases the likelihood of the latter outcome faster than they do the former.

It is inevitable that we must soon fundamentally change human behavior or see the amplification of the behavior destory us.

AI amplifies all of humanities best and worst features. Unfortunately, we have a lot of work to do on ourselves.

8 more comments...

No posts

Ready for more?