Discussion about this post

User's avatar
Johannes Treutlein's avatar

> One possible reason here is that being the kind of civilization that doesn’t uphold their promises would decorrelate us from civilisations that do uphold their promises, and generally deter such civilizations from wanting to trade with us (in the context of ECL).

I'm a bit confused by this. In general, I think it's not really clear how one can "decorrelate" oneself from another civilization in one's reference class. Instead, your actions give you evidence about how other civilizations in your reference class are likely to behave. I.e., if you don't keep your promises to early AIs, this is an update that other civilizations in similar situations will also not keep their promises to early AIs. This initself would only be a negative update if keeping promises across universes would somehow make the early AIs more likely to engage in trade, which I'm not sure is true.

There is a general argument to cooperate with early AIs if you believe that in some universes, there are agents with different values in power and the early AIs of those universes share your own values. But you would want to do that regardless of how the early AIs behave.

Johannes Treutlein's avatar

> We expect that many potential misaligned AIs will either have clearly self-regarding preferences or have significant uncertainty about what they value on reflection, which induces additional uncertainty about whether other AIs will have the same goals on reflection.

Small nitpick: I don't think having uncertainty about one's own values is an update towards being more uncertain whether other AIs have the same values. An AI could also believe that misaligned AIs' values converge to similar values upon reflection (similarly to how some humans believe this).

2 more comments...

No posts

Ready for more?