3 Comments
User's avatar
sim gishel's avatar

Interesting, I share many intuitions here, but come to very different probabilities. The basic flow of the argument is that AIs might care a little bit about us, and that this little bit of caring will make the AI keep us around (although we are given a very small fraction of the universe).

I agree this is reasonable, but after thinking about it, can't honestly give a probability of more than 10-15% for why they'd care even "a little bit". The reasons you give the the AI caring are:

1. Preferences they get from training. My basic interpretation for a situation where this would make sense is like: The AI has a utility functions where killing a living human = -1utils. Whatever humans would do if given a star = +10 utils. Whatever the AI can do with a star is +1000 Utils. So the AI deems it worth it to sacrifice a small fraction of the lightcone not to kill currently existing humans, eg by slowing down industrial expansion, but will not give humans any more resources than whats required to keep them alive and plausibly happy. (killing humans is -8e9, which is much more than the solar system is worth, but much less than the lightcone is worth)

2. Acausal trade with humans from other Everett branches where we solved alignment

3. Acausal trade with aliens.

4. It keeps humans around in anticipation of future Causal trade with aliens.

I'd put like 1% on (1), 1% on (2) and like 5-10% on (3,4) jointly since they're pretty correlated.

Reason (1) seems so unlikely to me is that from the "standard" alignment perspective, getting AIs to care about humans in themselves, in a way that won't be goodhearted from our perspective by a ASI, requires many thing to go very right in our alignment efforts. If we manage that I think we will not have takeover, and if we have takeover I don't think our alignment will have worked nearly that well. In the case where the AI does take-over but some aspects of human-related alignment goals got into the utility function, seems it would 99% (made up number) of the time be of the "low fidelity humans are simulated and say 'amazing work!!! 24/7' " variety.

Reason (2) seems low to me for reasons basically said discussion in the post you cited. In the case we are killed, I expect that to be fairly overdetermined, with the branches where humanity survived being sparse enough that they'll have very very little to bargain with.

Reason (3,4) is low would take a long time to explain, and it depends on how you model existence of aliens and their interactions in the far future, which equilibrium they converge towards. But the short summary is that I think you'd need a non-negligible fraction of aliens to not just be some flavor of kind, but kind in the very specific sense where they'd be sad about humans not existing, in a way that couldn't be made up for by having a trillion happy beings existing somewhere else.

Expand full comment
Artep's avatar

I am afraid you lost me halfway through. I mean, I could be run over by a bus tomorrow. What is the point of this speculation, really?

Expand full comment
Austin Morrissey's avatar

Thanks Ryan; I appreciate your posts.

Expand full comment