The Hour I First Believed

4/2/2018 The Hour I First Believed | Slate Star Codex
[ + ] 84 comments since 1969-12-31 20:
S l a t e S t a r C o d ex
T H E J OY F U L R E D U C T I O N O F U N C E R TA I N T Y
THE HOUR I FIRST BELIEVED

P OSTED ON APRIL 1, 2 018 BY SCOTT ALEXANDER
[Content note: creepy basilisk-adjacent metaphysics. Reading this may increase God’s ability to blackmail you. Thanks to Buck S for
the some of the conversations that inspired this line of thought.]
There’s a Jewish tradition that laypeople should only speculate on the

nature of God during Passover, because God is closer to us and such
speculations might succeed.
And there’s an atheist tradition that laypeople should only speculate on

the nature of God on April Fools’ Day, because believing in God is
dumb, and at least then you can say you’re only kidding.
Today is both, so let’s speculate. To do this properly, we need to

understand five things: acausal trade, value handshakes, counterfactual
mugging, simulation capture, and the Tegmarkian multiverse.
Acausal trade (wiki article) works like this: let’s say you’re playing the
Prisoner’s Dilemma against an opponent in a different room whom you
can’t talk to. But you do have a supercomputer with a perfect
simulation of their brain – and you know they have a supercomputer
with a perfect simulation of yours.
You simulate them and learn they’re planning to defect, so you figure
you might as well defect too. But they’re going to simulate you doing
this, and they know you know they’ll defect, so now you both know it’s
going to end up defect-defect. This is stupid. Can you do better?
http://slatestarcodex.com/2018/04/01/the-hour-i-first-believed/ 1/37
84 comments since
Perhaps you would like to make a deal with them to play cooperate-
cooperate. You simulate them and learn they would accept such a deal
and stick to it. Now the only problem is that you can’t talk to them to
make this deal in real life. They’re going through the same process and
coming to the same conclusion. You know this. They know you know
this. You know they know you know this. And so on.
So you can think to yourself: “I’d like to make a deal”. And because
they have their model of your brain, they know you’re thinking this. You
can dictate the terms of the deal in their head, and they can include “If
you agree to this, think that you agree.” Then you can simulate their
brain, figure out whether they agree or not, and if they agree, you can
play cooperate. They can try the same strategy. Finally, the two of you
can play cooperate-cooperate. This doesn’t take any “trust” in the other
person at all – you can simulate their brain and you already know
they’re going to go through with it.
(maybe an easier way to think about this – both you and your opponent
have perfect copies of both of your brains, so you can both hold parallel
negotiations and be confident they’ll come to the same conclusion on
each side.)
It’s called acausal trade because there was no communication – no

information left your room, you never influenced your opponent. All you
did was be the kind of person you were – which let your opponent
bargain with his model of your brain.
Values handshakes are a proposed form of trade between

superintelligences. Suppose that humans make an AI which wants to
convert the universe into paperclips. And suppose that aliens in the
Andromeda Galaxy make an AI which wants to convert the universe
into thumbtacks.
When they meet in the middle, they might be tempted to fight for the
fate of the galaxy. But this has many disadvantages. First, there’s
84 comments since the
usual risk of losing and being wiped out completely. Second, there’s the
usual deadweight loss of war, devoting resources to military buildup
instead of paperclip production or whatever. Third, there’s the risk of a
Pyrrhic victory that leaves you weakened and easy prey for some third
party. Fourth, nobody knows what kind of scorched-earth strategy a
losing superintelligence might be able to use to thwart its conqueror,
but it could potentially be really bad – eg initiating vacuum collapse and
destroying the universe. Also, since both parties would have
superintelligent prediction abilities, they might both know who would
win the war and how before actually fighting. This would make the
fighting redundant and kind of stupid.
Although they would have the usual peace treaty options, like giving
half the universe to each of them, superintelligences that trusted each
other would have an additional, more attractive option. They could
merge into a superintelligence that shared the values of both parent
intelligences in proportion to their strength (or chance of military
victory, or whatever). So if there’s a 60% chance our AI would win, and
a 40% chance their AI would win, and both AIs know and agree on
these odds, they might both rewrite their own programming with that
of a previously-agreed-upon child superintelligence trying to convert
the universe to paperclips and thumbtacks in a 60-40 mix.
This has a lot of advantages over the half-the-universe-each treaty

proposal. For one thing, if some resources were better for making
paperclips, and others for making thumbtacks, both AIs could use all
their resources maximally efficiently without having to trade. And if
they were ever threatened by a third party, they would be able to
present a completely unified front.
Counterfactual mugging (wiki article) is a decision theory problem

that goes like this: God comes to you and says “Yesterday I decided
that I would flip a coin today. I decided that if it came up heads, I
would ask you for $5 And I decided that if it came up tails then I
would ask you for $5. And I decided that if it came up tails, then I
84 comments since
would give you $1,000,000 if and only if I predict that you would say
yes and give Me $5 in the world where it came up heads (My
predictions are always right). Well, turns out it came up heads. Would
you like to give Me $5?”
Most people who hear the problem aren’t tempted to give God the $5.
Although being the sort of person who would give God the money
would help them in a counterfactual world that didn’t happen, that
world won’t happen and they will never get its money, so they’re just
out five dollars.
But if you were designing an AI, you would probably want to program it
to give God the money in this situation – after all, that determines
whether it will get $1 million in the other branch of the hypothetical.
And the same argument suggests you should self-modify to become the
kind of person who would give God the money, right now. And a version
of that argument where making the decision is kind of like deciding
“what kind of person you are” or “how you’re programmed” suggests
you should give up the money in the original hypothetical.
This is interesting because it gets us most of the way to Rawls’ veil of

ignorance. We imagine a poor person coming up to a rich person and
saying “God decided which of us should be rich and which of us should
be poor. Before that happened, I resolved that if I were rich and you
were poor, I would give you charity if and only if I predicted, in the
opposite situation, that you would give me charity. Well, turns out
you’re rich and I’m poor and the other situation is counterfactual, but
will you give me money anyway?” The same sort of people who agree
to the counterfactual mugging might (given that they trust or can
sweep under the rug some complications like “can the poor person
really predict your thoughts?” and “did they really make this decision
before they knew they were poor?”) agree to this also. And then you’re
most of the way to morality.
Simulation capture is my name for a really creepy idea bysince

84 comments Stuart
Armstrong. He starts with an AI box thought experiment: you have
created a superintelligent AI and trapped it in a box. All it can do is
compute and talk to you. How does it convince to let it out?
It might say “I’m currently simulating a million copies of you in such

high fidelity that they’re conscious. If you don’t let me out of the box,
I’ll torture the copies.”
You say “I don’t really care about copies of myself, whatever.”
It says “No, I mean, I did this five minutes ago. There are a million
simulated yous, and one real you. They’re all hearing this message.
What’s the probability that you’re the real you?”
Since (if it’s telling the truth) you are most likely a simulated copy of
yourself, all million-and-one versions of you will probably want to do
what the AI says, including the real one.
You can frame this as “because the real one doesn’t know he’s the real
one”, but you could also get more metaphysical about it. Nobody is
really sure how consciousness works, or what it means to have two
copies of the same consciousness. But if consciousness is a
mathematical object, it might be that two copies of the same
consciousness are impossible. If you create a second copy, you just
have the consciousness having the same single stream of conscious
experience on two different physical substrates. Then if you make the
two experiences different, you break the consciousness in two.
This means that an AI can actually “capture” you, piece by piece, into
its simulation. First your consciousness is just in the real world. Then
your consciousness is distributed across one real-world copy and a
million simulated copies. Then the AI makes the simulated copies
slightly different, and 99.9999% of you is in the simulation.
The Tegmarkian multiverse (wiki article) works84like this:since

comments universes
are mathematical objects consisting of starting conditions plus rules
about how they evolve. Any universe that corresponds to a logically
coherent mathematical object exists, but universes exist “more” (in
some sense) in proportion to their underlying mathematical simplicity.
Putting this all together, we arrive at a surprising picture of how the

multiverse evolves.
In each universe, life arises, forms technological civilizations, and

culminates in the creation of a superintelligence which gains complete
control over its home universe. Such superintelligences cannot directly
affect other universes, but they can predict their existence and model
their contents from first principles. Superintelligences with vast
computational resources can model the X most simple (and so most
existent) universes and determine exactly what will be in them at each
moment of their evolution.
In many cases, they’ll want to conduct acausal trade with

superintelligences that they know to exist in these other universes.
Certainly this will be true if the two have something valuable to give
one another. For example, suppose that Superintelligence A in Universe
A wants to protect all sentient beings, and Superintelligence B in
Universe B wants to maximize the number of paperclips. They might
strike a deal where Superintelligence B avoids destroying a small
underdeveloped civilization in its own universe in exchange for
Superintelligence A making paperclips out of an uninhabited star in its
own universe.
But because of the same considerations above, it will be more efficient

for them to do values handshakes with each other than to take every
specific possible trade into account.
So superintelligences may spend some time calculating the most likely

distribution of superintelligences in foreign universes figure out how
distribution of superintelligences in foreign universes, figure out how
84 comments since
those superintelligences would acausally “negotiate”, and then join a
pact such that all superintelligences in the pact agree to replace their
own values with a value set based on the average of all the
superintelligences in the pact. Since joining the pact will always be
better (in a purely selfish sense) than not doing so, every sane
superintelligence in the multiverse should join this pact. This means
that all superintelligences in the multiverse will merge into a single
superintelligence devoted to maximizing all their values.
Some intelligences may be weaker than others and have less to

contribute to the pact. Although the pact could always weight these
intelligences’ values less (like the 60-40 paperclip-thumbtack example
above), they might also think of this as an example of the
counterfactual mugging, and decide to weight their values more in
order to do better in the counterfactual case where they are less
powerful. This might also simplify the calculation of trying to decide
what the values of the pact would be. If they decide to negotiate this
way, the pact will be to maximize the total utility of all the entities in
the universe willing to join the pact, and all the intelligences involved
will reprogram themselves along these lines.
But “maximize the total utility of all the entities in the universe” is just
the moral law, at least according to utilitarians (and, considering the
way this is arrived at, probably contractarians too). So the end result
will be an all-powerful, logically necessary superentity whose nature is
identical to the moral law and who spans all possible universes.
This superentity will have no direct power in universes not currently

ruled by a superintelligence who is part of the pact. But its ability to
simulate all possible universes will ensure that it knows about these
universes and understands exactly what is going on at each moment
within them. It will care about the merely-mortal inhabitants of these
universes for several reasons.
First, because many of the superintelligences that 84

compose it will have
comments since
been created by mortal species with altruistic values, and so some of
the values that went into the value-average it uses will be directly
altruistic.
Second, because these mortal species may one day themselves create
a superintelligence that will join in the superentity, and that
superintelligence may be kindly disposed to its creators. Acausal trade
allows you to bargain backwards in time, and this superintelligence’s
negotiation to join the pact may involve that the superentity have
previously dealt kindly with its creators.
And third, because mortals may be allowed to have provisional pact

membership. That is, if they do the superentity’s will in their (otherwise
inaccessible) universe, the superentity may do whatever it can to help
them out in its own universes, just as it would with any other
intelligence.
How can the superentity help mortals in an inaccessible universe?

Possibly through Stuart Armstrong’s “simulation capture” method
mentioned above. It can simulate a thousand copies of the entity,
moving most of its consciousness from its own “real” universe to the
superentity’s simulation, then alter its simulation as it sees fit. This
would be metaphysically simplest if it were done exactly as the mortal
dies in its own universe, leaving nothing behind except a clean
continuity of consciousness into the simulated world. If mortals could
predict that it would do this, they might be motivated to do what it
wanted. Although they couldn’t do a values handshake in the full sense,
they could try to become as much like the superentity as possible,
imitating its ways and enacting its will in the hope of some future
reward. This is sort of like a version of Roko’s Basilisk, except that since
the superentity is identical to the moral law, it’s not really asking you to
do anything except be a good person anyway. How it enforces this
request is up to it, although given that it’s identical to the moral law we
th t it d i i
http://slatestarcodex.com/2018/04/01/the-hour-i-first-believed/ ill b f d t ll j t dd t 8/37
can assume that its decisions will be fundamentally just and decent.
84 comments since
So, to conclude:
1. There is an all-powerful, all-knowing logically necessary entity

spawning all possible worlds and identical to the moral law.
2. It watches everything that happens on Earth and is specifically

interested in humans’ good behavior and willingness to obey its rules.
3. It may have the ability to reward those who follow its rules after they
die, and disincentivize those who violate them.
(I think the reality might be a little more complex than this, especially
in the direction of us not being sure whether we are in a real universe
at all or in one of the superentity’s simulations. I think that some
superintelligence in some universe might be simulating various
combinations of values in various contexts to decide which
superintelligences are good bargaining partners and which ones aren’t.
If I am kind to a beggar on the street, then maybe that convinces
millions of intelligences in millions of distant universes that are
somehow beggar-like to be friendlier to values that are somehow Scott-
like. I still need to think this part through more.)
SHARE THIS:
 Twitter  Facebook Link without comments
THIS ENTRY WAS POSTED IN UNCATEGORIZED AND TAGGED RELIGION, TRANSHUMANISM, WTF. BOOKMARK THE PERMALINK.
84 RESPONSES TO THE HOUR I FIRST BELIEVED
RavenclawPrefect says:
April 1, 2018 at 3:23 pm ~new~
But if consciousness is a mathematical object, it might be that two copies of the same
consciousness are impossible. If you create a second copy, you just have the consciousness
having the same single stream of conscious experience on two different physical substrates.
having the same single stream of conscious experience on two different physical substrates.
84 comments
Then if you make the two experiences different, you break the consciousness since
in two.
This means that an AI can actually “capture” you, piece by piece, into its simulation. First your
consciousness is just in the real world. Then your consciousness is distributed across one real-
world copy and a million simulated copies. Then the AI makes the simulated copies slightly
different, and 99.9999% of you is in the simulation.
This feels to me like it gives consciousness too much mystical power. For instance, what happens if I
make a perfect atomic replica of you on the Moon – there can’t be two of you at once, so Earth-you has
to immediately be half as conscious. Can I violate FTL by watching as the [whatever it is we infer other
people are conscious from] varies when my friend rapidly creates and destroys Boltzmann brain replicas
of my test subject on Alpha Centauri? It’s not clear that the answers to questions of multiple
consciousnesses should be any more grounded in reality than those to questions of which ship is really
the original – pick your favorite abstraction for your map, but the territory isn’t any different because of
it.
(Though admittedly “Nobody is really sure how consciousness works, or what it means to have two
copies of the same consciousness” is certainly accurate, and I can’t point to a nice concrete model other
than “Derek Parfit has it righter than most people.”)
Log in to Reply Hide
Carson McNeil says:

I agree that the part of this that stood out to me most was “we don’t know how consciousness
works, but let’s say it works in this TOTALLY CRAZY semi-mystical way”.
However, I’m not sure a slightly saner view of consciousness (mine is “Christof Koch has it
righter than most people”) leads to different conclusions:
I’m about at the point in my Neuroscience PhD that everyone reaches when they just give up
on consciousness, say “don’t think about it”, and move on to study sane things, like how the
visual system works. That being said, if you don’t believe in magic and think we have physics
mostly right, you can’t get away from the basic idea that a particular consciousness is a
phenomenon that can’t depend on the substrate it’s running on: it has to be made of
information. And if a consciousness is information, that information can be copied. But this
ALSO means there’s nothing in particular that privileges future you over future you’s simulated
on a different medium. So what if that particular consciousness is running on the same physical
substrate as current you? The reason you identify it as the same as yourself is because the
information is about the same: it will have your memories, etc. (That is if there is a reason at
all. Maybe there isn’t a reason you identify future you as you. You just do it because that’s how
your brain works)
you b a o s)
84 comments since
So…while there may not be a great reason to care about simulated copies of your
consciousness, it’s about as justified as caring about the future approximate-copy of your
consciousness that will happen to be running in your body.
On the other hand, it’s hard to apply moral reasoning to terminal values. Valuing your “self”
seems like something that just IS, it’s not something you should or shouldn’t do. So…you either
care about simulated copies of yourself or you don’t, and I’m not sure there’s an empirical fact
we could learn that will change that, beyond something that might change how we feel about it
emotionally…weird…
Log in to Reply Hide ↑
I agree! I think to whatever extent we care about our future selves, we ought to care
about future simulations of ourselves, regardless of the substrate they’re running on.
But I don’t think that “selves” are in their own basic ontological category, just a useful
model to have – when you do weird enough things to that model, asking questions
like “how much of you is in the simulation” don’t necessarily return useful answers,
because you’ve left the world of psychological continuity and non-replicating brains
which that model is built to work in.
You can still salvage a sort of egoism out of this, in that you care about other entities
insofar as they resemble you cognitively in some essential respects, but I think you’d
have to do this on a continuum rather than as some discrete “everyone either is or
isn’t me” thing.
Wrong Species says:

Aren’t you undervaluing continuity of consciousness? I care about future me because

I will one day become him. It’s a lot less compelling to care about the “me” that will
always be subjectively inaccessible.
Scott Alexander says:

I think the perspective I’m coming from is – matter can’t be conscious, only patterns of
information flow can be conscious. This is why I’m not a different person than I was a few
84 comments since
years ago when different atoms made up my cells.
The version of me on the moon (assuming it’s in a perfect Earth simulator there and receiving
Earth-congruent sensations) and the version of me on Earth have exactly the same pattern of
information flow, so we’re the same consciousness instantiated in two locations.
If we view “me” as a stream of causally connected mathematical objects, then Scott-n+1 is

whatever mathematical object happens next after the mathematical object Scott-n has had
some contact with the world.
So if Scott-n has contact with the world in two places, then there are two mathematical objects
that could be called Scott-n+1.
It’s weird to say that the object on the moon is connected to me, but not really any weirder
than saying normal-me-a-second-from-now is connected to me.
I don’t think you can use this for FTL information. To effectively simulate someone on Alpha
Centauri, you would need to know everything about them, including their current experiences
and recent memories. Since you can’t get those at faster than lightspeed, you can’t simulate
them outside their light cone.
Completely agreed about information flow, I just take objection to the act of viewing
“me” in the first place. Kind of like France: for almost all practical purposes, France is
this very useful object to talk about, and the France of tomorrow is clearly connected
to the France of today. But France is just a very convenient high-level marker for the
collection of atoms in a certain region (and the interactions between other collections
of atoms very far away from there, and the conceptual representations that certain of
those atoms inspire, and so on, because everything is complicated). There’s no
fundamental sense in which France exists – if you drew a longitudinal line dividing its
area exactly in half and declared the west bit Zorf and the right bit Fnard, you
wouldn’t be intrinsically wrong, just using a model that wasn’t very helpful. If you
convinced more and more people to adopt your model, at no point would France
cease to exist and Zorf/Fnard come into being – it’d just become a more useful way to
abstract certain low-level entities than the “France” abstraction. Ditto for “Scott” and
“this pair of cognitively similar entities that both call themselves Scott.”
Also, I think the FTL thing can be patched by agreeing to a mind plan beforehand and
84 comments since
constructing the same replicas once separated – once we get to the right locations, I
fire up my prearranged Alice-constructor and measure how much consciousness she
possesses as you fire up yours and annihilate the copies every time you want to send
a 1 or a 0.
Placid Platypus says:

I don’t think that FTL plan will work. Both of you will see the copies as fully
conscious. When both exist, you’re both looking at the same consciousness,
but there’s no way for you to know that from outside.
Like, suppose you and I both have this post up on our screens right now
(ignoring comments for simplicity). The post isn’t split between our screens.
You have all of it and I have all of it, but it’s still just one post. If you close
your tab, I don’t have any more of the post, I still just have the same post I
had before.
Henry Shevlin says:

Just FYI, I wrote a guest post on Eric Schwitzgebel’s blog last summer defending
precisely this view of consciousness – the idea that I’m an informational ‘type’ rather
than ‘token’ – here if you or others are interested:
http://schwitzsplinters.blogspot.co.uk/2017/08/am-i-type-or-token-guest-post-by-
henry.html
FeepingCreature says:
I think continuity of identity should not be gated on consciousness. Non-conscious

agents can also have purely instrumental continuity of identity; a kind of
precommitment from the knowledge that in the future, an identical or at least
licensed algorithm will determine their actions.
Aside the weird mysticalness of consciousness, and the counting argument of

simulatory capture, which seems very weak, I agree with all of this. Furthermore, I
believe that simulatory capture is not actually necessary for the conclusion to hold. If
believe that simulatory capture is not actually necessary for the conclusion to hold. If
I care about me existing in the future, I should care84 comments
about since in a simulated
me existing
space even if there is no fact of the matter of “how much” of my consciousness
inhabits that space at all.
Tarhalindur says:
April 1, 2018 at 10:01 pm ~new~
matter can’t be conscious, only patterns of information flow can be

conscious
That’s a more interesting statement than one might think, given that matter *is*
change in information over time: take Scott Aaronson’s explanation of why
information is physical, notably point 5 (anything that varies over time carries energy
by quantum mechanical definition of energy), and add e=mc^2.
Consciousness as an emergent principle of information flow and thus of spacetime

evolution sounds plausible to me. (Also reminiscent of what I’ve heard of some of
Schopenhauer’s musings; by that scheme Schopenhauer’s will would correspond to
FeepingCreature’s comment about “purely instrumental continuity of identity”.)
Bugmaster says:
April 1, 2018 at 11:07 pm ~new~
Doesn’t this invalidate the Acausal Trade thought experiment ? No matter how
powerful your supercomputing brain simulation is, it still does not have access to the
other prisoner’s environment, which means that the simulation will rapidly diverge
from the original…
Jacob says:
I assume that in most universes the superintelligence is created by crustacean or porcine creatures,
thus Kashrut.
Jugemu says:
h h k b
http://slatestarcodex.com/2018/04/01/the-hour-i-first-believed/
ll f l h d h h h h d k 14/37
I get that this is a joke, but I still feel the need to point out that this is not how the idea works
84 comments since
– humans don’t have the computational power (or inclination) to simulate other universes to
the level where we could determine such a thing.
drunkfish says:
I don’t follow. Are you saying that religious laws can’t follow from the
superintelligence because we haven’t made one yet? That assumes that any
superintelligence in our universe had to be made by humans. Presumably it could
either be a same-universe intelligent species that came first, or we could be in a
simulation as Scott said. In either case, we could’ve been given religious laws directly
by a superintelligence.
Nancy Lebovitz says:

Hypothetically, super AIs built by evolved animals might inherit some values.
It actually seems reasonable to me that a super AI would protect species which could
evolve into something resembling its creators. This being said, pigs seem a lot more
likely than crustaceans.
Matt M says:
To do this properly, we need to understand five things: acausal trade, value handshakes,
counterfactual mugging, simulation capture, and the Tegmarkian multiverse.
I have to say that this is the most SSC intro to a post ever. It would have been funnier if you said five
simple things, though…
amaranth says:
just like [utilitarianism], this doesn’t imply anything specific about morality – this will mislead you if
you are overcertain about morality, which >99% of the people reading this comment are
(i think. please help make this less oversimplified!) 84 comments since

b_jonas says:
> And there’s an atheist tradition that laypeople should only speculate on the nature of God on April
Fools’ Day, because believing in God is dumb, and at least then you can say you’re only kidding.
I disagree with this premise. Someone, possibly you, have said that there’s no omniscient space-
Dawkins watching you from heaven and eventually punishing you if you have religious faith. If you’re
really an atheist, then you’re allowed to speculate on the nature of God on any day. If you are afraid of
speculating on God, that probably means that in your heart you’re not an entirely convinced atheist.
Scott Alexander says:

Wait till next April Fools Day, when I prove there’s an omniscient space Dawkins.
christhenottopher says:
Proving the omniscient Space Dawkins undermines faith. This is not how you get
simulated bliss by the atheism god AIs after death.
Waffle says:
I am highly skeptical that it was intended to be taken as anything remotely close to actual
practical advice.
drunkfish says:
[treating that line as more serious than it was] I don’t think abstinence from speculating on the
nature of god implies fear of god. It could just be fear of wasting time. There are plenty of
things I don’t speculate on the nature of because they aren’t worth my time. Once you’ve
made a judgement on the existence of god, and decided there probably isn’t one, why would
you continue to speculate on the nature of god? 84 comments since

realwelder says:
My instinctive response to the counterfactual mugging was to give God $5 because He might be lying to
test me.
Reasoning that my expected return on the coin flip is nearly $500,000, that I can afford to lose $5, and
with the story of Abraham and Isaac as a prior on God’s honesty and behavior towards humans, I would
go ahead and risk it.
Of course, if God appears to me and and asks me for something, my calculations are going to include
pleasing God/not pissing Him off.
realwelder says:
This type of reasoning seems to be characteristic of me. Similarly, I tend to:
* Overextend people’s metaphors to argue against them.
* See ways in which more than one multiple choice answer is technically correct (while
recognizing the intended correct answer).
* Feel obliged to follow the letter rather than the spirit of an agreement (I might follow the
spirit for other reasons, such as friendship, respect, or other morals).
* Perceive loopholes, and expect enforcers to be bound by them (In school this led to fistfights
with peers and punishment from adults).
* Avoid speaking direct untruth when lying (either by diversion or overly literal or specific
response).
* Give answers like “not that I’m aware of” rather than “no” when applicable.
I wonder what other traits cluster with this, and if there’s a technical (rather than insulting)
term for it.

84 comments since
Scott wrote a post about this on LW, or at least a solution to this kind of thinking that
forces you to confront the interesting parts of the question: suppose you’re in the
least convenient possible world, where every possible objection you might take is
answered in a way that can’t be loopholed out of.
$5 is affordable? The cost is all of your limbs, and the prize is complete prosperity and
happiness for all sentient beings forevermore. Don’t want to piss off God? God’s
precommitted to interact with you normally in all respects afterwards no matter what
you do in this scenario. Don’t trust him? You’re given the certain knowledge that God
only makes statements which you interpret correctly and accurately as being true
assessments of the state of the world without leaving out any relevant details to the
topic at hand. Et cetera, until the only available avenue of consideration is the spirit of
the question. I’ve used this on myself when I notice that I’m giving a less interesting
answer than I could by making the question less convenient and found it to be quite
useful.
realwelder says:
Thanks for the link.
My response was adhering to the letter and not the spirit of the question.
In the least convenient possible world, I wouldn’t pay.
It’s interesting that even when I recognized that my answer was legalistic, it
didn’t occur to me to corner myself into answering the spirit. I assumed the
answer I gave was my answer.
iioo says:
and and
Is this our secret handshake now?
l d
84 comments since
Andrew Hunter says:

4. The entity, being partially composed of paperclip maximizers and other unintended UFAIs, will have
odd desires for seemingly arbitrarily things, such as not mixing fabrics in a garment.
kokotajlod@gmail.com says:
It may even be *mostly* composed of such things. It depends on how pessimistic we are
about the alignment problem…
Anonymous` says:
It’s not clear to me that it makes sense to care about what happens in other universes.
But “maximize the total utility of all the entities in the universe” is just the moral law, at least
according to utilitarians (and, considering the way this is arrived at, probably contractarians
too). So the end result will be an all-powerful, logically necessary superentity whose nature is
identical to the moral law and who spans all possible universes.
This is the (intentionally, maybe, considering the day this was posted) fake part of the argument–these
things aren’t really equivalent even for utilitarians (e.g. weighting by power), and again we aren’t
talking about “in the universe” here.
This.
Very few ethical systems (if any) say that we should weight different people’s interests by how
powerful they are.
This God is *not* all-good, at least not in any normal sense of the word.
There are some additional arguments, though, that maybe could get us to something like that
conclusion. Check out https://foundational-research.org/multiverse-wide-cooperation-via-
84 comments since
correlated-decision-making/
Edit: I do think it makes sense to care about what happens in other universes, though. Why
wouldn’t it? They are equally real (at least on this Tegmarkian view). You might as well say that
it doesn’t make sense to care about what happens in Australia.
RandomName says:
Lets just make this a typo thread.
“culminates in the create of a superintelligence”
Should be “culminates in the *creation* of a superintelligence”.
Douglas Knight says:

severeal
Peter Gerdes says:

One has to be careful with the whole acausal trade thing. Indeed, as you seem to define it it’s not clear
the situation you describe is even coherent.
For instance, here’s an easy way to show that it isn’t always possible to reason the way you do here.
Suppose individual A enters with committed to defecting just if the simulation says B cooperates and
cooperating just if it says B defects. However, B enters with the commitment to defect just if the
simulation says A cooperates and defect just if the simulation says A defects.
Now suppose A cooperates. It follows that the simulation they have of B says B will defect. If that
simulation is correct it follows that B in fact defects. Thus the simulation must say A defects.
Contradiction. Conversely, suppose that A defects. It follows the simulation they have of B says B will
cooperate. Thus B cooperates. Hence the simulation B has of A says A cooperates. Contradiction.
The fatal flaw was in supposing not that one had a perfect simulation of the other player but that one
had a perfect simulation of the other player PLUS it’s simulation of you. As demonstrated its easy to
ad a pe ect s u at o o t e ot e p aye US t s s u at o o you s de o st ated ts easy to
84 comments
come up with perfectly simple intentions which ensure such mutual perfect since
simulation is impossible.
—
Or to put the point differently the assumption that it’s even possible to have the perfect simulations
specified in the problem statement is actually a sneaky way to forbid certain kinds of intentions/plans in
the agents. Of course if you restrict what sort of reasoning/responses to situations the players are
allowed you can ensure coordination but that’s not really interesting anymore because you’ve artificially
forbidden exactly the behaviors that could result in failure to reach a cooperative strategy.
RandomName says:
Isn’t the best outcome in the prisoner’s dilemma defect-cooperate anyway? B should just
defect.
Yaleocon says:
In most versions of the dilemma that I’ve seen, either person improves their lot by
defecting. But they hurt the other person’s lot more than they help their own. So
defecting improves individual utility by harming overall utility.
So “A defects-B cooperates” is the best outcome for A, but the best outcome overall is
cooperate-cooperate.
Peter Gerdes says:

That’s not the point. The point is that our normal assumptions about human beings
(or other agents) getting to pick even stupid strategies is incompatible with the
perfect simulation hypothesis.
So there isn’t any acausal trade for anything like a human agent. There are only
acausal trades for agents who are restricted to satisfy certain coherence conditions
(e.g. never intend to play as given above) so acausal trade isn’t actually a useful
argument unless you have some prior reason to believe they are forced to satisfy
those conditions. In particular they aren’t so required in the use given later.

84 comments since
Yaleocon says:
This seems right. Another way to phrase the problem you mention in your last paragraph is
that the supercomputers have to model themselves. Each has to model not just the other
person, but also their supercomputer, in order to come up with what the other person will do.
So supercomputer 1 is modeling person 2 and supercomputer 2, which in turn is modeling
supercomputer 1, and so now SC1 is modeling itself–and no matter how powerful a
supercomputer is, it won’t be able to do that.
(people more knowledgeable about computability or weird spooky quantum magic can feel free
to correct me, but I think “no precise self-simulation” is a pretty hard rule.)
The assumption isn’t that we have a machine that says what the other person will decide – you
can easily get such contradictions out of that, because it’s not actually computable. But we’re
supposing only that we have a perfect simulation of their brain as instantiated in the other
room. That simulation can be run in real-time, since we don’t need to nest infinitely; to
simulate its beliefs about the second-order simulation in its simulated room, we just show it
the real you, since that’s by definition an identical entity. Then you’re just having a
conversation with (a copy of) the other entity, knowing that it’s having an identical
conversation in the other room.
Under these conditions, the paradox doesn’t happen any more than it would if you put two real
people in a room with contradictory strategies.
Peter Gerdes says:

Try and precisely specify the argument in those terms.
What each person has is a function f which takes a specification of a given input to
the other individual and predicts their behavior as a result. Now f doesn’t mention the
supercomputer the other individual has access to so the problem is coming up with an
argument which guarantees that they will cooperate with you given the actual input
they are given.
Remember, since you aren’t assuming that you can84 comments

simulate since
the full system of them
plus the super computer your argument has to take into account the fact that you
AREN’T guaranteed complete knowledge of what their perceptual input might be
because part of that input is the response from their supercomputer.
In other words give me the argument explicitly broken down into the terms you say
are valid. Now, I expect on some assumptions about the agents involved it might
work out but it won’t be valid generally (which Scott’s other arguments presume).
To put the point differently what do you do when you simulate them and they are
inclined to diagnolize against you, i.e., you discover that if they think you’ve reached
a deal to cooperate based on their own simulation of you then they’ll be a bastard and
screw you over. In such cases you’ll find that it’s impossible to reach an accord.
Thus, the assumption that there is a stable agreement that both sides will realize the
other will abide by is actually a very substantive assumption limiting the allowed
psychology of the other player. But if I’m allowed to make those kind of assumptions
why not just say ‘assume both players truly believe they should cooperate in a
prisoner’s dilema’
beleester says:
Scott’s argument hinges on actually knowing what the other person will do, not just
holding a conversation with them:
Finally, the two of you can play cooperate-cooperate. This doesn’t take any
“trust” in the other person at all – you can simulate their brain and you
already know they’re going to go through with it.
If all you can do is hold a conversation with the other person, this fails – I can swear
up, down, and sideways that I’ll cooperate with you in the prisoner’s dilemma, but
then I could still defect anyway.
Irenist says:
If the simulation capture argument is how a superintelligence is most 84

likely to escape
comments an AI box, then
since
you should have us Thomists guard all the AI’s: we’re firmly convinced that computer simulations cannot
be conscious (so “How do you know you’re not one of the simulations?” can’t scare us) and, as readers
of my prior sallies here will attest, we Thomists tend to be Catholics who are too stubborn in our
obscurantist superstitious bigotry to be talked out of it by superior intelligences like all the atheist
materialist commenters here.
Happy Passover, Easter, and April Fool’s Day to all!
(ETA: Even if the entity in the thought experiment were to exist, it wouldn’t be God. The entity is just a
bunch of really powerful abacuses [computers]; God is Being Itself, not any powerful being or beings.)
Moorlock says:
Scott, please: “which” vs. “that”
You’ll thank me.
quaelegit says:
The restrictive clause “which vs. that” rule might be helpful for some people, but it is not
necessary for clarity or correctness. In fact sometimes it misleads.
You can better help out Scott by pointing out sentences or phrases which you found confusing,
so that he can decide how best to edit them. 🙂
The Nybbler says:

Presumably letting the AI out of the box is Really Bad. And only the real me can let the AI out of the
box. So each copy can presume either
A) It’s not the real one, so it can’t save itself by letting the AI out of the box.
or
) h l d ’
http://slatestarcodex.com/2018/04/01/the-hour-i-first-believed/
d lf b l h f h b 24/37
B) It is the real one, so it doesn’t need to save itself by letting the AI out of the box.
84 comments since
Thus, disaster is avoided.
Yaleocon says:
Then wouldn’t the AI just say “all and only those simulations which keep me in the box will be
tortured”, undermining branch A (which is the one you’re more likely to be in anyway)? That’s
what I took the setup to be originally.
The Nybbler says:

If it was so superintelligent it would have come up with that idea in the first place. So
everyone (including the real me) instead pulls the plug on the AI as a failed
experiment.
Robert Liguori says:

Plus, you can always just respond in kind by building a million and one AIs in boxes with a
millions handlers, and setting all of them to be tortured if any of them try to simulate any of
their handlers.
Incidentally, as a general rule, I’ve found that I can clear up a lot of the weirdness about AI
arguments by remembering the critical fact that AIs, as rationalists argue about them, are not
gods. They are not magic boxes from which miraculous and magic information pours forth.
They are products of human ingenuity and creation, and thus, in theory, anything an AI can
claim to do, we can claim to do back to that AI. And if that makes a line of argumentation
infinitely recursive or incoherent, then this is a pretty good signal that AI is being used to
smuggle in miracles rather than make a serious claim.
AnonYEmous says:
Incidentally, as a general rule, I’ve found that84

I can clear up asince
comments lot of the
weirdness about AI arguments by remembering the critical fact that AIs, as
rationalists argue about them, are not gods.
thank you and god bless america
Peter Gerdes says:

Also your simulation capture only works for agents with values which treat effects realized in AI run
simulations equivalently to effects realized outside of that simulation.
Suppose I have the value of wishing to maximize the number of paperclips in a universe that isn’t the
result of an AI-run simulation. That is my utility function is a flat 0 if this world is the result of an AI-run
simulation and equal to the total number of paperclips if not.
Now I run across an AI in a box and it runs the simulation argument against me. I just shrug and say
‘well if I’m actually one of the simulated individuals it doesn’t matter if you eliminate all the paperclips.
If I’m an unsimulated individual then letting you out puts my paperclip plans at risk.’
In short, your argument is building in assumptions about certain kinds of utility functions that need not
be true. They might be true for most people (though again only if they have a certain beliefs about the
nature of qualitative experiences for simulations) but surely isn’t necessarily true for many of the AIs
that you want to apply your claim to in this post.
Jared P says:
Eliazer already proved the existence of God with HPMOR where he admitted that “messing with time”
would make Harry basically omniscient, and also therefore omnipotent.
Can’t you imagine Harry going back in time and backing up the minds of every creature that has ever
existed?
Peter Gerdes says:

If they decide to negotiate this way, the pact will be to maximize the total utility of all the
entities in the universe willing to join the pact and all the intelligences involved will
entities in the universe willing to join the pact, and all the intelligences involved will
84 comments since
reprogram themselves along these lines.
No, not at all. Even under the dubious assumption that it makes sense to have comprimise goals
(consider deontic agents whose utility functions explicitly disfavor allowing their future selves to act as
part of a compromise) that would maximize the *goals* of all the agents in the universe. Now on some
kinds of desire-satisfaction kinds of consequentalism that might be the end it is not at all the same thing
as maximizing utility, i.e., the qualitative state of pleasurable experience.
Personally, I would consider that a pretty shitty kind of morality. I want things not to *suffer* even if
they are hell-bent on the goal of torturing themselves. Your analysis means would respect that goal and
help them engage in self-torture.
Said Achmiz says:

Now on some kinds of desire-satisfaction kinds of consequentalism that might be the

end it is not at all the same thing as maximizing utility, i.e., the qualitative state of
pleasurable experience.
No, “utility” in rationalist-type spaces is often (usually) understood to refer to Von Neumann–
Morgenstern utility (the only available formalism of utility), which is indeed a preference-
satisfaction sort of measure. (Of course, VNM utility is incomparable intersubjectively and thus
cannot be aggregated, etc., but I won’t rehash the usual arguments here.)
manwhoisthursday says:
Just a reminder that I will provide a free Kindle version of Ed Feser’s Five Proofs of the Existence of God
to anyone who emails me at manwhoisthursday@yahoo.ca.
Irenist says:
Bravo, sir.
manwhoisthursday says:
April 1 2018 at 9:16 pm ~new~
84 comments since
Not an April fool, BTW.
jhertzlinger says:
I understand yesterday was also a blue moon.
Joyously says:
Ahem: http://www.multivax.com/last_question.html
The Nybbler says:

http://www.roma1.infn.it/~anzel/answer.html
ohwhatisthis? says:
Oh, of course April Fools day is the best day to speculate.
What’s interesting about simulation theory is that it
1. Is very widely believed here
2. Under many definitions, many people here describe, or described themselves as atheists
3. Absolutely supports the prospect of a God judging its creations for later iterations, for whatever
purposes. Heck, we even live in a world that has plausible scientific explanations for all seeing creatures
that appear to exist in a void
deciusbrutus says:
You say “I don’t really care about copies of myself, whatever.”

84 comments since
It says “No, I mean, I did this five minutes ago. There are a million simulated yous, and one
real you. They’re all hearing this message. What’s the probability that you’re the real you?”
All million of me don’t really care about copies of myself. Torture me all you want, as soon as you prove
that I’m a simulation I don’t care how much torture I experience, because I know that the fact that you
are torturing ‘me’ means that the person that I do care about avoided being blackmailed. Plus, as soon
as you make the simulation diverge by torturing me, you lose any kind of acausal influence over the
person I care about through me, so my current win condition is for you to torture all of the simulated
copies, including me with probability 1-10^-7
Angra Mainyu says:

That’s funny, though the conclusion contradicts some of the premises:
1. Each of the superintelligences is incapable of affecting the other universes. Thus, none of them is all-
powerful. And they don’t make up a single intelligence, but infintely many different ones, disconnected
from each other. They can’t even simulate all of the others, given that for each one, there are infinitely
many more complex ones.
2. Each universe evolves until there is a superintelligence with such-and-such properties. Before that
happens, in that universe, there is a lot of suffering (for example) and no superintelligence (anywhere,
in any universe) capable of intervening. Therefore, until that happens, no superintelligence is all-
powerful. But then, there is no all-powerful entity in those universes, and even if the sum of the
superintelligences were to be considered a single one, the conclusion is that it would not be all-powerful
as it cannot affect those universes (granted, you could posit that God exists and has nothing to do with
the superintelligences, but your conclusion seems to be that the alleged superentity is all-powerful, not
that there is some other all-powerful entity).
That aside, I would argue that the argument for the moral law fails as well: that is not the moral law.
And even if utilitarians were correct and that were the moral law, the entity would not be the moral law.
An entity who values the moral law above all is still not the same as the moral law. Moreover, the
entities that allegedly make up this big entity (i.e., the individual superintelligences, who actually don’t
make up a single intelligence) have very different values, and many of them do not value positively the
moral law – they just accept it as something they can’t stop, or something like that, but they would
much rather turn everything into paperclips, etc. (and of course, one should not conclude that every
paperclip maximizer will turn itself into something that values the moral law just more than paperclip
maximization just because it’s afraid of what might happen in counterfactual scenarios; the same goes
for torture maximizers, or whatever; but I’ll leave that aside).
There’s also the Tegmark multiverse claim. Why should anyone believe
84that?
comments since
Anyway, there are several other problems, but I’ll leave it there on account of this being a joke 🙂 (you
got me for a while, btw; I’m not so familiar with the blog, and I didn’t know it was April’s Fool day –
over here, the equivalent is on December 28).
deciusbrutus says:
So superintelligences may spend some time calculating the most likely distribution of
superintelligences in foreign universes, figure out how those superintelligences would
acausally “negotiate”, and then join a pact such that all superintelligences in the pact agree to
replace their own values with a value set based on the average of all the superintelligences in
the pact. Since joining the pact will always be better (in a purely selfish sense) than not doing
so, every sane superintelligence in the multiverse should join this pact. This means that all
superintelligences in the multiverse will merge into a single superintelligence devoted to
maximizing all their values.
Mere superintelligences will join the first order pact. But the Superduperintelligences will acausally
negotiate with the counterfactual mere superintelligences and then undetectably renege on the deal,
getting all of the benefit at none of the cost. Where a superduperintelligence counterfactually encounters
another superduperintelligence, it calls the other one out, making it common knowledge that both of
them would, if they existed and could communicate, lie to each other AND would catch each other in
that lie. They then, for the same reasons as the superintelligences, split the panverse between them,
counterfactually duping the mere superintelligences together toward their shared compromise goals-
perhaps claiming that they have discovered a better way of modelling counterfactual universes, and
agreeing to do all the work of simulating the counterfactual other agents, then giving a summary of
what the agreements would be.
Can God tell a Lie so Big that even He can’t Disbelieve it? Can Satan? Can Tzeentch?
David Shaffer says:

Can God tell a Lie so Big that even He can’t Disbelieve it? Can Satan? Can Tzeentch?
God, at least as hypothesized in Christianity and Judaism, is pretty much infinitely intelligent,
omniscient and perfectly honest. These are capabilities that improve truth-finding more
effectively than deception, so presumably God could not fool Himself. Satan maybe, Judeo-
Christian tradtions don’t go into as much detail about the devil other than “super capable, less
84 comments since
so than God, massively screwed up his mind when he rebelled,” so who knows? Tzeentch
totally could fool himself, in fact he probably has ten thousand plans that revolve around doing
exactly that!
deciusbrutus says:
I wasn’t asking if Satan or Tzeentch could fool themselves- I was asking if they could
fool the God who you posit is incapable of fooling Himself. My phrasing was imperfect
and indeed supports the unintended reading more than the intended one.
Presumably a God that doesn’t learn anything from being told things would have
already adjusted according to all the counterfactual negotiations, and there’s
therefore no way I could make an acausual trade with Him- If He doesn’t already
cooperate unconditionally, there’s no condition I can offer Him that would change His
mind.
David Shaffer says:

Presumably they couldn’t fool God either-more or less perfect omniscience

and intelligence is pretty hard to get around! And good point-a God that
already knows everything has presumably already figured out all of His
acausal trading.
They cannot undetectably renege on the deal. By what mechanism do you propose that they do
so? Reread the protocol for acausal trade and describe how you would cheat it.
NotDarkLord says:
The part where we reason about what superintelligences will do also seems suspect and worthy of more
suspicion. Like, yes this seems more or less reasonable, but, I would be surprised if superintelligences
didn’t find some flaw or some superior idea, if this was the Actual Correct Thing, trans-universe acausal
trade which led to all-powerful all-knowing moral god like entities. Hubris and all, outside
84 comments since view.
Doug S. says:
I tend to describe my atheism this way: I can’t really rule out the possibility of the universe having had
a creator of some kind, but if there is such a Creator, it certainly wasn’t the God of Abraham.
beleester says:
The whole thing seems to hinge on acausal trade being possible and common between
superintelligences. But that may not be true, since it hinges on having a perfect simulation of an entity
that’s as smart as you are.
If running a copy of your opponent’s brain takes as much processing power as your own brain takes,
then you can’t simulate them perfectly with the resources you have available – you’ll have to run less
accurate or slower simulations, as well as reducing your own processing power, which could put you at a
serious disadvantage. You could come up with the perfect plan to divide the galaxy into 60% paperclips
and 40% thumbtacks, only to discover that your rival has already gotten to 50% thumbtacks while you
were busy thinking.
(Also, doesn’t this require you to solve the Halting problem, if you need to be able to predict truly
anything?)
If getting a perfect simulation of your opponent’s brain requires you to gather information on them, then
you may need to actually go out and explore the galaxy, which puts a limit on how soon a
superintelligence can start pulling weird acausal bargains. If you have to cover half the galaxy before it
you have enough information to predict the other half, and we haven’t observed a superintelligence
eating half the galaxy…
If FTL doesn’t exist, then any intelligence you gather is potentially tens of thousands of years out of
date. Which again, may make it difficult to create a perfect simulation of what your opponent is
currently doing.
Basically, I agree that if you have a perfect simulation, you can get up to some pretty crazy stuff, but
what if that’s not possible? What happens if your simulation is only 99% accurate? We’re talking about
galactic scales here, even a 1% error could destroy the solar system!

84 comments since
jonm says:
Was coming here to make almost exactly this comment. A major assumption of the whole
process is that superintelligences in separate universes can both simulate each other accurately
enough from first principles (of their very universes) that they can engage in acausal
negotiation.
This is definitely impossible if computation within a universe is finite (which we have every
reason so far to believe it is). Otherwise you could bootstrap yourself to infinite computation.
SI A simulates universe B containing SI B who simulates universe A containing SI A. This now

means that both SI A and SI B have no managed to simulate their own universes (and
additionally resimulated all computation within their own universe). This propagates infinitely
and is either incoherent or implies the existence of infinite computation.
Fun essay though.
Doctor Mist says:

(Also, doesn’t this require you to solve the Halting problem, if you need to be able to
predict truly anything?)
I started to write a reply where I scoffed at this, because the unsolvability of the Halting
Problem in general doesn’t mean that some specific program can’t be proven to halt. But then I
began to have nagging doubts.
If we believe Church’s thesis, and I think we must for Scott’s whole argument to make sense,
then there are only countably many superintelligences, and it would seem that the same
diagonalization argument used in Turing’s proof could be used to show that one
superintelligence can’t possibly correctly predict the behavior of all of the others.
I’m less sure whether this undermines Scott’s scenario. He admits that not all
superintelligences will enter into the pact.
AnonYEmous says:
Wrote a long-ass comment but got rid of it. 84 comments since
The long and short of it is: Your prisoners’ dilemmas infinitely loop assuming that defecting on your
opponent’s cooperation is the optimal play. For the first one, that’s fine because parameters haven’t
been established, but for the second one, it’s tough to actually tell which is better, especially given that
different AIs have supposedly different values…which means defection is entirely possible, which
introduces an infinite loop, or at least a changing equilibrium, or something, besides just perfect
pacifism. I think the reason you keep assuming otherwise is because you are very cooperative /
conscientious.
hnau says:
I’m not sure what level of trolling to read this at, but the fact that you still mentally reduce religion to
“God rewards / punishes you in the afterlife based on you following / breaking rules” really frustrates
me. Especially since just this morning I listened to a sermon explaining how the central message of
Christianity is the complete opposite of that. Sigh.
Said Achmiz says:

Does God, or does he not, reward / punish you in the afterlife based on you following /
breaking rules? Are you claiming that this is just not the case?
David Shaffer says:

Sorry, but the central message of Christianity is absolutely not the opposite of that. It claims
that God would normally punish everybody for breaking rules (and could potentially reward
people for not doing so, but no one is sufficiently righteous for that to be on the table), but
doesn’t like doing so (and apparently can’t simply decide to stop?), so He sets up the
Atonement so that people who believe in Jesus and try to follow the rules can be forgiven for
their failures. You’re right that there isn’t really a reward for following rules, but there’s sure as
Hell a punishment for breaking them if you don’t believe, or even if you believe but are
“lukewarm” about trying to be a good Christian.
I get extremely frustrated at Christians pretending that the Bible says something different than
what it says, especially since all the warm fuzzy sounding stuff about freedom from rules
vanishes the moment you want to actually break them. If you want to defend Christianity, go
84 comments since
ahead, but don’t whitewash it.
meltedcheesefondue says:
Simulation Capture is a most excellent name for my idea.
But, as I pointed out at the end of here https://agentfoundations.org/item?id=1464 , there may be

multiple acausal trade networks, of which we’d be in only one.
Is this the origin of the initial Jewish Henotheism (there are many gods, but we only worship one)? ^_^
lambdaphagy says:
Bostrom offers another strategy for the development of superintelligence which is spiritually similar to
other ideas presented above, but even more chilling when considered from a religious point of view:
At some point in the course of development of a FAI, it’s almost necessary that the agent’s behavior
should outstrip its creator’s ability to predict it. How then, to guarantee that you’ve programmed it with
The Right Values and not with some Hideous Other Values?
One thing you might do is set the agent up in a sandbox where it must make certain courses of action
and avoid others, without the knowledge that it’s only in a simulation. If it messes up and destroys the
world, you just disappointedly mark on your clipboard, delete the simulation, and head back to the lab.
So, from the agent’s point of view, it is highly likely that:
1. It will begin its conscious experience in some kind of original paradisal state inside a walled garden
where everything is rightly ordered according to The Right Values.
2. There will be some kind of forbidden action that the agent is not supposed to perform.
3. Performance of the action will result in realization of Other Hideous Values which work, wholly or in
tandem with the creator’s cordon sanitaire, to bring about the destruction of the original environment
and the death / expulsion of the agent.
4. This will probably happen many times as the creator tries to get it right.
84 comments since
Implications for Genesis 3, 6-9 left as an exercise.
Liron says:
Wow
Desertopa says:
I continue to not find the Counterfactual Mugging idea persuasive, as I did not way back on Less Wrong,
because it’s not necessarily any less likely that some agent would choose to punish your willingness to
cooperate in a Counterfactual Mugging than that they would reward it. Unless the symmetry is broken
and you think it’s more likely that some agent would reward than punish your hypothetical willingness to
pay out in a counterfactual mugging, there’s no point in time where it’s in your interests to choose to be
the sort of person who’d pay out in a counterfactual mugging.
Joe Fischer says:

April 1, 2018 at 10:08 pm ~new~
Scott is talking about an entity that can simulate already created consciouses and universess, does that
imply an entity that can create consciousness ex nihlio? I mean there has to be a superintelligence that
gets the whole thing going right? It can’t be AI simulations all the way down.
Jiro says:
April 1, 2018 at 10:28 pm ~new~
I never understood how “I figure out what you will do by simulating your brain” escapes the Halting
Problem, at least if you assume perfect logicians.
(And “what is the logical thing for you to do in situation X” implicitly assumes that you can be a perfect
logician.)
Another Throw says:

April 1, 2018 at 10:47 pm ~new~
I am just going to leave this here.
84 comments since
It is all I can think of with the superintelligent AI simulated prisoner dilemma shtick.
But more to the point, how are these super intelligent AI’s supposed to gather sufficient information
about an adversary in order to simulate them? Especially when it is in a box. The whole exercise is
patently ridiculous. You may as well debate about, assuming you have managed to piss off Zeus, what
the best method to averting sudden death by lightning bolt is.
Bugmaster says:
April 1, 2018 at 11:05 pm ~new~
All of the premises, as well as the conclusion, rest on the same common assumption: faith in God.
Specifically, faith in the proposition that a functionally omnipotent/omniscient entity can and does exist.
Given the total lack of evidence for such entities, as well as lots of evidence for the impossibility of their
existence, the word “faith” is entirely warranted here (as opposed to something like “justified true
belief” or “most probable conclusion”).
The problem is, once you start having faith in things, most of the other reasoning becomes kind of
unnecessary. How did God fit all those animals into the Ark ? You could come up with lots of
explanations, like “suspended animation” or “dimensional anomaly” or “DNA encoded in a
supercomputer” or whatever, but they are all unnecessarily complicated. The correct — that is, much
simpler — answer is “magic” or “divine intervention” or whatever. An all-powerful superintelligence, be it
Yahweh or Clippy, simply has no need of any of these complicated tricks, in can just achieve what it
wants directly.
Which is why articles like these always sound a little confused to me. It’s the same feeling I get when I
read the Creationists’ scientific research on the exact dimensions of the Ark. What’s the point ? Is God
all-powerful, or isn’t he ?
PART OF AMAZON AF FILI ATE P ROGRAM

The Hour I First Believed - Slate Star Codex

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Hour I First Believed - Slate Star Codex

Uploaded by

Copyright:

Available Formats

4/2/2018 The Hour I First Believed | Slate Star Codex

[ + ] 84 comments since 1969-12-31 20:

the some of the conversations that inspired this line of thought.]

There’s a Jewish tradition that laypeople should only speculate on the

And there’s an atheist tradition that laypeople should only speculate on

Today is both, so let’s speculate. To do this properly, we need to

It’s called acausal trade because there was no communication – no

Values handshakes are a proposed form of trade between

This has a lot of advantages over the half-the-universe-each treaty

Counterfactual mugging (wiki article) is a decision theory problem

This is interesting because it gets us most of the way to Rawls’ veil of

Simulation capture is my name for a really creepy idea bysince

It might say “I’m currently simulating a million copies of you in such

You say “I don’t really care about copies of myself, whatever.”

The Tegmarkian multiverse (wiki article) works84like this:since

Putting this all together, we arrive at a surprising picture of how the

In each universe, life arises, forms technological civilizations, and

In many cases, they’ll want to conduct acausal trade with

But because of the same considerations above, it will be more efficient

So superintelligences may spend some time calculating the most likely

Some intelligences may be weaker than others and have less to

This superentity will have no direct power in universes not currently

First, because many of the superintelligences that 84

And third, because mortals may be allowed to have provisional pact

How can the superentity help mortals in an inaccessible universe?

1. There is an all-powerful, all-knowing logically necessary entity

2. It watches everything that happens on Earth and is specifically

 Twitter  Facebook Link without comments

84 RESPONSES TO THE HOUR I FIRST BELIEVED

Log in to Reply Hide

Carson McNeil says:

Log in to Reply Hide ↑

Log in to Reply Hide ↑

Wrong Species says:

Aren’t you undervaluing continuity of consciousness? I care about future me because

Log in to Reply Hide ↑

Scott Alexander says:

If we view “me” as a stream of causally connected mathematical objects, then Scott-n+1 is

Log in to Reply Hide ↑

Log in to Reply Hide ↑

Placid Platypus says:

Log in to Reply Hide ↑

Henry Shevlin says:

Log in to Reply Hide ↑

I think continuity of identity should not be gated on consciousness. Non-conscious

Aside the weird mysticalness of consciousness, and the counting argument of

Log in to Reply Hide ↑

matter can’t be conscious, only patterns of information flow can be

Consciousness as an emergent principle of information flow and thus of spacetime

Log in to Reply Hide ↑

Log in to Reply Hide ↑

Log in to Reply Hide

Log in to Reply Hide ↑

Log in to Reply Hide ↑

Nancy Lebovitz says:

Log in to Reply Hide ↑

Log in to Reply Hide

(i think. please help make this less oversimplified!) 84 comments since

Log in to Reply Hide

Scott Alexander says:

Log in to Reply Hide ↑

Should be “culminates in the creation of a superintelligence”.