20:02, water fountain, NMT, Socorro

Today was the first day of TA lectures. My fellow course 8 major-course 24 enthusiast gave a talk introducing some problems in philosophy. While I had heard these before (utilitarianism vs. deontology; the meaning of truth; metaphysical basics; etc.), they were presented in such a clear manner that I wish I’d taken notes or recorded the thing.

One surprising result: he went on to speak about decision theory and came to Newcomb’s problem.

In short, there is a predictor, a player, and two boxes. One box is clear and contains a visible $1,000. The other box may contain either $0 or $1,000,000. In particular, if the predictor predicts the player will take both boxes, it contains nothing. However, if the predictor predicts the player will take only one, it contains $1,000,000. The contents of the box will not change when the player enters the room with the boxes, though the predictor is very, very good.

For most people, what they would do feels pretty obvious. But these people may also have different intuitions.

I remember being confronted with this problem just about five years ago by W. at my own SSP. I remember thinking it beyond obvious to take both boxes – after all, when you walk into the room, it’s not like causality happens backwards. There was no world in which it could be better to take the single box. So when the speaker mentioned “Newcomb”, I groaned internally. (I didn’t remember the formulation of the problem, but I did remember the feeling of confusion and frustration that someone smart could believe the opposite.)

Yet today, I was surprised. The speaker included a sentence about being shown a list of people with whom the predictor had played this game previously – tens of millions, and not one mistake (i.e. all one-boxers left with $1,000,000; all two-boxers left with $1,000). And suddenly it was abundantly clear to me to one-box!

In the span of five years, I have gone from a “causalist” to “evidentialist” – that is, allowing myself to update in the face of evidence over that which I “knew” to be true – that predictors can’t change the contents of the box by the laws of physics as I knew them. (And I don’t think this change is a coincidence – it seems likely to be a direct effect of a way of thinking “rationality” has cultivated in me over these last few years.) I liken it to how 2+2 could equal 3 – if I see this implausible result but the evidence is strong enough to practically imply I’ll end up with the million by one-boxing, I should one-box. Even if I don’t understand how the predictor could be that good.

I’m not sure if 5-years-ago me misunderstood this premise (couldn’t conceptualize a very very good predictor) or what. Unfortunately, I can’t recapture a past worldview.

Some argue that the “correct” strategy is to be a one-boxer thoroughly, but change your mind once you walk into the room. But then it becomes a different problem: can one one have a reason to intend to do something without having a reason to actually do it? (This is similar to Kavka’s toxin problem – for what it’s worth, I think not really. There’s not enough reason for me not to just thoroughly one-box and throw in confusion for the predictor.)

The one-boxers this year were a mix of the people I somewhat admired for their reflectiveness and religious people who were fine operating on faith. The majority (~60% of people) here were two-boxers.