[attention conservation notice: Intended for a rationalist audience, and likely to be of little interest to non-rationalists]
[epistemic effort: I discussed the topic with friends and will update this blog post in response to criticism I consider reasonable]
CFAR has recently pivoted to focusing more on AI safety. I feel a lot of qualms about this, which I wanted to write up in a blog post. I had previously had qualms about cooperation and rationality as the common interest of many causes; however, I think this post addresses my concerns and made me significantly less worried. However, I do have another concern.
I want to make my positionality clear here, because I think it does affect my thinking. Most notably, I am highly uncertain about AI risk, which leads me to consider it not to be one of the most effective cause areas. For obvious reasons, this increases my qualms. I have tried to not say anything that I wouldn’t endorse if CFAR were discussing about pivoting to a focus on global poverty rather than existential risk; however, I encourage my readers to account for my bias when they assess my arguments.
I think that it is possible to create a teachable art of rationality (as in domain-general thinking skills which improve one’s ability to reason in a wide variety of fields), but that we currently do not have one. I believe that a teachable art of rationality would positively affect the world by causing people to make fewer dumbass decisions. For this reason, I care about CFAR’s success. However, while I believe that CFAR likely has a positive effect on its attendees, I’m not convinced that this isn’t due to the Dodo bird verdict: that is, I think it’s likely that CFAR has positive effects because they are kind, empathetic, high-status people who want other people to improve, not because they have necessarily figured out rationality skills yet. To be clear, this is in no way an insult to CFAR: developing an entire new field of psychology is really hard! However, this does mean that I am not a CFAR alum nor did I have any interest in attending CFAR. For this reason, their post making me less likely to want to attend CFAR might not be particularly interesting to CFAR; after all, I wasn’t going to attend there anyway.
CFAR and MIRI have always been very closely linked; for instance, CFAR and MIRI share an office. To the extent that this blog post makes public what was always true, I think it’s good and virtuous; honesty and transparency are an important part of creating evidence-backed charities. Nevertheless, I think that this thing being true is likely to be bad.
Imagine that instead of saying that they were focusing on AI risk, CFAR said that they’d decided that CFAR as an organization would plan and act as though deworming charities were the most important ones to donate to. For this reason, they would prioritize recruitment of people working in the public health sphere (in addition to effective altruists and rationality geeks more generally). They would prioritize skills that are useful in deworming, such as doing expected-value calculations in high-risk situations and avoiding scope insensitivity. Their metrics include the effects their graduates have on deworming.
I have an issue with this hypothetical. Deworming charities being the most important is a specific empirical claim; to believe deworming is the one of the most important causes, one must believe that the studies were well-conducted, that it’s likely the effects aren’t spurious, that they generalize to the locations where deworming is happening, etc. Similarly, the importance of AI risk is also a specific empirical claim: to believe that AI risk is one of the most important causes, one must believe that it’s likely that artificial intelligence will be developed, that it is possible for people currently working on AI to have a positive effect, etc.
But it’s possible that deworming isn’t one of the best interventions. For instance, imagine that the deworming studies were very poorly done. While an ordinary person might not be able to tell that they were poorly done, it’s obvious to someone who knows how to read a study. Now imagine that knowing how to read a study is, in fact, a very important rationality skill. (I’m not claiming either that the deworming studies are poorly done or that knowing how to read a study is an important skill; this is strictly a hypothetical.) I think it’s very possible that hypothetical Deworming CFAR would flinch away from the idea “maybe we should teach people to read studies”. Or they’d run a pilot program, notice that teaching people to read studies tends to make them less enthusiastic about deworming, and say “well, guess that learning how to read studies actually makes people more irrational.” Or they’d still teach the how-to-read-studies unit, but they’d alter it so that people continue to be enthusiastic about deworming afterward– thus not teaching an important aspect of the skill. Or it would never occur to them to teach that unit, perhaps because this is a flaw in the rationality of the employees of Deworming CFAR itself. (After all, they aren’t noticing that deworming is a poor cause area…)
To be clear, I’m not suggesting that anyone at CFAR would intentionally teach people poor rationality skills in order to trick them into caring about AI risk. Of course not! CFAR as an organization genuinely cares about developing the art of rationality, and genuinely believes that AI risk is the biggest threat to humanity reaching the stars. But it doesn’t have to be intentional to be harmful. Many biases happen on the subconscious level, as unnoticed flinches or ideas that you happen to overlook.
You might argue that if deworming is not the best cause, surely hypothetical Deworming CFAR would notice this and reorient themselves. But I don’t think that’s true. We don’t have the art of rationality yet. CFAR employees are still vulnerable to confirmation bias, self-deception, and motivated reasoning. I think it’s important to reduce the harm caused by these aspects of human psychology.
I think endorsing specific empirical claims is different from endorsing value sets. For instance, “people in the developing world matter as much as people in the developed world” is a value, as is “it would be bad for humanity to be destroyed.” I do not think that the problems of endorsing empirical claims apply to endorsing values, because values do not generally work in the normal Bayesian way. It is hard to think of evidence that will disprove the claim that people in the developing world matter as much as people in the developed world (…maybe if they’re all p-zombies?). Rationality is, to a very large degree, about gathering and properly updating based on evidence. Therefore, a value has a much lower risk of hurting the rationality curriculum than an empirical belief. This is why I was less worried when it seemed to me that CFAR’s mission was to be a general effective altruist organization, because general effective altruism is almost all about values rather than facts.
CFAR did this shift for very good reasons. There are a lot of potential rationality skills; you need some criteria to be able to prioritize which ones you discover and teach. There are a lot of potential audiences for rationality; you need some criteria to pick which ones you will target. It is important to have integrity; secretly having AI risk as one’s mission while passing oneself off as a cause-neutral rationality organization is wrong. And we’re not trying to develop the art of rationality to make Facebook comment threads better; it makes sense to want to prioritize the most important issues facing the world.
This is probably the part of the post where I am supposed to offer a solution. Unfortunately, I have never run a rationality organization, and I am highly uncertain about what the correct path is. I don’t know how CFAR can get the very real benefits of this pivot while also reducing the risk of their mission distorting their rationality curriculum. But I do think that a first step is keeping in mind the risks.
Paul Crowley said:
– If AI risk is the highest priority cause, this has costs and benefits; presumably CFAR has evaluated those and decided the benefits are higher, which is plausible.
– If AI risk isn’t the highest priority cause, this is all costs: CFAR’s efforts can be misdirected, and they can fail to notice things or notice them sooner for the reasons you set out.
– If the people who run CFAR are pretty confident that AI risk is actually the highest priority cause, then this can be worth it.
– I’m personally pretty confident of that so I’m OK with this move
– I encourage you to get elbows-deep into the business of assessing AI risk as a cause area ASAP 🙂
LikeLike
sniffnoy said:
This post seems pretty odd to me because there seems to be an implicit assumption that CFAR is an EA organization. When I think of cause-neutral CFAR, I don’t think of EA; I think of, essentially, I want to be better at attaining my goals whether those are altruistic or not.
LikeLike
raemon777 said:
Things worth noting people may not know:
In the past several EA Globals, CFAR has a) been present, b) specifically talked about a goal/plan, broken down as follows:
– The world has big problems, and needs people who are smart, capable, rational, and altruistic (or at least motivated to solve those problems for other reasons)
– CFAR has limited number of people they can teach
– People tend to rub off on each other when they hang out with each other
– People vary in how rational, altruistic, and capable they are.
– So, CFAR seeks out people who have SOME combination of high rationality, altruism, and competence. They run workshops with all those people, and one of their hopes is that the rationality/altruism/competence will rub off on each other.
So it is not new that CFAR has (at least) a subgoal of “create people capable of solving the world’s problems, with the motivation to do so.” This may not have been well publicized either, for good or for ill.
I think this was absolutely a worthy goal, and the correct one for them to focus on given their limited resources.
So the new AI announcement is basically them saying “we are refining this a step further, to optimize for AI in particular.”
(Whether you think that is good or bad depends on a lot of things)
LikeLike
Olivia Schaefer said:
I’ve noticed that CFAR has a very similar effect as MIRI, in that the closer you get to it, the more confidence you tend to have that it’s making largely minor mistakes, and that major mistakes are corrected soon after they’re noticed (soon on organizational timescales, which are a bit slower than single people). As Anna was saying in the followup post, from CFAR’s perspective, inside view arguments are far stronger than outside view arguments. So, while it may not have been your intention to in the first place, pointing out failure modes of the reference class of cause-focused rationality organizations isn’t likely to be very moving to anyone sharing CFAR’s view of things.
I think there’s a general point here about misuse of the outside view and the EMH and similar that I’ve noticed is at the bottom of most arguments among rationalists. I’m tempted to try writing “Contra Type 2 Rationalists”.
LikeLiked by 1 person
John said:
“values do not generally work in the normal Bayesian way”
Yep, I think it was philosopher David Hume who first pointed this out, check out the “is-ought distinction”
“we’re not trying to develop the art of rationality to make Facebook comment threads better”
Why do you believe this is a low impact area? Making Facebook comment threads better seems pretty important to me.
LikeLike
Autolykos said:
There’s a saying in Go: “Urgent points before big points.”
You don’t need to make the claim that AI risk is the cause with the highest net utility to make prioritizing it over most EA causes rational. You only need to make the claim that it is an existential risk we are very likely to encounter soon, and can do something about. Which is much, much weaker.
Parasites, for example, may cause a lot more harm in total, but they are still relatively “safe”. They have been with us ever since animals exist, and have, to my knowledge, never caused an extinction event. If we ignored them for another century, humanity would survive. Whereas we have zero experience on how dangerous AGI actually is, but plenty reason to worry. It might not be safe to ignore.
You could argue that another cause is more urgent (e.g. global warming or protection against large asteroids), but if you only claim that another cause has higher expected utility, you might be missing part of the point.
LikeLiked by 1 person
raemon777 said:
Re: Autolykos – yeah, and I think the deworming example listed here isn’t a sufficiently good analogy because you’d (or at least I’d) have to really strain to see it as the most important cause, and especially as an urgent cause.
I think the proper ideological turing test for evaluating CFAR’s decision is “think of whatever cause seems *most* important and urgent to you, or that you can easily imagine turning out to be most important and urgent, and then see if it CFAR’s decision would make sense in that sense.”
Even this still doesn’t quite cut it, because one of the reasons AI and rationality are linked in Less Wrong discourse is because AI is a domain that _especially_ needs to have a good grasp on rationality. It’s not only murky and hard to think about, but you’re working on how to build (or constrain) a process that will also be thinking about murky and hard to think about things.
LikeLiked by 1 person
ozymandias said:
Autolykos: My argument has nothing to do with whether AI risk is the most important cause. Whether or not AI risk is an important cause, I think endorsing specific empirical claims is a bad idea for a rationality organization.
Raemon: I would say that global poverty is a highly urgent issue to the parents mourning their dead children.
I agree that more rationality is required for AI risk than for deworming, but I am not sure how this relates to my hypothetical, which is about the potential distorting effects that endorsing a specific empirical claim would have on developing an art of rationality. I guess it could make the situation more dire in the AI risk case, because incorrect opinions on deworming touch only a few areas, while incorrect opinions on AI risk could potentially poison the entire art?
LikeLike
Autolykos said:
I do agree with you that CFAR should choose between promoting rationality and promoting AI safety. Serving two masters is usually a bad idea.
But to me it seems they are already shifting their focus from a more value-neutral “raising the sanity waterline” to a more specific “building and recruiting talent for MIRI” anyway. Which is a totally valid choice, even though they could advertise it a bit more clearly.
It also means that I have little motivation to get involved with them unless I want to join MIRI, but it is well within their rights not to care about that.
LikeLike
raemon777 said:
Certainly agree poverty is urgent, I just think there are a lot of comparably urgent problems.
LikeLike
sconn said:
I completely agree with you. I think you can’t be rational and religious either. You can be rational and happen to, for the moment, agree with the tenets of a specific religion, but as soon as you elevate the religion itself to a fixed conclusion, you’re not going to be rational about it anymore.
Rationality can’t be a means to defend a specific conclusion, or else, by definition, it’s not rationality. It has to be a means to find the right conclusions AND must keep itself open to updating those conclusions.
LikeLiked by 1 person
benquo said:
>Now imagine that knowing how to read a study is, in fact, a very important rationality skill. […] I think it’s very possible that hypothetical Deworming CFAR would flinch away from the idea “maybe we should teach people to read studies”. Or they’d run a pilot program, notice that teaching people to read studies tends to make them less enthusiastic about deworming, and say “well, guess that learning how to read studies actually makes people more irrational.”
My impression is that CFAR specifically intends to focus on rationality skills relevant to evaluating AI risk. Obviously this puts them at elevated risk of the second failure mode you mention (mistaking correct updating away from AI risk for AI risk not being a thing), but it seems to me like if CFAR can’t pass that elementary a test of rationality, it fails even without this sort of conflict of interest adjacent problem.
LikeLike