Don’t just not do bad things. Do good things.
The development of AI that is more broadly capable than humans will create a new and serious threat: *AI-enabled coups*. An AI-enabled coup could be staged by a very small group, or just a single person, and could occur even in established democracies. Sufficiently advanced AI will introduce three novel dynamics that significantly increase coup risk. Firstly, military and government leaders could fully replace human personnel with AI systems that are *singularly loyal* to them, eliminating the need to gain human supporters for a coup. Secondly, leaders of AI projects could deliberately build AI systems that are *secretly loyal* to them, for example fully autonomous military robots that pass security tests but later execute a coup when deployed in military settings. Thirdly, senior officials within AI projects or the government could gain *exclusive access* to superhuman capabilities in weapons development, strategic planning, persuasion, and cyber offense, and use these to increase their power until they can stage a coup. To address these risks, AI projects should design and enforce rules against AI misuse, audit systems for secret loyalties, and share frontier AI systems with multiple stakeholders. Governments should establish principles for government use of advanced AI, increase oversight of frontier AI projects, and procure AI for critical systems from multiple independent providers.
Don’t just not do bad things. Do good things.
Don’t just not do bad things. Do good things.
Don’t just not do bad things. Do good things.
Don’t just not do bad things. Do good things.
Don’t just not do bad things. Do good things.
The content of this report was written by me, Michael Dickens, and does not represent the views of Rethink Priorities. The executive summary was prep…
The content of this report was written by me, Michael Dickens, and does not represent the views of Rethink Priorities. The executive summary was prep…
Don’t just not do bad things. Do good things.
Even if we solve the AI alignment problem, we still face post-alignment problems, which are all the other existential problems[1] that AI may bring. …
Don’t just not do bad things. Do good things.
This series examines the incoming crisis of human irrelevance and provides a map towards a future where people remain the masters of their destiny.
On a career move, and on AI-safety-focused people working at AI companies.
I wrote a report giving a high-level review of what work people are doing in AI safety. The report specifically focused on two areas: AI policy/advocacy and non-human welfare (including animals and digital minds).
Even if we solve the AI alignment problem, we still face post-alignment problems, which are all the other existential problems1 that AI may bring. Existential in the classic sense of “a permanent loss of most of the potential flourishing of the future”. ↩
Radical optionality is about preserving democratic governments’ ability to make good decisions about how to govern transformative AI systems as circumstances evolve.
As AI systems get more capable, it becomes increasingly uncompetitive and infeasible to avoid deferring to AIs on increasingly many decisions. Furthe…