Hey guys, This is our first check-in from the AI Safety Info distillation fellowship. We are working on distilling all manner of content to help make it easier for people to engage with AI Safety and x-risk topics. Here is the AI Safety Info website (and its more playful clone Stampy).
If there is a question you had in mind, but didn't find a clear answer for - type it in the search box and request an answer. It can be literally any question about AI safety no matter how basic or advanced. It could also just be a convoluted topic that you think could do with some distillation. One of the distillers will get to work on answering it.
Additionally, if you see content that you feel could be clearer or has mistakes, you can leave a comment on the document by clicking the edit button on the bottom right of the answer.
We will try to make this post a regular thing where we post some of the questions that have been answered or topics that have been distilled over the last little while. Since this is the first one, the following is a longer list with links to answers as an example of some of the questions that we answered in the last month:
- What are the differences between a singularity, an intelligence explosion and a hard takeoff?
- What is the difference between AI Safety, AI alignment, AI Control, Friendly AI, AI Ethics, AI existential safety and AGI safety?
- What are the differences between AGI, superintelligence and transformative AI?
- What is existential risk?
- What is corrigibility?
- What is AI Safety?
- What does prosaic alignment mean?
- What is everyone working on in AI alignment?
- What is mutual information?
- Are corporations superintelligent?
- Can't we limit damage from AI systems in the same ways we limit damage from companies?
- Isn't the real concern technological unemployment?
- What is Iterated Distillation and Amplification (IDA)?
- How might interpretability be helpful?
- What is the probability of extinction from misaligned superintelligence?
- What projects are CAIS working on?
- What are the leading theories in moral philosophy and which of them might be technically the easiest to encode into an AI?
- Isn't the real concern X?
- Wouldn't a superintelligence be smart enough not to make silly mistakes in its comprehension of our instructions?
- What is an alignment tax?
- Won't humans be able to beat an unaligned AI since we have a huge advantage in numbers?
- Even if we are rationally convinced about the urgency of existential AI risk, it can be hard to feel that emotionally because the danger is so abstract. How can this gap be bridged?
- What evidence do experts usually base their timeline predictions on?
This is crossposted to lesswrong : https://www.lesswrong.com/posts/FYRYhkdAQoQibasNB/new-distillations-on-stampy-s-ai-safety-info-expansive