GPT-4 surprised many people with its abilities at coding, creative brainstorming, letter-writing, and other skills. How can we be less surprised by developments in machine learning? In this post, I’ll forecast the properties of large pretrained ML systems in 2030.
The deployment of powerful deep learning systems such as ChatGPT raises the question of how to make these systems safe and consistently aligned with human intent. Since building these systems is an engineering
Note: This post is based on a Google document I created for my research group. It speaks in the first person, but I think the lessons could be helpful for many research groups,
I’ve previously argued that machine learning systems often exhibit emergent capabilities, and that these capabilities could lead to unintended negative consequences. But how can we reason concretely about these consequences?
Thanks to Collin Burns, Ruiqi Zhong, Cassidy Laidlaw, Jean-Stanislas Denain, and
Erik Jones, who generated most of the considerations discussed in this post.
Previously [https://bounded-regret.ghost.io/ai-forecasting-one-year-in/], I
evaluated the accuracy