Note: This post is based on a Google document I created for my research group. It speaks in the first person, but I think the lessons could be helpful for many research groups,
I’ve previously argued that machine learning systems often exhibit emergent capabilities, and that these capabilities could lead to unintended negative consequences. But how can we reason concretely about these consequences?
Thanks to Collin Burns, Ruiqi Zhong, Cassidy Laidlaw, Jean-Stanislas Denain, and
Erik Jones, who generated most of the considerations discussed in this post.
Previously [https://bounded-regret.ghost.io/ai-forecasting-one-year-in/], I
evaluated the accuracy
Last August, my research group created a forecasting contest
[https://bounded-regret.ghost.io/ai-forecasting/] to predict AI progress on four
benchmarks. Forecasts were asked to predict state-of-the-art performance (SOTA)
on each benchmark for
Thanks to Hao Zhang, Kayvon Fatahalian, and Jean-Stanislas Denain for helpful
discussions and comments.
Addendum and erratum. See here
[https://kipp.ly/blog/transformer-inference-arithmetic/] for an excellent
discussion of similar ideas by Kipply