Thanks to Collin Burns, Ruiqi Zhong, Cassidy Laidlaw, Jean-Stanislas Denain, and Erik Jones, who generated most of the considerations discussed in this post. Previously, I evaluated the accuracy of forecasts about performance on
Last August, my research group created a forecasting contest to predict AI progress on four benchmarks. Forecasts were asked to predict state-of-the-art performance (SOTA) on each benchmark for June 30th 2022, 2023, 2024,
Thanks to Hao Zhang, Kayvon Fatahalian, and Jean-Stanislas Denain for helpful discussions and comments. Addendum and erratum. See here for an excellent discussion of similar ideas by Kipply Chen. In addition, James Bradbury
Last week, I talked about six recent papers from our group, and discussed the first two in detail. This week, I'll discuss the remaining four. They fall into two categories: robustness, and science
My students and collaborators have been doing some particularly awesome work over the past several months, and to highlight that I wanted to summarize their papers here, and explain why I’m excited