Jacob Steinhardt - Bounded Regret

Building Technology to Drive AI Governance

2 months ago 7 min read

Technically skilled people who care about AI going well often ask me: how should I spend my time if I think AI governance is important? By governance, I mean the constraints, incentives, and

Oversight Assistants: Turning Compute into Understanding

3 months ago 10 min read

Currently, we primarily oversee AI with human supervision and human-run experiments, possibly augmented by off-the-shelf AI assistants like ChatGPT or Claude. At training time, we run RLHF, where humans (and/or chat assistants)

Analyzing long agent transcripts (Docent)

a year ago 1 min read

This is a brief overview of a recent release by Transluce. You can see the full write-up on the Transluce website. AI systems are increasingly being used as agents: scaffolded systems in which

Introducing Transluce — A Letter from the Founders

a year ago 3 min read

We are launching an independent research lab that builds open, scalable technology for understanding AI systems and steering them in the public interest. Transluce means to shine light through something to reveal its

Analyzing the Historical Rate of Catastrophes

2 years ago 20 min read

To communicate risks, we often turn to stories. Nuclear weapons conjure stories of mutually assured destruction, briefcases with red buttons, and nuclear winter. Climate change conjures stories of extreme weather, cities overtaken by