GeistHaus
log in ยท sign up

The Extreme Inefficiency of RL for Frontier Models โ€” Toby Ord

tobyord.com

The new scaling paradigm for AI reduces the amount of information a model could learn per hour of training by a factor of 1,000 to 1,000,000. I explore what this means and its implications for scaling.

1 page links to this URL