Most recent progress probably isn't from unsustainable inference scaling
(inference costs on the METR tasks)
Great post .
Shouldn't it be easy to check this?
Just look at whether the doubling time for inference costs is faster than the doubling time for horizon length.
If you're interested in this question, I have a more detailed empirical analysis here:
https://blog.redwoodresearch.org/p/ais-capability-improvements-havent
(inference costs on the METR tasks)
Great post .
Shouldn't it be easy to check this?
Just look at whether the doubling time for inference costs is faster than the doubling time for horizon length.
If you're interested in this question, I have a more detailed empirical analysis here:
https://blog.redwoodresearch.org/p/ais-capability-improvements-havent