Discussion about this post

User's avatar
Leo C's avatar

It’s worth noting that the progress so far, though it follows a task doubling time of 7 months (per METR), is not built on simple scaling laws. Therefore, it is perhaps not reasonable to expect continual progress just because research time passes. Since scaling laws in pre-training had been exhausted, reasoning tokens, agentic workflows, and scaling RL have provided the additional boosts. Most evidence point to these additional developments as having plateaus.

I am skeptical about using priors in a time series without considering the variables that have driven that performance increase to begin with.

I would expect the continual progress to reach a plateau by the end of 2025, unless another architectural breakthrough achieves scale.

Expand full comment

No posts