Redwood Research blog
Subscribe
Sign in
Home
Chat
Reading List
Archive
About
Latest
Top
Discussions
The inaugural Redwood Research podcast
With Buck Shlegeris and Ryan Greenblatt
13 hrs ago
•
Buck Shlegeris
15
5
3:52:01
Recent LLMs can do 2-hop and 3-hop latent (no CoT) reasoning on natural facts
Recent AIs are much better at chaining together knowledge in a single forward pass
Jan 1
•
Ryan Greenblatt
11
4
December 2025
Measuring no CoT math time horizon (single forward pass)
Opus 4.5 has around a 3.5 minute 50%-reliablity time horizon
Dec 26, 2025
•
Ryan Greenblatt
9
1
Recent LLMs can use filler tokens or problem repeats to improve (no-CoT) math performance
AI can sometimes distribute cognition over many extra tokens
Dec 22, 2025
•
Ryan Greenblatt
10
2
BashArena and Control Setting Design
We’ve just released BashArena, a new high-stakes control setting we think is a major improvement over the settings we’ve used in the past.
Dec 18, 2025
•
Adam Kaufman
and
James Lucassen
8
The behavioral selection model for predicting AI motivations
The basic arguments about AI motivations in one causal graph
Dec 4, 2025
•
Alex Mallen
and
Buck Shlegeris
13
4
November 2025
Will AI systems drift into misalignment?
A reason alignment could be hard
Nov 15, 2025
•
Josh Clymer
15
2
1
What's up with Anthropic predicting AGI by early 2027?
I operationalize Anthropic's prediction of "powerful AI" and explain why I'm skeptical
Nov 3, 2025
•
Ryan Greenblatt
20
2
October 2025
Sonnet 4.5's eval gaming seriously undermines alignment evals
And this seems caused by training on alignment evals.
Oct 30, 2025
•
Alexa Pan
and
Ryan Greenblatt
26
5
2
Should AI Developers Remove Discussion of AI Misalignment from AI Training Data?
There is some concern that training AI systems on content predicting AI misalignment will hyperstition AI systems into misalignment.
Oct 23, 2025
•
Alek Westover
9
3
Is 90% of code at Anthropic being written by AIs?
I'm skeptical that Dario's prediction of AIs writing 90% of code in 3-6 months has come true
Oct 22, 2025
•
Ryan Greenblatt
14
2
Reducing risk from scheming by studying trained-in scheming behavior
Can we study scheming by studying AIs trained to act like schemers?
Oct 16, 2025
•
Ryan Greenblatt
5
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts