4 Comments
User's avatar
Nick Bulka's avatar

Are we at all concerned about our private chat / coding data being trained on by Anthropic models?

They show that dark pattern of: How would you rate this chat? 1: Bad 2: Ok 3. Good

which gives them access to the whole chat through their terms of service, especially if you press the one key out of habit, expecting to accept the model’s next step.

Trust violation.

I could see how Mythos boost is just the culmination of a feature flag: extract high signal intelligence from user feedback sessions. Need clarity here, Anthropic.

JSD's avatar

> was isolated to three specific sub-domains of our environment mix: GUI computer use, office-related tasks, and a small set of STEM environments.

Given this, I wonder if CoT controllability is better on SWE-related tasks than on STEM, office-related, or GUI computer use tasks.

JSD's avatar

Ah maybe not if this is mostly a propensity effect.

JSD's avatar

> It also seems plausible that the warm-start data contains examples where AIs are clearly doing some reasoning but don’t think about it and this transfers broadly.

Here "this transfers broadly" means, this transfers to the model not distinguishing thinking from output in other situations? Or is that not exactly what you meant?