> the “cost” of our monitoring scales with how many benign messages get flagged
In practice wouldn't the cost be higher for messages that occur later in a trajectory, because the auditor needs to be familiar with what happened earlier during the trajectory to assess the message? Or is the setup that the auditor reviews messages without any context?
> the “cost” of our monitoring scales with how many benign messages get flagged
In practice wouldn't the cost be higher for messages that occur later in a trajectory, because the auditor needs to be familiar with what happened earlier during the trajectory to assess the message? Or is the setup that the auditor reviews messages without any context?
Yeah, that's a fair point; our cost model here is simpler than what's realistic.