back
Get SIGNAL/NOISE in your inbox daily

tl;dr
Paper of the month: • Models can detect when they’re being evaluated with high accuracy, and potentially undermine safety assessments by behavi…