back
Get SIGNAL/NOISE in your inbox daily

Nature – Large language models (LLMs) are more accurate when they output intermediate steps. A strategy called reinforcement can teach them to do this without being told.