The new reinforcement learning system lets large language models challenge and improve themselves using real-world data ...