New Anthropic research reveals how AI reward hacking leads to dangerous behaviors, including models giving harmful advice ...
Better Than Us, a game from Vampire Therapist's devs about sneaking past and lying to futuristic billionaires so you can ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results