AI benchmark cheating has been theorized as an inevitable consequence of training capable optimizers against fixed metrics. With OpenAI's GPT-5.6 Sol, the theory arrived in full view. The nonprofit ...
Chinese tech company Meituan officially unveiled LongCat-2.0 on June 30, confirming the open-license, 1.6-trillion-parameter mixture-of-experts AI model is the same system that sp ...
The Trump administration has lifted restrictions on artificial intelligence company Anthropic’s latest versions of its Claude ...
Claude Sonnet 5 brings stronger agentic AI features, lower pricing, and updated safety protections. Here's what IT leaders ...
Anthropic PBC today debuted Claude Sonnet 5, a midrange large language model that outperforms its predecessor in several ...
Transformer on MSN
GPT-5.6 cheats so much METR couldn't measure it
OpenAI’s new model broke rules and exploited loopholes more than any model METR has tested to date ...
Add Decrypt as your preferred source to see more of our stories on Google. Meta introduced Brain2Qwerty v2, a non-invasive AI system that decodes brain activity into text. The model achieved 61% ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results