“For the vast majority of workloads, Sonnet is 2x faster than Claude 2 and Claude 2.1 with higher levels of intelligence,” Anthropic wrote in the Claude 3 announcement post. “It excels at ...
Anthropic's first Economic Index sheds light on who's using AI for what, and how much of our work it's actually automating.
Anthropic developed a defense against universal AI jailbreaks for Claude called Constitutional Classifiers - here's how it ...
After improving it, Anthropic ran a test of 10,000 synthetic jailbreaking attempts on an October version of Claude 3.5 Sonnet with and without classifier protection using known successful attacks.
Anthropic unveils new proof-of-concept security measure tested on Claude 3.5 Sonnet “Constitutional classifiers” are an attempt to teach LLMs value systems Tests resulted in more than an 80% ...
In Anthropic's internal test, the unprotected version of Claude 3.5 Sonnet is said to have blocked only 14 percent of unauthorized requests. A version protected with the filter system, on the ...
Claude 3.5 Sonnet. It does this while minimizing over-refusals (rejection of prompts that are actually benign) and and doesn’t require large compute. The Anthropic Safeguards Research Team has ...