AI Reasoning Benchmarks

Law And Technology Artificial Intelligence AI Legal Research AI Reasoning Benchmarks AI Capability Research AI Education AI Legal Education

54

Score

New study shows OpenAI's GPT-5.5 failed to outperform o3 on law school exams

University of Maryland law professors have found that OpenAI's GPT-5.5 did not meaningfully outperform its predecessor, o3, on law school final exams—a finding that challenges assumptions about consistent improvement in newer AI models.

June 29, 2026

Details arrow_forward

Artificial Intelligence Law And Technology Privacy AI Transparency Disclosure AI Preemption AI International Competition AI Bias Audit AI Agentic Systems AI Capability Research AI National Security AI Liability Framework AI State Legislation AI Agentic Governance AI Federal Framework AI Hallucination Incident Fraud Regulatory Fragmentation Deepfake Detection AI Physical Robotics AI Reasoning Benchmarks AI Sandbox Program AI Content Moderation AI Journalism AI Identity Verification AI Training Data Health Care

39

Score

UN independent panel warns unchecked AI progress poses catastrophic risks

On July 1, 2026, the UN's Independent International Scientific Panel on Artificial Intelligence released a preliminary report warning that unregulated AI development is outpacing both scientific understanding and government policy, with no guarantee against catastrophic harm. Led by UN Secretary-General António Guterres and computer scientist Yoshua Bengio, the panel identified specific risks: loss of control over autonomous systems, deceptive AI behaviors, and exploitation for fraud, cyberattacks, and biological threats. The report notes that AI already demonstrates expert-level reasoning in mathematics and science, with task complexity doubling every four to seven months, while current models trained on only a fraction of the world's 7,000 languages produce dangerous errors in health diagnoses for many populations.

July 1, 2026

Details arrow_forward

Artificial Intelligence AI Capability Research AI International Competition AI Vendor Market AI Enterprise Adoption Law And Technology AI Reasoning Benchmarks Antitrust

16

Score

Chinese startup Z.ai launches GLM-5.2, rivaling Anthropic and OpenAI at one-sixth the cost

Beijing-based startup Z.ai launched GLM-5.2 last month, a large language model now performing nearly as well as Anthropic's Claude Opus 4.8 on coding and agent tasks while operating at roughly one-sixth the cost of closed U.S. models like GPT and Claude. The model has rapidly gained traction on third-party AI platforms including OpenRouter, where it now ranks above Anthropic's offerings, and on Artificial Analysis' leaderboard, where it holds fifth place overall and second place for front-end coding. Industry observers have characterized the development as a "mini DeepSeek moment"—a reference to the Chinese competitor that disrupted markets in 2025 with its own low-cost, high-capability model. Prominent Western tech leaders including Snowflake CEO Sridhar Ramaswamy and venture capitalist Marc Andreessen have publicly praised GLM-5.2's capabilities.

July 2, 2026

Details arrow_forward

Artificial Intelligence AI Enterprise Adoption AI Reasoning Benchmarks AI Capability Research AI Transparency Disclosure Law And Technology AI Agentic Systems AI Vendor Market

13

Score

Scaled Cognition Raises $100M to Build Reliability-First AI for Enterprise

Scaled Cognition, a newly launched AI company founded by Dan Klein, has raised $100 million in Series A funding led by Khosla Ventures to build enterprise-grade AI systems designed for reliability rather than raw capability. The company focuses on "Large Action Models" and verifiable reinforcement learning, positioning itself against the current generation of AI tools that deliver inconsistent results. Scaled Cognition has announced partnerships with Genesys for virtual agent deployment and integrations with Baseten and Together AI for infrastructure.

June 25, 2026

Details arrow_forward

4 Contributing Entries

New study shows OpenAI's GPT-5.5 failed to outperform o3 on law school exams

UN independent panel warns unchecked AI progress poses catastrophic risks

Chinese startup Z.ai launches GLM-5.2, rivaling Anthropic and OpenAI at one-sixth the cost

Scaled Cognition Raises $100M to Build Reliability-First AI for Enterprise

mail Subscribe to AI Reasoning Benchmarks email updates