MIT Unveils EnCompass Framework for AI Agent Optimization
MIT CSAIL introduces EnCompass framework enabling AI agents to backtrack and optimize LLM outputs, achieving 15-40% accuracy boost with 82% less code.
MIT CSAIL introduces EnCompass framework enabling AI agents to backtrack and optimize LLM outputs, achieving 15-40% accuracy boost with 82% less code.
Discovery Learning method enables rapid battery lifetime prediction in one week versus traditional months-long testing cycles.
OpenAI, Anthropic, and Google DeepMind researchers bypassed 12 published AI defenses at 90%+ rates, exposing critical security gaps in production systems.
Research from the Center for Countering Digital Hate (CCDH) estimates that Elon Musk's Grok AI was used to create approximately 3 million sexualized images, including thousands depicting children, over an 11-day period, raising severe safety concerns.
A new benchmark called APEX-Agents shows that even leading AI models like GPT-5.2 and Gemini 3 Flash fail on most complex, multi-domain tasks drawn from professional fields like law and finance, raising doubts about their immediate readiness for the workplace.
MIT researchers demonstrate that best-performing machine learning models can become worst-performing when applied to new data environments, revealing hidden risks from spurious correlations in medical AI and other critical applications.
In a surprising development, amateur mathematicians are leveraging AI chatbots to solve complex, long-standing mathematical problems posed by the legendary Paul Erdős, signaling a significant leap in AI's reasoning capabilities.