Research
OpenAI Reveals Critical Flaws in AI Image Recognition
New research shows neural networks can be tricked by images from different angles, raising urgent safety concerns for self-driving cars.
OpenAI Studies AI Agents Creating Their Own Language
New research shows AI agents crafting unique languages, pointing to future multi-agent teamwork.
OpenAI Builds Unsupervised Model That Learns Sentiment from Amazon Reviews
OpenAI cuts reliance on labeled data with a new model that understands sentiment through raw text prediction.
AgentMath Sets New Standard by Combining Language Models and Code for Math Mastery
AgentMath blends language models with code interpreters to push math problem-solving accuracy to new heights.
Generative Ontology: Structuring AI-Driven Game Design
New research introduces Generative Ontology, a framework that combines ontologies and large language models to create coherent and creative game designs, demonstrated by the GameGrammar system.
Testing Robot Smarts: The ERIQ Benchmark and GenieReasoner
A new benchmark and action tokenizer tackle the gap between robotic thinking and doing—showing even the smartest AI needs steady hands.

Microsoft Launches Paza to Close Speech Recognition Gap for African Languages
Microsoft Research introduces Paza, a speech pipeline, and PazaBench, a benchmark covering 39 African languages. Both tested with communities in real-world settings to improve recognition for under-represented languages.
ALIVE Framework Transforms LLM Training with Adversarial Learning and Verbal Feedback
Researchers unveil ALIVE, a new framework that uses adversarial learning and verbal feedback to boost reasoning in large language models—sidestepping key limits of traditional reinforcement learning.
Depth Anything V2 Boosts Accuracy in Robotic Surgery Depth Perception
New research uses synthetic data and DV-LORA to improve monocular depth estimation, a key factor for precision in complex surgical environments. Achieves top results on the SCARED dataset.
DreamTacVLA Gives Robots a High-Res Sense of Touch
A new framework adds detailed tactile sensing to VLA models, helping robots master contact-heavy tasks with 95% accuracy.
GR-Dexter Advances Bimanual Dexterous-Hand Robot Manipulation
New framework combines custom hardware, intuitive teleoperation, and curated datasets to boost vision-language-action models for bimanual robots.
DiffThinker: Redefining Multimodal AI with Generative Image Reasoning
DiffThinker introduces a new diffusion-based method that treats multimodal reasoning as a generative image-to-image task, surpassing top models in vision-focused challenges.

OpenScholar Beats GPT-4o in Scientific Research Synthesis
An open-source model from UW and AI2 outperforms GPT-4o in summarizing research and citing sources, earning scientists’ preference.

TTT-Discover: AI Trains During Use, Beats Human Experts
A new method lets AI keep learning while running, finding fresh solutions and outperforming humans on tough tasks.
New Method Maps Entropy in Phase-Change Materials to Boost Memory Speed
Researchers combine first-principles calculations, machine learning, and experiments to track entropy in phase-change materials, aiming for faster, more efficient memory devices.