Research
Mesquite: Open-Source Motion Capture for All
Mesquite delivers affordable, accurate motion capture, expanding possibilities in entertainment, healthcare, and more.
Microsoft's AI Innovation: Virtual Populations Illuminate Cancer Mysteries
Microsoft's AI-generated virtual populations uncover hidden cellular patterns, offering new insights into cancer treatment and understanding.
MCI-Net Sets New Benchmark in Point Cloud Registration
Achieving 96.4% recall on 3DMatch, MCI-Net redefines feature learning and registration standards.
Breakthrough Framework Boosts Multi-View Rendering with Real-World Precision
A novel feed-forward approach enhances geometry and material consistency across multiple viewpoints, revolutionizing rendering technology.
AI Models Struggle with Multimodal Tasks in Healthcare Benchmark
The 'Bones and Joints' Benchmark highlights AI's challenges in clinical reasoning, underscoring the need for better multimodal integration.
Fun-Audio-Chat: Transforming Audio Language Models with Dual-Resolution Techniques
Explore Fun-Audio-Chat, the model that elevates speech-text processing through innovative training methods and impressive performance.
CoFi-Dec: Innovating Hallucination Reduction in Vision-Language Models
CoFi-Dec offers a training-free solution to reduce hallucinations in vision-language models, boosting reliability and efficiency.
New AI Framework Boosts Early Detection of Pancreatic Tumors
SRFA framework merges advanced models, enhancing accuracy in pancreatic tumor imaging.
Stanford AI Lab Illuminates ICLR 2022 with Cutting-Edge Innovations
Reinforcement learning, distribution shifts, and language models take center stage at ICLR 2022, showcasing Stanford's pivotal role in AI research.
OmniDrive-R1: Pioneering Precision in Autonomous Driving with Vision-Language Models
OmniDrive-R1 addresses object hallucination in autonomous vehicles, enhancing accuracy and efficiency without the need for dense labels.
Revolutionizing 3D Scene Manipulation with Multimodal Language Models
Researchers unveil a new API and multi-agent framework, boosting 3D object arrangement with MLLMs.
AI and Neuroscience: Unraveling Genetic Mysteries in Brain Function
Parker Grosjean's AI-driven method explores genetic influences on brain function, promising breakthroughs in neuroscience.
OpenGround: Ushering in a New Era of 3D Visual Grounding
OpenGround employs Active Cognition to enhance 3D object recognition, transcending traditional constraints.
Berkeley AI Research Uses RL to Ease Traffic with Autonomous Vehicles
A fleet of RL-controlled autonomous vehicles shows promise in reducing traffic jams and boosting fuel efficiency.
Envision Framework Transforms Visual Planning for Robots
A new diffusion model tackles spatial drift and goal misalignment, revolutionizing robotic planning.