Research
Revolutionary Depth Estimation Method Uses Visual Autoregressive Priors
VAR-Depth achieves top-tier performance with fewer samples, pushing forward 3D vision technology.
Lamps: Transforming Medical Imaging with Self-Supervised Learning
Lamps uses self-supervised learning to enhance anatomical recognition in chest radiographs, promising breakthroughs in clinical diagnostics.
ViLaCD-R1: Precision Change Detection Revolutionizes Remote Sensing
ViLaCD-R1 sets a new standard in remote sensing, enhancing semantic change detection and spatial accuracy.
CritiFusion: Elevating Text-to-Image AI with Semantic Precision
CritiFusion refines text-to-image AI with semantic critique and frequency enhancement, boosting alignment and quality.
M-ErasureBench and IRECE: Pioneering AI Safety in Diffusion Models
Explore how a new benchmark and module are revolutionizing concept erasure, enhancing AI safety across diverse input modalities.
CLAdapter: Enhancing Vision Models for Data-Scarce Domains
Discover CLAdapter, a breakthrough tool that boosts the adaptability of vision models in specialized, data-limited environments.
Guided Path Sampling: Stabilizing the Future of Diffusion Models
Researchers introduce Guided Path Sampling, enhancing stability and output quality in diffusion models like SDXL and Hunyuan-DiT.
Adaptive Visual Token Pruning: Boosting Efficiency in Multimodal Models
A novel approach reduces token redundancy in Large Multimodal Models, enhancing performance in complex scenarios.
HY-Motion 1.0: A New Era in 3D Motion Generation from Text
HY-Motion 1.0 scales Diffusion Transformers to a billion parameters, setting a new standard in 3D motion generation from text.
YOLO-Master: Transforming Real-Time Object Detection with Adaptive Computation
Discover YOLO-Master, a breakthrough framework using adaptive computation to boost object detection performance and efficiency.
DreamOmni3: Scribble Your Way to Smarter GUI Editing
DreamOmni3's scribble-based GUI editing promises to transform design workflows with innovative tasks and benchmarks.
Freetime FeatureGS: Revolutionizing 4D Scene Reconstruction with Gaussian Primitives
Freetime FeatureGS introduces a groundbreaking method in 4D scene reconstruction, eliminating unstable video segmentation maps for superior results.
WRCFormer: Revolutionizing 3D Object Detection for Autonomous Vehicles
By merging radar and camera data, WRCFormer sets new standards for object detection under adverse weather conditions.
New Framework Transforms Face Anonymization with Diffusion Models
Researchers introduce a method for anonymizing faces while preserving attributes, enhancing privacy without text prompts.
3D Gaussian Framework Transforms Scene Understanding in Autonomous Vehicles
Driving World Models (DWMs) utilize 3D Gaussian scenes for enhanced multi-modal generation and understanding in autonomous driving.