Talks
- 27-03-2026 3D World Model for Robotics
- 20-03-2026 Scaling World Models for Generalist Robots
- 24-02-2026 Towards a science of scaling agent systems: When and why agent systems work?
- 03-02-2026 SAM 3D: Powerful 3D Reconstruction for Physical World Images
- 27-01-2026 A Step for AI Copilot in Medical Diagnosis and Surgery
- 16-12-2025 Towards Spatial Supersensing in Video
- 08-12-2025 Scaling Beyond Autoregression
- 28-10-2025 Video models are zero-shot learners and reasoners
- 23-10-2025 The Tolman-Sherrington Metamorphosis of Intelligence
- 21-05-2025 Towards Packing the Intelligence of Large Foundation Models on Low Resource Devices
- 15-07-2024 Visual Human Motion Analysis
- 04-04-2024 Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense
- 28-03-2024 Physics-informed Modeling of Dynamic Humans and their Interactions
- 28-03-2024 Towards democratising robot learning for all
- 21-03-2024 Learning Humanoid Robots
- 18-03-2024 LLaVA: A Vision-and-Language Approach to Computer Vision in the Wild
- 29-02-2024 Understanding and Mitigating the Pre-training Noise on Downstream Tasks
- 22-02-2024 Distilling Vision-Language Models on Millions of Videos
- 22-02-2024 Video Creation with Diffusion Models
- 19-02-2024 Towards Learning a Driving Simulator from the Real World