About me
Hi, I'm Alessio
I am an AI Researcher at Google XR Zurich in Federico Tombari's team. I lead my own applied research team focused on AI technologies for the current and next generation of XR devices. We work on the whole stack ranging from the integration of AI technologies to improving Gemini for egocentric assistant use cases.
I earned my Ph.D. in Computer Science and Engineering from the University of Bologna in 2019 at the Computer Vision Lab, advised by Professor Luigi Di Stefano.
My research journey began with deep learning for retail environments and depth estimation for autonomous driving. Since then, it has evolved through several different topics, ranging from implementing some of the visual translation stack powering Google Lens and Google translate to different image understanding tasks without forgetting some of my early 3D reconstruction and generation works. My current focus is on multimodal video/image understanding and generation using LLMs and Diffusion models.
I'm always excited to discuss new ideas or explore emerging topics. If you find any of our work interesting, please don't hesitate to reach out!

Latest News
Reconstructing 3D motion from a single camera is usually a mathematical nightmare, but Svitlana built MOSAIC-GS to make it surprisingly fast and high-quality by basically pre-calculating the chaos.
I will serve as Area Chair for ECCV 2026!
Since โtheโ and โandโ are apparently just giant attention-hungry magnets, Anna figured out how to filter the noise and turn them into RefAM, a zero-shot segmenter that somehow hits SOTA without us having to train a single thing.
Congratulations to Luca Zanella for having our joint work on online video step grounding: BagLM presented at Neurips2025.
