Yuguang Li (李宇光)
I am a computer vision researcher working on image-grounded 3D understanding: recovering camera pose, geometry, layout, and scene structure from sparse, noisy, unposed images. I did my PhD research in
GRAIL Lab at UW Allen School advised by Prof.
Linda Shapiro, with collaborations across UW and industry, including
Alex Colburn,
Sing Bing Kang and
Ranjay Krishna with broader collaborations spanning Zillow and other industry research groups.
My research interests span multi-view perception, geometric reconstruction, geometric foundation models, generative models, optimization, graphics, and compact 3D scene representations. I am especially interested in systems that reconstruct the physical world from minimal visual evidence: accurate where images provide strong constraints, and plausible where observations are missing, occluded, or noisy.
My recent work focuses on autonomous multi-view floor-plan reconstruction.
BADGR combines layout-pose bundle adjustment with a floor-plan-specific diffusion model for learned outlier rejection and layout inpainting.
MvFFN/MOIS reformulates floor-plan recovery as pixel-grounded wall-instance and connectivity reasoning from sparse, unposed, wide-baseline panoramas.
This research started from
Zillow’s floor-plan reconstruction platform I originated and led to build, with a consistent mission: recover accurate camera poses, geometry, and structured floor plans from sparse indoor imagery. We shared our dataset and initial details publicly with the
ZInD paper;
PSM-Net extended the direction to BEV-based multi-view layout estimation;
SALVe grew from intern-led work that I mentored from problem framing through technical direction; and
CoVisPose advanced the same wide-baseline pose-estimation agenda before I later productized it as a production localization system.