EgoRoom: egocentric 3D pose estimation through multi-coordinates heatmaps

Jung, Changsoo, authorBlanchard, Nathaniel, advisorBeveridge, Ross, committee memberClegg, Benjamin, committee memberEgoRoom: egocentric 3D pose estimation through multi-coordinates heatmapsColorado State University. Libraries2022joint trackingegocentric pose estimationXRMy UniversityMy University2022-08-292024-08-222022engTexthttps://hdl.handle.net/10217/235557https://doi.org/10.25675/3.04407born digitalmasters thesesCopyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright.Recent head-mounted virtual reality (VR) devices include fisheye lenses oriented to users' bodies, which enable full body pose estimation from video. However, traditional joint detection methods fail under this use case because fisheye lenses make joint depth information ambiguous, causing body parts to be self-occluded by the distorted torso. To resolve these problems, we propose a novel architecture, EgoRoom, that uses three different types of heatmaps in 3D to predict body joints, even if they are self-occluded. Our approach consists of three main modules. The first module transmutes the fisheye image into feature embeddings via an attention mechanism. Then, the second module utilizes three decoder branches to convert those features into a 3D coordinate system, with each branch corresponding to the xy, yz, and xz planes. Finally, the third module combines the three decoder heatmaps into the predicted 3D pose. Our method achieves state-of-the-art results on the xR-EgoPose dataset.