Ryo Yonetani, Kris Kitani, Yoichi Sato: “Ego-Surfing First-Person Videos”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR2015), Boston, MA, USA, Jun 2015 [CVPR2015 version] [Extended version (arXiv)]
We envision a future time when wearable cameras are worn by the masses, recording first-person point-of-view (POV) videos of everyday life. While these cameras can enable new assistive technologies and novel research challenges, they also raise serious privacy concerns. For example, first-person videos passively recorded by wearable cameras will necessarily include anyone who comes into the view of a camera – with or without consent. Motivated by these benefits and risks, we developed a self-search technique tailored to first-person videos. The key observation of our work is that the egocentric head motion of a target person (\ie, the self) is observed both in the POV video of the target and observer. The motion correlation between the target person’s video and the observer’s video can then be used to identify instances of the self uniquely. We incorporate this feature into the proposed approach that computes the motion correlation over densely-sampled trajectories to search for a target in observer videos. Our approach significantly improves self-search performance over several well-known face detectors and recognizers. Furthermore, we show how our approach can enable several practical applications such as privacy filtering, target video retrieval, and social group clustering.
This dataset contains synchronized first-person videos of a conversation scene recorded at 60fps for 30sec. The following files include:
- ppm/: Input videos (sequences of input frames resized into 320x240. Please email to firstname.lastname@example.org if you need videos at the original resolution.)
- sv/: Supervoxel hierarchies computed by LIBSVX, which is used in our CVPR2015 version.
- gt/: Target masks (bounding boxes annotated every 0.5sec)