Joint Aligning and Cosegmenting Multiple Photo StreamsPeople
Publication
Matlab example codeWe are working on a journal version. We will post the code after the Journal submission. DescriptionMotivation of ResearchSuppose that we query and download millions of photo streams associated with the keyword scuba diving from the photo sharing site Flickr. Obviously, the photo streams are neither aligned nor calibrated since they are taken by different users at different time and locations. However, at the same time, they are likely to share common storylines consisting of sequences of events and activities repeatedly recurred across the scuba diving photo streams (e.g. riding a boat, wearing equipment, underwater exploration, and so on). Our challenging goal is to build such collective storylines from the photo streams of millions of users. In this paper, as a first technical step, we propose a method to jointly perform alignment of multiple photo streams and cosegmentation of aligned images, as shown in the figure below. In the alignment step, the images of different photo sets are matched based on visual contents and associated meta-data. In the cosegmentation step, the aligned images are segmented together in order to facilitate image understanding such as pixel-level classification in the images. We close a loop between the two tasks so that solving one task helps enhance the performance of the other in a mutually rewarding way.
MethodWe design a scalable message-passing based optimization framework to jointly achieve both tasks for the whole input image set at once. Please see the details in the paper. For evaluation, we collect about 1.5 millions of images of 13 thousands of photo streams regarding 15 outdoor recreational activities from Flickr. Take-home MessageWe proposed a scalable approach to jointly aligning and segmenting multiple uncalibrated Web photo streams of different users in an unsupervised and bottom-up way. The empirical results assured that our method can be a key component to achieve our ultimate goal: inferring collective photo storylines from Web images, which is a next direction of our future work. Funding
|