Discovering Collective Narratives of Theme Parks from Visitors’ Photo StreamsPeople
Publication
DescriptionMotivation of ResearchCurrent technical advances, including widespread availability of mobile photo taking devices and ubiquitous network connectivity, are changing the way we tell our stories. Personal storytelling is becoming more data-driven and vision-oriented; people can simply capture their memorable moments by a stream of images, and then spontaneously reuse them to deliver their own stories. For example, even in a single day, tens of thousands of people visit Disneyland, and much of them take and are willing to share large streams of photos that record their experiences with families or friends. Each photo stream tells a slightly different story from its own point of view, but by aggregating them, it is likely that common storylines of Disneyland experience emerge. The objective of this research is to develop an approach for creating and exploring spatio-temporal storylines from large collections of photo streams contributed by visitors to Disneyland, along with its public information like visitor's map. Taking advantage of computer vision techniques, we represent the visual contents of images in the form of story elements (e.g. human faces, supporting objects, and locations), and automatically extract shared key moments and put them together to create a story graph, which is a structural summary of branching narratives that visualize various events and/or activities recurring across the input photo sets.
Method and ExperimentsThe output of our algorithm is two-fold. First, we extract story elements from all images of the photo streams using computer vision techniques. Along with two low-level image features, we define four types of story elements as story-related, high-level descriptors of images in the context of theme parks: faces, time, location, supporting objects. Based on the estimated story elements over photo streams, the second output is the spatio-temporal story graph via an inference of sparse time-varying directed graphs. The vertices correspond to dominant image clusters of the photo streams, and the edge set links the vertices that sequentially recur in many photo streams. Through quantitative evaluation and crowdsourcing-based user studies via Amazon Mechanical Turk, we show that the story graphs serve as a more convenient mid-level data structure to perform photo-based recommendation tasks than other alternatives. We also present storybook-like demo examples regarding exploration, recommendation, and temporal analysis, which may be most beneficial uses of the story graphs to visitors. Funding
|