Abstract

Video segmentation is an application of computer vision aimed at automating the extraction of an object from a series of video frames. However, it is a difficult problem, especially to compute at real-time, interactive rates. Although general application to video is difficult because of the wide range of image scenarios, user interaction can help to reduce the problem space and speed up the computation. This thesis presents a fast object-tracking tool that selects an object from a series of frames based on minimal user input. Our Intelligent Rotoscoping tool aims for increased speed and accuracy over other video segmentation tools, while maintaining reproducibility of results. For speed, the tool stays ahead of the user in selecting frames and responding to feedback. For accuracy, it interprets user input such that the user does not have to edit in every frame. For reproducibility, it maintains results for multiple iterations. Realization of these goals comes from the following process. After selecting a single frame, the user watches a speedy propagation of the initial selection with minor nudges where the selection misses its mark. This allows the user to “mold” the selection in certain frames while the tool is propagating the fixes to neighboring frames. It has a simple interface, minimal preprocessing, and minimal user input. It takes in any sort of film and exploits the spatial-temporal coherence of the object to be segmented. The tool allows artistic freedom without demanding intensive sequential processing. This thesis includes three specific extensions to Intelligent Scissors for application to video: 1. Leapfrogging, a robust method to propagate a user's single-frame selection over multiple frames by snapping each selection to its neighboring frame. 2. Histogram snapping, a method for training each frame's cost map based on previous user selections by measuring proximity to pixels in a training set and snapping to the most similar pixel's cost. 3. A real-time feedback and correction loop that provides an intuitive interface for a user to watch and control the selection propagation, with which input the algorithm updates the training data.

Degree

MS

College and Department

Physical and Mathematical Sciences; Computer Science

Rights

http://lib.byu.edu/about/copyright/

Date Submitted

2007-06-13

Document Type

Thesis

Handle

http://hdl.lib.byu.edu/1877/etd1877

Keywords

video segmentation, tracking, boundary, selection, rotoscope, rotoscoping, training

Language

English

Share

COinS