1. Title and Due Date Image Cutout due February 24, 2006 2. Problem Statement Lazy Snapping successfully cuts out a region of an image when provided with good user-defind label seeds, a sufficient segmented image, and accurate edge weights. However, the process is currently very slow, especially in the parts of the algorithm whose only purpose is to extract and set up the data that graph cut needs to solve the energy optimization problem. Moreover, the results rely very heavily on the quality of the user input; in this case, quality is measured by how well the explicitly marked areas of the image represent the area and color variety of the object we wish to cut out. The problems we wish to solve are to make Lazy Snapping perform better (both in terms of speed and performance), evaluate how well the max-flow algorithm performs against other methods in returning a good graph cut that can then be interpreted as a solution labeling for our image, and also determine if there is a way to eliminate the need for user interactivity (i.e. make lazy snapping a completely autonomous process). 3. Methods There are four distinct parts to Lazy Snapping that need to be tuned for performance. The first part is the image pre-segmentation. Currently, Lazy Snapping is using the mean shift algorithm to get segment the image. This should be replaced with something that can return similar results but in less time, such as the Watershed algorithm. The second part is K-means clustering. Currently, K-means extracts a group of representative colors from the image by iterating through all the pixels of the image. This should be changed so it uses the segmented region information instead. Other clustering algorithms should also be experimented with to find a way of getting the desired information in a more timely fashion. The third part is the setting up of the graph that max-flow will attempt to find the optimal cut of. This involves representing the image as a structure of nodes with edge weights between neighboring nodes and the two flow terminals. There may be ways of speeding up this process by general improvements to the code. The fourth part pertains to the graph cut that lazy snapping uses to find the optimal flow of the graph that was set-up in part 3. The underlying method that graph cut uses to accomplish this allows us to locate regions that may not be connected to the area that was specifically marked as being part of the cutout object. Consequently, if we know that we want to preserve connectivity, then there may be much faster ways to get the optimal cut. One idea is to use a similar graph structure, but instead of max-flow, use a region growing algorithm that finds the shortest path from a seed to a region (the heuristic for the edge weights would be based on color differences). Then use the path costs to calculate which regions are connected to a particular node. The benefit to doing graph cut in this way is that it has the potential of solving the multi-cut problem as well. In other words, this method could be designed to handle an arbitrary number of seeds with very little extra work; something not possible with max-flow. To solve the problem of requiring user input, we shall supplement the seed data with a saliency mapping of the image. Highly-salient regions will be explicitly marked as foreground, and regions with low saliency will be marked as part of the background. It isn't known how well this will work or what the results might be. 4. Plan Milestone 1: Implement Watershed for pre-segmentation Milestone 2: Improve performance of K-means clustering by either optimizing existing code or implemented a different clustering algorithm. Milestone 3: Improve performance of main lazy snapping algorithm (this includes everything not covered by Milestone 1 and 2, or part of graph cut) Milestone 4: Re-structure Lazy Snapping's code into multiple re-usable parts, then add code to the image library; each part that lazy snapping uses should be a stand-alone function that simply solves an arbitrary version of its particular problem. Milestone 5: Implement shortest-path region-growing graph cut algorithm and compare results against max-flow graph cut. Milestone 6: Implement saliency detection as a replacement for the foreground and background seed regions. Evaluate the results. Milestone 7: Enhance the Lazy Snapping UI. This might involve implementing the second half of the Lazy Snapping paper, or creating some other method for allowing the user to make tuned adjustments to the cutout. 5. Evaluation Criteria Lazy Snapping should be a relatively fast algorithm, especially when performing a cutout on smaller images. We'll know it's working when performance is fast and the results are similar to those that the authors of the Lazy Snapping paper were getting. Lazy Snapping should also be easy to use. It should have a simple, yet robust interface for cutting out an object and then interactively fine-tuning the cutout. Each separate function/algorithm that Lazy Snapping uses should be modular and re-usable by anyone. The code for these methods should also be complete and easy-to-understand. We should be able to tell if max-flow or shortest path provides a better way of achieving graph cut. This should be evident in the speed of execution, performance, and the quality of the final results. We should be able to tell if using the saliency of an image is a viable replacement for interactive user input. This will be determined by the results achieved by using saliency to define the seed regions. 6. Deliverables A stand-alone easy-to-use implementation of Lazy Snapping that can do all the major things that the original implementation from the paper can do. Depending on the results of the shortest-path graph cut, it might also be viable to give users the option to preserve connectivity between regions or to use saliency instead of manual user input. Well-structured, well-documented code modules for doing segmentation of an image, color clustering, lazy snapping, and graph cut with an arbitrary number of seeds (or at least an implementation that can handle the basic 2-seed case). 7. Readings Old readings: a) Lazy Snapping b) Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects New readings: c) GrabCut d) Fast Approximate Energy Minimization via Graph Cuts e) Book sections on the topic of Watershed segmentation f) Book sections on the topic of color clustering techniques g) Papers on Image Saliency 8. Risks Tuning performance can become an arbitrarily long project. It is important to balance between making something go really fast, and taking too much time to build it to be that way. Tuning the weights of the edges might require some experimentation to find out what yields the best results. 9. Motivation Having a way of doing fast and accurate image cutout would be a very helpful tool for other image-related projects, such as creating dynamic photo collages, especially the cutout process can be automated using saliency or some other means. Moreover, the individual tools being used to perform the image cutout can be re-used as separate pieces in other projects as well. Being able to quickly segment an image, cluster colors, or perform a fast graph cut would be useful tools to have on hand.