1. Title and Due Date

Image Cutout due February 24, 2006

2. Problem Statement

Lazy Snapping successfully cuts out a region of an image when provided
with good user-defind label seeds, a sufficient segmented image, and
accurate edge weights.  However, the process is currently very slow,
especially in the parts of the algorithm whose only purpose is to
extract and set up the data that graph cut needs to solve the energy
optimization problem.  Moreover, the results rely very heavily on the
quality of the user input; in this case, quality is measured by how
well the explicitly marked areas of the image represent the area and
color variety of the object we wish to cut out.

The problems we wish to solve are to make Lazy Snapping perform better
(both in terms of speed and performance), evaluate how well the
max-flow algorithm performs against other methods in returning a good
graph cut that can then be interpreted as a solution labeling for our
image, and also determine if there is a way to eliminate the need for
user interactivity (i.e. make lazy snapping a completely autonomous
process).

3. Methods

There are four distinct parts to Lazy Snapping that need to be tuned
for performance.

The first part is the image pre-segmentation.  Currently, Lazy
Snapping is using the mean shift algorithm to get segment the image.
This should be replaced with something that can return similar results
but in less time, such as the Watershed algorithm.

The second part is K-means clustering.  Currently, K-means extracts a
group of representative colors from the image by iterating through all
the pixels of the image.  This should be changed so it uses the
segmented region information instead.  Other clustering algorithms
should also be experimented with to find a way of getting the desired
information in a more timely fashion.

The third part is the setting up of the graph that max-flow will
attempt to find the optimal cut of.  This involves representing the
image as a structure of nodes with edge weights between neighboring
nodes and the two flow terminals.  There may be ways of speeding up
this process by general improvements to the code.

The fourth part pertains to the graph cut that lazy snapping uses to
find the optimal flow of the graph that was set-up in part 3.  The
underlying method that graph cut uses to accomplish this allows us to
locate regions that may not be connected to the area that was
specifically marked as being part of the cutout object.  Consequently,
if we know that we want to preserve connectivity, then there may be
much faster ways to get the optimal cut.  One idea is to use a similar
graph structure, but instead of max-flow, use a region growing
algorithm that finds the shortest path from a seed to a region (the
heuristic for the edge weights would be based on color differences).
Then use the path costs to calculate which regions are connected to a
particular node.  The benefit to doing graph cut in this way is that
it has the potential of solving the multi-cut problem as well.  In
other words, this method could be designed to handle an arbitrary
number of seeds with very little extra work; something not possible
with max-flow.

To solve the problem of requiring user input, we shall supplement the
seed data with a saliency mapping of the image.  Highly-salient
regions will be explicitly marked as foreground, and regions with low
saliency will be marked as part of the background.  It isn't known how
well this will work or what the results might be.

4. Plan

Milestone 1: Implement Watershed for pre-segmentation

Milestone 2: Improve performance of K-means clustering by either
optimizing existing code or implemented a different clustering
algorithm.

Milestone 3: Improve performance of main lazy snapping algorithm (this
includes everything not covered by Milestone 1 and 2, or part of graph
cut)

Milestone 4: Re-structure Lazy Snapping's code into multiple re-usable
parts, then add code to the image library; each part that lazy
snapping uses should be a stand-alone function that simply solves an
arbitrary version of its particular problem.

Milestone 5: Implement shortest-path region-growing graph cut
algorithm and compare results against max-flow graph cut.

Milestone 6: Implement saliency detection as a replacement for the
foreground and background seed regions.  Evaluate the results.

Milestone 7: Enhance the Lazy Snapping UI.  This might involve
implementing the second half of the Lazy Snapping paper, or creating
some other method for allowing the user to make tuned adjustments to
the cutout.

5. Evaluation Criteria

Lazy Snapping should be a relatively fast algorithm, especially when
performing a cutout on smaller images.  We'll know it's working when
performance is fast and the results are similar to those that the
authors of the Lazy Snapping paper were getting.

Lazy Snapping should also be easy to use.  It should have a simple,
yet robust interface for cutting out an object and then interactively
fine-tuning the cutout.

Each separate function/algorithm that Lazy Snapping uses should be
modular and re-usable by anyone.  The code for these methods should
also be complete and easy-to-understand.

We should be able to tell if max-flow or shortest path provides a
better way of achieving graph cut.  This should be evident in the
speed of execution, performance, and the quality of the final results.

We should be able to tell if using the saliency of an image is a
viable replacement for interactive user input.  This will be
determined by the results achieved by using saliency to define the
seed regions.

6. Deliverables

A stand-alone easy-to-use implementation of Lazy Snapping that can do
all the major things that the original implementation from the paper
can do.  Depending on the results of the shortest-path graph cut, it
might also be viable to give users the option to preserve connectivity
between regions or to use saliency instead of manual user input.

Well-structured, well-documented code modules for doing segmentation
of an image, color clustering, lazy snapping, and graph cut with an
arbitrary number of seeds (or at least an implementation that can
handle the basic 2-seed case).

7. Readings

Old readings:
   a) Lazy Snapping
   b) Interactive Graph Cuts for Optimal Boundary & Region
Segmentation of Objects

New readings:
   c) GrabCut
   d) Fast Approximate Energy Minimization via Graph Cuts
   e) Book sections on the topic of Watershed segmentation
   f) Book sections on the topic of color clustering techniques
   g) Papers on Image Saliency

8. Risks

Tuning performance can become an arbitrarily long project.  It is
important to balance between making something go really fast, and
taking too much time to build it to be that way.

Tuning the weights of the edges might require some experimentation to
find out what yields the best results.

9. Motivation

Having a way of doing fast and accurate image cutout would be a very
helpful tool for other image-related projects, such as creating
dynamic photo collages, especially the cutout process can be automated
using saliency or some other means.  Moreover, the individual tools
being used to perform the image cutout can be re-used as separate
pieces in other projects as well.  Being able to quickly segment an
image, cluster colors, or perform a fast graph cut would be useful
tools to have on hand.