559 Template

CS 559: Computer Graphics
Fall 2001

CS 559 Home

Calendar

Resources

Assignments

Projects 1 2 3

Project 1: Picture Composer

Update: October 9, 2001: the definitions of the compositing operators was unclear, here's a clarification:

Because Porter and Duff use the same operator name for A x B as B x A (for example "A over B" and "B over A") we run into a naming problem when we are using the second argument for the destination. Therefore, we need to distinguish between
    B = A over B 
      and
    A = A over B 
Since we always put the answer into the second operand in Project 1, we need a different mechanism for specifying the second case. The mechanism that we chose was to invent new names for the "backwards" cases. This is described in the project, but it is not clear (and there is an inconsistency).

So if you want to say A = A over B, you have to write A = B under A.

Here is a set of 12 names (PD only have 8) so that every combination is unique. In the original project description, it is a little ambiguous. So here's a rosetta stone:

Porter Duff Project 1

Clear A Clear B

A A first B

B A second B

A over B A over B

B over A A under B

A in B A in B

B in A A heldin B

A out B A out B

B out A A heldout B

A atop B A atop B

B atop A A below B

A xor B A xor B

Notice htat while Porter and Duff define out to be an abbreviation for "held out" and in to be short for "held in," we treat them differently.

Due date: October 11, 2001, 5:00 pm. Late assignments accepted according to the course late policy. No late assignments accepted after 8:00am, October 15th.

Updates

10/1/01 "Different kinds of blurs" is a bit of a misnomer. What I mean is that there should be at least two different values of the parameters that give different results. For example, you might have the parameter be the frequency limit (kernel size) of a low-pass filter.

Overview

This project challenges you to build a system that allows the user to combine the various image processing methods discussed in class (and some that aren't) to make pictures. The goal is to give you an opportunity to implement some of the methods and see how they can be combined and used to produce useful effects.

Your program will read a list of instructions written in a text file in a simple format. Optionally, it may provide an interactive user interface to allow the user to specify the operations.

The operations your program will support will include reading and writing images, single image transformations (filters), image combination operations (including compositing), and some conversion operations (such as image scaling and quantizing).

Pep Talk

This is a big project. However, what potentially makes it a huge project is that it is open ended. The minimum requirements are not that difficult. However, from there, the sky is the limit. Past experience shows that implementing some of the harder extensions can be quite rewarding (both because you'll learn something, and because you'll do something cool).

Be warned, however: the points for grading in this assignment come from the basic components. In fact, if you don't meet the minimum requirements, you will get a failing grade. You could implement Photoshop or the Gimp, make something that was so good that millions of people would use it, and still get an F! More likely, you might spend three weeks trying to get some advanced feature working, only to find that you haven't left yourself much time to get the basics finished. This happened to a lot of people last year despite this warning.

In reality, this project isn't one huge project. It's a lot of little pieces that all fit togehter. In fact, we will even give you help in building the system in a modular way, so that you can approach it as putting a lot of little pieces together.

Because of the way grading works, it is much more important to get the basics right than to get some of the fancier features.

Be sure to think and plan BEFORE you write any code. Having a general strategy about how to tackle this project is the key to doing it. It's too big to swallow whole.

In most cases, we specify a minimum requirement. Our hope is that everyone will do more than the minimum. Comments on grading are at the end.

Ground Rules

Course policy permits you to use libraries and code snippets from various places, providing that they can be made available to us for testing your program, and that you give proper attribution to them. For this assignment, however, there is an exception: you must write all of the image processing and manipulation routines yourself. We encourage you to use a library to read and write images, and will even provide you with some sample code to help you get the "core" of the program started. However, all the imaging pieces you must write yourself.

If you are in doubt whether or not something is legal, please ask.

Your program MUST understand the text files that describe imaging operations. We will use these types of files to test your program. We will provide some sample files to test your program out.

Your program should not crash. If the user requests something that you didn't implement, put up an error message. Crashing is worse than an error message, even when it comes to grading.

Your program must read and write Targa (tga) files. You need only support the variants of TGA supported by Alex Mohr's libtarga. We will check to make sure that this library can read all of our test images. Therefore, if you use libtarga, your program will read our images. If you write your own Targa reader, you're on your own.

We will provide you with some test images for your program, but you will probably want to make more.

The Basics: What is this thing anyway?

At the most basic level, you will make a program that takes a single (optional) command line argument that is the name of a text file. Each line of the file has an imaging command on it. Your program reads each line, does whatever the command says, and then moves on to the next line. This goes on until the program either encounters a "STOP" command, or the end of the file. At this point, your program should terminate.

In the event that no command line argument is given (which is what happens if you press "run" in Visual Studio), your program must allow the user to interactively do everything that a script can do. You might provide a nice, fancy, direct manipulation user interface. A much simpler way to do this is to simply use the standard input as the script file, allowing the user to type script commands. While this may not be the fanciest of user interfaces, it does get the job done. Some thoughts on this later.

If you do implement the text file interface, your program will have output as well because you will have to implement the required SHOW command.

You can thing of the text files as being programs in a very simple programming language. Ideally, your program should not be case sensitive, however, all of the test scripts we will use with your program will consist of lower case letters only.

Each command manipulates images that are stored in image variables. All variables in the files will have single letter names. Our test scripts will only use variables A-E (at most all 5 of them).

Here is a sample script:

read a p:/course/cs559-gleicher/public/TGAs/i1.tga
 a 3
crop a 10 10 50 50
read b p:/course/cs559-gleicher/public/TGAs/i1.tga
composite a over b 25 25
show b
write b sample.tga
stop

Notice that each line begins with the name of a command. The second word on each line is the name of the variable that recieves the result of the command. the rest of the line has arguments for the command. Words are seperated by spaces, and we will only test your programs with files that contain single spaces between words (that's not to say that your program shouldn't be more robust, we just won't test it).

What this script does is read an image into variable A, it, and crops it to a 40x40 square.Then it reads in another image, and performs a compositing operation mixing it with the first image. This image is shown to the user, and then written out to a file.

Some things to notice:

Each command (with the exception of stop) has a variable that it operates on. The operation happens "in place." For example, in line 3, a gets replaced by the smaller version of itself. Be careful when implementing this so that you remember to get rid of the old one, but not until you're done with it.
The two argument commands (like composite) place their result in the one of their operands. Think of them as functioning like "+=" in C or C++. For compositing, the result goes into the second image.
All positions are written relative to the upper left corner of the image. So 0,0 is the upper left corner, and 10,20 is 20 pixels down and 10 pixels to the right.

Where this becomes interesting is with interesting commands. Basically, the sky's the limit. We'll give you some minimums, and some suggestions for things to try, but you can be creative and make new and interesting imaging operations to make cool pictures.

Your grade will depend on how many commands you implement, how difficult they are, and how well they are done. You don't get any points for optional parts unless you do the required parts. So, while it is better to implement impressionist painting than image dimming, it is more important to get ring working first. Doing simple things well is also important: for example, a "half" operation that does proper filtering is better than one that doesn't, but one that doesn't do proper filtering is better than one that doesn't work at all.

You must also be ready to show off what you've done: if you implement a command, you should have an example to show how cool it is.

Basic Commands

Some basic requirements that are needed. These commands are required.

STOP: Stops the program. All commands after the STOP should be ignored. Your program must also stop on the end of file in a script.
READ v filename: (where v is a single letter variable name, and filename is a string that is the name of a file, always ending in ".tga")
WRITE v filename: (where v is a single letter variable name, and filename is a string that is the name of a file, always ending in ".tga")
SHOW v: (where v is a single letter) - this shows the image to the viewer. In a script, the program must wait for a user response before continuing. The easiest way to do this is with a modal dialog box requiring the user to click.
SOLID v x y r g b a: (where v is a single letter variable name, and x y r g b a are all integers. r g b and a are in the range 0->255). This creates a new "blank" image that is a solid color. The size of the image is x pixels across and y pixels down. The image is filled with the color r,g,b,a.
COPY u v: (where u and v are single letter variable names) - Makes a copy of image u into variable v. Note, it should actually make a copy since one of the images might be changed. Also, the order is significant, this command copies u into v.
HELP: This should provide a list of the commands that your program supports.

Single Image Operations

All of these operations destructively effect a single image. The image in a variable is replaced by the result of the operation.

There are 7 required single image operations. You must implement at least 3 image operations in addition to these. You may use your creativity to create ones not on the list, however they should be non-trivial, and preferably interesting. If you're curious as to whether your idea is good enough, feel free to ask.

HALF v: Reduces the image by a factor of 2. Doing good sampling is a valuable feature for this. You should look at my notes on image scaling for some hints on how to do this.
It is unclear if it is correct to round sizes up or down when you halve the size of an odd-sized image. Whatever you do, you should do correctly
DOUBLE v: Doubles the size of the image by a factor of 2. Better versions might do something other than pixel doubling. See my notes for some hints.
CROP v x y w h: Chops the image to be a smaller size. If any of the cropping rectangle is out of bounds, you should clip it to the size of the original image.
BLUR v n: (where n is an integer) This blurs the image using some kind of low-pass filtering. The extra parameter says how much. It is up to you to decide how to interpret the parameter, however, there must be at least 2 different blurs possible.
Note: "blurring" could mean a lot of things. The blur command should at least do a low-pass filtering. You might implement other blurring operations (such as simulated motion blur) as one of the non-required operations.
DARKEN v a: (where v is a variable name, and a is a floating point number). Multiply all values in v by a. Be careful about pixel values that go out of range!
DISSOLVE v a: (where v is a variable name, and a is a floating point number). Multiply all alpha values in v by a. Be careful about pixel values that go out of range! This is called FADE in Foley and van Dam.
BW v: Convert image v to black and white. Doing this well would use an error diffusion algorithm, a basic version would do the simple thing.
It is unclear what alpha means in "black-and-whitification," especially with pre-multiplied alpha. We will only test your BW command using an image with opacities of 100% (alpha=255), but you should make your program do something reasonable.

There are many possible options for your additional operations. Here are some suggestions. In each case, you get to choose how the command works (for example, what the parameters are). Remember that you have to document all of your commands!

SCALE: scaling an image up by an integer multiple is easy. Scaling an image down by an integer multipler is harder. Scaling by an arbitrary amount (especially if you do sampling correctly) is difficult. Doing non-uniform scaling with correct sampling is very difficult.
ROTATE: Rotating an image by a multiple of 90 degrees is easy. Rotating by an arbitrary angle is difficult to do well, especially if you want to do sampling correctly.
PAINT: Convert an image into a painting using a "painterly" technique, as introduce by Haeberli (who was a UW alumn!). To try this out, look at the online Java demo.To do this as a command, you should have the program randomly make enough brush strokes to cover the image.
The original paper about the technique (which is quite a fun paper) is in the reader.
There are many possible "painterly" techniques.
Use your imagination to try to invent some transformations that make interesting images! Here's an image done by a simple Oil-painting algorithm:

EDGEDETECT: Run an edge detector to find edges in the image. There are many methods for doing this.
SHARPEN: Run a filter that makes an image appear sharper (sortof the opposite of blur).
WARP: Do some form of geometric distortion on an image. Swirl, twist, bend, ...

Matte Generation Commands

For images that do not have an alpha channel, you must provide provide some methods for computing one. Note that most of these will take some number of parameters to tune the algorithm. You must implement at least one of these:

LUMAMATTE v a: All pixels with brightness of value a are changed so that their alpha value is 0 (transparent).
CHROMAMATTE v r g b p1 p2 p3 ...: All pixels with color r,g,b are made transparent. Variants of this command tolerate some difference from the specified color.
BLUESCREEN v p1 p2 p3 ...: Do a bluescreening operation. This differs from Chroma-matting as it tries to find in-between values of alpha to make soft edges, deal with partially transparent objects, ... If you're interested in trying this, there is a paper describing some of the basic techniques.
DIFFERENCE v w p1 ...: Do a difference matting operation (modifying the alpha channel of v to be opaque only if a pixel differs between v and w)

COMPOSITING

You must implement the COMPOSITE command that composites two images. The form of the command is:

composite u operator v x y: Composite image u onto image v, and place the result in image v.
The upper left corner of u (0,0) "goes" to position x,y in v.Only the region of v "covered' by u is affected.

UPDATE: this original description was confusing and self-contradicting. See the update at the begining of the document.

Operator is any one of the Porter and Duff Imaging operators (listed on p 838 of FvD). Because we have no way to reverse the order of operations (it is always u op v), we need to introduce some new terms: under (which is the reverse of over), heldout (which is the reverse of out), heldin (which is the reverse of int), and below (the reverse of atop).

A lot of this naming confusion comes from my initial use of the term "opposite" (what does that mean), and the fact that Foley and van Dam don't use the terms "held in" or "out" (they do use "held out" and "in"). Since we didn't use the Foley and van Dam book this year, I should have stayed truer to the Porter and Duff descriptions.

The 12 operator "names" are: clear, first, second, over, under, in, out, heldout, heldin, atop, below, xor. Notice that since we always give the destination second, we cannot reverse the order of operands and needed to introduce new names. So, B over A is the same as A under B except in the former the result goes into A, and in the latter, the result goes in B.

First and second might seem a little funny - they correspond to the A and B operators from Porter and Duff or the text. We chose to call them first and second to avoid confusing situations like:

composite b a a 10 10

which would actually give image b in the region.

composite b first a 10 10

which specifies that we want the first of the two arguments seems less ambiguous.

You should also implement the plus operator in a similar way. You may either make it another compositing operator, or its own command - just be sure to document what you did. We chose to do it as a compositing operator

composite b plus a 10 10

be careful: plus is different than the other compositing operators in that you need to worry about overflow.

The minimum UI

The key to doing well on this project (in terms of a grade, and probably in terms of learning the most about the imaging operations) is to do a minimal amount of user interface, and focus your energies on the imaging operation.

The minimal interface is quite minimal! You need to do the right thing about the command line arguments, read scripts from the console in addition to a file, and provide a "Show" command that puts up a window and waits for the user. This isn't very hard. In fact, we'll even give you a skeleton driver that shows you how to do this. (it doesn't do any imaging operations - you need to change the windows so they show images, etc)

Samples

We will provide you with plenty of sample images and scripts, as well as a complete working assignment to try out (executable only). This will happen real soon.

A directory of sample images has been placed in
p:/course/cs559-gleicher/public/TGAs

Having your UI and getting a grade too...

If you want to make a nice UI, but still get the scripting right, here's a hint: make a system that does the scripting right, and then have the UI send script commands to the system in response to user actions.

Building a system this way has a number of advantages. For one, you can test the imaging and script interpretting independently from the interface. You can get the minimal UI working (to insure you get a good grade) and have that to fall back on. There are also nice properties of a system constructed like this: you can record the user actions for later playback, there is seperation of UI and "engine", ...

Some thoughts...

Two years ago, we made an emphasis on the interactive UI for the system. To get a good grade, you needed to do a direct manipulation UI such that everything could be done with only the mouse. The good news was that we got to see a lot of very clever user interfaces. The bad news is that often people spent a lot of time on their interfaces, and not a lot of time on the imaging operations.

This year, we are specifically trying to place a greater emphasis on imaging operations. It will be possible to get an A implementing only the text file interface (and allowing the user to type text into the program when no command line argument). It will not be possible to get a decent grade if you don't do a good job on the imaging aspects, no matter how nice the UI is.

For the purposes of this assignment, it is sufficient to do all operations on 8 bit unsigned values for each color channel (and alpha). Doing things in a linearized intensity space is a lot of extra work, and probably isn't worth the effort given how uncalibrated all of the other aspects of the system are.

How will we do grading?

The majority of grading will be done by an interactive demo session. In this demo session, we will require you to build your program and show it off to us. In order to test the basic functionality of your system, we will give you some scripts to run, and we will look at the results in order to make sure your imaging algorithms work. A large portion of your grade (maybe the largest) comes from being able to run these basic scripts correctly.

But What are the Pont Values?

If you do a great job, you'll get a great grade! What else do you need to know!

OK, realistically, you have a finite amount of time and want to know where to best direct your efforts since you won't be able to do everything you want.

In previous years, it has been less about "this many points for this operation" and more "if you got this far, you get this grade." Of course, since everyone does something different, this is a little hard to measure. There are some general guidelines:

It is more important to get basic things working than to get advanced features working. The majority of your grade comes from having the basic things working correctly. You can't get points for the fancy stuff if the basic stuff doesn't work
It is more important for something to be correct than to be fancy. A very fancy imaging operator is worth almost nothing if it doesn't work.
Understanding what you've done is important. At your demo, we will ask you to show us that you know how to use your program. Your descriptions of what you've done (documentation) is important to let us now what you have done.
We will look at your code, although we will not read it all.
Crashing is bad. Have your program check for error cases and print error messages if there are illegal inputs. Don't crash.

So, roughly speaking, grading works as:

Signs of Life	Assignment turned in properly (project files, documentation, sample pictures) Program reads, writes, and displays images. Some imaging operations work.
Better	All of the above PLUS: All required pieces in place Most imaging operators working Able to show that some things work Good documentation.
Good	All of the above PLUS: All required pieces work Simple versions of all commands Basic (text) user interface Simple choices for optional commands Able to use system
Great	All of the above PLUS: All required pieces work well (more "correct" versions of the implementations) Some challenging optional commands implemented
Excellent!	All of the above PLUS: Required (e.g. at least 3) optional command are all challenging, and work well. Some user interface beyond simple text.

There's some opportunity for "or" here. If you do a simple version of a required command (say, simple binary thresholding for the BW command), but do 3 totally amazing optional commands, you still might get an "Excellent" grade.

Also, from past experience, these categories roughly correspond to grades (C->A). I am happy to give everyone in the class an A, if everyone does excellent work.

The Art Assignment

In addition to your code and text files, you are to turn in 2 pictures made with your system. You should turn in the corresponding scripts that generate each one (although we will not be able to try the script since we won't have the source images). If you create your picture using your program's GUI and did not record the commands, please give a detailed descriptions of the steps you took.

The first image is just something that shows off how nice you can make things. Just make something nice and artistic. No larger than 400x300 pixels. You should call this image art-login.tga (where login is your login). You should also have a file called art-login.txt containing the script.

The second image must be one that has a picture of you either with someone famous that you have never met, or in someplace recognizable that you have never been. Get a picture off the web, get a picture of you from someplace, and put the two together with your program. To make this easier, we will take pictures of people in front of a bluescreen backdrop one day after class. In case you want to try doing automatic bluescreening.

You should turn in an image called me-login.tga (where login is your login), a corresponding me-login.txt, and a file called me-readme.txt explaining where you got the picture, and any trick you did. (for example, you might want to manually paint the alpha channel using photoshop if you don't implelemt blue-screening).

Artistic merit will not influence your grade. However, to inspire you to do something creative, we will be having an Art Show to show off the wonderful works you create. This show will be complete with a jury and prizes! (don't expect a big prize, but there will be something)

What will you hand in?

In addition to your program files, you will also turn in documentation, and the 2 sample pictures made with your system. We will give you specific pictures to make with your system. With each one, you will need to turn in the script that creates the picture as well.

Under no circumstances should your hand-in directory be larger than 2MB. We will check. You should check too. If you feel compelled to hand-in more stuff than this (for example, a large amount of fabulous artwork), you will need to make other arrangements.

Review of the Requirements

A Checklist of the required parts:

A project handed in correctly (with all required files, subject to course policies). Remember, this includes documentation.
The 2 pictures for the art assignment. Be sure to name them correctly and to have the scripts.
The seven "basic" commands
The severn required single image operators
At least 3 optional image operators of your own choosing
A matte generation command
The composite command (with 12 Porter/Duff operators and the PLUS operator)

Some Hints (from the Spring 2000 mailing list)

Someone asked this question, and I thought that I should give the same
set of hints to everybody. I hope this person doesn't mind me
broadcasting their question.

At 08:37 PM 10/4/00 -0500, someone wrote:
> Hello Professor,
>
> In the project description you state: "In reality, this project isn't
> one huge project. It's a lot of little pieces that all fit
> togehter. In fact, we will even give you help in building the system
> in a modular way, so that you can approach it as putting a lot of
> little pieces together."
> 
> I'm just wondering what you mean by "we will even give you help in
> building the system in a modular way."? Are you still planning on
> providing this help? I can obviously proceed on the project without
> this help, but I would be glad to have it. Even after all of these
> years at UW doing modular programming, I still suck at getting started
> with it and developing a good model.

We decided not to give any more code than the example skeleton
itself. This was the subject of quite a debate between Mark and I, and
we decided that giving more code would probably confuse people more
than it would help. I should probably update the assignment page to
make it clearer that this is all the code we will provide.

However, it should be obvious from how the assignment is written how
to break it into smaller chunks. Each of the operations is an
independent thing, once you get the basic framework in place. Give
some thought to this framework, and try to structure your code so that
you can work on each imaging operation independently. The way that
Mark and my sample solution works, each of the imaging operations can
go into a seperate file, and adding a new command simply means adding
a new file to our project. (keeping the files independent was
important to us - since 2 of us were working together, and we
 sacrificed some clode cleanliness for it).

I made the design decision (for our sample solution) to have the
individual commands do some of the work of parsing. This may not have
been the best approach since it leads to redundancy. However, the
assignment was specifically designed to make command parsing as easy
as possible.

To get started, you want to consider what the core data structures for
this system are, and to decide how you will do command processing. If
you want to build a GUI interface, I recommend that you build your
command processor first, and then have the GUI communicate with the
"engine" by issuing text commands, just as if they were typed to the
console - if nothing else, you can always fall back to the "type-in
commands" UI.

CS559 Web
Home	Resources	Assignments	Tools
Calendar	Policies	Projects 1 2 3		Examples
Copyright (C) 2001 by Michael Gleicher		Last modified: 19:10 Nov 15, 2001