CS 559: Computer
Project 1: Picture Composer
Update: October 9, 2001: the definitions of the compositing operators was unclear, here's a clarification:
Due date: October 11, 2001, 5:00 pm. Late assignments accepted according to the course late policy. No late assignments accepted after 8:00am, October 15th.
Updates10/1/01 "Different kinds of blurs" is a bit of a misnomer. What I mean is that there should be at least two different values of the parameters that give different results. For example, you might have the parameter be the frequency limit (kernel size) of a low-pass filter.
This project challenges you to build a system that allows the user to combine the various image processing methods discussed in class (and some that aren't) to make pictures. The goal is to give you an opportunity to implement some of the methods and see how they can be combined and used to produce useful effects.
Your program will read a list of instructions written in a text file in a simple format. Optionally, it may provide an interactive user interface to allow the user to specify the operations.
The operations your program will support will include reading and writing images, single image transformations (filters), image combination operations (including compositing), and some conversion operations (such as image scaling and quantizing).
This is a big project. However, what potentially makes it a huge project is that it is open ended. The minimum requirements are not that difficult. However, from there, the sky is the limit. Past experience shows that implementing some of the harder extensions can be quite rewarding (both because you'll learn something, and because you'll do something cool).
Be warned, however: the points for grading in this assignment come from the basic components. In fact, if you don't meet the minimum requirements, you will get a failing grade. You could implement Photoshop or the Gimp, make something that was so good that millions of people would use it, and still get an F! More likely, you might spend three weeks trying to get some advanced feature working, only to find that you haven't left yourself much time to get the basics finished. This happened to a lot of people last year despite this warning.
In reality, this project isn't one huge project. It's a lot of little pieces that all fit togehter. In fact, we will even give you help in building the system in a modular way, so that you can approach it as putting a lot of little pieces together.
Because of the way grading works, it is much more important to get the basics right than to get some of the fancier features.
Be sure to think and plan BEFORE you write any code. Having a general strategy about how to tackle this project is the key to doing it. It's too big to swallow whole.
In most cases, we specify a minimum requirement. Our hope is that everyone will do more than the minimum. Comments on grading are at the end.
Course policy permits you to use libraries and code snippets from various places, providing that they can be made available to us for testing your program, and that you give proper attribution to them. For this assignment, however, there is an exception: you must write all of the image processing and manipulation routines yourself. We encourage you to use a library to read and write images, and will even provide you with some sample code to help you get the "core" of the program started. However, all the imaging pieces you must write yourself.
If you are in doubt whether or not something is legal, please ask.
Your program MUST understand the text files that describe imaging operations. We will use these types of files to test your program. We will provide some sample files to test your program out.
Your program should not crash. If the user requests something that you didn't implement, put up an error message. Crashing is worse than an error message, even when it comes to grading.
Your program must read and write Targa (tga) files. You need only support the variants of TGA supported by Alex Mohr's libtarga. We will check to make sure that this library can read all of our test images. Therefore, if you use libtarga, your program will read our images. If you write your own Targa reader, you're on your own.
We will provide you with some test images for your program, but you will probably want to make more.
The Basics: What is this thing anyway?
At the most basic level, you will make a program that takes a single (optional) command line argument that is the name of a text file. Each line of the file has an imaging command on it. Your program reads each line, does whatever the command says, and then moves on to the next line. This goes on until the program either encounters a "STOP" command, or the end of the file. At this point, your program should terminate.
In the event that no command line argument is given (which is what happens if you press "run" in Visual Studio), your program must allow the user to interactively do everything that a script can do. You might provide a nice, fancy, direct manipulation user interface. A much simpler way to do this is to simply use the standard input as the script file, allowing the user to type script commands. While this may not be the fanciest of user interfaces, it does get the job done. Some thoughts on this later.
If you do implement the text file interface, your program will have output as well because you will have to implement the required SHOW command.
You can thing of the text files as being programs in a very simple programming language. Ideally, your program should not be case sensitive, however, all of the test scripts we will use with your program will consist of lower case letters only.
Each command manipulates images that are stored in image variables. All variables in the files will have single letter names. Our test scripts will only use variables A-E (at most all 5 of them).
Here is a sample script:
read a p:/course/cs559-gleicher/public/TGAs/i1.tga a 3 crop a 10 10 50 50 read b p:/course/cs559-gleicher/public/TGAs/i1.tga composite a over b 25 25 show b write b sample.tga stop
Notice that each line begins with the name of a command. The second word on each line is the name of the variable that recieves the result of the command. the rest of the line has arguments for the command. Words are seperated by spaces, and we will only test your programs with files that contain single spaces between words (that's not to say that your program shouldn't be more robust, we just won't test it).
What this script does is read an image into variable A, it, and crops it to a 40x40 square.Then it reads in another image, and performs a compositing operation mixing it with the first image. This image is shown to the user, and then written out to a file.
Some things to notice:
Where this becomes interesting is with interesting commands. Basically, the sky's the limit. We'll give you some minimums, and some suggestions for things to try, but you can be creative and make new and interesting imaging operations to make cool pictures.
Your grade will depend on how many commands you implement, how difficult they are, and how well they are done. You don't get any points for optional parts unless you do the required parts. So, while it is better to implement impressionist painting than image dimming, it is more important to get ring working first. Doing simple things well is also important: for example, a "half" operation that does proper filtering is better than one that doesn't, but one that doesn't do proper filtering is better than one that doesn't work at all.
You must also be ready to show off what you've done: if you implement a command, you should have an example to show how cool it is.
Some basic requirements that are needed. These commands are required.
All of these operations destructively effect a single image. The image in a variable is replaced by the result of the operation.
There are 7 required single image operations. You must implement at least 3 image operations in addition to these. You may use your creativity to create ones not on the list, however they should be non-trivial, and preferably interesting. If you're curious as to whether your idea is good enough, feel free to ask.
There are many possible options for your additional operations. Here are some suggestions. In each case, you get to choose how the command works (for example, what the parameters are). Remember that you have to document all of your commands!
Matte Generation Commands
For images that do not have an alpha channel, you must provide provide some methods for computing one. Note that most of these will take some number of parameters to tune the algorithm. You must implement at least one of these:
You must implement the COMPOSITE command that composites two images. The form of the command is:
UPDATE: this original description was confusing and self-contradicting. See the update at the begining of the document.
Operator is any one of the Porter and Duff Imaging operators (listed on p 838 of FvD). Because we have no way to reverse the order of operations (it is always u op v), we need to introduce some new terms: under (which is the reverse of over), heldout (which is the reverse of out), heldin (which is the reverse of int), and below (the reverse of atop).
A lot of this naming confusion comes from my initial use of the term "opposite" (what does that mean), and the fact that Foley and van Dam don't use the terms "held in" or "out" (they do use "held out" and "in"). Since we didn't use the Foley and van Dam book this year, I should have stayed truer to the Porter and Duff descriptions.
The 12 operator "names" are: clear, first, second, over, under, in, out, heldout, heldin, atop, below, xor. Notice that since we always give the destination second, we cannot reverse the order of operands and needed to introduce new names. So, B over A is the same as A under B except in the former the result goes into A, and in the latter, the result goes in B.
First and second might seem a little funny - they correspond to the A
and B operators from Porter and Duff or the text. We chose to call them
first and second to avoid confusing situations like:
composite b a a 10 10which would actually give image b in the region.
composite b first a 10 10which specifies that we want the first of the two arguments seems less ambiguous.
You should also implement the plus operator in a similar way. You may either make it another compositing operator, or its own command - just be sure to document what you did. We chose to do it as a compositing operator
composite b plus a 10 10
be careful: plus is different than the other compositing operators in that you need to worry about overflow.
The minimum UI
The key to doing well on this project (in terms of a grade, and probably in terms of learning the most about the imaging operations) is to do a minimal amount of user interface, and focus your energies on the imaging operation.
The minimal interface is quite minimal! You need to do the right thing about the command line arguments, read scripts from the console in addition to a file, and provide a "Show" command that puts up a window and waits for the user. This isn't very hard. In fact, we'll even give you a skeleton driver that shows you how to do this. (it doesn't do any imaging operations - you need to change the windows so they show images, etc)
We will provide you with plenty of sample images and scripts, as well as a complete working assignment to try out (executable only). This will happen real soon.
A directory of sample images has been placed in
Having your UI and getting a grade too...
If you want to make a nice UI, but still get the scripting right, here's a hint: make a system that does the scripting right, and then have the UI send script commands to the system in response to user actions.
Building a system this way has a number of advantages. For one, you can test the imaging and script interpretting independently from the interface. You can get the minimal UI working (to insure you get a good grade) and have that to fall back on. There are also nice properties of a system constructed like this: you can record the user actions for later playback, there is seperation of UI and "engine", ...
Two years ago, we made an emphasis on the interactive UI for the system. To get a good grade, you needed to do a direct manipulation UI such that everything could be done with only the mouse. The good news was that we got to see a lot of very clever user interfaces. The bad news is that often people spent a lot of time on their interfaces, and not a lot of time on the imaging operations.
This year, we are specifically trying to place a greater emphasis on imaging operations. It will be possible to get an A implementing only the text file interface (and allowing the user to type text into the program when no command line argument). It will not be possible to get a decent grade if you don't do a good job on the imaging aspects, no matter how nice the UI is.
For the purposes of this assignment, it is sufficient to do all operations on 8 bit unsigned values for each color channel (and alpha). Doing things in a linearized intensity space is a lot of extra work, and probably isn't worth the effort given how uncalibrated all of the other aspects of the system are.
How will we do grading?
The majority of grading will be done by an interactive demo session. In this demo session, we will require you to build your program and show it off to us. In order to test the basic functionality of your system, we will give you some scripts to run, and we will look at the results in order to make sure your imaging algorithms work. A large portion of your grade (maybe the largest) comes from being able to run these basic scripts correctly.
But What are the Pont Values?
If you do a great job, you'll get a great grade! What else do you need to know!
OK, realistically, you have a finite amount of time and want to know where to best direct your efforts since you won't be able to do everything you want.
In previous years, it has been less about "this many points for this operation" and more "if you got this far, you get this grade." Of course, since everyone does something different, this is a little hard to measure. There are some general guidelines:
So, roughly speaking, grading works as:
There's some opportunity for "or" here. If you do a simple version of a required command (say, simple binary thresholding for the BW command), but do 3 totally amazing optional commands, you still might get an "Excellent" grade.
Also, from past experience, these categories roughly correspond to grades (C->A). I am happy to give everyone in the class an A, if everyone does excellent work.
The Art Assignment
In addition to your code and text files, you are to turn in 2 pictures made with your system. You should turn in the corresponding scripts that generate each one (although we will not be able to try the script since we won't have the source images). If you create your picture using your program's GUI and did not record the commands, please give a detailed descriptions of the steps you took.
The first image is just something that shows off how nice you can make things. Just make something nice and artistic. No larger than 400x300 pixels. You should call this image art-login.tga (where login is your login). You should also have a file called art-login.txt containing the script.
The second image must be one that has a picture of you either with someone famous that you have never met, or in someplace recognizable that you have never been. Get a picture off the web, get a picture of you from someplace, and put the two together with your program. To make this easier, we will take pictures of people in front of a bluescreen backdrop one day after class. In case you want to try doing automatic bluescreening.
You should turn in an image called me-login.tga (where login is your login), a corresponding me-login.txt, and a file called me-readme.txt explaining where you got the picture, and any trick you did. (for example, you might want to manually paint the alpha channel using photoshop if you don't implelemt blue-screening).
Artistic merit will not influence your grade. However, to inspire you to do something creative, we will be having an Art Show to show off the wonderful works you create. This show will be complete with a jury and prizes! (don't expect a big prize, but there will be something)
What will you hand in?
In addition to your program files, you will also turn in documentation, and the 2 sample pictures made with your system. We will give you specific pictures to make with your system. With each one, you will need to turn in the script that creates the picture as well.
Under no circumstances should your hand-in directory be larger than 2MB. We will check. You should check too. If you feel compelled to hand-in more stuff than this (for example, a large amount of fabulous artwork), you will need to make other arrangements.
Review of the Requirements
A Checklist of the required parts:
Someone asked this question, and I thought that I should give the same set of hints to everybody. I hope this person doesn't mind me broadcasting their question. At 08:37 PM 10/4/00 -0500, someone wrote: > Hello Professor, > > In the project description you state: "In reality, this project isn't > one huge project. It's a lot of little pieces that all fit > togehter. In fact, we will even give you help in building the system > in a modular way, so that you can approach it as putting a lot of > little pieces together." > > I'm just wondering what you mean by "we will even give you help in > building the system in a modular way."? Are you still planning on > providing this help? I can obviously proceed on the project without > this help, but I would be glad to have it. Even after all of these > years at UW doing modular programming, I still suck at getting started > with it and developing a good model. We decided not to give any more code than the example skeleton itself. This was the subject of quite a debate between Mark and I, and we decided that giving more code would probably confuse people more than it would help. I should probably update the assignment page to make it clearer that this is all the code we will provide. However, it should be obvious from how the assignment is written how to break it into smaller chunks. Each of the operations is an independent thing, once you get the basic framework in place. Give some thought to this framework, and try to structure your code so that you can work on each imaging operation independently. The way that Mark and my sample solution works, each of the imaging operations can go into a seperate file, and adding a new command simply means adding a new file to our project. (keeping the files independent was important to us - since 2 of us were working together, and we sacrificed some clode cleanliness for it). I made the design decision (for our sample solution) to have the individual commands do some of the work of parsing. This may not have been the best approach since it leads to redundancy. However, the assignment was specifically designed to make command parsing as easy as possible. To get started, you want to consider what the core data structures for this system are, and to decide how you will do command processing. If you want to build a GUI interface, I recommend that you build your command processor first, and then have the GUI communicate with the "engine" by issuing text commands, just as if they were typed to the console - if nothing else, you can always fall back to the "type-in commands" UI.