This is a quick summary of what happened for project 2. 0. Contents -------- 1. The aim 2. The methods 3. The implementation 4. The observations 5. The future 6. The references 1. The Aim ---------- At the beginning, my aim was to do a proof/disproof of concept for gaze direction. My intention was to try this and see where it leads us. I found that in general, there is some information in there. We don't get the entire information of course (We definitely lose some information because we don't have the eyes), but we can still do something with what we got. Now, the aim shifts to what we do with this information. A first aim was to find the points in time where gaze shifts from one thing to the other. Another is to change direction of gaze over certain frames. This can open the road for lots of different other applications. 2. The methods -------------- We assume we are only working with mocap data. Therefore, the only available data to us is the head orientation (and chest and hips orientations of course). The first step was to add cones of vision in front of characters to check if we can infer anything from them. This led me to 2 directions. One of them is segmenting the time line and the other is to change the actual gaze direction. 3. The Implementation --------------------- Here is a quick idea of what was implemented and the status of each part a. Inserting cones of vision in front of characters: That was an easy part b. Adding an IK solver to change head orientation to a certain "point" over certain frames. This part was not fully implemented to be honest. I had to leave it in order to focus on segmentation. Anyway, the IK solver works per frame. It generates a new head orientation looking at the desired point. If moving the head is not enough, it adds chest rotation. If this is still not enough, we assume the character will move his eyes to the extreme right or left in order to see the target (this works for like, something behind the user). My intuition behind this came from a a simple experiment. I just tried to be self conscious of how I look at things in my everyday life. After lots of observations, I beleive this is how humans work... When looking at things, humans just try to use the least possible energy. So, the first step is just to move your eye towards what you want to see (this is the cheapest thing to do). If this is not enough, you also move your neck towards the goal. If this is still not enough, you rotate your hips towards the object. if finally, this is still not enough (or if you want to look at something for a long time), you can also make a few short steps to orient your whole body to a better position to get a better look. the IK solver just uses the neck and hip rotations to get to the correct orientation. Anway, finally, we have a new skeleton now that obeys the constraints. We use displacement mapping to blend the 2 motions. So, displacement mapping takes place only at the beginning and the end of constraints. There is a third point where we need to use displacement mapping, and that is when the target I am looking at is moving behind my back, and I need to quickly rotate my head in order to be able to look at it. Finally on the segmentation part. I used Laplacian of Gaussian, this produced moderate results. They are not perfect, but they were just ok (which is expected because some info is already missing). I also tried bilateral filtering, which produced nice results at smoothing, but wasn't clear how to get the transition points from that. 4. The observations --------------- Here are some observations 1. Some information is definitely missing. But, not all information is missing (this is reinforced by the "Eyes alive" paper where participants still showed some degree of satisfaction even with constant eyes) 2. There is clearly more than one way to segment the data to saccades vs. gazes. Each method applied will give different results. There is no single correct solution. Moreover, more than one solution could be equally good 3. A per-frame gaze editor isn't really that bad. In fact, even not using any displacement mapping at all isn't that bad!!! (You get the impression the character is just turning his head way too fast). Hopefully a better gaze editor should produce much better results. 5. The future ---------------- There are several directions this project can go. For example a. After segmenting the data, we can use motion blending to blend saccades to generate new saccades (Thanks to Rachel for the idea) b. Better ways to segment the data. Probably clustering techniques can be used. c. Better IK solvers that go beyond the per-frame assumption. 6. The references ---------------------- 1. "Eyes Alive", Sooha Part Lee et Al., SigGraph 2002 2. "Gaze estimation using morphable models", Thomas D. Rikert et al, 1998 3. "Estimating focus of attention based on gaze and sound", Rainer Stiefelhagen et al., 2001 4. "Bilateral Filtering for gray and color images", C. Tomasi et al, 1998 5. "I See What You See: Gaze Perception During Scene Viewing" 6. "Effects of Gaze on Multiparty Mediated Communication", Roel Vertegaal, 2000 7. "Psychologically-Based Vision and Attention for the Simulation of Human Behaviour", Stephen J. Rymill, 2005 8. "The Impact of Eye Gaze on Communication using Humanoid Avatars", Maia Garau et al., 2001 9. "Real-Time Eye, Gaze, and Face Pose - Tracking for Monitoring Driver Vigilance", Qiang Jil et al, 2003 10. "Where to Look? Automating Some Visual Attending Behaviours of Human Characters", PhD Thesis, Sonu Chopra, 1999