Monday, July 07, 2008

EHCI Upate - 6 degrees of freedom head tracking

I'm posting here some updates of the Google Summer of Code EHCI project. This part of the project deals with head tracking with 6 degrees of freedom, a problem often referred as finding the pose of an object. Since no light is being generated from the head - as in some types of infra-red tracking - it needs to rely on natural features of the head. This implementation tries to follow the excellent work from Luca Vacchetti, Vincent Lepetit, and Pascal Fua, from the Computer Vision Laboratory of the Swiss Federal Institute of Technology (EPFL), "Fusing Online and Offline Information for Stable 3D Tracking in Real-Time". The paper is available here

There's a video on youtube showing current progress.


The algorithm starts automatically looking for a head in the image, through the famous Viola Jones algorithm.

After finding the head position, a feature tracking algorithm is started. It uses cvFindGoodFeatures to track in the region of interest defined by the head width and height. When these features are discovered, they are mapped back to a head model (I'm currently using a cylindrical model, but I plan to use the excellent head model by Len Van Der Westhuizen, which is available here, thanks Len!).

When the head model 3d points are known, as well as its corresponding 2d image points, DeMenthon's POSIT algorithm is used to find the initial pose estimation.

After that, an optical flow algorithm by Lucas-Kanade is used is used to track the points along the frames. These points are mapped back to original 3d points and the pose matrix is updated.

The source code shows how to deal with several important OpenCV functions, such as cvGoodFeaturesToTrack, cvCreatePOSITObject, cvPOSIT, and cvCalcOpticalFlowPyrLK, as well as some interesting OpenGL features like loading custum Model View, and Projection matrixes through glLoadMatrix.

I'd really like to thank God and everyone that has helped me develop this work with invaluable tutorials, papers, 3d models, and e-mails,


Posit tutorial:

Explanation of the raw format:

The full report is available at


Kenny said...

Awesome application!!!
I'm currently involved in a similiar project. But not as cool as yours.

It seems like that you only utilize the viola and jones'detection algorithm. The default cascade classifier works not very well under the case of non-frontal face. Have your ever condsidered the "detection and then traking" strategy? using some algrithms like calman or meanshif to track the deteted face window.

Daniel Lélis Baggio said...

Hi Kenny,

thank you for your comment, but I'm not only using Viola-Jones. That's only for bootstrapping (and it's only 2D). I use DeMenthon's for Pose, and I am actually using "detection and then tracking". I have tried meanshift, but it's not as robust as Viola-Jones, so that's why I've picked it.
I'm about to update the project with some binaries, but current svn looks way easier to use (but I wouldn't assume it's already ok so far :) ).

Stay with God,
best regards!

ps: I think your university was the ACM-ICPC champion back in 2005, wasn't it?

Kenny said...

Yeah, it's quite a hornor. SJTU is one of the best engineering schools in China.

I sent you some useful information, please check your e-mail.

Mike Nigh said...

Fucking awesome! just what we needed!

Anonymous said...
This comment has been removed by a blog administrator.
Rahul Kavi said...

I am working on a similar application. I was wondering how did you map points on face with points in the 3d object. How did you establish the correspondence?? In order to use the POSIT algorithm one needs to know the corresponding 2d and 3d points right? (I went through the code and have understood that you have used good features to track to track the 2d points)