Krishnakanth Mallik: Discovering ARToolKit

During my initial days at IIIT-H, I had no definite goals apart from the fact that I wanted to spend my time here productively. Though I had some ideas in mind, my guide suggested me to put them away for later as 6 months was too less a time to complete them. He gave me a week off to think of some good things that I could do.

I wanted to work on CUDA. CUDA (Compute Unified Device Architeture) is a revolutionary concept from nVidia. The principle of the CUDA architecture is fairly simple. Instead of devoting almost half the transistors in the processor IC to cache and registers, CUDA devices ( all the latest nVidia GPUs and nVidia Tesla supercomputer) have most transistors devoted to computing. This means lesser cache (which is made unnecessary in GPUs due to the copious amount of equally fast video memory that they come with these days) and more raw computing power. In a nutshell CUDA devices are essentially an array of multiprocessors with shared device memory and they can outperform any processor in the market today by huge margins! Anyway, I was in IIIT -H to do some constructive research on CUDA and my guide seemed to think that this will take more time than I have.

During the course of my week off , I spent most of the time learning about programmable shaders and writing a few myself. I even learnt how to program CUDA devices. I also wrote a few image processing routines that harnessed CUDA's multiprocessing power ( I have a nVidia Geforce 8600 GTS , a CUDA device with a CUDA computability of 1.2). All this gave me no new ideas other than the ones I came here with.

Towards the end of the week I received an email from my guide. In it he described this concept called Augmented Reality and sent me a few links to a toolkit called ARToolKit developed by Dr. Hirokazu Kato and supported by the Human Interface Technology Laboratory at the University of Washington. As I browsed through the sites and googled a few terms, I found myself increasingly fascinated by this new technology. Though it looked complex and enthralling, looking through the sources and reading a few books on computer vision ( Multiple View Geometry in Computer Vision - R.Hartley and A.Zisserman ) I realized that the principle involved was quite simple.

Those of you familiar with the fundamental graphics pipeline will know that at every stage the vertices are changed from one coordinate system (CS) to another by simply multiplying with an appropriate transformation matrix. Similarly, ARToolKit finds the transformation matrix that maps the coordinates in the real world CS to the coordinates in the camera projection plane CS, where the image is formed. Once this matrix is obtained any given virtual object can be placed anywhere in the scene and its corresponding camera plane coordinates can be obtained by multiplying its coordinates with that matrix. This matrix is called the camera matrix or in ARToolKit terms a Transformation Matrix and it is a 3 x 4 matrix.

ARToolKit uses markers of known dimensions and shapes in order to achieve this. The program first looks for a square box with a darkened border in the scene first. Once it finds it, It matches the pattern inside the box to the pattern templates that it has. This can be quite trivially achieved with common image processing routines. once this marker of known dimensions is detected, it is easy to obtain the camera matrix with the pattern of known dimensions and the camera's focal length.

A Typical ARToolKit style marker with a dark bounding box and a pattern inside.

Though it sounds simple, It is difficult to perfect this technique. most cameras have glitches in their output called distortions. These distortions occur due to improper manufacturing of the lens. Radial distortion if the lens bulges too much and barrel distortion etc. It is impossible to manufacture a perfect lens. However these distortions can be corrected by certain complex methods that I wont get my hands dirty with because thankfully ARToolKit does all that hard work.
Once the camera matrix is known all the hard work is done as far as the augmented reality part is concerned. All that is left for us to do is to place our objects where ever we want them and use this camera matrix to transform it to the camera plane CS.
The code that follows is an example given with the ARToolKit. It draws a cube on the detected pattern.


#ifdef _WIN32
#include <windows.h>
#endif
#include <stdio.h>
#include <stdlib.h>
#ifndef __APPLE__
#include <gl/gl.h>
#include <gl/glut.h>
#else
#include <opengl/gl.h>
#include <glut/glut.h>
#endif
#include <ar/gsub.h>
#include <ar/video.h>
#include <ar/param.h>
#include <ar/ar.h>

//
// Camera configuration.
//
#ifdef _WIN32
char            *vconf = "Data\\WDM_camera_flipV.xml";
#else
char            *vconf = "";
#endif

int             xsize, ysize;
int             thresh = 100;
int             count = 0;

char           *cparam_name    = "Data/camera_para.dat";
ARParam         cparam;

char           *patt_name      = "Data/patt.hiro";
int             patt_id;
double          patt_width     = 80.0;
double          patt_center[2] = {0.0, 0.0};
double          patt_trans[3][4];

static void   init(void);
static void   cleanup(void);
static void   keyEvent( unsigned char key, int x, int y);
static void   mainLoop(void);
static void   draw( void );

int main(int argc, char **argv)
{
    glutInit(&argc, argv);
    init();

    arVideoCapStart();
    argMainLoop( NULL, keyEvent, mainLoop );
    return (0);
}

static void   keyEvent( unsigned char key, int x, int y)
{
    /* quit if the ESC key is pressed */
    if( key == 0x1b ) {
        printf("*** %f (frame/sec)\n", (double)count/arUtilTimer());
        cleanup();
        exit(0);
    }
}

/* main loop */
static void mainLoop(void)
{
    ARUint8         *dataPtr;
    ARMarkerInfo    *marker_info;
    int             marker_num;
    int             j, k;

    /* grab a vide frame */
    if( (dataPtr = (ARUint8 *)arVideoGetImage()) == NULL ) {
        arUtilSleep(2);
        return;
    }
    if( count == 0 ) arUtilTimerReset();
    count++;

    argDrawMode2D();
    argDispImage( dataPtr, 0,0 );

    /* detect the markers in the video frame */
    if( arDetectMarker(dataPtr, thresh, &marker_info, &marker_num) < 0 ) {
        cleanup();
        exit(0);
    }

    arVideoCapNext();

    /* check for object visibility */
    k = -1;
    for( j = 0; j < marker_num; j++ ) {
        if( patt_id == marker_info[j].id ) {
            if( k == -1 ) k = j;
            else if( marker_info[k].cf < marker_info[j].cf ) k = j;
        }
    }
    if( k == -1 ) {
        argSwapBuffers();
        return;
    }

    /* get the transformation between the marker and the real camera */
    arGetTransMat(&marker_info[k], patt_center, patt_width, patt_trans);

    draw();

    argSwapBuffers();
}

static void init( void )
{
    ARParam  wparam;
   
    /* open the video path */
    if( arVideoOpen( vconf ) < 0 ) exit(0);
    /* find the size of the window */
    if( arVideoInqSize(&xsize, &ysize) < 0 ) exit(0);
    printf("Image size (x,y) = (%d,%d)\n", xsize, ysize);

    /* set the initial camera parameters */
    if( arParamLoad(cparam_name, 1, &wparam) < 0 ) {
        printf("Camera parameter load error !!\n");
        exit(0);
    }
    arParamChangeSize( &wparam, xsize, ysize, &cparam );
    arInitCparam( &cparam );
    printf("*** Camera Parameter ***\n");
    arParamDisp( &cparam );

    if( (patt_id=arLoadPatt(patt_name)) < 0 ) {
        printf("pattern load error !!\n");
        exit(0);
    }

    /* open the graphics window */
    argInit( &cparam, 1.0, 0, 0, 0, 0 );
}

/* cleanup function called when program exits */
static void cleanup(void)
{
    arVideoCapStop();
    arVideoClose();
    argCleanup();
}

static void draw( void )
{
    double    gl_para[16];
    GLfloat   mat_ambient[]     = {0.0, 0.0, 1.0, 1.0};
    GLfloat   mat_flash[]       = {0.0, 0.0, 1.0, 1.0};
    GLfloat   mat_flash_shiny[] = {50.0};
    GLfloat   light_position[]  = {100.0,-200.0,200.0,0.0};
    GLfloat   ambi[]            = {0.1, 0.1, 0.1, 0.1};
    GLfloat   lightZeroColor[]  = {0.9, 0.9, 0.9, 0.1};
   
    argDrawMode3D();
    argDraw3dCamera( 0, 0 );
    glClearDepth( 1.0 );
    glClear(GL_DEPTH_BUFFER_BIT);
    glEnable(GL_DEPTH_TEST);
    glDepthFunc(GL_LEQUAL);
   
    /* load the camera transformation matrix */
    argConvGlpara(patt_trans, gl_para);
    glMatrixMode(GL_MODELVIEW);
    glLoadMatrixd( gl_para );

    glEnable(GL_LIGHTING);
    glEnable(GL_LIGHT0);
    glLightfv(GL_LIGHT0, GL_POSITION, light_position);
    glLightfv(GL_LIGHT0, GL_AMBIENT, ambi);
    glLightfv(GL_LIGHT0, GL_DIFFUSE, lightZeroColor);
    glMaterialfv(GL_FRONT, GL_SPECULAR, mat_flash);
    glMaterialfv(GL_FRONT, GL_SHININESS, mat_flash_shiny);   
    glMaterialfv(GL_FRONT, GL_AMBIENT, mat_ambient);
    glMatrixMode(GL_MODELVIEW);
    glTranslatef( 0.0, 0.0, 25.0 );
    glutSolidCube(50.0);
    glDisable( GL_LIGHTING );

    glDisable( GL_DEPTH_TEST );
}

If you want to play around with this you should first print out all the markers provided with the toolkit.

The output of this program can be seen once you display your "Hiro" marker in front of the camera. A blue cube appears on the pattern and it turns in accordance to your movements.

Most of the code is self explanatory and the rest will be clear when you write a few programs yourself so I wont waste time in elaborating it.

So this was my first encounter with ARToolKit.

The next day I met my guide ( after trying out ARToolKit ) and discussed with him. We decided that I should develop a demo in which humans can interactively control these objects. Now as to what the demo should be never surfaced until a little later but until then i was tinkering with ARToolKit. In between I tried to recreate the functionality of ARToolKit using OpenCV which I left midway ( I hope to resume it soon ) as soon as I had a solid idea as to what I was going to about the demo.

5 comments:

Shabab Haider SiddiqueDecember 31, 2010 at 11:25 PM
hi.

I am very very interested in AR. and doing some projects. I have completed some small 2d AR with opencv but completely lost with ARToolkit.

Do you have any windows installer for artoolkit?
i am running win 7.

Tried all the possible moves came to my way but the problem still persists.

The examples run successfully but how do i compile a art program? i am using codeblocks. If you suggest me any other IDE i am ready to go for it.

Please help.

by the way. I have add requested you in FB.
Krishnakanth MallikJanuary 9, 2011 at 1:53 PM
I run ARToolkit on Windows 7. Its an SDK and not an installer. Whichever IDE you use just include the ARToolkit libraries and include files that you download.
AnonymousMarch 19, 2011 at 1:37 AM
Hi,

I'm trying to install ARToolkit on windows XP - my IDE is Visual Studio 2010 and I can't get the project to open - its complaining about failing to upgrade the projects? Any ideas would be appreciated!
isuApril 1, 2011 at 12:07 AM
Hi.. artoolkit is no compatible with visual studio 2010. Use 2005 or 2008.. its easy to build with them :) I was using visual studio 2008 and artool kit was configured to it.. when i'm upgrading to visual studio 2010 its not tool kit wasn't converting into that format. So thats the only reason why i'm using visual studio 2008 now :)
Krishnakanth MallikApril 8, 2011 at 5:58 PM
Though I personally use VS 2008 for my AR projects I do not see any problem as to why they wont run with VS 2010. Link the libraries and the headers and you are good to go. I shall soon try it myself and post about it.

Wednesday, March 31, 2010

Discovering ARToolKit

5 comments: