tag:blogger.com,1999:blog-28555026573666427272024-03-14T15:05:32.962+05:30Krishnakanth MallikProjects, Programming, Physics and PhilosophyKrishnakanth Mallikhttp://www.blogger.com/profile/18368128453294368556noreply@blogger.comBlogger10125tag:blogger.com,1999:blog-2855502657366642727.post-45193525322297443362011-02-18T12:46:00.011+05:302011-02-18T22:10:44.055+05:30Using SIMD for Hardware AccelerationMost modern processors come with a feature of operating the same instruction on multiple instances of data. The acronym SIMD stands for "Single Instruction Multiple Data" describes just that.<div><br /></div><div>Intel first introduced the MMX instructions that could operate on multiple data in the Pentium CPU's way back in 1994. Intel supported this by coming out with an extended instruction set henceforth called Streaming SIMD Extensions or SSE. The SSE instruction set which first came with the Pentium III processor uses 128 bit registers that can be used to pack 4 integers or 4 floating point data types. The advantage of doing this becomes clear if one tries to consecutively add four different sets of operands. Using the basic instruction set one needs to:</div><div><ol><li>Move the two operands to registers.</li><li>Add the operands.</li><li>Move result to memory.</li><li>Repeat the above steps for three more sets of operands.</li></ol><div>But with just three MMX instructions all these can be done at once for the four sets of operands hence rapidly improving performance. </div></div><div><br /></div><div>Before I show you how to use these SSE instructions it is important to know how to access data in a SIMD register. As you know a SIMD register is 128 bit which means it can hold 4 x 32 it data or in other 4 ints or 4 floats. This data in order is referred as [x, y, z, w]. The following SSE instruction: addps adds the four ints or floats in the 128 bit XMM0 register and the 128 bit XMM1 register and stores the results back in the XMM0 register.</div><div><br /></div><code>addps xmm0 xmm1;</code><br /><br /><div>The four floats that are loaded into the SSE register can be moved from memory individually but such operations are slow. Moreover moving data between the FPU (Floating Point Processing Unit) registers and the CPU registers is particularly slow because the CPU has to wait for the FPU to complete the present operation at hand. Hence it is a good practice to leave the data in the SSE registers unless and until space has to be cleared.</div><div><br /></div><div>Let us now see how we can leverage these SIMD instructions from C/C++. Many compilers provide different data types for SIMD operations. Here I will discuss only the Microsoft Visual Studio compiler. The MVCC provides a predefined datatype <code>__m128</code>, which can be used to declare a variable which holds data in a MMX register. A <code>__m128</code> type variable is stored directly in a MMX register without ever being put in the memory or the CPU registers. It is the programmer's responsibility to align the data to 16 byte address once you load its contents directly into memory.</div><div><br /></div><div>Here's a sample program which demonstrates the usage of SIMD to perform addition on four floating point data.</div><div><br /><code>__m128 addMMX(__m128 a, __m128 b)</code></div><div><code>{</code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span>__m128 result;</code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span></code></div><div><code><span class="Apple-tab-span" style="white-space:pre"><span class="Apple-tab-span" style="white-space:pre"> </span>/</span>* inline assembly */</code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span>__asm</code></div><div><code></code><span class="Apple-style-span" style="font-family: monospace; font-size: 13px; "><span class="Apple-tab-span" style="white-space:pre"> </span>{</span></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span>movaps xmm0, xmmword ptr [a]</code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span>movaps xmm1, xmmword ptr [b]</code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span>addps xmm0, xmm1</code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span>movaps xmmword ptr [result], xmm0</code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> <span class="Apple-tab-span" style="white-space:pre"> </span></span>}<span class="Apple-tab-span" style="white-space:pre"> </span></code></div><div><code><span class="Apple-tab-span" style="white-space:pre"></span></code><span class="Apple-style-span" style="font-family: monospace; font-size: 13px; "><span class="Apple-tab-span" style="white-space:pre"><span class="Apple-tab-span" style="white-space:pre"> </span>r</span>eturn result;</span></div><div><code>}</code><br /><br /></div><div>This is however a bad approach because the code is not portable and one has to embed inline assembly into high level code. A better way to do the same thing is to use intrinsics. Intrinsics are special commands that look and behave like C functions but are internally expanded to inline assembly code by the compiler. In order to use intrinsics be sure to include the <code>xmmintrin.h</code> file into your code.</div><div><br /><code><br /></code></div><div><code>#include <xmmintrin.h></code></div><div><code><br /></code></div><div><code>__m128 addSIMDwithIntrinsics(__m128 a, __m128 b)</code></div><div><code>{</code></div><div><code><span class="Apple-tab-span" style="white-space:pre"><span class="Apple-tab-span" style="white-space:pre"> </span>/* use intrisics */</span></code></div><div><code><span class="Apple-tab-span" style="white-space:pre"><span class="Apple-tab-span" style="white-space:pre"></span><span class="Apple-tab-span" style="white-space:pre"> </span>__m128 result = _mm_add_ps(a,b);</span></code></div><div><code><span class="Apple-tab-span" style="white-space:pre"><span class="Apple-tab-span" style="white-space:pre"> </span>return result;</span></code></div><div><code>}</code><br /><br />To load 4 floats into the MMX register simply use the load intrinsic.<br /></div><div><br /></div><div><code>/*<span class="Apple-tab-span" style="white-space:pre"> </span>be sure to 16 byte align your arrays to </code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span>reduce the number of fetch cycles required to load */</code></div><div><code><br /></code></div><div><code>__declspec(align(16)) float A[] = {1.0f, 2.0f, 3.0f, 4.0f};</code></div><div><code>__declspec(align(16)) float B[] = {4.0f, 3.0f, 2.0f, 1.0f};</code></div><div><code>__declspec(align(16)) float C[] = {0.0f, 0.0f, 0.0f, 0.0f};</code></div><div><code><br /></code></div><div><code>int main(int args, char* argv[])</code></div><div><code>{</code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span>/* load a and b from the arrays above */</code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span>__m128 a = _mm_load_ps(&A[0]);</code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span>__m128 b = _mm_load_ps(&B[0]);</code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span>__m128 c;</code></div><div><code><br /></code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span>/* call addSIMDwithIntrinsics() function from above */</code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span>c = addSIMDwithIntrinsics(a, b);</code></div><div><code><br /></code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span>/* write the result back to array */</code></div><div><code><span class="Apple-tab-span" style="white-space:pre"> </span>_mm_store_ps(&C, c);</code></div><div><code>}</code></div><div><code><br /></code></div><div>The next time you set out to write FFT functions or just about any repetitive math operations be sure to utilize this feature.</div>Krishnakanth Mallikhttp://www.blogger.com/profile/18368128453294368556noreply@blogger.com0tag:blogger.com,1999:blog-2855502657366642727.post-48995018892206584342010-04-10T11:38:00.009+05:302010-04-13T17:51:40.185+05:30The limits of our perception<div>The last century can be seen as the golden age of modern civilization. There has been exponential scientific growth and expansion of human knowledge about self and our surroundings. We can now predict the behavior of almost everything with a high degree of precision. Despite the extent of our knowledge there is still uncertainty prevalent in our science. Our understanding of the quantum realms and the obscure phenomena in the vastness of space is still hazy. Theoretically our equations give us a fair picture. Or at least that is what we think. The major part of the 20th century was spent in unifying the most successful theories we have - The general theory of relativity and the quantum mechanical theory. Even now our physicists are in pursuit of the grand unification theory, a theory that if formulated will govern the behavior of particles in both infinitesimal and the infinitely large realms.</div><div><div><div><br /></div><div>Now here surfaces the major flaw. What exactly is the infinitesimal small and the infinitely large? Our mathematics focuses on the real number system as the superset. ( complex numbers are ignored here for the sake of practicality). The real number line is defined such that there are infinitely many real numbers between any two given real numbers. In other words the set of infinity is a subset of itself. ( You might argue saying that every set is a subset of itself but remember that here we talk about a bounded subset). This gives rise to a major anomaly. <b>How can infinity be contained between two well defined finite real numbers, when the distance between them is also a finite real number?</b></div><div><b><br /></b></div><div>Consider a line of finite length say 10cm. Lets call this line "line A". Consider another line, "line B" of length 5 cm. Based on inequality one can definitely say that line A is larger than line B.</div><div><br /></div><div>Now, let me start my argument. I argue that both the lines are of the same length. My argument is backed by the fact that every point on the line A can be paired with a point on line B. This one-one correspondence is possible because both line A and line B contain infinite number of points. Due to this one-one correspondence we can also say that both the lines have the same number of points and hence they are of the same length. </div><div><u><br /></u></div></div></div><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6AOIdiuAyPYQYQvmP3SzsiOaZevba3X684n5GYt727lqseWZInfDUvgLKAl9Yzke4OBqfqR1Ksmbv53dRez3piLdAPLvUr5ko6XweUAWzoKFBHiwuaMFtYgNkyJWvJcQCxX6LFTcxUzw/s1600/1.jpg"><img style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 320px; height: 115px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6AOIdiuAyPYQYQvmP3SzsiOaZevba3X684n5GYt727lqseWZInfDUvgLKAl9Yzke4OBqfqR1Ksmbv53dRez3piLdAPLvUr5ko6XweUAWzoKFBHiwuaMFtYgNkyJWvJcQCxX6LFTcxUzw/s320/1.jpg" alt="" id="BLOGGER_PHOTO_ID_5458766833817922642" border="0" /></a><div><div>You may call me insane, but before you do let me explain why and how this ambiguity prevails. Every object in this universe has an upper and lower limit of perception. All observations that can be made by that object lie within its own limits of perception or what I call the region of perception. These limits equally apply to all the dimensions perceivable by that object. We humans can only observe objects that are larger and more massive than photons. This is because only such objects can reflect photons back to our eye. Similarly we can extend this argument to the larger scales. after a certain point any object bigger than the limiting boundary value of perception seems equally big. Hence we cannot differentiate them. It is just not humanly possible. But you may say that we can build apparatus that can make these measurements for us but again remember that any apparatus that we may build wcertainly lie in our boundary of perception and hence cannot make observations beyond our scale.</div><div><br /></div><div>So in order to completely understand our universe we require the ability to perceive in any scale. Un-defining infinity and eliminating it from our system of mathematics is the necessary condition to advance our knowledge. But the limits imposed on our region of perception prevent us from doing so. There may be ways to achieve this indirectly. The Vedas hold countless accounts about how our ancient sages freed their minds and opened the doors to the limitless knowledge. But the true essence of the Vedas is scarce in today's world. The translations to the original version are adulterated with the personal interpretations of the translators. The Vedas must be relished as they are, untouched. It is one of my goals to read them in their true form.</div><div><br /></div><div>I do not assert that the Vedas hold the key to our advancement. But I certainly believe they can direct us in a right path and accelerate us towards achieving it. The top scientists and great thinkers of today might just rediscover this knowledge for us.</div></div>Krishnakanth Mallikhttp://www.blogger.com/profile/18368128453294368556noreply@blogger.com4tag:blogger.com,1999:blog-2855502657366642727.post-47254764624058784552010-04-08T21:02:00.006+05:302010-04-08T22:59:27.142+05:30Kick to start!It has been a few days since my "so called" review, during which I let myself take a few days off before and after. I am conveniently extending my break until the end of this week. Call me lazy, YES call me LAZY! All that I want to do now is slouch around on my bed and brood over each and every thought that comes by in my head.<div><br /></div><div>Very ironically most of my crazy thoughts occur to me in an effortless fleeting moment when I am close to sleep and when my mind is blank ( I am positively sure my mind is blank ). I guess in this state I tend to shut down the logical part of my brain and substitute it with a magic 8 ball ! With no criteria for selection and with the slightest hint of external stimuli, I rapidly argue with myself, conclude the subject and wait for the next crazy thought. Much like a powerful vacuum cleaner that swallows everything that is in its way.</div><div><br /></div><div>On a hot day in Vellore ( the town from where the devil leaves with sunburns ), while waiting for my friend Abhiram at the canteen table with food laid out in front, I felt myself falling in this crazy trance again. When I am in it I know I am in it but I am so comfortable that I want to be in it! I don't remember what I was looking at but the first thing I noticed was a house fly. It was sitting on the rim on my friends juice mug ready to take a sip! In an instinctive motion I unleashed the "human fly swat attack" which the fly dodged easily. A couple of more failed attempts aaaaand... </div><div>woooosh! </div><div>first thought:<span class="Apple-tab-span" style="white-space:pre"> </span>The fly must be really good at computing trajectories and judging collisions. Assertion:<span class="Apple-tab-span" style="white-space:pre"> </span>The fly's brain must be quite adept at calculations and recognition, much more powerful in this respect than our first microprocessors. </div><div><br /></div><div>second thought:<span class="Apple-tab-span" style="white-space:pre"> </span>But the fly must be dumb enough to miss the logic behind all the "fried" flies in the pest-o-flash tray because it zooms straight toward it!</div><div>Assertion:<span class="Apple-tab-span" style="white-space:pre"> </span>Its brain is fast but cannot process logical information.</div><div><br /></div><div>Third thought:<span class="Apple-tab-span" style="white-space:pre"> </span>Maybe the fly thinks it is not one of the hundreds of others that lay burnt in the tray!</div><div>Assertion:<span class="Apple-tab-span" style="white-space:pre"> </span>The fly doesnt know its a fly!!</div><div><br /></div><div>Then comes Abhiram to save my brain some effort. The guy is as crazy as I am! I just casually mention my new observations and he quips : " Flies have hundreds of eyes and their brains have to be powerful and quick enough to process all that information" - True! We then continue to compare a fly's brain to our early microprocessors, the details of which I should avoid for sanity's sake!</div><div><br /></div>Krishnakanth Mallikhttp://www.blogger.com/profile/18368128453294368556noreply@blogger.com2tag:blogger.com,1999:blog-2855502657366642727.post-23790088561794645922010-03-31T22:52:00.002+05:302010-03-31T23:09:43.572+05:30Problems with DsRenderer.axI would recommend that you do ARToolKit development on Linux. There are many hassles one can face while compiling and trying to build ARToolKit for Windows. I couldn't find a V4L driver for my cheap camera and hence I was stuck with no option but to develop on Windows.<br /><br />For those of you who are stuck with Windows like me, I suggest you use MS VC++ 2005 or later. VC++ 6.0 has some known bugs with class templates. Once you build and try to run your first AR application you are most likely to get an error like: <blockquote>"Could not find DsRenderer runtime" or sorts.<br /></blockquote>ARToolKit depends on DSVideoLib 0.0.8 or higher for video related functions that internally call the VFW or V4L driver. To solve this problem you have to register the DsRenderer.ax as a service. Just open command prompt as administrator, navigate to<br /><br />C:\Windows\System32\<br /><br />then type Regsvr32 "path\DsRenderer.ax"<br /><br />where path is the absolute path to where you contain the file DsRenderer.ax. If you cant find it anywhere, then you are most likely to find it in the "bin" folder of <a href="http://www.hitl.washington.edu/artoolkit/Software/ARToolKit2.65.zip">ARToolKit 2.65</a> version.<br /><br />Once the service is successfully registered you can run your application smoothly.Krishnakanth Mallikhttp://www.blogger.com/profile/18368128453294368556noreply@blogger.com1tag:blogger.com,1999:blog-2855502657366642727.post-15554728814040754892010-03-31T17:55:00.009+05:302011-02-18T15:28:56.258+05:30Discovering ARToolKitDuring my initial days at IIIT-H, I had no definite goals apart from the fact that I wanted to spend my time here productively. Though I had some ideas in mind, my guide suggested me to put them away for later as 6 months was too less a time to complete them. He gave me a week off to think of some good things that I could do.<br /><br />I wanted to work on CUDA. CUDA (Compute Unified Device Architeture) is a revolutionary concept from nVidia. The principle of the CUDA architecture is fairly simple. Instead of devoting almost half the transistors in the processor IC to cache and registers, CUDA devices ( all the latest nVidia GPUs and nVidia Tesla supercomputer) have most transistors devoted to computing. This means lesser cache (which is made unnecessary in GPUs due to the copious amount of equally fast video memory that they come with these days) and more raw computing power. In a nutshell CUDA devices are essentially an array of multiprocessors with shared device memory and they can outperform any processor in the market today by huge margins! Anyway, I was in IIIT -H to do some constructive research on CUDA and my guide seemed to think that this will take more time than I have.<br /><br />During the course of my week off , I spent most of the time learning about programmable shaders and writing a few myself. I even learnt how to program CUDA devices. I also wrote a few image processing routines that harnessed CUDA's multiprocessing power ( I have a nVidia Geforce 8600 GTS , a CUDA device with a CUDA computability of 1.2). All this gave me no new ideas other than the ones I came here with.<br /><br />Towards the end of the week I received an email from my guide. In it he described this concept called Augmented Reality and sent me a few links to a toolkit called <a href="http://www.hitl.washington.edu/artoolkit/">ARToolKit</a> developed by Dr. Hirokazu Kato and supported by the Human Interface Technology Laboratory at the University of Washington. As I browsed through the sites and googled a few terms, I found myself increasingly fascinated by this new technology. Though it looked complex and enthralling, looking through the sources and reading a few books on computer vision ( Multiple View Geometry in Computer Vision - R.Hartley and A.Zisserman ) I realized that the principle involved was quite simple.<br /><br />Those of you familiar with the fundamental graphics pipeline will know that at every stage the vertices are changed from one coordinate system (CS) to another by simply multiplying with an appropriate transformation matrix. Similarly, ARToolKit finds the transformation matrix that maps the coordinates in the real world CS to the coordinates in the camera projection plane CS, where the image is formed. Once this matrix is obtained any given virtual object can be placed anywhere in the scene and its corresponding camera plane coordinates can be obtained by multiplying its coordinates with that matrix. This matrix is called the camera matrix or in ARToolKit terms a Transformation Matrix and it is a 3 x 4 matrix.<br /><br />ARToolKit uses markers of known dimensions and shapes in order to achieve this. The program first looks for a square box with a darkened border in the scene first. Once it finds it, It matches the pattern inside the box to the pattern templates that it has. This can be quite trivially achieved with common image processing routines. once this marker of known dimensions is detected, it is easy to obtain the camera matrix with the pattern of known dimensions and the camera's focal length.<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhYzB7lcxRn-y2z3m49dTgAY_8jK5RHSJ3LADUMiKIZPPQ44_eCZwu7PwMh1qSKzneVz6OUUQcjeRqA4T1zizE7fAuyemLcyyYCIFUx4x6awnihhUpvkUTPj-E0soGU-CdbD5cxhwezgJA/s1600/hiro.jpg"><img style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 300px; height: 230px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhYzB7lcxRn-y2z3m49dTgAY_8jK5RHSJ3LADUMiKIZPPQ44_eCZwu7PwMh1qSKzneVz6OUUQcjeRqA4T1zizE7fAuyemLcyyYCIFUx4x6awnihhUpvkUTPj-E0soGU-CdbD5cxhwezgJA/s320/hiro.jpg" alt="" id="BLOGGER_PHOTO_ID_5454790215150756386" border="0" /></a> A Typical ARToolKit style marker with a dark bounding box and a pattern inside.<br /><br />Though it sounds simple, It is difficult to perfect this technique. most cameras have glitches in their output called distortions. These distortions occur due to improper manufacturing of the lens. Radial distortion if the lens bulges too much and barrel distortion etc. It is impossible to manufacture a perfect lens. However these distortions can be corrected by certain complex methods that I wont get my hands dirty with because thankfully ARToolKit does all that hard work.<br />Once the camera matrix is known all the hard work is done as far as the augmented reality part is concerned. All that is left for us to do is to place our objects where ever we want them and use this camera matrix to transform it to the camera plane CS.<br />The code that follows is an example given with the ARToolKit. It draws a cube on the detected pattern.<br /><br /><code><br />#ifdef _WIN32<br />#include <windows.h><br />#endif<br />#include <stdio.h><br />#include <stdlib.h><br />#ifndef __APPLE__<br />#include <gl/gl.h><br />#include <gl/glut.h><br />#else<br />#include <opengl/gl.h><br />#include <glut/glut.h><br />#endif<br />#include <ar/gsub.h><br />#include <ar/video.h><br />#include <ar/param.h><br />#include <ar/ar.h><br /><br />//<br />// Camera configuration.<br />//<br />#ifdef _WIN32<br />char *vconf = "Data\\WDM_camera_flipV.xml";<br />#else<br />char *vconf = "";<br />#endif<br /><br />int xsize, ysize;<br />int thresh = 100;<br />int count = 0;<br /><br />char *cparam_name = "Data/camera_para.dat";<br />ARParam cparam;<br /><br />char *patt_name = "Data/patt.hiro";<br />int patt_id;<br />double patt_width = 80.0;<br />double patt_center[2] = {0.0, 0.0};<br />double patt_trans[3][4];<br /><br />static void init(void);<br />static void cleanup(void);<br />static void keyEvent( unsigned char key, int x, int y);<br />static void mainLoop(void);<br />static void draw( void );<br /><br />int main(int argc, char **argv)<br />{<br /> glutInit(&argc, argv);<br /> init();<br /><br /> arVideoCapStart();<br /> argMainLoop( NULL, keyEvent, mainLoop );<br /> return (0);<br />}<br /><br />static void keyEvent( unsigned char key, int x, int y)<br />{<br /> /* quit if the ESC key is pressed */<br /> if( key == 0x1b ) {<br /> printf("*** %f (frame/sec)\n", (double)count/arUtilTimer());<br /> cleanup();<br /> exit(0);<br /> }<br />}<br /><br />/* main loop */<br />static void mainLoop(void)<br />{<br /> ARUint8 *dataPtr;<br /> ARMarkerInfo *marker_info;<br /> int marker_num;<br /> int j, k;<br /><br /> /* grab a vide frame */<br /> if( (dataPtr = (ARUint8 *)arVideoGetImage()) == NULL ) {<br /> arUtilSleep(2);<br /> return;<br /> }<br /> if( count == 0 ) arUtilTimerReset();<br /> count++;<br /><br /> argDrawMode2D();<br /> argDispImage( dataPtr, 0,0 );<br /><br /> /* detect the markers in the video frame */<br /> if( arDetectMarker(dataPtr, thresh, &marker_info, &marker_num) < 0 ) {<br /> cleanup();<br /> exit(0);<br /> }<br /><br /> arVideoCapNext();<br /><br /> /* check for object visibility */<br /> k = -1;<br /> for( j = 0; j < marker_num; j++ ) {<br /> if( patt_id == marker_info[j].id ) {<br /> if( k == -1 ) k = j;<br /> else if( marker_info[k].cf < marker_info[j].cf ) k = j;<br /> }<br /> }<br /> if( k == -1 ) {<br /> argSwapBuffers();<br /> return;<br /> }<br /><br /> /* get the transformation between the marker and the real camera */<br /> arGetTransMat(&marker_info[k], patt_center, patt_width, patt_trans);<br /><br /> draw();<br /><br /> argSwapBuffers();<br />}<br /><br />static void init( void )<br />{<br /> ARParam wparam;<br /> <br /> /* open the video path */<br /> if( arVideoOpen( vconf ) < 0 ) exit(0);<br /> /* find the size of the window */<br /> if( arVideoInqSize(&xsize, &ysize) < 0 ) exit(0);<br /> printf("Image size (x,y) = (%d,%d)\n", xsize, ysize);<br /><br /> /* set the initial camera parameters */<br /> if( arParamLoad(cparam_name, 1, &wparam) < 0 ) {<br /> printf("Camera parameter load error !!\n");<br /> exit(0);<br /> }<br /> arParamChangeSize( &wparam, xsize, ysize, &cparam );<br /> arInitCparam( &cparam );<br /> printf("*** Camera Parameter ***\n");<br /> arParamDisp( &cparam );<br /><br /> if( (patt_id=arLoadPatt(patt_name)) < 0 ) {<br /> printf("pattern load error !!\n");<br /> exit(0);<br /> }<br /><br /> /* open the graphics window */<br /> argInit( &cparam, 1.0, 0, 0, 0, 0 );<br />}<br /><br />/* cleanup function called when program exits */<br />static void cleanup(void)<br />{<br /> arVideoCapStop();<br /> arVideoClose();<br /> argCleanup();<br />}<br /><br />static void draw( void )<br />{<br /> double gl_para[16];<br /> GLfloat mat_ambient[] = {0.0, 0.0, 1.0, 1.0};<br /> GLfloat mat_flash[] = {0.0, 0.0, 1.0, 1.0};<br /> GLfloat mat_flash_shiny[] = {50.0};<br /> GLfloat light_position[] = {100.0,-200.0,200.0,0.0};<br /> GLfloat ambi[] = {0.1, 0.1, 0.1, 0.1};<br /> GLfloat lightZeroColor[] = {0.9, 0.9, 0.9, 0.1};<br /> <br /> argDrawMode3D();<br /> argDraw3dCamera( 0, 0 );<br /> glClearDepth( 1.0 );<br /> glClear(GL_DEPTH_BUFFER_BIT);<br /> glEnable(GL_DEPTH_TEST);<br /> glDepthFunc(GL_LEQUAL);<br /> <br /> /* load the camera transformation matrix */<br /> argConvGlpara(patt_trans, gl_para);<br /> glMatrixMode(GL_MODELVIEW);<br /> glLoadMatrixd( gl_para );<br /><br /> glEnable(GL_LIGHTING);<br /> glEnable(GL_LIGHT0);<br /> glLightfv(GL_LIGHT0, GL_POSITION, light_position);<br /> glLightfv(GL_LIGHT0, GL_AMBIENT, ambi);<br /> glLightfv(GL_LIGHT0, GL_DIFFUSE, lightZeroColor);<br /> glMaterialfv(GL_FRONT, GL_SPECULAR, mat_flash);<br /> glMaterialfv(GL_FRONT, GL_SHININESS, mat_flash_shiny); <br /> glMaterialfv(GL_FRONT, GL_AMBIENT, mat_ambient);<br /> glMatrixMode(GL_MODELVIEW);<br /> glTranslatef( 0.0, 0.0, 25.0 );<br /> glutSolidCube(50.0);<br /> glDisable( GL_LIGHTING );<br /><br /> glDisable( GL_DEPTH_TEST );<br />}<br /></code><br /><br />If you want to play around with this you should first print out all the markers provided with the toolkit.<br /><br />The output of this program can be seen once you display your "Hiro" marker in front of the camera. A blue cube appears on the pattern and it turns in accordance to your movements.<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjtTSJbxpXRYAuIXvRqTVehPv9zXtoEgYYXwTT-CZtkDDYSYu_Typv-XZKFEDfEn3Jyjgb63_0CU7LQL4b12w0fC9PlJ6xtk1pSbG3C_68mpoB1xy3SSHfsm3cFJpnSCfZG5ZqWVhdaBcg/s1600/scr001.jpg"><img style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 320px; height: 250px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjtTSJbxpXRYAuIXvRqTVehPv9zXtoEgYYXwTT-CZtkDDYSYu_Typv-XZKFEDfEn3Jyjgb63_0CU7LQL4b12w0fC9PlJ6xtk1pSbG3C_68mpoB1xy3SSHfsm3cFJpnSCfZG5ZqWVhdaBcg/s320/scr001.jpg" alt="" id="BLOGGER_PHOTO_ID_5454797960683027426" border="0" /></a><br /><br />Most of the code is self explanatory and the rest will be clear when you write a few programs yourself so I wont waste time in elaborating it.<br /><br />So this was my first encounter with ARToolKit.<br /><br />The next day I met my guide ( after trying out ARToolKit ) and discussed with him. We decided that I should develop a demo in which humans can interactively control these objects. Now as to what the demo should be never surfaced until a little later but until then i was tinkering with ARToolKit. In between I tried to recreate the functionality of ARToolKit using OpenCV which I left midway ( I hope to resume it soon ) as soon as I had a solid idea as to what I was going to about the demo.Krishnakanth Mallikhttp://www.blogger.com/profile/18368128453294368556noreply@blogger.com5tag:blogger.com,1999:blog-2855502657366642727.post-58472203068247037792010-03-31T12:28:00.005+05:302010-03-31T19:54:56.405+05:30Augmented RealityI am currently in IIIT-Hyderabad immersed in my current project on augmented reality.I was introduced to it by a professor here a few months ago. For those of you who haven't heard of augmented reality, it is an upcoming field that will soon play a huge part in revolutionizing human-computer interaction. Augmented-Reality, as the term suggests, means to add virtual elements to a real environment<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIbT3C_FNyfyHnm1LOnehYBWmRyXeTH6e44j6I5N9lc0IGK1B7Uzh3CH9nwucI9GKvSOsrSNa5M2hoJ2baZCmqZBxk1lf9wlDmDdKz-Jka5A2pEeMtM4OsyicmF7rAYCMGrZK92GERQ8Y/s1600/augmented-reality-dreamtime.jpg"><img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer; width: 320px; height: 224px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIbT3C_FNyfyHnm1LOnehYBWmRyXeTH6e44j6I5N9lc0IGK1B7Uzh3CH9nwucI9GKvSOsrSNa5M2hoJ2baZCmqZBxk1lf9wlDmDdKz-Jka5A2pEeMtM4OsyicmF7rAYCMGrZK92GERQ8Y/s320/augmented-reality-dreamtime.jpg" alt="" id="BLOGGER_PHOTO_ID_5454700811582354194" border="0" /></a>. In other words it can be described as a merger of reality and synthesized virtual realms. The results are truly a feast for the eyes! For instance in the picture show here, the two people who are viewing through their head mounted displays, fed by head mounted cameras can see the virtual city placed on a real table! The city is rendered by the program and placed in the correct position and in the right orientation as per certain calculations which I will describe in detail further ahead.<br /><br /><br /><br />Augmented reality can be seen as a progeny of existing computer vision concepts. The basic principle is simple:<br /><ul><li>Firstly it is assumed that the camera sees what the user sees, which is almost true given its close proximity to the user's eye and after methodically eliminating radial and axial distortions.</li><li>Since we wish to place our virtual objects in the scene and make it look convincingly real, we need to know the mapping from the scene to the image plane of the camera. To make it even simpler - The virtual objects need to be aligned with the perspective of the scene. For example, a user looking at a real teapot and a virtual teapot, placed at the same distance from the camera should be convinced that the scene is real. The virtual teapot should not look "out of place" in the scene. There are several novel methods that can be used to achieve this which I will go through in much more detail as I describe my project ahead.</li><li>The scene should be rendered efficiently. Since the frame rate of the video should be maintained at a reasonable and constant value, we need to finish our frame by frame processing quickly and efficiently.</li></ul>Now my project here in IIIT - Hyderabad is to find ways to add human interaction into this. So I set out to make a demo, a game rather. My game is quite simple. There is a plane that a user will be holding in front of a camera (like a sheet of paper) . A virtual maze will be created on this plane along with a virtual ball. The objective of this game is that the user should guide the ball through the maze by tilting the plane in his hand. Remember that the ball should be under the influence of gravity to make this possible! :)<br /><br />It has been almost 2 months since I have been brooding on this project and now I have a working prototype. I now have a 4 sided boundary region in which the user can play with his ball, collision, gravity and a fully functional force accumulator model included. I regret not blogging my progress all this while but now I have some catching up to do. I will devote any free time I have to enumerate the progress that I made over the past 2 months and other proceedings in my project. I will soon upload my game so that you can play it too!Krishnakanth Mallikhttp://www.blogger.com/profile/18368128453294368556noreply@blogger.com0tag:blogger.com,1999:blog-2855502657366642727.post-2039564479903300572010-03-31T12:04:00.003+05:302010-03-31T12:19:33.357+05:30Back to Blogging!I have been off blogging for long and I now wish to make up for the lost time.<br />Blogging is a healthy habit. Its enables one to maintain a log his thoughts and actions. Once you look back after years of blogging, you can see your life since then as a smooth slide show. This thought excites me. I often have lots to share and I feel a surge of enthusiasm to log my work here. But on each and every occasion such as this my laziness has successfully been able to take over me. Now I am determined to get over that phase of mine! It helps when I think of my blog as a means of torturing people with my rantings ;) ( Just kidding..hehe). Anyway, I am looking forward to some productive blog activity and urge you people to motivate me by posting your queries and arguments.Krishnakanth Mallikhttp://www.blogger.com/profile/18368128453294368556noreply@blogger.com0tag:blogger.com,1999:blog-2855502657366642727.post-46438871087117783782009-06-10T22:36:00.003+05:302011-02-18T15:21:41.605+05:30Function-Plot: Sample OpenGL programBefore we sit down to writing code we have to set up our environment. You can download the glut library from <a href="http://www.opengl.org/resources/libraries/glut/glut_downloads.php">here</a>. Alternatively you can use the Dev C++ IDE's Package Manager to download it. For those who use Dev C++ you can go to Tools>Check for Updates/Packages and download glut from there.<br />After you successfully finish downloading you can start a new glut project by clicking File>New>Project and selecting the Multimedia tab and selecting glut as your project type.<br />You will instantly see a sample glut program. Just to get a taste of what you can accomplish, try running it. Select Execute>Compile and Run or press F9.<br /><br />Now lets jump straight to out own glut project. Our objective is to plot a simple two dimensional sine curve on the screen. This will help to familiarize you to the opengl programming. Try running this program. I will elaborately explain the code in the following lines.<br /><br /><code><br />#include<gl h=""><br />#include<stdlib.h><br />#include<math.h><br /><br /><br />void Init(void)<br />{<br />glClearColor(0.0f,0.0f,0.0f,0.0f); // choose a color to clear the screen with<br />glColor3f(1.0f,1.0f,1.0f); // choose a fill color<br />glPointSize(1.0f); <br />glMatrixMode(GL_PROJECTION);<br />glLoadIdentity();<br />gluOrtho2D(0.0,640.0,0.0,480.0); // set up the co-ordinate system<br />}<br /><br /><br />void funcPlot(void)<br />{<br />glClear(GL_COLOR_BUFFER_BIT);<br />GLdouble x,y;<br />glBegin(GL_LINES); // Start drawing lines<br />glVertex2i(0,240); // Draw a vertex( a point,well almost) at (0,240) world co-ordinates<br />glVertex2i(640,240);<br />glEnd(); // end drawing<br /><br />glBegin(GL_POINTS); // begin drawing points<br />for(x=0.0;x<8.0;x+=0.005)></code><div><code><gl h="">y = (480/2)+ (10 * sin(x)); </gl></code></div><div><code><gl h="">glEnd(); // end drawing </gl></code></div><div><code><gl h="">} </gl></code></div><div><code><gl h=""><br /></gl></code></div><div><code><gl h="">int main(int args,char** argv) </gl></code></div><div><code><gl h="">{ </gl></code></div><div><code><gl h="">glutInit(&args,argv); </gl></code></div><div><code><gl h="">glutInitWindowPosition(0,0); </gl></code></div><div><code><gl h="">glutInitWindowSize(640,480); </gl></code></div><div><code><gl h="">glutInitDisplayMode(GLUT_SINGLE | GLUT_RGBA); </gl></code></div><div><code><gl h="">glutCreateWindow("Function Plot Example"); </gl></code></div><div><code><gl h="">glutDisplayFunc(funcPlot); </gl></code></div><div><code><gl h="">glutMainLoop(); </gl></code></div><div><code><gl h="">return(0); </gl></code></div><div><code><gl h="">} </gl></code><br /><br />To take you through this I prefer to start with the main() function. The function glutInit() is called to start a new new thread and create a new window instance to run your program. The function calls that follow, set the various properties of your rendering context and your window instance. The beauty of glut is that you need'nt worry about creating your window handles and managing your contexts, glut does it all for you. The internal implementation of glut is such that these tasks are performed for you depending on your native window system. Hence the platform independancy. The glutInitDisplayMode() call defines your rendering context. The parameters passed to it ie. GLUT_SINGLE | GLUT_RGB, denote that you need a single display buffer and your color mode is RGB or Red, Green and Blue. It will be clear further what they excatly mean but for now think of the display buffer as a contigious memory block where you're rendered scene will be temporarily stored before it is pushed into the display. The color mode needs further explaination. As you know white light comprises of all colors. But a colored light is characterized by the absence of certain components, coherently it can also be said that colored light is rather a combination of various color components in specific proportions. Thus RGB is color mode is a method in which every color is denoted as a combination of Red, Green and Blue. The next three lines are self explainatory. glutInitWindowSize() specifies the size or resolution of your window in (pixel width) x (pixel height). glutInitWindowPosition() is used to postion your window relative to the screen co-ordinate system. The screen co-ordinate system is defined in two dimensions. The top left corner of your screen is considered the origin (0,0) and x and y co-ordinates increase as we move towards the right and downward respectively. Finally glutCreateWindow() creates your window with the title specified as the parameter. The next section is where you register your call back functions. Call back functions in simple terms are functions that are called when a specific event occurs. For instance, a screen resize operation gives rise to a resize event(I will introduce this later). To handle this event we implement a function that is to be called every time a resize event occurs. Then to let glut know that this is the function to be called for such an event, we register it. In the next line we see that the the display call back is registered. The init() function that is called next is where you setup the canvas for you to draw on. The glClearColor() specifies the color that you want to clear your screen with, everytime you redraw the scene. The first three values are the Red, Green and Blue components of your color and the fourth value is alpha(leave this for now). The glColor3f() function chooses your drawing color. The next few lines are better understood later so for now think of this operation as setting up your world co-ordinate system. The display call back that we registered in main is implemented as shown. I believe most of you will have no difficulty understanding it and for now the comments in the code should suffice. </div>Krishnakanth Mallikhttp://www.blogger.com/profile/18368128453294368556noreply@blogger.com0tag:blogger.com,1999:blog-2855502657366642727.post-68548614105531141822009-06-06T21:29:00.000+05:302009-06-07T18:40:16.609+05:30OpenGL Introduction sans the nonsenseThis post is intended to make you feel at ease with using the OpenGL API. Here, I will cover the functionality of the low level procedures along with a brief overview of any graphics library in general.<br /> <br /> It is naturally intimidating to see the highly complex computer displays, or the cutting edge 3D graphics and visuals of a movie scene. But it might surprise you to know that whatever be the case, the end result of that ultra realistic 3D world is a sequence of flattened images on a 2D screen. What actually happens is that a set of functions are implemented to calculate the depths, the lighting, the surface normals etc. (all of which I will explain clearly later) in the virtual world and finally map the 3D images onto a 2D plane. Thankfully, the implementation of these functions (which vary from library to library) are abstracted from the programmer.<br />Any known graphics library adheres to the following simple (at times) steps:<br /><ol><li>Establish a co-ordinate system for the screen onto which the image is projected.</li><li>Establish a co-ordinate system for the virtual world.</li><li>Implement a function that allows the programmer to draw a point ( the smallest point one can draw is a pixel). It is this basic primitive that allows a programmer to draw in complex shapes and objects.</li><li>Allow a way to specify normals to any surface or point (this helps in lighting and also when we want to simulate physics) .</li><li>Provide methods to operate on matrices (Matrices are used to translate, rotate or scale</li><li>Provide other common methods that might make the programmer's life easy.</li></ol>Now lets jump straight to OpenGL.<br />Most books introduce you to a simple OpenGL program to start off with. But I feel there is a need to break this tradition. So, ill first do the explaining part and then show you the program.<br /><br />OpenGL like all other graphic libraries encapsulates the aforesaid points in some way or the other. This will be clear if i explain in common language the structure of a typical OpenGL program. Keep in mind that i am using GLUT.<br /><br /><ul><li>Include necessary files</li><li>Write a main function like any C/C++ program. And in main call a few glut functions which will set up and handle your window in which you are going to draw. To be more specific you will be setting up the display modes, the window sizes, window positions, menus in your windows(if there are any) etc.</li><li>Also in the main function after the initialization done in the previous step, you register the call back functions for necessary events. Further explaination is required here. Now in any window based program the code that is to be executed next is decided based on events. For instance when you click a mouse button, a mouse button event is generated, and to act on this event and perform the necessary action the programmer implements a call back function for mouse events. But how does GLUT know that the function that is implemented is a mouse call back? Hence the call back function is registered with GLUT as being a mouse event based call back function.</li><li>Now these call back functions that were just registered have to be implemented to actually be of any use. So that is done next. Also note that since registering a call back means you have to implement them before your main function.<br /></li></ul>And thats it! Thats how simple an OpenGL program is.<br />The next post presents a simple OpenGL program where a sine function is plotted on screen.Krishnakanth Mallikhttp://www.blogger.com/profile/18368128453294368556noreply@blogger.com0tag:blogger.com,1999:blog-2855502657366642727.post-43864847210985650142009-06-06T19:44:00.000+05:302009-06-06T21:28:53.709+05:30OpenGL - IntroductionOpenGL is widely considered as hard to tame. It is often intimidating to amateur and overenthusiastic programmers who's approach is to take a plunge into it at first sight. Though it may be encouraging to see, it often leaves them terribly confused.<br /><br /> Before GLUT, ( GL utility toolkit) which conviniently keeps us away from the complexities of window system related code, OpenGL programming was a daunting task. Despite this benifit, which GLUT gives us, it helps to be guided by a detailed tutorial or walk-through.<br /><br /> The soul aim of this tutorial is to eliminate the hardships faced by wannabe "GLUTtons". This tutorial is tailored in the way in which I taught myself this skill. I will try my best to ensure that I leave no doubts in the reader's mind.Krishnakanth Mallikhttp://www.blogger.com/profile/18368128453294368556noreply@blogger.com0