Assignment 3 Lecture Notes

The following will act as lecture notes to help you review the material from lecture for the assignment. You can click on the links below to get directed to the appropriate section or subsection.


Section 1: OpenGL

OpenGL (Open Graphics Library) is a cross-language, multi-platform API for rendering 2D and 3D graphics. It provides a plethora of functionality that automatically handles tedious graphics computations for us. For instance, all the matrix manipulations and the lighting and shading computations that you had to do manually in the previous two assignments are automatically done for us in OpenGL with some simple function calls.

For this class, we will be using OpenGL 3.0. It is not the newest version of OpenGL, but it is stable; and it contains all the necessary functionality for this class. Most of the syntax in OpenGL 3.0 carries over to the newer versions, so you should still be able to use more modern OpenGL without too much difficulty after this class. The main difference between OpenGL 3.0 and the newer versions is that OpenGL 3.0 depends on glut, which has been deprecated on Mac OS. Keep this in mind when looking into newer versions.

To help you learn how to write programs using OpenGL, we have provided a different kind of “lecture notes” than what you are used to. Rather than trying to explain OpenGL syntax on this webpage, we believe that it would be more effective to learn by seeing the syntax used in actual source code. Hence, the “lecture notes” for OpenGL are actually extensive, instructive comments in this demo program. This is the same demo program that we asked you to run previously to test your OpenGL environment, and it is also included within the Assignment 3 files.

As a note, the demo does not cover the OpenGL Shading Language (GLSL for short). GLSL will be covered in the next set of notes.

The comments and code are all in the opengl_demo.cpp file. You may consider the file to be a template for writing your own OpenGL programs.


Section 2: The Arcball

The Arcball is an intuitive method proposed by Ken Shoemake in 1992 for manipulating and rotating a scene with the mouse [1]. Ever since its introduction, the Arcball has become the primary mechanism that people use to rotate scenes and objects in a 3D computer environment. You may have already encountered the Arcball or variants of it if you have ever used popular modeling software such as Blender or many of Autodesk’s products. In this section, we discuss the details behind the Arcball mechanism and how to implement it with basic rotation matrices; the sections after this one discuss an alternative, more elegant way to implement the Arcball using a construct known as the quaternion.

The best way to get a sense of how the Arcball works is by seeing a demonstration, so let us begin by considering the following movie (if a picture is worth 1000 words, how many words is a movie worth?):

The above movie shows an OpenGL program that allows us to rotate a given scene with the mouse via the Arcball. In this case, the scene that we are rotating is a simple one that has the Stanford bunny mesh centered at the origin. You can see that, usually, when we click and drag the mouse from one point on the screen to another, the scene rotates in a direction that intuitively corresponds to the direction in which we move the mouse. And when we click and drag the mouse near the corners of the screen, the scene rotates only around an axis that perpendicularly intersects with the center of the screen. This gives us two different ways to rotate the scene with the mouse.

The Arcball works by trying to map a click-and-drag motion of the mouse on the screen to a rotation about the surface of a sphere of radius 1 that is inscribed into our viewing cube in NDC. Figure 1 shows a visual of this:

pict
Figure 1: A click-and-drag motion of the mouse on our screen (left) is mapped to a rotation about the surface of a sphere inscribed into our viewing cube in NDC (right). The points in the blue region in the left diagram are mapped to points on the surface of the sphere in the right diagram. Here, (px,py) and (px,p y) are mapped from screen coordinates to (x,y) and (x,y) in NDC respectively; the z-NDC are computed using a function that we discuss below. The Arcball method computes the rotation that would take us from the position vector (x,y,z) to the position vector (x,y,z) and uses it to rotate the scene.

Suppose we click our mouse when it is at point p = (px,py) on our screen in screen coordinates, and we move the mouse to another point p = (p x,p y) on our screen before we release the mouse button. The Arcball method first converts points p and p from screen coordinates to NDC, ignoring the missing z-component from the screen coordinates for now. Let us denote the NDC of p and p with pndc = (x,y) and pndc = (x,y) respectively. From here, the Arcball computes z-coordinates for pndc and pndc by trying to map them to points on the surface of a sphere of radius 1, centerted at the origin of our NDC system. The z-coordinates of the two points are computed with the following function:

z = f(x,y) = + 1 - x2 - y2 0 ,if(x2 + y2 1) ,if(x2 + y2 > 1)

where the + in front of the square root means that we take the positive root. Let us designate our two points now as pndc = (x,y,z) and pndc = (x,y,z), where z and z were computed using the above function.

The idea of the function is to map, if possible, the start and end points of our mouse motion on the screen to points on the surface of this inscribed sphere. Suppose both our start and end points (pndc and pndc respectively) are mapped onto the surface of the sphere successfully; i.e. (x2 + y2 1) and ((x)2 + (y)2 1). The Arcball method then computes the rotation that transforms the position vector pndc to the position vector pndc. Both these position vectors are shown in Figure 1.

As we have seen in computer graphics, a rotation can be defined in terms of an angle θ and a unit vector u to rotate about. It is pretty straightforward to see that the rotation angle θ and rotation vector v for our rotation of pndc to pndc are given by:

θ = arccos( pndc pndc |pndc||pndc|)

u = pndc × pndc

Remember to normalize u to make it a unit vector. Also, be careful with floating and double point precision when computing arccos! Imperfect precision can cause the dot product within the arccos to slightly go above 1 (e.g. 1.00001), which would trigger a NAN from the arccos function. To safely compute arccos, we should take care to compute it as: arccos( min( 1 , arg ) ).

Once we have θ and u, we can form our usual axis-angle rotation matrix:

R = ux2 + (1 - u x2) cos θ u xuy(1 - cos θ) - uz sin θuxuz(1 - cos θ) + uy sin θ0 uyux(1 - cos θ) + uz sin θ uy2 + (1 - u y2) cos θ u yuz(1 - cos θ) - ux sin θ0 uzux(1 - cos θ) - uy sin θuzuy(1 - cos θ) + ux sin θ uz2 + (1 - u z2) cos θ 0 0 0 0 1

which we can use to rotate our scene.

One nice way to think about the Arcball is to imagine the scene that we see on our screen to be enclosed and set in-place within a sphere (or ellipsoid if the screen is not perfectly square). When we click and drag the mouse across the screen, we are essentially grabbing the sphere and rotating it about the center of the screen; and anything within the sphere rotates along with it.

In the event that the start and end points of our mouse motion fall outside the sphere in NDC, we still use the same process above for determining the rotation matrix. What will happen is that a z-component of 0 for the position vectors just simply results in a rotation about the z-axis (i.e. an axis coming out of the screen and through its center). This is what causes the two dimensional rotations that occur in the above movie when we click and drag near the corners of the gray screen.

In practice, when we implement the Arcball, we rotate the scene after every motion (i.e. change in screen coordinates) of the mouse that takes place between the initial click and the release of the mouse button. Using the method above, the OpenGL code structure for the Arcball would look something like the following:


Algorithm 1:
1:  function Init( ... )
2:   ...
3:   last_rotation Identity()
4:   current_rotation Identity()
5:   ...
6:  end function
7:  
8:  function Mouse_Click( ... ,px,py)
9:   (pxstart,pystart) (px,py)
10:   ...
11:  end function
12:  
13:  function Mouse_Motion( ... ,px,py)
14:   (pxcurrent,pycurrent) (px,py)
15:   current_rotation Compute_Rotation_Matrix(pxcurrent,pycurrent,pxstart,pystart)
16:   ...
17:  end function
18:  
19:  function Get_Current_Rotation( ... )
20:   return current_rotation * last_rotation
21:  end function
22:  
23:  function Mouse_Release( ... ,px,py)
24:   last_rotation current_rotation * last_rotation
25:   current_rotation Identity()
26:   ...
27:  end function
28:  
29:  function Draw_Scene( ... )
30:   // code to set up screen, camera, etc
31:   glMultMatrixd( Get_Current_Rotation() )
32:   // code to set up lights and then draw
33:   ...
34:  end function

where Compute_Rotation_Matrix() on line 15 is the procedure we describe above for computing the rotation matrix; and glMultMatrixd() on line 31 is the OpenGL function for transforming the current matrix mode (i.e. projection or modelview) by an arbitrary matrix, which is passed in as an argument. The passed-in matrix must be a pointer to a 4x4 column-major matrix structure. Luckily, the C++ matrix library, Eigen, already stores matrices in column-major, so we can simply pass in the data array of an Eigen matrix (e.g. my_matrix.data()) whenever we want to use glMultMatrixd().

The above code structure has the Arcball rotate scenes about the origin. This means though that scenes with translated cameras in the x and y directions will not rotate around the center of the screen, since the world origin wouldn’t correspond with the screen origin. This is often fine, because when we want to use the Arcball, it is usually to examine individual meshes; and in these situations, we would find it most convenient to center the mesh at the origin. Indeed, the Stanford bunny in the demo video above is centered at the origin to make it easier to examine with the Arcball.

As a final note for this section, we would like to point out that there is an alternative, more elegant way to implement the Arcball than the above-mentioned method. The alternative method is to compute and combine the rotations using a construct known as the quaternion instead of the usual axis-angle rotation matrices. One advantage that we can point out right now to using quaternions over rotation matrices is that it allows us to avoid the computational cost of constantly multiplying rotation matrices every render (line 31, which calls line 20, of Algorithm 1). We will see in the following sections that it is computationally more efficient to combine rotations using quaternions than rotation matrices.


Section 3: Quaternions

The quaternion is an extension of complex numbers that is often used in computer graphics for representing rotations. Its main uses in computer graphics are in interpolating rotations – which we will cover later in the class – and, of course, in the Arcball. The theory of quaternions is a large area of study, and we unfortunately cannot cover all of it within the scope of this class. Rather, we only present a cursory overview of the concept for the purposes of computer graphics. We do encourage those who are curious to look more into the matter on their own independent study.

Quaternion Definitions

A quaternion extends the idea of a complex number by adding two more “imaginary” components. While a complex number simply has a real and an imaginary (i) component, quaternions have a real component (s) and three “imaginary” components (i,j,k). We generally express quaternions in the following form:

q = s + xi + yj + zk

where s,x,y,z . People also often refer to the real component as the w-component instead of the s-component. Both w and s are commonly used, but for this class, we will use s.

The i,j,k components of a quaternion satisfy the conditions:

i2 = j2 = k2 = ijk = -1

ij = k ji = -k ,jk = i ,kj = -i ,ki = j ,ik = -j

We don’t need to worry about these conditions for the purposes of this class, but they are necessary for the definition of the quaternion. For those who are curious, the above relationships between the i,j,k components correspond to the cross product relationships for the unit Cartesian vectors:

x × y = z y × x = -z ,y × z = x ,z × y = -x ,z × x = y ,x × z = -y

We often find it convenient to represent quaternions as pairs of [s,v], with s ,v 3, and v = xi + yi + zk. Using this notation, we introduce the following basic arithmetic operations in the context of quaternions. Given two quaternions qa = [sa,va] and qb = [sb,vb], we can perform quaternion addition and subtraction as follows:

qa + qb = [sa + sb,va + vb]

qa - qb = [sa - sb,va - vb]

Given a constant c , we also have scalar multiplication of a quaternion as follows:

cqa = [csa,cva]

The product between two quaternions is given by:

qaqb = (sa + xai + yaj + zak)(sb + xbi + ybj + zbk)

= [sasb - va vb,savb + sbva + va × vb]

where the first line can be simplified to the second line with some algebra.

Just like how a complex number has a complex conjugate, a quaternion q has its own defined quaternion conjugate:

q* = [s,-v]

As one might expect, the norm of a quaternion is given by:

|q| = s2 + x2 + y2 + z2

To normalize a quaternion into a unit quaternion, we simply divide the quaternion by its norm. Note that, similar to complex numbers, we also have the following relationship involving the conjugate and norm for quaternions:

qq* = |q|2

Finally, the inverse of a quaternion is defined as:

q-1 = q* |q|2

If we multiply a quaternion by its own inverse, then we get the expected identity:

qq-1 = [1, 0] = 1

Rotations with Quaternions

In the context of rotations, quaternions give us a simple way to encode both the axis and angle of a rotation in a set of 4 numbers. Let u = (ux,uy,uz) = uxi + uyj + uzk be the unit vector that represents our rotation axis of interest in 3, and let θ be the rotation angle. We define the corresponding quaternion for this rotation using an extension of Euler’s formula (i.e. eiθ = cos θ + i sin θ) as so:

r = eθ 2(uxi+uyj+uzk) = cos θ 2 + (uxi + uyj + uzk) sin θ 2

= [cos θ 2,(uxi + uyj + uzk) sin θ 2]

To apply our rotation r on a point p = (px,py,pz) 3, we first represent p as the quaternion q = [0,pxi + pyj + pzk], and then evaluate the conjugation of q by r, defined to be:

q = rqr-1

We can show through some algebra that rqr-1 evaluates to q = [0,p xi + p yj + p zk], which contains the new, rotated position of our original point p within the vector component of the quaternion. As a side note, it can also be shown that the conjugation operation can be algebraically simplified into the general axis-angle rotation matrix that we have been using throughout this class (i.e. one way to prove the correctness of our usual axis-angle rotation matrix is through quaternions).

To clarify the conversion between the axis-angle and quaternion representations of rotations, we explicitly list out the conversion equations below. Let our unit rotation axis be denoted with u = (ux,uy,uz), our rotation angle be θ, and our unit quaternion be q = [qs,qxi + qyj + qzk]. Then:

qs = cos θ 2 qx = ux sin θ 2 ,qy = uy sin θ 2 ,qz = uz sin θ 2

And taking note that sin 2(θ 2) = 1 - cos 2(θ 2) = 1 - qs2, we have:

θ = 2 arccos(qs) ux = qx1 - qs 2 ,uz = qz1 - qs 2 ,uy = qy1 - qs 2

The square root, arccos, and division in the last four equations are a bit discerning because they are functions known to lead to singularities. Thankfully, only the division can actually lead to a singularity.

Because q is a unit quaternion, qs 1, so we do not need to worry about the square root or inverse cosine returning anything but a real number. However, it is possible for the square root to be 0. If that were the case, then we would get a divide-by-zero when computing any of the axis components. This would be an issue if it were not for the fact that 0 = 2 arccos(1) = θ. For the square root to be 0, qs must be 1. But if qs = 1, then θ = 0. This allows us to specify any arbitrary axis for u, since a rotation angle of θ = 0 means that we have no rotation at all. So in the event that the square root is zero, we can set u to be any arbitrary axis.

It can also be shown that we can directly express a quaternion q = [qs,qxi + qyj + qzk] as the following rotation matrix:

R = 1 - 2qy2 - 2q z2 2(q xqy - qzqs) 2(qxqz + qyqs) 0 2(qxqy + qzqs) 1 - 2qx2 - 2q z2 2(q yqz - qxqs) 0 2(qxqz - qyqs) 2(qyqz + qxqs) 1 - 2qx2 - 2q y20 0 0 0 1

Finally, just like how we can combine rotations by multiplying their matrix representations together, we can also combine rotations by multiplying their quaternion representations together. That is, given two rotations represented by quaternions q1 and q2, the product:

q = q 2q1

corresponds to the rotation obtained by first applying the rotation represented by q1 and then applying the rotation represented by q2. This is a much more elegant way of combining rotations than multiplying their axis-angle rotation matrices, which is more computationally expensive.


Section 4: The Arcball with Quaternions

Using what we know about quaternions, we can modify Algorithm 1 to produce a more efficient Arcball implementation:


Algorithm 2:
1:  function Init( ... )
2:   ...
3:   last_rotation Identity()
4:   current_rotation Identity()
5:   ...
6:  end function
7:  
8:  function Mouse_Click( ... ,px,py)
9:   (pxstart,pystart) (px,py)
10:   ...
11:  end function
12:  
13:  function Mouse_Motion( ... ,px,py)
14:   (pxcurrent,pycurrent) (px,py)
15:   current_rotation Compute_Rotation_Quaternion(pxcurrent,pycurrent,pxstart,pystart)
16:   ...
17:  end function
18:  
19:  function Get_Current_Rotation( ... )
20:   return current_rotation * last_rotation
21:  end function
22:  
23:  function Mouse_Release( ... ,px,py)
24:   last_rotation current_rotation * last_rotation
25:   current_rotation Identity()
26:   ...
27:  end function
28:  
29:  function Draw_Scene( ... )
30:   // code to set up screen, camera, etc
31:   glMultMatrixd( Get_Current_Rotation() )
32:   // code to set up lights and then draw
33:   ...
34:  end function

This time, last_rotation and current_rotation are quaternions rather than rotation matrices. This makes line 20, which frequently gets called by line 31, less computationally expensive. Notice how the only matrix operation that occurs now is the mandatory OpenGL call to glMultMatrixd().

The structure of the code remains the same with only a couple of changes in the form of new functions. The Compute_Rotation_Quaternion() function computes the unit rotation axis and rotation angle based on the passed-in points and returns the quaternion that represents the rotation. The Get_Rotation_Matrix() function converts a given unit quaternion to a rotation matrix. For the latter function, it would be best to directly express the quaternion as the following rotation matrix:

R = 1 - 2qy2 - 2q z2 2(q xqy - qzqs) 2(qxqz + qyqs) 0 2(qxqy + qzqs) 1 - 2qx2 - 2q z2 2(q yqz - qxqs) 0 2(qxqz - qyqs) 2(qyqz + qxqs) 1 - 2qx2 - 2q y20 0 0 0 1

This would be faster than converting the quaternion to its axis-angle representation and then forming the axis-angle rotation matrix; it also avoids the need to call the sine and cosine functions.


References

[1] http://dl.acm.org/citation.cfm?id=155312


Written by Kevin (Kevli) Li (Class of 2016).
Links: Home Assignments Contacts Policies Resources