The following will act as lecture notes to help you review the material from lecture for the assignment. You can click on the links below to get directed to the appropriate section.

Superquadrics are a family of inside-outside functions (defined below) which allow us to generate a variety of implicit three-dimensional shapes that we can easily compute normals and gradients for; perfect for a simple raytracing program to work with.

In order to define the shapes we’ll be ray tracing, we use an inside-outside function, defined to be some

f : ℝ^{3} \to ℝ

such that

f (x, y, z) \{\begin{matrix} < 0 & inside the object \\ = 0 & on the object’s surface \\ > 0 & outside the object \end{matrix}

Many common solids can be defined this way, such as the unit cube, unit sphere, octahedron, unit cylinder, and double cone:

\begin{array}{l} cube (x, y, z) & = - 1 + \max (| x |, | y |, | z |) \\ sphere (x, y, z) & = - 1 + x^{2} + y^{2} + z^{2} \\ octahedron (x, y, z) & = - 1 + | x | + | y | + | z | \\ cylinder (x, y, z) & = - 1 + \max (x^{2} + y^{2}, z^{2}) \\ double cone (x, y, z) & = - 1 + \sqrt{x^{2} + y^{2}} + | z | \end{array}

One can see that by setting the left-hand side of side of these equations to 0, we get an implicit definition for each of these shapes.

Instead of having to use a variety of different functions for different shapes however, we introduce superquadrics, which themeselves are a three-dimensional generalization made by Prof. Barr in 1981, of Lamé curves (also known as Piet Hein’s “superellipses”). The general superquadric function can then be defined to mimic the surfaces described above, amongst others. It is even possible to extend the math to create toroids; however, we limit the scope to dealing solely with ellipsoids, defined by a longitudinal shape parameter

e

and latitudinal parameter

n

S (x, y, z, e, n) = - 1 + {(z^{2})}^{\frac{1}{n}} + {({(x^{2})}^{\frac{1}{e}} + {(y^{2})}^{\frac{1}{e}})}^{\frac{e}{n}}

Looking at the general form, we can verify that when

e = n = 1

this become the same inside-outside function as the unit sphere, and similarly, when

e = n = 2

, we have an octahedron.

Keep in mind that when computing the result of this function, the squaring of the

x

y

and

z

values must occur prior to the other exponentiation since negative coordinates would cause computational errors.

With an implicit definition of a particular superquadric, we can also compute the surface normal at a given point

(x, y, z)

by computing the gradient of the superquadric function. We therefore have:

The following images are rendered images of superquadric surfaces. (They were actually rendered with the command line and rendering tool that you will be using in Assignment 7.) By varying the longitudinal and latitudinal parameters,

e

, and

n

respectively, we can create the shapes shown below.

pict pict pict
Figure 1: $\begin{align} Cube & e = 0.0001, n = 0.0001 & (1) \\ Cylinder & e = 1, n = 0.0001 & (2) \\ Double Cone & e = 1, n = 2 & (3) \end{align}$

So far, we’ve only considered superquadrics centered at the origin, but in order to make scenes, we want to be able to position our objects arbitrarily.

Specifically, given some scalar inside-outside function (little)

f (\vec{x})

of point little

\vec{x}

, we want to know how to find the resulting scalar function (big)

F (\vec{X})

as a function of the point, big

\vec{X}

How can we make something that serves as an inside-outside function for the same surface under some transformation

O

that takes one point to another, i.e., which takes (little)

\vec{x}

to (big)

\vec{X}

where

T

R

, and

S

are some translation, rotation, and scaling operations respectively. Then, we have the inverse transformation

O^{- 1}

\vec{x} = O^{- 1} (\vec{X}) = S^{- 1} (R^{- 1} (T^{- 1} (\vec{X})))

We can then plug this into the relation between our original and transformed inside-outside functions to get an expression for (big)

F

F (\vec{X}) = f (\vec{x}) = f (O^{- 1} (\vec{X})) .

Thus, in order to calculate the inside-outside function of a transformed superquadric, you must first apply the inverse transformation to the position vector or point in question, and feed this into the original untransformed function.

Keep in mind that when using this technique to make sure that you are clear on what space you are in when you compute different things.

For example, if you had a point

\vec{x}

that you want to find the surface normal for on some inside-outside function

f (\vec{x})

under some transformation

O

, then you should transform the point

\vec{x}

to the point

\vec{X}

, compute the normal vector for

\vec{X}

using (little)

f

in body coordinates, and then transform the resulting normal back by applying a version of

O

for normals: don’t forget that transformations work differently for normal vectors than points, and you’ll also have to normalize these normal vectors, so they’d have magnitude 1.

For (forward) transforming normal vectors, you use the inverse transpose of the 3x3 matrix part of the transform. (See the Assignment 2 lecture notes if you need a refresher).

On the same note, if you have a directional vector

\vec{a}

such as the direction for the ray, (perhaps if you parameterized a point to be an origin plus a directional vector times

t

), in order to compute an inside-outside function

f

with transformation

O

properly, the origin point of the ray needs to be transformed as stated above with the inverse transformation matrix of

O

But the direction for the ray,

\vec{a}

should be transformed with the inverse transformation matrix without translations in order to preserve the correct resulting point that corresponds to adding the two. Alternatively, you could compute the actual point by adding the directional vector to the origin prior to doing any transformation and transform the point as stated above.

In order to understand what we mean by “a ray for each pixel,” I find that it helps to imagine putting a wire screen, as you might find on a window, some distance in front of your face. You’d still be able to tell what’s behind it, but your view would be divided up into a grid.

If you then chose a point in each square of the grid and colored the entire square to match that point, then you’d end up with a similar-looking image, relative to what you were actually seeing, without the screen.

If the squares were large, it might look highly “pixelated” like an old videogame, but as you increased their density you’d get a sharper image.

This is essentially how scenes are represented on computer screens: we can’t represent the light from the (by definition) infinite number of points, so we sample them at regular intervals and call it good enough. To ray trace an image, we think about the actual grid of pixels on the monitor as being the wire screen from before, and send a ray out from our camera through a point in each pixel in order to fill it in with a color.

Now that we know how to express the surface of a superquadric and compute its normal vectors, how do we follow light from the camera back to the surface of objects? For simple contours such as planes we can solve for the intersection between a vector and the surface explicitly, but for superquadrics we must use an iterative solver to produce an estimation. One such technique that produces accurate results but is still fairly straightforward is that of Newton’s method.

Many of you will recall from calculus that a Taylor series

T (g (t), t_{0})

is a method of approximating the function

g (t)

around the point

t = t_{0}

, given by

g (t) = g (t_{0}) + g^{'} (t_{0}) (t - t_{0}) + \dots

If we ignore the higher order terms as they are comparatively small in the neighborhood of

t_{0}

, then we can say that if

g (t) = 0

, we have

0 = g (t_{0}) + g^{'} (t_{0}) (t - t_{0}) \Rightarrow t = t_{0} - \frac{g (t_{0})}{g^{'} (t_{0})}

This tells us that if we’re at some point

t_{0}

and want to get to another point

t

where

t

is a root of

g (t)

, then we can approximate

t

using

t_{0}

g (t_{0})

, and

g^{'} (t_{0})

This can be made into an iterative process by replacing

t

with

t_{new}

and

t_{0}

with

t_{old}

, so that if we start from some initial value of

t_{old}

, we can compute our next approximation of

t

t_{new} = t_{old} - \frac{g (t_{old})}{g^{'} (t_{old})},

and then plug this again in as

t_{old}

to get another approximation of

t

, and so on and so forth until we hit some “stopping condition.” We label this value as

t_{final}

, and take it to be that

g (t_{final}) \approx 0

. As you may have guessed, we will be using the superquadric function (

S

) and its derivative for

g

and

g^{'}

, since finding the intersection of a vector and a superquadric’s surface is the same as finding where along the vector we will have

S (x, y, z, e, n) = 0

It is of course possible that

g^{'} (t_{0}) = 0

, which would cause divide-by-zero errors at runtime - you should check for this.

So what do we use as our initial value of

t_{old}

? A simple way to bound superquadrics is to use a sphere larger than the largest unit superquadric, which is a cube. The radius of the bounding sphere must therefore be

\sqrt{1^{2} + 1^{2} + 1^{2}} = \sqrt{3}

, which in turn gives us an inside-outside function for it:

bsphere (\vec{x}) = \vec{x} \cdot \vec{x} - 3

Now, we know we can express the equation of a 3D line parallel to the 3D vector

\vec{a}

that passes through the 3D point

\vec{b}

ray (t) = \vec{a} t + \vec{b}

Where

t

varies across

ℝ

. If we let the point

\vec{b}

be the location of our camera, then we can solve for the intersection of a ray exiting the camera and the bounding sphere of a superquadric by solving

$\begin{array}{l} \vec{a} t + \vec{b} = \vec{x} \Rightarrow (\vec{a} t + \vec{b}) & \cdot (\vec{a} t + \vec{b}) - 3 \\ = (\vec{a} \cdot \vec{a}) t^{2} + 2 (\vec{a} \cdot \vec{b}) t + (\vec{b} \cdot \vec{b}) - 3 \\ = 0 \end{array}$

This is just a quadratic equation in

t

with coefficients where

\begin{array}{l} a & = \vec{a} \cdot \vec{a} \\ b & = 2 (\vec{a} \cdot \vec{b}) and \\ c & = \vec{b} \cdot \vec{b} - 3 \end{array}

But before you do this, note that this ray,

\vec{a} t + \vec{b}

, needs to be first transformed into the body coordinate system, from world coordinates, before it can be plugged into the above bounding sphere equation! The above sphere equation is in its body coordinates at the origin.

but this has numerical problems when

4 a c

is small, relative to

b^{2}

, so we would be subtracting two values that are very close to each other, losing many bits of precision.

Numerically Improved Version of the Quadratic Equation Solutions. You can create a numerically improved quadratic equation solver by never canceling the square root term with the

- b

term, so these can always add.

When

b \geq 0

, and

4 a c

is small relative to

b^{2}

, a numerical problem takes place with the “plus” term in front of the square root term, i.e., for

t^{+}

, where

- b

and the square root are almost the same magnitude, canceling and losing many bits of precision.

This problem doesn’t take place with the “minus” sign in front of the square root term. So one of the above algebraic solutions is numerically “bad” and the other is “good.”

To derive a better expression, you can multiply numerator and denominator by

- b \mp \sqrt{b^{2} - 4 a c}

to cancel out many terms in the original numerator, initially where

4 a c

is smaller than

b^{2}

\begin{array}{l} t^{\pm} & = \frac{- b \pm \sqrt{b^{2} - 4 a c}}{2 a} \frac{- b \mp \sqrt{b^{2} - 4 a c}}{- b \mp \sqrt{b^{2} - 4 a c}} \\ = \frac{(- b \pm \sqrt{b^{2} - 4 a c}) (- b \mp \sqrt{b^{2} - 4 a c})}{2 a (- b \mp \sqrt{b^{2} - 4 a c})} \\ = \frac{b^{2} - | (b^{2} - 4 a c) |}{2 a (- b \mp \sqrt{b^{2} - 4 a c})} where 4 a c < b^{2} \\ = \frac{b^{2} - b^{2} + 4 a c}{2 a (- b \mp \sqrt{b^{2} - 4 a c})} \\ = \frac{4 a c}{2 a (- b \mp \sqrt{b^{2} - 4 a c})} \\ = \frac{2 c}{- b \mp \sqrt{b^{2} - 4 a c}} \end{array}

So when

b \geq 0

the two solutions to select are:

\begin{array}{l} t^{-} & = \frac{- b - \sqrt{b^{2} - 4 a c}}{2 a} \\ t^{+} & = \frac{2 c}{- b - \sqrt{b^{2} - 4 a c}} \end{array}

When

b \geq 0

you’ll never be canceling out a square root term that has the opposite sign from the

b

term in the solution, so you will not be losing as much accuracy with these versions of the solutions.

Now, to handle the case when

b < 0

you can multiply both sides of the quadratic equation by

sign (b)

which is

\pm 1

, so you can get a new quadratic equation that never has a negative

b

value, since

sign (b) b = | b |

Just don’t multiply both sides of the equation by zero! There is a similar

s i g n u m (b)

function, which is zero when

b = 0

. Don’t use that similar function!

The above equation is equivalent to our original equation, where we’ve multiplied both sides of the equation by a nonzero number,

sign (b)

, which is

\pm 1

There are two roots for the new quadratic equation, which we will call

t_{1}

and

t_{2}

. If we care which root is which, if

b \geq 0

then

t_{1} = t^{-}

and

t_{2} = t^{+}

. If

b < 0

then

t_{1} = t^{+}

and

t_{2} = t^{-}

. This can most easily be seen in the final expression for

t_{1}

which is the same as the conventional quadratic equation formula, but with an extra

sign (b)

factor on the square root, reversing its sign when

b < 0

\begin{array}{l} t_{1} & = \frac{sign (b) (- | b | - \sqrt{b^{2} - 4 a c})}{2 a} & = \frac{- b - sign (b) \sqrt{b^{2} - 4 a c}}{2 a} \\ t_{2} & = \frac{2 sign (b) c}{- | b | - \sqrt{b^{2} - 4 a c}} & = \frac{2 c}{- b - sign (b) \sqrt{b^{2} - 4 a c}} \end{array}

This is an “improved” version for the two numerical solutions of the quadratic equation. The

| b |

term never cancels with the square root, since both items have the same plus or minus sign.

And then the expressions were further simplified, for the final expressions to use for

t_{1}

and

t_{2}

Both for negative and non-negative values of

b

, these final expressions numerically work for values of

a

b

, and

c

for the original quadratic equation; you can algebraically test the two roots of the new final expressions of

t_{1}

and

t_{2}

to see that they are still valid, surprisingly, even when

4 a c

is or is not smaller than

b^{2}

And again note that the value of

sign (b)

must never be zero! It should be +1 or -1.

In computer graphics, the improved numerical values for

t

can remove jagged artifacts in ray tracing at the silhouette boundaries of shapes like spheres, when the quadratic equation needs to be solved. (Oops, our bounding spheres will usually not need the extra accuracy.)

The above cases were covered, because in computer graphics, it can be important to know how to solve the quadratic equation correctly!

Next, there are two more things to consider. The first is that our discriminant may be negative, leading to imaginary values of

t

. This corresponds with cases where the ray misses the bounding sphere entirely, so you should check for this and handle it accordingly.

The second is that we have two solutions for

t

t^{+}

and

t^{-}

. The physical interpretation of this fact is intuitive, as a ray may intersect a sphere in at most two locations. If both values of

t

are positive, then starting point

\vec{b}

is outside the sphere and the smaller value

t^{-}

is on the side of the sphere closer to the camera, and thus closer to the ray’s actual intersection with the superquadric. In this case, your initial guess should be

t^{-}

If one value is positive and the other negative, then

\vec{b}

is inside the sphere, and the ray may or may not hit the superquadric while traveling away from the eye. In this case, both

t^{-}

and

t^{+}

could be tested as initial guesses. If either produces a positive solution for

t

through the Newton’s method solver, then the trick is to know whether the ray intersects the superquadric on the way in or the way out. If one solution is positive and one is negative, then we know

\vec{b}

is inside the superquadric, and the outside surface can’t be seen. If both are positive, however, then we know the ray goes in one side and comes out the other. If both are negative, then the superquadric is entirely behind the camera, and can’t be seen.

As with the Newton’s method solutions in the previous case, we know that if both

t^{+}

and

t^{-}

are negative, then the bounding sphere is behind the camera and thus invisible, since the ray has to travel backwards to hit it.

Note that for the assignment, using

t^{-}

when it is positive should be sufficient for the scenes we provide.

Between floating point math and the inherent approximation made by our Newton’s method solver, we can’t simply iterate until we find a value of

t

such that

S (\vec{a} t + \vec{b}) = 0

. Instead, we can stop when the value of a superquadric’s inside-outside function is close enough to zero. A good approximation of this is when we are within one twentieth or so of a pixel’s width away from the actual surface, but the gist is that

| S (\vec{a} t + \vec{b}) |

must be sufficiently close to 0.

However, we can also take advantage of the fact that we know some information about

g^{'} (t) = \vec{a} \cdot \nabla S (\vec{a} t + \vec{b}) .

As long as we are approaching the superquadric along a ray, the inside-outside function should be positive and decreasing, and thus

g^{'} (t)

should be negative. If ever it becomes positive, then traveling further along the ray should take us farther away from the superquadric. Thus, if the sign of

g^{'} (t)

ever changes from

-

+

, we know we’ve missed the superquadric.

Additionally, on the surface of a superquadric we will have

g^{'} (t) = 0

, so if this is ever the case we should also check if

g (t) = 0

. If this is satisfied as well, then we know we’re on the superquadric’s surface already, and can just use our current value for

t_{old}

without computing

t_{new}

. On the other hand, if

g^{'} (t) = 0

and

g (t)

is nowhere near 0, the math is effectively the same as the sign change case mentioned above.

Above all else, remember that

\vec{b}

is the position of the camera,

\vec{a}

is the direction the camera is looking, so

ray (t) = \vec{a} t + \vec{b}

To “plug” it into the scalar inside outside function

S ()

, however, we need to transform the ray into the local coordinate system first. The matrix transforms of a straight ray are still a straight ray. So we need to use the body coordinate system version of the ray, using vector

\vec{a}

in body coordinates and the body coordinate system version of the point

\vec{b}

is the function to use in Newton’s method as it tests whether the ray is inside a superquadric, and

g^{'} (t) = \vec{a} \cdot \nabla S ({ray}_{body} (t))

Now that we know what to do, the question is how to do it: how do we express each ray in the form

\vec{a} t + \vec{b}

, so that we can use our Newton’s method solver? This is actually just a poorly-disguised exercise in linear algebra.

First, we define the screen to be the front plane of the camera frustum. In this class so far, you’ve likely only dealt with frustums given by their near, far, left, right, top, and bottom components, but our program uses another common representation based on the near and far components, along with the camera’s vertical field-of-view (FOV) and aspect ratio. The FOV is defined to be the angle between the vector from the camera to the horizontal midpoint of the top of the screen and the vector from the camera to the horizontal midpoint of the bottom of the screen, while the aspect ratio is simply the x-resolution divided by the y-resolution.

From this, we can easily calculate the height and width of the front plane of the frustum, assuming that the camera is centered within it. If its distance from the camera (i.e. the near component) is

n

, the FOV is

θ

, and the height of the plane is

h

, then from basic trigonometry we know that

\frac{h ∕ 2}{n} = \tan (\frac{θ}{2}) \Leftrightarrow h = 2 n \tan (\frac{θ}{2}),

where

θ

is the unique angle in an isosceles triangle of height

n

and base

h

. If the aspect ratio is then

a

, and the width of the plane is

w

, we have

\frac{w}{h} = a \Leftrightarrow w = a h .

Now we need to find the vector from the camera through each point on the grid, which can easily be done by finding the vector which would translate us from the camera’s position to that point.

We begin by defining three basis vectors,

\vec{e_{1}}

\vec{e_{2}}

, and

\vec{e_{3}}

\vec{e_{1}}

is simply the direction the camera is looking, which is

(0, 0, - 1)

before the camera rotation is applied. We then take

\vec{e_{2}}

to be the one pointing directly to the right relative to the camera, which is initially

(1, 0, 0)

, and

\vec{e_{3}}

to be the one pointing directly upwards relative to the camera, which is initially

(0, 1, 0)

Adding

n \cdot \vec{e_{1}}

to the camera’s position takes us to the exact center of the screen. If we then translated again by

\frac{w}{2} \cdot \vec{e_{2}}

we’d be at the midpoint of the right edge of the screen, and then another translation by

\frac{h}{2} \cdot \vec{e_{3}}

would take us to the top right corner. From this, it should make sense that adding or subtracting

\frac{w}{x} \cdot \vec{e_{2}}

takes us right or left by a pixel, while adding or subtracting

\frac{h}{y} \cdot \vec{e_{3}}

takes us up or down by a pixel.

This is really all we need. If we define

x_{i}

to be

(i - \frac{x}{2}) \frac{w}{x}

, with

i

being an integer in

[0, x)

, and

y_{j}

to be

(j - \frac{y}{2}) \frac{h}{y}

, with

j

being an integer in

[0, y)

, then pixel

(i, j)

can be expressed as

\vec{b} + n \vec{e_{1}} + x_{i} \vec{e_{2}} + y_{j} \vec{e_{3}}

, where

\vec{b}

is the position of the camera. You can verify that this produces sane results by plugging in

(i, j) = (0, 0)

, which should take us to the bottom left corner, or

(i, j) = (x, y)

, which should take us to the top right. Note that for the purposes of this derivation, I have defined the

y

-axis to be increasing along the upwards direction, which is the opposite of graphics convention.

Now we recall that our rays should be expressed in the form

\vec{a} t + \vec{b}

. It doesn’t really matter how long

\vec{a}

is, since whatever value of

t

we solve for will be correct for the given

\vec{a}

. Thus, we can simply take

\vec{a} = n \vec{e_{1}} + x_{i} \vec{e_{2}} + y_{j} \vec{e_{3}}

, with

t = 1

taking us to a point on the front plane of the frustum.

As you should have demonstrated in your intersection test from part a, we can find the normal at any point on the surface of a superquadric. Because the Phong model calculates lighting at any point on a surface given the normal, we can easily reuse it here to figure out the color at a ray-superquadric intersection.

However, ray tracing allows us to incorporate further complexities into the Phong model, the simplest of which is shadowing. Let us define an object to be illuminated by a point light as long as the path from the light to the object is unobstructed. At each point we apply the Phong model to, we can check each light to see if it’s obstructed by sending a ray out from the light to the intersection point (or vice-versa) and testing whether it hits any other superquadrics before hitting the one we initially intersected with. If the intersection point is

\vec{p}

and the position of the light is

\vec{l}

, then

\vec{a}

is given as

\vec{a} = \vec{p} - \vec{l}

, and when

t = 1

we’ve reached the intersection point. You must do this in your lighting calculations.

Other more advanced capabilities when using raytracing to render images include handling reflections and refractions, but will not be in the scope for this class. You could imagine that given we implemented all these different extensions, the rendered images become very realistic and life-like since we are effectively modelling light as we see it in our physical world.

Original written by Parker Won and Nailen Matschke (Class of 2017).
Modified and adapted by Lokbondo (Loko) Kung (Class of 2018). Also Al Barr, 2020.
Links: Home Assignments Contacts Policies Resources