More 3-d Animation

 More 3-d Animation

 Journal:   Dr. Dobb's Journal  Feb 1992 v17 n2 p125(8)
 -----------------------------------------------------------------------------
 Title:     More 3-D animation. (Graphics Programming)(column) (Tutorial)
 Author:    Abrash, Michael.
 AttFile:    Program:  3D.EXE  Self extracting archive.
             Program:  GP-FEP92.ASC  Source code listing.

 Abstract:  Computer animation, especially three-dimensional animation, is as
            much the art of illusion as a mathematical procedure.  The
            programmer must compensate for such imperfections as 'jaggies' and
            flicker which the viewer would notice, but others are 'deleted' by
            the human brain which makes certain assumptions when viewing a
            scene.  One-sided polygons, drawn as if they were not visible to
            the viewer in two-dimensional space, are discussed.  Removing
            polygons that are not facing the viewer would solve all 'hidden
            surface' problems for a single convex polyhedron, but overlapping
            polyhedrons require other methods such as determining the
            cross-product of two vectors.  Sample programs that perform
            backface removal, incremental transformation and other animation
            algorithms are presented.
 -----------------------------------------------------------------------------
 Descriptors..
 Topic:     Animation
            Tutorial
            Programming Instruction
            Three-Dimensional Graphics
            Programming
            Graphics Systems.
 Feature:   illustration
            program.

 -----------------------------------------------------------------------------
 Full Text:

 As I'm fond of pointing out, computer animation isn't a matter of
 mathematically exaxt modeling or raw technical prowess, but rather of fooling
 the eye and the mind.  That's especially true for 3-D animation, where we're
 not only trying to convince viewers that they're seeing objects on a screen
 -- when in truth that screen contains no objects at all, only gaggles of
 pixels--but we're also trying to create the illusion that the objects exist
 in three-space, possessing four dimensions (counting movement over time) of
 their own.  To make this magic happen, we must provide cues for the eyes not
 only to pick out boundaries, but also to detect depth, orientation, and
 motion.  This involves perspective, shading, proper handling of hidden
 surfaces, and rapid and smooth screen updates; the whole deal is considerably
 more difficult to pull off on a PC than 2-D animation.

 And yet, in some senses, 3-D animation is easier than 2-D.  Because there's
 more going on in 3-D animation, the eye and brain tend to make more
 assumptions, and so are more apt to see what they expect to see, rather than
 what's actually there.  If you're piloting a (virtual) ship through a field
 of thousands of asteroids at high speed, you're unlikely to notice if the
 more distant asteroids occasionally seem to go right through each other, or
 if the topographic detail on the asteroids' surfaces sometimes shifts about a
 bit.  You'll be busy viewing the asteroids in their primary role, as objects
 to be navigated around, and the mere presence of topographic detail will
 suffice; without being aware of it, you'll fill in the blanks.  Your mind
 will see the topography peripherally, recognize it for what it is supposed to
 be, and, unless that landscape does something really obtrusive such as
 vanishing altogether or suddenly shooting a spike miles into space, you will
 see what you expect to see: a bunch of nicely detailed asteroids tumbling
 around you.

 To what extent can you rely on the eye and mind to make up for imperfetions
 in the 3-D animation process?  In some areas, hardly at all; for example,
 jaggies crawling along edges stick out like red flags, and likewise for
 flicker.  In other areas, though, the human perceptual system is more
 forgiving than you'd think.  Consider this.  At the end of Return of the
 Jedi, in the battle to end all battles around the Death Star, there is a
 sequence of about five seconds in which several spaceships are visible in the
 background.  One of those spaceships (and it's not very far in the
 background, either) looks a bit unusual.  What it looks like is a sneaker.
 In fact, it is a sneaker -- but unless you know to look for it, you'll never
 notice it, because your mind is busy making simplifying assumptions about the
 complex scene it's seeing--and one of those assumptions is that medium-sized
 objects floating in space are spaceships, unless proven otherwise.  (Thanks
 to Chris Hecker for pointing this out.  I'd never have noticed the sneaker,
 myself, witout being tipped off--which is, of course, the whole point.)

 If it's good enough for George Lucas, it's good enough for us.  And with
 that, let's resume our request for real-time 3-D animation on the PC.

 One-sided Polygons: Backface Removal

 Last month, we implemented the basic polygon drawing pipeline, transforming a
 polygon all the way from its basic definition in object space, through the
 shared 3-D world space, and into the 3-D space as seen from the viewpoint,
 called "view space."  From view space, we performed a perspective projection
 to convert the polygon into screen space, then mapped the transformed and
 projected vertices to the nearest screen coordinates and filled the polygon.
 Armed with code that implemented this pipeline, we were able to watch as a
 polygon rotated about its Y axis, and were able to move the polygon around in
 space freely.

 One of the drawbacks of last month's approach was that the polygon had two
 visible sides.  Why is that a drawback?  Well, it isn't, necessarily, but in
 our case we want to use polygons to build solid objects with continuous
 surfaces, and in the context, only one side of a polygon is ever visible; the
 other side always faces the inside of the object, and can never be seen.  It
 would save time and simplify the process of hidden surface removal if we
 could quickly and easily determine whether the inside or outside face of each
 polygon was facing us, so that we could draw each polygon only if it were
 visible (that is, had the outside face pointing toward the viewer).  On
 average, half the polygons in an object could be instantly rejected by a test
 of this sort.  Such testing of polygon visibility goes by a number of names
 in the literature, including backplane culling, backface removal, and
 assorted variations thereon; I'll refer to it as backface removal.

 For a single convex polyhedron, removal of polygons that aren't facing the
 viewer would solve all hidden surface problems.  In a convex polyhedron, any
 polygon facing the viewer can never be obscured by any other polygon in that
 polyhedron; this falls out of the definition of a convex polyhedron.
 Likewise, any polygon facing away from the viewer can never be visible.
 Therefore, in order to draw a convex polyhedron, if you draw all polygons
 facing toward the viewer but none facing away from the viewer, everything
 will work out properly, with no additional checking for overlap and hidden
 surfaces needed.

 Unfortunately, backface removal completely solves the hidden surface problem
 only for convex polyhedrons, and only if there's a single convex polyhederon
 involved; when convex polyhedrons overlap, other methods must be used.
 Nonetheless, backface removal does instantly halve the number of polygons to
 be handled in rednering any particular scene.  Backface removal can also
 speed hidden-surface handling if objects are built out of convex polyhedrons,
 as we'll see in a future column.  This month, though, we have only one convex
 polyhedron to deal wit, so backface removal alone will do the trick.

 Give that I've convinced you that backface removal would be a handy thing to
 have, how do we actually do it?  A it?  A logical approach, often implemented
 in PC literature, would be to calculate the plane equation for the plane in
 which the polygon lies, and see which way the normal (perpendicular) vector
 to the plane points.  That works, but there's a more efficient way to
 calculate the normal to the polygon: as the cross-product of two of the
 polygon's edges.

 The cross-product of two vectors is defined as the vector shown in Figure 1.
 One interesting property of the cross-product vector is that it is
 perpendicular to the plane in which the two original vector lie.  If we take
 the cross-product of the vectors that form two edges of a polygon, the result
 will be a vector perpendicular to the polygon; then, we'll know that the
 polygon is visible if and only if the cross-product vector points toward the
 viewer.  We need one more thing to make the cross-product approach work,
 though.  The cross-product can actually point either way, depending on which
 edges of the polygon we choose to work with and the order in which we
 evaluate them, so we must establish some conventions for defining polygons
 and evaluating the cross-product.  We'll define only convex polygons, with
 the vertices defined in clockwise order, as viewed from the outside; that is,
 if you're looking at the visible side of the polygon, the vertices will
 appear in the polygon definition in clockwise order.  With those assumptions,
 the cross-product becomes a quick and easy indicator of polygon orientation
 with respect to the viewer; we'll calculate it as the cross-product of the
 first and last vectors in a polygon, as shown in Figure 2, and it it's
 pointing toward the viewer, we'll know that the polygon is visible.
 Actually, we don't even have to calculate the entire cross-product vector,
 because the Z component alone suffices to tell us which way the polygon is
 facing: positive Z means visible, negative Z means not.  The Z component can
 be calculated very efficiently, with only two multiplies and a subtraction.

 The question remains of the proper space in which to perform backface
 removal.  There's a temptation to perform it in view space, which is, after
 all, the space defined with respect to the viewer, but view space is not a
 good choice.  Screen space--the space in which perspective projection has
 been performed--is the best choice.  The purpose of backface removal is to
 determine whether each polygon is visible to the viewer, and, despite its
 name, view space does not provide that information; unlike screen space, it
 does not reflect perspective effects.

 Backface removal may also be performed using the polygon vertices in screen
 coordinates, which are integers.  This is less accurate than using the screen
 space coordinates, which are floating point, but is, by the same token,
 faster.  In Listing Three, backface removal is performed in screen
 coordinates in the interest of speed.

 Backface removal, as implemented in Listing Three, will not work reliably if
 the polygon is not convex, if the vertices don't appear in counterclockwise
 order, if either the first or last edge in a polygon has zero length, or if
 the first and last edges are congruent.  These latter two points are the
 reason it's preferable to wrok in screen space rather than screen coordinates
 (which suffer from rounding problems), speed considerations aside.

 Backface Removal in Action

 Listings One through Five together form a program that rotates a solid cube
 in real time under user control.  Listing One (page 147) is the main program;
 Listing Two (page 147) performs transformation and projection; Listing Three
 (page 147) performs backface removal and draws visible faces; Listing Four
 (page 148) concatenates incremental rotations to the object-to-world
 transformation matrix; Listing Five (page 150) is the general header file.
 Also required from previous columns are: Listings One and Two from last month
 (draw clipped line list, matrix math functions); Listings One and Six from
 July 1991, (mode X mode set, rectangle fill); Listing Six from September
 1991; Listing Four from March 1991 (polygon edge scan); and the
 FillConvexPolygonO function from Listing One from February 1991.  All
 necessary modules, along with a make file, will be available as part of the
 code from this issue.

 The sample program places a cube, floating in three-space, under the complete
 control of the user.  The arrow keys may be used to move the cube left,
 right, up, and down, and the A and T keys may be used to move the cube away
 from or toward the viewer.  The F1 and F2 keys perform rotation around the Z
 axis, the axis running from the viewer straight into the screen.  The 4 and 6
 keys perform rotation around the Y (vertical) axis, and the 2 and 8 keys
 perform rotation around the X axis, which runs horizontally across the
 screen; the latter four keys are most conveniently used by flipping the
 keypad to the numeric state.

 The demo involves six polygons, one for each side of the cube.  Each of the
 polygons must be transformed and projected, so it would seem that 24 vertices
 (four for each polygon) must be handled, but some steps ahve been taken to
 improve performance.  All vertices for the object have been stored in a
 single list; the definition of each face contains not the vertices for that
 face themselves, but rather indexes into the object's vertex list, as shown
 in Figure 3.  This reduces the number of vertices to be manipulated from 24
 to 8 , for there are, after all, only eight vertices in a cube, with three
 faces sharing each vertex.  In this way, the transformation burden is
 lightened by two-thirds.  Also, as mentioned earlier, backface removal is
 performed with integers, in screen coordinates, rather than with
 floating-point values in screen space.  Finally, the RecalcXForm flag is set
 whenever the user changes the object-to-world transformation.  Only when this
 flaf is set is the full object-to-view transformation recalculated and the
 object's vertices transformed and projected again: otherwise, the values
 already stored within the object are reused.  In the sample application, this
 brings no visible improvement, because there's only the one object, but the
 underlying mechanism is sound: In a full-blown 3-D animation application,
 with multiple objects moving about the screen, it would help a great deal to
 flag which of the objects had moved with respect to the viewer, performing a
 new transformation and projection only for those that had.

 With the above optimizations, the sample program is certainly adequately
 responsive on a 20-MHz 386 (sans 387; I'm sure it's wonderfully responsive
 with a 387).  Still, it couldn't quite keep up with the keyboard when I
 modified it to read only one key each time through the loop--and we're
 talking about only eight vertices here.  This indicates that we're already
 near the limit of animation complexity possible with our current approach.
 It's time to start rethinking that approach; over two-thirds of the overall
 time is spent in floating-point calculations, and it's there that we'll begin
 to attack the performance bottleneck we find ourselves up against.

 Incremental Transformation

 Listing Four contains three functions; each concatenates an additional
 rotation around one of the three axes to an existing rotation.  In order to
 improve performance, only the matrix entries that are affected in a rotation
 around each particular axis are recalculated (all but four of the entries in
 a single-axis rotation matrix are either 0 or 1, as shown last month).  This
 cuts the number of floating-point multiplies from the 64 required for the
 multiplication of two 4X4 matrices to just 12, and floating point adds from
 48 to 6.

 Be aware that Listing Four performs an incremental rotation on top of
 whatever rotation is already in the matrix.  The cube may already have been
 turned left, right, up, down, and sideways; regardless, Listing Four just
 tacks the specified rotation onto whatever already exists.  In this way, the
 object-to-world transformation matrix contains a history of all the rotations
 ever specified by the user, concatenated one after another onto the original
 matrix.  Potential loss of precisionis a problem associated with using such
 an approach to represent a very long concatenation of transformations,
 especially with fixed-point arithmetic; that's not a problem for us yet, but
 we'll run into it eventually.

 A Note on Rounding Negative Numbers

 Last month, I added 0.5 and truncated in order to round values from
 floating-point to integer format.  In Listing Two this month, I've switched
 to adding 0.5 and using the floorO function.  For positive values, the two
 approahces are equivalent; for negative values, only the floorO approach
 works properly.

 Object Representation

 Each object consists of a list of vertices and a list of faces, with the
 vertices of each face defined by pointers into the vertex list; this allows
 each vertex to be transformed exactly once, even though several faces may
 share a single vertex.  Each object contains the vertices not only in their
 original, untransformed state, but in three other forms as well: transformed
 to view space, transformed and projected to screen space, and converted to
 screen coordinates.  Earlier, we saw that it can be convenient to store the
 screen coordinates within the object, so that if the object hasn't moved with
 respect to the viewer, it can be redrawn without the need for recalculation,
 but why bother storing the view and screen space forms of the vertices as
 well?

 The screen space vertices are useful for some sorts of hidden surface
 removal.  For example, in order to determine whether two polygons overlap as
 seen by the viewer, you must first know how they look to the viewer,
 accounting for perspective; screen space provides that information.  (So do
 the final screen coordinates, but with less accuracy, and without any Z
 information.)  The view space vertices are useful for collision and proximity
 detection; screen space can't be used here, because objects are distorted by
 the perspective projection into screen space.  World space would serve as
 well as view space for collision detection, but because it's possible to
 transform directly from object space to view space with a single matrix, it's
 often preferable to skip over world space altogeher.  It's not mandatory that
 vertices be stored for all these different spaces, but the coordinates in all
 those spaces have to be calculated as intermediate steps anyway, so we might
 as well keep them around for those occasions when they're needed.

 Coming up: shading, hidden surfaces, and performance.

 [BACK] Back
Discuss this article in the forums
Date this article was posted to GameDev.net: 7/16/1999
(Note that this date does not necessarily correspond to the date the article was written)
See Also:
Michael Abrash's Articles