Fast 3-d Animation: meet X-Sharp
Journal: Dr. Dobb's Journal March 1992 v17 n3 p119(7)
-----------------------------------------------------------------------------
Title: Fast 3-D animation: meet X-Sharp. (Graphics Programming) (Column)
Author: Abrash, Michael.
AttFile: Program: GP-MAR92.ASC Source code listing.
Abstract: Three elements come into play to enable 12 three-dimensional
graphics cubes to rotate at an update rate of about 15 frames per
second on a 20-MHz 386-based machine with a slow video graphics
array. One of them is fixed-point arithmetic, which provides the
graphics with an immediate order-of-magnitude performance boost.
The second element is the use of the 386-based machine's 32-bit
multiply and divide instructions. The third performance
enhancement key is to maintain and operate on only the relevant
sections of the transformation coordinates and matrices. The
techniques are very powerful and can cause some problems. For
example, since points outside of the 64K by 64K by 64K space
cannot be handled, multiple matrix concatenations may turn out to
be imprecise.
-----------------------------------------------------------------------------
Descriptors..
Topic: Three-Dimensional Graphics
Rendering
Tutorial
Performance Improvement
Animation
Programming.
Feature: illustration
program.
-----------------------------------------------------------------------------
Full Text:
Across the lake from Vermont, a few miles into upstate New York, the Ausable
River has carved out a fairly impressive gorge known as "Ausable Chasm."
Impressive for the East, anyway; you might think of it as the poor man's
Grand Canyon. This summer, I did the tour with my wife and five-year-old,
and it was fun, although I confess that I didn't loosen my grip on my
daughter's hand until we were on the bus and headed for home; that gorge is
deep, and the railings tend to be of the single-bar, rusted-out variety.
New Yorkers can drive straight to this wonder of nature, but Vermonters must
take their cars across on the ferry; the alternative is driving three hours
around the south end of Lake Champlain. No problem; the ferry ride is an
hour well spent on a beautiful lake. Or, rather, no problem -- once you're
on the ferry. Getting to New York is easy, but, as we found out, the line of
cars waiting to come back from Ausable Chasm gets lengthy about
mid-afternoon. The ferry can hold only so many cars, and we wound up
spending an unexpected hour exploring the wonders of the ferry docks. Not a
big deal, with a good-natured kid and an entertaining mom; we got ice cream,
explored the beach, looked through binoculars, and told stories. It was a
fun break, actually, and before we knew it, the ferry was steaming back to
pick us up.
A friend of mine, an elementary-school teacher, helped take 65 sixth graders
to Ausable Chasm. Never mind the potential for trouble with 65 kids loose on
a ferry. Never mind what it was like trying to herd that group around a
gorge that looks like it was designed to swallow children and small animals
without a trace. The hard part was getting back to the docks and finding
they'd have to wait an hour for the next ferry. As my friend put it, "Let me
tell you, an hour is an eternity with 65 sixth grades screaming the song 'You
Are My Sunshine."
Apart from reminding you how lucky you are to be working in a quiet,
air-conditioned room, in front of a gently humming computer, free to think
deep thoughts and eat Cheetos to your heart's content, this story provides a
useful perspective on the malleable nature of time. An hour isn't just an
hour -- it can be forever, or it can be the wink of an eye. Just think of
the last hour you spent working under a deadline; I bet it went past in a
flash. Which is not to say, mind you, that I recommend working in a bus full
of screaming kids in order to make time pass more slowly; there are quality
issues here, as well.
In our 3-D animation work so far, we've used floating-point arithmetic.
Floating-point arithmetic, even with a floating-point processor but
especially without, is the microcomputer animation equivalent of working in a
school bus: It takes forever to do anything, and you just know you're never
going to accomplish as much as you want to. This month, it's time for
fixed-point arithmetic, which will give us an instant order-of-magnitude
performance boost. We'll also give our 3-D animation code a much more
powerful and extensible framework, making it easy to add new and different
sorts of objects. Taken together, these alterations will let us start to do
some really interesting animation. Unfortunately, they take a lot of code,
so I'll have to keep the text short. Therefore, without further ado, I give
you real real-time animation.
Fixed Point, Native 386, and More
As of last month, we were at the point where we could rotate, move, and draw
a solid cube in real time. This month's program, shown in Listings One
through Ten (pages 134 through 138), goes a bit further, rotating 12 solid
cubes at an update rate of about 15 frames per second (fps) on a 20-MHz 386
with a slow VGA. That's 12 transformation matrices, 72 polygons, and 96
vertices being handled in real time; not Star Wars, granted, but a giant step
beyond a single cube. Run the program if you get a chance; you may be
surprised at just how effective this level of animation is. I'd like to
point out, in case anyone missed it, that this is fully general 3-D. I'm not
using any shortcuts or tricks, like prestoring coordinates or pregenerating
bitmaps; if you were to feed in different rotations or vertices, the
animation would change accordingly.
The keys to this month's performance are three. The first key is fixed-point
arithmetic. In the last two months, we've worked with floating-point
coordinates and transformation matrices. Those values are now stored as
32-bit fixed-point numbers, in the form 16.16 (16 bits of whole number, 16
bits of fraction). 32-bit fixed-point numbers allow sufficient precision for
3-D animation, but can be manipulated with fast integer operations, rather
than slow floating-point processor operations or excruciatingly slow
floating-point emulator operations. Although the speed advantage of
fixed-point varies depending on the operation, the processor, and whether a
coprocessor is present, fixed-point multiplication can be as much as 100
times faster than the emulated floating-point equivalent. (I'd like to take
a moment to thank Chris Hecker for his invaluable input in this area.)
The second performance key is the use of the 386's native 32-bit multiply and
divide instructions. Real-mode C compilers, such as Borland C++ and
Microsoft C, call library routines to perform multiplications and divisions
involving 32-bit values, and those library functions are fairly slow,
especially for division. On a 386, 32-bit multiplication and division can be
handled with the bit of code in Listing Nine -- and most of even that code is
only for rounding.
The third performance key is maintaining and operating on only the relevant
portions of transformation matrices and coordinates. The bottom row of every
transformation matrix we'll use (at least for the near future) is [0 0 0 1],
so why bother using or recalculating it when concatenating transforms and
transforming points? Likewise for the fourth element of a 3-D vector in
homogeneous coordinates, which is always 1. Basically, transformation
matrices are treated as consisting of a 3 X 3 rotation matrix and a 3 X 1
translation vector, and coordinates are treated as 3 X 1 vectors. This saves
a great many multiplications in the course of transforming each point.
Just for fun, I reimplemented the animation of Listings One through Ten with
floating-point instructions. Together, the preceeding optimizations improve
the performance of the entire animation -- including drawing time and
overhead, not just math -- by more than ten times over the code that uses the
floating-point emulator. Amazing what one can accomplish with a few dozen
lines of assembler and a switch in number format, isn't it? Note that no
assembly code other than the native 386 multiply and divide is used in
Listings One through Ten, although the polygon fill code is of course mostly
in assembler; we've achieved 12 cubes animated at 15 fps while doing the 3-D
work almost entirely in Borland C++ -- and we're still doing sine and cosine
via the floating-point emulator. Happily, we're still nowhere near the upper
limit on the animation potential of the PC.
Drawbacks
The techniques we've used to turbo-charge 3-D animation are very powerful,
but there's a dark side to them as well. Obviously, native 386 instructions
won't work on 8088 and 286 machines. That's rectifiable; equivalent
multiplication and division routines could be implemented for real mode (and
I may just do that one of these months, especially if enough of you give me a
hard time for taking the easy way out with 386 instructions), and performance
would still be reasonable. It sure is nice to be able to plug in a 32-bit
IMUL or DIV and be done with it, though. More importantly, 32-bit
fixed-point arithmetic has limitations in range and accuracy. Points outside
a 64K X 64K X 64K space can't be handled, imprecision tends to creep in over
the course of multiple matrix concatenations, and it's quite possible to
generate the dreaded divide by 0 interrupt if Z coordinates with absolute
values less than one are used. I don't have space to discuss these issues in
detail now, but here are some brief thoughts. The working 64K X 64K X 64K
fixed-point space can be paged into a larger virtual space. Imprecision of a
pixel or two rarely matters in terms of display quality, and deterioration of
concatenated rotations can be corrected by restoring orthogonality, for
example by periodically calculating one row of the matrix as the
cross-product of the other two (forcing it to be perpendicular to both). 3-D
clipping with a front clip plane of -1 or less can prevent divide overflow.
A New Animation Framework
Listings One through Ten represent not merely faster animation, but also a
mostly complete, extensible, data-driven animation framework. Where earlier
animation code was hardwired to demonstrate certain concepts, this month's
code is intended to serve as the basis for a solid animation package.
Objects are stored, in their entirety, in customizable structures; new
structures can be devised for new sorts of objects. Drawing, preparing for
drawing, and moving are all vectored functions, so that variations such as
shading or texturing, or even radically different sorts of graphics objects,
such as scaled bitmaps, could be supported. The cube initialization is
entirely data driven; more or different cubes, or other sorts of convex
polyhedrons, could be added by simply changing the initialization data in
Listing Eight.
The animation framework is not yet complete. Movement is supported only
along the Z axis, and then in a non-general fashion. More interesting
movement isn't supported at this point because of the one gaping hole in the
package: hidden-surface removal. Until this is implemented -- and it will
be, soon -- nothing can safely overlap. It would actually be easy enough to
perform hidden-surface removal by keeping the cubes in different Z bands and
drawing them back to front, but this gets into sorting and list issues, and
is not a complete solution -- and I've crammed as much as will fit into this
month's code, anyway.
Where the Time Goes
The distribution of execution time in the animation code is no longer wildly
biased toward transformation, but sine and cosine are certainly still sucking
up cycles. Likewise, the overhead in the calls to FixedMul() and FixedDiv()
are costly. Much of this is correctable with a little carefully crafted
assembly language and a lookup table; expect that soon. When all that is
firmly in place, we'll take a look at the number of pixels being drawn versus
the bandwidth of display memory; that'll give us an idea of how close we are
to the theoretical limit of VGA animation. Probably not too close, even with
those optimizations; a faster 2-D clipping approach and still faster
polygon-fill code will most likely be in order. (Yes, Virginia, there is an
even faster way to fill polygons!)
Regardless, this month we have made the critical jump to a usable level of
performance and a serviceable framework. From here on out, it's the fun
stuff.
X-Sharp
Three-dimensional animation is a complicated business, and it takes an
astonishing amount of functionality just to get off the launching pad: page
flipping, polygon filling, clipping, transformations, list management, and so
forth. There's no way all of that could fit in a single column, and in fact
I've been building toward a critical mass of animation functionality since my
very first column in DDJ. This month builds on code from no less than five
previous columns. The code that's required in order to link this month's
animation program is: Listing One from January (draw clipped line list);
Listings One and Six from July 1991 (mode X mode set, rectangle fill);
Listing Six from September 1991; Listing Four in March 1991 (polygon edge
scan); and the FillConvexPolygon() function from Listing One from February
1991. The struct keywords in FillConvexPolygon() must be removed to reflect
the switch to typedefs in the animation header file.
This is the last time I'm going to list all the code needed to build the
animation package, and I am not going to print every change to every module
from now on. The code is simply getting too large to show every bit of it,
and the scope is only going to grow as I add functions such as shading, 3-D
clipping, and hidden surfaces; also, I'm going to be tinkering with stuff
such as converting key code to assembler and modifying functions slightly as
structures change, and I hate to take up valuable space in DDJ for what's
basically fine-tuning and housekeeping.
I think we've reached the point where we can call this an ongoing project and
give it a name. In the spirit of Al Stevens' wonderful D-Flat, I hereby dub
the animation package X-Sharp. (X for mode X, sharp because who wants a flat
animation package?) From now on, I will make the full source for X-Sharp
available, complete with make files, online, and otherwise. It will be
available in the file XSHARPn.ARC in the DDJ Forum on CompuServe and on M&T
Online. Alternatively, you can send me a 360K or 720K formatted diskette and
an addressed, stamped diskette mailer, care of DDJ (411 Borel Ave., San
Mateo, CA 94403-3522), and I'll send you the latest copy of X-Sharp. There's
no charge, but, in the spirit of Al Stevens's "careware," it'd be very much
appreciated if you'd slip in a dollar or so for the folks at the Vermont
Association for the Blind and Visually Impaired, who help the visually
impaired build productive, self-sufficient lives, and are amazingly
successful at it. Imagine for a moment trying to do your work if you lost
your sight (and it can be done; I got a request for the text of my book Zen
of Assembly Language in computer-readable form from a blind programmer the
other day)--heck, imagine trying to cross the street--and I suspect you'll
understand why it matters so much. As Al says, it's purely voluntary, but
both you and I will feel good.
I'm available on a daily basis to discuss X-Sharp on M&T Online and Bix (user
name MABRASH in both cases).
[BACK] Back
Discuss this article in the forums
See Also: © 1999-2011 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
|