How do 3D graphics engines work


As soon as a discussion about graphics cards starts, you quickly get to terms that no longer have much to do with the graphics cards themselves, but with what graphics cards process: polygons, vectors, matrices, rasterizers, graphics engines, etc. A graphics card is always just a means to an end to achieve something - in this case the acceleration especially of the graphics engine of a game. But how do today's graphics engines actually work? This article is intended to provide a brief insight into this ...

Introduction - or: historical

Since the first computer games saw the light of day (at the latest), graphics had to be drawn on the screen. If you follow the definition of the term "graphics engine" given by Wikipedia, a graphics engine is "an sometimes independent part of a computer program or computer hardware that is responsible for its display of computer graphics [...]". Seen in this way, even games like Pong, Space Invaders or Pac-Man had graphics engines.

These and almost all other games that were available on home computers like the C64 or on consoles like the NES used 2D graphics. Practically everything was drawn "by hand": a cloud here, a tree there, a stick figure there. The graphics required for this were generated by code or came from bitmaps. Sprites were also used. Such sprite engines were continuously developed, improved, and refined. They may culminate in games like those in the Turrican franchise.

On later systems like the Amiga 500 or the SNES, however, 3D graphics should gradually be used instead of the traditional, fuel-based 2D graphics. Sprite technology was powerful, but completely unsuitable for 3D. At the SNES in particular, the dilemma became clear: While the SNES hardware was still very much trimmed for sprite display, successes like that of the legendary Mario Kart showed how well 3D was received by the players.

However, the hardware of the time was far too slow for a "real" 3D display. Game developers started doing what they always did: trick!

The PC was also used for trickery: very different approaches to generating three-dimensional graphics emerged, and none of them delivered "real" 3D. The notorious Wolfenstein 3D used a technique called "Raycasting", which was later developed considerably and has long been known as the Seemed to establish technology for first person shooters.

The voxel-based graphics, which Comanche used, among others, offered an alternative. And of course there was also polygon-based 3D graphics early on. Stunt Car Racer was even available on the C64!

It is a matter of dispute when "real" 3D will be achieved. At the moment Intel is intensifying research on ray tracing engines for computer games, because ray tracing is often seen as the only "real" 3D calculation method. "Avatar" is playing in the cinema, the entertainment industry is hoping for "3D" as the next box-office filler.

The fact is, however, that polygon graphics (actually: polygon rastering) emerged as the winner from the 1980s and 1990s as the principle for the graphic calculation in computer games. Virtually every 3D game today uses polygon graphics. The main reasons for their victory are on the one hand the fact that polygon graphics can be calculated quickly and easily using massive parallel number crunchers (GPUs), and on the other hand polygon graphics come very close to "real" 3D: at least as a mathematical construct, a real three-dimensional space is provided. Nevertheless, even today people still use tricks wherever possible ...

Mathematics - or: The world works that simple

But this is not about clever trickery at all. Polygon graphics engines have a long history of development: From stunts ...

... up to Crysis ...

... and beyond. But how does the technology work?

As the name suggests, the graphic in polygon graphics is composed of polygons (dt: polygons), so-called "primitives". If several such polygons are put together to form a mesh, the result is a wire frame model of a three-dimensional object, which is then rendered. How this is done will be described below.

First of all, it must be made clear that the three-dimensional world takes place "in" or "behind" the screen. If one imagines the screen as one of the surfaces of a cube, the three-dimensional world is located inside the cube. The screen is just our window through which we can look into the world in the cube.

The computer stores the three-dimensional world that takes place inside the cube in its working memory. More precisely, the working memory contains the three-dimensional coordinate of each corner point of each polygon in the three-dimensional world. If an object moves through space, the coordinates of the polygons of the object in question are adjusted accordingly. The renderer can then calculate an updated view of our window into the world on the basis of the new data. A projection takes place: the three-dimensional world is projected onto the two-dimensional screen surface.

Usually, 3D models are created using modeling software such as Blender. Instead, a simple model is to be defined "by hand". It should be a square pyramid. This requires a square and only two triangles (!), As the sides should not be filled (nobody should notice the trick, we just don't tell anyone). This results in exactly three polygons, which together will form a mesh.

Note: You wouldn't get very far with Direct3D with this mesh. Direct3D assumes that all polygons consist of three and just consist of three corners (i.e. are triangles). The base of the pyramid should be a square. Therefore, it would first be necessary to split the square into two triangles ("triangulate"). Direct3D makes this requirement for good reasons: On the one hand, it enables uniformity of the calculation algorithms, on the other hand it ensures that polygons are always convex.

Here are the four coordinates of the square Q:

The square is therefore on the x-y plane, the height information z is zero for all points. The center of the square is exactly the origin (0, 0, 0). For the top of the pyramid, there is a point that is exactly above the middle, just high enough. The two side faces of the pyramid are also defined in this way.

Triangle A:


and triangle B:


Scoring - or: get down to business

So far, the world is made up of five points represented by three-dimensional vectors. In order to be able to represent these three-dimensional vectors meaningfully on the screen, a projection is required. In order to be able to carry out such a projection, a projection matrix is ​​required. This matrix maps the three-dimensional vectors into two-dimensional space by multiplying the vectors with the matrix.

Matrices are multiplied by multiplying their elements in pairs and adding up in rows / columns. This short example already reveals the whole secret.

Note: With polygon graphics, an enormous number of such matrix multiplications are required. You can see that only the basic arithmetic operations multiplication and addition are required for this. Early graphics accelerators (in the 3fdx days) were simple multiply / add machines. Today's graphics cards can do more because they also have more tasks, but their main main task is still massive multiplication and addition. The calculations are usually independent of one another, so that many calculations can run in parallel. Modern GPUs calculate a MUL and an ADD (also combined to form a MADD) per clock and parallel unit. This is how many hundreds of billions of MADDs are created every second.

If the (transposed) vector A has the dimension 1x3 and the result of the multiplication should have the dimension 1x2, what must the dimension of the projection matrix P be?

After a brief look it becomes clear that the dimension for P is only it is a possibility. Every matrix P that fulfills this property is a projection matrix. Which matrix you choose decides which perspective distortion you get. Thus, for example, you can also set the Field of View (FOV).

A simple projection matrix is ​​the cavalier projection. This matrix looks like this:

So if, for example, the first point of the base square of the pyramid is to be projected onto the plane, the result is:

The point (1.5; 0.25) can be drawn directly on the two-dimensional coordinate system. If you do this for all points in the scene, you get a whole bunch of points on the two-dimensional plane, which you only have to connect with each other as specified by a rule. This rule is usually given for each polygon by "connect point one with point two, point two with point three and point three with point one". With the basic square used here, the whole thing must of course be expanded to four points - another reason why Direct3D only allows triangles here: So that there is no confusion in this regard.

Bookmark / Search this post with: