Outerra: planet rendering

Showing posts with label planet rendering. Show all posts

Saturday, November 17, 2012

View Frustum Culling of Sphere-mapped Terrain

View frustum culling is quite important part of any 3D engine, more so of engines focusing on a large scale terrain rendering. For objects we are using a variant of “p/n vertex” approach for culling object bounding boxes, represented as a center and a half-extent vector. The algorithm was described, for example, in the Real-Time Rendering book. Also, Fabian Giesen has a nice comparison and evolution of view frustum culling methods on his blog.

The idea is to check the intersection of individual frustum planes with the axis-aligned box by computing the distance of the corner point of the box that is farthest “into” the plane. There’s an elegant way how to do that. Provided you have an axis-aligned box represented with a center and a half-extents of the box for x, y and z, you can compute the signed distance from the center to the plane (using dot with the plane normal). Plane normal points inside the frustum, so for the center points inside of it you get a positive distance, and a negative one otherwise.

The box corner that lies farthest into the plane (the one that we need to check against) is the one that would maximize the signed distance from the plane. We can get the distance by projecting the half-extents onto the plane normal and summing up the absolute values (or, since the extents are positive values, by taking the absolute values of plane normal components).

This method can be generalized to oriented bounding boxes by rotating the plane into the OBB space first.

Culling terrain tiles

The algorithm is of course applicable for culling a tiled terrain. For a plane-mapped terrain the situation would be much simpler. Certainly one would choose the terrain tiles to be axis-aligned from the start, avoiding the need for rotating the frustum planes into the oriented bounding-box space.

However, for spherical worlds it’s more complicated. Not only the tiles on sphere surface will be arbitrarily oriented, but because there’s no way to wrap a rectangular grid over a sphere seamlessly, the tiles will be also deformed.

Outerra uses a variant of quadrilateralized spherical cube mapping which looks like this:

You’ll notice that the individual tiles generally aren’t rectangular. The major deformation component is the shear. Upper-level tiles are more deformed in other, non-symmetric ways, but as we go lower we go in the quad-tree hierarchy, the shear becomes the dominant one.

We can wrap the tiles in oriented or even axis aligned bounding boxes. but in many cases it would be wasteful, leading to false positives.

But here’s an interesting thing. Since the shear is an affine transform, one can actually rotate the frustum planes directly into the skewed space and perform the culling test there normally.

This can be done by multiplying the rotation matrix by our shear matrix, and transforming the planes with the transpose of the inverse of the resulting matrix.

(Update: simplified the math)

The resulting rotation matrix can be simply made from normalized vectors of the skewed tile space itself, like this:

The following code tests whether a skewed box lies in the frustum:

    float d = dot(center, plane);
    float3 npv = abs(R * plane);
    float r = dot(extents, npv);

    if(d+r > 0)  // partially inside
    if(d-r >= 0) // fully inside

The npv = abs(R*plane) can be precomputed, from a certain level of the quad-tree the u and v vectors of tiles don't change much, and thus the matrix R changes only marginally and npv can be cached and reused.

Lastly, here’s a short video showing it in action on an extreme case of shear:

Sunday, July 3, 2011

Book: 3D Engine Design for Virtual Globes

3D Engine Design for Virtual Globes is a book by Patrick Cozzi and Kevin Ring describing the essential techniques and algorithms used for the design of planetary scale 3D engines. It's interesting to note that even though virtual globes gained the popularity a long time ago with software like Google Earth or NASA World Wind, there wasn't any book dealing with this topic until now.

As the topic of the book is relevant also for planetary engines like Outerra, I would like to do a short review here.
I have been initially contacted by Patrick to review the chapter about the depth precision, and later he also asked for a permission to include some images from Outerra there. You can check out the sample chapters, for example the Level of Detail.

Behind the simple title you'll find almost surprisingly in-depth analysis of techniques essential for the design of virtual globe and planetary-scale 3D engines. After the intro, the book starts with the fundamentals: the basic math apparatus, and the basic building blocks of a modern, hardware friendly 3D renderer. The fundamentals conclude with a chapter about globe rendering, on the ways of tesselating the globe in order to be able to feed it to the renderer, together with appropriate globe texturing and lighting.

Part II of the book guides you to the area that you cannot afford to neglect if you don't want to hit the wall further along in your design - precision. Regardless of what spatial units you are using, it's the range of detail expressible in floating point values supported by 3D hardware that is limiting you. If you want to achieve both global view on a planet from space, and a ground-level view on it's surface, without handling the precision you'll get jitter as you zoom in and it soon becomes unusable. The book introduces several approaches used to solve these vertex precision issues, each possibly suited for different areas.

Another precision issue that affects the rendering of large areas is the precision of depth buffer. Because of an old non-ideal hardware design that reuses values from perspective division also for the depth values it writes, depth buffer issues show up even in games with larger outdoor levels. In planetary engines that also want a human scale detail this problem grows beyond the bounds. The chapter on depth buffer precision compares several algorithms that more or less solve this problem, including the algorithm we use in Outerra - logarithmic depth buffer. Who knows, maybe one day we'll get a direct hardware support for it, as per Thatcher Ulrich's suggestion, and it becomes a thing of the past.

Third part of the book concerns with the rendering of vector data in virtual globes, used to render things like country boundaries or rivers, or polygon overlays to highlight areas of interest. It also deals with the rendering of billboards (marks) on terrain, and rendering of text labels on virtual globes.

The last chapter in this part, Exploiting Parallelism in Resource Preparation, deals with an important issue popping up in virtual globes: utilizing parallelism in the management of content and resources. Being able to load data on the background, not interfering with the main rendering is one of the crucial requirements here.

The last part of the book talks about the rendering of massive terrains in hardware friendly manner: about the representation of terrain, preprocessing, level of detail. Two major rendering approaches have their dedicated chapters in the book: geometry clipmapping and chunked LOD, together with a comparison. Of course, the book also comes with a comprehensive list of external resources in each chapter.

We've received many questions from several people that wanted to know how we started programming our engine and what problems we have encountered, or how did we solve this or that. Many of them I can now direct to this book, which really covers the essential stuff one needs to know here.

Sunday, October 31, 2010

Speed of Light

We've had this idea some time ago - to experience how fast the speed of light actually is by flying away from the planet in Outerra at that speed. Now I've made a short video showing exactly that - flying away from the surface to space, and then returning back (but overshooting past it).

And while the speed is indeed great, one can feel that's also terribly slow when considering the extents of space ..

Thursday, December 31, 2009

Floating Point Depth Buffer

Update: a more complete and updated info about the use of reverse floating point buffer can be found in post Maximizing Depth Buffer Range and Precision.

I had always thought that using a floating point depth buffer on modern hardware would solve all the depth buffer problems, in a similar way than the logarithmic depth buffer but without requiring any changes in the shader code, and having no artifacts and potential performance losses in their workarounds. So I was quite surprised when swiftcoder mentioned in this thread that he had found that the floating point depth buffer had insufficient resolution for a planetary renderer.

The value that gets written into the Z-buffer is value of z/w after projection, and it has an unfortunate shape that gives enormous precision to a very narrow part close to the near plane. In fact, almost half of the possible values lie within two times the distance of the near plane. In the picture below it's the red "curve". The logarithmic distribution (blue curve), on the other hand, is optimal with regards to object sizes that can be visible at given distance.

Floating point depth buffer should be able to handle the original z/w curve because the exponent part corresponds to the logarithm of the number.

However, here's the catch. Depth buffer values converge towards value of 1.0, or in fact most of the depth range gives values very close to it. The resolution of floating point around 1.0 is entirely given by the mantissa, and it's approximately 1e-7. That is not enough for planetary rendering, given the ugly shape of z/w.

However, the solution is quite easy. Swapping the values of far and near plane and changing the depth function to "greater" inverts the z/w shape so that it iterates towards zero with rising distance, where there is a plenty of resolution in the floating point.

I've also found an earlier post by Humus where he says the same thing, and also gives more insight into the old W-buffers and various Z-buffer properties and optimizations.

Monday, August 31, 2009

Short flight sim video

It's been one year since we released the first videos from Outerra engine. So it's about time to release another short one featuring a lifeless Jalapeno and a pilotless Cessna plane with broken propeller

Wednesday, August 12, 2009

Logarithmic Depth Buffer

I assume pretty much every 3D programmer runs into Z-buffer issues sooner or later. Especially when doing planetary rendering; the distant stuff can be a thousand kilometers away but you still would like to see fine details right in front of the camera.

Previously I have dealt with the problem by splitting the depth range in two and using the first part for near stuff and another for distant stuff. The boundary was floating, somewhere around 5km - quad-tree tiles up to certain level were using the distant part, and the more detailed tiles that by law of LOD are occurring nearer the camera used the other part.
Most of the time this worked. But in one case it failed miserably - when a more detailed tile appeared behind a less detailed one.
I was thinking about the ways to fix it, grumbling why we can't have a Z-buffer with better distribution, when it occurred to me that maybe we can.

Steve Baker's document explains common problems with Z-buffer. In short, the depth values are proportional to the reciprocal of Z. This gives amounts of precision near the camera but little off in the distance. Common method is then to move your near clip plane further away, which helps but also brings its own problems, mainly that .. the near clip plane is too far.

A much better Z-value distribution is a logarithmic one. It also plays nicely with LOD used in large scale terrain rendering.
Using the following equation to modify depth value after it's been transformed by the projection matrix:

    z = log(C*w + 1) / log(C*Far + 1) * w      //DirectX with depth range 0..1

or

    z = (2*log(C*w + 1) / log(C*Far + 1) - 1) * w   //OpenGL, depth range -1..1

Note: you can use the value of w after your vertices are transformed by your model view projection matrix, since the w component ends up with the view space depth. Hence w is used in the equations above.

Update: Logarithmic depth buffer optimizations & fixes

Where C is constant that determines the resolution near the camera, and the multiplication by w undoes in advance the implicit division by w later in the pipeline.
Resolution at distance x, for given C and n bits of Z-buffer resolution can be computed as

    Res = log(C*Far + 1) / ((2^n - 1) * C/(C*x+1))

So for example for a far plane at 10,000 km and 24-bit Z-buffer this gives the following resolutions:

            1m      10m     100m    1km     10km    100km   1Mm     10Mm
------------------------------------------------------------------------
C=1         1.9e-6  1.1e-5  9.7e-5  0.001   0.01    0.096   0.96    9.6     [m]
C=0.001     0.0005  0.0005  0.0006  0.001   0.006   0.055   0.549   5.49    [m]

Along with the better utilization of z-value space it also (almost) gets us rid of the near clip plane.

And here comes the result.

Looking into the nose while keeping eye on distant mountains ..

10 thousand kilometers, no near Z clipping and no Z-fighting! HOORAY!

More details

The C basically changes the resolution near the camera; I used C=1 for the screenshots, having theoretical resolution 1.9e-6m. However, the resolution near the camera cannot be utilized fully as long as the geometry isn't finely tessellated too, because the depth is interpolated linearly and not logarithmically. On models such as the guy on the screenshots it is perfectly fine to put camera on his nose, but with models with long stripes with vertices few meters apart the bugs from the interpolation can be visible. We will be dealing with it by requiring certain minimum tessellation.

Fragment shader interpolation fix

Ysaneya suggested a fix for the artifacts occurring with thin or large triangles when close to the camera, when perspectively interpolated depth values diverge too much from the logarithmic values, by writing the correct Z-value at the pixel shader level. This disables fast-Z mode but he found the performance hit to be negligible.

Update: a more complete and updated info about the logarithmic depth buffer and the reverse floating point buffer can be found in post Maximizing Depth Buffer Range and Precision.