Welcome to SSAO v0.6 for Gamestudio A7 & A8! The purpose of this shader solution is to approximate ambient occlusion in real time as a post-processing effect for arbitrary 3D scenes. Screen-Space Ambient Occlusion (SSAO) helps adding realism to a 3d scene, since it darkens creases, holes and closely aligned surfaces and gives the viewer a higher perception of depth, geometry and curvatures.

Overview

Unlike other DX 9 / HLSL implementations, this SSAO solution offers both high quality and high performance: It auto-recognizes surface types of most objects, deals with soft-alpha transparency objects, creates a clear AO map without edge-bleeding, doesn't bleed ambient occlusion over high-angular normal surfaces, works with the Gamestudio particle system, supports GPU-animated characters and ignores decals. This solution is standalone and can be easily integrated into any existing Gamestudio project for A7 or A8 (sic!). It integrates very well into existing shader chains and adapts automatically to screen resolution changes, and can produce depth/normalmaps even if turned off, to be recycled in other post-processing effects. You can customize it with a rich set of variables and compiler defines, so that it perfectly fits your project and uses memory and CPU/GPU cycles as few as possible.

Core features

  • Compatible to A7.86.6 & A8.30.3, shader model 3 required
  • Easy integration into any existing renderchain
  • Supports fog and particles; static meshes; solid-, alphatesting and softalpha entities; decals; animated sprites
  • Automatic GPU bone support (A8 pro only)
  • Works with (blurred) stencil shadows and PSSM shadows
  • Can sample depth- and normal map, even if SSAO is disabled
  • Lots of showcase demos included + source; extensive CHM manual
  • Free to use in any project (MIT license)

Latest release

All releases include the corresponding Lite-C and HLSL shaders sources and the demos with their sources, binaries (compiled and ready-to-run), as well as all used artwork.

SSAO for Gamestudio A7 & A8

V 0.6 - ∼44 MB - October 2011

Demos

This SSAO solution comes with a lot of demos - some of them are embedded as videos below - if you want to seem the live, please download the latest version and execute them on your own.

Testscene Demo

The Testscene Demo is the most feature complete demo. It represents all aspects of a typical Gamestudio level. SSAO is activated in standalone mode on top of the camera.


Sponza Demo

The architecture, textures and coloring of the publically available Sponza Demo (Frank Meinl, Crytek) is perfectly suited for global illumination tests. It is entirely made out of high quality models and -textures.


Humanoid Demo

The humanoid demo uses the most complex postprocessing chain of all demos. Sources of the SSAO stage (here: depth) are recycled in another post processing effect (here: Depth of Field). It also adds in Gamestudio's PSSM (Parallel Split Shadow Mapping) shader. The crowd is under A8 Pro GPU-animated, this demo shows the automatic support of GPU-bones animation (otherwise the crowd is CPU-animated).


Ikasoeder Demo

The Ikosaeder Demo was made to test the compatibility of SSAO to stencil shadows and blurring the stencil shadow map with a postprocessing shader.


Animated Sprite Demo

The Animated Sprite Demo is used to test animated sprites and serves also as an example for the usefulness of softalpha rendering.


Integration

It is very easy to add SSAO to your project. When you are developing a standard Gamestudio game, it is in almost all cases often sufficient to just call once runSsao(); Nevertheless, a game is actually an individual piece of software with a wide range of features and complex requirements which need special treatment. Luckily, this SSAO solution is designed with both target audiences in mind: beginners which just want to add this cool effect to their game, and advanced users, which already have high quality shaders and effects running in an already established game environment.

Quickstart

To try it quickly in your project, do the following:

  • Copy the ppSsao.h & ppSsao.c files as well as all shaders (ao_*.fx) and the file ao_pp_encode.tga into your project folder.
  • Include ppSsao.h in your project via #include "ppSsao.h"
  • After loading the level, run ssao();
  • Set toggleSsaoState on a key, like on_h = toggleSsaoState; and press the key (here: H) to toggle the SSAO effect on and off.

Example

#include 
#include "ppSsao.h" 

int main () 
{
   level_load("level.wmb"); 

   runSsao();
   on_h = toggleSsaoState; 
}
							

Methodology

You'll find in the following short descriptions of the used techniques involved in generating the SSAO in this implementation.

G-Buffer

Screen-Space Ambient Occlusion (SSAO) is by definition a post processing effect. This means, that the shading effect is achieved by processing only bitmaps that were generated during the rendering process. Classic SSAO implementations, like the published Crytek shader [Kajalin2008] , require only a depth map as scene description. This implementation requires also a normal map for upscaling the final ambient occlusion (AO) map, because AO is calculated on half the screen size. Soft Alpha transparency is always hard to deal with in all post processing techniques, this accounts as well for SSAO. To allow soft alpha sprites which are not affected by SSAO (like explosions, halos, fake lightshafts, drop shadows) to blended into the scene, soft alpha mapping is used.

These requirements conclude into a G-Buffer (G for geometry), that stores depth, normal and soft alpha information. The scene is rendered twice, once for depth only and once for normals + alpha, so this implementation uses a 2-pass G-Buffer. After rendering the scene with the user camera, the two SSAO views render the scene again from the same position with identical camera parameters (angle, arc, aspect ratio, etc.) and assigns each entity a shader, dependent on the surface type of each mesh and if the mesh is GPU-bones transformed or not. Point particles are automatically supressed in the user view and both G-Buffer views, since particles will be instantiated later into the scene in a post-postprocessing rendering view.

SSAO approximation

This implementation uses the same technique as the published Crytek shader, except that the rotation vectors are not point-sampled from a texture but stored in the shader itself and that the SSAO is calculated on half the resolution. This common approach in realtime computer graphics accelerates the post processing dramatically by reducing the arithmetics to a fraction of 25%. This necessarily involves a sophisticated upscaling technique for avoiding edge bleeding, though. Using half the screen resolution for approximating SSAO is also a very good decision, because AO terms that are generated for 4 instead of one pixel are likely to be stable due to poor high frequency details among AO surfaces. Edges, caused by prominent geometry details and depth discontinuities can be treated with integrated blur- and upscaling techniques. So, using only half the resolution cause more shader calculations in repairing undersampled regions, but in the end the same visual quality can be achieved with more performance gain compared to processing the full screen size.

Surface types

The approach of [Kajalin2008 ] assumed that you get a correct depth map provided and since the approach utilizes only the depth map, this works. For unknown game scenarios, for which a generic SSAO solution will be used, though, you need more information than that. The most vagueness is exposed by the question, which surface a game object has. In games, everything is faked and worked around, and so it is questionable, if an object is A) completely solid, B) have transparent parts, that are treated with an alphatesting (or "cut out") material, C) uses a real soft alpha channel, D) is a decal or E) is something that can not receive any shading, like the sky. The answer to this question is crucial, because if the SSAO solutions knows e.g. that it is a decal, it can clip the decal from the G-Buffer, or if a pixel covers the sky, no AO is calculated. If it uses a alphatesting material, the overwritten shader can do so, too and if it is an object with real soft alpha channel transparency, it can treat it also differently.

This implementation automatically registers for each game objects a unique identifier, indicating its surface type, if possible. If it is not possible, it is treated as a solid object. The user can overwrite this, though. This approach is also used in other solutions, like in the Unity engine.

Soft alpha mapping

The term soft alpha mapping describes in this context the exclusion from certain soft alpha meshes & sprites from the G-Buffer depth + normal sampling and writing the accumulated alpha texture into a third buffer, the so called soft alpha map. After calculating the final SSAO image, the AO terms are supressed with the soft allpha map like this:

soft alpha map

SSAO map of a scene

supression by soft alpha

This way, mesh and sprites, registred as soft alpha surfaces, can overlap with their alpha channel AO terms, that are behind them, without getting shaded. This is very useful for explosions, light shafts, halos, text sprites and all types of things that have a "not being solid" state of existance.

scene without SSAO + soft alpha sprite

with SSAO

Thresholded blur

[Kajalin2008] uses a 4x4 box blur kernel to smoothen the quiet noisy SSAO image. This is a very very basic approach, since edge bleeding is introduced instantly. Since the shader from [Kajalin2008 ] works on the screen resolution, it is less noticeable, but it is there. Because this implementation works on half the screen resolution, it is very important, that the blurred SSAO image, which is upscaled afterwards, has A) a high quality and B) avoids edge bleeding as much as possible.

To achieve this, a technique from [Aras] has been modfified and adapted: thresholded box blurring. To put it in simple terms: instead of convoluting the screen with a straight box blur kernel, several independent thresholds are calculated and used as a weight for the used pixel. The two thresholds are based upon 1) the dot product of the to-be-blurred pixel and the to-be-weighted pixel and 2) the depth delta between the same two pixels as in 1). The normal dot product is used to make sure, that when a pixels receives the averaged AO value of its neighbours, that it doesn't receive AO from surface that are not similar, like in corners of perpedicular walls. The depth is used to make sure that the AO is seperated from surfaces that are far away, otherwise, the AO of a behind surface (e.g. a dark part of a room) blends into the AO of a front suface, like the head of the avatar, which is likely to be bright.

The threshold for normals is fixed. The threshold for depth discontinuities is calculated dynamically, using a custom parabola equation. To achieve smooth AO surfaces, a high number of samples are taken for the box blur.

Reversed bilateral upscaling

Since this implementation is written for utilizing half the screen resolution and smart blurring is used after the SSAO stage, some sort of upscaling technique is required to make sure, that no edge bleeding occurs due to bilinear sampling. As shown by Jeremy Shopf in a GDC workshop in 2009 [Shopf2009], bilateral upscaling [Peter-PikeSloan2007] can be used to successfully upscale lower resolution post processing results to a high resolution version, while keeping a high image quality on geometry based edge-, normal- and depth discontinuities.

In his original implementation, he sampled on the screen size for each hires pixel the corresponding low resolution ao value, normal and depth (both bilinearily downsampled before, here it is done in the SSAO stage) and the neighbouring 4 normals- and depth values. Then he determines for the current pixel, in which corner it is aligned of the 2x2 pixel block of the hires image, which is the target of the corresponding low resolution ao pixel. Using the known bilinear weights for downscaling, he weights the normal- and depth values to decide how the lowres ao pixel gets upscaled. This technique works very well, has a high image quality and is used in several commercial games to e.g. calculate particle effects on half the resolution.

Though, the proposed bilateral upscaling technique has some unnecessary overhead. For each 2x2 block, you sample the same hires normal and depth values and the same lowres ao, depth and normal value. In this implementation, a different take on this approach is introduced and implemented, which is called Reverse Bilateral Upscaling. Since you essentially calculate for each 2x2 hires block four upscaled grayscale ao terms from the same sources, you can precalculate the upscaled image in the domain of the half screen size and reconstruct the hires upscaled image in a second pass via a simple texture lookup technique.

In the first pass (which utilizes a render target with half the screen size), you sample the lowres ao term, normal, depth and then the four neighbouring normals and depths of the 2x2 hires block. Then you calculate the values of the upscaled 2x2 block and write it in a linear RGBA sequence into the render target. Then, in the second pass, you sample twice: 1) the RGBA vector with the four upscaled ao terms for the current 2x2 block and 2) a 2x2 RGBA texture, which encodes the corner of the current pixel, so it saves (1,0,0,0), (0,1,0,0) and so on. Then, the RGBA ao vector is multiplied elementwise with the corner RGBA vector, which essentially masks the AO pixel, which is used for the current pixel. Finally, the sum of the components will return the final upscaled AO term.

If W, H are the width and breadth of the hires image, the original technique from [Peter-PikeSloan2007] uses W x H x (3 + 4) lookups, while the used reversed technique uses (W/2 x H/2 x (3 + 4)) + (W x H x 2) lookups, which equals a ratio of 7 : 3,75 lookups (independent of the screensize), which means a speedup of about ∼46,4% compared to the original implementation.

For now, a file ao_pp_encode.tga has to be used for encoding. This will be replaced by a procedural bitmap in a future update.

Post postprocessing particle instantiation

Since the scene is rendered again for the G-Buffer, the Z-Buffer remains even after the post processing stages. This is used to supress particles in all rendering views (including the user view) to instantiate particles in the final SSAO + diffuse image.

References

Peter-PikeSloan2007
Peter-Pike Sloan, Naga K. Govindaraju, D. N. J. S.
Image-Based Proxy Accumulation for Real-Time Soft Global Illumination (PDF)
15th Pacific Conference on Computer Graphics and Applications, 2007, 97-105

Kajalin2008
Kajalin, V., Engel, W. (Ed.)
Screen-Space Ambient Occlusion (VI)
ShaderX 7 - Advanced Rendering Techniques (URL), Charles Rivers Media, 2008, 413-423

Shopf2009
Shopf, J.
Mixed Resolution Rendering (PPT)
Game Developer Conference, 2009

Aras
Aras, R. & Shen, Y.
GPU Accelerated Stylistic Augmented Reality (PDF)

License

SSAO v0.6 for Gamestudio A7 & A8
Copyright © 2010, 2011 by Christian Behrenberg

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Credits

SSAO v0.6 for Gamestudio A7 & A8
Copyright © 2010, 2011 by Christian Behrenberg

Christian Behrenberg
Lite-C code, manual, demos, shaders
Reversed Bilateral Upscaling


Additional work and Kudos

Nils Daumann
Port of the Crytek fragment shader from ShaderX7 to Gamestudio A8; Manual revision

Johann Christian Lotter
Custom Gamestudio A8 engine modifications and -bugfixing

Sidney Just, Felix Pohl, Robert Jäger, Jonas Freiknecht, Kitsune Horstmann
Testing, Manual revision

Christopher Bläsius
Notes on hardware compatibility issues

André Weinhold
Feature suggestions and testing


Manual and Software are protected under the copyright laws of Germany. Any trademarks used in this manual are trademarks of their respective owners. Any reproduction of the material and artwork printed herein without the written permission of Christian Behrenberg is prohibited. We undertake no guarantee for the accuracy of this manual. Christian Behrenberg reserves the right to make alterations or updates without further announcement.

Do you like this project?

donate via PayPal see Amazon wishlist