Planet Gamedev

Game From Scratch

Unity 5 Vs. Unreal Engine 4. The next major tutorial project on

by at March 29, 2015 04:10 PM


I’ve been trying to decide what major project to embark on here at  All of the current tutorial series are at a stage that I feel they are “good enough” to get anyone started and I will keep adding to them over time.  Once I reach that point I need to decide on what project to work on next.  It’s both a fun and frustrating problem to have!


I’ve been thinking about it long and hard, then recently there were some major announcements I simply couldn’t afford to ingore.  First the folks over at Unreal announced that Unreal Engine would no longer have a monthly subscription.  This is actually something I called for ( with a fair bit of hyperbole… ) when Unreal Engine 4 was released.  Immediately after, and with much thunder stolen, Unity made a very similar announcement.  These lowered barriers of entry vastly increased the appeal of both engines.


I’ve long intended to cover both engines in more detail.  I immediately subscribed to Unreal 4 on it’s release and did a bit of an overview post.  I never got the opportunity to get much deeper, as frankly, there is a pretty steep learning curve attached and I simply didn’t have the time.  Way back when I launched this game I was intending to “create a game from scratch” using Unity.  Somewhere along the way I got distracted and we ended up with a series of LibGDX, Phaser, Blender, HTML, C++, JavaScript and more game development tutorials.  Oops.


So, basically I’ve always intended to cover both for the longest time, but which one should I cover first?


I struggled with this for a long time, going back and forth between the two so many times I got nowhere.  Then I had a thought…


Lot’s of you have got to be asking “What should I use, Unity or Unreal?”.  It’s a fair and difficult question, as I obviously can’t decide myself!  So I am going to learn both and document the process, both in text and video.  So essentially I am going to do a Unity and Unreal tutorial series at the same time, learning both and documenting the process in both video and text form as I go.


Untitled 7


That all said…  this thread title and the above image are both a bit on the sensational side.  I am not actually comparing the two engines, there will never be a “Unity is greater than Unreal” or vice versa conclusion.  Both engines are obviously quite viable, popular and each has it’s own strengths and weakness.  Determining which engine is better than the other engine all comes down to your own preferences and requirements.  At this point arguing which engine is better is about as useful as the endless programming language wars.


Instead I will be going the process of creating a typical 2D game ( at least initially, 2D only ) in each game engine, documenting the process as thoroughly as possible.  So by the time I am done I should have a fairly comprehensive tutorial series covering creating a 2D game in each engine, and you should have a nice side by side comparison of how each engine works, which should aid in your selection process.


I intend to cover subjects such as the following, for each engine, in both video and text tutorial form:

  • Engine overview
  • Learning resources
  • Simple graphics
  • Game loop/Event processing
  • Input
  • Audio
  • Animation
  • Level composition
  • Collisions
  • Physics
  • AI
  • Networking ( maybe )
  • etc…


So basically all the pieces that go in to making a simple game.  I will learn it in one engine, document the process, learn it in the other engine, document the process then continue on to the next item in the list.  Obviously if there is something you think I should cover, let me know and I’ll do my best.


There are a few caveats of course…  first, this might take a very long time.  I’ve a lot of learning to do here, so there might be a bit of lag between posts.  The biggest catch though is I’ll be documenting things I’ve only just learned!  Expect some mistakes, inefficiencies and other hiccups as I go.  Obviously as I go I will try to be as “right” as possible, but I am no subject matter expert here!  I have a small bit of experience with both engines and tons of experience with game programming in general, but I don’t for a second claim to be an expert with either technology!


One other aspect of a game project like this is obviously game assets.  For a programmer, often getting art assets is as much of a time sink as programming!  Therefore I am going to be implementing this project in parallel with another very interesting art tutorial over at 2dgameartforprogrammers to create and release all of the assets to create a game called BotBox.


So, essentially I am going to attempt to create that game using both the Unity and Unreal game engines.  Wish me luck and I hope you enjoy it!


Of course, any and all feedback highly appreciated.  Please have patience with me… this might take a while!

Phaser HTML5 Game Dev Library release version 2.3

by at March 29, 2015 02:57 PM


The popular HTML5 library Phaser have just released 2.3.  It’s not a huge new functionality release, but it’s got some major architecture changes that will affect you greatly if you are a Phaser user.  First off, they went down a more modular approach leading to fewer god classes.  This is aPhaser 2.3.0 good thing™.  Then thanks to this redesign, they also made the Phaser build process different, allow you to disuse unneeded portions.


From the release notes:


Significant Updates



All of the core Game Objects have received an important internal restructuring. We have moved all of the common functions to a new set of Component classes. They cover functionality such as 'Crop', 'Physics Body', 'InCamera' and more. You can find the source code to each component in the src/gameobjects/components folder of the repo.

All of the Game Object classes have been restructured to use the new component approach. This puts an end to the "God classes" structure we had before and removes literally hundreds of lines of duplicate code. It also allowed us to add features to Game Objects; for example Bitmap Text objects are now full-class citizens with regard to physics capabilities.

Although this was a big internal shift from an API point of view not much changed - you still access the same methods and properties in the same way as before. Phaser is just a lot leaner under the hood now.

It's worth mentioning that from a JavaScript perspective components are mixins applied to the core game objects when Phaser is instantiated. They are not added at run-time or are dynamic (they never get removed from an object once added for example). Please understand that this is by design.

You can create your own custom Phaser classes, with your own set of active components by copying any of the pre-existing Game Objects and modifying them.



As a result of the shift to components we went through the entire source base and optimised everything we could. Redundant paths were removed, debug flags removed and new stub classes and hooks were created. What this means is that it's now easier than ever to "disable" parts of Phaser and build your own custom version.

We have always included a couple of extra custom builds with Phaser. For example a build without P2 Physics included. But now you can strip out lots of additional features you may not require, saving hundreds of KB from your build file in the process. Don't use any Sound in your game? Then you can now exclude the entire sound system. Don't need Keyboard support? That can be stripped out too.

As a result of this work the minimum build size of Phaser is now just 83KB (minified and gzipped).

Please see this tutorial on how to create custom builds.



We've updated the core of Arcade Physics in a number of significant ways.

First we've dropped lots of internal private vars and moved to using non-cached local vars. Array lengths are no longer cached and we've implemented physicsType properties on Game Objects to speed-up the core World collideHandler. All of these small changes have lead to a nice improvement in speed as a result, and also allows us to now offer things like physics enabled BitmapText objects.

More importantly we're now using a spacial pre-sort for all Sprite vs. Group and Group vs. Group collisions. You can define the direction the sort will prioritize via the new sortDirection property. By default it is set to Phaser.Physics.Arcade.LEFT_RIGHT. For example if you are making a horizontally scrolling game, where the player starts on the left of the world and moves to the right, then this sort order will allow the physics system to quickly eliminate any objects to the right of the player bounds. This cuts down on the sheer volume of actual collision checks needing to be made. In a densely populated level it can improve the fps rate dramatically.

There are 3 other directions available (RIGHT_LEFT, TOP_BOTTOM and BOTTOM_TOP) and which one you need will depend on your game type. If you were making a vertically scrolling shoot-em-up then you'd pick BOTTOM_TOP so it sorts all objects above and can bail out quickly. There is also SORT_NONE if you would like to pre-sort the Groups yourself or disable this feature.

Another handy feature is that you can switch the sortDirection at run-time with no loss of performance. Just make sure you do it before running any collision checks. So if you had a large 8-way scrolling world you could set the sortDirection to match the direction the player was moving in and adjust it in real-time, getting the benefits as you go. My thanks to Aaron Lahman for inspiring this update.



The Phaser.Loader has been updated to support parallel downloads which is now enabled by default (you can toggle it via the Loader.enableParallel flag) as well as adding future extensibility points with a pack/file unified filelist and an inflight queue.

There are no known incompatibilities with the previous Loader. Be aware that with parallel downloading enabled the order of the Loader events may vary (as can be seen in the "Load Events" example).

The parallel file concurrency limit is available in Loader.maxParallelDownloads and is set to 4 by default. Under simulated slower network connections parallel loading was a good bit faster than sequential loading. Even under a direct localhost connection parallel loading was never slower, but benefited most when loading many small assets (large assets are more limited by bandwidth); both results are fairly expected.

The Loader now supports synchronization points. An asset marked as a synchronization point must be loaded (or fail to load) before any subsequent assets can be loaded. This is enabled by using the withSyncPoint and addSyncPoint methods. Packs ('packfile' files) and Scripts ('script' files) are treated as synchronization points by default. This allows parallel downloads in general while allowing synchronization of select resources if required (packs, and potentially other assets in the future, can load-around synchronization points if they are written to delay final 'loading').

Additional error handling / guards have been added, and the reported error message has been made more consistent. Invalid XML (when loading) no longer throws an exception but fails the particular file/asset that was being loaded.

Some public methods/properties have been marked as protected, but no (except in case of a should-have-been-private-method) public-facing interfaces have been removed. Some private methods have been renamed and/or removed.

A new XHR object is created for each relevant asset (as there must be a different XHR for each asset loaded in parallel). Online searches indicated that there was no relevant benefit of XHR (as a particular use-case) re-use; and time will be dominated with the resource fetch. With the new flight queue an XHR cache could be re-added, at the cost of some complexity.

The URL is always transformed through transformUrl, which can make adding some one-off special cases like #1355 easier to deal with.

This also incorporates the fast-cache path for Images tags that can greatly speed up the responsiveness of image loading.

Loader.resetLocked is a boolean that allows you to control what happens when the loader is reset, which happens automatically on a State change. If you set resetLocked to true it allows you to populate the loader queue in one State, then swap to another State without having the queue erased, and start the load going from there. After the load has completed you could then disable the lock again as needed.

Thanks to @pnstickne for vast majority of this update.



We are now using our own custom build of Pixi v2. The Pixi project has moved all development resources over to Pixi v3, but it wasn't ready in time for the release of Phaser 2.3 so we've started applying our own fixes to the version of Pixi that Phaser uses.

As a result we have removed all files from the src/pixi folder that Phaser doesn't use, in order to make this distinction clearer. This includes EventTarget, so if you were relying on that in your game you'll need to add it back in to your local build.

We've also removed functions and properties from Pixi classes that Phaser doesn't require: such as the Interaction Manager, Stage.dirty, etc. This has helped us cut down the source code size and make the docs less confusing, as they no longer show properties for things that weren't even enabled.

We've rolled our own fixes into our version of Pixi, ensuring we keep it as bug-free as possible.

You can read the entire release notes here.

c0de517e Rendering et alter

Being more wrong: Parallax corrected environment maps

by DEADC0DE ( at March 29, 2015 11:58 AM


A follow-up to my article on how wrong we do environment map lighting, or how to get researchers excited and engineers depressed. 
Here I'll have a look at the errors we incur when we want to adopt "parallax corrected" (a.k.a. "localized" or "proxy geometry") pre-filtered cube-map probes, a technique so very popular nowadays.

I won't explain the base technique here, for that please refer to the following articles:

Errors, errors everywhere...

All these are in -addition- to the errors we commit when using the standard cubemap-based specular lighting.

1) Pre-filter shape

Let's imagine we're in an empty rectangular room, with diffuse walls. In this case the cubemap can be made to accurately represent radiance from the room.
We want to prefilter the cubemap to be able to query irradiance in a fast way. What shape does the filter kernel have?
  • The cubemap is not at infinite distance anymore -> the filter doesn't depend only on angles!
  • We have to look at how the BRDF lobe "hits" the walls, and that depends on many dimensions (view vector, normal, surface position, surface parameters)
  • Even in the easy case where we assume the BRDF lobe to be circularly symmetric around the reflection, and we consider the reflection to hit a wall perpendicularly, the footprint won't be exactly identical to one computed only on angles.
  • More worryingly, that case won't actually happen often, the BRDF will often hit a wall, or many walls, at an angle, creating an anisotropic footprint!
  • Pre-filtering "from the center", using angles, will skew the filter size near the cube vertices, but unlike infinite cubemaps, this is not exactly justified in this case, it optimizes for a single given point of view (query position)
The pre-filter kernel for parallax-corrected cubes should be seen more as -a- kernel we applied and know...
It doesn't have a direct, one-to-one relationship with the material roughness... We can try, knowing we have a prefiltered cube, to approximate what fetch or fetches best approximate the actual BRDF footprint on the proxy geometry.

This problem can be seen also from a different point of view:
  • Let's assume we have a perfectly prefiltered cube for a given surface location in space (query point or "point of view"). 
  • Let's compute a new cubemap for a different point in space, by re-projecting the information in the first cubemap to the new point of view via the proxy geometry (or even the actual geometry for what matters...).
  • Let's imagine the filter kernel we applied at a given cubemap location in the original pre-filter. 

How will it become distorted after the projection we do to obtain the new cubemap? This is the distortion that we need to compensate somehow...

This issue is quite apparent with rougher objects near the proxy geometry, it results in a reflection that looks sharper, less rough than it should be, usually as we underfilter compared to the actual footprint.
A common "solution" is to not use parallax projection as the surfaces get rougher, which creates lighting errors.

I made this BRDF/plane intersection visualization
while working on area lights, the problem with cubemaps is identical

2) Visibility

In most real-world applications, the geometry we use for the parallax-correction (commonly a box) is doesn't match exactly the real world geometry. Environment with all perfectly rectangular, perfectly empty rooms might be a bit boring. 
As soon as we place an object on the ground, its geometry won't be captured by the reflection proxy, and we will be effectively raytracing the reflection past it, thus creating a light leak.

This is really quite a hard problem, light leaks are one of the big issues in rendering, they are immediately noticeable and they "disconnect" objects. Specular reflections in PBR tend to be quite intense, and so it's not easy even to just occlude them away with standard methods like SSAO (and of course considering only occlusion would be per se an error, we are just subtracting light).

An obvious solution to this issue is to just enrich somehow the geometrical representation we have for parallax correction, and this could be done in quite a lot of ways, from having richer analytic geometry to trace against, to using signed distance fields and so on.
All these ideas are neat, and will produce absolutely horrible results. Why? Because of the first problem we analyzed! 
The more complex and non-smooth your proxy geometry is, the more problems you'll have pre-filtering it. In general if your proxy is non-convex your BRDF can splat across different surfaces at different distances and will horribly break pre-filtering, resulting in sharp discontinuities on rough materials.
Any solution to this that wants to use non-convex proxies, needs to have a notion of prefiltered visibility, not just irradiance, and the ability of doing multiple fetches (blending them based on the prefiltered visibility)

A common trick to partially solve this issue is to "renormalize" the cube irradiance based on the ratio between the diffuse irradiance at the cube center and the diffuse irradiance at the surface (commonly known via lightmaps). 
The idea is that such ratio would express somewhat well how different (due to occlusions/other reflections) how intense the cubemap would be if it was baked from the surface point.
This trick works for rough materials, as the cubemap irradiance gets more "similar" to diffuse irradiance, but it breaks for sharp reflections... Somewhat ironically here the parallax cubemap is "best" with rough reflections, but we saw the opposite is true when it comes to filter footprint...

McGuire's Screen Space Raytracing

3) Other errors

For completeness, I'll mention here some other relatively "minor" errors:
  • Interpolation between reflection probes. We can't have a single probe for the entire environment, likely we'll have many that cover everything. Commonly these are made to overlap a bit and we interpolate while transitioning from one to another. This interpolation is wrong, note that if the two probes reprojected identically at a border between them, we wouldn't need to interpolate to being with...
  • These reflection proxies capture only radiance scattered only at a specific direction for each texel. If the scattering is not purely diffuse, you'll have another source of error.
  • Baking the scattering itself can be complicated, without a path tracer you risk to "miss" some light due to multiple scattering.
  • If you have fog (atmospheric scattering), its influence has to be considered, and it can't really be just pre-baked in the probes correctly (it depends on how much fog the reflection rays traverses, and it's not just attenuation, it will scatter the reflection rays altering the way they hit the proxy)
  • Question: what is the best point inside the proxy geometry volume from which to bake the cubemap probe? This is usually hand authored and artists tend to place it as possible away from any object (this could be a heuristic indeed, easy to implement).
  • Another way of seeing parallax-corrected probes is to treat think of them really as textured area lights

A common solution to mitigate many issues is to use screen space reflections (especially if you have the performance to do so, fading to baked cubemap proxies only where the SSR doesn't have data to work.
I won't delve into the errors and issues of SSR here, it would be off-topic, but beware of having the two methods represent the same radiance. Even when that's done correctly, the transition between the two techniques can be very noticeable and distracting, it might be better to use one or the other based on location.

From GPU-Based Importance Sampling.


If you think you are not committing large errors in your PBR pipeline, you didn't look hard enough. You should be aware of many issues, most of them having a real, practical impact and you should assume many more errors exist that you haven't discovered yet.
Do your own tests, compare with real-world, be aware, critical, use "ground truth" simulations.

Remember that in practice artists are good at hiding problems and working around them, often asking to have non-physical adjustment knobs they will use to tuning down/skew certain effects.
Listen to these requests as they probably "hide" a deep problem with your math and assumptions.

Finally, some tips on how to try solve these issues:
  • PBR is not free from hacks (not even offline...), there are many things we can't derive analytically. 
  • The main point of PBR is that now we can reason about physics to do "well motivated" hacks. 
  • That requires having references and ground truth to compare and tune.
    • A good idea for this problem is to write an importance sampled shader that does glossy reflections via many taps (doing the filtering part in realtime, per shaded point, instead of pre-filtering).
    • A full raytraced ground truth is also handy, and you don't need to recreate all the features of your runtime engine...
  • Experimentation requires fast iteration and a fast and accurate way to evaluate the error against ground truth.
  • If you have a way of programmatically computing the error from the realtime solution to the ground truth, you can figure out models with free parameters that can be then numerically optimized (fit) to minimize the error...

Geeks3D Forums From Fortran to performance via transformation and substitution rules

March 29, 2015 10:25 AM

"A large amount of numerically-oriented code is written and is being written in legacy languages. Much of this code could, in principle, make good use of data-parallel throughput-oriented computer architectures., a transformation-based programmi...

Kingston unveils high-speed HyperX Predator PCIe SSD

March 29, 2015 10:20 AM

"Kingston's new HyperX Predator PCIe SSD is an upgrade part that brings read speeds of up to 1400MB/s and write speeds hitting 1000MB/s to desktop and notebook systems."

GNU Nano Gets New Stable Release

March 29, 2015 10:18 AM

GNU Nano 2.4.0 has been released as the first stable update to this UNIX command line text editor in a number of years. The release codenamed "Lizf" brings a wide variety of changes: full undo system, Vim-compatible file locking, linter support, format...

Microsoft - The Zombie DirectX SDK

March 29, 2015 06:17 AM

Over the past five years, I've devoted significant time and effort to explaining the state of affairs with the legacy DirectX SDK. Developers...

Timothy Lottes

Stills from my Talk

by Timothy Lottes ( at March 29, 2015 12:52 AM

cbloom rants

03-25-15 - My Chameleon

by cbloom ( at March 28, 2015 05:22 PM

I did my own implementation of the Chameleon compression algorithm. (the original distribution is via the density project)

This is the core of Chameleon's encoder :

    cur = *fm32++; h = CHAMELEON_HASH(cur); flags = 1;
    if ( c->hash[h] == cur ) { flags ++; *to16++ = (uint16) h; }
    else { c->hash[h] = cur; *((uint32 *)to16) = cur; to16 += 2; }

This is the decoder :

    if ( (int16)flags  0 ) { cur = c->hash[ *fm16++ ]; }
    else { cur = *((const uint32 *)fm16); fm16 += 2; c->hash[ CHAMELEON_HASH(cur) ] = cur; }
    flags = 1; *to32++ = cur;

I thought it deserved a super-simple STB-style header-only dashfuly-described implementation :


My Chameleon.h is not portable or safe or any of that jizzle. Maybe it will be someday. (Update : now builds on GCC & clang. Tested on PS4. Still not Endian-invariant.)

// Usage :

#include "Chameleon.h"

Chameleon c;


size_t comp_buf_size = CHAMELEON_MAXIMUM_OUTPUT_SIZE(in_size);

void * comp_buf = malloc(comp_buf_size);

size_t comp_len = Chameleon_Encode(&c, comp_buf, in_buf, in_size );


Chameleon_Decode(&c, out_buf, in_size, comp_buf );

int cmp = memcmp(in_buf,out_buf,in_size);
assert( comp == 0 );

ADD : Chameleon2 SIMD prototype now posted : (NOTE : this is not good, do not use)

Chameleon2.h - experimental SIMD wide Chameleon
both Chameleons in a zip

The SIMD encoder is not fast. Even on SSE4 it only barely beats scalar Chameleon. So this is a dead end. Maybe some day when we get fast hardware scatter/gather it will be good (*).

(* = though use of hardware scatter here is always going to be treacherous, because hashes may be repeated, and the order in which collisions resolve must be consistent)

03-25-15 - Density - Chameleon

by cbloom ( at March 28, 2015 05:22 PM

Casey pointed me at Density .

Density contains 3 algorithms, from super fast to slower : Chameleon, Cheetah, Lion.

They all attain speed primarily by working on U32 quanta of input, rather than bytes. They're sort of LZPish type things that work on U32's, which is a reasonable way to get speed in this modern world. (Cheetah and Lion are really similar to the old LZP1/LZP2 with bit flags for different predictors, or to some of the LZRW's that output forward hashes; the main difference is working on U32 quanta and no match lengths)

The compression ratio is very poor. The highest compression option (Lion) is around LZ4-fast territory, not as good as LZ4-hc. But, are they Pareto? Is it a good space-speed tradeoff?

Well, I can't build Density (I use MSVC) so I can't test their implementation for space-speed.

Compressed sizes :

lzt99 :
uncompressed       24,700,820

density :
c0 Chameleon       19,530,262
c1 Cheetah         17,482,048
c2 Lion            16,627,513

lz4 -1             16,193,125
lz4 -9             14,825,016

Oodle -1 (LZB)     16,944,829
Oodle -2 (LZB)     16,409,913

Oodle LZNIB        12,375,347

(lz4 -9 is not competitive for encode time, it's just to show the level of compression you could get at very fast decode speeds if you don't care about encode time ; LZNIB is an even more extreme case of the same thing - slow to encode, but decode time comparable to Chameleon).

To check speed I did my own implementation of Chameleon (which I believe to be faster than Density's, so it's a fair test). See the next post to get my implementation.

The results are :

comp_len = 19492042
Chameleon_Encode_Time : seconds:0.0274 ticks per: 1.919 mbps : 901.12
Chameleon_Decode_Time : seconds:0.0293 ticks per: 2.050 mbps : 843.31

round trip time = 0.05670
I get a somewhat smaller file size than Density's version for unknown reason.

Let's compare to Oodle's LZB (an LZ4ish) :

Oodle -1 :

24,700,820 ->16,944,829 =  5.488 bpb =  1.458 to 1
encode           : 0.061 seconds, 232.40 b/kc, rate= 401.85 mb/s
decode           : 0.013 seconds, 1071.15 b/kc, rate= 1852.17 mb/s

round trip time = 0.074

Oodle -2 :

24,700,820 ->16,409,913 =  5.315 bpb =  1.505 to 1 
encode           : 0.070 seconds, 203.89 b/kc, rate= 352.55 mb/s
decode           : 0.014 seconds, 1008.76 b/kc, rate= 1744.34 mb/s

round trip time = 0.084

lzt99 is a collection of typical game data files.

We can test on enwik8 (text/html) too :

Chameleon :

enwik8 :
Chameleon_Encode_Time : seconds:0.1077 ticks per: 1.862 mbps : 928.36
Chameleon_Decode_Time : seconds:0.0676 ticks per: 1.169 mbps : 1479.08
comp_len = 61524068

Oodle -1 :

enwik8 : 
100,000,000 ->57,267,299 =  4.581 bpb =  1.746 to 1 
encode           : 0.481 seconds, 120.17 b/kc, rate= 207.79 mb/s
decode           : 0.083 seconds, 697.58 b/kc, rate= 1206.19 mb/s

here Chameleon is much more compelling. It's competitive for size & decode speed, not just encode speed.

Commentary :

Any time you're storing files on disk, this is not the right algorithm. You want something more asymmetric (slow compress, fast decompress).

I'm not sure if Cheetah and Lion are Pareto for round trip time. I'd have to test speed on a wider set of sample data.

When do you actually want a compressor that's this fast and gets so little compression? I'm not sure.

c0de517e Rendering et alter

Oh envmap lighting, how do we get you wrong? Let me count the ways...

by DEADC0DE ( at March 28, 2015 04:35 PM

Environment map lighting via prefiltered cubemaps is very popular in realtime CG.

The basics are well known:

  1. Generate a cubemap of your environment radiance (a probe, even offline or in realtime).
  2. Blur it with a cosine hemisphere kernel for diffuse lighting (irradiance) and with a number of phong lobes of varying exponent for specular. The various convolutions for phong are stored in the mip chain of the cubemap, with rougher exponents placed in the coarser mips.
  3. At runtime we fetch the diffuse cube using the surface normal and the specular cube using the reflection vector, forcing the latter fetch to happen at a mip corresponding to the material roughness.
Many engines stop at that, but a few extensions emerged (somewhat) recently:
Especially the last extension allowed a huge leap in quality and applicability, it's so nifty it's worth explaining a second.

The problem with Cook-Torrance BRDFs is that they depend from three functions: a distribution function that depends on N.H, a shadowing function that depends on N.H, N.L and N.V and the Fresnel function that depends on N.V.

While we know we can somehow solve functions that depend on N.H by fetching a prefiltered cube in the reflection direction (not really the same, but the same different that there is between the Phong and Blinn specular models), if something depends on N.V it would add another dimension to the preintegrated solution (requiring an array of cubemaps) and we completely wouldn't know what to do with N.L as we don't have a single light vector in environment lighting.

The cleverness of the solution that was found can be explained by observing the BRDF and how its shape changes when manipulating the Fresnel and shadowing components.
You should notice that the BRDF shape, thus the filtering kernel on the environment map, is mostly determined by the distribution function, that we know how to tackle. The other two components don't change much of the shape but scale it and "shift" it away from the H vector. 

So we can imagine an approximation that integrates the distribution function with a preconvolved cubemap mip pyramid, and the other components are somehow relegated into a scaling component by preintegrating them against an all-white cubemap, ignoring specifically how the lighting is distributed. 
And this is the main extension we employ today, we correct the cubemap that has been preintegrated only with the distribution lobe with a (very clever) biasing factor.

All good, and works, but now, is all this -right-? Obviously not! I won't offer (just yet) solutions here but can you count the ways we're wrong?
  1. First and foremost the reflection vector is not the half-vector, obviously.
    • The preconvolved BRDF expresses a radially symmetric lobe around the reflection vector, but an half-vector BRDF is not radially symmetric at grazing angles (when H!=N), it becomes stretched.
    • It's also different from the its reflection-vector based one when R=H=N but there it can be adjusted with a simple constant roughness modification (just remember to do it!).
  2. As we said, Cook-Torrance is not based only on an half-vector lobe. 
    • We have a solution that works well but it's based only on a bias, and while that accounts for the biggest difference between using only the distribution and using the full CT formulation, it's not the only difference.
    • Fresnel and shadowing also "push" the BRDF lobe so it doesn't reach its peak value on the reflection direction.
  3. If we bake lighting from points close enough that perspective matters, then discarding position dependence is wrong. 
    • It's true that perceptually is hard for us to judge where lighting comes from when we see a specular highlight (good!) but for reflections of nearby objects the error can be easy to spot. 
    • We can employ warping as we mentioned, but then the preconvolution is warped as well.
    • If for example we warp the cubemap by considering it representing light from a box placed in the scene, what we should do is to trace the BRDF against the box and see how it projects onto it. That projection won't be a radially symmetric filtering kernel in most cases.
    • In the "box" localized environment map scenario the problem is closely related to texture card area lights.
  4. We disregard occlusions.
    • Any form of shadowing of the preconvolved enviroment lighting that just scales it down is wrong as occlusion should happen before prefiltering.
    • Still -DO- shadow environment map lighting somehow. A good way is to use screen-space (or voxel-traced) computed occlusion by casting a cone emanating from the reflection vector, even if that's done without considering roughness for the cone size, or somehow precomputing and baking some form of directional occlusion information.
    • Really this is still due to the fact that we use the envmap information at a point that is not the one from which it was baked.
    • Another good alternative to try to fix this issue is renormalization as shown by Call of Duty.
  5. We disregard surface normal variance.
    • Forcing a given miplevel (texCubeLod) is needed as mips in our case represent different lobes at different roughnesses, but that means we don't antialias that texture considering how normals change inside the footprint of a pixel (note: some HW gets that wrong even with regular texCube fetches)
    • The solution here is "simple" as it's related to the specular antialiasing we do by pushing normal variance into specular roughness.
    • But that line of thought, no matter the details, is also provably wrong (still -do- that). The problem is closesly related to the "roughness modification" solution for spherical area lights and it suffers from the same issue, the proper integral of the BRDF with a normal cone is flatter than what we get at any roughness on the original BRDF.
    • Also, the footprint of the normals won't be a cone with a circular base, and even what we get with the finite difference ddx/ddy approximation would be elliptical.
  6. Bonus: compression issues for cubemaps and dx9 hardware.
    • Older hardware couldn't properly do bilinear filtering across cubemap edges, thus leading to visibile artifacts that some corrected by making sure the edge texels were the same across faces.
    • What most don't consider though is that if we use a block-compression format on the cubemap (DXT, BCn and so on) there will be discontinuities between blocks which will make the edge texels different again. Compressors in these cases should be modified so the edge blocks share the same reference colors.
    • Adding borders is better.
    • These techniques are relevant also for hardware that does bilinear filter across cubemap edges, as that might be slower... Also, avoid using the very bottom mips...
I'll close with some links that might inspire further thinking:
#phyiscallybasedrenderingproblems #naturesucks

Timothy Lottes

Other CRT Options

by Timothy Lottes ( at March 27, 2015 11:36 PM

29" Makvision CRT SVGA Arcade Monitor
Link: XGaming has these for roughly $500 and around $60 shipping to where I live.
Uses VGA input. Looks like there are three kHz spec ranges depending on version of display: {90 Mhz, 15-40 kHz or 30-40 kHz or 30-50 kHz (model C2929D1), 47-90 Hz, 800x600 max}. The peak kHz model might have capacity for 800x600 @ 80 Hz. Wonder what kind of persistence this display has.

Sony GDM-FW900 24" Widescreen CRT Monitor
Possible to find on ebay. Does {30-121 kHz, 48-160 Hz, 2304x1440 max}. Seems possible to do 960x600 at 160 Hz, and peak resolution around 80 Hz.

Gamasutra Feature Articles

One Life Left vs. Gamasutra GDC Podcast #4: Who in the world is 'Richard Lemarchand'?

March 27, 2015 08:27 PM

Relive GDC 2015 through the magic of podcasting! Special guests include IGDA's Kate Edwards, Harmonix creative director Matt Boch, and an impromptu appearance by a man named Richard Lemarchand. ...

10 can't-miss video postmortems from the GDC Vault

March 27, 2015 06:51 PM

From Steamworld Dig to Shenmue, this collection of video postmortems covers a variety of games. What they have in common is that they're free to watch, and can't-miss viewing. ...

Get a job: Sucker Punch seeks a Narrative Writer

March 27, 2015 06:39 PM

InFamous developer Sucker Punch is seeking a narrative writer for "story development, game dialogue, and general narrative contribution" to the company's new project. ...

Don't Miss: Boss battle design and structure

March 27, 2015 06:26 PM

Designer Mike Stout breaks down the boss battle into eight different beats, and runs two notable ones -- Ocarina of Time's Ganon and Portal's GladOS -- through a thorough design analysis. ...

How analytics and ads can work together in your game

March 27, 2015 06:18 PM

"With data-driven models, you can see exactly what source is bringing consumers in, which source they're passing over, which source finally drives them to buy." ...

Soon you'll be able to sign up for Valve's Vive VR dev kit

March 27, 2015 05:54 PM

Company spokesman Doug Lombardi says that Valve plans to launch a sign-up site for free (and likely extremely limited) dev kits next as early as week. ...

Quebec reinstates video game tax breaks

March 27, 2015 05:38 PM

A new budget means new tax breaks: The Quebec government has restored the 37.5 percent credit for game development, which was slashed last June. ...

How to implement in-game player feedback forms

March 27, 2015 04:46 PM

"As developers we need to make it as frictionless as possible for players to communicate feedback to us - otherwise most won't bother. This is the beauty of this system." ...

Running an indie studio: Biz lessons, one year in

March 27, 2015 03:56 PM

A year in to running SassyBot Studio, Elwin Verploegen shares tips on running a business -- from the initial founding agreements to accounting and planning development. ...

Geeks3D Forums

Humus' Clustered Shading demo

March 27, 2015 03:52 PM

Clustered Shading is a technique for efficient lighting on modern GPUs ...
The main motivation for Clustered Shading is performance, flexibility, and simplicity. It normally out-pe...

Gamasutra Feature Articles

GDC Europe 2015 call for papers ends soon!

March 27, 2015 03:14 PM

Game makers, take note: The call for submissions to present lectures and panel sessions at the 2015 Game Developers Conference Europe closes Monday, March 30. ...

5 awesome hacks for exhibiting your indie game at PAX or GDC

March 27, 2015 01:40 PM

"With a limited budget, it's critical to be as efficient as possible, and small edges really add up. These tricks likely made hundreds of real dollars of difference in the effectiveness of our booth." ...

Free-to-play: What about player skill?

March 27, 2015 12:03 PM

"Those 'near misses' keep the player feeling like they can beat the level, they just need to play a few more times or convert. With higher player skill, this becomes much more difficult to achieve." ...

Geeks3D Forums

Rasterization: A Practical Implementation

March 27, 2015 10:17 AM

The rasterization rendering technique is surely the most commonly used technique to render images of 3D scenes, and yet, that is probably the least understood and the least properly documented technique of all (especially compared to ray-tracin...

Gamasutra Feature Articles

5 best practices for cost-efficient user acquisition

March 27, 2015 09:01 AM

"Building and executing a cost-efficient user acquisition strategy is a cornerstone of mobile game monetization." Here are "the best-and most economical-ways to attract new players." ...

Game Design Deep Dive: Traffic systems in Cities: Skylines

March 27, 2015 09:01 AM

Colossal Order launched Cities: Skylines earlier this year to remarkable acclaim. Here, the developers explain how they went about designing and coding the city builder's robust traffic systems. ...

iPhone Development Tutorials and Programming Tips

Open Source Component For Creating Sophisticated Animated Text Labels

by Johann at March 27, 2015 06:30 AM

I’ve mentioned some excellent open source projects from Yalantis including a component for great looking scrolling iconized top menus, an  iconized side menu component with slick animations, a great example of a pull-to-refresh implementation with an expanding animation.

Here’s an open source component submitted by Nikita of Yalantis that provides a custom label allowing you to easily apply animations to the text, and even individual layers within the label.

Yalantis includes a nice example project showing how to create a sophisticated animated label using Ophiucus.

Here’s an animation from the readme showing Ophiucus in action:


You can find Ophiucus on Github here.

You can also read more about Ophiucus on the Yalantis blog.

A nice component for creating sophisticated text animations.

Original article: Open Source Component For Creating Sophisticated Animated Text Labels

©2015 iOS App Dev Libraries, Controls, Tutorials, Examples and Tools. All Rights Reserved.

Gamasutra Feature Articles

Zynga faces revived fraud lawsuit: Did its execs lie to investors?

March 26, 2015 11:41 PM

In a ruling in San Francisco, U.S. District Judge Jeffrey White ruled that shareholders can pursue claims alleging that its execs took advantage of investors. ...

Game From Scratch

An Hour with Blender and

by at March 26, 2015 07:51 PM


I just started a new concept off today, please let me know if you like it.  Basically it’s a fixed duration (one hour) overview on a specific topic, in this case Blender.  The idea is to give a cross between an introduction and a tutorial on getting started with using a certain product.  In this video we look at Blender, how to configure it, how to navigate and customize the interface, what it’s composed off and the basics of operating it.


If there is interest, I can do “An hour with” topics that are much more focused, such as “An hour Modelling” or “An hour texturing”, etc.


Additionally, this is not a deep dive Blender tutorial.  Fortunately I already have one of those!  If you are looking at specifics of learning Blender, the hotkeys, etc, please start here.


Below is an embedded version of the video.  It is also available in full 1080p on YouTube here.


The Video


iPhone Development Tutorials and Programming Tips

Tool: Xcode Plugin For Easy Tweaking Of CAMediaTimingFunction Control Points

by Johann at March 26, 2015 06:33 AM

Earlier this month I mentioned a nice Xcode plugin that adds an action bar you can bring up with a hotkey providing easy access to nearly every Xcode action.

Here’s an Xcode plugin called CATweakerSense from Xu Lian that provides a nice interface directly within Xcode so you can visually set your CAMediaTimingFunction animation timing curve.

CATweakerSense will pop up its interface for tweaking directly within your code within a popover while editing CAMediaTimingFunction control points.

CATweakerSense also includes a tool you can use on your desktop for tweaking CAMediaTimingFunction values.

Here’s an image showing CATweakerSense in action:


You can find CATweakerSense and CATweaker on Github here.

A handy Xcode plugin.

See More: Xcode Plugins

Original article: Tool: Xcode Plugin For Easy Tweaking Of CAMediaTimingFunction Control Points

©2015 iOS App Dev Libraries, Controls, Tutorials, Examples and Tools. All Rights Reserved.

David Perry Blog

Big Day for PS4 Firmware Updates!

by David Perry at March 26, 2015 04:41 AM

Yes, I know video game console firmware updates can get annoying, but not when they have really cool new features!

Tomorrow (3/25/15) the Yukimura update 2.50 will be available on the PlayStation 4.

This will add Suspend and Resume, a feature gamers have been excited about.

Our Gaikai team has slipped in something as well…

Remote Play and Share Play — For games that support 60 fps, users will be able to automatically enjoy those games with 60 fps for both Remote Play and Share Play** on supported devices.


You can find a list of the new features here: Firmware update 2.50

Filed under: Uncategorized Tagged: firmware update, gaikai, playstation 4, PS4, remoteplay, shareplay

David Perry Blog

An Idea worth a MILLION dollars! Really!

by David Perry at March 25, 2015 09:11 PM

This Ted Talk by David Isay of StoryCorps won the 2015 TED prize, he got a million dollars to help him realize this idea and he got a well deserved standing ovation.

How powerful can a simple idea be?   Check it out:

Filed under: Uncategorized Tagged: david isay, storycorps, TED 2015, ted prize winner