Planet Gamedev

Gamasutra Feature Articles

Action, death, and catharsis: Call of Duty 4: Modern Warfare

August 05, 2015 11:03 AM

"By the time Modern Warfare was in development the binary view of World War II explored in past games no longer felt important. The wars of 2007 weren't being waged against villainous armies, but insurgent groups." ...

App store optimization for game developers

August 05, 2015 08:03 AM

"Once you have implemented App Store Optimization, downloads will increase, users will find your app more easily and you will rank higher compared to your competitors." ...

Feasible interface adventures: Designing the complex UI of Affordable Space Adventures

August 05, 2015 08:02 AM

How do you get players to understand an overly complicated UI -- when that's the point? This design article covers how the devs reached a feeling of "in distress and slightly overwhelmed" yet comprehensible and usable. ...

Geeks3D Forums

HMD benchmark for Oculus

August 05, 2015 04:08 AM

HMD benchmark for Oculus

CAPCOM Monster Hunter Online (2015) benchmark

August 05, 2015 03:46 AM

Quote
Monster Hunter Online, NVIDIA and Crytek have joined hands and accomplished this stunning benchmark program.The program is based on client game Monster Hunter Online, and was produc...

CAPCOM Dragon's Dogma Online benchmark

August 05, 2015 03:28 AM

Download from 4gamer

As usual don't take japanese benchimakus seriously.

It doubles my VRAM and shows way too much loading screens.



[img]https://farm1.staticflickr.com/255/19681898244_9e8e...

(WebGL) OORTONLINE.GL benchmark

August 05, 2015 03:02 AM

Quote
ABOUT OORTONLINE.GL

WebGL enables graphically rich games and applications on the web without the need for plug-ins. OortOnline.gl is designed to test the capability of your hardware to deliver the latest WebGL g...

(WebGL) Superformula mesh generator

August 05, 2015 02:31 AM

Superformula mesh generator (mysterydate.github.io)


Autodesk Announces Cross-Platform Stingray Game Engine

August 05, 2015 01:45 AM

Game developers have a new player in the game engine market, and it's one most of them already know quite well: Autodesk. At GDC Europe, the software company behind some of the most popular 3D modeling tools in the industry – 3ds Max and Maya – has ann...

Gamasutra Feature Articles

Disney Infinity sales slow ahead of sequel

August 04, 2015 10:03 PM

Disney Interactive today posted break-even operating income, reflecting a decline in sales of its popular toy-to-game franchise, Disney Infinity. ...

Activision Blizzard revenues climb, even as World of Warcraft subs tank

August 04, 2015 08:42 PM

WoW shed 1.5 million players during the quarter, but Destiny, Hearthstone, and Call of Duty are making up for any downturn in the decade-old MMO. ...

Don't miss: The fight to finish Galak-Z

August 04, 2015 07:32 PM

17-Bit CEO Jake Kazdal speaks to Gamasutra about the development of Galak-Z and why 17-Bit decided to switch gears and embrace procedural generation, throwing out months of work in the process. ...

Get a job: DrinkBox Studios seeks a 2D Animator

August 04, 2015 07:30 PM

The studio behind Guacamelee and the upcoming Severed is seeking a 2D animator in its Toronto, Canada-based studio. ...

Don't miss: Evaluating game mechanics for depth

August 04, 2015 06:55 PM

Ratchet and Skylanders designer Mike Stout writes on depth: "To me, it describes a sweet spot -- that point during a game where the player can repeatedly display his mastery of a game mechanic." ...

Women in the game industry share stories of improving diversity

August 04, 2015 06:52 PM

A panel of women from companies like King, NaturalMotion and Gameloft came together at GDC Europe today to share their experiences as women in the industry and offer advice for increasing diversity. ...

Facebook cancels plans to change how ads work

August 04, 2015 06:50 PM

In the face of fierce opposition from ad networks and game developers, the company scraps plans to force sharing of information with the platform. ...

Chinese developers have the West in their sights

August 04, 2015 06:31 PM

In terms of being able to compete globally, that is -- and they'll get there in time, both the government and the CEO of Perfect World promise. ...

China has 366 million mobile game players, as revenues skyrocket

August 04, 2015 06:02 PM

... and that's not all. According to a new report from Chinese media company Sina, the country has 134 million PC gamers, too. ...

Bitsquid

Allocation Adventures 3: The Buddy Allocator

by Niklas (noreply@blogger.com) at August 04, 2015 06:38 PM

Hello, allocator!

The job of a memory allocator is to take a big block of memory (from the OS) and chop it up into smaller pieces for individual allocations:

void *A = malloc(10);
void *B = malloc(100);
void *C = malloc(20);

------------------------------------------------------
| A | free | B | C | free |
------------------------------------------------------

The allocator needs to be fast at serving an allocation request, i.e. finding a suitable piece of free memory. It also needs to be fast at freeing memory, i.e. making a previously used piece of memory available for new allocations. Finally, it needs to prevent fragmentation -- more about that in a moment.

Suppose we put all free blocks in a linked list and allocate memory by searching that list for a block of a suitable size. That makes allocation an O(n) operation, where n is the total number of free blocks. There could be thousands of free blocks and following the links in the list will cause cache misses, so to make a competitive allocator we need a faster method.

Fragmentation occurs when the free memory cannot be used effectively, because it is chopped up into little pieces:

------------------------------------------------------
| A | free | B | C | free |
------------------------------------------------------

Here, we might not be able to service a large allocation request, because the free memory is split up in two pieces. In a real world scenario, the memory can be fragmented into thousands of pieces.

The first step in preventing fragmentation is to ensure that we have some way of merging free memory blocks together. Otherwise, allocating blocks and freeing them will leave the memory buffer in a chopped up state where it is unable to handle any large requests:

-------------------------------------------------------
| free | free | free | free | free | free |
-------------------------------------------------------

Merging needs to be a quick operation, so scanning the entire buffer for adjacent free blocks is not an option.

Note that even if we merge all neighboring free blocks, we can still get fragmentation, because we can't merge the free blocks when there is a piece of allocated memory between them:

-----------------------------------------------------------
| free | A | free | B | free | C | free | D | free |
-----------------------------------------------------------

Some useful techniques for preventing this kind of fragmentation are:

  • Use separate allocators for long-lived and short-lived allocations, so that the short-lived allocations don't create "holes" between the long lived ones.
  • Put "small" allocations in a separate part of the buffer so they don't interfere with the big ones.
  • Make the memory blocks relocatable (i.e. use "handles" rather than pointers).
  • Allocate whole pages from the OS and rely on the page mapping to prevent fragmentation.

The last approach can be surprisingly efficient if you have a small page size and follow the advice suggested earlier in this series, to try to have a few large allocations rather than many small ones. On the other hand, a small page size means more TLB misses. But maybe that doesn't matter so much if you have good data locality. Speculation, speculation! I should provide some real numbers instead, but that is too much work!

Three techniques used by many allocators are in-place linked lists, preambles and postambles.

In-place linked lists is a technique for storing linked lists of free memory blocks without using any extra memory. The idea is that since the memory in the blocks is free anyway, we can just store the prev and next pointers directly in the blocks themselves, which means we don't need any extra storage space.

A preamble is a little piece of data that sits just before the allocated memory and contains some information about that memory block. The allocator allocates extra memory for the preamble and fills it with information when the memory is allocated:

void *A = malloc(100);

------------------------
| pre | A | post|
------------------------

In C we pretty much need to have a preamble, because when the user calls free(void *p) on a pointer p, we get no information about how big the memory block allocated at p is. That information needs to come from somewhere and a preamble is a reasonable option, because it is easy to access from the free() code:

struct Preamble
{
unsigned size;
...
};

void free(void *p)
{
Preamble *pre = (Preamble *)p - 1;
unsigned size = pre->size;
}

Note that there are other options. We could use a hash table to store the size of each pointer. We could reserve particular areas in the memory buffer for allocations of certain sizes and use pointer compare to find the area (and hence the size) for a certain pointer. But hash tables are expensive, and having certain areas for allocations of certain sizes only really work if you have a limited number of different sizes. So preambles are a common option.

They are really annoying though. They increase the size of all memory allocations and they mess with alignment. For example, suppose that the user wants to allocate 4 K of memory and that our OS uses 4 K pages. Without preambles, we could just allocate a page from the OS and return it. But if we need a four byte preamble, then we will have to allocate 8 K from the OS so that we have somewhere to put those extra four bytes. So annoying!

And what makes it even more annoying is that in most cases storing the size is pointless, because the caller already knows it. For example, in C++, when we do:

delete x;

The runtime knows the actual type of x, because otherwise it wouldn't be able to call the destructor properly. But since it knows the type, it knows the size of that type and it could provide that information to the allocator when the memory is freed..

Similarly, if the memory belongs to an std::vector, the vector class has a capacity field that stores how big the buffer is, so again the size is known.

In fact, you could argue that whenever you have a pointer, some part of the runtime has to know how big that memory allocation is, because otherwise, how could the runtime use that memory without causing an access violation?

So we could imagine a parallel world where instead of free(void *) we would have free(void *, size_t) and the caller would be required to explicitly pass the size when freeing a memory block. That world would be a paradise for allocators. But alas, it is not the world we live in.

(You could enforce this parallel world in a subsystem, but I'm not sure if it is a good idea to enforce it across the board in a bigger project. Going against the grain of the programming language can be painful.)

A postamble is a similar piece of data that is put at the end of an allocated memory block.

Postambles are useful for merging. As mentioned above, when you free a memory block, you want to merge it with its free neighbors. But how do you know what the neighbors are and if they are free or not?

For the memory block to the right it is easy. That memory block starts where yours end, so you can easily get to it and check its preamble.

The neighbor to the left is trickier. Since you don't know how big that memory block might be, you don't know where to find its preamble. A postamble solves that problem, since the postamble of the block to the left will always be located just before your block.

Again, the alternative to using preambles and postambles to check for merging is to have some centralized structure with information about the blocks that you can query. And the challenge is to make such queries efficient.

If you require all allocations to be 16-byte aligned, then having both a preamble and a postamble will add 32 bytes of overhead to your allocations. That is not peanuts, especially if you have many small allocations. You can get around that by using slab or block allocators for such allocations, or even better, avoid them completely and try to make fewer and bigger allocations instead, as already mentioned in this series.

The buddy allocator

With that short introduction to some general allocation issues, it is time to take a look at the buddy allocator.

The buddy allocator works by repeatedly splitting memory blocks in half to create two smaller "buddies" until we get a block of the desired size.

If we start with a 512 K block allocated from the OS, we can split it to create two 256 K buddies. We can then take one of those and split it further into two 128 K buddies, and so on.

When allocating, we check to see if we have a free block of the appropriate size. If not, we split a larger block as many times as necessary to get a block of a suitable size. So if we want 32 K, we split the 128 K block into 64 K and then split one of those into 32 K.

At the end of this, the state of the allocator will look something like this:

Buddy allocator after 32 K allocation:

-----------------------------------------------------------------
512 | S |
-----------------------------------------------------------------
256 | S | F |
-----------------------------------------------------------------
128 | S | F |
---------------------------------
64 | S | F | S - split
----------------- F - free
32 | A | F | A - allocated
---------

As you can see, this method of splitting means that the block sizes will always be a powers of two. If you try to allocate something smaller, say 13 K, the allocation will be rounded up to the nearest power of two (16 K) and then get assigned a 16 K block.

So there is a significant amount of fragmentation happening here. This kind of fragmentation is called internal fragmentation since it is wasted memory inside a block, not wasted space between the blocks.

Merging in the buddy allocator is dead simple. Whenever a block is freed, we check if it's buddy is also free. If it is, we merge the two buddies back together into the single block they were once split from. We continue to do this recursively, so if this newly created free block also has a free buddy, they get merged together into an even bigger block, etc.

The buddy allocator is pretty good at preventing external fragmentation, since whenever something is freed there is a pretty good chance that we can merge, and if we can't the "hole" should be filled pretty soon by a similarly sized allocation. You can still imagine pathological worst-case scenarios. For example, if we first allocate every leaf node and then free every other of those allocations we would end up with a pretty fragmented memory. But such situations should be rare in practice.

Worst case fragmentation, 16 K block size

-----------------------------------------------------------------
512 | S |
-----------------------------------------------------------------
256 | S | S |
-----------------------------------------------------------------
128 | S | S | S | S |
-----------------------------------------------------------------
64 | S | S | S | S | S | S | S | S |
-----------------------------------------------------------------
32 | S | S | S | S | S | S | S | S | S | S | S | S | S | S | S | S |
-----------------------------------------------------------------
16 |A|F|A|F|A|F|A|F|A|F|A|F|A|F|A|F|A|F|A|F|A|F|A|F|A|F|A|F|A|F|A|F|
-----------------------------------------------------------------

I'm being pretty vague here, I know. That's because it is quite hard in general to say something meaningful about how "good" an allocator is at preventing fragmentation. You can say how good it does with a certain allocation pattern, but every program has a different allocation pattern.

Implementing the buddy allocator

Articles on algorithms and data structures are often light on implementation details. For example, you can find tons of articles describing the high-level idea behind the buddy allocator as I've outlined it above, but not much information about how to implement the bloody thing!

This is a pity, because the implementation details can really matter. For example, it's not uncommon to see someone carefully implement the A*-algorithm, but using a data structure for the open and closed sets that completely obliterates the performance advantages of the algorithm.

So let's get into a bit more detail.

We start with allocation. How can we find a free block of a requested size? We can use the technique described above: we put the free blocks of each size in an implicit linked list. To find a free block we just take the first item from the list of blocks of that size, remove it from the list and return it.

If there is no block of the right size, we take the block of the next higher size and split that. We use one of the two blocks we get and put the other one on the free list for that size. If the list of blocks of the bigger size is also empty, we can go to the even bigger size, etc.

To make things easier for us, let's introduce the concept of levels. We say that the single block that we start with, representing the entire buffer, is at level 0. When we split that we get two blocks at level 1. Splitting them, we get to level 2, etc.

We can now write the pseudocode for allocating a block at level n:

if the list of free blocks at level n is empty
allocate a block at level n-1 (using this algorithm)
split the block into two blocks at level n
insert the two blocks into the list of free blocks for level n
remove the first block from the list at level n and return it

The only data structure we need for this is a list of pointers to the first free block at each level:

static const int MAX_LEVELS = 32;
void *_free_lists[MAX_LEVELS];

The prev and next pointers for the lists are stored directly in the free blocks themselves.

We can also note some mathematical properties of the allocator:

total_size == (1<<num_levels) * leaf_size
size_of_level(n) == total_size / (1<<n)
max_blocks_of_level(n) = (1<<n)

Note that MAX_LEVELS = 32 is probably enough since that gives a total size of leaf_size * 4 GB and we know leaf_size will be at least 16. (The leaf nodes must have room for the prev and next pointers of the linked list and we assume a 64 bit system.)

Note also that we can create a unique index for each block in the buddy allocator as (1<<level) + index_in_level - 1. The node at level 0 will have index 0. The two nodes at level 1 will have index 1 and 2, etc:

Block indices

-----------------------------------------------------------------
512 | 0 |
-----------------------------------------------------------------
256 | 1 | 2 |
-----------------------------------------------------------------
128 | 3 | 4 | 5 | 6 |
-----------------------------------------------------------------
64 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
-----------------------------------------------------------------
32 |15 |16 |17 |18 |19 |20 |21 |22 |23 |24 |25 |26 |27 |28 |29 |30 |
-----------------------------------------------------------------

The total number of entries in the index is (1 << num_levels) - 1. So if we want to store some data per block, this is how much memory we will need. For the sake of simplicity, let's ignore the - 1 part and just round it of as (1 << num_levels).

What about deallocation?

The tricky part is the merging. Doing the merging is simple, we just take the two blocks, remove them from the free list at level n and insert the merged block into the free list at level n-1.

The tricky part is to know when we should merge. I.e. when we are freeing a block p, how do we know if it is buddy is also free, so that we can merge them?

First, note that we can easily compute the address of the buddy. Suppose we have free a block p at level n. We can compute the index of that in the level as:

index_in_level_of(p,n) == (p - _buffer_start) / size_of_level(n)

If the index i is even, then the buddy as at index i+1 and otherwise the buddy is at i-1 and we can use the formula above to solve for the pointer, given the index.

So given the address of the buddy, let's call it buddy_ptr, how can we know if it is free or not? We could look through the free list for level n. If we find it there we know it is free and otherwise it's not. But there could be thousands of blocks and walking the list is hard on the cache.

To do better, we need to store some kind of extra information.

We could use preambles and postambles as discussed earlier, but that would be a pity. The buddy allocator has such nice, even block sizes: 1 K, 2 K, 4 K, we really don't want to mess that up with preambles and postambles.

But what we can do is to store a bit for each block, telling us if that block is free or allocated. We can use the block index as described above to access this bitfield. This will require a total of (1 << num_level) bits. Since the total size of the tree is (1 << num_levels) * leaf_size bytes, we can see that the overhead of storing these extra bits is 1 / 8 / leaf_size. With a decent leaf_size of say 128 (small allocations are better handled by a slab alloactor anyway) the overhead of this table is just 0.1 %. Not too shabby.

But in fact we can do even better. We can get by with just half a bit per block. That sounds impossible, but here is how:

For each pair of buddies A and B we store the single bit is_A_free XOR is_B_free. We can easily maintain the state of that bit by flipping it each time one of the buddies is freed or allocated.

When we consider making a merge we know that one of buddies is free, because it is only when a block has just been freed that we consider a merge. This means we can find out the state of the other block from the XORed bit. If it is 0, then both blocks are free. If it is 1 then it is just our block that is free.

So we can get by with just one bit for every pair of blocks, that's half a bit per block, or an overhead of just 1 / 16 / leaf_size.

At this point, careful readers may note that I have been cheating.

All this time I have assumed that we know the level n of the block that we are freeing. Otherwise we cannot compute the address of the buddy or its index in the node tree.

But to know the level n of ptr we must know the size of its allocated block. So this only really works if the user passes the size of the allocation when freeing the block. I.e, the free(void *, size_t) interface that we discussed earlier.

If we want to support the simpler and more common API free(void *p), the alloator needs to somehow store the size of each alloation.

Again, using a preamble is possible, but we don't really want to.

We could use an array, indexed by (p - _buffer_start) / leaf_size to store the sizes. Note that this is not the same as the block index. We can't use the block index, since we don't know the level. Instead this is an index of size 1 << (num_levels - 1) with one entry for each possible pointer that the buddy allocator can return.

We don't have to store the full size (32 bits) in the index, just the level. That's 5 bits assuming that MAX_LEVELS = 32. Since the number of entries in this index is half that of the block index this ammounts to 2.5 bits per block.

But we can do even better.

Instead of storing the size explicitly, we can use the block index and store a single bit to keep track of whether the block at that level has been split or not.

To find the level n of an allocated block we can use the algorithm:

n = num_levels - 1
while n > 0
if block_has_been_split(ptr, n-1)
return n
n = n - 1
return 0

Since the leaf blocks can't be split, we only need 1 << (num_levels - 1) entries in the split index. This means that the cost of the split index is the same as for the merge index, 0.5 bits per block. It's a bit amazing that we can do all this with a total overhead of just 1 bit per block.

The prize of the memory savings is that we now have to loop a bit to find the allocated size. But num_levels is usually small (in any case <= 32) and since we only have 1 bit per entry the cache usage is pretty good. Furthermore, with this approach it is easy to offer both a free(void *) and a free(void *, size_t) interface. The latter can be used by more sophisticated callers to avoid the loop to calculate the block size.

Memory arrangements

Where do we store this 1 bit of metadata per block? We could use a separate buffer, but it is not that elegant. It would mean that our allocator would have to request two buffers from the system, one for the data and one for the metadata.

Instead, let's just put the metadata in the buffer itself, at the beginning where we can easily find it. We mark the blocks used to store the metadata as allocated so that they won't be used by other allocations:

Initial state of memory after reserving metadata:

-----------------------------------------------------------------
512 | S |
-----------------------------------------------------------------
256 | S | F |
-----------------------------------------------------------------
128 | S | F |
---------------------------------
64 | S | S |
-----------------
32 | S | S | S | F |
-----------------
16 |A|A|A|A|A|F|
-------------
********** Metadata

Note that when allocating the metadata we can be a bit sneaky and not round up the allocation to the nearest power of two. Instead we just take as many leaf blocks as we need. That is because when we allocate the metadata we know that the allocator is completely empty, so we are guaranteed to be able to allocate adjacent leaf blocks. In the example above we only have to use 5 * 16 = 80 K for the metadata instead of the 128 K we would have used if we rounded up.

(The size of the metadata has been greatly exaggerated in the illustration above to show this effect. In reality, since the tree in the illustration has only six levels, the metadata is just 1 * (1 << 6) = 64 bits, that's 8 bytes, not 80 K.)

Note that you have to be a bit careful when allocating the metadata in this way, because you are allocating memory for the metadata that your memory allocation functions depend on. That's a chicken-and-egg problem. Either you have to write a special allocation routine for this initial allocation, or be very careful with how you write your allocation code so that this case is handled gracefully.

We can use the same technique to handle another pesky issue.

It's a bit irritating that the size of the buddy allocator has to be a power of two of the leaf size. Say that we happen to have 400 K of memory lying around somewhere. It would be really nice if we could use all of that memory instead of just the first 256 K.

We can do that using the same trick. For our 400 K, we can just create a 512 K buddy allocator and mark the first 144 K of it as "already allocated". We also offset the start of the buffer, so that the start of the usable memory coincides with the start of the buffer in memory. Like this:

    -----------------------------------------------------------------
512 | S |
-----------------------------------------------------------------
256 | S | F |
-----------------------------------------------------------------
128 | S | S |
---------------------------------
64 | S | S | S | F |
---------------------------------
32 | S | S | S | S | S | F |
-------------------------
16 |A|A|A|A|A|A|A|A|A|A|
---------------------
******************* Unusable, unallocated memory
MET * Metadata
^
+-- Usable memory starts here

Again, this requires some care when writing the code that does the initial allocation so that it doesn't write into the unallocated memory and causes an access violation.

The buddy allocator and growing buffers

As mentioned in the previous post, the buddy allocator is perfect for allocating dynamically growing buffers, because what we want there is allocations that progressively double in size, which is exactly what the different levels of the buddy allocator offer.

When a buffer needs to grow, we just allocate the next level from the buddy allocator and set the capacity of the buffer so that it fills up all that space.

Note that this completely avoids the internal fragmentation issue, which is otherwise one of the biggest problems with the buddy allocator. There will be no internal fragmentation because the dynamic buffers will make use of all the available space.

In the next post, I'll show how all of this ties together.

Gamasutra Feature Articles

5 lessons learned building Vainglory, a MOBA for touchscreens

August 04, 2015 04:59 PM

"I wanted to share five lessons we've learned while developing the game and bringing a super core, traditionally PC-only genre to touch screens." ...

Ninja Theory's tips for surviving and thriving as a 'AAA indie'

August 04, 2015 04:24 PM

Ninja Theory Tameem Antoinades offers an update at GDC Europe on how the studio is developing Hellblade with a 15-man team, and offer tips for fellow devs on making indie games with AAA polish. ...

A narrative fallacy: It's all about Aristotle

August 04, 2015 04:05 PM

Warren Spector explains how crucial it is for game narrative to matter -- and shares examples from Deus Ex and Epic Mickey that showcase how you do it. ...

Mobile game startup Seriously lands $18M

August 04, 2015 04:04 PM

As the mobile industry continues to go from strength to strength, more investors are entering the space in the hopes of unearthing the next Angry Birds or Candy Crush Saga. ...

Sponsored: From hindsight to insight to foresight in the Azure cloud

August 04, 2015 03:52 PM

Microsoft Azure offers a number of benefits for people planning to build analytics into their games. ...

Rocket League 'definitely' headed to other platforms

August 04, 2015 03:12 PM

Rocket League has been downloaded over 5 million times, and developer Psyonix is now looking to capitalize on that meteoric momentum by bringing the title to other platforms. ...

A newbie's guide to exhibiting at public game events

August 04, 2015 03:07 PM

"I've had some trouble in the past finding good resources for first-time exhibitors, so after wrapping up a second year of con seasons, I thought it might be useful to write up some of my own initial impressions." ...

Real-Time Rendering

Freezing Time at SIGGRAPH

by Eric at August 04, 2015 02:48 PM

Andrew Glassner and I are running a fun little workshop called “Freezing Time” this Sunday, as part of Making @ SIGGRAPH. Details: 12:15-1:45 PM, South Hall G – Studio Workstation Area

We’ll be teaching how to use T2Z, “Time To Z”, a program that lets you generate a 2D animation and then turn it into a 3D printable sculpture. Participants will be provided workstations, and there will be high-speed 3D printers available after the workshop. Can’t make it? Read on… Can make it? Get the code now and have fun on the plane ride to Los Angeles.

T2Z takes the frames of your animation and stacks them to form a 3D sculpture. This three.js program shows the transition for a number of animations – use the mouse to change the camera’s view. It’ll also get your cooling fan cranking, if your GPU is like mine. Turning “cycles” down to 1 keeps it sane.

Here’s a simple example. This animation:

squib animation

gives this sculpture when you stack the frames:

squib sculpture

The (self-imposed) challenge is to create an interesting, looping animation that also creates a visually-pleasing and printable sculpture. This example is pretty good, though the animation doesn’t quite perfectly loop. It would be easy enough to make it loop, but then we lose the base it can sit on. Tricky! If you want to hack on this code, it’s the Wobbly animation in T2Z.

There are many more examples in our gallery. I’ve been playing with the idea of data translation in general; you’ll see some experiments there. It’s been a great excuse for me to learn to use various tools at the local makerspace, Artisan’s Asylum, though I’ve not worked up the courage to actually use the plasma cutter yet. There are also plenty of fun & free tools for data manipulation, such as 123D Make, third-generation Photosynth, Sketchfab, 123D Catch, and on and on.

Even if you can’t attend the workshop, you can easily do this sort of experimentation at home. The T2Z code is free and open source, and well-documented. Companies such as Shapeways give you the ability to print high-quality models. We have lots of little animations in Animations.pde – go mess with them! There are also super-hacky “animations” at the end of this file: AnimatedGifReader turns a GIF into an STL 3D print file, FolderOfFramesReader does the same for a set of PNGs, and HeightField takes a grayscale image and uses the gray as a measure of height, e.g.

Skull3dRelief

skull_relief

Processing is entirely fun to hack on (Debugger? We don’t need no stinkin’ debugger, println() is our only friend). It’s Java plus stuff to make graphics easy. I like the fact that to run the program you are faced with the code – the system invites you to start poking at the program from the outset. Andrew wrote most of the code, being a Processing pro (he wrote a book and teaches a course in it; the first half of his course is free). Me, I translated the Marching Cubes code to Processing: each pixel of each image is treated as a voxel, the 3D model is from the isosurface formed between the objects and the background.

We hope to see you on Sunday! Or better yet, online, where we hope to see you sending us animations for the gallery and pull requests for code you’ve added.

 

Where did all this come from? Last year around May Andrew started making a series of looping GIFs using Processing, taking after the Bees & Bombs Tumblr feed. His goal was to make animations worth posting. These can now be found on Andrew’s Tumblr feed. Steve Drucker and I were the critics, over more than half a year.

During this time I was attending and organizing 3D printer meetups in the Boston area. Mark Stock pointed out a fascinating way of modeling: instead of explicitly using union operations on 3D models, the traditional CAD approach, he instead deposited objects into a large voxel grid. It’s much simpler and faster to figure out if a voxel is inside some given primitive vs. performing a union or other constructive solid geometry operation on a set of models. For example, computing the union of thousands upon thousands of spheres will bring most CAD modelers to their knees. Voxel in/out functions are trivial to compute for spheres, and Marching Cubes then guarantees a watertight, well-formed model with no geometric singularities, precision problems, etc. 3D printers themselves have limits to precision, so using voxels is a good match. Here’s an example of Mark’s work:

Dendrite by Mark Stock

So, for me, these two things combined: animations could be used to define voxels, and Marching Cubes used to generate 3D representations. I made an exceedingly slow GIF to STL converter in Perl and ran a bunch of Andrew’s GIFs through it. A few interesting forms turned up and that got me started on playing with what I call “323,” converting from some three-dimensional form of data (an animation being 2D plus time) to another (a sculpture).

Seeing the call for Making @ SIGGRAPH, we decided to go further and give a workshop on the process. The T2Z program that resulted is massively faster than my original Perl program, generating sculptures in a few seconds. It’s also much more usable, allowing you to make your own animations, hook up sliders to variables, and easily export them as GIFs, a set of PNGs, or a 3D STL model. Programming all this sucked up way more time than expected, and of course was highly addictive. Andrew made this Processing program do things that Nature did not intend (e.g., binary STL output and multi-window UI).

Personally, I find this whole design process entertaining. In idle moments (or at the dentist) I imagine what might make both an interesting animation and a worthwhile sculpture. It’s a fun way to think about modeling and animation, and one where my intuition doesn’t always pay off. The more I play, the more I learn.

Here’s a screenshot, to whet your appetite – click it for the full-size readable version:

screenshot

So download the thing, install Processing and three little libraries (easy!), and start sliding sliders, pushing buttons, and hacking code! And let us know what you find.

BTW, if you want just one link to bookmark, it’s this: http://bit.ly/t2zspot



Gamasutra Feature Articles

Keiji Inafune's clumsy Kickstarter misses target

August 04, 2015 02:48 PM

Proving that more can indeed be less, Red Ash, Keiji Inafune's spiritual successor to Mega Man Legends, has fallen $279k short of its sizable $800k Kickstarter goal. ...

c0de517e Rendering et alter

Smoothen your functions

by DEADC0DE (noreply@blogger.com) at August 04, 2015 02:28 PM

Do you have an "if", "step" or such? Replace with a saturate(multiply-add(x)).
Do you have a mad-saturate? Replace with a smoothstep.
Do you have a smoothstep? Replace with smootherstep...

Ok, kidding, but sort-of, I actually do often ed up replacing ramps (saturate/mad... the fairy dust of shading, I love sprinkling mads in shader code) instead of steps, I remember years ago turning a pretty much by-the-book crysis 1-style SSAO into a much better SSAO by just "feathering" the hard in/out tests (which is kinda what line sampling SSAO does btw).

If you think about it, it's a bit of "code smell". What shading functions should be discontinuous? True, most lighting has a max or saturate right? But why? Really we're considering infinitesimal lights, for physically realistic lights we would have an area of emission, and that area would be fractionally shadowed by a surface, so even there, the shadowing function wouldn't just be a step of the dot product. This might not be evident on diffuse, but already when you're trying to use half-angle based specular attention has to be taken when handling transition to the "nightside".

And of course even when reasonable, any "step" function (well -any- function!) in a shader should be anti-aliased... And of course everybody knows what's the convolution of a step with a box (pixel footprint) is... Texturing and Modeling, a Procedural Approach is the canonical text for this, but it's funny, googling around one of the first hits is this documentation page of Renderman on antialiasing, whose slides are horribly aliased. The OpenGL Orange Book also has examples, and I really want to mention this IQ's article on ray differentials even if it doesn't do analytic convolution...

Many times the continuity of derivatives is not that important (visible), that's why we can use saturated ramps (discontinuous in the first derivative) or saturated smoothsteps (discontinuous in the second), with the big exception of manipulating inputs to specular shading. In that case, even second-derivative discontinuities can very clearly show, thus the need of the famous "smootherstep".

Anyhow. I usually have a bunch of functions around to help with ramps, triangle ramps, smoothsteps and so on, most of them are trivial and can be derived on paper in a second or so. Lately I had to use a few I didn't know before, so I'll be writing them down here.

Yes, all this introduction was useless. :)

- Smooth Min/Max

log(pow(pow(exp(x),s) + pow(exp(y),s),1/s))

This will result is a smooth "min" between x and y for negative values of s (which controls the smoothness of the transition), "max" for positive values.

For s=-1 this results in the "smoothest" min:

log(exp(x+y)/(exp(x)+exp(y))

If you know that x,y are always positive a simpler formulation can be employed, as we don't need to go through the exponential mapping:

pow(pow(x,s) + pow(y,s),1/s)


Note also that if you need a soft minimum of more than two values, your expressions simplify, e.g. pow(pow(pow(pow(x,s) + pow(y,s),1/s),s)  + pow(z,s),1/s) = pow(pow(x,s) + pow(y,s)  + pow(z,s) ... ,1/s).

Note also the link between softmax and norm-infinity.

- A few notes on smoothsteps

Deriving smoothstep and smootherstep is trivial, just create a polynomial of the right degree (cubic or quintic) and impose f(0)=0, f(1)=1 and f'(0)=0, f'(1)=0 (and the same for f'' in case of smootherstep), solve and voila'.


Once you do that, it's equally trivial to start toying around and derive polynomials with other properties. E.g. imposing derivatives only at one extreme:


You can have a "smoothstep" with non-zero derivatives at the extremes:


Or a quartic that shifts the midpoint:


It would seem that the more "properties" you need to have the higher degree polynomial you need to craft. Until you remember that you can do everything piecewise...
Which is basically making small, specialized splines. For example, a quadric smoothstep can look like this:


This is helpful also because there are certain tradeoffs based on applications, especially as having continuous derivatives don't mean automatically that it will be nice looking...
You can make functions that impose more and more derivatives (and do you know that smoothsteps can be chained? smoothstep(smoothstep(x))...) but that doesn't mean the derivatives will "behave", as they can vary wildly in the domain and result in visible "wobbling" in shading.


Another thing that you might not have noticed is how close smoothstep is to a (shifted) cosine, I didn't before a coworker or mine, the all-knowing Paul Edelstein, mentioned it. Probably not too useful, but never know, in certain situations it might be applicable and cheaper.


- Sigmoid functions

Another class of functions that are widely useful are sigmoids, "s shaped functions"

Smooth Sigmoid: x/pow((pow(abs(x),s)+1),1/s)
Logistic: 1/(1+exp(-x))

Sigmoids are similar to smoothsteps, but usually reach zero derivatives at infinity instead at 0,1 endpoints.


They make nice "replacements" for "step" as they approach nicely their limits as they go to infinity:


But also for saturated ramps, especially the smooth sigmoid as it has f'(0)=1 as we have shown before.


Another sigmoid is the Gompertz function, which has nice and clear parameters:

asymptote*exp(-displacement*exp(-rate*x))

Beware though, it's not symmetric around its midpoint:


There are a ton more, but I'd say not as generic. If you look at the various tonemapping curves, most of them are sigmoids, but most of them are in exponential space and not symmetric.
In fact at a given point I made tonemapping curves out of sigmoids, piecewise sigmoids or other weird things glued together :)



- Bias and Gain (thanks to Steve Worley for reminding me of these)

Bias pow(x,(-log2(a))
Gain if x < 0.5 then 0.5*bias(2*x, a) else 1-0.5*bias(2-2*x, a) 

Schlick's Bias x/((1/a-2)*(1-x)+1)
Schlick's Gain if x < 0.5 then SBias(2*x,a)*0.5 else 1-0.5*SBias(2-2*x,a))

Bias is just a power (-log2(a) only maps 0...1 to the power), and Gain maps one power next to a mirrored copy around the midpoint, the easiest way you can construct a piecewise sigmoid (without imposing conditions on the derivatives and so on).

Schlick's versions were published in Graphics Gems 2, and are not only an optimization of the original Bias/Gain formulas (credited to Perlin's Hypertexture paper), but are symmetric over the diagonal, which is a nifty property (it also means that for parameter a the inverse curve is given by the same formula with 1-a)



- Smooth Abs


Obviously if you have any "smoothstep" you can shift it around zero to create a "smoothsign" and multiply by the original value to get a smoothed absolute. The rational polynomial sigmoid works quite well for that:


SmoothAbsZero d*x*x/sqrt(1+d*d*x*x)

If you don't need to reach zero at x=0 then you can simply add an epsilon to the square root of the square of your input, yielding this


SmoothAbs sqrt(x*x+e)


And that's all what I have for now, if you encountered other nifty functions for modelling and tinkering with procedurals and so on, let me know in the comments! 
I'm always looking for nifty functions that can be useful for sculpting mathematical shapes :)

- Bonus example: Soft conditional assignment


Some links:

Gamasutra Feature Articles

Cities: Skylines dev: Don't punish your players; teach them

August 04, 2015 12:57 PM

Cities: Skylines lead designer Karoliina Korppoo offers a postmortem of sorts at GDC Europe, deconstructing her design philosophy and making a case for designing games that teach, rather than punish. ...



Hooking up Elias' adaptive game music to Unity

August 04, 2015 11:03 AM

"I am going to show you a real world example of how adaptive music is used in a game, how simple it is for programmers to get Elias up and running in Unity." ...

Valve shares advice on designing great VR game interactions

August 04, 2015 10:23 AM

At GDC Europe today, Valve's Yasser Malaika shared some key lessons learned about how to make great VR game interactions by both Valve's own devs and external dev partners like Owlchemy Labs. ...

5 video game translation tools

August 04, 2015 08:05 AM

A list of five tools used by a professional localization house when translating video games -- ranging from free to paid tools. ...

Spry Fox's Alphabear cracks the code of mobile puzzle popularity

August 04, 2015 08:04 AM

How Alphabear braved the dark waters of the free-to-play ocean with irresistible viral sharing and a sensible business model -- and made it through the journey 1 million downloads strong. ...

Geeks3D Forums

(Android) Adreno Trepn profiler 6.1

August 04, 2015 07:09 AM

Quote
we wanted to let you know that version 6.1 is now available and can be downloaded from QDN[/col...

Timothy Lottes

Hydrokinetics by Prismbeings

by Timothy Lottes (noreply@blogger.com) at August 04, 2015 07:19 AM



Geeks3D Forums

NVIDIA GeForce driver 355.06 for Linux, FreeBSD and Solaris

August 04, 2015 06:15 AM

Linux, Solaris, and FreeBSD driver 355.06 (beta)

 Release highlights since 352.30:
 

  • Fixed a bug...

Gamasutra Feature Articles

Game devs negative on job outlook, positive on diversity, says IGDA survey

August 03, 2015 11:47 PM

Game developers optimistic about growth, less so about the industry itself, says new IGDA survey of game dev professionals -- which also touches on quality of life, diversity, and opportunity. ...

Video: 'The future of the PlayStation' from Phil Harrison, in 2000

August 03, 2015 09:09 PM

Former and longtime Sony exec Phil Harrison's fascinating GDC 2000 Sony keynote, given ahead of PlayStation 2's Western launch -- right before the console went on to win the world. ...

Don't Miss: Fixing Final Fantasy XIV

August 03, 2015 08:56 PM

This candid 2011 interview sees producer Naoki Yoshida setting out on his journey to redevelop Final Fantasy XIV into A Realm Reborn: A Quixotic quest, ultimately successful. ...

Get a job: ToyTalk is seeking a Gameplay and Prototyping Engineer

August 03, 2015 08:05 PM

Kids' game studio ToyTalk is looking for a high-level Gameplay and Prototyping Engineer to work alongside the company's Chief Creative Officer. ...

True virtual reality: The race to build a 'metaverse'

August 03, 2015 06:54 PM

Want to know what companies are gunning to build the next-generation virtual meeting space? A new profile looks at all of them. ...

$18m for Dota 2's The International makes for the richest prize in eSports

August 03, 2015 06:28 PM

The biggest prize in eSports gets bigger than ever, as the finals kick off at Seattle's Key Arena. The winning team will take home more than $6 million. ...

Summer Games Done Quick speedrun marathon raises $1.2m for charity

August 03, 2015 05:51 PM

Almost doubling last year's total, speedrunners raise well over a million dollars for Doctors Without Borders. ...

PlayStation exclusive DriveClub goes double platinum

August 03, 2015 05:15 PM

Troubled PlayStation exclusive DriveClub bounces back to go double platinum -- even if it's caused the company, and its developer, no end of headaches. ...

Timothy Lottes

GCN Optimization Slides by Emil Persson

by Timothy Lottes (noreply@blogger.com) at August 03, 2015 06:09 PM

A repost of one of the most useful shader optimization presentations,

Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson from AMD Developer Central

Gamasutra Feature Articles

Key takeaways from Epic's open development of Unreal Tournament

August 03, 2015 04:47 PM

Senior designer Jim Brown took the stage at GDC Europe today to talk about some of the lessons the Unreal Tournament team has learned about open development and turning players into contributors. ...

Reducing difficulty dynamically and invisibly

August 03, 2015 04:47 PM

"The key is that dynamic difficulty, when it's done right, is about tailoring the overall curve of the entire game to each player, and not simply removing a challenge if they fail at it a few times." ...

Autodesk Stingray game engine to launch later this month

August 03, 2015 04:09 PM

You can get your hands on the tool, by subscription, on August 19; later in the summer, Maya LT subscribers will get it bundled for free. ...

Lessons learned while fixing memory leaks in Unity

August 03, 2015 03:28 PM

Boss Fight Entertainment CTO Rich Geldreich shares a list of practical tips and lessons learned when fixing a major, game-crashing memory leak in Unity. ...

No place for third-party exclusives in Microsoft's long term plan

August 03, 2015 03:07 PM

Third-party deals are left behind as Microsoft's Phil Spencer reveals new first-party focus. ...

What does player retention look like for a week-long game?

August 03, 2015 02:51 PM

Ron Carmel on Subterfuge's retention: "We know this is not a mass market game. Those who love it will happily stick around and those who don't are better off not playing." ...

Cambridge researchers use brain training game to treat schizophrenia

August 03, 2015 02:27 PM

A research team out of Cambridge is using a brain training game to combat day-to-day schizophrenia symptoms. ...

Game From Scratch

Autodesk’s Stingray Game Engine Arrives Soon

by Mike@gamefromscratch.com at August 03, 2015 02:23 PM

 

It was becoming clear that Autodesk was entering the game market when they purchased BitSquid back in June of last year.  In addition to making the Magicka series of games, they also created the BitSquid game engine.  In March of 2015 Autodesk announced that Bitsquid was now the Stingray Game Engine and that it was coming soon.  Today more details emerged, including pricing and a release date.  Here is the official press release:

image

 

 

Autodesk Launches Stingray Game Engine at GDC Europe 2015


COLOGNE, Germany, August 3, 2015 --
At the Game Developers Conference (GDC) Europe 2015, Autodesk (NASDAQ: ADSK) announced that its new Stingray game engine will be available to game developers worldwide beginning August 19, 2015. Later this summer, Autodesk will also offer Autodesk Maya LT Desktop Subscription customers access to Autodesk Stingray as part of their subscription.


Built on the powerful, data-driven architecture of the Bitsquid engine, which Autodesk acquired in 2014, Stingray is a comprehensive new platform for making 3D games. The engine supports a host of industry-standard game development workflows and includes powerful connectivity to Autodesk 3D animation software that simplifies game development across a wide range of platforms.


"Between Augmented Reality, Virtual Reality and the proliferation of mobile platforms, the games industry is undergoing a major transition, which poses new complexities for both AAA and indie game developers. Autodesk developed Stingray with these challenges in mind, and we're excited to share its debut with the game developer community," said Autodesk senior vice president, Media & Entertainment, Chris Bradshaw. "Stingray makes it easy and intuitive for artists with varying skill sets and programming expertise to create the next generation of 3D blockbuster games, entertainment and even architecture."

 


Stingray feature highlights include:


-- Seamless Art-to-Engine Workflow:
Import, create, iterate, test and review 3D assets and gameplay faster with a one-click workflow and live link between Stingray and Autodesk 3D animation software.


-- Modern Data-Driven Architecture: A lightweight code base gives game developers the freedom to make significant changes to the engine and renderer without requiring source code access.


-- Advanced Visuals and Rendering: Produce visually stunning games with a powerful rendering pipeline, physically-based shading, advanced particle effects, post processed visual effects, lightmap baking and a high-performance reflection system.


-- Proven Creative Toolset: Stingray includes proven solutions like Beast, HumanIK, Navigation, Scaleform Studio (UI technology built on Scaleform), FBX, Audiokinetic Wwise and NVIDIA PhysX.


-- Versatile Game Logic Creation: Stingray includes a wide range of development tools, making game creation more accessible for game makers with varying levels of experience - including visual node-based-scripting and Lua scripting. C++ source code will also be available as an additional purchase upon request.


-- Multiplatform Deployment and Testing: Quickly make and apply changes to gameplay and visuals across supported platforms: Apple iOS, Google Android, Microsoft Windows 7 and Windows 8, Oculus Rift DevKit 2, Sony PlayStation 4 and Microsoft Xbox One.


Autodesk previewed Stingray at GDC 2015 earlier this year in San Francisco. Since then, game developers around the world have signed up for Autodesk's beta program, shipped games using this technology and provided feedback including:


"Stingray's data-driven architecture and flexibility have helped us build a broad portfolio of games, and quick iteration times for both code and content creators has boosted our productivity significantly. The engine has been a key success factor for us because we're able to produce high quality games in a shortened timeframe. We're excited to see how Autodesk will continue to evolve the engine," shared Martin Wahlund, CEO of Fatshark.


"We never know what kind of games we're going to create, and the engine is good for that. It really allows us to just make anything. We can make an FPS or an RTS, or a top-down shooter, or a role-playing game, or whatever. It's not tied to a specific genre," explained Johan Pilestedt, CEO, Arrowhead Game Studios.


The Stingray engine can also be used in design environments and is an informative next step to further understand design data before anything is physically built. The engine's real-time digital environment, on a powerful, data-driven architecture, is programmed to look and feel like the physical world. Through the high-end development tools and visual scripting system, customers can program objects, light effects, environmental elements, materials, and entourage elements to behave and react as they would in the physical world.


Connected to Autodesk 3ds Max, architecture, engineering and construction customers can import Autodesk Revit data into 3ds Max, add content to the 3ds Max scene and then place that scene in the Stingray engine to explore, animate, and interact in the designed space.


Pricing and Availability


Autodesk Stingray runs on Windows and will be available via Autodesk Subscription starting August 19, 2015 for $30 US MSRP per month. Later this summer, Autodesk plans to offer Maya LT Desktop Subscription customers access to the engine as part of Maya LT. For more details about Stingray, visit: www.autodesk.com/stingrayengine


About Autodesk


Autodesk helps people imagine, design and create a better world. Everyone--from design professionals, engineers and architects to digital artists, students and hobbyists--uses Autodesk software to unlock their creativity and solve important challenges. For more information visit autodesk.com or follow @autodesk.

 

So there you have it, it will be available for $30 a month starting later this month.  Interestingly it seems to also be available as part of the Maya LT subscription which is also $30 a month or $240 a year, so it’s effectively free to Maya LT users.  It’s certainly a boon for existing Maya LT users, but in a world full of free game engines, is a subscription based engine going to fly?

 

You can learn more about the StingrayEngine at http://stingrayengine.com/ or by watching the video below.

 

Introducing the Autodesk Stingray 3D game engine from Autodesk Media and Entertainment on Vimeo.



Gamasutra Feature Articles

GDC Europe is this week! Follow Gamasutra's coverage right here

August 03, 2015 02:17 PM

Follow Gamasutra's coverage of news, interviews, and presentations from this week's GDC Europe 2015, straight from Cologne, Germany. ...

This Week in Video Game Criticism: From cat collection to Metal Gear codec calls

August 03, 2015 02:03 PM

This week, our partnership with game criticism site Critical Distance brings us picks from Kris Ligman on topics ranging from the zen patience of Metal Gear Solid's codec calls to the feline indifference of mobile hit Neko Atsume. ...

Cat and mouse: How Bohemia bags cheaters in DayZ

August 03, 2015 01:57 PM

After studying them intently for some time, DayZ associate producer Eugen Harton shares lessons learned on how to detect and combat cheaters in your own game. ...

Yearly revisions the secret to Nintendo's handheld biz

August 03, 2015 01:37 PM

Analyst Matt Matthews charts how the company's flagship console products keep on keeping on, thanks to a yearly upgrade cycle. ...

Advice on making believable VR games from Sony's Heist designer

August 03, 2015 12:19 PM

At GDC Europe today, SCEE's John Foster shared advice for fellow devs on making believable, enjoyable VR games based on his experiences crafting Project Morpheus demos like The London Heist. ...

How CD Projekt Red made crafting work with Witcher 3's dynamic economy

August 03, 2015 10:58 AM

At GDC Europe, CDPR's Matthew Steinke offers a behind-the-scenes look at how he led design of The Witcher 3: Wild Hunt's crafting system and how he made it play nice with the game's dynamic economy. ...