3/03/2009

03-03-09 - Oodle Update

Oodle currently has two modes - "Incremental Build" and "Final". In Final mode it's assumed that you have made all the packs the way you want, and all the content is valid. This is presumably how you will actually ship your game. In Final mode, I don't check bundles against the original files or any of that stuff, it's much simpler and faster.

The problem of course is that Final is a big pain during development. Incremental lets you change any individual file at any time. It lets you delete any of the packed bundles. It automatically tries to use the best packed bundle - but only if it contains the latest version of the file. As I've written about before, we can't use modtime for this reliably, so I now use the hash of the source file (and use modtime to tell when I need to recompute the hash).

The goal of the incremental build system is that you can just do whatever you want during dev, and everything "just works" - you never see the wrong version fo the content or anything weird. I consider it better to be correct and robust than to be hyper efficient. But I also have the simultaneous goal that performance should be as close to the Final mode as possible. There are two big reasons for this - one is just speed, you still want your loads and iteration to be fast during development, but the other is even more significant : the way you run during dev should be as close as possible to the way you run when you ship. It also hopefully makes it easier to write code for Oodle, because you just write to one API and it automatically acts the right way.

Okay, that all sounds good, but dear god it gets complex in practice.

The general thing that makes it really tricky is I want to allow the client to do absolutely anything to their content at any time, without telling me, and things should still work. Oodle maintains a lightweight database of what it thinks of the content, but at any time that database can be wrong because the client may have done things without telling me. So I constantly need to be verifying it and reupdating it.

For example, clients might go and change source content while the game's not running. When the game runs next I want it to automatically update to the newer source content. (if they change it while the game is running, I see it and reload immediately). Clients might just delete bundles. They might sync to bundles that were made on a server. They might run the bundler tool and add or remove files from individual bundles. The next time the game runs it should just all work (and it shouldn't have to spin over every file and load everything up and validate it either).

I think I have this largely fixed for most normal cases. Basically it involves a lot of "lazy evaluation" of database correctness; as things are requested or brought in, inconsistencies are detected and fixed. A typical load pattern goes like this :

Client code asks to make resource A resident. (there are various ways that clients can make resources resident, this is the "loosest" path - you could also manually tell a whole Bundle to come resident, or you could ask for a named "paging set" to come resident).

First I check if A is already resident and just give it back. But if there's been an error in the previous load of A, I invalidate the database entry that was used and don't return the bad version.

I see if A has a valid map in the database. If so I try to use one of the existing bundles. Now there's a heuristic for which bundle to load. If any of the bundles that contain A is already in memory or already in the progress of loading, I use that. Else, if A is in a header-aggregate bundle, I use that to get the header. (header aggregates contain the headers of lots of resources and generally are made resident very early so you can get the headers before the data comes in). Next if A is in a compiled bundle, I use that (compiled bundles are presumably made by the packing tool so are preferred if possible). Finally I fall back to the single bundle.

If there is no valid database map, I invoke the "baker" for A which knows how to load the data raw and turn it into a single bundle. This is the only pathway that knows how to load raw data, and this is where the client must plug in their file format parsers.

That all catches a lot of cases. For example, if some bundle has been corrupted, I will go ahead and fire the load, when the load fails I will invalidate the map from the resource to that bundle. If you retry the load, it will see there's no mapping to a bundle and fire the baker to make a new one.

When bundles load, they contain various objects, it should contain the resource that caused the load request, but it also might contain others. Those others may have also been previously loaded another way. Our new data may or may not be inteded to replace the old data. The heuristic that I use is that the raw source file on disk is always authoritative. I believe that is intuitive and what artists and users would expect - if you have a certain BMP on disk, then that is what you will see in the game. If you have some packed bundles that contain older versions of that file, they will be ignored - when they try to register their objects they will be rejected. If you have packed bundles that contain newer versions (eg. from a server or other developer) those will also be ignored if you don't have the newer source files, I'm not sure if this is good or bad, but it's an inevitable consequence of not trusting mod time.

That's all relatively simple (LOL?).

One problem remains - because I'm primarily keying off the source file hash, I can't distinguish between different processings of the same original source file. This happens if you're doing something nontrivial in the baker, which I believe is a common case. The baker might convert textures to DXTC, it might optimize geometry for vertex cache, that kind of stuff. You might want to have a "fast bake" that just gets the data into some useable format, and a "heavy bake" that really does a good job. If there are bundles which contain the content in various levels of bake, I always want to load the heavier bake.

Currently I'm using the heuristic that newer bakes are "heavier" , so I always prefer the newest bundle that contains the valid content. But there are various problems with that. One is that of course it's not necessarily correct and modtime isn't reliable. But even if it *was* always correct, it wouldn't work for me, because I can't tell when two bakings are the same. eg. say I load up bundle X that contains resources {A,B,C}. Now client asks for resource C. I see that I already have it because I loaded bundle X. But I also see in the game content map that bundle X is not the newest bundle for resource C. There's some newer bundle Y. Can I return the X->C that I already have? Or do I have to fire a load of Y to get the newer version? I can't tell.

I guess the easy solution is to add an extra field to the baking of each resource that indicates the baking level, and require the client to manually set that field. I guess that field should really be a bit-field so that the client can mark up to 32 types of processing that resource might have and then I insist that you get the "most processed" resource at any time.

I hate exposing more fields that the client has to set right for things to work. My goal is that all this complexity is completely hidden, the client doesn't really have to worry about it, you just do things and it all magically works.


Semi-related : I wrote a while ago about the speeds that I think you should have with an LZ decompressor for file IO. In that post I assume that the game has some CPU work to do for loading, so the CPU used by the decompressor is stealing from someone else.

It turns out that's changing fast. On a new 8-core chip, it's highly unlikely that the game is actually using all the CPU time *ever* , and even more unlikely that it's using it during load. That means CPU is basically free, which totally changes the equation. Having a slow decompressor like LZMA might still hurt your *latency* but it will improve your throughput.

Of course this only true on modern CPUs. We sort of have a problem now with CPUs similar to the problem with GPUs - there's a huge difference in capabilities between high spec and low spec, and the way you target them is quite different. The biggest step is probably from 1 core to 2 core, because the difference between having a true simultaneous background thread and just switching with one is huge. But the difference from 2 to 8 is also pretty huge. With 8 cores you can do a *lot* very fast in the background. For example you could just do masses of decompression and procedural content generation.

Steam Hardware Survey shows 1 cpu is going away pretty fast, so hopefully soon we will be able to target min specs of >= 2 cpus and >= dx9.

No comments:

old rants