11-26-08 - Oodle

OMG I finally got hot loading sort of working with bundles and override bundles. So you can build a package of a bunch of files. The game runs and grabs the package that pages in. You touch some file, the game sees the change, that causes it to make a single-file-bundle for the changed file, it loads in the new bundle, and then patches the resource table so that the new version is used instead of the old one in the package. Then the Game Dictionary is updated so that future runs will also load the new version.

Phew. It works but it's fragile and I'm not happy with how careful you have to be. There are lots of ways it can fail. For example, the packages all load one by one asynchronously. If you fire the big package load first, it may come in and get hooked up before the override package. Then if you hook up your game resources, you have hooked up to the old (wrong) data. You can prevent this by doing a call on each resource that you get to say "is this the right version of this resource" before hooking up to it. But currently there's just a ton of stuff like that you have to be aware of and make sure to do the right call, etc. etc.

I keep coming back to the problem that I need to know whether I have the right version of the file in a package. There are three options I can see :

1. Fuck it. Do nothing. This requires the game to load the right versions all the time. Clients could, for example, always just work unpackaged while editing, then when they make paging packages they would simply not be editable or incremental-linkable in a nice way. This is not a happy solution but it "works" in the sense that it is *reliable* about not working.

2. Use mod times. This works fine and is easy and fast *as long as you only touch and make files on your local machine*. Which would fail for major game devs since they tend to compile files on some kind of build server, or check in and share compiled files. But if you think about the way programmers work - we don't ever check in our .OBJ files, we just check in sources and everybody has to rebuild on their local machine. If you make your artists do the same thing - only share source files and always local rebuild - OR sync to a full compiled set from the server and only run compiled, but never try to mix local changes with server compiles - then it works fine.

3. Use CRC's. This is the most robust and reliable. The packages store the CRC of the files they were made from. If the source file on disk has a different CRC, we assume that the source file is better. (we don't try to use mod times to tell who's newer because of the server-client mod time problem). If the CRC is different we repackage using the local source file. Then when we load we always prefer packages that have content that matches the local CRC. This works and is stable and good and all. The only problem is all the time you spend doing CRC's on files, which may or may not be a big deal. Obviously running over your whole tree and doing CRC's all the time would be ridiculous, but you can use mod time to tell when to redo the CRC, so you only incrementally redo it. Even getting all the mod times is too slow, so you can use a persistent Watcher app to cache the drive dir listing and file infos.

The "CRC" method is not necessarilly "right" in the sense that it doesn't load the newest version of the content, but it is reliable, it always does the same thing - it loads content that corresponds to the source version on the user's disk.

BTW this last thing is something the OS should really do but totally fails on. With the amount of memory we have now, the OS should just keep a listing of every dir you've ever visited in memory at all times. Even if you visit every dir on the disk it would only be 100 Megs. You could cap it at 64 MB or something and LRU, and obviously having 64 MB of dir listings is going to make your dir access instant all the time. I might just write an app to do this myself.

I'm kind of tempted to offer #1 and #3 to clients and not offer #2. I don't want to offer any features that sort of work or will have weird behavior that looks like "bugs".

Also BTW yes of course there will also just be a mode to load what's in the packages and not even look at what's newest or anything. This is what the real shipping game will use of course, once you have your final packages set up it just loads them and assumes you made them right.


won3d said...

Couldn't you synchronously load some bundle metadata that lets you reliably tell the directory which package to expect each resource?

I'm a fan of using hashes as object identity, but I would go for something stronger than CRC. I suppose you could use a big CRC, but some HMAC or just MD5 would probably work better. Also, using strong hashes means you can do some cool caching and distributed stuff pretty easily.

cbloom said...

"Couldn't you synchronously load some bundle metadata that lets you reliably tell the directory which package to expect each resource?

That's exactly what I'm trying to do (that metadata is the "Game Dictionary" in my lingo), but correctly maintaining that metadata is not as trivial as it seems due to the issues outlined in the previous post on this topic.

cbloom said...

Yeah, also I guess I should some day look into what hash to use. I'm not actually sure that even CRC32 would be a bad thing since I don't care about global hash collisions, only hash collisions on the exact same file, which has extremely low probability.

Anyway it would be cool to have a stronger hash in my toolkit that's also very fast. The ghetto MD5 and SHA code I've used in the past has been sick slow, but I'm sure it could be made fast.

Assen said...

We found the Windows Crypto API MD5 to be brutally faster than the reference MD5 code we used (3-4x), so on Windows we use that; in any case, it's considerably faster than the actual reading of the file itself.

Sean Barrett said...

There's sha-1 code in stb.h

old rants