9/04/2010

09-04-10 - Holy fuck balls

MSVC 2005 x64 turns this :

static RADFORCEINLINE void rrMemCpySmall(void * RADRESTRICT to,const void * RADRESTRICT from, SINTa size)
{
    U8 * RADRESTRICT pto = (U8 * RADRESTRICT) to;
    const U8 * RADRESTRICT pfm = (const U8 * RADRESTRICT) from;
    for(SINTa i=0;i < size;i++) pto[i] = pfm[i];
}

into

call memcpy

god damn you, you motherfuckers. The whole reason I wrote it out by hand is because I know that size is "small" ( < 32 or so) so I don't want the function call overhead and the rollup/rolldown of a full memcpy. I just want you to generate rep stosb. If I wanted memcpy I'd fucking call memcpy god dammit. Friggle fracking frackam jackam.


In other news, I know it's "cool" to run with /W4 or MSVC or /Wall on GCC (or even /Wextra), but I actually think it's counterproductive. The difference between /W3 and /W4 is almost entirely warnings that I don't give a flying fuck about, shit like "variable is unused". Okay fine, it's not used, fucking get rid of it and don't tell me about it.

Shit like "variable initialized but not used" , "conditional is constant", "unused static function removed" is completely benign and I don't want to hear about it.

I've always been peer-pressured into running with max warning settings because it's the "right" thing to do, but actually I think it's just a waste of my time.


MoveFileTransacted is pretty good. There's other transactional NTFS shit but really this is the only one you need. You can write your changes to a temp file (on your ram drive, for example) then use MoveFileTransacted to put them back on the real file, and it's nice and safe.

BTW while I'm ranting about ram drives; how the fuck is that not in the OS? And my god people are getting away with charging $80 for them. AND even those expensive ones don't support the only feature I actually want - dynamic sizing. It should be the fucking size of the data on it, not some static predetermined size.

20 comments:

  1. Are you looking at the body of the call, or are you looking at an inline call site?

    Those annoying warnings you site are less about catching likely coding problems and more about style or legibility. Having superfluous things (like an if() that only ever goes one way) is confusing. It might seem like a waste of a time to the programmer, but it is useful to whomever is doing a cold read (which is likely you, a month from now).

    MoveFileTransacted looks good, but (to quote a wise man) holy fuck balls! Most of the time, you want to do this within a filesystem (not across disks, network or whatever) so you can atomically revise a file. How come MoveFileEx doesn't already do this? It should be simple file metadata manipulation. Blech.

    ReplyDelete
  2. I guess you cannot use #pragma intrinsic for memcpy in your case?

    ReplyDelete
  3. "How come MoveFileEx doesn't already do this? It should be simple file metadata manipulation. Blech."

    If it's a rename on the same volume and the target file does not already exist, then regular MoveFile is atomic. But if you're overwriting an existing file, it's not atomic.

    ReplyDelete
  4. "I guess you cannot use #pragma intrinsic for memcpy in your case? "

    I am using intrinsic memcpy all the time of course. That doesn't do anything magical unless the size of the memcpy is known at compile time. In this case I know size is < 48 but not exactly what it is.

    BTW a related thing I discovered a while ago when I made my backspace-key-changing app. I wanted to make it not use any CRT to get the exe size tiny. One annoyance I found was that if you do :

    type array[size] = { 0 };

    that actually generates "call memset" in MSVC, which gives you link errors or course when you have no CRT. Similarly,

    struct A,B;
    A = B;

    generates "call memcpy".
    (these cases however may get inlined if you use the intrinsics)

    It's a little bit weird the way memcpy/memset are now treated as being part of the language as opposed to part of the standard library.

    ReplyDelete
  5. "It's a little bit weird the way memcpy/memset are now treated as being part of the language as opposed to part of the standard library."
    The way most modern compilers see it, the standard library is a part of the language. GCC does even more special-case handling for C standard library functions along that vein. There's compile-time type checking of printf arguments based on the format string. GCC knows that strlen, memcmp, strchr etc. are pure functions, and will optimize away repeated invocations on constant strings (exploited here). It will also evaluate such functions on constant values at compile time (e.g. strlen("Hello") will be constant-folded into a literal integer 5). And so on...

    ReplyDelete
  6. Continuing my tangent (since maybe you really do just want MoveFileTransacted): even if MoveFileEx does the right thing sometimes, the way NTFS is designed makes atomic file updates difficult.

    http://blogs.msdn.com/b/adioltean/archive/2005/12/28/507866.aspx

    Then again, hey look at this:

    http://msdn.microsoft.com/en-us/library/aa365512(VS.85).aspx

    One kinda neat thing about GCC is that the special cases for those methods are accessible as function attributes. You can tell which fields correspond to a printf format string + input arguments. You can specify which fields are non-null. You can declare the function as pure or const, etc.

    ReplyDelete
  7. All the intrinsic stuff needs to be cleaned up and made a first class thing...

    ReplyDelete
  8. "It will also evaluate such functions on constant values at compile time (e.g. strlen("Hello") will be constant-folded into a literal integer 5). And so on... "

    Yeah, a lot of this stuff is really cool and generally it's good, but I have a few complaints.

    I don't really like the promotion of the stdlib into a special status. The compiler should let me do that markup to *any* function.

    When I'm actually calling a stdlib function and it does some mojo on it, that's pretty okay, but when I'm just writing machine code and it compiles my code into a stdlib call, that's a bit bonkers. At least that should be exposed and I should be able to disable it, or get the same treatment for my functions.

    I just think of all this approach to making C into a semi-high-level language is backwards. You shouldn't be letting people continue to write lowish-level-looking plain C and then magically making it high level behind their back through figuring out their intent. You should just have a richer stdlib and encourage people to call good functions more.

    ReplyDelete
  9. They're getting there, but the non-standardness sucks, since they don't really overlap.

    http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html

    http://msdn.microsoft.com/en-us/library/aa383701.aspx

    ReplyDelete
  10. http://en.wikipedia.org/wiki/ReadyBoost

    This has been in Windows since Vista

    ReplyDelete
  11. What does ReadyBoost have to do with anything?

    ReplyDelete
  12. Sorry, this was in reference to "BTW while I'm ranting about ram drives; how the fuck is that not in the OS? And my god people are getting away with charging $80 for them."

    I probably should have replied to the last post (re: RAM drives) to be clear.

    ReplyDelete
  13. Yeah, I think the outrageous thing is that they are charing money for a piece of software that ought to be pretty simple to write. Also, cb has the SSD hotness, so ReadyBoost is not really helpful. Not that it is particularly helpful to anyone, anyway.

    That being said: when you make a temporary file with the right flag, the OS will delay writing it back, presumably as long as it thinks it is profitable to keep it around in buffer cache. That's kinda like a dynamically-sized ram disk.

    http://en.wikipedia.org/wiki/Tmpfs#Microsoft_Windows

    ReplyDelete
  14. Hmm, yeah FILE_ATTRIBUTE_TEMPORARY is pretty cool, it's unclear to me if you actually have to use DELETE_ON_CLOSE, if so that makes it useless though.

    Actually the whole windows disk cache is very nice, and I wish it was exposed as a generic place for to store cached junk. For example perhaps procedurally generated texture data. It should just be names that point to bytes and don't necessarily actually have disk files backing them.

    ReplyDelete
  15. DELETE_ON_CLOSE only does what it says. The documentation is pretty clear (bottom of this section):

    http://msdn.microsoft.com/en-us/library/aa363858(VS.85).aspx#caching_behavior

    This thread is drifting towards the comments I posted for:

    http://cbloomrants.blogspot.com/2010/05/05-26-10-windows-page-cache.html

    For procedurally-generated data (or decompressed images) why not just mmap temporary files? If you do it right, it should be available across processes and process invocations while minimally hitting the disk. You'd probably need a helper process that goes and GCs old or stale tmp files. Then the circle will be complete, and you, too, can be a bastard who has written a background process that mysteriously touches disk.

    ReplyDelete
  16. "why not just mmap temporary files? If you do it right, it should be available across processes and process invocations while minimally hitting the disk."

    Yeah that's probably okay in practice, but in theory it bugs me.

    In particular, what I want is mem-mapped temporary files that *never* get actually written to disk, so that if I try to open one by name, if it's in cache I get it, and if it's not in cache I get a failure.

    This is important because if you ever actually wind up opening a file from a hard disk, it can be a 10 ms seek delay, and when your whole frame is 16 ms that's catastrophic.

    One hacky cover would be to do your file open on a thread and if it doesn't return within 1000 clocks treat as a failure.

    This would be the correct way for web browser to do their mem cache of downloaded pages for example, so that it could be persistent across runs and play well with other users of ram & cache.

    ReplyDelete
  17. In Linux, you can use mincore to see whether a page is resident. Maybe you can do something like that with SEH. On most platforms, you can lock some memory so that they are guaranteed to be paged in. For example:

    http://msdn.microsoft.com/en-us/library/aa366895(v=VS.85).aspx

    You can ration your memory sanely by asking the OS:

    http://msdn.microsoft.com/en-us/library/aa366589
    http://msdn.microsoft.com/en-us/library/aa965224

    And you can dynamically adjust using memory notifications:

    http://msdn.microsoft.com/en-us/library/aa366541

    ReplyDelete
  18. No no no, none of that is what I want, and in fact I would say using that stuff is why we have badly behaving apps, because people implement their own caches and memory locking and low-mem handling, and they do it wrong or it doesn't play nice with others. I don't want to manage locking my own pages, I want the cache to do it for me.

    ReplyDelete
  19. You have part of what you want with temporary files: they are stored in the cache and only write back under memory pressure. The wrinkle you want is to prevent even those write backs. It would be nice to have OS support for this sort of thing; in fact something like this exists within the Linux kernel. What I'm suggesting is that you prevent/avoid this situation.

    Querying beforehand gives you a rough budget of how many temp pages you can map in. Using the notification mechanism lets you free your temps before they get written out. The missing part of the equation is what is your eviction policy, and the fact that you operate on file, rather than page/cache block granularity. I guess that's where you reinvent the wheel. :-/

    http://www.youtube.com/watch?v=WX8Du9pusdA

    Then again, you also have much more information about how you intend to use the cached data. Like, what you're likely to access next, what the storage cost in RAM v. the rematerialization cost in CPU, etc.

    ReplyDelete