6/20/2010

06-20-10 - Struct annoyance

It's a very annoying fact of coding life that references through a struct are much slower than loading the variable out to temps. So for example this is slow :

void rrMTF_Init(rrMTFData * pData)
{
    for (int i = 0; i < 256;  i++)
    {
        pData->m_MTFimage[i] = pData->m_MTFmap[i] = (U8) i;
    }
}

and you have to do this by hand :

void rrMTF_Init(rrMTFData * pData)
{
    U8 * pMTFimage = pData->m_MTFimage;
    U8 * pMTFMap = pData->m_MTFmap;
    for (int i = 0; i < 256;  i++)
    {
        pMTFimage[i] = pMTFmap[i] = (U8) i;
    }
}

Unfortunately there are plenty of cases where this is actually a significant big deal.

5 comments:

  1. Can you use 'restrict' to get it to automatically optimize it?

    ReplyDelete
  2. That said, there is just something wrong with our programming languages that you almost always *mean* for that sort of thing to be optimized out of the loop, and the "best" way we have to do this is using strict type aliasing. Maybe we need some metaphor for "locking" a data structure (meaning we know it won't get changed behind the compilers back) or something.

    ReplyDelete
  3. Yeah, in theory "restrict" does this for you, though I don't think that restrict will apply to pointers inside the struct (?). That is, if you put a "restrict" on pData, you would still have to load the array pointers out to temporaries so that you can mark them resitrict as well. (I'm unclear on this) (... actually it appears the correct solution is to put "restrict" in the declaration of the struct)

    I agree it definitely feels like a flaw in C.

    The solution I'd like to see is something like

    __non_volatile {
    ... code ...
    }

    which lets you mark a whole area of code as being "non volatile" which tells the compiler it could do all memory reads at the open brace and app writes at the close brace and it wouldn't change behavior (eg. nothing within the braces communicates through memory in any hidden way).

    But really a lot of the cases where this hits me are so simple that just the tiniest bit of static analysis would figure it out for me.

    ReplyDelete
  4. Our Java compiler will do this. Pointer analysis for C is generally pretty tricky, though I would have thought that the points-to analysis in gcc would occasionally extract the non-aliasing properties you need for CSE.

    ReplyDelete
  5. Your "__non_volatile" thing is reminding me of transactional memory.

    Yeah, Java arrays never alias each other. Re: points-to analysis, it depends on whether the compiler can see the assignments of the pointer fields. I think if the compiler is really aggressive and this is a hot area, then the compiler might emit pointer range checks and dispatch to a restricted or non-restricted version. ICC might do this, but I have not firsthand experience.

    ReplyDelete