8/22/2010

08-22-10 - AutoPrintf v1

Well autoprintf v1 appears to be all working. The core element is a bunch of functions like this :

template < typename T1, typename T2, typename T3, typename T4 >
inline String autoToString( T1 arg1, T2 arg2, T3 arg3, T4 arg4 )
{
    return ToString(arg1) + autoToString( arg2,arg3,arg4);
}


template < typename T2, typename T3 >
inline String autoToString( const char *fmt, T2 arg2, T3 arg3 )
{
    autoFormatInfo fmtInfo = GetAutoFormatInfo(fmt);
    if ( fmtInfo.autoArgI )
    {
        String newFmt = ChangeAtoS(fmt,fmtInfo);
        if ( 0 ) ;
        else if ( fmtInfo.autoArgI == 1 ) return autoToString(newFmt.CStr(), ToString(arg2).CStr(),arg3);
        else if ( fmtInfo.autoArgI == 2 ) return autoToString(newFmt.CStr(), arg2,ToString(arg3).CStr());
        else return autoPrintf_BadAutoArgI(fmt,fmtInfo);
    }

         if ( fmtInfo.numPercents == 0 )    return ToString(fmt) + autoToString(arg2,arg3);
    else if ( fmtInfo.numPercents == 1 )    return StringPrintf(fmt,arg2) + autoToString(arg3);
    else if ( fmtInfo.numPercents == 2 )    return StringPrintf(fmt,arg2,arg3);
    else return autoPrintf_TooManyPercents(fmt,fmtInfo);
};

you have an autoToString that takes various numbers of template args. If the first arg is NOT a char *, it calls ToString on it then repeats on the remaning args. Any time the first arg is a char *, it uses the other specialization which looks in fmt to see if it's a printf format string, then splits the args based on how many percents they are. I also added the ability to use "%a" to mean auto-typed args, which is what the first part of the function is doing.

That's all dandy, but you should be able to see that for large numbers of args, it generates a massive amount of code.

The real problem is that even though the format string is usually a compile-time constant, I can't parse it at compile time, so I generate code for each arg being %a or not being %a, and for each possible number of percents. The result is something like 2^N codegens for N args. That's bad.

So, I know how to fix this, so I don't think I'll publish v1. I have a method for v2 that moves most of the work out of the template. It's much simpler actually, and it's a very obvious idea, all you have to do is make a template like :


autoprintf(T1 a1, T2 a2, T3 a3)
{
    autoPrintfSub( autoType(a1), autoArg(a1) ,autoType(a2), autoArg(a2) , .. )
}

where autoType is a template that gives you the type info of the arg, and autoArg does conversions on non-basic types for you, and then autoPrintfSub can be a normal varargs non-template function and take care of all the hard work.

... yep new style looks like it will work. It requires a lot more fudging with varargs, the old style didn't need any of that. And I'm now using undefined behavior, though I think it always works in all real-world cases. In particular, in v2 I'm now relying on the fact that I can do :


  va_start(vl)
  va_arg(vl) .. a few types to grab some args from vl
  vsnprintf(  vl);

that is, I rely on the fact that va_arg advances me one step in the va_list, and that I then still have a valid va_list for remaining args which I can pass on. This is not allowed by the standard technically but I've never seen a case where it doesn't work (unless GCC decided to get pedantic and forceably make it fail for no good reason).

10 comments:

  1. Are you sure it's not allowed by the standard? I thought that the design WAS that the va_list is an iterator, not just a list. But I've never read the relevant parts of the standard.

    ReplyDelete
  2. " Are you sure it's not allowed by the standard? I thought that the design WAS that the va_list is an iterator, not just a list. But I've never read the relevant parts of the standard. "

    Eh.. I read somewhere that you aren't allowed to do that, but I haven't actually read the standard.

    .. reading it now ..

    actually it looks like it's fine. The standard doesn't clearly say that when you pass the va_list through it will act like the remaining args, it just says you can pass it.

    But it also says that after passing it, your copy of the va_list is invalidated and you must immediately call va_end to stop your iteration, then restart from the beginning.

    So I could do that but it's a bit annoying.

    ReplyDelete
  3. See

    http://www.opengroup.org/onlinepubs/009695399/basedefs/stdarg.h.html

    ReplyDelete
  4. God I hate that kind of spec writing. Anyway, the description of how ap advances and the cross-function comments seem to mean (though hard to be sure through spec speak) that it's supposed to do the right thing. This seems strongly the implication of va_copy, which lets you copy the current state and pass one or the other to another function, and then pick up again where you left off. It's probably legit to let the called function return you the va_list and continue where it left off, but it's unclearly written given the way they talk about multiple functions.

    ReplyDelete
  5. It's not even clear to me from the spec what va_copy is supposed to do after you've called va_arg. Does it copy the whole list, or just the portion of the list you are now pointing at?

    I mean that spec is just terrible. It's not actually restrictive enough to give the implementers much freedom, but it's also not clear or generous enough to gaurantee the clients the functionality they need.

    Which is why I get so pissed off at standards wonks. In practice I know how va_list works on every real platforms, and I know what I can and can't do, and that is a way more useful "de facto spec" than what's written.

    ReplyDelete
  6. va_copy just copies the pointer--it gives you a copy of the iterator into the same data.

    i think.

    ReplyDelete
  7. Hmm... what happend with my previous comment? Is it awaiting moderation or was it sent directly to /dev/null?

    Anyway, the problem with va_copy is that in the x64 calling convention some arguments are always passed by registers, even with variadic functions. GCC handles that using a structure holding the registers and the stack pointer; va_list is a pointer to that structure, so in order to save the state you have to use va_copy. Here's an explanation of how that works on msvc and why it does not require va_copy.

    ReplyDelete
  8. Yeah I looked at the implementation of va_arg on x64 MSVC and saw it was just doing stack pointer stepping so figured they had to be copying the args somewhere. I think their solution to do it inside the callee is nice.

    I might do the GCC port some day. Or somebody who actually cares about GCC compatibility can do it.

    ReplyDelete
  9. "if ( 0 ) ;"

    You are cracking me up :)

    Did you do that because it makes the "else if" clauses line up nicely, or because it's faster on PS3?

    ReplyDelete
  10. Lol. It's actually because that code is all autogen'ed and it was just easier to not generate a special case for the first if.

    see:


    template < typename T2, typename T3, typename T4, typename T5, typename T6, typename T7, typename T8, typename T9, typename T10 >
    inline String autoToString( const char *fmt, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8, T9 arg9, T10 arg10 )
    {
    autoFormatInfo fmtInfo = GetAutoFormatInfo(fmt);
    if ( fmtInfo.autoArgI )
    {
    String newFmt = ChangeAtoS(fmt,fmtInfo);
    if ( 0 ) ;
    else if ( fmtInfo.autoArgI == 1 ) return autoToString(newFmt.CStr(), ToString(arg2).CStr(),arg3,arg4,arg5,arg6,arg7,arg8,arg9,arg10);
    else if ( fmtInfo.autoArgI == 2 ) return autoToString(newFmt.CStr(), arg2,ToString(arg3).CStr(),arg4,arg5,arg6,arg7,arg8,arg9,arg10);
    else if ( fmtInfo.autoArgI == 3 ) return autoToString(newFmt.CStr(), arg2,arg3,ToString(arg4).CStr(),arg5,arg6,arg7,arg8,arg9,arg10);
    else if ( fmtInfo.autoArgI == 4 ) return autoToString(newFmt.CStr(), arg2,arg3,arg4,ToString(arg5).CStr(),arg6,arg7,arg8,arg9,arg10);
    else if ( fmtInfo.autoArgI == 5 ) return autoToString(newFmt.CStr(), arg2,arg3,arg4,arg5,ToString(arg6).CStr(),arg7,arg8,arg9,arg10);
    else if ( fmtInfo.autoArgI == 6 ) return autoToString(newFmt.CStr(), arg2,arg3,arg4,arg5,arg6,ToString(arg7).CStr(),arg8,arg9,arg10);
    else if ( fmtInfo.autoArgI == 7 ) return autoToString(newFmt.CStr(), arg2,arg3,arg4,arg5,arg6,arg7,ToString(arg8).CStr(),arg9,arg10);
    else if ( fmtInfo.autoArgI == 8 ) return autoToString(newFmt.CStr(), arg2,arg3,arg4,arg5,arg6,arg7,arg8,ToString(arg9).CStr(),arg10);
    else if ( fmtInfo.autoArgI == 9 ) return autoToString(newFmt.CStr(), arg2,arg3,arg4,arg5,arg6,arg7,arg8,arg9,ToString(arg10).CStr());
    else return autoPrintf_BadAutoArgI(fmt,fmtInfo);
    }

    if ( fmtInfo.numPercents == 0 ) return ToString(fmt) + autoToString(arg2,arg3,arg4,arg5,arg6,arg7,arg8,arg9,arg10);
    else if ( fmtInfo.numPercents == 1 ) return StringPrintf(fmt,arg2) + autoToString(arg3,arg4,arg5,arg6,arg7,arg8,arg9,arg10);
    else if ( fmtInfo.numPercents == 2 ) return StringPrintf(fmt,arg2,arg3) + autoToString(arg4,arg5,arg6,arg7,arg8,arg9,arg10);
    else if ( fmtInfo.numPercents == 3 ) return StringPrintf(fmt,arg2,arg3,arg4) + autoToString(arg5,arg6,arg7,arg8,arg9,arg10);
    else if ( fmtInfo.numPercents == 4 ) return StringPrintf(fmt,arg2,arg3,arg4,arg5) + autoToString(arg6,arg7,arg8,arg9,arg10);
    else if ( fmtInfo.numPercents == 5 ) return StringPrintf(fmt,arg2,arg3,arg4,arg5,arg6) + autoToString(arg7,arg8,arg9,arg10);
    else if ( fmtInfo.numPercents == 6 ) return StringPrintf(fmt,arg2,arg3,arg4,arg5,arg6,arg7) + autoToString(arg8,arg9,arg10);
    else if ( fmtInfo.numPercents == 7 ) return StringPrintf(fmt,arg2,arg3,arg4,arg5,arg6,arg7,arg8) + autoToString(arg9,arg10);
    else if ( fmtInfo.numPercents == 8 ) return StringPrintf(fmt,arg2,arg3,arg4,arg5,arg6,arg7,arg8,arg9) + autoToString(arg10);
    else if ( fmtInfo.numPercents == 9 ) return StringPrintf(fmt,arg2,arg3,arg4,arg5,arg6,arg7,arg8,arg9,arg10);
    else return autoPrintf_TooManyPercents(fmt,fmtInfo);
    };

    ReplyDelete