An omnivorous information addict. Still, graphics, rendering and programming languages are my favorite areas. A non-conformist wannabe.
Posts by Jaewon Jung
  1. A Fast, Type-safe & Minimal String Formatting Library, miniformat ( Counting comments... )
  2. The less the code, the better ( Counting comments... )
  3. Smoothsort vs. Timsort ( Counting comments... )
  4. Generating Uniformly Distributed Points on Sphere ( Counting comments... )
  5. Simulating a Loaded Dice in a Constant Time ( Counting comments... )
  6. All the nuances of static_cast (and others) ( Counting comments... )
  7. Getting the Projected Extent of a Sphere to the Near Plane ( Counting comments... )
  8. Search by a Changelist Description in Perforce ( Counting comments... )
  9. GUI Test Automation using SIKULI ( Counting comments... )
  10. Sudoku Solver in Haskell ( Counting comments... )
Technology/ Code /

The inspiration for this came from tinyformat. Unfortunately it relies on the C++ iostream, basically std::cout, which is somewhat slow and definitely not everyone's favorite. So I bit the bullet and got around to implement my own with minimal dependency and speed in mind.

Origin

I didn't want to implement all the nitty-gritty of formatting a floating-point number to a string. Fortunately there was an awesome library called stringencoders, so I used it for the number-to-string part. The original miniformat came out like that(By the way, you can find the latest code of miniformat with improvements mentioned in this article in the new repository. I created the new one with git back-end to use SourceTree, which currently doesn't support Mercurial on Windows). Its usage was like:

std::string str;
mini::format(str, "String: %1 Int: %0, Float: %(.3)2\n", 100, "JJ", 3.141592);

Originally it supported only std::string as an output type by default, although one can support other string types by specializing function templates in string_adaptor namespace. By the way, keep in mind the caveat that it's not supporting all the features of printf format specifiers(for instance, no hexadecimal/octal integer support).

Improvements

Recently I saw Andrei Alexandrescu's great post on C++ optimizations(video, slides) and decide to apply the lessons there (as for number-to-string conversion) to the formatting library and see how it fares. I also wanted to add a support for raw char arrays as an output type by default since a potential heap allocation cannot be an option in some situations.

Here's the original code for converting 32bit unsigned integer to a string:

template <typename String>
void miniformat::detail::render(String& wstr, uint32_t value, int width, int)
{
    int beginIndex = string_adaptor::length(wstr);
    // Conversion. Number is reversed.
    do 
    {
        string_adaptor::append(wstr, 1, (char)(48 + (value % 10))); 
    }
    while (value /= 10);

    // Handle the 'width' parameter.
    int spaceCnt = width - (string_adaptor::length(wstr)-beginIndex);
    if(spaceCnt > 0)
        string_adaptor::append(wstr, spaceCnt, ' ');

    int endIndex = string_adaptor::length(wstr)-1;
    // Reverse string.
    strreverse(string_adaptor::at(wstr, beginIndex), string_adaptor::at(wstr, endIndex));
}

Pretty straightforward, other than the consideration for the 'width' parameter(the unnamed fourth parameter is a precision value, which is irrelevant for integers).

Here is the one after Andrei's optimizations applied:

inline int miniformat::detail::digits10(uint32_t v)
{
    static const uint32_t P01 = 10;
    static const uint32_t P02 = 100;
    static const uint32_t P03 = 1000;
    static const uint32_t P04 = 10000;
    static const uint32_t P05 = 100000;
    static const uint32_t P06 = 1000000;
    static const uint32_t P07 = 10000000;
    static const uint32_t P08 = 100000000;
    static const uint32_t P09 = 1000000000;

    if(v < P01) return 1;
    if(v < P02) return 2;
    if(v < P03) return 3;
    if(v < P08)
    {
        if(v < P06)
        {
            if(v < P04) return 4;
            return 5 + (v >= P05);
        }
        return 7 + (v >= P07);
    }
    return 9 + (v >= P09);
}
template <typename String>
void miniformat::detail::render(String& wstr, uint32_t value, int width, int)
{
    static const char digits[201] =
        "0001020304050607080910111213141516171819"
        "2021222324252627282930313233343536373839"
        "4041424344454647484950515253545556575859"
        "6061626364656667686970717273747576777879"
        "8081828384858687888990919293949596979899";

    int next = string_adaptor::length(wstr);
    const int length = digits10(value);

    // Handle the 'width' parameter.
    int spaceCnt = width - length;
    if(spaceCnt > 0)
    {
        string_adaptor::append(wstr, spaceCnt, ' ');
        next += spaceCnt;
    }

    string_adaptor::append(wstr, length, '0');
    next += length-1;

    while(value >= 100)
    {
        const uint32_t i = (value % 100) * 2;
        value /= 100;
        *string_adaptor::at(wstr, next) = digits[i+1];
        *string_adaptor::at(wstr, next-1) = digits[i];
        next -= 2;
    }

    // Handle last 1-2 digits.
    if(value < 10)
    {
        *string_adaptor::at(wstr, next) = '0'+static_cast<char>(value);
    }
    else
    {
        const uint32_t i = static_cast<uint32_t>(value) * 2;
        *string_adaptor::at(wstr, next) = digits[i+1];
        *string_adaptor::at(wstr, next-1) = digits[i];
    }
}

It now consists of two passes(and two corresponding functions) instead of one, but due to strength reduction, array writes minimization and other optimizations from Andrei's post, should perform faster than before.

Here are timings of converting some random integers into std::string, before & after this optimization:

Win32 Release/before Release/after Debug/before Debug/after
uint32_t 0.85 0.57(1.5x) 4.64 2.44(1.9x)
int32_t 0.89 0.62(1.4x) 4.63 2.44(1.9x)
uint64_t 1.47 1.03(1.4x) 4.71 2.5(1.9x)
int64_t 1.49 1.08(1.4x) 4.71 2.5(1.9x)

 

x64 Release/before Release/after Debug/before Debug/after
uint32_t 0.93 0.54(1.7x) 1.38 0.78(1.8x)
int32_t 0.94 0.56(1.7x) 1.39 0.77(1.8x)
uint64_t 0.93 0.57(1.6x) 1.43 0.79(1.8x)
int64_t 0.94 0.63(1.5x) 1.45 0.82(1.8x)
  • Numbers are in seconds.
  • Iteration count: 10,000,000 for Release, 1,000,000 for Debug
  • Compiler used: VS2012
  • One can find the code for this in benchmark2() function template of miniformat.cpp in the project repository.

One can observe 1.4x ~ 1.9x performance improvements overall. Not bad. You can also notice that the performance degrades very much in Debug(keep in mind that the iteration count differs between Release and Debug), especially in Win32. It seems mainly due to the fact that I used std::string as an output in this benchmark, which has much more overhead in Debug.

As I said, I had bolted on a support for raw char arrays recently, but it wasn't very efficient because I had originally structured the code on the implicit assumption that getting a length of a string is a constant operation. This holds true for std::string and similars, but not for a raw char array. For that, one should pass the current length of the output string through functions. You can see that the render() code above calls string_adaptor::length() in the beginning to get the current length. That's far from optimal. So I fixed it.

Here are some timings of a benchmark comparing sprintf vs. miniformat(for the both cases of outputting to std::string and to char array), before & after this fix:

Win32 Release/before Release/after Debug/before Debug/after
Speed-up when outputting to std::string 4.42 4.20 0.26 0.24
Speed-up when outputting to char array 2.59 3.84 1.58 1.27

 

x64 Release/before Release/after Debug/before Debug/after
Speed-up when outputting to std::string 3.81 3.81 1.13 1.06
Speed-up when outputting to char array 2.19 5.13 3.91 3.51
  • Numbers mean a speed-up factor of miniformat compared to sprintf.
  • Compiler used: VS2012
  • One can find the code for this in benchmark() function of miniformat.cpp in the project repository.

Before this fix, the speed-ups in the case of outputting to char array were less than the ones in the case of outputting to std::string because of the inefficiency mentioned above(4.42 vs 2.59 on Win32/Release, 3.81 vs 2.19 on x64/Release). With the fix, the speed-ups of outputting to char array has come close to or exceeded the ones of outputting to std::string. Again, you can see the inefficiency of std::string on Debug, which actually makes the speed-up factor of miniformat less than one on Win32/Debug.

Conclusion

I think miniformat is now in a decent state. As I said, it's not exactly functionally equivalent to sprintf and some people don't like possible code bloat that an extensive use of C++ template brings. I definitely won't say benchmarks above are scientific. Still, it brings type-safety and speed-up of up to 4~5x (than sprintf) as just a header-only library with no dependency.

It's not yet using C++11 stuff. In a related node, one should still append .c_str() when putting std::string as an input argument. I tried to do without it, but without move constructor stuff in C++11, it seemed very difficult to not introduce any copying. std::ref)(also, available in C++11) can do the trick, but it requires more typing than adding .c_str()! If anyone knows a solution for this without introducing a dependency, please enlighten me.

I welcome suggestions!

(I posted this article to my personal blog, too.)