format_it: Iterator based string formatting

by Malte Skarupke

format_it is my attempt at writing string formatting that is good enough that it could make it into the C++ standard. It’s not quite there yet but I haven’t gotten to work on it in a few months, so I will publish what I have so far. It is fast (much faster than using snprintf to write to the stack), type safe, memory safe, doesn’t truncate, extensible, backwards compatible for std::ostream and somewhat composable.

The syntax is a mix of printf, ostream and varargs. I’ll just show an example:

// format "100 and 0.5" into 1024 bytes of stack memory
fmt::stack_format<1024> format("%0 and %1", 100, 0.5f);

// print "Hello World" into 1024 bytes of stack memory
fmt::stack_print<1024> print("Hello", "World");

// prints "100 and 0.5\nHello World\n"
auto format_to_cout = fmt::make_format_it(std::ostreambuf_iterator<char>(std::cout));
*format_to_cout++ = format;
*format_to_cout++ = '\n';
*format_to_cout++ = print;
*format_to_cout++ = '\n';

Where the first and second struct mimic snprintf, and the third example introcues the formatting iterator that they use behind the scenes.

First one clarification: I promised memory safety and no truncation, but I am writing to a fixed size buffer on the stack. The way that I guarantee safety is that there is a heap fallback in there. As long as you don’t need the heap fallback that code is much faster than snprintf. For the case where the heap fallback is used, I can construct both cases where it’s faster or slower than snprintf. When explaining the concept of a heap fallback to C++ programmers, they inevitably get angry at me, (even if it’s faster than snprintf) which is why it’s optional. The core of the library is the iterator that I use in the third example, which in the example writes straight to std::cout without formatting to temporary stack or heap memory.

Here’s another example:

std::vector<int> numbers = { 1, 10, 100 };
std::string a_string;
auto format_to_string = fmt::make_format_it(std::back_inserter(a_string));
// writes "{ 1, 10, 100 }" into the string
*format_to_string++ = numbers;
// also possible:
format_to_string.format("\n%0: %1\n", numbers.size(), numbers);
// that wrote "\n3: { 1, 10, 100 }\n" to the string. so the string is now
// {1, 10, 100 }
// 3: { 1, 10, 100 }
//

This formats into a string. As you can see the iterator has more functions than just the iterator assignment. The format() function is the core of the library. The stack_format<> struct from above just uses this. Besides this other available functions are print(), print_separated() and printpacked():


// prints "1 2 3"
fmt::cout.print(1, 2, 3).print('\n');
// prints "1, 2, 3"
fmt::cout.print_separated(", ", 1, 2, 3).print('\n');
// prints "123"
fmt::cout.printpacked(1, 2, 3).print('\n');

Where fmt::cout is a format_it iterator that wraps std::cout.

Since the iterator can write to everything, you might wonder why I even have the stack_format<> struct from my first example. The reason is simple: This has to be faster than snprintf. With all the downsides of std::stringstream, the main reason why C++ programmers don’t use it is that snprintf is faster. You can use format_it to write to a stringstream, but it would remain slower than snprintf. The default behavior, to use stack_format<> can be several times faster than snprintf, and it has to be for people to start using this.

You may have already noticed that this can print more than snprintf or std::ostream can print: It can print containers like the vector that I used in a previous example. It can also print vectors of vectors. If you include the “format_stl.hpp” header it can print all stl containers as well as std::pair and std::tuple. And it’s pretty easy to make it print your structs. But before going into extending format_it, I would like to illustrate one more benefit of it being an iterator: You can use the standard algorithms on it:

std::vector<int> numbers = { 1, 10, 100 };
// prints "{ 1, 10, 100 }"
fmt::cout.print(numbers);
// prints "110100"
std::copy(numbers.begin(), numbers.end(), fmt::cout);
// prints
// 1
// 10
// 100
std::copy(numbers.begin(), numbers.end(), fmt::with_separator(fmt::cout, "\n"));
// prints
// 1
// a
// 64
std::transform(numbers.begin(), numbers.end(), fmt::with_separator(fmt::cout, "\n"), &fmt::hex<int>);

This makes it pretty composable, but this is also one area where I’m not yet happy with what I have. I can explain why at the end.

Another function that I use in there is fmt::hex which prints an integer as hexadecimal. There’s also fmt::upperhex, fmt::oct and fmt::boolalpha. For floating point numbers there is fmt::precise_float, fmt::nodrift_float and fmt::float_as_fixed. There’s also fmt::pad_int and fmt::pad_float as well as fmt::pad_left, fmt::pad_right and fmt::pad_both. (which work on all types) All of these do roughly what they say. precise_float allows you to say how many digits you want, nodrift_float determines the necessary number of digits by itself.

But finally let’s talk about how to make it possible to print your struct. If your struct can print to a std::ostream, then format_it can already print it. Alternatively you can print your struct by writing a format() function in your namespace:

struct print_through_ostream
{
    int id = 0;
    std::string name = "hank";
};

std::ostream & operator<<(std::ostream & lhs, const print_through_ostream & rhs)
{
    return lhs << "id: " << rhs.id << ", name: " << rhs.name;
}

struct print_through_format_it
{
    int id = 0;
    std::string name = "hank";
};
template<typename C, typename It>
fmt::format_it<C, It> format(fmt::format_it<C, It> it, const print_through_format_it & self)
{
    return it.format("id: %0, name: %1", self.id, self.name);
    // alternatively:
    //return it.printpacked("id: ", self.id, ", name: ", self.name);
    // alternatively:
    // *it++ = "id: ";
    // *it++ = self.id;
    // *it++ = ", name: ";
    // *it++ = self.name;
    // return it;
}

std::string print_both(const print_through_ostream & first, const print_through_format_it & second)
{
    std::string out;
    auto it = fmt::make_format_it(std::back_inserter(out));
    it.print_separated('\n', first, second);
    return out;
}

In this example I use both the method of printing to std::ostream, and the method of formatting directly. In the template format_it<C, It> the argument C is the character type, meaning char or wchar_t or u32char_t or whatever else you want to print as, and the argument It is the output iterator. You never need to know what your output iterator is. It could be a std::back_insert_iterator or a std::ostreambuf_iterator or just a char *. You simply need to forward this argument.

Printing to std::ostream has a virtual method call, but it is the only way to print structs that use the pimpl pattern. The fact that format_it supports this means that all of your old structs can already print. But I like to use the function that prints directly to format_it because you have all the format functions available. (that being said I showed you how to make them available for a std::ostream in my first example: it’s just one (long) line of code)

If you are not satisfied with either the format method or printing to std::ostream because you have some complex overload resolution problem, there is a third way: There is a struct called fmt::formatter<> which allows you to do SFINAE tricks. I try to avoid SFINAE because of the compilation time overhead and I don’t use it in the code that ships with the library. But struct overload rules are a bit more powerful than function overload rules, so if you want to use them feel free to specialize that struct.

So in summary: Compared to snprintf this is

Faster
Memory Safe
Type Safe
Doesn’t Truncate
Extensible
Works with std::ostream
Composes better
Compiles more slowly

That last point is the main downside of this, but ultimately there isn’t much you can do: The reason for snprintf compiling faster is that it drops type information. And that’s why snprintf is the source of so many errors and why snprintf is not extensible.

The main benefit of this over stringstream is that it is faster and that you can use printf style syntax, which is useful in several situations. For example some people use this for translation, where you can have the string “Hello %0, how’s it going?” in English and “Hallo %0, wie gehts?” in German.

OK so why am I not happy with this yet? It’s not stl quality yet. One issue is how it doesn’t compose quite as well as I would like it to. For example there are four different ways to print values with a separator. Every one is used for different situations. I feel like there should be a better way though. And code should only be in the standard library if it composes well and is easily extensible. I think I already beat stringstream in that regard, but the iterator based approach could be more powerful still.

I also wanted to explore the idea of getting some simple scripting in there. I don’t remember where I have this from, but I saw that someone had support for “How is the Commander? Is %{0 male:he, female:she} recovering?” Which is useful for translation because in German that would have to be “Wie geht es %{0 male:dem Kommandanten, female: der Kommandantin}? Erholt %{0 male:er, female:sie} sich?” Which for example the Mass Effect series always gets wrong. The printf style syntax already is a simple form of scripting (because you can move text around) and I feel like adding a tiny layer of text replacement on top of that could improve things a lot. This has to be in the core feature set because users can’t add this. But I need to come up with a good design first. Maybe text replacement is enough, but maybe users should be able to write their own macros. (and maybe I should skip all this because this should be handled at a higher level. But I have seen enough poor translations where a sentence was gender neutral in English but required a different sentence in German depending on if your character was male or female, that I think it would be valuable if this was a basic feature)

Finally performance numbers. I ran each of these lines two million times to get nice big numbers:

Printing strings
fmt::stack_format<1024>(“[%0, %1, %2]”, “foo”, “bar”, “baz”);	182 ms
snprintf(buffer, 1024, “[%s, %s, %s]”, “foo”, “bar”, “baz”);	408 ms
std::stringstream() << ‘[‘ << “foo” << “, ” << “bar” << “, ” << “baz” << ‘]’;	1444 ms
Printing ints
fmt::stack_format<1024>(“[%0, %1, %2]”, i, i * 2, i / 2);	267 ms
snprintf(buffer, 1024, “[%d, %d, %d]”, i, i * 2, i / 2);	397 ms
std::stringstream() << ‘[‘ << i << “, ” << i * 2 << “, ” << i / 2 << ‘]’;	1598 ms
Printing floats
fmt::stack_format<1024>(“[%0, %1, %2]”, static_cast<float>(i), i * 2.0f, i / 2.0f);	1125 ms
snprintf(buffer, 1024, “[%g, %g, %g]”, static_cast<float>(i), i * 2.0f, i / 2.0f);	3198 ms
std::stringstream() << ‘[‘ << static_cast<float>(i) << “, ” << i * 2.0f << “, ” << i / 2.0f << ‘]’;	5473 ms

The floating point code is so much faster because I use Google’s double-conversion library. The other code is faster because I keep type information and because printf is such a huge beast. The printf code is a giant monster that handles a ton of possible flags and combinations. In my code I handle things like hexadecimal printing as extensions which don’t need to be part of the core format_it. I think printf could speed up, but since it’s written in C it has to drop type information and can never be as fast as C++ code. The code used for the profiling is at the bottom of this file.

And once again here is the link to the library. Feel free to take this and to build something standard quality with this. I haven’t had time recently and I don’t think I will have time to finish this for at least the next year.

9 Comments to “format_it: Iterator based string formatting”

vondleblog says:

August 31, 2014 at 08:26

Awesome! It looks already like it is more idiomatic than streams. I’m intrested in what kind of scripting solution you’ll come up with.

John Calsbeek (@jcalsbeek) says:

August 31, 2014 at 09:28

You’re comparing against glibc’s snprintf, right? I believe one of the reasons most snprintf implementations are slow is that they reuse the fprintf implementation by creating a buffer-backed FILE.

- Malte Skarupke says:
  
  August 31, 2014 at 12:50
  
  Yes I compare against glibc and yes, They write to a fake FILE object which redirects output to a buffer. So there is some more overhead in that. Still I think you’d run into language barriers if you tried to make snprintf as fast as this library. It’s the same problem as qsort vs std::sort. At some point you need to give the compiler more information about what’s going on if you want your code to be fast.
  
  That being said they could speed up their floating point code by using the same library that I use. I don’t remember the details because I wrote all of this quite a while ago, but there are basically two important algorithms for printing floats. One of them can handle all floats, the other one is faster but breaks on something like 0.1% of all possible values. glibc uses the algorithm that always works, Google’s library uses the faster version and detects failure and falls back to the slower version. Which seems smart and everyone should be doing that. My default float format is the same as “%g” from printf, and for that I had to write a wrapper around double-conversion that imitates printf behavior. They can use that code to switch more easily if they want to.
  
  I’m actually curious what the global energy savings are if something as fundamental as glibc improves performance slightly. I also seem to recall making a small optimization to double-conversion which I should probably give back to Google. They use this code in V8 so any performance gain there would also have big impacts. But it’s been too long and I don’t have a history of my changes…
  
  - John Calsbeek (@jcalsbeek) says:
    
    August 31, 2014 at 18:41
    
    Sure, snprintf discards type information, but that’s one switch statement over every possible character that can come after a %. That has to be weighed against the additional code bloat (which, in theory, is forcing potentially useful code out of the I$—but that assumes that the snprintf implementation is not disastrous in terms of I$).
    
    qsort calls the comparator n log n times. spnrintf has to dispatch over the format specifier a very limited number of times.
    
    Of course there’s also probably a lot of cruft in the average snprintf, like the slow numeric formatting you mention. But it also has a lot going for it. Sure, it isn’t type-safe, but basically every compiler either has a warning or static analysis to check format strings. format_it is type-safe so at least you won’t crash, but it doesn’t statically verify that the number of parameters used in the format string matches the number of parameters passed. This is probably a better solution in a case where the format string is provided at runtime (like localization), but I’m not sure I’m ready to give up printf for regular untranslated text that my compiler can check.
    
    (I just want string interpolation! Is that too much to ask?!)
  - Malte Skarupke says:
    
    September 1, 2014 at 10:28
    
    OK yeah to be honest I never measured why snprintf is slow. It could be that you can make it equally fast without keeping type information. That being said the first benchmark, which just does string concatenation, is probably the one that comes closest to only measuring the overhead, and the difference there is huge.
    I stepped through it and there isn’t actually a lot of inlining going on, so it’s not that. It would be interesting to profile snprintf there and see where the time goes, but I’m afraid somebody else is going to have to do that since I don’t have the time.
    
    The static validation is a huge bonus for snprintf. That being said I think it’s inevitable that C++ is going to get better string formatting eventually. This one could be it but I don’t have the time to finish it. (this is going to follow the old 80/20 rule in terms of I’m going to have to spend four times more time on this to bring it to the necessary quality) And once this is in the C++ standard, the static validation will come.
    
    Funny story: In our codebase at work I actually turned off the static analysis for printf errors. We just had too many. And I wanted to get static analysis turned on for more important warnings so I had to turn off some warnings to finish in time. Also many of our strings actually get passed around a fair bit before they reach snprintf. All of those functions have to be annotated for the compiler to pick up errors, which we also haven’t done yet. I’ll eventually do that because printf errors are so easy to fix, but as always I lack time. That being said the sheer volume of errors that we had convinced me that snprintf is just too problematic.
    
    I also fixed errors like this:
    char buffer[32];
    sprintf(buffer, “%f”, foo);
    which is non-obvious, and I don’t think static analysis picks this up. This code should have used snprintf in which case the error would have been truncation instead of stomping the stack, but in my opinion truncation is equally bad. Truncation errors can hide too easily.
    
    And my main argument to convince you to use this would be this: Think about how often you have written code for printing a Vector3. printf(“(%g, %g, %g)”, vec.x, vec.y, vec.z); Wouldn’t it be nice to only write this once? And after it was written once for the code to be automatically reusable so that if you have a vector of Vector3s, you can just print that? Also have you tried to print a matrix using printf? I have seen it. I bet if I were to do a search through our codebase at work I would find it at least five times. And every time it is ugly.
  - John Calsbeek (@jcalsbeek) says:
    
    September 1, 2014 at 20:54
    
    Yeah, use-printf-because-static-analysis is an inertia argument. Ideally “but compilers know how to warn about it” wouldn’t be a valid argument. But use-C++-because-it’s-industry-standard is also an inertia argument, so… time to label all the printf-style functions, hooray! But from another point of view, static analysis is a huge improvement over rewriting all the string formatting using a saner API: it finds all these old bugs in all this old code for free. It’s money under the couch cushions!
    
    Silent truncation isn’t great, I agree. I haven’t found a much better option yet, though… calling an “invalid parameter” function is all right if you remembered to override the default function so you actually get debuggable information. If you go with a heap fallback you aren’t in a memory-constrained environment, so why not just allocate on the heap to start with?
    
    Life would definitely be much better if there was a way to define printf formatters for different types. But there are still ways to avoid duplicating “(%f, %f, %f)” everywhere… you could have a simple FixedSizeString Printable(vec3) function, you just have an extra copy and lose a bit of syntactic sugar.
Vinipsmaker says:

September 3, 2014 at 22:36

I gave my feedback at reddit. I hope you are following discussion there: http://www.reddit.com/r/cpp/comments/2f2r8h/format_it_iterator_based_string_formatting/ck8qa4c

- Malte Skarupke says:
  
  September 4, 2014 at 18:47
  
  Hi, thanks for the feedback. It’s really good feedback :-).
  
  I was reading the discussion on reddit but I probably won’t respond to anything on there. The reason is simple: A few years ago I found myself wasting too much time on reddit, so now I try to stay away. I’m not fanatical about this so I check in every now and then, but I try to go there rarely. Having an active account wouldn’t help that.
  
  That being said responses:
  
  “I loved what you did. Keep going, please.”
  
  I can’t. I have no time. And I won’t for at least the next year. That being said I think this is in a pretty good state where I prefer this over printf and stringstream. But my goal was to get this into the standard and it’s not there yet.
  
  “I understand input string is not always captured at compile time and this is required to some forms of translatable strings that aren’t available at compile-time, but do you think it would be possible to add some template trick to guess the optimum value for stack_print?”
  
  It’s possible, but I’ve been bit by long compile times too often. Especially for something like this which will be used many times in a large codebase. I purposely don’t use any SFINAE in this. What you’re asking for doesn’t require SFINAE, it would be enough to just add the maximum length of all arguments of a known type. (Meaning start with the length of the input const char[], then add 20 for each integer etc) But I don’t know what the impact of that on compile time is and it just seems like something that could lead to unexpected slowdowns. (actually the current design is already worse for compile time that I want, so I’ll have to improve it before I make it worse again)
  Also stack memory is free. Just use one megabyte if you’re worried about hitting the heap fallback. I’ve found that if your string is long enough where you do need the heap fallback, the heap fallback is not what slows you down, so don’t worry about it.
  
  “I suggest you forget about the scripting idea. Translatable strings should be entire sentences. And take a look at the gettext manual on plural forms.”
  (sorry, I lost the links while copy pasting)
  
  I read those linked texts as arguments for scripting. At least text replacement scripting solves the gender issue. The number issue is more complicated. To solve that for all languages you would need real scripting like they have, I don’t want that for this, but if I allowed people to write their own macros they could provide the same solution. But even for the number case text replacement gets you pretty far. If you allowed regexp based text replacement you could handle all the cases that they handle but now we’re going crazy. User provided macros seem better.
  There is a real problem here in that I’ve never actually worked with translations of programs. I just went by poor translations that I had seen. And those suggest that if there’s one line in english, there will be one line in German. So it would be good if that one line was a bit more powerful.
  And something like this needs to be solved if I want this at a high quality. If you put out a library that has poor translation support, there are going to be poor translations. If you don’t support translations at all then people will fall back to printf for any text that the user sees, which defeats the whole point of this.
  
  “Is there a delimiter to indicate the end of the numeric arg? I mean, “%0 %{1}0 %2”, “a”, “b”, “c” should print “a b0 c”. Is there something similar? Do not use a unreliable design like bash.”
  
  I actually don’t support the “%{1}” syntax yet. I was going to add curly braces for the scripting solution. So you actually can’t do what you want to do. There is no way to append a 0 to a formatted value. But that’s another good reason to add the curly brace syntax.
  That being said I think I will keep the “%1” syntax for the simple case. I can’t think of any security issues that this would cause because I never use user-provided strings to determine what the number is. Meaning I only use the numbers that are in the format string initially. Or am I missing a clever exploit?
  
Vinipsmaker says:

September 4, 2014 at 20:17

I didn’t mention before, but I also liked the fmt::hex and family. It keeps composibility and kind of cleanly separates string conversion and string format. There are a number of formatting string libraries that love to mix the job of string-conversion and string formatting.

> I can’t. I have no time. And I won’t for at least the next year.

Be sure to add your library to the following list. Maybe it will give you more visibility (and maybe another programmer can continue your work).

> But my goal was to get this into the standard and it’s not there yet.

I also want the glory to have text contributed to a published C++ standard. *Maybe* I’ll help you. And I just loved your design.

> I read those linked texts as arguments for scripting. At least text replacement scripting solves the gender issue. The number issue is more complicated. To solve that for all languages you would need real scripting like they have, I don’t want that for this, but if I allowed people to write their own macros they could provide the same solution. But even for the number case text replacement gets you pretty far. If you allowed regexp based text replacement you could handle all the cases that they handle but now we’re going crazy. User provided macros seem better.
There is a real problem here in that I’ve never actually worked with translations of programs. I just went by poor translations that I had seen. And those suggest that if there’s one line in english, there will be one line in German. So it would be good if that one line was a bit more powerful.
And something like this needs to be solved if I want this at a high quality. If you put out a library that has poor translation support, there are going to be poor translations. If you don’t support translations at all then people will fall back to printf for any text that the user sees, which defeats the whole point of this.

Well, programmers need a library to format strings. Your library does that. Your library does not solve the problem for translatable applications (and nor need to). Localized applications is another layer to work on and it is not the same problem as formatting strings. If you want to extend your library to also handle translations, remember to stick to the most important rule: “Translatable strings should be entire sentences”. The library is not of much help if the programmer itself does a poor job.

Given said that, the scripting proposal you mentioned/created is the first and only one I like. The mentioned proposal keeps the entire context into one string and, if used well, can actually help programmers and translators.

> That being said I think I will keep the “%1″ syntax for the simple case. I can’t think of any security issues that this would cause because I never use user-provided strings to determine what the number is. Meaning I only use the numbers that are in the format string initially. Or am I missing a clever exploit?

Then I think you’re good. Just one question: how does the error handling happen? Exceptions?

Probably Dance