There comes a time in most complex programs where you want to ask a simple question like, ‘have I already processed a string with this id’? Linear searches through an array are easy to write and work well enough for small array sizes. Plus, the memory overhead of linear searches is fantastic, since it basically has none. But when your arrays can contain many elements, it is time to ditch those linear searches and go with an ordered map or unordered map.
This information can already be found easily, so I’ll just sum this up. As far as your C++ STL is concerned a map is, by default, ordered. That is to say that it will probably implement a tree structure internally and represent your elements in alphabetical order. Each time you insert an element, an algorithm figures out where the element belongs and stores it there, and possibly performs a tree rebalance. Conversely, when looking up an element in a map, the tree is traversed until the element is found. The more elements in the map, the longer it takes to insert a new element, or look up an old one. Both of these take time to complete, but the advantage of ordered maps is that you can iterate over them in alphabetical order easily.
Unordered maps are a whole different beast. Instead of a tree structure, they use a hash table. That means they have constant insertion and lookup times per element. The only downside is that you can’t directly iterate over the elements in alphabetical order. Of course, that’s only a downside if you need that feature. Okay, enough of the differences, let’s get into the performance numbers!
The first two graphs are completed by inserting a random alphanumeric string between 10 and 20 characters long. These strings are already computed and stored in an array so as to not interfere with the map performance measurements. The CPU is an i7-930, running Windows 7, so these are with the STL supplied with Visual Studio 2010.
As you can see, using the unordered_map is substantially faster than the map implementation, even for small numbers of elements. I also graphed the insertion performance on a per element basis with a logarithmic scale for number of elements in the test. Notice that as the regular map contains more elements, the insertion performance becomes slower. At 8M elements, the cost to insert into a map is 4x that of inserting into an unordered map.
There is no point of using a map unless you’re planning on eventually looking up elements, or at least seeing if they exist. I performed lookups on maps and unordered maps with different numbers of elements already inserted into them. I also performed the test using random alphanumeric strings between 40 and 80 characters long. There is a penalty for performing a hash, and also testing alphabetical order of two strings.
The result is surprising. The unordered_map took slightly longer to compute a hash needed to lookup each element in the hash table, but the ordered map suffered more greatly. Whatever the case may be, I’ve shown that unordered maps are several times faster to insert elements into, and lookup elements. Even for lists of only a million elements, ordered maps are twice as slow to insert into, and even slower to lookup elements. So next time you decide between maps and unordered_maps, ask yourself if keeping the list alphabetical really worth the performance hit. If you don’t need to iterate alphabetically, you’re much better off using an unordered map.