Making efficient use of multiple processing cores in a symmetrical multiprocessing (SMP) environment can be tricky. But Intel's Thread Building Blocks (TBB) aim to ease the challenge.
TBB is based on C++ templates that provide a set of thread-safe parallel control and data objects suitable for writing efficient parallel algorithms. Compared to approaches like OpenMP, these class templates make mapping conventional loop constructs such as for and while and data structures such as vectors and queues a straightforward process that allows the TBB objects to operate in parallel without requiring explicit thread management by the programmer.
TBB doesn't create a plethora of threads. Instead, it creates one thread per logical core. (A hyperthreading core may support two or more logical threads.) Each working thread manages a work queue (see Figure). The main thread creates the TBB objects like a vector and performs operations on the vector such as scanning all the entries.
In this case, the operation will be part of the program code. It will be referenced by multiple chunks to span the vector where each chunk will process a subset of the vector. These chunks are spread across the work queues of the working threads. An idle working thread grabs the next chunk out of the queue and then performs the work to be done on its block of data.
This approach is very similar to distributed Web-based servers like Java 2 Enterprise Edition (J2EE), except the J2EE fills the queue with incoming Web requests. In general, the TBB is much more data-structure-driven, whereas J2EE is more I/O-oriented. Likewise, J2EE threads typically are allocated based on the estimated I/O load versus the number of cores the system is running on. That’s because the J2EE threads often will have to wait on other operations, while the TBB threads have all the data available to them when the chunk is being processed.
Complementary products for Linux with 64-bit support include Thread Checker and Thread Profiler. These tools can be used with TBB, but they're just as useful for any multithreaded application. Thread Checker uses binary instrumentation. It can find problems such as race conditions and deadlock automatically. New command line support is ideal for batch mode and regression testing. The Thread Profiler presents TBB information in a more abstract fashion commensurate with TBB's source abstraction.
TBB costs $299. It's available for Windows, Linux, and Mac operating systems. Thread Checker costs $999, and Thread Profiler is $299.
| Intel |