Each of the AMD Opteron cores has a floating point addition unit and a floating point multiplication unit. These units are independent of each other which means that an addition and a multiplication operation can take place simultaneously. The processor is capable of completing a single floating point operation from each of these units per cycle. Given the clock speed of 2.8 GHz this gives us a theoretical peak performance of 2 * 2.8 = 5.6 Gflops per core or 11.2 Gflops per dual core for double precision floating point operations.
The caches on each core are private. Unlike many systems there are no shared caches on HECToR. Each core has a separate 2-way set associative level 1 cache of 64 kB. The level 2 cache is a 16-way combined data and instruction cache totalling 1 MB. Both the level 1 and 2 caches use 64 byte cache lines, equating to eight double precision words. The level 2 cache acts as a victim cache for the level 1 cache which means that data evicted from the level 1 cache gets placed onto the level 2 cache.