<![endif] RSS Blog Madison Hack&Tell; Archives About Jan 2 nd , 2014 Here’s the graph of a toy benchmark 1 of page-aligned vs. mis-aligned accesses; it shows a ratio of performance between the two at different working set sizes. If this benchmark seems contrived, it actually comes from a real world example of the disastrous performance implications of using nice power of 2 alignment, or page aligment in an actual system 2 . Except for very small working sets (1-8), the unaligned version is noticeably faster than the page-aligned version, and there’s a large region up to a working set size of 512 where the ratio in performance is somewhat stable, but moreso on our Sandy Bridge chip than our Westmere chip. To understand what’s going on here, we have to look at how caches organize data. By way of analogy, consider a 1,000 car parking garage that has 10,000 permits...