Increases processing speed while reducing energy consumption

For decades, computer chips have increased efficiency by using “caches,” small, local memory banks that store frequently used data and cut down on time- and energy-consuming communication with off-chip memory.

Today’s chips generally have three or even four different levels of cache, each of which is more capacious but slower than the last. The sizes of the caches represent a compromise between the needs of different kinds of programs, but it’s rare that they’re exactly suited to any one program.

Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory have designed a system that reallocates cache access on the fly, to create new “cache hierarchies” tailored to the needs of particular programs.

The researchers tested their system on a simulation of a chip with 36 cores, or processing units. They found that, compared to its best-performing predecessors, the system increased processing speed by 20 to 30 percent while reducing energy consumption by 30 to 85 percent.

“What you would like is to take these distributed physical memory resources and build application-specific hierarchies that maximize the performance for your particular application,” says Daniel Sanchez, an assistant professor in the Department of Electrical Engineering and Computer Science (EECS), whose group developed the new system.

“And that depends on many things in the application. What’s the size of the data it accesses? Does it have hierarchical reuse, so that it would benefit from a hierarchy of progressively larger memories? Or is it scanning through a data structure, so we’d be better off having a single but very large level? How often does it access data? How much would its performance suffer if we just let data drop to main memory? There are all these different tradeoffs.”

Sanchez and his coauthors — Po-An Tsai, a graduate student in EECS at MIT, and Nathan Beckmann, who was an MIT graduate student when the work was done and is now an assistant professor of computer science at Carnegie Mellon University — presented the new system, dubbed Jenga, at the International Symposium on Computer Architecture last week.

Staying local

For the past 10 years or so, improvements in computer chips’ processing power have come from the addition of more cores. The chips in most of today’s desktop computers have four cores, but several major chipmakers have announced plans to move to six cores in the next year or so, and 16-core processors are not uncommon in high-end servers. Most industry watchers assume that the core count will continue to climb.

Each core in a multicore chip usually has two levels of private cache. All the cores share a third cache, which is actually broken up into discrete memory banks scattered around the chip. Some new chips also include a so-called DRAM cache, which is etched into a second chip that is mounted on top of the first.