True advances in technology are rare. The expense and difficulty of launching brand-new initiatives means that companies tend to prefer iterative improvements. Every now and then, however, we get the best of both worlds — an iterative improvement that could deliver enormous gains to a wide slice of the consumer market. At Hot Chips, Samsung unveiled a pair of initiatives that could revolutionize computer memory by pushing High Bandwidth Memory further on the one hand, while cutting costs and introducing the technology to all-new markets on the other.
Let’s take them one at a time.
Low-cost HBM clears the path for less expensive devices
As we’ve discussed previously, HBM stacks memory chips on top of each other around a central core. The stacks are all connected by wires that run through each memory die (these are called through silicon vias, or TSVs) and the entire chip structure sits on an interposer layer. The resulting configuration is sometimes referred to as a 2.5D architecture. The advantage is vastly increased memory bandwidth and much lower power consumption compared with GDDR5. The disadvantage is cost. While HBM proved competitive with GDDR5 at high frequencies and loadouts, the technology is currently limited to the top of the graphics market. AMD’s upcoming Vega is expected to use HBM rather than GDDR5X, but that chip will target the $300+ segment.
Samsung is proposing a low-cost HBM that would reduce costs in multiple ways. The number of connects per-die would shrink, reducing the number of vias required for each chip. The company wants to replace the large silicon interposer with an organic layer, and believes it can cut costs by removing the on-die buffer as well (how this would impact the overall design remains uncertain). While the resulting HBM variant would have less overall bandwidth than HBM2, Samsung believes it can compensate by increasing the clock rate (presumably without compromising HBM’s overall design, which emphasizes low clock rates and extremely wide buses).
If successful, this low-cost HBM could drive the memory into markets where it can’t currently compete, including low-end graphics cards and the APU market. Right now, Intel has a potent GPU competitor with its Crystal Well, which puts 64-128MB of EDRAM on-package with the CPU. AMD doesn’t really have an answer to Crystal Well at present, and the company’s on-die graphics are already bandwidth-limited. One potential solution is to adopt HBM for APUs and offer a chip with a unified memory pool for CPU and GPU in a single package — but that can only happen if HBM prices drop enough to justify its inclusion. Any push to cut these costs could result in a much improved HBM technology deploying on APUs and other types of SoCs. But it’s not clear how power consumption would compare with other low-power technologies or whether or not we’d see the technology in 15-25W laptops.
HBM3: More capacity, more bandwidth
Samsung’s HBM3 is a straightforward improvement on HBM2 that would debut in 2019 or 2020 and offer higher densities, higher stacks (more RAM per chip, more chips per stack), and 2x the maximum bandwidth of HBM2. The goal is to reduce the core voltage (currently 1.2V) and the I/O signaling power, according to Ars Technica, while improving maximum performance.
HBM3 could allow for 64GB of memory on-die and 512GB/s of memory bandwidth per stack. A four-way stack of HBM3 would offer 2048GB/s of memory bandwidth in aggregate, compared with 1024GB/s with HBM2 and 512GB/s of HBM (all figures assume a four-stack configuration). This kind of bandwidth increase would give graphics cards or other peripherals far more memory than even the highest-end cards offer today and could be critical to driving next-generation VR systems.
The memory industry, however, isn’t as unified on HBM as you might think. As Anandtech details, both Micron and Samsung unveiled proposals for next-generation graphics and desktop memory (DDR5 and GDDR6, respectively). Xilinx is more commonly associated with FPGAs, not RAM. But Samsung used its own presentation to discuss how proper cooling technology is essential to large-scale die stacking and to call for the development of materials that can operate well at higher temperatures.
While many of these proposals are just that — proposals — they point the way to a potential revolution in gaming and high-end applications, while reduced cost and lower power options could extend those revolutions into form factors and power envelopes they currently can’t touch.