AMD’s Flagship RDNA3 GPU Rumored to Feature 384-bit Memory Bus
As we head into summer, more information about AMD’s upcoming GPU architecture is finally coming to light. So far there hasn’t been a lot of information to peruse, despite there being a deluge of leaks about Nvidia’s plans. To rectify this imbalance, internet sleuths have been poring over the company’s drivers looking for any morsel of info they can find. They recently hit pay dirt care of AMD’s linux drivers for GPUs. AMD has since patched the information out of the driver, while seemingly verifying it by doing so.
A Twitter user named Kepler (ironically) was first to spot the detail. The driver had a line that was labeled MCD_INSTANCE_NUM with the number six. This seems to confirm six memory controllers. If you extrapolate that to each one being 64-bit, that equals a 384-bit memory controller. This is an upgrade to the 256-bit bus on its flagship RDNA2 GPUs, the RX 6900/6950 XT. What’s interesting is AMD put those GPUs up against Nvidia’s RTX 3090, which has a 384-bit bus. AMD explained a wider bus wasn’t necessary, as it had a trick up its sleeve: Infinity Cache. Overall, AMD was right. It was able to go toe-to-toe in rasterization with Nvidia this round. Despite reaching parity with its rival, it seems like AMD isn’t taking any chances with RDNA3. AMD also replaced this line of code with different text a week later, according to Videocardz. As always, deleting the offending text just heightens the intrigue.
This leak seems to confirm the previous speculation about the design of the chip as well. As shown above, it’s long been rumored to be a seven-chiplet GPU. That means a main graphics chiplet and six multi-cache dies, or MCDs. This could mean it will sport as much as 192MB of Infinity Cache assuming 32MB per die. Kepler also predicts AMD could use 3D stacking on its flagship GPU, doubling that number to 384MB. If so that would mark a radical boost in the amount of Infinity Cache it’s using. The current RX 6950 XT has just 128MB.
Also, using the 6950 XT as a benchmark, we can also expect memory bandwidth to be almost double for RDNA3. If it uses the same 18Gb/s GDDR6 as the current GPU, it would be capable of 864GB/s. That’s compared to the 6950’s 576GB/s maximum. It also doesn’t take into account the benefits of Infinity Cache either. That would easily allow an RDNA3 GPU to achieve 1TB/s of memory bandwidth. This would match the memory bandwidth of Nvidia’s RTX 3090 Ti.
One potential explanation for AMD’s bandwidth boost lies in the overall size of the card. Top-end RDNA3 cards have been rumored to field up to 12,288 cores. The top-end Radeon 6950XT fielded 128MB of L3 cache to back up 5,120 GPU cores. If AMD bumps core counts this high, even a 192MB L3 cache might not be sufficient. A 384MB L3 would actually increase the total amount of L3 relative to the number of cores, while a 192MB L3 would still represent a modest decrease.
Tests of AMD’s memory bandwidth have consistently shown that Infinity Cache does reduce pressure on memory bandwidth, so regardless of how much cache AMD fields, one thing is clear: If these rumors are true, the company decided it needed to use both memory bandwidth and Infinity Cache to catch up with Nvidia’s overall performance rather than substituting one for the other.
For its part, Nvidia is also rumored to be increasing the cache sizes on its upcoming Ada Lovelace GPUs. Previous reports indicated Nvidia would be bumping L2 amounts by 16x, at least on some models. It’s speculated to be adding 16MB of L2 per 64-bit memory controller, for a total of 96MB. It currently uses just 512KB of L2 on its GA102 die with 32-bit memory controllers. This would mark a significant increase in L2 amounts, as Nvidia attempts to blunt AMD’s cache offensive.
As always, we will have to wait and see where the chips fall when these two titanic GPUs go head-to-head later this year. What’s especially interesting this time around is both companies are using the same TSMC N5 process. This will make for an unprecedented battle of MCM versus monolithic designs using the same fabrication node. One concern was brought up recently though, which is that TSMC customers were looking to reduce their existing orders. This has been in response to the recent GPU dump that’s occurred, as well as global economic jitters. However, that report stated AMD wasn’t asking to cut its order of 5nm products, but Nvidia was. This could lead to a delay for the RTX 40-series launch. TSMC reportedly told Nvidia it can’t reduce its order, but it can push it back a bit.