AMD Responds to Claims of EPYC Genoa Memory Bug, Says Update On Track
AMD Responds to Claims of EPYC Genoa Memory Bug, Says Update On Track
At a recent financial conference, AMD CTO Mark Papermaster was asked about a report of a memory bug with the company’s EPYC Genoa processors that would ostensibly require a lengthy redesign/respin process to fix. His answer was a bit vague, so we followed up with AMD for more details. The company repudiated the claims of a memory bug, telling Tom’s Hardware that all fourth-gen EPYC processors shipped to date fully support the coming 2DPC memory configuration and that no respin is needed. Additionally, the company has already issued BIOS updates to its OEM partners to enable the promised support for 2DPC configurations by the end of Q1 2023. AMD also shared other details we’ll cover below. But first, a bit of background info.
As you can see in our EPYC Genoa review, AMD’s new data center chips exhibit market-leading performance and come with several new interfaces, with support for 12 channels of DDR5 memory being one of the most important. However, Genoa only launched with support for DDR5 memory in a one DIMM per channel (1DPC) configuration. This type of configuration supports only one memory stick connected to each of the twelve DDR5 memory controllers inside the processor.
At launch, AMD said it would release a BIOS update in the first quarter of 2023 to enable support for two memory DIMMS per channel (2DPC), thus allowing two memory sticks to be connected to each memory channel to boost capacity. AMD said it was further characterizing and tuning the 2DPC memory configurations, so it would release the spec for the supported 2DPC memory speeds when the update became available.
In the interim, SemiAccurate (partially paywalled) reported a purported problem with AMD’s Genoa processors last month. The report cited unnamed industry sources that claim Genoa has a bug in the memory subsystem, so AMD had to embark on a costly respin of the processors to support 2DPC memory configurations. This would inevitably lead to delays of several months as the new chips worked their way through the redesign and manufacturing process.
Naturally, a bug in the memory subsystem for the shipping chips would mean that the currently-shipping Genoa processors would not support the forthcoming 2DPC spec. So to determine if a new respin was needed, we asked AMD if all of the Genoa processors already in circulation would support the 2DPC memory configuration when launched, which the company assured us is the case.
Additionally, AMD went on the record to say that no respin is required for 2DPC support. Instead, the company says 2DPC support only requires the BIOS update it has already issued to its OEM customers. As a result, they are already designing motherboards with enough slots to support the feature.
AMD also clarified Papermaster’s comments at the recent Morgan Stanley investor conference, which have been misinterpreted. At the conference, Papermaster said, “And the 2 DIMM per channel, which is I think what you’re referring to is following. So that is for a targeted – a much smaller targeted set of customers. Those speeds will be announced later this quarter, and that will ramp as well, but this number of customers for 2 DIMMs per channel is much smaller.” AMD says the “ramp” comment is in reference to systems that support 2DPC configurations (they need more physical slots), not to a newer revision of the processor.
Genoa’s support for 12 channels of DDR5 is the highest on the market for an x86 processor. Genoa has 50% more channels than Sapphire Rapids‘ eight channels, and both chips support a peak of DDR5-4800 memory in a 1DPC configuration. Intel has specced its 2DPC configuration at DDR5-4400, but as mentioned, AMD hasn’t finished qualifying its 2DPC transfer rates.
AMD’s decision to launch Genoa before it had finalized 2DPC support is sound — it is rational to expect that the demand for 2DPC configs will be dramatically less than we’ve seen in the past. The 2DPC config is typically used to access increased capacity (there can be small performance improvements with certain rank configs). But with 12 memory channels in a 1DPC configuration, AMD can already support up to 3TB of memory per chip with 256 GB sticks. That’s plenty for the broadest cross-section of users. Support for 2DPC boosts that capacity to 6TB of DDR5 per socket, but AMD is already running into space constraints packing in 12 channels of memory into regular two-socket servers.
As you can see in the above image of our Genoa test server, cramming in 24 total DIMM slots for a 1DPC config already creates plenty of issues due to space constraints. Frankly, it’s hard to imagine packing in twice the number of pictured slots for a 2DPC configuration — a dual-socket server would need 48 total slots. As such, we believe that most 2DPC configs will likely either be for single-socket servers or use a reduced number of channels in dual-socket servers.
There are already plenty of challenges enabling the pictured 1DPC config. In fact, AMD had to use special ‘skinny’ memory slots for Genoa motherboards to help pack 12 slots into the chassis. AMD cautioned us that, due to the skinny slots and other accommodations for the denser arrangement, it has had several incidents where lateral pressure when installing the DDR5 DIMMs had stripped the DIMM socket off the board. This is an edge case and not indicative of an issue with the platform, but it does point to the challenges AMD already faces with ‘just’ 12 memory slots.
The challenges for 2DPC expand beyond just the space needed for more slots. As we’ve seen with DDR4 memory, adding more DIMMs per channel results in reduced memory speeds, and more channels results in even more complexity. Additionally, even having extra empty slots can result in lower peak memory speeds, as seen with the complicated DDR4 and DDR5 support matrix for the consumer platforms. Those problems become even more vexing with DDR5, as it has much higher tolerances and requires more complex motherboard designs with more layers and better materials, which adds cost. This will become even more challenging with the higher transfer rates needed for next-gen memory — market insiders have even predicted that support for 2DPC could end with the DDR6 standard.
Due to the normal step-down in speeds with 2DPC configs, Intel’s Sapphire Rapids drops from DDR5-4800 to DDR5-4400 when in a 2DPC config. We can also expect Genoa’s 2DPC speeds to be reduced when the company releases the final spec, but it remains to be seen how much penalty it will incur.
AMD says it will release the details of Genoa’s 2DPC support this month, and we’ll update once we receive the details.