Apple M1 Macs May Be Writing Far More Data Than They Should
Concerning reports from Apple M1 Mac users have surfaced in the past few days, as different folks compare notes on how often their systems are writing to disk. Some comparisons display eye-popping levels of drive writes, especially given how long some of these systems have been in use. Evidence of a truly systemic problem, however, is limited — and I’m not sure how much we can trust some of the counters people are using to report data.
The top-line results are somewhat alarming. Users like @David_Rysk report over 150TB of drive writes already performed in less than two months on a system with 16GB of RAM and a 2TB SSD.
2TB 16GB model. 3% used.
That means that for a 256GB model, proportionally, you’d expect ~30% usage.
If this is accurate, some of these machines aren’t going to last half a year to 100%.
And that’s a 16GB model. 8GB should be worse.
Holy shit. https://t.co/9HcmaYgJPT
— Hector Martin (@marcan42) February 15, 2021
So… that’s a lot. Burning 3 percent of “percentage used” in two months does not suggest these drives will live very long. Here’s how that SMART Attribute is defined according to Kingston:
Percentage Used: Contains a vendor specific estimate of the percentage of NVM subsystem life used based on the actual usage and the manufacturer’s prediction of NVM life. A value of 100 indicates that the estimated endurance of the NVM in the NVM subsystem has been consumed, but may not indicate an NVM subsystem failure. The value is allowed to exceed 100. Percentages greater than 254 shall be represented as 255. This value shall be updated once per power-on hour (when the controller is not in a sleep state).
There are a lot of follow-on reports from these numbers. But there’s not much consistency in how much data is being written to these drives as a relation of how long they’ve been powered up. I gathered the data from multiple tweets and compared the total amount of data written to the drive against the amount of time the system had been powered up, but could establish no clear trend.
If you divide the total amount of data written against the total number of powered-on hours, the values range from a low of 21.5GB/hour to a high of 347GB/hour. That last figure is the one we see quoted above. It’s also an extreme outlier. Out of 10 reports, only three were higher than 100GB/hour, and the other two were 109GB/hour and 118GB/hour respectively. Five systems reported between the range of 20-40GB/hour.
This could make perfect sense, if the high reporting machines had the least RAM, or if the highest reporting machines always had the heaviest workloads, but user reports suggest this is not the case. Some users claim to have only been using their systems for lightweight activities. There is no apparent rhyme or reason to these reports. SSD write amplification has been floated as a possible cause:
What’s iosnoop saying? Could be some process constantly writing in 4kb chunks with O_DIRECT, thereby causing SSD write amplification effect. (50 writes per second, each 4kb in size becomes 512KB full rewrite of SSD block, and 50 × 512KB per second easily becomes your 2211GB/day)
— HMage (@hmage) February 23, 2021
A system losing 3 percent of its functional NAND every two months would see 90 percent of its NAND exhausted within 5 years, assuming linear rates of progression. Many SSDs can run well past their lifetimes, but the risk of hitting the manufacturer-specified limit is going to make a lot of people antsy, no matter what. The fact that these SSDs are soldered down and effectively impossible to replace also concerns some folks. Others have chimed in, claiming this issue affects both x86 and ARM, that it began after Catalina (as opposed to Big Sur), or that it affects x86, but to a lesser degree. At least one user has claimed his power-on hours are incorrect. If that value isn’t accurate, it would wreck any basis for comparison on these systems.
Right now, there doesn’t seem to be a consistent explanation for what’s going on here, and some changes Apple made to the M1 storage system may also be concerning. An M1 Mac, unlike an x86 Mac, cannot be booted from external storage if the internal storage completely fails. This, to our eye, is a problem that needs more attention than it’s gotten. SSDs fail for reasons other than hitting their write limits.
It is possible that some tools are reporting incorrect values for certain fields. It’s also possible that Apple has a low-level storage bug that’s drastically inflating drive writes. There’s no sign of this being a near-term problem, since even the most aggressive usage we’ve charted would support more than five years of use. But people keep laptops for longer than they used to, and Apple has aggressively traded on the idea of the M1 as a step up from Intel in every respect. For now, we’re still hunting for a cause or to understand if Apple considers this standard operating procedure and what the associated implications for long-term longevity are.