How RAM Works & Why RAM Is So Expensive?? RAM Explained
Thanks almost exclusively to AI, the entire consumer tech market is being flushed down the drain. Why? Well that’s just one of the questions I hope to answer in this video. To understand why the entire tech market is in the crapper, we first need to understand how RAM is made, and to understand that, I think it’d be a good idea to understand how RAM even works, so let’s start there!
RAM, or random access memory, is different to solid state storage like NAND flash in a few very key ways. The biggest one is volatility. With an SSD (which is made up of NAND flash chips), you can remove the drive from one PC, leave it on a shelf for a week, then plug it into a different computer, and all your data will still be on there and accessible, making it non-volatile storage. As an aside, leaving SSDs unplugged for long periods and expecting the data to stay intact is a bad idea. NAND flash does degrade the data over time when powered off, so keep that in mind. Anyway, back to RAM. RAM, on the other hand, is volatile, meaning the second power is lost, so is the data. It requires constant power to keep any data in it. That difference comes from the structural and design difference in how each stores data. NAND flash uses NAND gates to hold the data, whereas RAM uses transistors and capacitors. It’s worth making the distinction between DRAM and SRAM – SRAM, or static random access memory uses a flip-flop (yes that’s a real electronic component name) to hold the data – that’s a six transistor package – and that’s what most CPU cache is made of. That is more stable and as long as power is supplied, the bit will stay locked as a one or a zero without needing to be refreshed. DRAM, or dynamic random access memory, is an even simpler design, just a single transistor and a capacitor. That makes it much more storage-dense and theoretically simpler to manufacture, although to get the sorts of capacities we are looking for in our RAM, you need to make the structures 3D to keep the capacitor’s capacity high enough to be useful. Because DRAM uses a capacitor to store the charge, but there is some natural leakage from the cap and transistor, which means the data stored in the cell needs to be refreshed – essentially all the data you are storing in your RAM needs to be re-written every 2-50 milliseconds. Non-stop. Yeah, right now your CPU is re-writing every single bit of data that’s in your RAM hundreds of times per second. It’s a lot!
The RAM’s structure is a grid – a matrix – where you have a bit line and a word line, basically a control line per column, and per row. To store a bit in a cell, you activate the corresponding word and bit lines. It’s actually remarkably complicated, and there’s whole sections on the DRAM wiki page just about the bitline designs, so just know I’m skipping over a whole lot here. If you were wondering about ECC, or error correcting code, that uses extra bits to store parity data, and then the memory controller (built into your CPU) can check, validate, and even recreate corrupt data. Pretty cool! While we are here, it’s worth pointing out the extra bits of circuitry built into RAM chips, because the way that your CPU actually interacts with your memory modules is purely via a 64 bit bus and a clock signal, and the memory chips need to be able to convert those 64 bit numbers into addresses to find out where the data actually is in those grids, and then extract it back to the bus. That’s the address demultiplexer, but there’s also a set of switches to control the data lines (the RAS, row address strobe) and the CAS (column address strobe), which have very precise timings. Ever heard of CAS latency? That’s this! For context, the lower the latency, the faster you can physically enter or retrieve data. The transfer rate, the mega-transfers-per-second speed we often talk about too, like 5600MT/s, is how many of those transfers you can do per second. It might seem like we’re talking about the same thing, or that one informs the other, but it’s actually slightly different. The frequency is how many packets of data are able to be sent per second, whereas the latency is how long it takes once you’ve actually requested a specific packet to be sent. It’s all well and good being able to transfer twice as many packets per second, but if it takes twice as long to actually find the one you need… That’s why the balance between CAS latency and frequency is so important.
Anyway, back to RAM. When it comes to actually manufacturing it, there are actually only four companies worldwide that manufacture the chips. Loads of companies take the RAM modules and stick them on PCBs, but only Micron, SKHynix, Samsung and CXMT actually make the RAM packages. Interestingly, the process to manufacture RAM modules is remarkably similar to most other chip manufacturing, using ultraviolet light to imprint designs onto silicon wafers, although much to my surprise, RAM manufacturers are only just (and I mean just, like in the last year or two) moving to EUV machines (extreme ultraviolet), something TSMC has been using for the better part of a decade for high end chip fabrication. Per Micron anyway, only their most recent 1-gamma node (DDR5 with up to 9200MT/s) uses EUV and sub 10nm class processes. For context, CPUs are using 2nm class nodes now, having gone through 3, 4, 5, 6, 7 and 10nm already. That’s how far RAM manufacturing is behind the high end parts like CPUs and GPUs. Why? Well the simplest answer is that RAM is a commodity item. It’s manufactured in such high quantities that, much like grains and other commodities, the price is really just dictated by demand, not by technological innovation and progress. That is one of the big reasons why there are so few RAM manufacturers – there just isn’t (or wasn’t) any money in it, and the setup costs would eat most companies alive if they even tried. This fierce price competition is also what drove most RAM manufacturers out of the market, either into the hands of the big three (who produce 95 percent of the world’s supply of memory), or into bankruptcy.
To give you some context, Intel, who you might be thinking is in a perfect position to start manufacturing their own RAM, actually kind of invented RAM as we know it. In 1968 they launched the Intel 1103, the first commercially successful DRAM chip, and by the 70s Intel was THE RAM maker. But, by the 80s Japanese companies like NEC, Toshiba and Hitachi – who were heavily subsidised by the Japanese government – meant Intel was basically priced out of their own market. By 1985, Intel abandoned their memory division. They actually tried to re-enter the memory market with their hybrid memory and storage designs, Optane and 3D XPoint. That didn’t go so well, having now sold it off. Hell, they sold off their NAND division to SK Hynix too – that’s how bad the money is in the memory and storage game.
Of course, the other reason new players can’t enter the DRAM game is patents. Between Samsung, SK Hynix and Micron, they hold basically all the patents for RAM manufacturing, and unless you are an Asian country that doesn’t care much for international patent law (Japan in the 80s, or China today), you’re kinda SOL if you want to make RAM. Any money that is left in the pot would be gobbled up by licensing fees. Doesn’t sound like such a great investment, right? It is worth focusing on CXMT – or ChangXin Memory Technologies – because they might just be our saviour. CXMT was only founded in 2016, although through heavy state backing (sound familiar?) they’ve been able to do a hefty bit of damage to the big three’s triopoly. CXMT started by making the lowest end RAM that still was commercially viable, which was low spec DDR4. They’ve been ramping up capacity (68 percent year over year) to basically flood the market, and thanks to state backing and generally low manufacturing costs, they’ve been selling their memory for up to 50 percent less than Samsung and SK Hynix, which has forced them to effectively abandon that lower end market, which in turn has left the door wide open for CXMT’s expansion and profits. Even after that success, most industry experts balked at the idea CXMT could get anywhere near the western triad’s technology, but they’ve already mass-produced DDR5, and have shrunk their process node significantly. They are at 16nm now, and working to drop it even smaller – and that’s all without EUV machines too! They went from over a decade of technological gap, to just three or four years real fast. The big AI RAM is HBM – high bandwidth memory – which CXMT is working on making too. The bottom line is CXMT’s market share was 0 percent in 2020, but is estimated to hit 10-12 percent this year. If CXMT can pull this off, in the same way that China has ultra-commodified solar panels and EV batteries, the big three are in trouble.
But why exactly are we in such a DRAM drought? Well the short answer is AI, the long answer… well that’s more complicated. It is true that AI companies are acting like Smaug which is driving up the price, but that’s mostly thanks to RAM manufacturing being a pretty delicate balance, which AI companies are royally upsetting. What is most frustrating is that the demand from AI companies – be that Microsoft, OpenAI, Oracle, NVIDIA or Google – is almost entirely fabricated. Not just because AI is a venture capital fever dream in search of a use case and no one on earth actually cares, but because the datacentres they are all racing to build take years to even break ground, let alone kit out with next gen “AI” servers. OpenAI’s multi-year 40 percent of production orders are literally to hoard it. To stockpile at a lower price and literally sit on it until you might, maybe, have a use for it. It’s insane. But, thanks to the slim margins and high complexity manufacturing, spinning up more capacity to meet this increased demand is tricky, and even trickier when you realise everyone knows it’s only a temporary spike in demand. Once the AI idiots get their fill, they’ll be overproducing to meet nonexistent demand, and then the manufacturers will be in trouble. So they keep their production steady, or slight increases, but they aren’t exactly moving heaven and earth to double or triple their capacity, despite at least currently being fully sold out for the next couple years.
Which, unfortunately, means that besides hoping to get your hands on some Chinese made DDR4 or DDR5, we are kinda stuck, and if you can’t get memory, then there isn’t much point in trying to build a system, and that is putting a drag on the whole industry. I wish there was some secret or hack I could share, but this is a supply and post-capitalism problem, not something us individuals can control. Beyond helping to put an end to the AI craze by just not using (and specifically not paying for) their AI stuff, there isn’t much we can do. One thing I’ve been mulling over is Micron’s decision to take Crucial into the back alley for a mob-style execution, because on the face of it, that doesn’t make much sense. Just limit production of Crucial’s products for the next year or two. Slow down releases. Everyone else is! You don’t need to announce anything, just say ‘supply issues’ and let stuff go out of stock for a bit. It wouldn’t stand out at all. But instead they publicly shot themselves in the foot. Once the AI craze is over, they’re going to have to resurrect the Crucial brand and try and win back all the favour they’ve shattered with this announcement. Why? It didn’t make sense, until I realised that it’s just business. Specifically, if Micron can be seen to be so committed to supplying AI monsters with memory that they’d cut their own (profitable) leg off, well that’s great for those short sighted investors who are pumping money into everything AI related! And that’s great for the share price, and shows true commitment that might mean OpenAI will spend an extra million or ten with them over SK or Samsung. And that’s worth it, for now anyway.
