How NVIDIA Became a Trillion Dollar AI Titan
|NVIDIA has become an AI titan, and the third largest company in the world, just $50 billion behind Microsoft at $2.84 trillion – at least at the time of writing anyway. Nearly $3 trillion in value, and spearheading, or perhaps underpinning, the AI tech revolution. Their chips are what the majority of people game on – the disastrous 50 series not withstanding – and they are what all the AI tools use not only to learn, but to function. No NVIDIA, no ChatGPT or Github CoPilot. But how did they get here? How did they go from a scrappy fabless GPU designer for the Dreamcast to a trillion dollar titan of not just the tech industry, but the whole world? Let’s rewind the clock a little and find out…
NVIDIA started, seemingly as all giant tech companies that are still around did, with a group of guys that weren’t all too happy with how their current tech-based employers were running things. Jensen Huang, the leather jacket man himself, along with Chris Malachowsky and Curtis Priem left their jobs at LSI and Sun Microsystems respectively to form invidia in 1993 (invidia being the latin word for envy, chosen because they’d save new files with “NV” meaning ‘new version’, and couldn’t think of anything else. They stripped off the leading ‘i’ and set up NVidia). Seriously, Intel was founded by two of the traitorous eight who left Shockley to found Fairchild, and AMD was founded by a guy who left Fairchild too. I guess that’s why companies keep putting non-compete clauses in their job contracts now. Anyway, after famously deciding in a Denny’s to form a company, they got $20 million in capital from Sequoia Capital (after the boss of LSI gave them a glowing recommendation) and set about making 3D graphics chips.
NVIDIA has always been fabless – meaning they don’t manufacture their own chip designs. In 1993 that was still a pretty radical idea – it wouldn’t be until 2009 that AMD sold off their fabs and I think 2023/24 until Intel started using TSMC for their own mainstream chips. NVIDIA’s first chip was the NV1, which was theoretically an absolute killer in the market. It was a Sega Saturn compatible card that was designed to replace your 2D graphics card, a Sound Blaster style sound card, AND a joystick card, all while providing 3D graphics too. Sounds amazing, right? Well, no. Out of the 250,000 they sold to Diamond Multimedia, Diamond returned 249,000 chips. The NV1 was a complete and utter flop. Why? In part, thanks to Microsoft. The NV1 was designed around quad primitives – four-pointed shapes being the minimum sized shape – but the industry as a whole, and pointedly Microsoft with their new DirectX and Direct3D pipeline, had decided triangular primitives were the best choice – three pointed shapes we know today as tris, or often polygons. The failure of the NV1 hit NVIDIA hard, causing them to lay off a majority of their staff going from 100 to just 40 – which was especially hard as initially NVIDIA was contracted to provide the graphics chip from the Sega Dreamcast, but eventually opted to use an NEC PowerVR2 chip instead. Sega’s president, Shoichiro Irimajiri, believed in NVIDIA and convinced Sega’s management to invest $5 million in NVIDIA, which Jensen said was what kept them afloat through that difficult time, “[he] gave us six months to live”.
Not to be dissuaded, NVIDA came back kicking just over a year later with the RIVA 128, a full-service graphics card that worked with pretty much everything. The 128 in it’s name referred to it being a 128 bit memory bus – with just 4MB of standard data rate memory, a 100 MHz clock speed, and a whopping 4 million transistors, it wasn’t what you’d call flash by modern standards, but hey it only drew 4 watts over the AGP interface, so I guess that’s something. When NVIDIA launched the RIVA 128 they had just one month’s worth of payroll money left. Their unofficial motto became, “our company is thirty days from going out of business”. Luckily the RIVA 128 sold incredibly well, selling a million units in just four months. That kept them afloat long enough to make the RIVA TNT, a 7 million transistor, 16MB VRAM card that doubled the number of pixel shaders, for, well, a healthy bump in performance. The TNT went toe to toe with the 3dfx Voodoo 2, and depending on who you ask, either won, lost, or it was a wash between them. Either way, it was a decent card, although the Voodoo 2 did have one major advantage… SLI. Yep, that’s right, SLI, the famous NVIDIA feature was actually made by their competitor, 3dfx, for their Voodoo line. SLI stands for Scan-Line Interleave, meaning each card would render every second horizontal line of pixels, and would then have those two half-images combined to output the final image. NVIDIA did rebrand SLI to scalable link interface, opting primarily to render alternate frames per card rather than alternate lines, but the catch has always been game support. For games that supported it, it was amazing, but many, if not most, didn’t, and in those cases you got little to no performance benefit.
At the same time, the third graphics card competitor in the market was finding its feet with their own 3D graphics cards – ATI. ATI were actually almost a decade older than NVIDIA, and launched their own 3D cards starting in 1996 with the 3D Rage. The Rage family evolved throughout the late 90’s, with last being the Rage Fury MAXX which was actually a dual-GPU card that ran alternate frame rendering to compete with NVIDIA’s new offering – “GeForce”. Specifically the GeForce 256. That was a huge leap forward in a number of ways. The 256 was the largest die ever manufacturered at the time, a whopping 139mm2 – for context a new RTX 5090 has a die area of 750mm2 or more than FIVE times larger, but for the time it was massive. It also featured double-data-rate memory, and Direct3D 7 support thanks to a hardware transform and lighting engine that handled all the geometry calculations, offloading that from the CPU, and was the first PC card to support that in hardware (although game consoles like the Sega Genesis, PlayStation and Nintendo 64 supported it earlier).
Just before the GeForce 256 came out, NVIDIA went public, raking in $42 million by selling 3.5 million shares at $12 each – although the stock surged to $19.6875 at the end of that week, giving the company a huge $626 million valuation. By 2001 Standard and Poor’s picked NVIDIA to replace the scandal-ridden ENRON in the S&P 500 index helping NVIDIA’s stock price even more.
NVIDIA refined the GeForce line for the second generation, and again for the third – this time adding programmable vertex and pixel shaders which sped the cards up considerably in games, and this is where the biggest deal of NVIDIA’s life so far comes in. The Xbox. NVIDIA was contracted to provide the graphics core for the original Xbox, with Microsoft giving NVIDIA a $200 million advance on the deal. NVIDIA created the NV2A based on the GeForce 3 family, with a touch of fourth gen in there too it seems (especially since the fourth generation was a bit of an incremental improvement). This deal flushed NVIDIA with cash, and combined with their strong valuation, they bought out one of their two main competitors – 3dfx. They paid 3dfx $70 million and 1 million in common shares, with the deal concluding in 2002. And then there were two…
Interestingly, the GeForce 4 series is where Ti comes from, although it was a prefix rather than suffix, with the Ti 4600 being the top end option at the time. Fun fact, this is also the generation that ATI created the XT name. The 9700 TX, and and 9600 XT, and even the 9800 XT are all from this 2002/2003 time period, and were competitors to the 4th gen and 5th Gen FX NVIDIA cards. Those FX cards were a big leap forward again in performance, although interestingly this is where they standardised having a video decoder/encoder built into the chip design. The Video Processing Engine or VPE – first deployed in the lower end MX cards from the fourth gen – which sped up MPEG-2 playback, and helped NVIDIA compete with ATI’s offerings at the time.
The 6 series more closely resembles a modern GPU, with heatsinks getting a little bigger, multiple outputs, sometimes a power connector, and on some models even a PCIe slot. The 6 series introduced NVIDIA’s version of SLI – thanks to buying out 3dfx a few years earlier – along with NVIDIA PureVideo which is an upgrade to the on-chip video decoding, now supporting H.264, WMV and more, and even DirectX 9c support. The following year NVIDIA launched the 7th generation, which was the last generation to even partially support the old AGP interface. Importantly though, a modified version of the 7800 GTX found its way into the PlayStation 3 – they legit called it the “RSX Reality Synthetizer”. Like come on.
2006 brought about a number of major changes – both in NVIDIA and in the wider market. The first major market shift was AMD buying ATI for a whopping $5.4 billion – but not before asking NVIDIA! According to a Forbes article from 2012, the deal was killed by Jensen as he “insisted on being chief executive of the combined company” – AKA he wanted to be CEO of AMD after the acquisition. AMD didn’t fancy that, so opted to buy ATI instead. The second major change was an API from NVIDIA called CUDA, or Compute Unified Device Architecture, and was basically a way to use the graphics card not just as a gaming card, but as a general purpose processor. This is the single most influential change in NVIDIA’s value proposition, at least in my opinion, because it meant that the potential market for NVIDIA GPUs changed from tens of millions of gamers, to billions of workstations, servers and users. CUDA is basically the heart of GPU acceleration, even to this day. There are alternative APIs like OpenCL, although they are nowhere near as popular or often performant. Since CUDA is proprietary, that means ATI – now AMD – cards have been on the back foot for anything compute heavy. While they might be able to keep up – or even outperform – against NVIDIA’s cards, they almost never do well in productivity benchmarks thanks to CUDA.
We’ll come back to CUDA soon, but now following the timeline, 2008 saw another swath of changes – the GTX line including the GTX 280, switching the GTX from being a suffix to a prefix from previous generations – and NVIDIA bought another company. This time, the maker of PhysX, ageia. PhysX was a discrete PCI card that handled in-game physics calculations. Bare in mind that in the early 2000s it wasn’t uncommon to find multiple add-on cards installed, everything from sound and video cards, to USB and SATA cards, and yes even a physics accelerator – but still, PhysX wasn’t exactly immensely popular as requiring people to buy a PPU (physics processing unit) as well as a GPU (graphics processing unit) and a CPU (central processing unit) wasn’t exactly a strong draw. NVIDIA reworked the PhysX software to work on existing NVIDIA GPUs, and, fun fact, you could even run NVIDIA PhysX with an AMD GPU. How do I know that? Well we’ve got a now 13 year old guide on how to do just that. It’s an absolutely terrible video, recorded by a 15 year old idiot with a cold. It’s not an enjoyable watch, but it is still live if you want to torture yourself. 2008 also saw NVIDIA having to write down $200 million, and a further $13 million in legal fees from a class action suit, as their 8600M GPUs in laptops from Dell, HP and Apple had manufacturing defects that caused them to have an “abnormally high failure rate”. So some wins and losses there, huh?
NVIDIA skipped over a generation number, at least externally, as the GTX 300 series did technically exist as DirectX 10.1 compatible versions of the GTX 200 series cards, but were never made available for consumers to buy, at least directly. The next generation, the GTX 400 series, well that was something special. The GTX 480 was the hottest thing on the block – no seriously, that thing was a nuclear reactor’s worth of heat and power consumption. While it was the fastest single GPU card at the time, Techpowerup sum it up nicely, reporting “high power draw, noisy cool, high temperature, and a paper launch”. Where have I heard that before…. Anyway, it was fast as hell, and the revised 500 series, including the dual GPU GTX 590, did soften that power consumption a touch, at least over the 400 series. It is kind of funny looking back at this though, as much like AMD’s FX 9590 which was called a power hog and a thermal wasteland, these cards really pale in comparison to today’s cards. Take a guess how much power the 480 drew. Seriously, leave your guess in a comment below. 500 watts? 600 watts? 1000 watts?? Nah, 250W. That’d be considered a low power budget GPU today. Hell, it only cost $499 too – even adjusted for inflation that’s under $750, a damn sight less than the $1,500 an RTX 5080 costs today…
Something we haven’t discussed here is NVIDIA’s nForce chipsets. NVIDIA used to make chipsets for both Intel and AMD motherboards, with a bunch of for-the-time unique features, like having a decently powerful graphics core built right in. For some context, right up until the Core i days for Intel, or actually a fair bit earlier for AMD, everything up to and including the memory controller was built onto the motherboard, not the CPU. Back then the CPUs were really pretty basic, at least in terms of onboard functionality. It was really just the compute onboard, and a few relatively basic interfaces to talk to the northbridge and southbridges – AKA the two onboard chipsets. Generally speaking, the northbridge housed the memory controller, and the southbridge handled IO and connectivity – including things like integrated GPUs. NVIDIA made those chipsets, but when Intel in particular moved to their Core i series chips and the Nehalem architecture, they moved the memory controller onboard, and also swapped to using their own DMI interface to talk to the chipset – basically PCIe – which NVIDIA designed a chipset to work with, but Intel said they didn’t have the right to do that, and combined with Intel infringing on NVIDIA’s GPU core design patents with their HD Graphics, both sued each other. That was settled in 2011 with Intel paying NVIDIA $1.5 billion to create a patent cross licensing agreement where Intel would get the right to use NVIDIA’s patented GPU core designs, and NVIDIA would get.. Well basically $1.5 billion over the six year term of the deal. NVIDIA was still kicked out from making a competing H67 chipset design and stopped producing chipsets for both Intel and AMD.
Back on the graphics front, NVIDIA was on strong form going into the 2010’s, with their new Kepler architecture that powered both the 600 and 700 series cards, with some fun ones being the GTX 690, a dual GTX 680 card – and it’s 700 series twin, the GTX Titan Z – along with the GTX Titan, a semi-workstation level card thanks to the (at the time) monstrous 6 gigabytes of video memory and the double precision floating point performance giving it exceptional workstation performance and somewhat undercutting NVIDIA’s own Tesla workstation cards at $1,000 instead of $2,500. During this generation is where another NVIDIA proprietary technology comes in, GSYNC. GSYNC is a form of adaptive sync that uses a specialised module which at least originally was made from an FPGA (field programmable gate array, basically a chip you can code to function in any way you want) which made it expensive. GSYNC versions of monitors were easily £100 more than non-GSync counterparts, but at least initially offered an exclusive feature – the display would only refresh when the GPU had a new frame ready, which eliminates tearing. This exclusivity was fairly short-lived though, as AMD essentially rebranded the existing adaptive sync feature in the VESA standards into Freesync, a royalty free adaptive sync implementation that NVIDIA eventually capitulated and allowed their own cards to run adaptive sync with non-Gsync module displays in 2019. The be clear there is some extra nuance to Freesync than a simple rebrand, like the ability to run Adaptive Sync over HDMI (something GSYNC didn’t support, in fact GSYNC monitors originally didn’t even come with HDMI ports until a little later, and even then it’d be one DisplayPort and one HDMI, compared to often two of each on a Freesync display as it used more standard hardware), and features like low-framerate compensation, plus later revisions have added support for HDR among other things. Still, AMD went more FOSS than proprietary – a trend that only continues to this day.
Around the same time GSYNC was launched, NVIDIA also launched a new software feature that made use of the onboard video encoding engine (NVENC), called ShadowPlay. This was quite the revolution for recording and sharing games, and I have to imagine is a key factor in spearheading the explosion of game streamers and content creators. This if not eliminated, at least reduced, the requirement to have a second computer and a capture card to be able to record and stream your gameplay, and that’s a huge deal.
And that brings us nicely onto the GTX 900 series, and another class action lawsuit. While the 900 series cards were pretty great outright, and the GTX 970, the second to top card, sold incredibly well thanks to its 4GB of VRAM and solid performance, it turned out that that 4GB was… deceptive. See, 3.5GB of that memory was addressable via the expected 224 bit wide high speed bus, but the last 0.5GB was only addressable via a low speed 32 bit bus, meaning if the card tried to use that last 500MB, performance would utterly tank as accessing anything there was so glacially slow. On top of that, NVIDIA outright lied on the specs as the card was meant to have 64 ROPs (render output processors) but it actually only had 56 ROPs – wait where have I heard this before…. Wow, dejavu! Oh and it was meant to have 2048KB of L2 cache but only had 1792KB. That all meant when a class action suit was filed, NVIDIA had to pay out $30 per card to anyone who claimed. While I’m not sure how many cards were sold, nor how many have claimed their pound of flesh, it can’t have been great for NVIDIA’s balance sheets. Also, fun fact, the 900 series were the last cards to support analogue video output via DVI-I. It’s digital only these days, and DVI quickly dropped off cards after that too in favour of just DisplayPort and HDMI.
The 10 series was pretty massive, with the GTX 1060 6GB being incredibly influential – and also one of the most frustrating naming schemes NVIDIA had brought out at least at the time. The problem? They launched a 3GB version of the 1060. That doesn’t sound like such a big deal though, there have been multiple VRAM configurations of GPUs before, who cares? Well, the problem here is that it wasn’t just the VRAM amount that changed, or the fact that the 6GB card used much faster GDDR5X memory, where the 3GB used the slower GDDR5 instead, no the problem is that THEY AREN’T THE SAME DAMN CORE! The 6GB card used a larger GP106 die with 1280 CUDA cores, but the 3GB used a smaller GP104 die with 1152 CUDA cores. The 6GB card has 11% more cores, and therefore is a decent bit faster, despite being named the same damn thing.
And that brings us onto the most recent branch of the NVIDIA GPU family, RTX, and their key feature, ray tracing. To be clear, NVIDIA didn’t invent ray tracing, as that’s a process that’s been around for as long as computer graphics has been a thing, but they did collaborate with Microsoft to make DXR, or DirectX RayTracing, an extension of DirectX 12. Thanks to the RT cores onboard the RTX cards, real-time ray tracing became possible, along with a number of optimisations like limiting the number of bounces and intersections a ray will have before stopping calculation. The key thing that DXR does – which I should stress is not unique to DXR as a rendering platform – is reverse ray tracing, where instead of simulating some number of rays leaving a light source and seeing what they bounce off of, DXR instead casts rays from each virtual pixel, seeing which ones hit a light source and which ones don’t, which informs both the final brightness of that pixel, and the reflections and shadows. This is a considerably cheaper (ie computationally cheaper) method, rather than the incredibly expensive light ray tracing method. Ray tracing can also be only partially implemented, as many games only choose to include reflections or shadows as ray-traced, to augment the conventional pre-baked lighting in game, or they can go whole-hog and do fully ray traced lighting – at a considerable performance cost. Ray tracing, even on the most recent 50 series can almost half performance in games, even with NVIDIA’s upscaling tech active.
When NVIDIA launched their RTX cards, with Ray Tracing being the killer new feature, they were clearly aware of the significant performance impact said feature had, and opted to, well, cheat. DLSS, or deep learning super sampling, is an upscaling technology that renders the image at a lower resolution than requested and then upscales that image after rendering to give you a close-to-native image quality, while getting the benefits of lower resolution rendering (ie better performance). The render resolution differs depending on the output resolution and the DLSS quality mode, with the more performant options rendering at lower resolutions and upscaling more, and the more quality focused modes rendering closer to native res. As an example, according to a game called Control, if your output resolution is 1440p, DLSS Quality mode renders at 960p, Balance is 835p, Performance is 720p, and Ultra Performance is 480p. The Deep Learning part of DLSS comes from the upscaler being a neural network that, barring an intermediary version that was made to run on the regular CUDA cores, uses the other specialised core design NVIDIA included in the RTX cards, the Tensor Cores, which are AI accelerator cores that in this case are used to run the neural network to upscale each frame in real time. The various versions of DLSS have added more features, like 2.0 which added temporal anti-aliasing upsampling, looking at previous frames to better upscale the current one, or DLSS 3.0 which added frame generation, a way to create intermediary ‘fake’ frames to help artificially increase the perceived frame rate. 4.0 added multi-frame generation where you can have up to 3 generated frames in between the actually rendered frames.
One little thing we skipped over during the RTX 20 series launch in 2018 was the GeForce Partner Program. GPP was a short lived, but still long impactful marketing program that NVIDIA attempted to both reward and restrict their ‘partners’ – AKA add-in board partners, the people who actually make the cards we all buy, that being Gigabyte, MSI, Zotac, formerly EVGA and more – to basically shut out AMD. Kyle Bennett from HardOCP did some amazing reporting at the time, and the crux of the matter was that NVIDIA was basically forbidding their partners from selling directly competing competitor cards under the same branding. While the legality of these deals were questioned, Asus in particular ran with it anyway. Basically instead of being able to sell a “Republic of Gamers” brand AMD card alongside an equal NVIDIA card, Asus moved to “STRIX” for AMD cards, while keeping the more premium “ROG” brand for NVIDIA. That has since stopped – Asus’ 9070XT and 5070 TI cards for example are both branded “Prime” and “TUF”.
Of course, that isn’t NVIDIA’s only strong-arm mishap. In 2020 Steve from Hardware Unboxed revealed NVIDIA had decided to punish him for not covering ray tracing (an area NVIDIA was, and still is, generally stronger at, albeit with somewhat limited user interest in) by not providing him any NVIDIA founders edition GPUs at launch. While of course NVIDIA is not required to give anyone anything, this move is plainly petty, and controlling. Clearly once it was made public, the top brass thought so too, because two days later NVIDIA retracted their statement and apologised.
I think it’s also worth mentioning cryptocurrency, as that helped NVIDIA sell an awful lot of GPUs in 2017. As a brief explanation, cryptocurrencies like Bitcoin and Ethereum (pre proof of stake anyway), use a method for determining legitimate transactions called proof of work. Proof of work is basically a method of adding a cost to doing an action. To steal an example from Dan Olson, imagine to send an email your PC needs to solve a simple maths equation. If you only send a couple emails a day, you would never notice any slow downs. But if you want to send thousands or millions of emails a day, you’d struggle. Cryptocurrencies use that proof of work method to add new blocks to their chain (hence the term ‘blockchain’) and crucially the difficulty level increases as computing power on the network increases, and as more coins are created (by solving the proof of work puzzles and adding blocks to the chain). GPUs proved to be a perfect design to do that work, and quickly people industrialised that process, buying hundreds of cards to run simultaneously. These buyers, buying in bulk, drove the price of cards through the roof as stock was basically permanently unavailable. While that did improve a little as the price of major cryptocurrencies crashed in 2018, the second wave spike in 2021 didn’t help things. NVIDIA launched LHR or lite-hash-rate versions of their cards that were supposedly meant to have half the hash rate – the speed of solving trying answers to the proof of work puzzle – although that was repeatedly bypassed so its effectiveness may have been limited. In 2022 NVIDIA dropped the LHR options as they decided it wasn’t worth it considering the craze had died down.
And speaking of crazes, that brings us nicely onto the elephant in the silicon, AI. AI is an incredibly wide topic to cover, so I’ll stick with NVIDIA’s part in it, although it is important to know what AI is, and what it isn’t. What we are mostly talking about here is deep learning and neural networks. Deep learning is the process of training neural networks, similar to how our brains work with neurons building connections to each other – hence the name. Neural networks aren’t new, with the first designs being drawn in 1965, and in fact here’s a demo of an optical character recognition neural network from 1989. It recognises handwritten and typed numbers – even somewhat difficult to read handwritten numbers it does just fine, and bubble writing is no problem. This should serve to prove this tech has been around for decades – but it was absolutely supercharged by GPUs. Andrew Ng appears to have been first to publish about using GPUs – 30 GTX 280s specifically – reporting their GPU server trained their network 70 times faster than the CPUs they had been using.
NVIDIA’s big break in the market was thanks to AlexNet, the winner of the 2012 ImageNet competition. It was able to correctly identify the most images thanks to the orders of magnitude more training it had on GPUs, and being able to use convolutional neural networks. This was the match that lit the fuse on GPU-based training, that has helped shoot NVIDIA into the stratosphere in terms of value. NVIDIA capitalised on this new found need for their GPUs, developing the first “AI Server”, the $149,000 DGX-1, an eight GPU server that offered “250 CPU based servers in a single box”. It was based on the Pascal architecture, the same as the 10 series GPUs, and has 128GB of HBM2 memory (16GB per GPU), along with dual Xeon E5 CPUs, 512GB of system memory, and 4 1.92TB SSDs. NVIDIA gifted the first DGX-1 to OpenAI, which helped them drop their training time from six days to two hours. That’s insane!
NVIDIA updated the DGX-1 to the DGX-2 with sixteen Volta V100 32GB GPUs for a staggering total of 512GB of HBM2 memory onboard, and apparently can draw up to 10 kilowatts under load. That one started at $400,000. The newer Ampere generation created the DGX A100, moving to an AMD EPYC 7742 64 core CPU with 8 Ampere (RTX 30 series) GPUs. I believe they could come with either 40GB of HBM or 80GB of HBM2e memory for either 320GB or 640GB of VRAM. The next architecture launched in 2022 was Hopper, which funnily enough brought about the DGX H100, which moved to HBM3 for almost double the data rate – an important feature for AI models, and moved back to Intel with two Xeon Platinum 8480C’s. That’d set you back £379,000. An alteration to the H100 came in mid 2023 with the GH200, which was legitimately insane. NVIDIA calls these “superchips”, and basically it’s a 72 core NVIDIA Grace CPU, tied directly to an NVIDIA H100 chip, meaning the Hopper GPU has access to not only the 96GB of HBM3 or 144GB of HBM3e memory onboard, but also the 480GB of LPDDR5X ECC RAM connected to that CPU. Each 4U chassis can then have TWO of these boards, and the full GH200 server package can have up to 256 of these GPUs all acting as one, making training or running AI models no problem at all. The most recent iteration of that is the newest Blackwell architecture, which is even faster, plus you can new get two of those GPU dies to one CPU, meaning 36 CPUs to a cluster, but 72 GPUs. That’s madness.
These systems are what is keeping NVIDIA’s stock price and valuation at an all time high, and why they couldn’t care less about us gamers now. Their revenue, their profit, and their insane valuation is all driven by these chips and systems. NVIDIA is putting everything into designing these new chips for their AI customers – people like OpenAI, Google or Meta – so it’s no wonder that for the 50 series the “killer feature” was 3x more fake frames than last time to convince you they’ve had a significant leap forward in performance – you know the 5070 being faster than the 4090, when in reality it’s barely faster than a 4070? Yeah. This is why. It sure seems like AI is here to stay, so we can only hope AMD keeps chipping away at NVIDIA’s lead while they focus on their AI stuff (and that AMD doesn’t get sucked into that mess too), but we’ll have to wait and see.