Elon Musk's Tesla AI supercomputer Cortex

AI data center A look into the heart of Elon Musk's Tesla AI supercomputer Cortex

2024-08-30 From Susanne Braun | Translated by AI 3 min Reading Time

Cortex is one of several AI training clusters that Tesla founder Elon Musk is having set up by his companies. On X, Musk shared a video from inside the Cortex cluster at Tesla headquarters, in which 50,000 Nvidia GPUs are installed in the first stage.

In June 2024, Musk claimed that Optimus would go into limited production in 2025, with plans for more than 1,000 units to be used in Tesla facilities and the possibility of production for other companies in 2026. Optimus is trained by the Tesla supercomputer.(Image: Tesla) — In June 2024, Musk claimed that Optimus would go into limited production in 2025, with plans for more than 1,000 units to be used in Tesla facilities and the possibility of production for other companies in 2026. Optimus is trained by the Tesla supercomputer.
(Image: Tesla)

Artificial intelligence, as a driver of the fourth industrial revolution, is said to be doing wonderful things today and especially in the future; at least if you believe the speeches of entrepreneurs who have invested a lot of money and time in AI. However, artificial intelligence is difficult to visualize. A few screens on which algorithms are executed or an uncanny-looking image from an image generator that is not so high-end, is that it?

The supposed costs of training and operating AI are also rarely put in the spotlight in reporting. The idea that an AI superchip should cost between 30,000 and 70,000 US dollars is already difficult, but still understandable for industry experts given the technology involved. So the idea of several tens of thousands of AI accelerators being operated simultaneously in server racks is not so difficult, is it? Or is it? You don't have to imagine it anymore, because Elon Musk has paid a visit to the Tesla AI training cluster Cortex and shared a video of it on his social platform X (formerly Twitter).

Gallery

Tesla's AI project Optimus comprises a humanoid, autonomous robot that will be used primarily in Tesla's production facilities. Tesla needs the computing power of Cortex to train the robot.(Image: Tesla)

Cortex supercluster in Austin

Cortex is described by Musk as an AI training supercluster and is currently set up on the grounds of Tesla's headquarters in Austin, Texas, to implement "real-world AI". Cortex will work on the Full Self Driving (FSD) autopilot system for Tesla and the system for the autonomous, humanoid Optimus robot. The latter is to be used in Tesla production. When completed, Cortex will link 70,000 GPUs together. For the first stage, 50,000 H100s from Nvidia will initially be installed, with 20,000 more chips developed by Tesla itself to be added later.

As a reminder, H100 is a GPU specially developed for AI and high-performance computing. The chip consists of 80 billion transistors and supports up to 700 GB/s memory bandwidth through HBM3 memory. The NVLink developed by Nvidia enables the connection to several other GPUs.

How loud is a supercomputer?

Back to Cortex. Musk's video from August 26, 2024 shows an interim status of the work. Based on the video, the authors of Tom's Hardware estimate and calculate the following: "The racks appear to be arranged in an array of 16 per row, with about four non-GPU racks dividing the rows. Each computer rack holds eight servers. The 20-second clip shows between 16 and 20 rows of server racks. Roughly speaking, 2,000 GPU servers can be seen, which is less than three percent of the estimated total number." They alone cause quite a lot of noise.

And the servers consume a lot of energy in cooled operation, which is one of the most overlooked costs of AI. Once the first stage of Cortex has been completed, the cluster will have a power requirement of 130 megawatts. This is enough to meet the power requirements of a small town for two to three hours. When Cortex has all 70,000 AI servers in operation, it is assumed that the energy requirement will increase to 500 megawatts.

Video of the inside of Cortex today, the giant new AI training supercluster being built at Tesla HQ in Austin to solve real-world AI pic.twitter.com/DwJVUWUrb5— Elon Musk (@elonmusk) August 26, 2024

It's hard to stand on one leg ...

After all, Tesla uses Supermicro's liquid cooling technology for Cortex. The company claims that direct liquid cooling can reduce electricity costs for the cooling infrastructure by up to 89 percent compared to air cooling. The CEO of Supermicro, Charles Liang, made a somewhat irritating comparison in July 2024. He said that 20 billion trees could be saved if liquid cooling were to become established in large data centers. It can be assumed that he is referring to all giant data centers.

Speaking of other AI training clusters: Cortex as part of the Tesla Gigafactory supercomputer cluster is, as mentioned, not the only supercomputer being worked on in Musk's ventures. The x.AI supercomputer is a little better known - and a little bigger. 100,000 H100 GPUs from Nvidia are to train the GrokAI for X premium users in the x.AI supercomputer. The x.AI AI training cluster is also to be expanded by 300,000 B200 GPUs in the coming year (2025). (sb)

Consent to the use of data for promotional purposes

I hereby consent to Vogel Communications Group GmbH & Co. KG, Max-Planck-Str. 7-9, 97082 Würzburg including any affiliated companies according to §§ 15 et seq. AktG (hereafter: Vogel Communications Group) using my e-mail address to send editorial newsletters. A list of all affiliated companies can be found here

Newsletter content may include all products and services of any companies mentioned above, including for example specialist journals and books, events and fairs as well as event-related products and services, print and digital media offers and services such as additional (editorial) newsletters, raffles, lead campaigns, market research both online and offline, specialist webportals and e-learning offers. In case my personal telephone number has also been collected, it may be used for offers of aforementioned products, for services of the companies mentioned above, and market research purposes.

Additionally, my consent also includes the processing of my email address and telephone number for data matching for marketing purposes with select advertising partners such as LinkedIn, Google, and Meta. For this, Vogel Communications Group may transmit said data in hashed form to the advertising partners who then use said data to determine whether I am also a member of the mentioned advertising partner portals. Vogel Communications Group uses this feature for the purposes of re-targeting (up-selling, cross-selling, and customer loyalty), generating so-called look-alike audiences for acquisition of new customers, and as basis for exclusion for on-going advertising campaigns. Further information can be found in section “data matching for marketing purposes”.

In case I access protected data on Internet portals of Vogel Communications Group including any affiliated companies according to §§ 15 et seq. AktG, I need to provide further data in order to register for the access to such content. In return for this free access to editorial content, my data may be used in accordance with this consent for the purposes stated here. This does not apply to data matching for marketing purposes.

Right of revocation

I understand that I can revoke my consent at will. My revocation does not change the lawfulness of data processing that was conducted based on my consent leading up to my revocation. One option to declare my revocation is to use the contact form found at https://contact.vogel.de. In case I no longer wish to receive certain newsletters, I have subscribed to, I can also click on the unsubscribe link included at the end of a newsletter. Further information regarding my right of revocation and the implementation of it as well as the consequences of my revocation can be found in the data protection declaration, section editorial newsletter.