A researcher claims Microsoft and OpenAI may have cracked multi-datacenter distributed training for their AI models based on their 'actions': "Microsoft has signed deals north of $10 billion with fiber companies to connect data centers"

Sam Altman and Satya Nadella on stage
Microsoft CEO Satya Nadella (R) speaks as OpenAI CEO Sam Altman (L) looks on during the OpenAI DevDay event on November 06, 2023 in San Francisco, California. (Image credit: Getty Images | Justin Sullivan)

What you need to know

  • Microsoft and OpenAI may have already cracked multi-datacenter distributed training for their LLMs.
  • This news comes amid rising concern among investors over AI's high demand for resources, including funds, cooling water for data centers, and electricity to power advances. 
  • There are also filed permits highlighting the companies' motives to dig between specific data centers. 

With the rapid advances and widespread adoption of AI, there's a dire need to scale greater heights in infrastructure to facilitate more growth and powerful AI systems. However, efforts toward these goals quickly dwindled because of insufficient funds, construction timelines, permit restrictions, regulations, and low electricity supply

To this end, Billionaire Elon Musk recently shared progress on his Tesla Cortex AI supercluster project. The project will reportedly feature 50,000 NVIDIA H100s and 20,000 of the company's custom Dojo AI hardware to foster autonomous driving, energy management, and more. However, early projections show the cluster will require an additional 500 MW for power and cooling by 2026.

Major tech corporations in the AI landscape, including Microsoft and OpenAI, have heavily invested in training AI models. However, the process is watered down since it's limited to a single data center. Though late to the AI party, Google owns the most advanced computing systems, placing it miles ahead of its competitors like Microsoft, OpenAI, and Anthropic. 

However, Microsoft and OpenAI have reportedly cracked multi-datacenter distributed training, which could be vital to unlocking greater heights for AI. According to a clip shared by a tech enthusiast well-versed in the AI landscape, James Campbell on X (formerly Twitter), Dylan Patel Boutique AI & Semiconductor Researcher claims Microsoft and OpenAI have finally figured out a plausible way to train the LLMs across multi-datacenters.

Patel attributes his deductions to Microsoft and OpenAI's actions. "Microsoft has signed deals north of 10 billion dollars with fiber companies to connect their data centers together," the researcher added. "There are some permits already filed to show people they are digging between certain data centers."

The researcher further claims that with "fairly high accuracy," there are at least five massive data centers across regions that the tech giant is actively trying to connect. Patel estimates the total power usage north of a gigawatt, depending on the time.

According to Patel:

"Well, each GPU is getting higher power consumption too. The rule of thumb is that a H100 is like 700 watts, but then total power per GPU all-in is like 1200-1400 watts. But next-generation NVIDIA GPUs are like 1200 watts for the GPU. It actually ends up being like 2000 watts all in. There's a little bit of scaling of power per GPU.

You already have 100K clusters. OpenAI in Arizona, xAI in Memphis. Many others are already building 100K clusters of H100s. You have multiple, at least five, I believe GB200 100K clusters being built by Microsoft/OpenAItheir partners for them. It’s potentially even more. 500K GB200s is like a gigawatt and that's online next year.

The year after that, if you aggregate all the data center sites, and how much power… You only look at net adds since 2022, instead of the total capacity at each data center, then you're still north of multi-gigawatt."

Factoring Microsoft's spending and investment in fiber deals worth billions of dollars despite investors raising concern coupled with data centers where it's reportedly building 100K clusters, Microsoft and OpenAI might have cracked multi-datacenter distributed training.

🎃The best early Black Friday deals🦃

We at Windows Central are scouring the internet for the best Prime Day deals and anti-Prime Day deals, but there are plenty more discounts going on now. Here's where to find more savings:

CATEGORIES
Kevin Okemwa
Contributor

Kevin Okemwa is a seasoned tech journalist based in Nairobi, Kenya with lots of experience covering the latest trends and developments in the industry at Windows Central. With a passion for innovation and a keen eye for detail, he has written for leading publications such as OnMSFT, MakeUseOf, and Windows Report, providing insightful analysis and breaking news on everything revolving around the Microsoft ecosystem. You'll also catch him occasionally contributing at iMore about Apple and AI. While AFK and not busy following the ever-emerging trends in tech, you can find him exploring the world or listening to music.

  • Roccy
    It will be interesting to see how this power-hungry tech fares with the looming degradation of the US power grid as green power takes over from the current grid's more reliable sources that are already starting to be shut down.
    Reply
  • fjtorres5591
    Roccy said:
    It will be interesting to see how this power-hungry tech fares with the looming degradation of the US power grid as green power takes over from the current grid's more reliable sources that are already starting to be shut down.
    Well, MS is looking at nuclear (reactivating Three Mile Island according to some reports), fusion (the deal with Helion for 2028) and lots of co-located solar. They're placing bets on lots of potential solutions.
    Google is betting on geothermal fracking.
    The others?
    Unclear.
    Reply
  • nocturn9x
    fjtorres5591 said:
    Well, MS is looking at nuclear (reactivating Three Mile Island according to some reports), fusion (the deal with Helion for 2028) and lots of co-located solar. They're placing bets on lots of potential solutions.
    Google is betting on geothermal fracking.
    The others?
    Unclear.
    Honestly, the more I look into Helion the more it seems like it's a lot of marketing/PR nonsense. But I'm in no way a fusion expert so I guess I'll just see what happens
    Reply
  • fjtorres5591
    nocturn9x said:
    Honestly, the more I look into Helion the more it seems like it's a lot of marketing/PR nonsense. But I'm in no way a fusion expert so I guess I'll just see what happens
    The thing about Helion is their approach to fusion isn't the miniature star approach of the tokamaks but more of a first principles approach. Think airplanes and helicopters versus ornithopters.

    Instead of trying to maintain a mini star, super hot plasma stable long enough to extract useful heat, the Helion system uses pulsing electromagnetic forces to collide plasma clouds to generate energy and extract the generated energy. (rather like an electromagnetic diesel-style engine). Its not reliant on thermodynamic cycles or long containment times but on many tiny fusion "explosions" in sequence.

    Much like the Farnsworth fusors and Bussard Polywell wiffle balls, their approach already works, but haven't yet produced enough energy to be useful. They need to scale up their device pulse rates successfully to produce more energy than they consume rather than grow physically bigger or increase containment times. To date their prototypes have scaled according to theory. They may or not succeed but they haven't failed yet.

    The PR hype is actually necessary because, remember, it is a private tech startup rather than a multinational jobs program like ITER or a government lab project. They need to lure investor support them with the promise of big profits and in today's economic environment venture capital is harder to raise for anything other than "AI".

    Whatever they have, they convinced to sign up for *production* output, with economic penalties if they don't deliver.

    https://www.windowscentral.com/microsoft/microsoft-is-reportedly-eyeing-nuclear-energy-for-its-ai-ventures-following-the-techs-exorbitant-power-consumption
    Reply
  • dougpaw57
    SKYNET
    Reply