Open in App
  • Local
  • Headlines
  • Election
  • Sports
  • Lifestyle
  • Education
  • Real Estate
  • Newsletter
  • Windows Central

    A researcher claims Microsoft and OpenAI may have cracked multi-datacenter distributed training for their AI models based on their 'actions': "Microsoft has signed deals north of $10 billion with fiber companies to connect data centers"

    By Kevin Okemwa,

    1 days ago

    https://img.particlenews.com/image.php?url=2b4Zmh_0vylPNZl00

    What you need to know

    • Microsoft and OpenAI may have already cracked multi-datacenter distributed training for their LLMs.
    • This news comes amid rising concern among investors over AI's high demand for resources, including funds, cooling water for data centers, and electricity to power advances.
    • There are also filed permits highlighting the companies' motives to dig between specific data centers.

    With the rapid advances and widespread adoption of AI, there's a dire need to scale greater heights in infrastructure to facilitate more growth and powerful AI systems. However, efforts toward these goals quickly dwindled because of insufficient funds , construction timelines, permit restrictions, regulations, and low electricity supply .

    To this end, Billionaire Elon Musk recently shared progress on his Tesla Cortex AI supercluster project . The project will reportedly feature 50,000 NVIDIA H100s and 20,000 of the company's custom Dojo AI hardware to foster autonomous driving, energy management, and more. However, early projections show the cluster will require an additional 500 MW for power and cooling by 2026.

    Major tech corporations in the AI landscape, including Microsoft and OpenAI, have heavily invested in training AI models. However, the process is watered down since it's limited to a single data center. Though late to the AI party, Google owns the most advanced computing systems, placing it miles ahead of its competitors like Microsoft, OpenAI, and Anthropic.

    However, Microsoft and OpenAI have reportedly cracked multi-datacenter distributed training, which could be vital to unlocking greater heights for AI. According to a clip shared by a tech enthusiast well-versed in the AI landscape, James Campbell on X (formerly Twitter), Dylan Patel Boutique AI & Semiconductor Researcher claims Microsoft and OpenAI have finally figured out a plausible way to train the LLMs across multi-datacenters.

    Patel attributes his deductions to Microsoft and OpenAI's actions. "Microsoft has signed deals north of 10 billion dollars with fiber companies to connect their data centers together," the researcher added. "There are some permits already filed to show people they are digging between certain data centers."

    The researcher further claims that with "fairly high accuracy," there are at least five massive data centers across regions that the tech giant is actively trying to connect. Patel estimates the total power usage north of a gigawatt, depending on the time.

    According to Patel:

    "Well, each GPU is getting higher power consumption too. The rule of thumb is that a H100 is like 700 watts, but then total power per GPU all-in is like 1200-1400 watts. But next-generation NVIDIA GPUs are like 1200 watts for the GPU. It actually ends up being like 2000 watts all in. There's a little bit of scaling of power per GPU.

    You already have 100K clusters. OpenAI in Arizona, xAI in Memphis. Many others are already building 100K clusters of H100s. You have multiple, at least five, I believe GB200 100K clusters being built by Microsoft/OpenAItheir partners for them. It’s potentially even more. 500K GB200s is like a gigawatt and that's online next year.

    The year after that, if you aggregate all the data center sites, and how much power… You only look at net adds since 2022, instead of the total capacity at each data center, then you're still north of multi-gigawatt."

    Factoring Microsoft's spending and investment in fiber deals worth billions of dollars despite investors raising concern coupled with data centers where it's reportedly building 100K clusters, Microsoft and OpenAI might have cracked multi-datacenter distributed training.

    🎃The best early Black Friday deals🦃

    We at Windows Central are scouring the internet for the best Prime Day deals and anti-Prime Day deals , but there are plenty more discounts going on now. Here's where to find more savings:

    Expand All
    Comments / 1
    Add a Comment
    Char
    16h ago
    if true, buy MSFT stock
    View all comments
    YOU MAY ALSO LIKE
    Local News newsLocal News

    Comments / 0