Tom's Hardware

Apple skips Nvidia's GPUs for its AI models, uses thousands of Google TPUs instead

By Mark Tyson,

5 hours ago

Apple has revealed that it didn’t use Nvidia’s hardware accelerators to develop its recently revealed Apple Intelligence features. According to an official Apple research paper (PDF), it instead relied on Google TPUs to crunch the training data behind the Apple Intelligence Foundation Language Models.

Systems packing Google TPUv4 and TPUv5 chips were instrumental to the creation of the Apple Foundation Models (AFMs). These models, AFM-server and AFM-on-device models, were designed to power online and offline Apple Intelligence features which were heralded back at WWDC 2024 in June.

https://img.particlenews.com/image.php?url=0WKw77_0uhj5amF00 — (Image credit: Apple research paper)

AFM-server is Apple’s biggest LLM, and thus it remains online only. According to the recently released research paper, Apple’s AFM-server was trained on 8,192 TPUv4 chips “provisioned as 8 × 1,024 chip slices, where slices are connected together by the data-center network (DCN).” Pre-training was a triple-stage process, starting with 6.3T tokens, continuing with 1T tokens, and then context-lengthening using 100B tokens.

Apple said the data used to train its AFMs included info gathered from the Applebot web crawler (heeding robots.txt) plus various licensed “high-quality” datasets. It also leveraged carefully chosen code, math, and public datasets.

Of course, the ARM-on-device model is significantly pruned, but Apple reckons its knowledge distillation techniques have optimized this smaller model’s performance and efficiency. The paper reveals that AFM-on-device is a 3B parameter model, distilled from the 6.4B server model, which was trained on the full 6.3T tokens.

Unlike AFM-server training, Google TPUv5 clusters were harnessed to prepare the ARM-on-device model. The paper reveals that “AFM-on-device was trained on one slice of 2,048 TPUv5p chips.”

It is interesting to see Apple has released such a detailed paper, revealing techniques and technologies behind Apple Intelligence. The company isn’t renowned for its transparency but seems to be trying hard to impress in AI, perhaps as it has been late to the game.

https://img.particlenews.com/image.php?url=15vsX1_0uhj5amF00 — (Image credit: Apple research paper)

According to Apple’s in-house testing, AFM-server and AFM-on-device excel in benchmarks such as Instruction Following, Tool Use, Writing, and more. We’ve embedded the Writing Benchmark chart, above, for one example.

If you are interested in some deeper details regarding the training and optimizations used by Apple, as well as further benchmark comparisons, check out the PDF linked in the intro.

Expand All

Read in NewsBreak

Comments / 0

Add a Comment

Tom's Hardware4 days ago

Amuse 2.0 beta released for easy on-device AI image generation on modern AMD hardware

Tom's Hardware1 day ago

Favorite Chicken Chain Suddenly Closes All Stores, Heartfelt Message Found on Doors

Lancaster County, PA13 days ago

Another retail giant has just filed for Chapter 11 bankruptcy and is shutting down several stores.

NewsByJoshua2 days ago

Chelsea Clinton is rumored to be living in Virginia but probably still in NYC

New York City, NY15 days ago

Nvidia reportedly preparing a Blackwell-based Titan GPU — maybe that 4-slot prototype cooler will show up again

Tom's Hardware6 days ago

Longest-Lasting Cars: Top 3 Most Reliable Models

Steve B Howard9 days ago

A rental company with 34,000 locations has closed its doors

NewsByJoshua15 days ago

Bankruptcy Looms for Major Pizza Chain With Chicago Suburban Locations

Chicago, IL20 days ago

Nvidia GeForce RTX 50-series launch pushed back to early 2025 according to prominent leaker

Tom's Hardware8 days ago

Horoscope for Tuesday, July 30th

Devra Lee1 hour ago

Concerns Rise as Massive Layoffs Hit California

California State12 days ago

AMD says its EPYC processors are up to twice as fast as Nvidia's Arm-powered Grace CPU Superchip across multiple benchmarks

Tom's Hardware7 days ago

Japanese Prime Minister Kishida vows government funds for local chip fabs — Rapidus eyes 2nm production by 2027

Tom's Hardware5 days ago

Intel hires Micron's technology development chief to lead Intel Foundry manufacturing operations

Tom's Hardware4 days ago

Elon Musk fires up ‘the most powerful AI cluster in the world’ to create the 'world's most powerful AI' by December — system uses 100,000 Nvidia H100 GPUs on a single fabric

Tom's Hardware8 days ago

Cruise Passengers Not Happy About Feature on World's Newest Mega Ship

J. Souza8 days ago

Prop. 33 will allow California voters to decide on removing rent control ban

California State6 days ago

Chinese chipmaker Loongson claims its 16-core 3C6000 CPU matches Intel's Ice Lake 16-core Xeon Silver 4314

Tom's Hardware5 days ago

Did You Feel the Earthquake?

Chicago, IL15 days ago

Welcome to NewsBreak, an open platform where diverse perspectives converge. Most of our content comes from established publications and journalists, as well as from our extensive network of tens of thousands of creators who contribute to our platform. We empower individuals to share insightful viewpoints through short posts and comments. It’s essential to note our commitment to transparency: our Terms of Use acknowledge that our services may not always be error-free, and our Community Standards emphasize our discretion in enforcing policies. We strive to foster a dynamic environment for free expression and robust discourse through safety guardrails of human and AI moderation. Join us in shaping the news narrative together.

Comments / 0

Community Policy