From Cloud Provider to Platform Builder AWS Targets Nvidia With Its Own AI Chips and Racks for Data Centers

From Manuel Christa | Translated by AI 3 min Reading Time

Related Vendors

New hardware, proprietary models, racks for customer data centers, and an agent stack aim to tie the development and operation of AI systems more closely to AWS. For developers in industry and embedded environments, this offers a lot, but also introduces new dependencies.

Trainium 3: With its own AI accelerator built on the 3-nm process, AWS aims to become more independent from Nvidia and scale training workloads more efficiently.(Image: Amazon)
Trainium 3: With its own AI accelerator built on the 3-nm process, AWS aims to become more independent from Nvidia and scale training workloads more efficiently.
(Image: Amazon)

AWS is using its in-house re:Invent 2025 conference to position itself not just as a cloud provider but as a full AI infrastructure supplier. With Trainium3, AWS introduces the third generation of its AI accelerators. The chips are manufactured using TSMC's 3-nm process and are integrated into the new Trn3 UltraServers, which connect up to 144 Trainium3 chips. According to AWS, these systems deliver more than four times the performance of the previous Trainium2 generation while reducing energy consumption by around 40 percent. The system targets training and inference workloads for large language models, mixture-of-experts architectures, and long-context handling.

Strategically important point for the semiconductor world: AWS is trying to reduce its reliance solely on Nvidia GPUs. Trainium3, as a dedicated accelerator, directly competes with Google's TPUs and GPU clusters based on H100 or Blackwell cards. At the same time, AWS remains a major Nvidia customer but uses its own chips as a price-performance lever.

In parallel, AWS is introducing the Graviton5, the fifth generation of its Arm server CPUs. The chip offers up to 192 cores, approximately five times larger L3 cache capacity, and is expected to deliver up to 25 percent more computing power in EC2-M9g instances compared to Graviton4. For workloads such as EDA, simulation, databases, or big data analysis, this could be highly relevant for electronics developers, as compile times and simulation runs can be executed more cost-effectively and faster on Graviton instances.

Nova 2: AWS Catches Up in the LLM Race

In its model stack, AWS aims to keep pace with OpenAI, Google, and Anthropic with the Nova-2 family. Nova 2 Lite, Pro, Sonic, and Omni cover different use cases: Lite addresses cost-sensitive standard workloads, Pro targets complex reasoning and serves as a "teacher" model for smaller variants, Sonic handles real-time speech interaction, and Omni integrates text, image, video, and audio processing into a single model. All models run on Amazon Bedrock and can be enriched with proprietary data or further customized using Nova Forge.

Bedrock positions itself as a counteroffer to Azure OpenAI Services or Vertex AI. For embedded and electronics companies, this means they can build AI functions—such as service assistance, code generation, document analysis, or edge device management—on AWS model families without having to train their own foundation models.

AI Factories: Cloud Stack in Own Data Center

On the topic of data sovereignty: With the new AI Factories, AWS provides dedicated AI infrastructure directly in the customer's data center. Companies provide space, power, and network, while AWS delivers and operates complete racks with Trainium3 UltraServers and Nvidia GPUs, including network infrastructure, storage, and services like Bedrock and SageMaker.

From the perspective of regulated industries, including many electronics and automation customers, this is significant: sensitive production, development, or device data does not leave the company's own data center, yet a scalable AI stack is available. AWS positions AI Factories as a sovereign mini-region, particularly suited for industrial countries with strict data residency requirements and critical infrastructure regulations.

Frontier Agents: Autonomous AI "Employees"

The fourth focus is called "Frontier Agents." AWS describes this as a new class of autonomous AI agents that are not limited to answering individual prompts but are designed to carry out projects over hours or days. In the first wave, AWS introduces three: the Kiro Autonomous Agent as a virtual developer, a Security Agent, and a DevOps Agent. They are built on Bedrock AgentCore, which provides storage functions, policies, and automated quality tests for agents.

In parallel, AWS is launching Nova Act, a service specifically designed for browser and UI automation. Developers define workflows, Nova Act controls applications in the browser, calls APIs, and escalates to humans when necessary. According to AWS, these agents achieve a success rate of around 90 percent in early workloads. This is particularly exciting for industrial and electronic environments where manual work is still required on web portals, PLM systems, cloud consoles, or support tools. (mc)

Subscribe to the newsletter now

Don't Miss out on Our Best Content

By clicking on „Subscribe to Newsletter“ I agree to the processing and use of my data according to the consent form (please expand for details) and accept the Terms of Use. For more information, please see our Privacy Policy. The consent declaration relates, among other things, to the sending of editorial newsletters by email and to data matching for marketing purposes with selected advertising partners (e.g., LinkedIn, Google, Meta)

Unfold for details of your consent