Hardware Health

Microsoft Corporation
📍 Multiple Locations, United States, United States 💼 Full-time 🕒 Posted June 23, 2026

Job Description

**Overview**

Microsoft AI operates one of the world’s most advanced AI training infrastructures, featuring multi-gigawatt clusters spanning tens of thousands of high-performance GPUs, ultra-low-latency NVLink/NVSwitch networks, and innovative liquid-cooling systems. Our team is seeking a Member of Technical Staff, Hardware Health, to ensure these systems deliver sustained reliability, performance, and availability across exascale-class deployments.

We work closely with research, hardware, datacenter, and platform engineering teams to develop predictive health models, failure detection frameworks, and autonomous remediation systems that keep our AI clusters operating at frontier scale.

Our newly formed organization, Microsoft AI, is dedicated to advancing Copilot and other consumer AI products and research. The team is responsible for Copilot, Bing, Edge, and generative AI research. Join us and help shape the future of personal computing.

Microsoft’s ...

Ready to Apply?

Submit your application today and join our talented team at Microsoft Corporation.

Submit Application

Job Details

  • Location Multiple Locations, United States
  • Job Type Full-time
  • Category other-general
  • Posted Date June 23, 2026
  • Application Deadline July 02, 2026