AI Cluster Design Engineer

Advanced Micro Devices, Inc.
$157,040.00/Yr.-$235,560.00/Yr.
United States, Texas, Austin
7171 Southwest Parkway (Show on map)
Oct 01, 2025
WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences - from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges-striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career. THE ROLE We are seeking a highly skilled systems engineer to architect and design scalable AI/HPC clusters. This role involves evaluating and selecting compute, storage, networking, power delivery, and cooling solutions to optimize performance and reliability across global deployments. You will collaborate with cross-functional teams to deliver cutting-edge infrastructure for AI and high-performance computing workloads. THE PERSON An experienced systems architect with a strong background in HPC, AI infrastructure, and data center engineering. You bring deep technical knowledge of compute and networking components, a strategic mindset for system-level design, and the ability to collaborate across diverse technical domains. You thrive in fast-paced environments and are passionate about building efficient, scalable, and reliable compute platforms. KEY RESPONSIBILITIES Design scalable AI/HPC clusters including compute, storage, networking, power delivery, and cooling Evaluate and select CPUs, GPUs, accelerators, interconnects, and memory configurations for optimal performance Develop advanced thermal and power delivery strategies for high-density deployments Understand global power delivery and regulatory requirements (U.S., EMEA, Asia, etc.) Define power budgets, redundancy schemes, and fault tolerance mechanisms Design network topologies to maximize cluster performance across workload types Assess trade-offs and performance characteristics of various network architectures Design and optimize storage solutions (e.g., Lustre, Ceph) for AI/HPC clusters Collaborate with hardware, software, network, data center, and operations teams to deliver robust infrastructure PREFERRED EXPERIENCE Extensive experience in HPC, AI infrastructure, or data center systems engineering Experience with liquid cooling or advanced thermal management Experience with rack level power distribution Contributions to open-source HPC or AI infrastructure projects ACADEMIC CREDENTIALS Bachelor's or Master's degree in Electrical Engineering, Computer Engineering, Computer Science, or related field or equivalent work experience #LI-KW1 #Remote Benefits offered are described: AMD benefits at a glance. AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.