Expoint – all jobs in one place
Finding the best job has never been easier
Limitless High-tech career opportunities - Expoint

Apple Systems Architect 
United States, California, Cupertino 
304890760

Today
As a Technical Specialist, the individual will negotiate and model Data Center infrastructure solution details from a computer architecture and performance perceptive. Collaborate and leverage domain expertise knowledge to provide guidance, and leadership to multi-functional engineering teams to integrate cluster network architectures into overall system architecture to ensure efficient data flow, impact product definitions, and meet scalability requirements. Contribute to the definition of rack and cluster capabilities, configurations, and scale out requirements to support the deployment of dense compute and specialty compute workloads and applications, including but not limited to the following:Pathfinding novel Data Center cluster and node architecture choices with a broad group of architects and system engineers, networking, technical leads, and HW/SW partners.Providing guidance in optimized network designs for large-scale AI/ML clusters considering factors like bandwidth, latency, and scalability.Influencing networking hardware and software components selection for the cluster, including switches, adapters, and protocols.Analyzing network traffic patterns and implementing strategies to improve data transfer speeds within the cluster for target topologies and choice configurations.Exploring and champion new product-level features and workflows.Mentoring junior engineers to best practices and data-driven processes.
  • BS/MS in Computer Engineering or equivalent with 10+ years of relevant industry experience.
  • Possesses functional experience in defining and deploying datacenter cluster networking architectures over highly dense mesh networks and interconnected nodes for AI/ML based workloads.
  • Proven track record of deploying AI/ML experiences at scale in large-scale data centers; Has strong experience with deployment of modern ML architectures.
  • Possesses strong technical breadth across several computer subsystem technologies, e.g., CPU, GPU, TPU, storage, memory, power delivery, power management, high speed networking, I/O, thermal management.
  • Has core competence and subject matter expertise architecting complex system architectures for general purpose compute, or specialty compute (GPUs, TPUs) systems running datacenter workloads for AI/ML applications.
  • Detailed knowledge of network protocols, expertise in Ethernet, Infiniband, RoCE, UE, UAL, or other relevant networking protocols.
  • Has strong analytical, verbal, written, and communication skills. Ability to summarize and effectively communicate technical issues and actions to key partners and leadership teams
  • Ability to comprehend the roles of HW/FW/SW layers and how they interact in system design.
  • Ability to create, review and approve engineering requirement specification documents.
  • Prior experience with data center and large-scale cluster systems is desired.
  • Machine Learning experience is desired.
  • Prior experience in performance modeling is desired.
Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program.