Expoint - all jobs in one place

המקום בו המומחים והחברות הטובות ביותר נפגשים

Limitless High-tech career opportunities - Expoint

KLA Linux Engineer - HPC Design 
United States, Michigan, Ann Arbor 
801703965

12.03.2025

Responsibilities for this exciting role will include:
  • Design, implementation & support of high-performance compute clusters
  • Solid understanding on HPC systems, including CPU/GPU architecture, scalable/robust storage, high-bandwidth inter-connects, and a knowledge of cloud based computing architectures
  • Apply their attention to detail to generate HW BOMs for HPC Clusters, provide vendor management and coordinate HW release activities.
  • Use their strong skills with the Linux OS to configure appropriate operating systems for the HPC system
  • Understand and assemble the project specifications and performance requirements at the subsystem and system levels. Adhere and strive to project timelines to ensure program achievements complete on time.
  • Support design and release of new products to manufacturing and ultimately the customer, providing quality golden images, procedures, scripts and documentation to the manufacturing team and customer support team.
  • Lead EOL Parts Re-Qualification for long term system deployments
  • Support in-house as well as in-field critical issues
Required Qualifications:
  • Validated in-depth and flavor agnostic knowledge of Linux systems (SuSE, RedHat, Rocky, Ubuntu)
  • Experience of crafting and maintaining robust storage
  • Strong HPC HW knowledge especially in the Server, GPU, Networking, Storage, Scheduler, BIOS & BMC arenas.
  • Experience in System-D, Net boot/PXE, Linux HA.
  • Strong understanding of TCP/IP fundamentals and knowledge of protocols, DNS, DHCP, HTTP, LDAP, SMTP.
  • Strong with Storage File Shares: NFS/CIFS
  • Ability to code and develop Shell and Python scripts.
  • Experience with one or more of the listed Configuration Mgmt utilities. (Ansible, Salt, Chef, Puppet etc).
Preferred Qualifications:
  • Possess a strong DevOps focus: Knowledge of setting up a continuous development pipelines, Repository software (Git-based).
  • Hypervisor Knowledge: VMWare, Proxmox, or XCP-ng
  • Knowledge of Apache/Nginx, Setting up proxy/reverse proxy, application server routing, load balancing (HA Proxy)
  • HPC Schedulers: SGE/SLURM
  • Monitoring tools: Prometheus, Grafana, Nagios
  • Database Technologies: MySQL
  • BS or MS degree 5+ years validated experience
  • Computer Engineering or Electrical Engineer related fields
Skills and Abilities:
  • Team Orientation & Interpersonal – Highly motivated teammate with ability to develop and maintain collaborative relationships with all levels within and external to the organization.
  • Organization & Time Management – Able to plan, schedule, prioritize, and follow up on tasks related to the job to achieve goals within or ahead of established time frames.
  • Multi-task - Ability to expeditiously organize, coordinate, manage, prioritize, and perform multiple tasks simultaneously to swiftly assess a situation, determine a logical course of action, and apply the appropriate response.
  • Adaptability to Change – Able to be flexible and encouraging, and able to assimilate change positively and proactively in rapid growth environment.
  • Outstanding teammate with excellent written and verbal communications skills.

Minimum Qualifications

Doctorate (Academic) Degree and 0 years related work experience; Master's Level Degree and related work experience of 3 years; Bachelor's Level Degree and related work experience of 5 years