Expoint - all jobs in one place

המקום בו המומחים והחברות הטובות ביותר נפגשים

Limitless High-tech career opportunities - Expoint

KLA HPC Performance Engineer 
United States, California, Milpitas 
617616310

05.05.2024

Responsibilities for this exciting role will include:
  • Design, implementation & support of high-performance compute clusters
  • Solid knowledge on HPC systems, including CPU/GPU architecture, scalable/robust storage, high-bandwidth inter-connects, and a knowledge of cloud-based computing architectures
  • Ability to do the deep dive to find and resolve processing efficiencies and drive optimizations to improve utilization of the cluster. This may include addressing Job scheduling improvements, infrastructure optimizations, algorithm code optimizations, vector processing, parallel processing, data transfer timing analysis and evaluating the detailed timing of cluster operation.
  • Use their strong skills with the Linux OS to configure appropriate operating systems for the HPC system
  • Understand and assemble the project specifications and performance requirements at the subsystem and system levels. Adhere and drive to project timelines to ensure program achievements complete on time.
  • Support design and release of new products to manufacturing and ultimately the customer, providing quality golden images, procedures, scripts and documentation to the manufacturing team and customer support team.
Required Qualifications:
  • Validated in-depth and flavor agnostic knowledge of Linux systems (SuSE, RedHat, Rocky, Ubuntu)
  • In depth knowledge of parallel programming, vector-based processing, distributed computing, code optimization on CPU and GPUs
  • Experience in vector processing and multi-threading related technologies & libraries such as (SIMD, AVX, IPP, MKL, openCV, openMP, OpenCL, MPI, TBB, CUDA)
  • Knowledge of performance profilers such as Intel vTune, Nvidia Nsight compute, AMD uProf, perf, as well as using custom profiling and telemetry tools to identify pinch points. Have the coding skills to implement parsing utilities and graphing functions to allow visualization of the performance data.
  • Knowledge of HPC job schedulers and how they function.
  • Ability to find bottlenecks and drive closure of them whether it is in data movement, code execution timing or job scheduling optimization.
  • Strong HPC HW knowledge especially in the server, GPU, networking, Storage, BIOS & BMC arenas.
  • Ability to code and develop Shell and Python scripts for developing test environments
Preferred Qualifications:
  • Kubernetes, Harbor, Prometheus & Grafana experience
  • BS or MS degree + 3 to 5 years validated experience
  • Computer Engineering or Electrical Engineer related fields
Skills and Abilities:
  • Team Orientation & Interpersonal – Highly motivated teammate with ability to develop and maintain collaborative relationships with all levels within and external to the organization.
  • Organization & Time Management – Able to plan, schedule, organize, and follow up on tasks related to the job to achieve goals within or ahead of established time frames.
  • Multi-task - Ability to expeditiously organize, coordinate, manage, prioritize, and perform multiple tasks simultaneously to swiftly assess a situation, determine a logical course of action, and apply the appropriate response.
  • Adaptability to Change – Able to be flexible and supportive, and able to assimilate change positively and proactively in rapid growth environment.
  • Outstanding teammate with excellent written and verbal communications skills.

Minimum Qualifications

Typically requires a Doctorate (Academic) Degree and 0 years related work experience; Master's Level Degree and related work experience of 3 years; Bachelor's Level Degree and related work experience of 5 years