Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

KLA Product Development Engineer High-Performance Computing 
United States, California, Milpitas 
383972886

12.03.2025

We are seeking passionate individuals to join our High-Performance Computing (HPC) team in the Broadband Plasma Division (BBP). Our team provides a cutting-edge HPC Computational Platform for executing image processing algorithms and enabling real-time wafer inspections. As a Product Development Engineer for an Embedded Linux HPC Cluster, which is part of the KLA Wafer inspection tool, you will:

  • Review requirements and translate them to design an optimized HPC cluster.

  • Create Operating System Golden Images to enable HPC Application workloads.

  • Work with multiple stakeholders to drive Hardware/OS stack qualification.

  • Work towards performance tuning, compute optimization and diagnostics development.

  • Manage design development efforts, assist with documentation for Mfg/Service teams and support L4 escalations.

  • International traveling as needed, approximately 2-3 times per year.

Required Qualifications:

  • In-depth knowledge of one or more Linux distributions: SuSE, RedHat, CentOS, Ubuntu, including experience with System-D, Net boot/PXE, and Linux HA.

  • Experience with one or more configuration management utilities (Salt, Chef, Puppet, etc.).

  • Proficiency in shell scripting (Bash) and Python, with a strong understanding of object-oriented concepts.

  • Strong understanding of TCP/IP fundamentals and knowledge of DNS, DHCP, and InfiniBand fabric troubleshooting.

  • Good working knowledge of x86 hardware platforms and proven ability to benchmark performance across various hardware platforms with GPUs.

  • Familiarity with observability tools and proven ability to collect metrics, create visuals, and analyze them for data-based decision-making.

  • Possess excellent written and verbal communication skills.

This position offers flexibility and will require at least three days in the office.

Skills and Abilities:

  • Team Orientation & Interpersonal Skills: Highly motivated team player with the ability to develop and maintain collaborative relationships at all levels within and outside the organization.

  • Organization & Time Management: Able to plan, schedule, organize, and follow up on tasks to achieve goals within or ahead of established time frames.

  • Multi-tasking: Ability to efficiently organize, coordinate, manage, prioritize, and perform multiple tasks simultaneously, swiftly assess situations, determine logical courses of action, and apply appropriate responses.

  • Adaptability to Change: Flexible and supportive, able to positively and proactively assimilate change in a rapid growth environment.

Prefer Qualifications:

  • Degree in Computer Science, Data Science, Computer Engineering, Electrical Engineering, or related fields.

    • BS: 6+ years of experience

    • MS: 3+ years of experience

    • PhD: 1+ year of experience

Optional Qualifications:

  • DevOps focus: Knowledge of setting up a continuous development pipeline (Jenkins), repository software (Git-based), and Docker containers.

  • Knowledge of Apache/Nginx, setting up proxy/reverse proxy, application server routing, and load balancing (HA Proxy).

  • Working knowledge of Prometheus/Grafana.

  • Knowledge of PKI & SSL/TLS certificate management.

Minimum Qualifications

  • Bachelors Degree + 5 years of experience

  • Masters Degree 3+ years of experience

  • PhD Degree + 0 year of experience