The point where experts and best companies meet
Share
We are seeking passionate individuals to join our High-Performance Computing (HPC) team in the Broadband Plasma Division (BBP). Our team provides a cutting-edge HPC Computational Platform for executing image processing algorithms and enabling real-time wafer inspections. As a Product Development Engineer for an Embedded Linux HPC Cluster, which is part of the KLA Wafer inspection tool, you will:
Review requirements and translate them to design an optimized HPC cluster.
Create Operating System Golden Images to enable HPC Application workloads.
Work with multiple stakeholders to drive Hardware/OS stack qualification.
Work towards performance tuning, compute optimization and diagnostics development.
Manage design development efforts, assist with documentation for Mfg/Service teams and support L4 escalations.
International traveling as needed, approximately 2-3 times per year.
Required Qualifications:
In-depth knowledge of one or more Linux distributions: SuSE, RedHat, CentOS, Ubuntu, including experience with System-D, Net boot/PXE, and Linux HA.
Experience with one or more configuration management utilities (Salt, Chef, Puppet, etc.).
Proficiency in shell scripting (Bash) and Python, with a strong understanding of object-oriented concepts.
Strong understanding of TCP/IP fundamentals and knowledge of DNS, DHCP, and InfiniBand fabric troubleshooting.
Good working knowledge of x86 hardware platforms and proven ability to benchmark performance across various hardware platforms with GPUs.
Familiarity with observability tools and proven ability to collect metrics, create visuals, and analyze them for data-based decision-making.
Possess excellent written and verbal communication skills.
This position offers flexibility and will require at least three days in the office.
Skills and Abilities:
Team Orientation & Interpersonal Skills: Highly motivated team player with the ability to develop and maintain collaborative relationships at all levels within and outside the organization.
Organization & Time Management: Able to plan, schedule, organize, and follow up on tasks to achieve goals within or ahead of established time frames.
Multi-tasking: Ability to efficiently organize, coordinate, manage, prioritize, and perform multiple tasks simultaneously, swiftly assess situations, determine logical courses of action, and apply appropriate responses.
Adaptability to Change: Flexible and supportive, able to positively and proactively assimilate change in a rapid growth environment.
Prefer Qualifications:
Degree in Computer Science, Data Science, Computer Engineering, Electrical Engineering, or related fields.
BS: 6+ years of experience
MS: 3+ years of experience
PhD: 1+ year of experience
Optional Qualifications:
DevOps focus: Knowledge of setting up a continuous development pipeline (Jenkins), repository software (Git-based), and Docker containers.
Knowledge of Apache/Nginx, setting up proxy/reverse proxy, application server routing, and load balancing (HA Proxy).
Working knowledge of Prometheus/Grafana.
Knowledge of PKI & SSL/TLS certificate management.
Minimum Qualifications
Bachelors Degree + 5 years of experience
Masters Degree 3+ years of experience
PhD Degree + 0 year of experience
These jobs might be a good fit