Finding the best job has never been easier
Share
You will be part of global Performance Lab team, improving our capacity to expertly and accurately benchmark state-of-the-art datacenter applications and products. We also work to develop new scripts that enhance the team’s ability to gather data through automation and designing efficient processes for testing a wide variety of applications and hardware. The data that we collect drives marketing/sales collaterals as well as engineering studies for current and future products. You will have the opportunity to work with multi-functional teams and in a dynamic environment where multiple projects will be active at once and priorities may shift frequently.
What you’ll be doing:
Benchmark, profile, and analyze the performance of AI workloads specifically tailored for large-scale LLM training and inference, as well as High-Performance Computing (HPC) on NVIDIA supercomputers and distributed systems.
Aggregate and produce written and visual reports with the testing data for internal sales, marketing, SW, and HW teams
Setup and configure systems with appropriate hardware and software to run benchmarks
Collaborate with internal teams to debug and improve performance issues
Develop Python scripts to automate the testing of various applications
Assist with the development of tools and processes that improve our ability to perform automated testing
What we need to see:
Currently pursuing a Bachelor's degree (or higher) in Computer Science, Electrical Engineering, or a related field.
Experienced in programming and debugging with scripting languages such as Python or Unix shell.
Strong data analysis skills and the ability to summarize findings in a written report
Hands-on experience with Linux based systems. Familiarity using a container platform such as Docker or Singularity. Experience with compiling and running software from source code.
Fast and self-learning capabilities with strong analytical and problem-solving skills.
Good English verbal and written interpersonal skills to improve collaboration with coworkers
Ways to stand out from the crowd:
Background with GPU/CPU benchmarking
Familiar with ML/DL techniques, algorithms and frameworks like TensorFlow or PyTorch.
Experience in AI model development, training, evaluation and deployment on Cloud, Cluster or on-premises. Familiar with cloud provisioning and scheduling tools (Kubernetes, SLURM).
Exposure to testing automation for various applications.
These jobs might be a good fit