We are looking for a software engineering intern to work on CuPyNumeric, a drop-in distributed, accelerated replacement for NumPy. As a member of our team, you will use your design abilities, coding expertise, and creativity to develop distributed and GPU-accelerated versions of NumPy and SciPy methods and other scientific computing libraries. You will also have the opportunity to enhance the functionality and performance of runtime systems that underlay the foundation of distributed GPU computing at NVIDIA. Specifically, you will be working to:
What you'll be doing:- Improving performance by developing highly optimized and innovative algorithms for high performance numerical computing.
- Architect, prioritize, and develop new features in cuPyNumeric and the runtime stack
- Designing future-proof API’s for accelerated numerical/scientific computing libraries.
- Contribute to the development of runtime systems that underlay the foundation of multi-GPU computing at NVIDIA.
- Write effective, maintainable, and well-tested code for production use.
What we need to see:- Pursuing BS, MS, or PhD degree in Computer Science, Electrical Engineering, or related field
- Strong foundation in modern C++ best practices and object oriented programming.
- Experience using Python for numerical computing (e.g. NumPy, SciPy).
- Experience with CUDA C++.
- Academic knowledge of tasking or asynchronous runtimes, especially data-centric initiatives such as Legion
- Good written communication, teamwork, and presentation skills.
Ways to stand out from the crowd:- Proficiency in C++17 and beyond.
- Experience in Python binding technology for C++, particularly pybind11 or nanobind.
- Experience using C++ tooling and linters such as clang-tidy, libclang, or similar.
- Experience building, debugging, profiling and optimizing distributed applications, on supercomputers or the cloud.
- Prior experience in open source HPC software development.
You will also be eligible for Intern