Finding the best job has never been easier
Share
NVIDIA is building a new category of products by intersecting our prowess in deep learning and computing with industry-leading technology. You will harness groundbreaking inference acceleration from NVIDIA, and design inference microservices that span from multi-modal, to protein folding, to weather prediction. You will influence and drive advances in many NVIDIA teams and external partners to solve the hardest problems in AI inference. In this role, you will design and improve our NIM architecture using the most recent improvements in AI inference, improving and specifying the underlying libraries, optimization techniques, containerization details, and easy-to-use APIs that capture metrics and traceability.
What you'll be doing:
You are the engineer's engineer demonstrating good engineering practice and mentoring others to follow suit. You will be architect and build NIM software to support modularity and high performance using the latest GPU-enhanced technology.
Design scalable and maintainable microservices, define APIs for different inference uses cases, both local and distributed processing, while providing observability. You deliver software through rapid iterations and support many different teams to re-use scalable architecture and components.
This role requires collaboration with multiple AI model teams to build an efficient architecture that improves inference performance and re-usability. You will define metrics and drive improvements based on user feedback and industry expectations.
You are a great communicator! Through your partnership with our NIM leadership, you will deliver a cohesive and enticing architecture to our engineers that is reflected in customer satisfaction.
What we need to see:
Experience with large-language models or generative AI providing high performant inference and deep expertise in microservices, Pytorch, ONNX, REST, gRPC APIs, and multiple inference backends.
Technical leadership experience providing designs and building scalable microservices in an agile software development environment. You demonstrate the ability to lead multi-functional efforts, effectively working with multi-functional teams, principals and architects, across organizational boundaries.
You are a mentor and coach in all your interactions with all your colleagues.
BS or MS in Computer Science, Computer Engineering or related field (or equivalent experience).
12+ years of experience building, debugging, analyzing and optimizing runtime performance of distributed services.
Ways to stand out from the crowd:
Experience building inference systems.
Prior experience providing full-stack development that scales to large numbers of nodes.
Prior MLOps experience and experience using ML and AI technologies
CUDA experience and an ability to use GPU optimized libraries for improved performance
You will also be eligible for equity and .
These jobs might be a good fit