Share
What You Will be Doing:
Guide, mentor, and develop an inclusive and collaborative engineering team focused on delivering robust model serving solutions.
Drive planning, prioritization, and execution for projects that improve Triton’s scalability, performance, and reliability in non-generative AI deployments.
Foster partnerships with Product and Program Management to create feature roadmaps, manage cross-team dependencies, and balance project resources for both cloud and on-premises platforms.
Collaborate with internal collaborators and external customers to understand use cases and convert their needs into product features.
Promote engineering excellence through modern, agile development practices and a culture of quality and accountability.
What We Need to See:
Master’s or PhD, or equivalent experience, in Computer Science, Computer Engineering, or a related field.
Eight or more years of overall hands-on software development experience in customer-facing environments.
At least three years building, mentoring, and leading software engineering teams delivering production-grade solutions.
Deep background in scalable serving architectures, with direct experience building cloud-native inference APIs,REST/gRPC/protobuf-basedservices, or similar technologies.
Advanced C/C++ and Python development skills, demonstrating clean, object-oriented design, as well as proficiency in debugging, performance optimization, and testing.
Track record of contributing to or leading large open-source projects—using GitHub for code reviews, bug tracking, and release management.
Strong knowledge of agile methodologies and tools such as JIRA and Linear.
Ability to communicate technical topics with clarity and empathy to colleagues, partners, and diverse audiences.
Ways to stand out from the crowd:
Experience working within distributed, global teams.
Practical knowledge of machine learning model deployment with frameworks such as TensorRT, TRT-LLM, PyTorch, ONNX, Python, or similar platforms.
Understanding of CPU and GPU architectures.
Skills in GPU programming (for example, CUDA or OpenCL).
You will also be eligible for equity and .
These jobs might be a good fit