Share
What you will be doing:
Be part of a DGX Cloud Lepton team responsible for developing the two-side marketplace, including integration of compute providers and developing discovery and bidding experiences to match supply with demand.
Design and implement IaaS API integrations, including collaborating with external engineering teams to ensure reliable, scalable, and consistent connectivity across a diverse set of cloud environments
Shape integration strategies, develop stateful workflow orchestration, and drive improvements in testing, observability, and automation to ensure high quality, fault-tolerant solutions
What we need to see:
12+ years of experience in developing software infrastructure for large scale AI systems.
Direct experience in a software engineering role within a highly technical organization with demonstrable impact from your work. Software development experience with kubernetes APIs and frameworks.
Familiarity with setting up cloud infrastructure environments (VMaaS, VPCs, RDMA, shared file-systems)
Proven track record with 3rd party API integrations: communicating with external teams, writing API clients, and improving integration reliability
Comfortable working in a fast-paced environment and collaborating with external engineering teams to test and debug integrations
Technical knowledge, including a systems programming language (strong preference for experience writing production code in Go) and a solid understanding of software design patterns for stateful workflow orchestration
BS in Computer Science, Engineering, Physics, Mathematics or a comparable Degree or equivalent experience.
2+ years in similar role and experience on large-scale production systems. Experience with common software engineering principles, tools and techniques.
You will also be eligible for equity and .
These jobs might be a good fit