Finding the best job has never been easier
Share
What you’ll be doing:
Architect, implement and optimize reliable low latency full duplex conversation pipelines and dialog systems, that excel across various application areas and challenging environments.
Build and benchmark cascaded and unified speech-to-speech models and systems that reflect real human conversations.
Designing, implementing and testing domain specific agents and workflows and a framework which can support multi-turn, multi-modal, multi-user conversations with LLM driven agents.
Analyze RAG and conversational AI agent end to end accuracy and limitations and recommend the next course of action & Improvements.
Characterize performance and quality metrics across platforms for various AI and system components
Collaborate with various teams on new product features and improvements of existing products. Customize and integrate the conversational AI framework with other NVIDIA products
Participate in developing and reviewing code, design documents, use case reviews, and test plan reviews and help innovate, identify problems, recommend solutions and perform triage in a collaborative team environment.
What we need to see:
Bachelor's degree or Master’s degree (or equivalent experience) in Computer Science, Electrical Engineering, Artificial Intelligence, or Applied Math
10+ years of experience, with a very good hands-on exposure to building solutions that touch various technology areas that cover Speech, LLM, RAG and Agents.
Excellent programming skills in Python and/or C++, with ability to debug complex asynchronous systems
Deep understanding of various Speech technologies like VAD, ASR, TTS, Translation, End-to-End Speech Models, etc. to build conversation systems.
Experience working with RAG and LLM based applications as a key part of building Dialog and Q & A systems. Additional exposure to LLM function calling, Information Retrieval, Vector Databases, Embedding and Rerank models, autonomous agents etc.is welcome.
Understanding of scalable deployment of multiple microservices involving Speech components, LLM driven RAG and Agent applications in production environment
Experience working with protocol and transports like HTTP REST, gRPC, Websockets, WebRTC, etc.
Hands on experience with building microservices and client-server applications.
Familiarity with Docker, helm, kubernetes etc.
Experience of working on end to end Software lifecycle, release packaging & CI/CD pipeline
General background around version control and code review tools like Git, Gerrit, Gitlab.
Ways to stand out from the crowd:
Strong fundamentals in Programming, Optimizations and Software design
Experience of working with open source frameworks like LangChain, LlamaIndex for building LLM driven applications
Strong knowledge of ML/DL techniques, algorithms and tools with exposure to Speech and Language Models
Familiarity with GPU based technologies like CUDA, CuDNN and TensorRT
Background with deploying machine learning models on data center, cloud, and embedded systems
These jobs might be a good fit