

Share
You will collaborate closely with researchers to design and scale agents - enabling them to reason, plan, call tools and code just like human engineers. You will work on building and maintaining the core infrastructure for deploying and running these agents in production, powering all our agentic tools and applications and ensuring their seamless and efficient performance. If you're passionate about the latest research and cutting-edge technologies shaping generative AI, this role and team offer an exciting opportunity to be at the forefront of innovation.
What you'll be doing:
Design, develop, and improve scalable infrastructure to support the next generation of AI applications, including copilots and agentic tools.
Drive improvements in architecture, performance, and reliability, enabling teams to bring to bear LLMs and advanced agent frameworks at scale.
Collaborate across hardware, software, and research teams, mentoring and supporting peers while encouraging best engineering practices and a culture of technical excellence.
Stay informed of the latest advancements in AI infrastructure and contribute to continuous innovation across the organization.
What we need to see:
Master or PhD or equivalent experience in Computer Science or related field, with a minimum of 5 years in large-scale distributed systems or AIinfrastructure.
Advanced expertise in Python (required), strong experience with JavaScript, and deep knowledge of software engineering principles, OOP/functional programming, and writing high-performance, maintainable code.
Demonstrated expertise in crafting scalable microservices, web apps, SQL, and NoSQL databases (especially MongoDB and Redis) in production with containers, Kubernetes, and CI/CD.
Solid experience with distributed messaging systems (e.g., Kafka), and integrating event-driven or decoupled architectures into robust enterprise solutions.
Practical experience integrating and fine-tuning LLMs or agent frameworks (e.g., LangChain, LangGraph, AutoGen, OpenAI Functions, RAG, vector databases, timely engineering).
Demonstrated end-to-end ownership of engineering solutions, from architecture and development to deployment, integration, and ongoingoperations/support.
Excellent communication skills and a collaborative, proactive approach.
You will also be eligible for equity and .
These jobs might be a good fit

Share
This position requires the incumbent to have a sufficient knowledge of English to have professional verbal and written exchanges in this language since the performance of the duties related to this position requires frequent and regular communication with colleagues and partners located worldwide and whose common language is English.
Gross pay salary$135,800—$203,600 USD
Share
What you'll be doing:
Lead sophisticated programs focused on improving the quality and efficiency of data center infrastructure, hardware, and software domains with multi-year strategic roadmaps and cross-
Drive technical execution from requirements gathering through production launch, including writing technical specifications, coordinating release schedules, and ensuring operational readiness across multiple team dependencies
Own server hardware development, testing, and integration efforts for computing products, working closely with original design manufacturers and contract manufacturers on new product introductions at global manufacturing scale
Partner with software development teams to build automation programs for large-scale infrastructure testing and develop solutions that enhance operational performance across highly concurrent, high-throughput distributed systems
Guide enterprise network infrastructure and data center operations initiatives covering servers, storage, networking, power, and cooling systems while serving as domain leader for manufacturing test infrastructure
Lead continuous improvement initiatives for engineering processes, quality management, and operational excellence while leading risk mitigation strategies and critical path oversight
Build trusted partnerships across hardware teams, security professionals, supply chain, operations, and product management to drive technical decisions and resolve sophisticated multi-functional dependencies
What we need to see:
Bachelor's degree in Engineering, Computer Science, Electrical Engineering, Mechanical Engineering, or related technical field, or equivalent experience
12+ years working directly with engineering teams with demonstrated technical program management experience
More than 7 years of practical program or project management expertise being responsible for intricate technology ventures involving teams with multifaceted strengths
5+ years of software development experience with proficiency in programming languages.
5+ years leading hardware product development and new product introduction on a global manufacturing scale
Deep technical expertise in server, network, or storage product architecture and manufacturing test development
Strong understanding of large-scale distributed systems, data center infrastructure, and enterprise network architecture
Experience with Linux/Unix or Windows system administration, database management, and infrastructure automation
Demonstrated ability to lead programs across multiple teams, handle project scope, schedule, budget, and quality, and maintain executive-level relationships
Ways to stand out from the crowd:
8+ years directly leading sophisticated technology projects with experience designing and architecting highly reliable, scalable systems
Track record launching AI or ML server products with new technology enablement such as Liquid Cooling
Experience leading manufacturing test engineering teams within the server, network, or storage sector with expertise in Design for Excellence methodologies
Knowledge of security engineering, cryptography, quality management systems, and supply chain operations
Demonstrated single-threaded ownership of strategic programs with demonstrated ability to deliver groundbreaking systems independently in fast-paced, ambiguous environments
You will also be eligible for equity and .

Share
This position requires the incumbent to have a sufficient knowledge of English to have professional verbal and written exchanges in this language since the performance of the duties related to this position requires frequent and regular communication with colleagues and partners located worldwide and whose common language is English.
Gross pay salary$153,400—$230,200 USD
Share
This position requires the incumbent to have a sufficient knowledge of English to have professional verbal and written exchanges in this language since the performance of the duties related to this position requires frequent and regular communication with colleagues and partners located worldwide and whose common language is English.
Gross pay salary$135,800—$203,600 USD
Share
What we need to see:
MS or PhD degree in Electrical Engineering, Computer Science or equivalent experience.
Minimum of 8 years relevant experience
We require proven theoretical knowledge of communication systems, communication theory, linear algebra, detection and estimation theory, baseband signal processing algorithms, and channel coding
We seek deep expertise in LTE/5G NR L1 (PHY) and L2 (MAC-scheduler) algorithms design and optimizations
Experience with wireless algorithm performance characterization and analysis
Wireless base station design, development, and commercialization experience
Experience building system models with Matlab, C/C++ and/or Python for algorithm design and link-level simulation
Solid knowledge of 3GPP 5G NR standard
Strategic context of Telecom Industry and Wireless technology evolution
Ways to stand out from the crowds:
Research experience in AI/ML and its applications to wireless applications
Demonstrated experience in software development for commercial RAN products
Knowledge of SIMD computing architecture
Background of GPU or CUDA programming
Knowledge of Wireless Protocols and E2E Deployment Architecture
You will also be eligible for equity and .

Share
What you’ll be doing:
Contribute features to vLLM that empower the newest models with the latest NVIDIA GPU hardware features; profile and optimize the inference framework (vLLM) with methods like speculative decoding,data/tensor/expert/pipeline-parallelism,prefill-decode disaggregation.
Develop, optimize, and benchmark GPU kernels (hand-tuned and compiler-generated) using techniques such as fusion, autotuning, and memory/layout optimization; build and extend high-level DSLs and compiler infrastructure to boost kernel developer productivity while approaching peak hardware utilization.
Define and build inference benchmarking methodologies and tools; contribute both new benchmark and NVIDIA’s submissions to the industry-leading MLPerf Inference benchmarking suite.
Architect the scheduling and orchestration of containerized large-scale inference deployments on GPU clusters across clouds.
Conduct and publish original research that pushes the pareto frontier for the field of ML Systems; survey recent publications and find a way to integrate research ideas and prototypes into NVIDIA’s software products.
What we need to see:
Bachelor’s degree (or equivalent expeience) in Computer Science (CS), Computer Engineering (CE) or Software Engineering (SE) with 7+ years of experience; alternatively, Master’s degree in CS/CE/SE with 5+ years of experience; or PhD degree with the thesis and top-tier publications in ML Systems, GPU architecture, or high-performance computing.
Strong programming skills in Python and C/C++; experience with Go or Rust is a plus; solid CS fundamentals: algorithms & data structures, operating systems, computer architecture, parallel programming, distributed systems, deep learning theories.
Knowledgeable and passionate about performance engineering in ML frameworks (e.g., PyTorch) and inference engines (e.g., vLLM and SGLang).
Familiarity with GPU programming and performance: CUDA, memory hierarchy, streams, NCCL; proficiency with profiling/debug tools (e.g., Nsight Systems/Compute).
Experience with containers and orchestration (Docker, Kubernetes, Slurm); familiarity with Linux namespaces and cgroups.
Excellent debugging, problem-solving, and communication skills; ability to excel in a fast-paced, multi-functional setting.
Ways to stand out from the crowd
Experience building and optimizing LLM inference engines (e.g., vLLM, SGLang).
Hands-on work with ML compilers and DSLs (e.g., Triton,TorchDynamo/Inductor,MLIR/LLVM, XLA), GPU libraries (e.g., CUTLASS) and features (e.g., CUDA Graph, Tensor Cores).
Experience contributing tocontainerization/virtualizationtechnologies such ascontainerd/CRI-O/CRIU.
Experience with cloud platforms (AWS/GCP/Azure), infrastructure as code, CI/CD, and production observability.
Contributions to open-source projects and/or publications; please include links to GitHub pull requests, published papers and artifacts.
You will also be eligible for equity and .

Share
You will collaborate closely with researchers to design and scale agents - enabling them to reason, plan, call tools and code just like human engineers. You will work on building and maintaining the core infrastructure for deploying and running these agents in production, powering all our agentic tools and applications and ensuring their seamless and efficient performance. If you're passionate about the latest research and cutting-edge technologies shaping generative AI, this role and team offer an exciting opportunity to be at the forefront of innovation.
What you'll be doing:
Design, develop, and improve scalable infrastructure to support the next generation of AI applications, including copilots and agentic tools.
Drive improvements in architecture, performance, and reliability, enabling teams to bring to bear LLMs and advanced agent frameworks at scale.
Collaborate across hardware, software, and research teams, mentoring and supporting peers while encouraging best engineering practices and a culture of technical excellence.
Stay informed of the latest advancements in AI infrastructure and contribute to continuous innovation across the organization.
What we need to see:
Master or PhD or equivalent experience in Computer Science or related field, with a minimum of 5 years in large-scale distributed systems or AIinfrastructure.
Advanced expertise in Python (required), strong experience with JavaScript, and deep knowledge of software engineering principles, OOP/functional programming, and writing high-performance, maintainable code.
Demonstrated expertise in crafting scalable microservices, web apps, SQL, and NoSQL databases (especially MongoDB and Redis) in production with containers, Kubernetes, and CI/CD.
Solid experience with distributed messaging systems (e.g., Kafka), and integrating event-driven or decoupled architectures into robust enterprise solutions.
Practical experience integrating and fine-tuning LLMs or agent frameworks (e.g., LangChain, LangGraph, AutoGen, OpenAI Functions, RAG, vector databases, timely engineering).
Demonstrated end-to-end ownership of engineering solutions, from architecture and development to deployment, integration, and ongoingoperations/support.
Excellent communication skills and a collaborative, proactive approach.
You will also be eligible for equity and .
These jobs might be a good fit