

Share
What you’ll be doing:
Contribute features to vLLM that empower the newest models with the latest NVIDIA GPU hardware features; profile and optimize the inference framework (vLLM) with methods like speculative decoding,data/tensor/expert/pipeline-parallelism,prefill-decode disaggregation.
Develop, optimize, and benchmark GPU kernels (hand-tuned and compiler-generated) using techniques such as fusion, autotuning, and memory/layout optimization; build and extend high-level DSLs and compiler infrastructure to boost kernel developer productivity while approaching peak hardware utilization.
Define and build inference benchmarking methodologies and tools; contribute both new benchmark and NVIDIA’s submissions to the industry-leading MLPerf Inference benchmarking suite.
Architect the scheduling and orchestration of containerized large-scale inference deployments on GPU clusters across clouds.
Conduct and publish original research that pushes the pareto frontier for the field of ML Systems; survey recent publications and find a way to integrate research ideas and prototypes into NVIDIA’s software products.
What we need to see:
Bachelor’s degree (or equivalent expeience) in Computer Science (CS), Computer Engineering (CE) or Software Engineering (SE) with 7+ years of experience; alternatively, Master’s degree in CS/CE/SE with 5+ years of experience; or PhD degree with the thesis and top-tier publications in ML Systems, GPU architecture, or high-performance computing.
Strong programming skills in Python and C/C++; experience with Go or Rust is a plus; solid CS fundamentals: algorithms & data structures, operating systems, computer architecture, parallel programming, distributed systems, deep learning theories.
Knowledgeable and passionate about performance engineering in ML frameworks (e.g., PyTorch) and inference engines (e.g., vLLM and SGLang).
Familiarity with GPU programming and performance: CUDA, memory hierarchy, streams, NCCL; proficiency with profiling/debug tools (e.g., Nsight Systems/Compute).
Experience with containers and orchestration (Docker, Kubernetes, Slurm); familiarity with Linux namespaces and cgroups.
Excellent debugging, problem-solving, and communication skills; ability to excel in a fast-paced, multi-functional setting.
Ways to stand out from the crowd
Experience building and optimizing LLM inference engines (e.g., vLLM, SGLang).
Hands-on work with ML compilers and DSLs (e.g., Triton,TorchDynamo/Inductor,MLIR/LLVM, XLA), GPU libraries (e.g., CUTLASS) and features (e.g., CUDA Graph, Tensor Cores).
Experience contributing tocontainerization/virtualizationtechnologies such ascontainerd/CRI-O/CRIU.
Experience with cloud platforms (AWS/GCP/Azure), infrastructure as code, CI/CD, and production observability.
Contributions to open-source projects and/or publications; please include links to GitHub pull requests, published papers and artifacts.
You will also be eligible for equity and .
These jobs might be a good fit

Share
What you'll be doing:
Design intuitive data models and semantic layers to enable self‑service and AI apps reducing ad‑hoc query friction for business users.
Enrich data products with business glossary and metadata to reduce AI hallucinations, improve user adoption, searchability and governance.
Lead multi‑site integrations across new manufacturing plants and ops applications standardizing schemas and controls; enabling cross‑plant insights.
Engineer scalable pipelines with data integrity functions and audit features. Automate measuring and monitoring data quality for improved decision making.
Explain the data designs, system changes, enhancements, address any questions or issues effectively to the stakeholders.
Partner with stakeholders, solve business problems, train users, help with data and queries.
Optimize Lakehouse systems to deliver high performing solutionswhile controlling operational costs.
What we need to see:
BS, MS, or PhD in EE/CS or related field of education (or equivalent experience).
5+ years of programming experience (Python, PySpark, SQL, etc.).
5+ years of experience with big data technologies and cloud platforms (AWS, Databricks, Snowflake).
12+ overall years in Data Warehousing, implementing projects with data Lakehouse solutions.
Experience with enterprise BI databases like SAP BW/HANA, ERP/CRM systems like SAP/Salesforce, planning applications like IBP, APO etc.
Knowledge of operational processes in chips, boards, systems, and networking.
Proficiency in Tableau, PowerBI, and SAP reporting applications.
Ways to stand out from the crowd:
Strong analytical skills with the ability to collect, organize, and disseminate significant amounts of information with attention to detail and accuracy.
Highly independent, able to lead key technical decisions, influence project roadmap and work effectively with team members
Proven experience leading multiple analytics projects in a dynamic, fast-paced environment
Data science, AI/ML experience
Positive interpersonal skills with ability to convey good verbal and written communication
You will also be eligible for equity and .

Share
What you will be doing:
Collaborate with networking teams to plan, implement, and evaluate performance benchmarks on NVLINK, NVSwitch, and InfiniBand powered infrastructures.
Assess findings and work closely with framework, hardware, and support teams to improve system performance across various deep learning workloads.
Act as a primary resource for fixing networking and hardware integration issues, focusing on scalable multi-node systems.
Maintain high communication standards across multiple engineering, support, and R&D teams, ensuring technical and performance goals are met.
Offer technical mentorship and documentation for internal teams and external partners on standard methodologies in HPC networking deployments.
Share insights on improving networking strategies for substantial AI and deep learning infrastructure.
What we need to see:
BS/MS or PhD in Computer Science, Engineering, or related field, or equivalent experience.
8+ years of proven experience in AI/HPC Infrastructure.
Familiarity with AI/HPC job schedulers and orchestrators like Slurm, K8s, or LSF. Practical exposure to AI/HPC workflows employing MPI and NCCL.
Familiarity with High-Speed Networking pertaining to HPC including InfiniBand, RDMA, RoCE, and Amazon EFA.
Essential to have an understanding of PyTorch, MegatronLM, and Deep Learning Inference frameworks such as vllm/sglang.
Proven experience with InfiniBand, NVLINK, and high-speed networking technologies in HPC or large-scale datacenter environments.
Investigating and evaluating performance in multi-node systems, especially in deep learning or scientific computing tasks.
Strong analytical, debugging, and technical communication skills.
Comfortable working in collaborative, multi-faceted teams.
Ways to stand out from the crowd:
Mastery in deep learning frameworks or distributed training systems.
Familiarity with datacenter automation, advanced network protocols, and supporting large HPC or AI clusters in production environments.
Understanding of fast, distributed storage systems like Lustre and GPFS for AI/HPC workload.
Experience with networking and communications libraries like NCCL, NIXL, NVSHMEM, UCX.
Experience developing or maintaining cluster management and monitoring tools Ex: ansible for infrastructure as a service, prometheus and grafana for monitoring.
You will also be eligible for equity and .

Share
NVIDIA is seeking an analytical Business Analyst for End User Support, a critical function in Information Technology. As a Business Analyst, you will drive the design, delivery, management, and improvement of IT services to meet Employee Experience needs. You will play a pivotal role in aligning IT services with business requirements, ensuring effective service delivery, improving the end-user experience, and championing clear, consistent communication and reporting to all relevant team members. At times, you will step in to lead key areas of IT End User Support such as Service Efficiency, Productivity and Operational Excellence. Agility is key for this role, along with a strong communication skill, and an ability to drive our programs
What you’ll be doing:
Develop, document, and coordinate adaptable operational processes that align with Support Engineering and business strategies.
Identify bottlenecks, inefficiencies, and automation opportunities to drive speed and scalability.
Proactively establish and drive collaborative partnerships with NVIDIA business leaders and partners to ensure IT alignment, robust support for requirements, and enable future growth.
Executing against roadmaps and improving operational mechanisms that drive a highly effective and efficient Support Engineering organization.
Champion product and service innovations using data insights, analysis, and metrics.
Drive efficiency in the IT processes, in partnership with the business, engineering team, and the IT Organization.
What we need to see:
Bachelor's degree or equivalent experience.
5+ years of experience.
Experience driving process efficiency or related field
Experience in building trusted advisor relationships, influencing others and working effectively with people at all levels in an organization.
Shown success in transformative alignment of process setting, translating strategy into actionable steps.
Exhibits strong verbal and written communication skills accompanied by the ability to influence both directly and indirectly and build consensus throughout the business.
Knowledge of communications standard processes and willingness to develop strategies and complete plans.
Ways to stand out from the crowd:
Operates with a high level of intellectual curiosity, demonstrating a startup personality that is proactive, hands-on, driven, strategic, accountable, ethical, collaborative, hard-working, and productive
Track record of driving business outcome and successful implementations, sales, strategies, requiring collaboration across different team members
You will also be eligible for equity and .

Share
Job Description Summary
Requires in-depth knowledge and experience. Uses best practices and knowledge of internal or external business issues to improve products or services. Solves complex problems; takes a new perspective using existing solutions. Works independently, receives minimal guidance. Acts as a resource for colleagues with less experience.
Job Description – Senior Financial Analyst – Global Financial Planning & Analysis (B4)
This Sr. FP&A Analyst role under Corporate Planning Organization will support driving GFP&A activities for the entire company. This position will involve analyzing and developing content for CFO, product groups, investor relations and treasury, and presenting that content in an effective way. The position requires high level of diligence as majority of the reporting and presentations will be serving the executive management of the company.
The right individual will have strong analytical skills and work ethics, capable of executing to deadlines and requests. Also the person should demonstrate solid grasp and understanding of financials and GAAP reporting, including understanding of balance sheet and cash flow statements and drivers.
Primary Responsibilities:
Requirements

Share
KeyResponsibilities
Qualifications & Preferences

Share
Job Description Summary
Requires knowledge and experience in own discipline; still acquiring higher-level knowledge and skills. Builds knowledge of the company, processes, and customers. Solves a range of problems. Analyzes possible solutions using standard procedures. Receives a moderate level of guidance and direction.
Key Responsibilities
Qualifications & Preferences
Bachelor’s degree inor related field
4-7 years of finance experience and Master’s Degree preferred
Full timeAssignee / Regular
Share
What you’ll be doing:
Contribute features to vLLM that empower the newest models with the latest NVIDIA GPU hardware features; profile and optimize the inference framework (vLLM) with methods like speculative decoding,data/tensor/expert/pipeline-parallelism,prefill-decode disaggregation.
Develop, optimize, and benchmark GPU kernels (hand-tuned and compiler-generated) using techniques such as fusion, autotuning, and memory/layout optimization; build and extend high-level DSLs and compiler infrastructure to boost kernel developer productivity while approaching peak hardware utilization.
Define and build inference benchmarking methodologies and tools; contribute both new benchmark and NVIDIA’s submissions to the industry-leading MLPerf Inference benchmarking suite.
Architect the scheduling and orchestration of containerized large-scale inference deployments on GPU clusters across clouds.
Conduct and publish original research that pushes the pareto frontier for the field of ML Systems; survey recent publications and find a way to integrate research ideas and prototypes into NVIDIA’s software products.
What we need to see:
Bachelor’s degree (or equivalent expeience) in Computer Science (CS), Computer Engineering (CE) or Software Engineering (SE) with 7+ years of experience; alternatively, Master’s degree in CS/CE/SE with 5+ years of experience; or PhD degree with the thesis and top-tier publications in ML Systems, GPU architecture, or high-performance computing.
Strong programming skills in Python and C/C++; experience with Go or Rust is a plus; solid CS fundamentals: algorithms & data structures, operating systems, computer architecture, parallel programming, distributed systems, deep learning theories.
Knowledgeable and passionate about performance engineering in ML frameworks (e.g., PyTorch) and inference engines (e.g., vLLM and SGLang).
Familiarity with GPU programming and performance: CUDA, memory hierarchy, streams, NCCL; proficiency with profiling/debug tools (e.g., Nsight Systems/Compute).
Experience with containers and orchestration (Docker, Kubernetes, Slurm); familiarity with Linux namespaces and cgroups.
Excellent debugging, problem-solving, and communication skills; ability to excel in a fast-paced, multi-functional setting.
Ways to stand out from the crowd
Experience building and optimizing LLM inference engines (e.g., vLLM, SGLang).
Hands-on work with ML compilers and DSLs (e.g., Triton,TorchDynamo/Inductor,MLIR/LLVM, XLA), GPU libraries (e.g., CUTLASS) and features (e.g., CUDA Graph, Tensor Cores).
Experience contributing tocontainerization/virtualizationtechnologies such ascontainerd/CRI-O/CRIU.
Experience with cloud platforms (AWS/GCP/Azure), infrastructure as code, CI/CD, and production observability.
Contributions to open-source projects and/or publications; please include links to GitHub pull requests, published papers and artifacts.
You will also be eligible for equity and .
These jobs might be a good fit