

Share
What you'll be doing:
Onboard new repair partners and manage day-to-day repair operations across our sites, ensuring all facilities achieve turnaround time and quality targets.
Lead daily operational reviews with repair partners to track key operational metrics, resolve bottlenecks, and enforce accountability for on-time delivery and repair commitments.
Function as the key intermediary between repair partners and NVIDIA internal teams — Engineering, Quality, Procurement, Logistics, and Planning — to set shared goals and overcome barriers to meeting targets.
Drive continuous improvement in repair processes, capacity utilization, yield, and quality through structured root-cause and corrective-action initiatives.
Coordinate repair commit performance (planned vs. actual), monitoring metrics such as on-time completion rate, FA/bonepile recovery time, turnaround time, and test yield.
Evaluate capital expenditures and repair expenses, and hold quarterly business meetings with repair collaborators.
Partner with Engineering and Quality to implement repair process controls, engineering change management, and compliance audits.
Work with Planning and Procurement to ensure supply chain readiness for new service offerings and ongoing repair builds.
Champion operational excellence by ensuring sites meet quality and delivery commitments, improving metrics.
What we need to see:
Bachelor’s degree in Engineering, Operations, Supply Chain, or equivalent experience.
15+ overall years of hands-on operations experience managing technology repair, refurbishment, or production environments.
8+ years ofmanagement/leadershipexperience in reverse logistics or repair operations, including managing multiple sites and vendors (domestic and international).
Established record of managing daily factory performance, tracking critical metrics, and ensuring vendor accountability.
Strong understanding of OEM server, PCBA, and system test processes.
Experience implementing process improvements, capacity planning, and inventory control systems at repair or manufacturing facilities.
Demonstrated ability to lead cross-functional teams (Engineering, Quality, Planning, Procurement) and deliver measurable performance gains.
Data-driven decision maker with excellent problem-solving, communication, and vendor management skills.
You will also be eligible for equity and .
These jobs might be a good fit

Share
NVIDIA is seeking a Senior Technical Program Manager to lead the Infrastructure and Product Security and Compliance program for DGX Cloud. In this role, you will ensure our platforms and partner ecosystem meet the highest standards of trust, resilience, and governance.
As a Senior TPM focused on Cloud Security, you will own the design and execution of a DGXC-wide infrastructure security program that strengthens how DGXC operates with Cloud Service Providers (CSPs) and NVIDIA Cloud Partners (NCPs). You will drive security initiatives by embedding compliance controls, governance frameworks, and best practices across infrastructure, platform, and product teams. This role also ensures Product Security is integrated into product roadmap planning and the software development lifecycle, aligning product and infrastructure priorities. You will work closely with senior leaders and cross-functional teams in Security, Compliance, DevOps, and Engineering to continuously enhance and scale the DGX Cloud Security Posture.
What You’ll Be Doing:
Lead alignment across engineering, product, security, and partner teams to deliver against cloud security guidelines with CSP and NCP partners.
Drive programs that strengthen vulnerability management, access control, patching, and compliance readiness for SOC 2, ISO 27001, and related certifications.
Operate DGXC-wide security engineering forums and processes, establishing security KPIs, dashboards, and “run safe” SRE practices.
Partner with the CISO organization to define and assess emerging cloud providers against DGX Cloud security requirements, driving measurable improvements and action plans.
Implement and evolve security controls frameworks (e.g., SSH hardening, IAM, secret rotation) in CI/CD pipelines to ensure continuous compliance.
Lead certification readiness and audit cycles, including SOC 2 Type 1 & 2 and ISO 27001, from control mapping through evidence collection and remediation.
Chair the DGX Cloud Security & Compliance Working Group, managing governance reviews, risk dashboards, and executive reporting on posture and metrics.
Develop training programs to build security and compliance awareness across Product, DevOps, and Engineering teams.
Create playbooks and automation frameworks that streamline certification renewals, patching cycles, and vulnerability management workflows.
Maintain and continuously improve technical compliance documentation, including system diagrams, process flows, and control mappings.
What We Need to See:
12+ years of Program Management experience driving the planning and execution of large programs, software engineering projects in a fast paced environment.
Consistent track record delivering successful Security, Risk, and/or Compliance programs, particularly in cloud IaaS and SaaS environments, resulting in full certification of a suite of products and services.
Experience leading efforts related to SOC2 (Type 1 and Type 2) audits and readiness, including leading control implementation (e.g., access controls, change management, vulnerability management).
Experience operationalizing vulnerability management, patch management, SSH key governance, and access controls across distributed systems.
Ability to think strategically and tactically and to build consensus in making programs successful; ability to resolve technical issues and resource constraints across cross-functional teams.
Demonstrated ability to define metrics, dashboards, and risk indicators that measure posture improvement and audit readiness.
Proficiency with tools like JIRA, to comfortably guide engineering teams on execution in an Agile/scrum manner and ensure accurate governance artifacts are delivered.
Excellent executive communication and presentation skills able to distill complex technical and compliance topics for senior leadership
MS EE or CS degree, or equivalent experience.
Ways to Stand Out from the Crowd:
Highly motivated with strong interpersonal skills, with proven track record to work successfully with multi-functional teams and coordinate effectively across organizational boundaries and geographies.
Experience implementing security features in a multi-cloud environment.
Experience with sophisticated compliance programs, such as FedRamp, SCO2, or ISO certification efforts.
Solid understanding of tier 1 cloud technologies (AWS, GCP, Azure, OCI).
Experience with productivity tools and process automation.
You will also be eligible for equity and .
These jobs might be a good fit

Share
NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by incredible technology—and amazing people.
What you'll be doing:
You will invent a full range of items and compositions that connect with significant themes, narratives, and product launches. From concept to production, you'll define the art direction for our global employee Gear Store and company merchandise, including the aesthetic, color, materials, and overall style. Work with the merchandise buyer and operations lead to establish roadmaps for our product collections. Direct the user experience and development of our on-site store, event pop-ups, and website.
What we need to see:
A portfolio showcasing excellence in crafting purposeful, beautiful designs for merchandise—apparel, soft and hard goods. Actual examples of your hands-on illustration and design work, including production.
A deep knowledge bank of materials, production processes, decoration techniques, complexities, and caveats.
Proven examples demonstrating your ability to art direct, invent, construct, and conceptualize ideas and innovative creative concepts, particularly relating to abstract and technicalsubjects—transformingideas into sought-after gear.
Evidence of how you stay on top of the latest trends in the apparel and merchandise space, tech industry trends and technologies— and embrace change.
Demonstration of the fundamental principles of art direction, graphic composition for brand, illustration, typography, video, photography, and even 3D.
Your passion to take intent and direction from your peers and leaders and apply your art direction skills to build a vision, iterate quickly, revise, and produce final artwork.
10+ years of proven experience working with established brands, building or extending apparel or merchandise collections, collaborating with senior executive level leadership teams.
Proficiency in Microsoft Office and/or Google Suite applications, tracking your work in project tracking tools such as Workfront.
A BFA, BS, or MFA in Graphic Development, Illustration, Fashion Composition, or equivalent field is required (or equivalent experience).
Ways to stand out from the crowd:
Are you curious and passionate about technology, design, and the rapidly evolving AI landscape? Show us you embrace AI, are an AI expert, and have successfully integrated AI tools into to your creative process or workflow.
Have experience working at a global apparel or premium lifestyle brand? Show us how you understood and integrated established brand systems and technologies into coveted items.
Do you have hands-on skills in illustration, infographics, 3D, motion graphics, video, photography or storyboarding? Can you sketch, or develop retail displays? Share your creative passions with us!
You will also be eligible for equity and .
These jobs might be a good fit

Share
What you’ll be doing:
Contribute features to vLLM that empower the newest models with the latest NVIDIA GPU hardware features; profile and optimize the inference framework (vLLM) with methods like speculative decoding,data/tensor/expert/pipeline-parallelism,prefill-decode disaggregation.
Develop, optimize, and benchmark GPU kernels (hand-tuned and compiler-generated) using techniques such as fusion, autotuning, and memory/layout optimization; build and extend high-level DSLs and compiler infrastructure to boost kernel developer productivity while approaching peak hardware utilization.
Define and build inference benchmarking methodologies and tools; contribute both new benchmark and NVIDIA’s submissions to the industry-leading MLPerf Inference benchmarking suite.
Architect the scheduling and orchestration of containerized large-scale inference deployments on GPU clusters across clouds.
Conduct and publish original research that pushes the pareto frontier for the field of ML Systems; survey recent publications and find a way to integrate research ideas and prototypes into NVIDIA’s software products.
What we need to see:
Bachelor’s degree (or equivalent expeience) in Computer Science (CS), Computer Engineering (CE) or Software Engineering (SE) with 7+ years of experience; alternatively, Master’s degree in CS/CE/SE with 5+ years of experience; or PhD degree with the thesis and top-tier publications in ML Systems, GPU architecture, or high-performance computing.
Strong programming skills in Python and C/C++; experience with Go or Rust is a plus; solid CS fundamentals: algorithms & data structures, operating systems, computer architecture, parallel programming, distributed systems, deep learning theories.
Knowledgeable and passionate about performance engineering in ML frameworks (e.g., PyTorch) and inference engines (e.g., vLLM and SGLang).
Familiarity with GPU programming and performance: CUDA, memory hierarchy, streams, NCCL; proficiency with profiling/debug tools (e.g., Nsight Systems/Compute).
Experience with containers and orchestration (Docker, Kubernetes, Slurm); familiarity with Linux namespaces and cgroups.
Excellent debugging, problem-solving, and communication skills; ability to excel in a fast-paced, multi-functional setting.
Ways to stand out from the crowd
Experience building and optimizing LLM inference engines (e.g., vLLM, SGLang).
Hands-on work with ML compilers and DSLs (e.g., Triton,TorchDynamo/Inductor,MLIR/LLVM, XLA), GPU libraries (e.g., CUTLASS) and features (e.g., CUDA Graph, Tensor Cores).
Experience contributing tocontainerization/virtualizationtechnologies such ascontainerd/CRI-O/CRIU.
Experience with cloud platforms (AWS/GCP/Azure), infrastructure as code, CI/CD, and production observability.
Contributions to open-source projects and/or publications; please include links to GitHub pull requests, published papers and artifacts.
You will also be eligible for equity and .
These jobs might be a good fit

Share
What you'll be doing:
Design intuitive data models and semantic layers to enable self‑service and AI apps reducing ad‑hoc query friction for business users.
Enrich data products with business glossary and metadata to reduce AI hallucinations, improve user adoption, searchability and governance.
Lead multi‑site integrations across new manufacturing plants and ops applications standardizing schemas and controls; enabling cross‑plant insights.
Engineer scalable pipelines with data integrity functions and audit features. Automate measuring and monitoring data quality for improved decision making.
Explain the data designs, system changes, enhancements, address any questions or issues effectively to the stakeholders.
Partner with stakeholders, solve business problems, train users, help with data and queries.
Optimize Lakehouse systems to deliver high performing solutionswhile controlling operational costs.
What we need to see:
BS, MS, or PhD in EE/CS or related field of education (or equivalent experience).
5+ years of programming experience (Python, PySpark, SQL, etc.).
5+ years of experience with big data technologies and cloud platforms (AWS, Databricks, Snowflake).
12+ overall years in Data Warehousing, implementing projects with data Lakehouse solutions.
Experience with enterprise BI databases like SAP BW/HANA, ERP/CRM systems like SAP/Salesforce, planning applications like IBP, APO etc.
Knowledge of operational processes in chips, boards, systems, and networking.
Proficiency in Tableau, PowerBI, and SAP reporting applications.
Ways to stand out from the crowd:
Strong analytical skills with the ability to collect, organize, and disseminate significant amounts of information with attention to detail and accuracy.
Highly independent, able to lead key technical decisions, influence project roadmap and work effectively with team members
Proven experience leading multiple analytics projects in a dynamic, fast-paced environment
Data science, AI/ML experience
Positive interpersonal skills with ability to convey good verbal and written communication
You will also be eligible for equity and .
These jobs might be a good fit

Share
What you will be doing:
Collaborate with networking teams to plan, implement, and evaluate performance benchmarks on NVLINK, NVSwitch, and InfiniBand powered infrastructures.
Assess findings and work closely with framework, hardware, and support teams to improve system performance across various deep learning workloads.
Act as a primary resource for fixing networking and hardware integration issues, focusing on scalable multi-node systems.
Maintain high communication standards across multiple engineering, support, and R&D teams, ensuring technical and performance goals are met.
Offer technical mentorship and documentation for internal teams and external partners on standard methodologies in HPC networking deployments.
Share insights on improving networking strategies for substantial AI and deep learning infrastructure.
What we need to see:
BS/MS or PhD in Computer Science, Engineering, or related field, or equivalent experience.
8+ years of proven experience in AI/HPC Infrastructure.
Familiarity with AI/HPC job schedulers and orchestrators like Slurm, K8s, or LSF. Practical exposure to AI/HPC workflows employing MPI and NCCL.
Familiarity with High-Speed Networking pertaining to HPC including InfiniBand, RDMA, RoCE, and Amazon EFA.
Essential to have an understanding of PyTorch, MegatronLM, and Deep Learning Inference frameworks such as vllm/sglang.
Proven experience with InfiniBand, NVLINK, and high-speed networking technologies in HPC or large-scale datacenter environments.
Investigating and evaluating performance in multi-node systems, especially in deep learning or scientific computing tasks.
Strong analytical, debugging, and technical communication skills.
Comfortable working in collaborative, multi-faceted teams.
Ways to stand out from the crowd:
Mastery in deep learning frameworks or distributed training systems.
Familiarity with datacenter automation, advanced network protocols, and supporting large HPC or AI clusters in production environments.
Understanding of fast, distributed storage systems like Lustre and GPFS for AI/HPC workload.
Experience with networking and communications libraries like NCCL, NIXL, NVSHMEM, UCX.
Experience developing or maintaining cluster management and monitoring tools Ex: ansible for infrastructure as a service, prometheus and grafana for monitoring.
You will also be eligible for equity and .
These jobs might be a good fit

Share
NVIDIA is seeking a Sr. Systems Software Engineer for the Apache Spark Acceleration group. Over the past five years GPU accelerated data processing has moved from proof of concept to production deployments. Many enterprises are now recognizing the needs of accelerated computing to handle their large data processing needs. Multi-node GPU deployments will reduce cloud computing costs and lower latency batch ETL workloads.
At NVIDIA, we have been invested in accelerating Apache Spark, providing an open source plugin for Apache Spark. Apache Spark is the most popular data processing engine in data centers. We strive to accelerate Spark applications on GPUs without any code changes. We are passionate about working on hard problems that have an impact. You will need to have strong programming skills, a deep understanding of software development related to C++. You will work with a team that is using open source libraries like RAPIDS to accelerate reading, writing and batch data operations in Spark.
What you'll be doing:
Develop CUDA/C++ libraries to accelerate DataFrames and I/O operations on common file formats such as Parquet, ORC and JSON
Collaborate with distributed systems teams to craft solutions to distributed processing problems challenges at large scale
Work with open source communities to enhance libraries like RAPIDS, CCCL and UCX through technical discussion and code contributions
Provide recommendations and feedback to teams regarding decisions surrounding topics such as infrastructure, continuous integration and testing strategy
Build, test and optimize CUDA/C++ libraries across different platforms
What we need to see:
BS, MS, or PhD in Computer Science, Computer Engineering, or closely related field (or equivalent experience)
12+ years of work experience in software development
Outstanding technical skills in designing and implementing high-quality distributed systems
Excellent programming skills in C++, Java, and/or Scala
Ability to work with teams across organizational boundaries and geographies
Highly motivated with strong interpersonal skills
OS kernel dev experience is a strong plus
You will also be eligible for equity and .
These jobs might be a good fit

Share
What you'll be doing:
Onboard new repair partners and manage day-to-day repair operations across our sites, ensuring all facilities achieve turnaround time and quality targets.
Lead daily operational reviews with repair partners to track key operational metrics, resolve bottlenecks, and enforce accountability for on-time delivery and repair commitments.
Function as the key intermediary between repair partners and NVIDIA internal teams — Engineering, Quality, Procurement, Logistics, and Planning — to set shared goals and overcome barriers to meeting targets.
Drive continuous improvement in repair processes, capacity utilization, yield, and quality through structured root-cause and corrective-action initiatives.
Coordinate repair commit performance (planned vs. actual), monitoring metrics such as on-time completion rate, FA/bonepile recovery time, turnaround time, and test yield.
Evaluate capital expenditures and repair expenses, and hold quarterly business meetings with repair collaborators.
Partner with Engineering and Quality to implement repair process controls, engineering change management, and compliance audits.
Work with Planning and Procurement to ensure supply chain readiness for new service offerings and ongoing repair builds.
Champion operational excellence by ensuring sites meet quality and delivery commitments, improving metrics.
What we need to see:
Bachelor’s degree in Engineering, Operations, Supply Chain, or equivalent experience.
15+ overall years of hands-on operations experience managing technology repair, refurbishment, or production environments.
8+ years ofmanagement/leadershipexperience in reverse logistics or repair operations, including managing multiple sites and vendors (domestic and international).
Established record of managing daily factory performance, tracking critical metrics, and ensuring vendor accountability.
Strong understanding of OEM server, PCBA, and system test processes.
Experience implementing process improvements, capacity planning, and inventory control systems at repair or manufacturing facilities.
Demonstrated ability to lead cross-functional teams (Engineering, Quality, Planning, Procurement) and deliver measurable performance gains.
Data-driven decision maker with excellent problem-solving, communication, and vendor management skills.
You will also be eligible for equity and .
These jobs might be a good fit