המקום בו המומחים והחברות הטובות ביותר נפגשים

Nvidia Senior Manager Storage Production Engineering
United States, California
945123687

14.04.2025

שיתוף

US, CA, Santa Clara

What You Will Be Doing:

Lead and mentor a team of highly skilled Storage Production Engineers, fostering a culture of innovation, collaboration, and technical excellence.
Oversee the design, deployment, and optimization of large-scale storage systems, including distributed storage, parallel file systems, and object storage platforms.
Partner with cross-functional teams to drive storage automation, monitoring, and predictive analytics to enhance reliability and efficiency.
Establish best practices for capacity planning, data lifecycle management, and cost optimization for storage infrastructure.
Implement high-availability and disaster recovery strategies, ensuring minimal downtime and data loss across mission-critical storage environments.
Drive the adoption of modern storage architectures, including NVMe over Fabrics (NVMe-oF), RDMA, high-speed interconnects, and cloud-based storage solutions.
Lead incident response and root cause analysis efforts, implementing proactive measures to enhance system stability and resilience.
Work closely with engineering, DevOps, and AI/ML teams to optimize data pipelines, storage access patterns, and workflow performance. Advocate for continuous improvements in automation, operational efficiency, and performance tuning within the storage infrastructure.

What We Need To See:

BS/MS in Computer Science, Storage Systems, or a related technical field (or equivalent experience).
10+ overall years of experience in large-scale storage architecture, production engineering, or infrastructure roles.
5+ years of management experience, leading high-performing storage, infrastructure, or site reliability engineering teams.
Proven expertise in scalable storage architectures, including parallel file systems (Lustre, GPFS), distributed storage (Ceph, MinIO), and enterprise-scale object storage (S3, NetApp, Pure Storage, etc.).
Strong background in block, file, and object storage technologies, including their performance tuning, high-availability strategies, and data protection mechanisms.
Experience with storage networking protocols, such as NFS, SMB, iSCSI, Fibre Channel, RDMA, and NVMe-oF.
Hands-on experience with automation and infrastructure as code using Terraform, Ansible, Puppet, or similar tools.
Deep understanding of capacity planning, performance tuning, and troubleshooting large-scale storage systems.
Expertise in monitoring and observability tools like Prometheus, InfluxDB, and Elastic stack for storage infrastructure.

Ways to Stand Out from the crowd:

Experience in designing and scaling storage infrastructure for AI/ML workloads and high-performance computing (HPC). Familiarity with hybrid cloud and multi-cloud storage solutions, including AWS S3, Azure Blob, and Google Cloud Storage.
Proven ability to drive cross-functional initiatives, aligning storage strategies with broader business and engineering objectives.
Experience with software-defined storage (SDS), cloud-native storage, and Kubernetes-based storage orchestration. Passion for mentoring engineers, fostering career growth, and creating a high-performance team culture.

You will also be eligible for equity and .

משרות נוספות שיכולות לעניין אותך

Nvidia Senior Storage Data Production Engineer United States, California

הצטרפו למאות שיצרו קורות חיים ושדרגו את הקריירה שלהם

צרו קו"ח